ariadne.space/content/blog/pkgconf-and-cve-2023-24056.md

199 lines
11 KiB
Markdown

---
title: pkgconf, CVE-2023-24056 and disinformation
date: 2023-01-24
---
Readers will have noticed that two maintenance releases of pkgconf were cut over the weekend,
1.9.4 and 1.8.1 respectively, to address [CVE-2023-24056][cve], a pkg-config specific variation
of the now-classic "[billion laughs attack][bla]". While fixing software defects is important,
a lot went wrong with how this CVE was reported and the motivations behind its disclosure, and
for my own catharsis, I want to talk about this.
[cve]: https://nvd.nist.gov/vuln/detail/CVE-2023-24056
[bla]: https://en.wikipedia.org/wiki/Billion_laughs_attack
## The origin of `pkgconf`
To hopefully explain why I am so bothered by all of this, let's first understand the history of
pkgconf: a project I began noodling on in March 2011.
2011 was a particularly rough year for me. In January, my father was diagnosed with pancreatic
cancer, and declined to disclose this to anyone. When I came back to Oklahoma to visit my
parents in early March, I walked into my dad's house and found him jaundiced. I drove him to
the emergency room, and was informed that he only had a few months to live due to the pancreatic
cancer he allowed to progress to stage 4. This was *shocking* to me, especially considering I
was 23 at the time. The stress of it led to me breaking up with my boyfriend at the time.
I did the only thing I could do given the situation: spent as much time with him as possible.
The hospital had installed Wi-Fi earlier that year, so I was able to take my computer and work
on my projects while I spent time with him. This worked out well, because it gave us a common
ground of subjects to talk about: my dad was the person who originally pushed me into getting
involved with software engineering as a profession in the first place. While he himself never
worked as a software engineer, he developed a number of small utilities and demo programs for
MS-DOS. Later, he became heavily interested in BSD, and then Slackware.
During this time period, pkg-config 0.26 was released, which required either a complicated
bootstrap procedure to satisfy the glib2 requirements by hand, or a pre-existing copy of
pkg-config to exist. Alpine was impacted by this bootstrap problem, and we ultimately decided
to hold back pkg-config on the 0.25 version because the bootstrapping problem was too complex
to solve for the pending release.
At the same time, I was looking for something, *anything* to work on that would serve as a
distraction and conversation piece. This created an opportunity: I could work on a replacement
pkg-config implementation that did not have the bootstrap requirement that the freedesktop
implementation required. I began working on pkgconf, specifically the .pc file parsing and
dependency graph walking code, while my dad was in the hospital. He found talking about it
*fascinating*, and so we discussed the various aspects of implementing a parser, and walking
dependency graphs in C. In a limited way, it was a project we collaborated on, in that I would
write code, tell him about it, and he'd point out ways my assumptions probably didn't hold
true.
After he passed away, I quit working on it for a while, until a few friends of mine decided
to pick it up and experiment with it in Gentoo and FreeBSD. Sadly, my father passed away in
early April, so he didn't get to see the first viable release, or to see pkgconf integrated
into Linux distributions.
## Maintaining a production-quality build tool at scale
These days, pkgconf is basically everywhere. It is the default pkg-config implementation in
every mainstream Linux distribution except Ubuntu. It is used heavily in embedded Linux
development and in plenty of other scenarios. My distfiles server, `distfiles.dereferenced.org`,
logs dozens of pkgconf downloads every second of the day.
The success of pkgconf is not without its problems though. There are aspects of the software
which, given what I know today, I would probably implement substantially differently. The
technical debt is real. I've been working, however, as time permits, to improve these problems
in the `pkgconf-1.9.x` release series.
But when pkgconf does something which is unexpected, and breaks a user's build... those
interactions are rarely fun. Many times, the user with the issue shows up on the issue
tracker, or worse, my personal inbox in a bad mood, which results in a triage experience
that is suboptimal for everyone involved. Thankfully, this doesn't happen so much
anymore, as we have worked hard to balance compatibility and developer-friendly output
from the tool.
But as smooth as things are these days, maintaining a production build tool imposes a lot
of burden that you cannot begin to expect until you've done it before. It is not enough
to simply tell a user that the framework he is using is doing things wrong, for example,
underspecifying its dependencies. You must consider "self-service" features: ones which
allow the user to diagnose the issues in his build and correct them himself. By doing
so, you provide the user with a good experience, and keep support requests from annoyed
users much lower. All of this has to be designed and implemented in production build
tools.
## The appearance of "competition"
The past weekend has been a wild ride for me. I recently moved to Seattle, and have
been getting settled in. A few people brought [u-config: a new, lean pkg-config clone][ucblog]
to my attention. At first, I shrugged it off, and mostly would have continued to do
so. An implementation of pkg-config on Windows would be good for me, personally, as
I do not develop pkgconf on Windows, and different people who contribute to the
maintenance of pkgconf's Windows support have different goals. This has led to some
significant fragmentation of pkgconf on the Windows side, with different tools bundling
it supporting specific aspects of the pkg-config format in different ways.
[ucblog]: https://nullprogram.com/blog/2023/01/18/
I have a number of social and technical observations about u-config. Some good,
some not so good. To start off with the social aspects: I don't particularly
appreciate the level of aggression directed toward pkgconf. While that alone would
not normally be a turn-off for me (one has to have a reasonably thick skin when
being a FOSS maintainer), casually dropping the "billion laughs" 0day with a snyde
comment about how we should use ASan (we do) when developing pkgconf was too much,
and the bug itself (a mistake in accounting for available buffer space during variable
expansion) was overstated.
There is a lot of good things about u-config. By focusing on only the minimally
required functionality, the author was able to write an excellent tool which has
the potential to someday be a replacement to pkgconf. I am open to talking about
such a deprecation, even.
However, after the initial blogpost (which contained disinformation about both
freedesktop pkg-config *and* pkgconf), there was additional disinformation from
another person who is enthusiastic about the u-config project. Notably, he
submitted a patch, which amongst other things, could be misinterpreted by readers
to conclude that `pkgconf` does not consider `/usr/include` as a system include
path. When configured correctly, it definitely does. For example, on Alpine Linux:
pestilence:~$ pkgconf --dump-personality
Triplet: default
DefaultSearchPaths: /usr/lib/pkgconfig /usr/share/pkgconfig
SystemIncludePaths: /usr/include
SystemLibraryPaths: /usr/lib
But this [particular disinformation was merged by the author of the software][uc-disinfo], without
regard for checking the comment for disinformation, despite how absurd it would be
if it were true.
[uc-disinfo]: https://github.com/skeeto/u-config/commit/c069c94d77e1381cf7d67b8283601c5e79a91534#diff-c1f8e1880984a1a513fbb1c1191ea62910de9f1656c89f30d41609fb7317080bR1563
*Update (28 January 2023):* Since the initial publication of this blog, the comment
introduced in the above patch has been corrected to reflect a specific edge case
relating to `-I/usr/include` verses `-I /usr/include`. I believe the discrepancy
in the handling of both fragments to be a bug, one which was not reported to me,
but rather discussed only in the source code comment. The contributor of the patch
in question to u-config, in particular, has pointed the fact that they later changed
the source code comment to clarify the issue, as part of an attempt to deflect from
the point of this blog: discussing how the u-config author and contributors have
chosen to engage in bad faith with other pkg-config implementations (especially
pkgconf) from the beginning of their project. While I plan to fix the non-reported
discrepancy in the next pkgconf release, I will note that the u-config authors have
so far [chosen to not handle this edge case][uc-comment-2].
[uc-comment-2]: https://github.com/skeeto/u-config/blob/7b5d32f/u-config.c#L1679-L1686
## `pkg-config` implementations do specific things for a reason
In the UNIX environment, the behavior of the system toolchain is static and
must be well-defined. Tools which act adjacently to the system C toolchain
must behave in ways which are aware of how the C toolchain is configured
to behave. This is why `pkgconf` checks several different environment
variables to learn about how the system toolchain has been configured, and
what deviations, if any, have been configured via the environment.
A frequent patten in UNIX pkg-config files is to write things like:
prefix=/usr
includedir=${prefix}/include
libdir=${prefix}/lib
Package: whatever
Version: 0
Cflags: -I${includedir}
Libs: -L${libdir} -lwhatever
On Windows, `pkg-config` implementations have `--define-prefix`, which is
used to override the `${prefix}` variable for this reason.
If `pkg-config` is not aware of `/usr/include` being a *system* include path,
then a disaster can happen when querying for multiple dependencies at the same
time. Consider this other pkg-config file:
prefix=/usr
includedir=${prefix}/include/OtherLib
libdir=${prefix}/lib
Package: OtherLib
Version: 0
Cflags: -I${includedir}
Libs: -L${libdir} -lother
Now lets say that `OtherLib` has a `/usr/include/OtherLib/math.h` file which
uses `#include_next` to enhance the `math.h` header. A real-world example of
a library which does this is `libbsd`. Well, if you query pkg-config with
`pkg-config --cflags --libs whatever OtherLib`, then you will get:
pestilence:~$ pkgconf --with-path=examples/ --personality=examples/broken.personality whatever OtherLib
-I/usr/include -I/usr/include/OtherLib -lwhatever -lother
This means that `/usr/include/math.h` will be preferred over `/usr/include/OtherLib/math.h`,
and your build will fail.
So this type of filtering, and the other types of filtering that pkgconf does, is very important
in the UNIX environment. The author of u-config will unfortunately have to learn these things
one by one as users come to him with bug reports.
There is probably an alternate reality where u-config and pkgconf work together to deprecate pkgconf,
and someday I hope that will be the reality here. But until the disinformation and putdowns are
addressed, it will unfortunately be impossible to collaborate.
Anyway, if you got through all of this, thanks for reading, I guess.