diff --git a/content/blog/a-brief-history-of-configuration-defined-image-builders.md b/content/blog/a-brief-history-of-configuration-defined-image-builders.md new file mode 100644 index 0000000..cbf31e6 --- /dev/null +++ b/content/blog/a-brief-history-of-configuration-defined-image-builders.md @@ -0,0 +1,46 @@ +--- +title: "A Brief History of Configuration-Defined Image Builders" +date: "2021-04-06" +--- + +When you think of a configuration-defined image builder, most likely you think of Docker (which builds images for containers).  But before Docker, there were several other projects, all of which came out of a vibrant community of Debian-using sysadmins looking for better ways to build VM and container images, which lead to a series of projects that built off each other to build something better. + +## Before KVM, there was Xen + +The [Xen hypervisor](https://xenproject.org/developers/teams/xen-hypervisor/) is likely something you've heard of, and that's where this story begins.  The mainstream desire to programmatically create OS images came about as Xen became a popular hypervisor in the mid 2000s.  The first development in that regard was [xen-tools](https://www.xen-tools.org/software/xen-tools/), which automated installation of Debian, Ubuntu and CentOS guests, by generating images for them using custom perl scripts.  The world has largely moved on from Xen, but it still sees wide use. + +## ApplianceKit and ApplianceKit-NG + +The methods used in xen-tools, while generally effective, lacked flexibility.  Hosting providers needed a way to allow end-users to customize the images they deployed.  In my case, we solved this by creating ApplianceKit.  That particular venture was sold to another hosting company, and for whatever reason, I started another one.  In that venture, we created ApplianceKit-NG. + +ApplianceKit and ApplianceKit-NG took different approaches internally to solve a basic problem, taking an XML description of a software image and reproducing it, for example: + + + + LAMP appliance based on Debian squeeze + + Ariadne Conill + ariadne@dereferenced.org + + squeeze + + apache2 + libapache2-mod-php5 + mysql-server + mysql-client + + + +As you can see here, the XML description described a _desired state_ for the image to be in at deployment time.  ApplianceKit did this through an actor model: different modules would act on elements in the configuration description.  [ApplianceKit-NG](https://bitbucket.org/tortoiselabs/appliancekit-ng/src/master/) instead treated this as a matter of compilation: first, a high-level pass converted the XML into a [mid-level IR,](https://bitbucket.org/tortoiselabs/appliancekit-ng/src/master/ADL.md) then the mid-level IR was converted into a low-level IR, then the IR was converted into a series of commands that were evaluated like a shell script.  (Had I known about skarnet's execline at that time, I would have used it.) + +## Docker + +Another company that was active in the Debian community and experimenting with configuration-defined image building was dotCloud.  dotCloud took a similar evolutionary path, with the final image building system they made being Docker.  Docker evolved further on the concept outlined in ApplianceKit-NG by simplifying everything: instead of explicitly configuring a desired state, you simply use image layering: + +FROM debian:squeeze +MAINTAINER ariadne@dereferenced.org +RUN apt-get update && apt-get install apache2 libapache2-mod-php5 mysql-server mysql-client + +By taking a simpler approach, Docker has won out.  Everything is built on top of Docker these days, such as Kubernetes, and this is a good thing.  Even though some projects like Packer have further advanced the state of the art, Docker remains the go-to for this task, simply because its simple enough for people to mostly understand. + +The main takeaway is that simply advancing the state of the art is not good enough to make a project compelling.  It must advance the state of simplicity too. diff --git a/content/blog/a-silo-can-never-provide-digital-autonomy-to-its-users.md b/content/blog/a-silo-can-never-provide-digital-autonomy-to-its-users.md new file mode 100644 index 0000000..2c7d68b --- /dev/null +++ b/content/blog/a-silo-can-never-provide-digital-autonomy-to-its-users.md @@ -0,0 +1,18 @@ +--- +title: "a silo can never provide digital autonomy to its users" +date: "2022-07-01" +--- + +Lately there has been a lot of discussion about various silos and their activities, notably GitHub and an up and coming alternative to Tumblr called Cohost. I'd like to talk about both to make the point that silos do not, and can not elevate user freedoms, by design, even if they are run with the best of intentions, by analyzing the behavior of both of these silos. + +It is said that if you are not paying for a service, that you are the product. To look at this, we will start with GitHub, who have had a significant controversy over the past year with their now-commercial Copilot service. Copilot is a paid service which provides code suggestions using a neural network model that was trained using the entirety of publicly posted source code on GitHub as its corpus. As many have noted, this is likely a problem from a copyright point of view. + +Microsoft claims that this use of the GitHub public source code is ethically correct and legal, citing fair use as their justification for data mining the entire GitHub public source corpus. Interestingly, in the EU, there is a "text and data mining" exception to the copyright directive, [which may provide for some precedent for this thinking](https://deliverypdf.ssrn.com/delivery.php?ID=380124069122109084081011069119068081059089022064027023064104069125083028119005007123033062000029047123108125065064093118008030058071007053078069071085069007101073030038014010096097074114126065017112027071084124110068123116074098119115105064007068091122&EXT=pdf&INDEX=TRUE). While the legal construction they use to justify the way they trained the Copilot model is interesting, it is important to note that we, as consumers of the GitHub service, enabled Microsoft to do this by uploading source code to their service. + +Now let's talk about [Cohost](https://cohost.org), a recently launched alternative to Tumblr which is paid for by its subscribers, and promises that it will never sell out to a third party. While I think that Cohost will likely be one of the more ethically-run silos out there, it is still a silo, and like Microsoft's GitHub, it has business interests (subscriber retention) which [place it in conflict with the goals of digital autonomy](https://techautonomy.org/). Specifically, like all silos, Cohost's platform is designed to keep users inside the Cohost platform, just as GitHub uses the network effect of its own silo to make it difficult to use anything other than GitHub for collaboration on software. + +Some have argued that, due to the network effects of silos, the only thing which can defeat a bad silo is a good silo. The problem with this argument is that it requires one to accept the supposition that there can be a good silo. Silos, by their very nature of being centralized services under the control of the privileged, cannot be good if you look at the power structures imposed by them. Instead, we should use our privilege to lift others up, something that commercial silos, by design, are incapable of doing. + +How do we do this though? One way is to embrace networks of consent. From a technical point of view, the IndieWeb people have worked on a number of simple, easy to implement protocols, which provide the ability for web services to interact openly with each other, but in a way that allows for a website owner to define policy over what content they will accept. From a social point of view, we should avoid commercial silos, such as GitHub, and use our own infrastructure, either through self-hosting or through membership to a cooperative or public society. + +Although I understand that both of these goals can be difficult to achieve, they make more sense than jumping from one silo to the next after they cross the line. You control where you choose to participate -- for me, that means I am shifting my participation so that I only participate in commercial silos when absolutely necessary. We should choose to participate in power structures which value our communal membership, rather than value our ability to generate or pay revenue. diff --git a/content/blog/a-slightly-delayed-monthly-status-update.md b/content/blog/a-slightly-delayed-monthly-status-update.md new file mode 100644 index 0000000..43a80b5 --- /dev/null +++ b/content/blog/a-slightly-delayed-monthly-status-update.md @@ -0,0 +1,52 @@ +--- +title: "A slightly-delayed monthly status update" +date: "2021-06-04" +--- + +A few weeks ago, I announced the [creation of a security response team for Alpine](https://ariadne.space/2021/04/20/building-a-security-response-team-in-alpine/), of which I am presently the chair. + +Since then, the team has been fully chartered by both the previous Alpine core team, and the new Alpine council, and we have gotten a few members on board working on security issues in Alpine.  Once the Technical Steering Committee is fully formed, the security team will report to the TSC and fall under its purview. + +Accordingly, I thought it would be prudent to start write monthly updates summarizing what I've been up to.  This one is a little delayed because we've been focused on getting Alpine 3.14 out the door (first RC should come out on Monday)! + +## secfixes-tracker + +One of the primary activities of the security team is to manage the [security database](https://secdb.alpinelinux.org).  This is largely done using the secfixes-tracker application I wrote in April.  At AlpineConf, I gave a bubble talk about the new security team, including a demonstration of how we use the secfixes-tracker application to research and mitigate security vulnerabilities. + +Since the creation of the security team through the Alpine 3.14 release cycle, myself and other security team volunteers have mitigated over 100 vulnerabilities through patching or non-maintainer security upgrades in the pending 3.14 release alone and many more in past releases which are still supported. + +All of this work in finding unpatched vulnerabilities is done using secfixes-tracker.  However, while it finds many vulnerabilities, it is not perfect.  There are both false positives and false negatives, which we are working on improving. + +The next step for secfixes-tracker is to integrate it into GitLab, so that maintainers can log in and reject CVEs they deem irrelevant in their packages instead of having to attribute a security fix to version `0`.  I am also [working on a protocol to allow security trackers to share data](https://docs.google.com/document/d/11-m_aXnrySM6KeA5I6BjdeGeSIxymfip4hseg2Y0UKw/edit#heading=h.bz0hbmpvjhfb) with each other in an automated way. + +## Infrastructure + +Another role of the security team is to advise the infrastructure team on security-related matters.  In the past few weeks, this primarily focused around two issues: how to [securely relay patches from the alpine-aports mailing list into GitLab without compromising the security of `aports.git`](https://gitlab.alpinelinux.org/mailinglist-bot) and [our response to recent changes in freenode](https://freenode.net/news/freenode-is-foss), where it was the recommendation of the security team to [leave freenode in favor of OFTC](https://alpinelinux.org/posts/Switching-to-OFTC.html). + +## Reproducible Builds + +Another project of mine personally is working to prove the reproducibility of Alpine package builds, as part of the [Reproducible Builds project](https://reproducible-builds.org/).  To this end, I hope to have the Alpine 3.15 build fully reproducible.  This will require some changes to `abuild` so that it produces buildinfo files, as well as a rebuilder backend.  We plan to use the same buildinfo format as Arch, and will likely adapt some of the other reproducible builds work Arch has done to Alpine. + +I plan to have a meeting within the next week or two to formulate an official reproducible builds team inside Alpine and lay out the next steps for what we need to do in order to get things going.  In the meantime, join `#alpine-reproducible` on `irc.oftc.net` if you wish to follow along. + +I plan for reproducible builds (perhaps getting all of main reproducible) to be a sprint in July, once the prerequisite infrastructure is in place to support it, so stay tuned on that. + +## apk-tools 3 + +On this front, there's not much to report yet.  My goal is to integrate the security database into our APKINDEX, so that we can have `apk list --upgradable --security`, which lists all of the security fixes you need to apply.  Unfortunately, we are still working to finalize the ADB format which is a prerequisite for providing the security database in ADB format.  It does look like Timo is almost done with this, so once he is done, I will be able to start working on a way to reflect the security database into our APKINDEX files. + +## The `linux-distros` list + +There is a mailing list which is intended to allow linux distribution security personnel to discuss security issues in private.  As Alpine now has a security team, it is possible for Alpine to take steps to participate on this list. + +However... participation on this list comes with a few restrictions: you have to agree to follow all embargo terms in a precise way.  For example, if an embargoed security vulnerability is announced there and the embargo specifies you may not patch your packages until XYZ date, then you must follow that or you will be kicked off the list. + +I am not sure it is necessarily appropriate or even valuable for Alpine to participate on the list.  At present, if an embargoed vulnerability falls off a truck and Alpine notices it, we can fix it immediately.  If we join the `linux-distros` list, then we may be put in a position where we have to hide problems, which I didn't sign up for.  I consider it a feature that the Alpine security team is operating fully in the open for everyone to see, and want to preserve that as much as possible. + +The other problem is that distributions which participate [bind their package maintainers to an NDA](https://wiki.gentoo.org/wiki/Project:Security/Pre-Release-Disclosure) in order to look at data relevant to their packages.  I don't like this at all and feel that it is not in the spirit of free software to make contributors acknowledge an NDA. + +We plan to discuss this over the next week and see if we can reach consensus as a team on what to do.  I prefer to fix vulnerabilities, not wait to fix vulnerabilities, but obviously I am open to being convinced that there is value to Alpine's participation on that list. + +## Acknowledgement + +My activities relating to Alpine security work are presently sponsored by Google and the Linux Foundation.  Without their support, I would not be able to work on security full time in Alpine, so thanks! diff --git a/content/blog/a-tail-of-two-bunnies.md b/content/blog/a-tail-of-two-bunnies.md new file mode 100644 index 0000000..075cbcd --- /dev/null +++ b/content/blog/a-tail-of-two-bunnies.md @@ -0,0 +1,36 @@ +--- +title: "a tail of two bunnies" +date: "2021-08-21" +--- + +As many people know, I collect stuffed animals.  Accordingly, I get a lot of questions about what to look for in a quality stuffed animal which will last a long time.  While there are a lot of factors to consider when evaluating a design, I hope the two examples I present here in contrast to each other will help most people get the basic idea. + +## the basic things to look for + +A stuffed animal is basically a set of fabric patches sewn together around some stuffing material.  Therefore, the primary mode of failure for a stuffed animal is when one or more seams suffers a tear or rip in its stitching.  A trained eye can look at a design and determine both the likelihood of failure and the most vulnerable seams, even in a high quality stuffed animal. + +There are two basic ways to sew together a stuffed animal: the fabric patches can be sewn together to form inward-facing seams, or they can be sewn together to form outward-facing seams.  Generally, the stuffed animals that have inward facing seams have more robust construction.  This means that if you can easily see the seam lines that the quality is likely to be low.  Similarly, if eyes and other accessories are sewn in along a main seam line, they become points of vulnerability in the design. + +Materials also matter: if the purpose of the stuffed animal is to be placed on a bed, or in a crib, it should be made out of fire-retardant materials.  Higher quality stuffed animals will use polyester fill with a wool-polyester blend for the outside, while lower quality stuffed animals may use materials like cotton.  In the [event of a fire](https://www.sikkerhverdag.no/en/safe-products/clothes-and-equipment/these-clothes-are-the-most-flammable/), polyester can potentially melt onto skin, but materials like cotton will burn much more vigorously than polyester (which is fire retardant). + +Finally, it is important to verify that the stuffed animal has been certified to a well-known safety standard.  Look for compliance with the European Union's EN71 safety standard or the ASTM F963 standard.  Do not buy any stuffed animal made by a company which is not compliant with these standards.  Stuffed animals bought off maker-oriented websites like Etsy will most likely not be certified, in these cases, you may wish to verify with the maker that they are familiar with the EN71 and ASTM F963 standards and have designed around those standards. + +## a good example: the jellycat bashful bunny + +![A jellycat bashful bunny, cream colored, size: really big. it is approximately 4 feet tall.](images/BARB1BC-300x300.jpg) + +One of my favorite bunny designs is the [Jellycat Bashful Bunny](https://www.jellycat.com/us/bashful-cream-bunny-bas3bc/).  I have several of them, ranging from small to the largest size available. + +This is what I would consider to be a high quality design.  While the seam line along his tummy is visible, it is a very small seam line, which is indicative that the stitching is inward-facing.  There are no other visible seam lines.  Cared for properly, this stuffed animal will last a very long time. + +## a bad example: build a bear's pawlette + +![Jumbo Pawlette, from build a bear. This variant is 3 feet tall.](images/25756Alt1x-300x300.jpg) + +A few people have asked me about [Build a Bear's Pawlette design](https://www.buildabear.com/online-exclusive-jumbo-pawlette/025756.html) recently, as it looks very similar to the Jellycat Bashful Bunny.  I don't think it is a very good design. + +To start with, you can see that there are 21 separate panels stitched together: 4 for the ears, 3 for the head, 4 for the arms, 2 for the tummy, 2 for the back, 4 for the legs, and 2 for the feet.  The seam lines are very visible, which indicates that there is a high likelihood that the stitching is outward rather than inward.  That makes sense, because it's a lot easier to stitch up a stuffed animal in store that way.  Additionally, you can see that the eyes are anchored to the seam lines that make up the face, which means detachment of the eyes is a likely possibility as a failure mode. + +Build a Bear has some good designs that are robustly constructed, but Pawlette is not one of them.  I would avoid that one. + +Hopefully this is helpful to somebody, at the very least, I can link people to this post now when they ask about this stuff. diff --git a/content/blog/a-tale-of-two-envsubst-implementations.md b/content/blog/a-tale-of-two-envsubst-implementations.md new file mode 100644 index 0000000..7110781 --- /dev/null +++ b/content/blog/a-tale-of-two-envsubst-implementations.md @@ -0,0 +1,107 @@ +--- +title: "A tale of two envsubst implementations" +date: "2021-04-15" +--- + +Yesterday, Dermot Bradley brought up in IRC that gettext-tiny's lack of an `envsubst` utility could be a potential problem, as many Alpine users [use it to generate configuration from templates](https://www.robustperception.io/environment-substitution-with-docker).  So I decided to look into writing a replacement, as the tool did not seem that complex.  That rewrite is [now available on GitHub](https://github.com/kaniini/envsubst), and is already in Alpine testing for experimental use. + +## What `envsubst` does + +The `envsubst` utility is designed to take a set of strings as input and replace variables in them, in the same way that shells do variable substitution.  Additionally, the variables that will be substituted can be restricted to a defined set, which is nice for reliability purposes. + +Because it provides a simple way to perform substitutions in a file without having to mess with `sed` and other similar utilities, it is seen as a helpful tool for building configuration files from templates: you just install the `cmd:envsubst` provider with apk and perform the substitutions. + +Unfortunately though, GNU `envsubst` is quite deficient in terms of functionality and interface. + +## Good tool design is important + +When building a tool like `envsubst`, it is important to think about how it will be used.  One of the things that is really important is making sure a tool is satisfying to use: a tool which has non-obvious behavior or implies functionality that is not actually there is a badly designed tool.  Sadly, while sussing out a list of requirements for my replacement `envsubst` tool, I found that GNU `envsubst` has several deficiencies that are quite disappointing. + +### GNU `envsubst` does not actually implement POSIX variable substitution like a shell would + +In POSIX, variable substitution is more than simply replacing a variable with the value it is defined to.  In GNU `envsubst`, the documentation speaks of _shell variables_, and then outlines the `$FOO` and `${FOO}` formats for representing those variables.  The latter format implies that POSIX variable substitution is supported, but it's not. + +In a POSIX-conformant shell, you can do: + +% FOO="abc\_123" +% echo ${FOO%\_\*} +abc + +Unfortunately, this isn't supported by GNU `envsubst`: + +% FOO="abc\_123" envsubst +$FOO +abc\_123 +${FOO} +abc\_123 +${FOO%\_\*} +${FOO%\_\*} + +It's not yet supported by my implementation either, [but it's on the list of things to do](https://github.com/kaniini/envsubst/issues/1). + +### Defining a restricted set of environment variables is bizzare + +GNU `envsubst` describes taking an optional `[SHELL-FORMAT]` parameter.  The way this feature is implemented is truly bizzare, as seen below: + +% envsubst -h +Usage: envsubst \[OPTION\] \[SHELL-FORMAT\] +... +Operation mode: +  -v, --variables             output the variables occurring in SHELL-FORMAT +... +% FOO="abc123" BAR="xyz456" envsubst FOO +$FOO +$FOO +% FOO="abc123" envsubst -v FOO +% FOO="abc123" envsubst -v \\$FOO +FOO +% FOO="abc123" BAR="xyz456" envsubst \\$FOO +$FOO +abc123 +$BAR +$BAR +% FOO="abc123" BAR="xyz456" envsubst \\$FOO \\$BAR +envsubst: too many arguments +% FOO="abc123" BAR="xyz456" envsubst \\$FOO,\\$BAR +$FOO +abc123 +$BAR +xyz456 +$BAZ +$BAZ +% envsubst -v +envsubst: missing arguments +% + +As discussed above, `[SHELL-FORMAT]` is a very strange thing to call this, because it is not really a shell variable substitution format at all. + +Then there's the matter of requiring variable names to be provided in this shell-like variable format.  That requirement gives a shell script author the ability to easily break their script by accident, for example: + +% echo 'Your home directory is $HOME' | envsubst $HOME +Your home directory is $HOME + +Because you forgot to escape `$HOME` as `\$HOME`, the substitution list was empty: + +% echo 'Your home directory is $HOME' | envsubst \\$HOME +Your home directory is /home/kaniini + +The correct way to handle this would be to accept `HOME` without having to describe it as a variable.  That approach is supported by my implementation: + +% echo 'Your home directory is $HOME' | ~/.local/bin/envsubst HOME +Your home directory is /home/kaniini + +Then there's the matter of not supporting multiple variables in the traditional UNIX style (as separate tokens).  Being forced to use a comma on top of using a variable sigil for this is just bizzare and makes the tool absolutely unpleasant to use with this feature.  For example, this is how you're supposed to add two variables to the substitution list in GNU `envsubst`: + +% echo 'User $USER with home directory $HOME' | envsubst \\$USER,\\$HOME +User kaniini with home directory /home/kaniini + +While my implementation supports doing it that way, it also supports the more natural UNIX way: + +% echo 'User $USER with home directory $HOME' | ~/.local/bin/envsubst USER HOME +User kaniini with home directory /home/kaniini + +## This is common with GNU software + +This isn't just about GNU `envsubst`.  A lot of other GNU software is equally broken.  Even the GNU C library [has design deficiencies which are similarly frustrating](https://drewdevault.com/2020/09/25/A-story-of-two-libcs.html).  The reason why I wish to replace GNU software in Alpine is because in many cases, it is _defective by design_.  Whether the design defects are caused by apathy, or they're caused by politics, it doesn't matter.  The end result is the same, we get defective software.  I want better security and better reliability, which means we need better tools. + +We can talk about the FSF political issue, and many are debating that at length.  But the larger picture is that the tools made by the GNU project are, for the most part, clunky and unpleasant to use.  That's the real issue that needs solving. diff --git a/content/blog/activitypub-the-present-state-or-why-saving-the-worse-is-better-virus-is-both-possible-and-important.md b/content/blog/activitypub-the-present-state-or-why-saving-the-worse-is-better-virus-is-both-possible-and-important.md new file mode 100644 index 0000000..43f4f33 --- /dev/null +++ b/content/blog/activitypub-the-present-state-or-why-saving-the-worse-is-better-virus-is-both-possible-and-important.md @@ -0,0 +1,110 @@ +--- +title: "ActivityPub: the present state, or why saving the 'worse is better' virus is both possible and important" +date: "2019-01-10" +--- + +> This is the second article in a series that will be a fairly critical review of ActivityPub from a trust & safety perspective. Stay tuned for more. + +In [our previous episode](https://blog.dereferenced.org/activitypub-the-worse-is-better-approach-to-federated-social-networking), I laid out some personal observations about implementing an AP stack from scratch over the past year. When we started this arduous task, there were only three other AP implementations in progress: Mastodon, Kroeg and PubCrawl (the AP transport for Hubzilla), so it has been a pretty significant journey. + +I also described how ActivityPub was a student of the 'worse is better' design philosophy. Some people felt a little hurt by this, but they shouldn't have: after all, UNIX (of which modern Linux and BSD systems are a derivative) is also a student of the 'worse is better' philosophy. And much like the unices of yesteryear, ActivityPub right now has a lot of missing pieces. But that's alright, as long as the participants in this experiment understand the limitations. + +For the first time in decades, the success of ActivityPub, in part by way of it's aggressive adoption of the 'worse is better' philosophy (which enabled them to ship _something_) has made some traction that has inspired people to believe that perhaps we can take back the Web and make it open again. This in itself is a wonderful thing, and we must do our best to seize this opportunity and run with it. + +As I mentioned, there have been a huge amount of projects looking to implement AP in some way or other, many not yet in a public stage but seeking guidance on how to write an AP stack. My DMs have been quite busy with questions over the past couple of months about ActivityPub. + +## Let's talk about the elephant in the room, actually no not that one. + +ActivityPub has been brought this far by the [W3C Social CG](https://www.w3.org/community/socialcg/). This is a Community Group that was chartered by the W3C to advance the Social Web. + +While they did a good job at getting some of the best minds into the same room and talking about building a federated social web, a lot of decisions were already predetermined (using pump.io as a basis) or left underspecified to satisfy other groups inside W3C. Finally, the ActivityPub specification itself claimed that pure JSON could be used to implement ActivityPub, but the W3C kept pushing for layered specs on top like [JSON-LD Linked Data Signatures](https://w3c-dvcg.github.io/ld-signatures/), a spec that is not yet finalized but depends on JSON-LD. + +[LDS has a lot of problems](https://blog.dereferenced.org/the-case-for-blind-key-rotation), but I already covered them already. You can read about some of those problems by reading up on a mitigation known as [Blind Key Rotation](https://blog.dereferenced.org/the-case-for-blind-key-rotation). Anyway, this isn't _really_ about W3C pushing for use of LDS in AP, that is just one illustrated example of trying to bundle JSON-LD and dependencies into ActivityPub to make JSON-LD a defacto requirement. + +Because of this bundling issue, we established a new community group, called [LitePub](https://litepub.social/litepub), this was meant to be a workspace for people actually implementing ActivityPub stacks so that they could get documentation and support for using ActivityPub without JSON-LD, or using JSON-LD in a safe way. To date, the LitePub community is one of the best resources for asking questions about ActivityPub and getting real answers that can be used in production today. + +But to build the next generation of ActivityPub, the LitePub group isn't enough. Is W3C still interested? Unfortunately, from what I can tell, not really: [they are pursuing another system that was developed in house called SOLID](https://www.w3.org/community/solid/), which is built on the [Linked Data Platform](https://www.w3.org/TR/ldp/). Since SOLID is being developed by W3C top brass, I would assume that they aren't interested in stewarding a new revision of ActivityPub. And why would they be? SOLID is essentially a semantic web retread of ActivityPub, which gives the W3C top brass exactly what they wanted in the first place. + +In some ways, I argue that W3C's perceived disinterest in Social Web technologies other than SOLID largely has to do with fediverse projects having a very luke warm response to JSON-LD and LDS. + +The good news is that there have been some initial conversations between a few projects on what a working group to build the next generation of ActivityPub would look like, how it would be managed, and how it would be funded. We will be having more of these conversations over the next few months. + +## ActivityPub: the present state + +In the first blog post, I went into [a little detail about the present state of ActivityPub](https://blog.dereferenced.org/activitypub-the-worse-is-better-approach-to-federated-social-networking). But is it really as bad as I said? + +I am going to break down a few examples of faults in the protocol and talk about their current state as well as what we are doing for short-term mitigations and where we are doing them. + +### Ambiguous addressing: is it a DM or just a post directly addressed to a circle of friends? + +As Osada and Hubzilla started to get attention, Mastodon and Pleroma users started to see weird behavior in their notifications and timelines: messages from people they didn't necessarily follow which got directly addressed to the user. These are messages sent to a group of selected friends, but can otherwise be forwarded (boosted/repeated/announced) to other audiences. + +In other words, they do not have the same _semantic_ meaning as a DM. But due to the way they were addressed, Mastodon and Pleroma saw them as a DM. + +Mastodon fixed this issue in 2.6 by adding heuristics: if a message has recipients in both the `to` and `cc` fields, then it's a public message that is addressed to a group of recipients, and not a DM. Unfortunately, Mastodon treats it similarly to a followers-only post and does not infer the correct rights. + +Meanwhile, Pleroma and Friendica came up with the idea to add a semantic hint to the message with the `litepub:directMessage` field. If this is set to true, it should be considered as a direct message. If the field is set to false, then it should be considered a group message. If the field is unset, then heuristics are used to determine the message type. + +Pleroma has a branch in progress which adds both support for the `litepub:directMessage` field as well as the heuristics. It should be landing shortly (it needs a rebase and I need to fix up some of the heuristics). + +So overall, the issue is reasonably mitigated at this point. + +### Fake direction attacks + +Several months ago, [Puckipedia](https://puckipedia.com/) did some fake direction testing against mainstream ActivityPub implementations. Fake direction attacks are especially problematic because they allow spoofing to happen. + +She found vulnerabilities in Mastodon, Pleroma and PixelFed, as well as [recently a couple of other fediverse software](https://puckipedia.com/mn1n-7nny). + +The vulnerabilities she reported in Mastodon, Pleroma and PixelFed have been fixed, but the class of vulnerability as she observes keeps appearing. + +In part, we can mitigate this by writing excellent security documentation and referring people to read it. This is something that I hope the LitePub group can do in the future. + +But for now, I would say this issue is not fully mitigated. + +### Leakage caused by Mastodon's followers-only scope + +Software which is directly compatible with the Mastodon followers-only scope have a few problems, I am grouping them together here: + +- New followers can see content that was posted before they were authorized to view any followers-only content +- Replies to followers-only posts are addressed to their _own_ followers instead of the followers collection of the OP at the time the post was created (which creates metadata leaks about the OP) +- Software which does not support the followers-only scope can dereference the OP's followers collection in any way they wish, including interpreting it as `as:Public` (this is explicitly allowed by the ActivityStreams 2.0 specification, you can't even make this up) + +Mitigation of this is actually incredibly easy, which makes me question why Mastodon didn't do it to begin with: simply expand the followers collection when preparing to send the message outbound. + +An implementation of this will be landing in Pleroma soon to harden the followers-only scope as well as fix followers-only threads to be more usable. + +Implementation of this mitigation also brings the followers-only threads to Friendica and Hubzilla in a safe and compatible way: all fediverse software will be able to properly interact with the threads. + +### The “don't @ me” problem + +> Some of this interpretation about Zot may be slightly wrong, it is based on reading the specification for Zot and Zot 6. + +Other federated protocols such as DFRN, Zot and Zot 6 provide a rich framework for defining what interactions are allowed with a given message. ActivityPub doesn't. + +DFRN provides UI hints on each object that hint at what may be done with the object, but uses a capabilities system under the hood. Capability enforcement is done by the “feed producer,” which either accepts your request or denies it. If you comment on a post in DFRN, it is the responsibility of the parent “feed producer” to forward your post onward through the network. + +Zot uses a similar capabilities system but provides a magic signature in response to consuming the capability, which you then forward as proof of acceptance. Zot 6 uses a similar authentication scheme, except using OpenWebAuth instead of the original Zot authentication scheme. + +For ActivityPub, my proposal is to use a system of capability URIs and proof objects that are cross-checked by the receiving server. In terms of the proof objects themselves, cryptographic signatures are not a component of this proof system, it is strictly capability based. Cryptographic verification could be provided by leveraging HTTP Signatures to sign the response, if desired. I am still working out the details on how precisely this will work, and that will probably be the what the next blog post is about. + +As a datapoint: in Pleroma, we already use this cross-checking technique to verify objects which have been forwarded to us due to ActivityPub §7.1.2. This allows us to avoid JSON-LD and LDS signatures and is the recommended way to verify forwarded objects in LitePub implementations. + +### Unauthenticated object fetching + +Right now, due to the nature of ActivityPub and the design motivations behind it, fetching public objects is entirely unauthenticated. + +This has lead to a few incidents where fediverse users have gotten upset over their posts still arriving at servers they have blocked, since they naturally expect that posts won't arrive at servers they have blocked. + +Mastodon has implemented an extension for post fetching where fetching private posts is authenticated using the HTTP Signature of the user who is fetching the post. This is a possible way of solving the authentication problem: instances can be identified based on which actor signed the request. + +However, I don't think that fetching private posts in this way (instead this should always fail) is a good idea and wouldn't recommend it. With that said, a more generalized approach based on using HTTP Signatures to fetch public posts could be workable. + +But I do not think the AP server should use a random user's key to sign the requests: instead there should be an AP actor which explicitly represents the whole instance, and the instance actor's key should be used to sign the fetch requests instead. That way information about individual users isn't leaked, and signatures aren't created without the express consent of a random instance user. + +Once object fetches are properly authenticated in a way that instances are identifiable, then objects can be selectively disclosed. This also hardens object fetching via third parties such as crawlers. + +## Conclusion + +In this particular blog entry, I discussed why ActivityPub is still the hero we need despite being designed with the 'worse is better' philosophy, as well as discussed some early plans for cross-project collaboration on a next generation ActivityPub-based protocol, and discussed a few of the common problem areas with ActivityPub and how we can mitigate them in the future. + +And with that, despite the present issues we face with ActivityPub, I will end this by borrowing a common saying from the cryptocurrency community: the future is bright, the future is decentralized. diff --git a/content/blog/activitypub-the-worse-is-better-approach-to-federated-social-networking.md b/content/blog/activitypub-the-worse-is-better-approach-to-federated-social-networking.md new file mode 100644 index 0000000..2b92c17 --- /dev/null +++ b/content/blog/activitypub-the-worse-is-better-approach-to-federated-social-networking.md @@ -0,0 +1,54 @@ +--- +title: "ActivityPub: The “Worse Is Better” Approach to Federated Social Networking" +date: "2019-01-07" +--- + +> This is the first article in a series that will be a fairly critical review of ActivityPub from a trust & safety perspective. Stay tuned for more. + +In the modern day, myself and many other developers working on libre software have been exposed to a protocol design philosophy that emphasizes safety and correctness. That philosophy can be summarized with these goals: + +- Simplicity: the protocol must be simple to implement. It is more important for the protocol to be simple than the backend implementation. +- Correctness: the protocol must be verifiably correct. Incorrect behavior is simply not allowed. +- Safety: the protocol must be designed in a way that is safe. Behavior and functionality which risks safety is considered incorrect. +- Completeness: the protocol must cover as many situations as is practical. All reasonably expected cases must be covered. Simplicity is not a valid excuse to reduce completeness. + +Most people would correctly refer to these as good characteristics and overall the right way to approach designing protocols, especially in a federated and social setting. In many ways, the [Diaspora protocol](https://diaspora.github.io/diaspora_federation/) could be considered as an example of this philosophy of design. + +The “worse is better” approach to protocol design is only slightly different: + +- Simplicity: the protocol must be simple to implement. It is important for the backend implementation to be equally simple as the protocol itself. Simplicity of both implementation and protocol are the most important considerations in the design. +- Correctness: the protocol must be correct when tested against reasonably expected cases. It is more important to be simple than correct. Inconsistencies between real implementations and theoretical implementations are acceptable. +- Safety: the protocol must be safe when tested against basic use cases. It is more important to be simple than safe. +- Completeness: the protocol must cover reasonably expected cases. It is more important for the protocol to be simple than complete. Under-specification is acceptable when it improves the simplicity of the protocol. + +[OStatus](https://indieweb.org/OStatus) and [ActivityPub](https://www.w3.org/tr/activitypub) are examples of the “worse is better” approach to protocol design. I have intentionally portrayed this design approach in a way to attempt to convince you that it is a really bad approach. + +However, I do believe that this approach, even though it is considerably worse approach to protocol design which creates technologies that people simply cannot trust or have confidence in their safety while using those technologies, has better survival characteristics. + +To understand why, we have to look at both what expected security features of federated social networks are, and what people mostly use social networks for. + +When you ask people what security features they expect of a federated social networking service such as Mastodon or Pleroma, they usually reply with a list like this: + +- I should be able to interact with my friends. +- The messages I share only with my friends should be handled in a secure manner. I should be able to depend on the software to not compromise my private posts. +- Blocking should work reasonably well: if I block someone, they should disappear from my experience. + +These requirements sound reasonable, right? And of course, ActivityPub mostly gets the job done. After all, the main use of social media is shitposting, posting selfies of yourself and sharing pictures of your dog. But would they be better served by a different protocol? Absolutely. + +See, the thing is, ActivityPub is like a virus. The protocol is simple enough to implement that people can actually do it. And they are, aren't they? There's over 40 applications presently in development that use ActivityPub as the basis of their networking stack. + +Why is this? Because, _despite_ the design flaws in ActivityPub, it is generally _good enough_: you can interact with your friends, and in compliant implementations, addressing ensures that nobody else except for those you explicitly authorize will read your messages. + +But it's not good enough: [for example, people have expressed that they want others to be able to read messages, but not reply to them](https://github.com/tootsuite/mastodon/issues/8565). + +Had ActivityPub been a capability-based system instead of a signature-based system, this would never have been a concern to begin with: replies to the message would have gone to a special capability URI and then accepted or rejected. + +There are similar problems with things like the Mastodon “followers-only” posts and general concerns like direct messaging: these types of messages imply specific policy, but there is no mechanism in ActivityPub to convey these semantics. (This is in part solved by the LitePub `litepub:directMessage` flag, but that's a kludge to be honest.) + +I've also mentioned before that a large number of instances where there have been discourse about Mastodon verses Pleroma have actually been caused by complete design failures of ActivityPub. + +An example of this is with instances you've banned being able to see threads from your instance still: what happens with this is that somebody from a third instance interacts with the thread and then the software (either Mastodon or Pleroma) reconstructs the entire thread. Since there is no authentication requirement to retrieve a thread, these blocked instances can successfully reconstruct the threads they weren't allowed to receive in the first place. The only difference between Mastodon and Pleroma here is that Pleroma allows the general public to view the shared timelines without using a third party tool, which exposes the leaks caused by ActivityPub's bad design. + +In an ideal world, the number of ActivityPub implementations would be zero. But of course this is not an ideal world, so that leaves us with the question: “where do we go from _here_?” + +And honestly, I don't know how to answer that yet. Maybe we can save ActivityPub by extending it to be properly capability-based and eventually dropping support for the ActivityPub of today. But this will require coordination between all the vendors. And with 40+ projects out there, it's not going to be easy. And do we even care about those 40+ projects anyway? diff --git a/content/blog/actually-bsd-kqueue-is-a-mountain-of-technical-debt.md b/content/blog/actually-bsd-kqueue-is-a-mountain-of-technical-debt.md new file mode 100644 index 0000000..c10faf3 --- /dev/null +++ b/content/blog/actually-bsd-kqueue-is-a-mountain-of-technical-debt.md @@ -0,0 +1,72 @@ +--- +title: "actually, BSD kqueue is a mountain of technical debt" +date: "2021-06-06" +--- + +A side effect of [the whole freenode kerfluffle](https://ariadne.space/2021/05/20/the-whole-freenode-kerfluffle/) is that I've been looking at IRCD again.  IRC, is of course a very weird and interesting place, and the smaller community of people who run IRCDs are largely weirder and even more interesting. + +However, in that community of IRCD administrators there happens to be a few incorrect systems programming opinions that have been cargo culted around for years.  This particular blog is about one of these bikesheds, namely the _kqueue vs epoll debate_. + +You've probably heard it before.  It goes something like this, _"BSD is better for networking, because it has kqueue.  Linux has nothing like kqueue, epoll doesn't come close."_  While I agree that epoll doesn't come close, I think that's actually a feature that has lead to a much more flexible and composable design. + +## In the beginning... + +Originally, IRCD like most daemons used `select` for polling sockets for readiness, as this was the first polling API available on systems with BSD sockets.  The `select` syscall works by taking a set of three bitmaps, with each bit describing a file descriptor number: bit 1 refers to file descriptor 1 and so on.  The bitmaps are the `read_set`, `write_set` and `err_set`, which map to sockets that can be read, written to or have errors accordingly.  Due to design defects with the `select` syscalls, it can only support up to `FD_SETSIZE` file descriptors on most systems.  This can be mitigated by making `fd_set` an arbitrarily large bitmap and depending on `fdmax` to be the upper bound, which is what WinSock has traditionally done on Windows. + +The `select` syscall clearly had some design deficits that negatively affected scalability, so AT&T introduced the `poll` syscall in System V UNIX.  The `poll` syscall takes an array of `struct pollfd` of user-specified length, and updates a bitmap of flags in each `struct pollfd` entry with the current status of each socket.  Then you iterate over the `struct pollfd` list.  This is naturally a lot more efficient than `select`, where you have to iterate over all file descriptors up to `fdmax` and test for membership in each of the three bitmaps to ascertain each socket's status. + +It can be argued that `select` was bounded by `FD_SETSIZE` (which is usually 1024 sockets), while `poll` begins to have serious scalability issues at around `10240` sockets.  These arbitrary benchmarks have been referred to as the C1K and C10K problems accordingly.  Dan Kegel has a [very lengthy post on his website](http://www.kegel.com/c10k.html) about his experiences mitigating the C10K problem in the context of running an FTP site. + +## Then there was kqueue... + +In July 2000, Jonathan Lemon introduced kqueue into FreeBSD, which quickly propagated into the other BSD forks as well.  kqueue is a kernel-assisted event notification system using two syscalls: `kqueue` and `kevent`.  The `kqueue` syscall creates a handle in the kernel represented as a file descriptor, which a developer uses with `kevent` to add and remove _event filters_.  Event filters can match against file descriptors, processes, filesystem paths, timers, and so on. + +This design allows for a single-threaded server to process hundreds of thousands of connections at once, because it can register all of the sockets it wishes to monitor with the kernel and then lazily iterate over the sockets as they have events. + +Most IRCDs have supported `kqueue` for the past 15 to 20 years. + +## And then epoll... + +In October 2002, Davide Libenzi got [his `epoll` patch](http://www.xmailserver.org/linux-patches/nio-improve.html) merged into Linux 2.5.44.  Like with kqueue, you use the `epoll_create` syscall to create a kernel handle which represents the set of descriptors to monitor.  You use the `epoll_ctl` syscall to add or remove descriptors from that set.  And finally, you use `epoll_wait` to wait for kernel events. + +In general, the scalability aspects are the same to the application programmer: you have your sockets, you use `epoll_ctl` to add them to the kernel's `epoll` handle, and then you wait for events, just like you would with `kevent`. + +Like `kqueue`, most IRCDs have supported `epoll` for the past 15 years. + +## What is a file descriptor, anyway? + +To understand the argument I am about to make, we need to talk about _file descriptors_.  UNIX uses the term _file descriptor_ a lot, even when referring to things which are clearly _not_ files, like network sockets.  Outside the UNIX world, a file descriptor is usually referred to as a _kernel handle_.  Indeed, in Windows, kernel-managed resources are given the `HANDLE` type, which makes this relationship more clear.  Essentially, a kernel handle is basically an opaque reference to an object in kernel space, and the astute reader may notice some similarities to the [object-capability model](https://en.wikipedia.org/wiki/Object-capability_model) as a result. + +Now that we understand that file descriptors are actually just kernel handles, we can now talk about `kqueue` and `epoll`, and why `epoll` is actually the correct design. + +## The problem with event filters + +The key difference between `epoll` and `kqueue` is that `kqueue` operates on the notion of _event filters_ instead of _kernel handles_.  This means that any time you want `kqueue` to do something new, you have to add a new type of _event filter_. + +[FreeBSD presently has 10 different event filter types](https://www.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2): `EVFILT_READ`, `EVFILT_WRITE`, `EVFILT_EMPTY`, `EVFILT_AIO`, `EVFILT_VNODE`, `EVFILT_PROC`, `EVFILT_PROCDESC`, `EVFILT_SIGNAL`, `EVFILT_TIMER` and `EVFILT_USER`.  Darwin has additional event filters concerning monitoring Mach ports. + +Other than `EVFILT_READ`, `EVFILT_WRITE` and `EVFILT_EMPTY`, all of these different event filter types are related to entirely different concerns in the kernel: they don't monitor kernel handles, but instead other specific subsystems than sockets. + +This makes for a powerful API, but one which lacks [composability](https://en.wikipedia.org/wiki/Composability). + +## `epoll` is better because it is composable + +It is possible to do almost everything that `kqueue` can do on FreeBSD in Linux, but instead of having a single monolithic syscall to handle _everything_, Linux takes the approach of providing syscalls which allow almost anything to be represented as a _kernel handle_. + +Since `epoll` strictly monitors _kernel handles_, you can register _any_ kernel handle you have with it and get events back when its state changes.  As a comparison to Windows, this basically means that `epoll` is a kernel-accelerated form of `WaitForMultipleObjects` in the Win32 API. + +You are probably wondering how this works, so here's a table of commonly used `kqueue` event filters and the Linux syscall used to get a kernel handle for use with `epoll`. + +| BSD event filter | Linux equivalent | +| --- | --- | +| `EVFILT_READ`, `EVFILT_WRITE`, `EVFILT_EMPTY` | Pass the socket with `EPOLLIN` etc. | +| `EVFILT_VNODE` | `inotify` | +| `EVFILT_SIGNAL` | `signalfd` | +| `EVFILT_TIMER` | `timerfd` | +| `EVFILT_USER` | `eventfd` | +| `EVFILT_PROC`, `EVFILT_PROCDESC` | `pidfd`, alternatively bind processes to a `cgroup` and monitor `cgroup.events` | +| `EVFILT_AIO` | `aiocb.aio_fildes` (treat as socket) | + +Hopefully, as you can see, `epoll` can automatically monitor _any_ kind of kernel resource without having to be modified, due to its composable design, which makes it superior to `kqueue` from the perspective of having less technical debt. + +Interestingly, [FreeBSD has added support for Linux's `eventfd` recently](https://www.freebsd.org/cgi/man.cgi?query=eventfd&apropos=0&sektion=2&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html), so it appears that they may take `kqueue` in this direction as well.  Between that and FreeBSD's [process descriptors](https://www.freebsd.org/cgi/man.cgi?query=procdesc&sektion=4&apropos=0&manpath=FreeBSD+13.0-RELEASE+and+Ports), it seems likely. diff --git a/content/blog/alpineconf-2021-recap.md b/content/blog/alpineconf-2021-recap.md new file mode 100644 index 0000000..51c645d --- /dev/null +++ b/content/blog/alpineconf-2021-recap.md @@ -0,0 +1,144 @@ +--- +title: "AlpineConf 2021 recap" +date: "2021-05-18" +--- + +Last weekend was AlpineConf, the first one ever.  We held it as a virtual event, and over 700 participants came and went during the weekend.  Although there were many things we learned up to and during the conference that could be improved, I think that the first AlpineConf was a great success!  If you're interested in rewatching the event, both days have mostly full recordings on the [Alpine website](https://alpinelinux.org/conf). + +## What worked + +We held the conference on a [BigBlueButton](https://bigbluebutton.org) instance I set up and used the [Alpine Gitlab for organizing](https://gitlab.alpinelinux.org/alpine/alpineconf-cfp).  BigBlueButton scaled well, even when we had nearly 100 active participants, the server performed quite well.  Similarly, using issue tracking in Gitlab helped us to keep the CFP process simple.  I think in general, we will keep this setup for future events, as it worked quite well. + +## What didn't work so well + +A major problem with BigBlueButton was attaching conference talks from YouTube.  This caused problems with several privacy extensions which blocked the YouTube player from running.  Also, the YouTube video playback segments are missing from the recordings.  I'm going to investigate alternative options for this which should hopefully help with making the recorded talks play back correctly next time. + +Maybe if a BigBlueButton developer sees this, they can work to improve the YouTube viewing feature as well so that it works on the recording playback.  That would be a really nice feature to have. + +Other than that, we only had one scheduling SNAFU, and that was basically my fault -- I didn't confirm the timeslot I scheduled the cloud team talk in, and so naturally, the cloud team was largely asleep because they were in US/Pacific time. + +Overall though, I think things went well and many people said they enjoyed the conference.  Next year, as we will have some experience to draw from, things will be even better, hopefully. + +## The talks on day 1... + +The first day was very exciting with a lot of talks and [blahaj representation](https://www.ikea.com/us/en/p/blahaj-soft-toy-shark-90373590/).  The talks mostly focused around user stories about Alpine.  We learned about where and how Alpine was being used... from phones, to data centers, to windmills, to the science community.  Here is the list of talks on the first day and my thoughts! + +#### The Beauty of Simplicity, by Cameron Seid (@deltaryz) + +This was the first talk of the conference and largely focused on how Cameron managed his Alpine server.  It was a good starting talk for the conference, I think, because it showed how people use Alpine at home in their personal infrastructure.  The talk was prerecorded and Cameron spent a lot of time on editing to make it look flashy. + +#### pmbootstrap: The Swiss Army Knife of postmarketOS development, by Oliver Smith (@ollieparanoid) + +postmarketOS is a distribution of Alpine for phones and other embedded devices. + +In this talk, Oliver went into `pmbootstrap`, a tool which helps to automate many of the tasks of building postmarketOS images and packages.  About halfway through the talk, a user joined who I needed to make moderator, but I clicked the wrong button and made them presenter instead.  Thankfully, Oliver was a good sport about it and we were able to fix the video playback quickly.  I learned a lot about how `pmbootstrap` can be used for any sort of embedded project, and that opens up a lot of possibilities for collaborating with the pmOS team in other embedded applications involving Alpine. + +#### Using Alpine Linux in DataCenterLight, by Nico Schottelius (@telmich) + +In this talk, Nico walks us through how Alpine powers many devices in his data center project called DataCenterLight.  He is using Alpine in his routing infrastructure with 10 gigabit links!  The talk went over everything from routing all the way down to individual customer services, and briefly compared Alpine to Debian and Devuan from both a user and development point of view. + +#### aports-qa-bot: automating aports, by Rasmus Thomsen (@Cogitri) + +Rasmus talked about the `aports-qa-bot` he wrote which helps maintainers and the mentoring team review merge requests from contributors.  He went into some detail about the modular design of the bot and how it can be easily extended for other teams and also Alpine derivatives.  The postmarketOS team asked about deploying it for their downstream `pmaports` repo, so you'll probably be seeing the bot there soon. + +#### apk-polkit-rs: Using APK without the CLI, by Rasmus Thomsen (@Cogitri) + +Rasmus had the next slot as well, where he talked about his `apk-polkit-rs` project which provides a DBus service that can be called for installing and upgrading packages using apk.  He also talked about the rust crate he is working on to wrap the apk-tools 3 API.  Overall, the future looks very interesting for working with apk-tools from rust! + +#### Alpine building infrastructure update, by Natanael Copa (@ncopa) + +Next, Natanael gave a bubble talk about the Alpine building infrastructure.  For me this was largely a trip down memory lane, as I witnessed the build infrastructure evolve first hand.  He talked about how the first generation build infrastructure was a series of IRC bots which reacted to IRC messages in order to trigger a new build, and how the IRC infrastructure evolved from IRC to ZeroMQ to MQTT. + +He then showed how the builders work, using a live builder as an example, walking through the design and implementation of the build scripts.  Finally, he proposed some ideas for building a more robust system that allowed for parallelizing the build process where possible. + +#### postmarketOS demo, by Martijn Braam (@MartijnBraam) + +Martijn showed us postmarketOS in action on several different phones.  Did I mention he has a lot of phones?  I asked in the Q&A afterwards and he said he had like 6 pinephones and somewhere around 60 other phones. + +I have to admire the dedication to reverse engineering phones that would lead to somebody acquiring 60+ phones to tinker with. + +#### Sxmo: Simple X Mobile - A minimalist environment for Linux smartphones, by Maarten van Gompel (@proycon) + +Maarten van Gompel, Anjandev Momi and Miles Alan gave a talk about and demonstration of Sxmo, their lightweight phone environment based on dwm, dmenu and a bunch of other tools as plumbing. + +The UI reminds me a lot of palmOS.  I suspect if palmOS were still alive and kicking today, it would look like Sxmo.  **Phone calls and text messages are routed through shell scripts**, a feature I didn't know I needed until I saw it in action.  Sxmo probably is _the_ killer app for running an actual Linux distribution on your phone. + +This UI is absolutely _begging_ for jog-wheels to come back, and I for one hope they do. + +#### Alpine and the larger musl ecosystem (a roundtable discussion) + +This got off to a rocky start because I don't know how to organize stuff like this.  I should have found somebody else to run the discussion, but it was really fruitful.  We came to the conclusion that we needed to work more closely together in the musl distribution ecosystem to proactively deal with issues like misinformed upstreams and so on, so that we do not have another Rust-like situation again.  That lead to the formation of `#musl-distros` on freenode to coordinate on these issues. + +#### Taking Alpine to the Edge and Beyond With Linux Foundation's Project EVE, by Roman Shaposhnik (@rvs) + +Roman talked about Project EVE, an edge computing solution being developed under the auspices of the LF Edge working group at Linux Foundation.  EVE (Edge Virtualization Engine) is a distribution of Alpine built with Docker's LinuxKit, which has multiple Alpine-based containers working together in order to provide an edge computing solution. + +He talked about how the cloud has eroded software freedom (after all, you can't depend on free-as-in-freedom computing when it's on hardware you don't own) by encouraging users to trade it for convenience, and how edge computing brings that same convenience in-house, thus solving the software freedom issue. + +Afterward, he demonstrated how EVE is deployed on windmills to analyze audio recordings from the windmill to determine their health.  All of that, including the customer application, is running on Alpine. + +He concluded the talk with a brief update on the `riscv64` port.  It looks like we are well on the way to having the port in Alpine 3.15. + +#### BinaryBuilder.jl: The Subtle Art of Binaries that "Just Work", by Elliot Saba and Mosè Giordano + +Elliot and Mosè talked about BinaryBuilder, which they use to cross-compile software for all platforms supported by the Julia programming language.  They do this by building the software in an Alpine-based environment under Linux namespaces or Docker (on mac). + +Amongst other things, they have a series of wrapper scripts around programs like `uname` which allow them to emulate the userspace commands of the target operating system, which helps convince badly written autoconf scripts to cooperate. + +All in all, it was a fascinating talk! + +## The talks on day 2... + +The talks on day 2 were primarily about the technical plumbing of Alpine. + +#### Future of Alpine Linux community chats (a roundtable discussion) + +We talked about the [current situation on freenode](https://news.ycombinator.com/item?id=27153338).  The conclusion we came to regarding that was to support the freenode staff in their efforts to find a solution until the end of the month, at which point we would evaluate the situation again. + +This lead to a discussion about enhancing the IRC experience for new contributors, and the possibility of just setting up an internal IRC server for the project to use, as well as working with Element to set up a hosted Matrix server alternative. + +We also talked for the first time about the Alpine communities which are growing on non-free services such as Discord.  Laurent observed that there is value in meeting users where they already are for outreach purposes, and also pointed out that the nature of proprietary IRC networks imposes a software freedom issue that doesn't exist with self-hosting our own.  Most people agreed with these points, so we concluded that we would figure out plans to start integrating these unofficial communities into Alpine properly. + +#### Security tracker demo and security team Q&A + +This was kind of a bubble talk.  I gave a demo of the new security.alpinelinux.org tracker, as well as an overview of how the current CVE system works with the NVD and CIRCL feeds and so on.  We then talked a bit about how the CVE system could be improved by the Linked Data proposal I am working on, which will be published shortly. + +Afterwards, we talked about initiatives like bringing `clang`'s Control Flow Integrity into Alpine and a bunch of other topics about security in Alpine.  It was a fun talk and we covered a lot of topics.  It went for an hour and a half, as a talk was cancelled in the 15:00 slot. + +#### Alpine s390x port discussion, by me + +After the security talk, I talked a bit about running Alpine on mainframes, how they work, and why people still want to use them in 2021.  In the Q&A we talked about big vs little endian and why people aren't mining Monero on mainframes. + +#### Simplified networking configuration with ifupdown-ng, by me + +This was an expanded talk about ifupdown-ng loosely based on the one Max gave at VirtualNOG last year.  I adapted his talk, replacing Debian-specific content with Alpine content and talked a bit about NSL (RIP).  The talk seemed to go well, in the Q&A we talked primarily about SR-IOV, which is not yet supported by ifupdown-ng. + +#### Declarative networking configuration with ifstate, by Thomas Liske (@liske) + +After the ifupdown-ng talk, Thomas talked about and demonstrated his `ifstate` project, which is available as an alternative to `ifupdown` in Alpine.  Unlike ifupdown-ng which takes a hybrid approach, and ifupdown which takes an imperative approach, ifstate is a fully declarative implementation.  The YAML syntax is quite interesting.  I think ifstate will be quite popular for Alpine users requiring fully declarative configuration. + +#### AlpineConf 2.0 planning discussion + +After the networking track, we talked about AlpineConf next year.  The conclusion was that AlpineConf has most value being a virtual event, and that if we want to have a physical event there's events like FOSDEM out there which we can use for that. + +#### Alpine cloud team talk and Q&A + +This wound up being a bit of a bubble talk because I failed to actually confirm whether anyone from the cloud team could give a talk at this time.  Nonetheless the talk was a huge success.  We talked about Alpine in the cloud and how to build on it. + +#### systemd: the good parts, by Christine Dodrill (@Xe) + +Christine gave a talk about systemd's feature set that she would like to see implemented in Alpine somehow.  In the chat, Laurent provided some commentary... + +It was a fun talk that was at least somewhat amusing. + +#### Governance event + +Finally to close out the conference, Natanael talked about Alpine governance.  In this event, he announced the dissolution of the Alpine Core Team and implementation of the Alpine Council instead.  The Alpine Council will be initially managed by Natanael Copa, Carlo Landmeter and Kevin Daudt in the interim.  This group will handle the administrative responsibilities of the project, while a technical steering committee will handle the technical planning for the project.  This arrangement is likely familiar to anyone who has used Fedora, I think it makes sense to copy what works! + +Afterwards, we talked a little bit informally about everyone's thoughts on the conference. + +## In closing... + +Thanks to [Natanael Copa for proposing the idea of AlpineConf last year](https://lists.alpinelinux.org/~alpine/devel/%3C20200521160527.718c2d2c%40ncopa-desktop.copa.dup.pw%3E), Kevin Daudt for helping push the buttons and keeping things going (especially when my internet connection failed due to bad weather), all of the wonderful presenters who gave talks (many of which gave talks for their first time ever!) and everyone who dropped in to participate in the conference! + +We will be having a technically-oriented Alpine miniconf in November, and then AlpineConf 2022 next May!  Hopefully you will be at both.  Announcements will be forthcoming about both soon. diff --git a/content/blog/an-inside-look-into-the-illicit-ad-industry.md b/content/blog/an-inside-look-into-the-illicit-ad-industry.md new file mode 100644 index 0000000..f517cbe --- /dev/null +++ b/content/blog/an-inside-look-into-the-illicit-ad-industry.md @@ -0,0 +1,89 @@ +--- +title: "an inside look into the illicit ad industry" +date: "2021-11-04" +--- + +So, you want to work in ad tech, do you? Perhaps this will be a cautionary tale... + +I have worked my entire life as a contractor. This has had advantages and disadvantages. For example, I am free to set my own schedule, and undertake engagements at my own leisure, but as a result my tax situation is more complicated. Another advantage is that sometimes, you get involved in an engagement that is truly fascinating. This is the story of such an engagement. Some details have been slightly changed, and specific names are elided. + +A common theme amongst contractors in the technology industry is to band together to take on engagements which cannot be reasonably handled by a single contractor. Our story begins with such an engagement: a friend of mine ran a bespoke IT services company, which provided system administration, free software consulting and development. His company also handled the infrastructure deployment needs of customers who did not want to build their own infrastructure. I frequently worked with my friend on various consulting engagements over the years, including this one. + +One day, I was chilling in IRC, when I got a PM from my friend: he had gotten an inquiry from a possible client that needed help reverse engineering a piece of obfuscated JavaScript. I said something like "sounds like fun, send it over, and I'll see what I come up with." The script in question was called `popunder.js` and did exactly what you think it does. The customer in question had started a popunder ad network, and needed help adapting this obfuscated popunder script to work with his system, which he built using [a software called Revive Adserver](https://en.wikipedia.org/wiki/Revive_Adserver), a fork of the last GPL version of OpenX. + +I rolled my eyes and reverse engineered the script for him, allowing him to adapt it for his ad network. The adaptation was a success, and he wired me a sum that was triple my quoted hourly rate. This, admittedly, resulted in me being very curious about his business, as at the time, I was not used to making that kind of money. Actually, I'm still not. + +A few weeks passed, and he approached me with a proposition: he needed somebody who could reverse engineer the JavaScript programs delivered by ad networks and figure out how the scripts worked. As he was paying considerably more than my advertised hourly rate, I agreed, and got to work reverse engineering the JavaScript programs he required. It was nearly a full time job, as these programs kept evolving. + +In retrospect, he probably wasn't doing anything with the reports I wrote on each piece of JavaScript I reverse engineered, as that wasn't the actual point of the exercise: in reality, he wanted me to become familiar with the techniques ad networks used to detect fraud, so that we could develop countermeasures. In other words, the engagement evolved into a red-team type engagement, except that we weren't testing the ad networks for their sake, but instead ours. + +## so-called "domain masking": an explanation + +Years ago, you might have browsed websites like The Pirate Bay and saw advertising for a popular game, or some sort of other advertisement that you wouldn't have expected to see on The Pirate Bay. I assure you, brands were not knowingly targeting users on TPB: they were being duped via a category of techniques called _domain masking_. + +This is a type of scam that black-hat ad networks do in order to launder illicit traffic into clean traffic: they will set up fake websites and apply for advertisements on those websites through a shell company. This gives them a clean advertising feed to serve ads from. The next step is to launder the traffic by serving those tags on empty pages on the website, so that you can use them with an ` + +LVis demonstration of complex waveforms + +At this point, I think it is at a point where others can start playing with it and contributing presets. The core language DSL is basically stable — I don't expect to change anything in a way that would cause breakage. So, please download it and send me your presets! diff --git a/content/blog/introducing-witchery-tools-for-building-distroless-images-with-alpine.md b/content/blog/introducing-witchery-tools-for-building-distroless-images-with-alpine.md new file mode 100644 index 0000000..efb710f --- /dev/null +++ b/content/blog/introducing-witchery-tools-for-building-distroless-images-with-alpine.md @@ -0,0 +1,71 @@ +--- +title: "introducing witchery: tools for building distroless images with alpine" +date: "2021-09-09" +coverImage: "Screen-Shot-2021-09-09-at-7.37.24-AM.png" +--- + +As I noted [in my last blog](https://ariadne.space/2021/09/07/bits-relating-to-alpine-security-initiatives-in-august/), I have been working on a set of tools which enable the building of so-called "distroless" images based on Alpine.  These tools have now evolved to a point where they are usable for testing in lab environments, thus I am happy to announce [the witchery project](https://github.com/kaniini/witchery). + +For the uninitiated, a "distroless" image is one which contains _only_ the application and its dependencies.  This has some desirable qualities: since the image is only the application and its immediate dependencies, there is less attack surface to worry about.  For example, a simple hello-world application built with witchery clocks in at 619kB, while that same hello-world application deployed on `alpine:3.14` clocks in at 5.6MB.  There are also drawbacks: a distroless image typically does not include a package manager, so there is generally no ability to add new packages to a distroless image. + +As for why it's called witchery: we are using Alpine's package manager in new ways to perform truly deep magic.  The basic idea behind witchery is that you use it to stuff your application into an `.apk` file, and then use `apk` to install _only_ that `.apk` and its dependencies into a rootfs: no `alpine-base`, no `apk-tools`, no `busybox` (though witchery allows you to install those things if you want them). + +## Deploying an an example application with witchery + +For those who want to see the source code without commentary, you [can find the `Dockerfile` for this example on the witchery GitHub repo](https://github.com/kaniini/witchery/blob/master/examples/hello-world/Dockerfile).  For everyone else, I am going to try to break down what each part is doing, so that you can hopefully understand how it all fits together. We will be looking at the `Dockerfile` in the `hello-world` example. + +The first thing the reader will likely notice is that Docker images built with witchery are done in three stages.  First, you build the application itself, then you use witchery to build what will become the final image, and finally, you copy that image over to a blank filesystem. + +FROM alpine:3.14 AS build +WORKDIR /root +COPY . . +RUN apk add --no-cache build-base && gcc -o hello-world hello-world.c + +The first stage to build the application is hopefully self explanatory, and is aptly named `build`.  We fetch the `alpine:3.14` image from Dockerhub, then install a compiler (`build-base`) and finally use `gcc` to build the application. + +The second stage has a few steps to it, that I will split up so that its easier to follow along. + +FROM kaniini/witchery:latest AS witchery + +First, we fetch the `kaniini/witchery:latest` image, and name it `witchery`.  This image contains `alpine-sdk`, which is needed to make packages, and the witchery tools which drive the `alpine-sdk` tools, such as `abuild`. + +RUN adduser -D builder && addgroup builder abuild +USER builder +WORKDIR /home/builder + +Anybody who is familiar with `abuild` will tell you that it cannot be used as root.  Accordingly, we create a user for running `abuild`, and add it to the `abuild` group.  We then tell Docker that we want to run commands as this new user, and do so from its home directory. + +COPY --from=build /root/hello-world . +RUN mkdir -p payloadfs/app && mv hello-world payloadfs/app/hello-world +RUN abuild-keygen -na && fakeroot witchery-buildapk -n payload payloadfs/ payloadout/ + +The next step is to package our application.  The first step in doing so involves copying the application from our `build` stage.  We ultimately want the application to wind up in `/app/hello-world`, so we make a directory for the package filesystem, then move the application into place.  Finally, we generate a signing key for the package, and then generate a signed `.apk` for the application named `payload`. + +At this point, we have a signed `.apk` package containing our application, but how do we actually build the image?  Well, just as we drove `abuild` with `witchery-buildapk` to build the `.apk` package and sign it, we will have `apk` build the image for us.  But first, we need to switch back to being root: + +USER root +WORKDIR /root + +Now that we are root again, we can generate the image.  But first, we need to add the signing key we generated in the earlier step to `apk`'s trusted keys.  To do that, we simply copy it from the builder user's home directory. + +RUN cp /home/builder/.abuild/\*.pub /etc/apk/keys + +And finally, we build the image.  Witchery contains a helper tool, `witchery-compose` that makes doing this with `apk` really easy. + +RUN witchery-compose -p ~builder/payloadout/payload\*.apk -k /etc/apk/keys -X http://dl-cdn.alpinelinux.org/alpine/v3.14/main /root/outimg/ + +In this case, we want `witchery-compose` to grab the application package from `~builder/payloadout/payload*.apk`.  We use a wildcard there because we don't know the full filename of the generated package.  There are options that can be passed to `witchery-buildapk` to allow you to control all parts of the `.apk` package's filename, so you don't necessarily have to do this.  We also want `witchery-compose` to use the system's trusted keys for validating signatures, and we want to pull dependencies from an Alpine mirror. + +Once `witchery-compose` finishes, you will have a full image in `/root/outimg`.  The final step is to copy that to a new blank image. + +FROM scratch +CMD \["/app/hello-world"\] +COPY --from=witchery /root/outimg/ . + +And that's all there is to it! + +## Things left to do + +There are still a lot of things left to do.  For example, we might want to implement layers that users can build from when deploying their apps, like one containing `s6` for example.  We also don't have a great answer for applications written in things like Python yet, so far this only works well for programs that are compiled in the traditional sense. + +But its a starting point none the less.  I'll be writing more about witchery over the coming months as the tools evolve into something even more powerful.  This is only the beginning. diff --git a/content/blog/it-is-correct-to-refer-to-gnu-linux-as-gnu-linux.md b/content/blog/it-is-correct-to-refer-to-gnu-linux-as-gnu-linux.md new file mode 100644 index 0000000..bde36e2 --- /dev/null +++ b/content/blog/it-is-correct-to-refer-to-gnu-linux-as-gnu-linux.md @@ -0,0 +1,44 @@ +--- +title: "it is correct to refer to GNU/Linux as GNU/Linux" +date: "2022-03-30" +--- + +You've probably seen the "I'd like to interject for a moment" quotation that is frequently attributed to Richard Stallman about how Linux should be referred to as GNU/Linux. While I disagree with _that_ particular assertion, I do believe it is important to refer to GNU/Linux distributions as such, because GNU/Linux is a distinct operating system in the family of operating systems which use the Linux kernel, and it is technically correct to recognize this, especially as different Linux-based operating systems have different behavior, and different advantages and disadvantages. + +For example, besides GNU/Linux, there are the Alpine and OpenWrt ecosystems, and last but not least, Android. All of these operating systems exist outside the GNU/Linux space and have significant differences, both between GNU/Linux and also each other. + +## what is GNU/Linux? + +I believe part of the problem which leads people to be confused about the alternative Linux ecosystems is the lack of a cogent GNU/Linux definition, in part because many GNU/Linux distributions try to downplay that they are, in fact, GNU/Linux distributions. This may be for commercial or marketing reasons, or it may be because they do not wish to be seen as associated with the FSF. Because of this, others, who are fans of the work of the FSF, tend to overreach and claim other Linux ecosystems as being part of the GNU/Linux ecosystem, which is equally harmful. + +It is therefore important to provide a technically accurate definition of GNU/Linux that provides actual useful meaning to consumers, so that they can understand the differences between GNU/Linux-based operating systems and other Linux-based operating systems. To that end, I believe a reasonable definition of the GNU/Linux ecosystem to be distributions which: + +- use the GNU C Library (frequently referred to as glibc) +- use the GNU coreutils package for their base UNIX commands (such as `/bin/cat` and so on). + +From a technical perspective, an easy way to check if you are on a GNU/Linux system would be to attempt to run the `/lib/libc.so.6` command. If you are running on a GNU/Linux system, this will print the glibc version that is installed. This technical definition of GNU/Linux also provides value, because some drivers and proprietary applications, such as the nVidia proprietary graphics driver, only support GNU/Linux systems. + +Given this rubric, we can easily test a few popular distributions and make some conclusions about their capabilities: + +- Debian-based Linux distributions, including Debian itself, and also Ubuntu and elementary, meet the above preconditions and are therefore GNU/Linux distributions. +- Fedora and the other distributions published by Red Hat also meet the same criterion to be defined as a GNU/Linux distribution. +- ArchLinux also meets the above criterion, and therefore is also a GNU/Linux distribution. Indeed, the preferred distribution of the FSF, Parabola, describes itself as GNU/Linux and is derived from Arch. +- Alpine does not use the GNU C library, and therefore is not a GNU/Linux distribution. Compatibility with GNU/Linux programs should not be assumed. More on that in a moment. +- Similarly, OpenWrt is not a GNU/Linux distribution. +- Android is also not a GNU/Linux, nor is Replicant, despite being sponsored by the FSF. + +## on compatibility between distros + +Even between GNU/Linux distributions, compatibility is difficult. Different GNU/Linux distributions upgrade their components at different times, and due to dynamic linking, this means that a program built against a specific set of components with a specific set of build configurations may or may not successfully run between GNU/Linux systems, but some amount of binary compatibility is otherwise possible as long as you take care to deal with that. + +On top of this, there is no binary compatibility between Linux ecosystems at large. GNU/Linux binaries require the gcompat compatibility framework to run on Alpine, and it generally is not possible to run OpenWrt binaries on Alpine or vice versa. The situation is the same with Android: without a compatibility tool (such as Termux), it is not possible to run binaries from other ecosystems there. + +Exacerbating the problem, developers also target specific APIs only available in their respective ecosystems: + +- systemd makes use of glibc-specific APIs, which are not part of POSIX +- Android makes use of bionic-specific APIs, which are not part of POSIX +- Alpine and OpenWrt both make use of internal frameworks, and these differ between the two ecosystems (although there are active efforts to converge both ecosystems). + +As a result, as a developer, it is important to note which ecosystems you are targeting, and it is important to refer to individual ecosystems, rather than saying "my program supports Linux." There are dozens of ecosystems which make use of the Linux kernel, and it is unlikely that a program supports all of them, or that the author is even aware of them. + +To conclude, it is both correct and important, to refer to GNU/Linux distributions as GNU/Linux distributions. Likewise, it is important to realize that non-GNU/Linux distributions exist, and are not necessarily compatible with the GNU/Linux ecosystem for your application. Each ecosystem is distinct, with its own strengths and weaknesses. diff --git a/content/blog/its-time-for-arm-to-embrace-traditional-hosting.md b/content/blog/its-time-for-arm-to-embrace-traditional-hosting.md new file mode 100644 index 0000000..76cbd2c --- /dev/null +++ b/content/blog/its-time-for-arm-to-embrace-traditional-hosting.md @@ -0,0 +1,72 @@ +--- +title: "It's time for ARM to embrace traditional hosting" +date: "2021-07-10" +--- + +ARM is everywhere these days -- from phones to hyperscale server deployments.  There is even an [ARM workstation available that has decent specs at an acceptable price](https://www.solid-run.com/arm-servers-networking-platforms/honeycomb-workstation/).  Amazon and Oracle tout white paper after white paper about how their customers have switched to ARM, gotten performance wins and saved money.  Sounds like everything is on the right track, yes?  Well, actually it's not. + +## ARM for the classes, x86 for the masses + +For various reasons, I've been informed that I need to start rethinking my server infrastructure arrangements.  We won't go into that here, but the recent swearing at San Francisco property developers on my Twitter is highly related. + +As I am highly allergic to using any infrastructure powered by x86 CPUs, due to the fact that [Intel and AMD both include firmware in the CPU which allow for computation to occur without my consent](https://en.wikipedia.org/wiki/Intel_Management_Engine) (also known as a backdoor) so that Hollywood can implement a largely pointless (especially on a server) digital restrictions management scheme, I decided to look at cloud-based hosting solutions using ARM CPUs, as that seemed perfectly reasonable at first glance. + +Unfortunately, what I found is that ARM hosting is not deployed in a way where individual users can access it at cost-competitive prices. + +## AWS Graviton (bespoke Neoverse CPUs) + +In late 2018, AWS announced the [Graviton CPU, which was based on a core design they got when they acquired Annapurna Labs](https://aws.amazon.com/blogs/aws/new-ec2-instances-a1-powered-by-arm-based-aws-graviton-processors/).  This was followed up in 2020 with [Graviton2, which is based on the ARM Neoverse N1 core design](https://www.anandtech.com/show/15578/cloud-clash-amazon-graviton2-arm-against-intel-and-amd).  These are decent chips, the performance is quite good, and costs are quite low. + +But, how much does it cost for an average person to actually make use of it?  We will assume that the 1 vCPU / 4GB RAM `m6g.medium` configuration is suitable for this comparison, as it is the most comparable to a modest x86 VPS. + +The `m6g.medium` instance does not come with any transfer, but the first GB is always free on EC2.  Further transfer is $0.09/GB up to 10TB.  By comparison, the Linode 4GB RAM plan comes with 4TB of transfer, so we will use that for our comparison. + +
Hourly price (m6g.medium)$0.0385
× 720 hours$27.72
+ 3.999TB of transfer ($0.09 × 3999)$359.90
Total:$387.62
+ +Transfer charges aside, the $27.72 monthly charge is quite competitive to Linode, clocking in at only $7.72 more expensive for comparable performance.  But the data transfer charges have the potential to make using Graviton on EC2 very costly. + +### What about AWS Lightsail? + +An astute reader might note that AWS actually _does_ provide traditional VPS hosting as a product, under its [Lightsail brand](https://aws.amazon.com/lightsail/).  But the Lightsail VPS product is x86-only for now. + +Amazon could make a huge impact in terms of driving ARM adoption in the hosting ecosystem by bringing Graviton to their Lightsail product.  Capturing Lightsail users into the Graviton ecosystem and then scaling them up to EC2 seems like a no-brainer sales strategy too.  But so far, they haven't implemented this. + +## Oracle Cloud Infrastructure + +A few months ago, [Oracle introduced instances based on Ampere's Altra CPUs](https://blogs.oracle.com/cloud-infrastructure/post/arm-based-cloud-computing-is-the-next-big-thing-introducing-arm-on-oracle-cloud-infrastructure), which are also based on the Neoverse N1 core. + +The base configuration (Oracle calls it a shape) is priced at $0.01/hourly, includes a single vCPU and 6GB of memory.  These instances do not come with any data transfer inclusive, but like AWS, data transfer is pooled.  A major difference between Oracle and AWS, however, is that the first 10TB of transfer is included gratis. + +
Hourly price$0.01
× 720 hours$7.20
+ 4TB transfer (included gratis)$0
Total:$7.20
+ +I really, _really_ wanted to find a reason to hate on Oracle here.  I mean, they are Oracle.  But I have to admit that Oracle's cloud product is a lot more similar to traditional VPS hosting than Amazon's EC2 offerings.  **Update:** Haha, nevermind!  They came up with a reason for me to hate on them when they [terminated my account for no reason](https://ariadne.space/2021/07/14/oracle-cloud-sucks/). + +So, we have one option for a paid ARM VPS, and that is only an option if you are willing to deal with Oracle, which are Oracle.  Did I mention they are Oracle? + +![Oracle federating its login service with itself](images/Screen-Shot-2021-07-09-at-9.14.41-PM-300x189.png) + +## Scaleway + +Tons of people told me that Scaleway had ARM VPS for a long time.  And indeed, they used to, but they don't anymore.  Back when they launched ARMv8 VPS on ThunderX servers, I actually used a Scaleway VPS to port `libucontext`. + +Unfortunately, they no longer offer ARM VPS of any kind, and only overpriced x86 ones that are not remotely cost competitive to anything else on that market. + +## Mythic Beasts, miniNodes, etc. + +These companies offer ARM instances, but they are Raspberry Pi instances.  The pricing is also rather expensive when considering that they are Raspberry Pi instances.  I don't consider these offers competitive in any way. + +## Equinix Metal + +You can still buy ARM servers on the Equinix Metal platform, but [you have to request permission to buy them](https://metal.equinix.com/developers/docs/servers/server-specs/#arm-servers).  In testing a couple of years ago, I was able to provision a `c1.large.arm` server on the spot market for $0.25/hour, which translates to $180/monthly. + +However, the problem with buying on the spot market is that your server might go away at any time, which means you can't actually depend on it. + +There is also the problem with data transfer: Equinix Metal follows the same billing practices for data transfer as AWS, meaning actual data transfer gets expensive quickly. + +However, the folks who run Equinix Metal are great people, and I feel like ARM could work with them to get some sort of side project going where they get ARM servers into the hands of developers at reasonable pricing.  They already have an arrangement like that for FOSS projects with the Works on ARM program. + +## Conclusions + +Right now, as noted above, Oracle is the best game in town for the average person (like me) to buy an ARM VPS.  We need more options.  Amazon should make Graviton available on its Lightsail platform. + +It is also possible that as a side effect of marcan's [Asahi Linux project](https://asahilinux.org/), we might have cheap Linux dedicated servers on Apple M1 mac minis soon.  That's also a space to watch. diff --git a/content/blog/its-time-to-boycott-aws.md b/content/blog/its-time-to-boycott-aws.md new file mode 100644 index 0000000..c470069 --- /dev/null +++ b/content/blog/its-time-to-boycott-aws.md @@ -0,0 +1,20 @@ +--- +title: "It’s time to boycott AWS" +date: "2021-10-26" +--- + +I woke up this morning not planning to write anything on this blog, much less anything about AWS. But then, as I was eating breakfast, I read a horrifying story in Mother Jones about how an AWS employee was treated as [he did his best to cope with his wife’s terminal cancer](https://www.motherjones.com/politics/2021/09/my-wife-was-dying-of-brain-cancer-my-boss-at-amazon-told-me-to-perform-or-quit/?utm_source=twitter&utm_campaign=naytev&utm_medium=social). + +In the free software community, Amazon (more specifically AWS) has been criticized for years for taking a largely exploitative position concerning FOSS projects. These conversations frequently result in proposals to use licensing as a weapon against AWS. In general, I believe that it would be difficult to target AWS with licensing, as statutory licenses must be fair, reasonable and non-discriminatory. But the issue of exploitation remains: AWS takes from the commons of FOSS projects and productizes that work, frequently without giving anything back. + +They are, of course, allowed to do this, but at the same time, in doing so, they have frequently undercut the efforts of developers to monetize the labor involved in software maintenance, which leads to projects adopting licenses like SSPL and Commons Clause, which are significantly problematic for the commons. + +On top of this, licensing-based attacks are unlikely to be effective against AWS anyway, because in the process of productization, they wind up significantly modifying the software anyway. This means that it is only another step further to just completely rewrite the software, which is something they have done in the past, and will likely do again in the future. + +But my issue isn’t just the exploitative relationship AWS has with the commons (which is largely specific to AWS by the way), but rather the corporate culture of AWS. When I read the story in Mother Jones this morning, I saw no reason to disbelieve it, as I have heard many similar stories in the past from AWS employees. + +As participants in the technology industry, we are free to choose our suppliers. This freedom comes with a damning responsibility, however. When we choose to engage with AWS as a supplier, we are enabling and affirming the way they do business as a company. We are affirming their exploitation of the commons. + +We are also affirming their exploitative practice of placing AWS employees on a “pivot” (their parlance for a Performance Improvement Plan), which involves working employees to the bone, saying they failed to meet their PIP objectives and then firing them. + +The free software community must stand against both kinds of exploitation. We must stand against it by boycotting AWS until they recalibrate their relationship with the commons, and their relationship with their employees. We must also encourage the adoption and proliferation of humane, freedom-respecting technology. diff --git a/content/blog/json-ld-is-ideal-for-cloud-native-technologies.md b/content/blog/json-ld-is-ideal-for-cloud-native-technologies.md new file mode 100644 index 0000000..526ab79 --- /dev/null +++ b/content/blog/json-ld-is-ideal-for-cloud-native-technologies.md @@ -0,0 +1,154 @@ +--- +title: "JSON-LD is ideal for Cloud Native technologies" +date: "2022-02-11" +--- + +Frequently I have been told by developers that it is impossible to have extensible JSON documents underpinning their projects, because there may be collisions later. For those of us who are unaware of more capable graph serializations such as JSON-LD and Turtle, this seems like a reasonable position. Accordingly, I would like to introduce you all to JSON-LD, using a practical real-world deployment as an example, as well as how one might use JSON-LD to extend something like OCI container manifests. + +You might feel compelled to look up JSON-LD on Google before continuing with reading this. My suggestion is to not do that, because [the JSON-LD website](https://json-ld.org/) is really aimed towards web developers, and this explanation will hopefully explain how a systems engineer can make use of JSON-LD graphs in practical terms. And, if it doesn't, feel free to DM me on Twitter or something. + +## what JSON-LD can do for you + +Have you ever wanted any of the following in the scenarios where you use JSON: + +- Conflict-free extensibility +- Strong typing +- Compatibility with the RDF ecosystem (e.g. XQuery, SPARQL, etc) +- Self-describing schemas +- Transparent document inclusion + +If you answered yes to any of these, then JSON-LD is for you. Some of these capabilities are also provided by the IETF's [JSON Schema project](http://json-schema.org/), but it has a much higher learning curve than JSON-LD. + +This post will be primarily focused on how namespaces and aliases can be used to provide extensibility while also providing backwards compatibility for clients that are not JSON-LD aware. In general, I believe strongly that any open standard built on JSON should actually be built on JSON-LD, and hopefully my examples will demonstrate why I believe this. + +## ActivityPub: a real-world case study + +[ActivityPub is a protocol](https://www.w3.org/TR/activitypub) that is used on the federated social web (thankfully entirely unrelated to Web3), that is built on the ActivityStreams 2.0 specification. Both ActivityPub and ActivityStreams are RDF vocabularies that are represented as JSON-LD documents, but you don't really need to know or care about this part. + +This is a very simplified representation of an ActivityPub actor object: + +```json +{ + "@context": [ + "https://www.w3.org/ns/activitystreams", + { + "alsoKnownAs": { + "@id": "as:alsoKnownAs", + "@type": "@id" + }, + "sec": "https://w3id.org/security#", + "owner": { + "@id": "sec:owner", + "@type": "@id" + }, + "publicKey": { + "@id": "sec:publicKey", + "@type": "@id" + }, + "publicKeyPem": "sec:publicKeyPem", + } + ], + "alsoKnownAs": "https://corp.example.org/~alice", + "id": "https://www.example.com/~alice", + "inbox": "https://www.example.com/~alice/inbox", + "name": "Alice", + "type": "Person", + "publicKey": { + "id": "https://www.example.com/~alice#key", + "owner": "https://www.example.com/~alice", + "publicKeyPem": "..." + } +} +``` + +Pay attention to the `@context` variable here, it is doing a few things: + +1. It pulls in the entire ActivityStreams and ActivityPub vocabularies by reference. These can be downloaded on the fly or bundled with the application using context preloading. +2. It then defines a few terms outside of those vocabularies: `alsoKnownAs`, `sec`, `owner`, `publicKey` and `publicKeyPem`. + +When an application that is JSON-LD aware parses this document, it will receive a document that looks like this: + +```json +{ + "@context": [ + "https://www.w3.org/ns/activitystreams", + { + "alsoKnownAs": { + "@id": "as:alsoKnownAs", + "@type": "@id" + }, + "sec": "https://w3id.org/security#", + "owner": { + "@id": "sec:owner", + "@type": "@id" + }, + "publicKey": { + "@id": "sec:publicKey", + "@type": "@id" + }, + "publicKeyPem": "sec:publicKeyPem", + } + ], + "@id": "https://www.example.com/~alice", + "@type": "Person", + "as:alsoKnownAs": "https://corp.example.org/~alice", + "as:inbox": "https://www.example.com/~alice/inbox", + "as:name": "Alice", + "sec:publicKey": { + "@id": "https://www.example.com/~alice#key", + "sec:owner": "https://www.example.com/~alice", + "sec:publicKeyPem": "..." + } +} +``` + +This allows extensions to interoperate with minimal conflicts, as the application is operating on a normalized version of the document that has as many things namespaced as possible, without the user having to worry about it. This allows a parser to easily ignore things it does not know about, as they aren't defined in the context (which does not actually have to be defined, you can preload a root context), and so they aren't placed in a namespace. + +In other words, that `@context` variable can be built into the application, or stored in an S3 bucket somewhere, or whatever you want to do. If you are planning to have an interoperable protocol, however, providing a useful `@context` is crucial. + +## How OCI image manifests could benefit from JSON-LD + +There was a discussion on Twitter this evening about how extending the OCI image spec with signature references has taken a year. If OCI used JSON-LD (ironically, its JSON vocabulary is already similar to several pre-existing JSON-LD ones), then implementations could just store the pre-existing metadata, mapped to a namespace. In the case of an OCI image, this might look something like: + +```json +{ + "@context": [ + "https://opencontainers.org/ns", + { + "sigstore": "https://sigstore.dev/ns", + "reference": { + "@type": "@id", + "@id": "sigstore:reference" + } + } + ], + "config": { + "mediaType": "application/vnd.oci.image.config.v1+json", + "digest": "sha256:d539cd357acb4a6df2a4ef99db5fe70714458349232dad0ec73e1ed65f6a0e13", + "size": 585 + }, + "layers": [ + { + "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip", + "digest": "sha256:59bf1c3509f33515622619af21ed55bbe26d24913cedbca106468a5fb37a50c3", + "size": 2818413 + }, + { + "mediaType": "application/vnd.example.signature+json", + "size": 3514, + "digest": "sha256:19387f68117dbe07daeef0d99e018f7bbf7a660158d24949ea47bc12a3e4ba17", + "reference": { + "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip", + "digest": "sha256:59bf1c3509f33515622619af21ed55bbe26d24913cedbca106468a5fb37a50c3", + "size": 2818413 + } + } + ] +} +``` + +The differences are minimal from a current OCI image manifest. Namely, `schemaVersion` has been deleted, because JSON-LD handles this detail automatically, and the signature reference extension has been added as the `sigstore:reference` property. Hopefully you can imagine how the rest of the document looks namespace wise. + +One last thing about this example. You might notice that I am using URIs when I define namespaces in the `@context`. This is a great feature of the RDF ecosystem: you can put up a webpage at those URIs defining how to make use of the terms defined in the namespace, meaning that JSON-LD tooling can have rich documentation built in. + +Also, since I am well aware that basically all of these OCI tools are written in Go, it should be noted that Go has an [excellent implementation of JSON-LD](https://pkg.go.dev/github.com/go-ap/jsonld), and for those concerned that W3C proposals are sometimes not in touch with reality, the creator of JSON-LD has [some words about it that are interesting](http://manu.sporny.org/2014/json-ld-origins-2/). Now, please, use JSON-LD and stop worrying about extensibility in open technology, this problem is totally solved. diff --git a/content/blog/lets-build-a-new-service-manager-for-alpine.md b/content/blog/lets-build-a-new-service-manager-for-alpine.md new file mode 100644 index 0000000..0963e71 --- /dev/null +++ b/content/blog/lets-build-a-new-service-manager-for-alpine.md @@ -0,0 +1,22 @@ +--- +title: "Let's build a new service manager for Alpine!" +date: "2021-03-25" +--- + +**Update (April 27)**: Please visit Laurent's website on this issue [for a more detailed proposal](https://skarnet.com/projects/service-manager.html).  If you work at a company which has budget for this, please get in touch with him directly. + +As many of you already know, Alpine presently uses an fairly [modified version of OpenRC](https://git.alpinelinux.org/aports/tree/main/openrc) as its service manager.  Unfortunately, OpenRC maintenance has stagnated: the last release was over a year ago. + +We feel now is a good time to start working on a replacement service manager based on user feedback and design discussions we've had over the past few years which can be simply summarized as _systemd done right_.  But what does _systemd done right_ mean? + +Our plan is to build a supervision-first service manager that consumes and reacts to events, using declarative unit files similar to systemd, so that administrators who are familiar with systemd can easily learn the new system.  In order to build this system, we plan to work with [Laurent Bercot](http://skarnet.org/), a globally recognized domain expert on process supervision systems and author of the [s6 software supervision suite](https://skarnet.org/software/s6/). + +This work will also build on the work we've done with ifupdown-ng, as ifupdown-ng will be able to reflect its own state into the service manager allowing it to start services or stop them as the network state changes.  OpenRC does not support reacting to arbitrary events, which is why this functionality is not yet available. + +Corporate funding of this effort would have meaningful impact on the timeline that this work can be delivered in.  It really comes down to giving Laurent the ability to work on this full time until it is done.  If he can do that, like Lennart was able to, he should be able to build the basic system in a few months. + +Users outside the Alpine ecosystem will also benefit from this work.  Our plan is to introduce a true contender to systemd that is completely competitive as a service manager.  If you believe real competition to systemd will be beneficial toward driving innovation in systemd, you should also want to sponsor this work. + +Alpine has gotten a lot of mileage out of OpenRC, and we are open to contributing to its future maintenance while Alpine releases still include it as part of the base system, but our long-term goal is to adopt the s6-based solution. + +If you're interested in sponsoring Laurent's work on this project, you can contact him [via e-mail](ska-remove-this-if-you-are-not-a-bot@skarnet.org) or via [his Twitter account](https://twitter.com/laurentbercot). diff --git a/content/blog/leveraging-json-ld-compound-typing-for-behavioural-hinting-in-activitypub.md b/content/blog/leveraging-json-ld-compound-typing-for-behavioural-hinting-in-activitypub.md new file mode 100644 index 0000000..909160b --- /dev/null +++ b/content/blog/leveraging-json-ld-compound-typing-for-behavioural-hinting-in-activitypub.md @@ -0,0 +1,63 @@ +--- +title: "Leveraging JSON-LD compound typing for behavioural hinting in ActivityPub" +date: "2019-10-02" +--- + +ActivityStreams provides for a multitude of different actor and object types, which ActivityPub capitalizes on effectively. However, neither ActivityPub nor ActivityStreams provide a method for hinting how a given actor or object should be interpreted in the vocabulary. + +The purpose of this blog post is to document how the litepub community intends to provide behavioural hinting in ActivityPub, as well as demonstrate an edge case where behavioural hinting is useful. + +## A Quick Refresher: what unhinted ActivityStreams objects look like + +This is an example actor, which is a relay service. It represents how relay services appear now. + +``` +{ + "@context": [ + "https://www.w3.org/ns/activitystreams", + "https://pleroma.site/schemas/litepub-0.1.jsonld" + ], + "id": "https://pleroma.site/relay", + "type": "Application", + "endpoints": { + "sharedInbox": "https://pleroma.site/inbox" + }, + "followers": "https://pleroma.site/relay/followers", + "following": "https://pleroma.site/relay/following", + "inbox": "https://pleroma.site/relay/inbox" +} +``` + +As you can tell, the `type` is set to `Application`, which when interpreted as a JSON-LD document expands to `https://www.w3.org/ns/activitystreams#Application`. + +## Hinting objects through compound typing + +In ActivityPub, different activities impose different side effects, but in many cases, it is not necessarily optimal to impose all side effects in all contexts. To know when we want to impose certain side effects or not, we need more semantic knowledge of the _intention_ behind an object. + +To solve this semantic quandry, JSON-LD provides a mechanism known as compound typing. In other words, an object can be two or more different types at once. For example, a `Person` object could also be a `Mother` or a `Partner` object as well. + +How does this apply to ActivityPub? By using the same mechanism, we can effectively hint the object to indicate how an implementation should ideally treat it: + +``` +{ + "@context": [ + "https://www.w3.org/ns/activitystreams", + "https://pleroma.site/schemas/litepub-0.1.jsonld", + {"Invisible": "litepub:Invisible"} + ], + "id": "https://pleroma.site/relay", + "type": ["Application", "Invisible"], + "endpoints": { + "sharedInbox": "https://pleroma.site/inbox" + }, + "followers": "https://pleroma.site/relay/followers", + "following": "https://pleroma.site/relay/following", + "inbox": "https://pleroma.site/relay/inbox" +} +``` + +Voila! Now an implementation which understands type hinting will understand that this relay service should not be visible to end users, which means that side effects caused by it doing it's job shouldn't be visible either. + +Of course, respecting such hinting is not mandatory, and therefore any security-dependent functionality shouldn't depend on behavioural hints. But security aside, they do have their uses. + +I have assigned the `litepub:Invisible` type as the first behavioural hint, for cases where side effects should not be visible to end users, as in the case of relaying and group chats (what matters in both cases is that the peer discovers the referenced message instead of showing the Announce activity directly). diff --git a/content/blog/libreplayer-toward-a-generic-interface-for-replayer-cores-and-music-players.md b/content/blog/libreplayer-toward-a-generic-interface-for-replayer-cores-and-music-players.md new file mode 100644 index 0000000..3c84dbd --- /dev/null +++ b/content/blog/libreplayer-toward-a-generic-interface-for-replayer-cores-and-music-players.md @@ -0,0 +1,16 @@ +--- +title: "libreplayer: toward a generic interface for replayer cores and music players" +date: "2019-09-08" +--- + +I've been taking a break from focusing on fediverse development for the past couple of weeks — I've done some things, but it's not my focus right now because I'm waiting for Pleroma's `develop` tree to stabilize enough to branch it for the 1.1 stable releases. So, I've been doing some multimedia coding instead. + +The most exciting aspect of this has been `libreplayer`, which is essentially a generic interface between replayer emulation cores and audio players. For example, it will be possible for a replayer emulation core to simply target libreplayer and an audio player to target libreplayer and allow these two things (the emulation core and the libreplayer client) to work together to produce audio. + +The first release of libreplayer will drop soon. It will contain a PSF1/PSF2 player that is free of binary blobs. This is an important milestone because the only other PSF1/PSF2 replayer that is blob-free has many emulation bugs due to the use of incorrectly modified CPU emulation code from MAME. Highly Experimental's dirty secret is that it contains an obfuscated version of the PS2 firmware that has been stripped down. + +And so, the naming of libreplayer is succinct, in two ways: one, it's self-descriptive, libreplayer obviously conveys that it's a library for accessing replayers, but also due to the emulation cores included in the source kit being blob-free, it implies that the replayer emulation cores we include are _free as in freedom_, which is also important to me. + +What does this mean for audacious? Well, my intention is to replace the uglier replayer emulation plugins in audacious with a libreplayer client and clean-room implementations of each replayer core. I also intend to introduce replayer emulation cores that are not yet supported in audacious in a good way. + +Hopefully this allows for the emulation community to be more effective stewards of their own emulation cores, while allowing for projects like audacious to focus on their core competencies. I also hope that having high-quality clean-room implementations of emulator cores written to modern coding practice will help to improve the security of the emulation scene in general. Time will tell. diff --git a/content/blog/monitoring-for-process-completion-in-2021.md b/content/blog/monitoring-for-process-completion-in-2021.md new file mode 100644 index 0000000..7266b95 --- /dev/null +++ b/content/blog/monitoring-for-process-completion-in-2021.md @@ -0,0 +1,129 @@ +--- +title: "Monitoring for process completion in 2021" +date: "2021-09-20" +--- + +A historical defect in the `ifupdown` suite has been the lack of proper supervision of processes run by the system in order to bring up and down interfaces. Specifically, it is possible in historical `ifupdown` for a process to hang forever, at which point the system will fail to finish configuring interfaces. As interface configuration is part of the boot process, this means that the boot process can potentially hang forever and fail to complete. Accordingly, we have [introduced correct supervision of processes run by `ifupdown-ng` in the upcoming version 0.12](https://github.com/ifupdown-ng/ifupdown-ng/pull/161), with a 5 minute timeout. + +Because `ifupdown-ng` is intended to be portable, we had to implement two versions of the process completion monitoring routine. The portable version is a busy loop, which sleeps for 50 milliseconds between iteration, and the non-portable version uses Linux process descriptors, a feature introduced in Linux 5.3. For earlier versions, `ifupdown-ng` will downgrade to using the portable implementation. There are also a couple of other ways that one can monitor for process completion using notifications, but they were not appropriate for the `ifupdown-ng` design. + +## Busy-waiting with `waitpid(2)` + +The portable version, as previously noted, uses a busy loop which sleeps for short durations of time. A naive version of a routine which does this would look something like: + +/\* return true if process exited successfully, false in any other case \*/ +bool +monitor\_with\_timeout(pid\_t child\_pid, int timeout\_sec) +{ + int status; + int ticks; + + while (ticks < timeout\_sec \* 10) + { + /\* waitpid returns the child PID on success \*/ + if (waitpid(child, &status, WNOHANG) == child) + return WIFEXITED(status) && WEXITSTATUS(status) == 0; + + /\* sleep 100ms \*/ + usleep(100000); + ticks++; + } + + /\* timeout exceeded, kill the child process and error \*/ + kill(child, SIGKILL); + waitpid(child, &status, WNOHANG); + return false; +} + +This approach, however, has some performance drawbacks. If the process has not already completed by the time that monitoring of it has begun, then you will be delayed at least 100ms. In the case of `ifupdown-ng`, almost all processes are very short-lived, so this is not a major issue, however, we can do better by tightening the event loop. Another optimization is to split the sleep part into two steps, allowing for the initial call to `waitpid` to have better chances of reaping the completed process: + +/\* return true if process exited successfully, false in any other case \*/ +bool +monitor\_with\_timeout(pid\_t child\_pid, int timeout\_sec) +{ + int status; + int ticks; + + while (ticks < timeout\_sec \* 20) + { + /\* sleep 50usec to allow the child PID to complete \*/ + usleep(50); + + /\* waitpid returns the child PID on success \*/ + if (waitpid(child, &status, WNOHANG) == child) + return WIFEXITED(status) && WEXITSTATUS(status) == 0; + + /\* sleep 49.95ms \*/ + usleep(49950); + ticks++; + } + + /\* timeout exceeded, kill the child process and error \*/ + kill(child, SIGKILL); + waitpid(child, &status, WNOHANG); + return false; +} + +This works fairly well in practice: there is no performance regression on the `ifupdown-ng` test suite with this implementation. + +## The self-pipe trick + +Daniel J. Bernstein described a trick in the early 90s that allows for process completion notifications to be delivered via a pollable file descriptor called [the self-pipe trick](https://cr.yp.to/docs/selfpipe.html). It is portable to any POSIX-compliant system, and can be used with `poll` or whatever you wish to use. It works by installing a signal handler against `SIGCHLD` that writes to a descriptor obtained with `pipe(2)`. The downside of this approach is that you have to write quite a bit of code, and you have to track which pipe FD is associated with which PID. It also wastes a file descriptor per process, since you have a file descriptor for both sides of the pipe. + +## Linux's `signalfd` + +What if we could turn delivery of signals into a pollable file descriptor? This is precisely what Linux's `signalfd` does. The basic idea here is to open a `signalfd`, associate `SIGCHLD` with it, and then do the `waitpid(2)` call when `SIGCHLD` is received at the `signalfd`. The downside with this approach is similar to the self-pipe trick, you have to keep global state in order to accomplish it, as there can only be a single `SIGCHLD` handler. + +## Process descriptors + +[FreeBSD introduced support for process descriptors in 2010](http://lackingrhoticity.blogspot.com/2010/10/process-descriptors-in-freebsd-capsicum.html) as part of the Capsicum framework. A process descriptor is an opaque handle to a specific process in the kernel. This is helpful as it avoids race conditions involving the recycling of PIDs. And since they are kernel handles, they can be waited on with `kqueue` like other kernel objects, by using `EVFILT_PROCDESC`. + +There have been a few attempts to introduce process descriptors to Linux over the years. The attempt which [finally succeeded was Christian Brauner's `pidfd` API](https://lwn.net/Articles/801319/), completely landing in Linux 5.4, although parts of it were functional in prior releases. Like FreeBSD's process descriptors, a `pidfd` is an opaque reference to a specific `struct task_struct` in the kernel, and is also pollable, making it quite suitable for notification monitoring. + +A problem with using the `pidfd` API, however, is that it is not presently implemented in either glibc or musl, which means that applications will need to provide stub implementations of the API themselves for now. This issue with having to write our own stub aside, the solution is quite elegant: + +#include + +#if defined(\_\_linux\_\_) && defined(\_\_NR\_pidfd\_open) + +static inline int +local\_pidfd\_open(pid\_t pid, unsigned int flags) +{ + return syscall(\_\_NR\_pidfd\_open, pid, flags); +} + +/\* return true if process exited successfully, false in any other case \*/ +bool +monitor\_with\_timeout(pid\_t child\_pid, int timeout\_sec) +{ + int status; + int pidfd = local\_pidfd\_open(child\_pid, 0); + if (pidfd < 0) + return false; + + struct pollfd pfd = { + .fd = pidfd, + .pollin = POLLIN, + }; + + /\* poll(2) returns the number of ready FDs, if it is less than + \* one, it means our process has timed out. + \*/ + if (poll(&pfd, 1, timeout\_sec \* 1000) < 1) + { + close(pidfd); + kill(child, SIGKILL); + waitpid(child, &status, WNOHANG); + return false; + } + + /\* if poll did return a ready FD, process completed. \*/ + waitpid(child, &status, WNOHANG); + close(pidfd); + + return WIFEXITED(status) && WEXITSTATUS(status) == 0; +} + +#endif + +It will be interesting to see process supervisors (and other programs which perform short-lived supervision) adopt these new APIs. As for me, I will probably prepare patches to include `pidfd_open` and the other syscalls in musl as soon as possible. diff --git a/content/blog/moving-my-blog-to-oracle-cloud.md b/content/blog/moving-my-blog-to-oracle-cloud.md new file mode 100644 index 0000000..823658e --- /dev/null +++ b/content/blog/moving-my-blog-to-oracle-cloud.md @@ -0,0 +1,62 @@ +--- +title: "Moving my blog to Oracle cloud" +date: "2021-07-18" +--- + +In my past few blog posts, I have been talking about the [current state of affairs concerning ARM VPS hosting](https://ariadne.space/2021/07/10/its-time-for-arm-to-embrace-traditional-hosting/).  To put my money where my mouth is, I have now migrated my blog to the ARM instances Oracle has to offer, as an actual production use of their cloud.  You might find this surprising, [given the last post](https://ariadne.space/2021/07/14/oracle-cloud-sucks/), but Oracle reached out and explained why their system terminated my original account and we found a solution for that problem. + +## What happened, anyway? + +Back at the end of May, [Oracle announced that they were offering ARM VPS servers running on Ampere Altra CPUs](https://blogs.oracle.com/cloud-infrastructure/post/arm-based-cloud-computing-is-the-next-big-thing-introducing-arm-on-oracle-cloud-infrastructure).  Accordingly, I was curious, so I signed up for an account on the free tier.  All went well, except that as I was signing up, my now-previous bank declined the initial charge to verify that I had a working credit card. + +I was able to sign up anyway, but then a few days later, they charged my card again, which was also declined by my previous bank's overzealous fraud protection.  Then a few weeks later, I attempted to upgrade, and the same thing happened again: first charge was declined, I got a text message and retried, and everything went through.  This weirdness with the card being declined reliably on the first try, however, made Oracle's anti-fraud team anxious, and so they decided to understandably cover their own asses and terminate my account. + +I'm going to talk in more depth about my relationship with my previous bank soon, but I want to close my accounts out fully with them before I complain about how awful they are: one does not talk smack about somebody who is holding large sums of your savings, after all.  Needless to say, if you find yourself at a bank being acquired by another bank, run like hell. + +Given that Oracle was very proactive in addressing my criticism, and that the issue was caused by something neither myself nor Oracle had any control over (my bank demonstrating very loudly that they needed to be replaced), I decided to give them another chance, and move some of my production services over. + +At least, at the moment, since I will no longer be operating my own network as of September, I plan on running my services on a mix of Vultr, Oracle and Linode VMs, as this allows me to avoid Intel CPUs (Oracle have ARM, but also AMD EPYC VMs available, while Vultr and Linode also use AMD EPYC).  I will probably run the more FOSS-centric infrastructure on fosshost's ARM infrastructure, assuming they accept my application anyway. + +## Installing Alpine on Oracle Cloud + +At present, Alpine images are not offered on Oracle's cloud.  I intend to talk with some of the folks running the service who reached out about getting official Alpine images running in their cloud, as it is a quite decent hosting option. + +In the meantime, it is pretty simple to install Alpine.  The first step is to provision an ARM (or x86) instance in their control panel.  You can just use the stock Oracle Linux image, as we will be blasting it away anyway. + +Once the image is running, you'll be presented with a control panel like so: + +![A control panel for the newly created VPS instance.](images/Screen-Shot-2021-07-17-at-10.55.35-PM-300x135.png) + +The next step is to create an SSH-based serial console.  You will need this to access the Alpine installer.  Scroll down to the resources section and click "Console Connection."  Then click "Create Console Connection": + +![Console connections without any created yet.](images/Screen-Shot-2021-07-17-at-10.59.04-PM-300x58.png) + +This will open a modal dialog, where you can specify the SSH key to use.  You'll need to use an RSA key, as this creation wizard doesn't yet recognize Ed25519 keys.  Select "Paste public key" and then paste in your RSA public key, then click "Create console connection" at the bottom of the modal dialog. + +The console connection will be created.  Click the menu icon for it, and then click "Copy Serial Console Connection for Linux/Mac." + +![Copying the SSH connection command.](images/Screen-Shot-2021-07-17-at-11.01.55-PM-300x115.png) + +Next, open a terminal and paste the command that was copied to your clipboard, and you should be able to access the VPS serial console after dealing with the SSH prompts. + +![VPS serial console running Oracle Linux](images/Screen-Shot-2021-07-17-at-11.04.05-PM-300x180.png) + +The next step is to SSH into the machine and download the Alpine installer.  This will just be `ssh opc@1.2.3.4` where 1.2.3.4 is the IP of the instance.  We will want to download the installer ISO to `/run`, which is a ramdisk, and then write it to `/dev/sda` and then sysrq b to reboot.  Here's what that looks like: + +![Preparing the Alpine installer](images/Screen-Shot-2021-07-17-at-11.09.40-PM-300x180.png) + +If you monitor your serial console window, you'll find that you've been dropped into the Alpine installer ISO. + +![Alpine installer shell](images/Screen-Shot-2021-07-17-at-11.11.39-PM-300x180.png) + +From here, you can run `setup-alpine` and follow the directions as usual.  You will want to overwrite the boot media, so answer yes when it asks. + +![Installing Alpine](images/Screen-Shot-2021-07-17-at-11.15.02-PM-300x179.png) + +At this point, you can reboot, and it will dump you into your new Alpine image.  You might want to set up `cloud-init`, or whatever, but that's not important to cover here. + +## Future plans + +At the moment, the plan is to see how things perform, and if they perform well, migrate more services over.  I might also create OCIs with `cloud-init` enabled for other users of Alpine on Oracle cloud. + +Stay tuned! diff --git a/content/blog/nfts-a-scam-that-artists-should-avoid.md b/content/blog/nfts-a-scam-that-artists-should-avoid.md new file mode 100644 index 0000000..2400322 --- /dev/null +++ b/content/blog/nfts-a-scam-that-artists-should-avoid.md @@ -0,0 +1,61 @@ +--- +title: "NFTs: A Scam that Artists Should Avoid" +date: "2021-03-21" +coverImage: "Screenshot_2021-03-21-NFTMagic-NFTs-on-Ardor-fully-stored-on-blockchain-and-low-fees.png" +--- + +Non-fungible tokens (NFTs) are the latest craze being pitched toward the artistic communities.  But, they are ultimately a meaningless token which fails to accomplish any of the things artists are looking for in an NFT-based solution. + +Let me explain... + +## So, What are NFTs? + +Non-fungible tokens are a form of smart contracts (program) which runs on a decentralized finance platform. + +They are considered "non-fungible" because they reference a specific asset, while a fungible token would represent an asset that is not specific.  An example of a non-fungible token in the physical world would be the title to your car or house, while a fungible token would be currency. + +These smart contracts could, if correctly implemented, represent title to a physical asset, but implementation of an effective NFT regime would require substantive changes to the contract law in order to be enforceable in the current legal system. + +## How do NFTs apply to artwork? + +Well, the simple answer is that they don't.  You might hear differently from companies that are selling NFT technology to you, as an artist or art collector, but there is no actual mechanism to enforce the terms of the smart contract in a court and there is no actual mechanism to enforce that the guarantees of the smart contract itself cannot be bypassed. + +NFT platforms like ArtMagic try to make it look like it is possible to enforce restrictions on your intellectual property using NFTs.  For example, [this NFT is listed as being limited to 1000 copies](https://nftmagic.io/view?asset=12588085591129036864): + +![](images/Screenshot_2021-03-21-NFTMagic-300x151.png) + +However, there is no mechanism to actually enforce this.  It is possible that only 1000 instances of the NFT can be _sold_, but this does not restrict the actual number of copies to 1000.  To demonstrate this, I have made a copy of this artwork and reproduced it on my own website, which exists outside of the world where the NFT has any enforcement mechanism. + +![](images/12588085591129036864-169x300.jpeg) + +As you can see, there are now at least 1001 available copies of this artwork.  Except you can download that one for free. + +All of these platforms have various ways of finding the master copy of the artwork and thus enabling you to make your own copy.  I am planning on writing a tool soon to allow anyone to download their own personal copy of any NFT.  I'll write about the internals of the smart contracts and how to get the goods later. + +## Well, what about archival? + +Some companies, like NFTMagic claim that your artwork is stored on the blockchain forever.  In practice, this is a very bold claim, because it requires that: + +- Data is never evicted from the blockchain in order to make room for new data. +- The blockchain will continue to exist forever. +- Appending large amounts of data to the blockchain will always be possible. + +Lets look into this for NFTMagic, since they make such a bold claim. + +NFTMagic runs on the [Ardor blockchain platform](https://www.jelurida.com/ardor), which is [written in Java](https://github.com/NXTARDOR/ARDOR).  This is already somewhat concerning because Java is not a very efficient language for writing this kind of software in.  But how is the blockchain stored to disk? + +For that purpose, the [Ardor software uses H2](https://github.com/NXTARDOR/ARDOR/blob/master/src/java/nxt/db/BasicDb.java#L21).  H2 is basically an SQLite clone for Java.  So far, this is not very confidence inspiring.  By comparison, Bitcoin and Ethereum use LevelDB, which is far more suited for this task (maintaining a content-addressable append-only log) than an SQL database of any kind. + +How is the data actually archived to the blockchain?  In the case of Ardor, it works by having some participants act as _archival nodes_.  These archival nodes maintain a full copy of the asset blockchain -- you either get everything or you get nothing.  By comparison, other archival systems, like IPFS, allow you to specify what buckets you would like to duplicate. + +How does data get to the archival nodes?  It is split into 42KB chunks and committed to the chain with a manifest.  Those 42KB chunks and manifest are then stored in the SQL database. + +Some proponents of the NFTMagic platform claim that this design ensures that there will be at least one archival node available at all times.  But I am skeptical, because the inefficient data storage mechanism in combination with the economics of being an archival node make this unlikely to be sustainable in the long term.  If things play out the way I think they will, there will be a point in the future where zero archival nodes exist. + +However, there is a larger problem with this design: if some archival nodes choose to be evil, they can effectively deny the rightful NFT owner's ability to download the content she has purchased.  I believe implementing an evil node on the Ardor network would actually not be terribly difficult to do.  A cursory examination of the `getMessage` API does not seem to provide any protections against evil archival nodes.  At the present size of the network, a nation state would have more than sufficient resources to pull off this kind of attack. + +## Conclusion + +In closing, I hope that by looking at the NFTMagic platform and it's underlying technology (Ardor), I have effectively conveyed that these NFT platforms are not terribly useful to artists for more than a glorified "certificate of authenticity" solution.  You could, incidentally, do a "certificate of authenticity" by simply writing one and signing it with something like Docusign.  That would be more likely to be enforceable in a court too. + +I also hope I have demonstrated that NFTs by design cannot restrict anyone from making a copy.  Be careful when evaluating the claims made by NFT vendors.  When in doubt, check your fingers and check your wallet, because they are most assuredly taking you for a ride. diff --git a/content/blog/on-centralized-development-forges.md b/content/blog/on-centralized-development-forges.md new file mode 100644 index 0000000..f78e0aa --- /dev/null +++ b/content/blog/on-centralized-development-forges.md @@ -0,0 +1,34 @@ +--- +title: "On centralized development forges" +date: "2021-12-02" +--- + +Since the launch of SourceForge in 1999, development of FOSS has started to concentrate in centralized development forges, the latest one of course being GitHub, now owned by Microsoft. While the centralization of development talent achieved by GitHub has had positive effects on software development output towards the commons, it is also a liability: GitHub is now effectively a single point of failure for the commons, since the overwhelming majority of software is developed there. + +In other words, for the sake of convenience, we have largely traded our autonomy as software maintainers to GitHub, GitLab.com, Bitbucket and SourceForge, all of which are owned by corporate interests which, by definition, are aligned with profitability, not with our interests as maintainers. + +It is indeed convenient to use GitHub or GitLab.com for software development: you get all the pieces you need in order to maintain software with modern workflows, but it really does come at a cost: SourceForge, for example, was [caught redistributing Windows builds of projects under their care with malware](https://lwn.net/Articles/564250/). + +While GitHub or the other forges besides SourceForge have not yet attempted anything similar, it does serve as a reminder that we are trusting forges to not tamper with the packages we release as maintainers. There are other liabilities too, for example, a commercial forge may unilaterally decide to kick your project off of their service, or terminate the account of a project maintainer. + +In order to protect the commons from this liability, it is imperative to build a more robust ecosystem, one which is a federated ecosystem of software development forges, which are either directly run by projects themselves, or are run by communities which directly represent the interests of the maintainers which participate in them. + +## Building a community of islands + +One of the main arguments in favor of centralization is that everyone else is already using a given service, and so you should as well. In other words, the concentrated social graph. However, it is possible to build systems which allow the social graph to be distributed across multiple instances. + +Networks like the ActivityPub fediverse (what many people incorrectly call the Mastodon network), despite their flaws, demonstrate the possibility of this. To that end, [ForgeFed is an adaptation of ActivityPub](https://forgefed.peers.community/) allowing development forges to federate (share social graph data) with other forges. With proliferation of standards like ForgeFed, it is possible to build a replacement ecosystem that is actually trustworthy and representative of the voices and needs of software maintainers. + +ForgeFed is moving along, albeit slowly. There is a [reference implementation called Vervis](https://dev.angeley.es/s/fr33domlover/r/vervis/s), and there is work ongoing to integrate ForgeFed into [Gitea](https://github.com/go-gitea/gitea/issues/14186) and [Gitlab CE](https://gitlab.com/gitlab-org/gitlab/-/issues/14116). As this work comes to fruition, forges will be able to start federating with each other. + +## A side-note on Radicle + +A competing proposal, known as Radicle, has been making waves lately. It should be ignored: it is just the latest in "Web3" cryptocurrency grifting, the software development equivalent to NFT mania. All problems solved by Radicle have better solutions in traditional infrastructure, or in ForgeFed. For example, to use Radicle, you must download a specialized client, and then download a blockchain with that client. This is not something most developers are going to want to do in order to just send a patch to a maintainer. + +## Setting up my own forge with CI + +[Treehouse](https://treehouse.systems/discord), the community I started by accident over labor day weekend, is now [offering a gitea instance with CI](https://gitea.treehouse.systems). It is my intention that this instance become communally governed, for the benefit of participants in the Treehouse community. We have made some modifications to gitea UI to make it more tolerable, and plan to implement ForgeFed as soon as patches are available, but it is admittedly still a work in progress. Come join us in [`#gitea` on the Treehouse discord](https://treehouse.systems/discord)! + +I have begun moving my own projects to this gitea instance. If you're interested in doing the same, the instance is open to anybody who wants to participate. I will probably be publishing the specific kubernetes charts to enable this setup on your own infrastructure in the next few days, as I clean them up to properly use secrets. I also plan to do a second blog outlining the setup once everything is figured out. + +It is my goal that we can move from large monolithic forges to smaller community-oriented ones, which federate with each other via ForgeFed to allow seamless collaboration without answering to corporate interests. Realization of this effort is a high priority of mine for 2022, and I intend to focus as much resources as I can on it. diff --git a/content/blog/on-cve-2019-5021.md b/content/blog/on-cve-2019-5021.md new file mode 100644 index 0000000..caaa658 --- /dev/null +++ b/content/blog/on-cve-2019-5021.md @@ -0,0 +1,16 @@ +--- +title: "On CVE-2019-5021" +date: "2021-11-22" +--- + +A few years ago, [it was discovered that the `root` account was not locked out in Alpine's Docker images](https://talosintelligence.com/vulnerability_reports/TALOS-2019-0782). This was not the first time that this was the case, [an actually exploitable case of this was first fixed with a hotfix in 2015](https://github.com/gliderlabs/docker-alpine/pull/109), but when the hotfix was replaced with appropriate use of `/etc/securetty`, the regression was inadvertently reintroduced for some configurations. + +It should be noted that I said **some** configurations there. Although CVE-2019-5021 was issued a CVSSv2 score of 9.8, in reality I have yet to find any Alpine-based docker image that is actually vulnerable to CVE-2019-5021. Of course, this doesn't mean that Alpine shouldn't have been locking out the root user on its `minirootfs` releases: that was a mistake, which I am glad was quickly rectified. + +Lately, however, there have been a few incidents involving CVE-2019-5021 involving less than honest actors in the security world. For example, a person named Donghyun Lee started [mass-filing CVEs against Alpine-based images without actually verifying if the image was vulnerable or not](https://github.com/donghyunlee00/CVE), which [Jerry Gamblin called out on Twitter last year](https://twitter.com/JGamblin/status/1338999330469011459). Other less than honest actors, have focused instead on attempting to use CVE-2019-5021 to sell their remediation solutions, implying a risk of vulnerability, where most likely none actually exists. + +So, what configurations are actually vulnerable to CVE-2019-5021? Well, you must install both the `shadow` and `linux-pam` packages in the container to have any possibility of vulnerability to this issue. I have yet to find a single container which has installed these packages: think about it, Docker containers do not run multi-user, so there is no reason to configure PAM inside them. In essence, CVE-2019-5021 was a vulnerability due to the fact that the PAM configuration was not updated to align with the new Busybox configuration introduced in 2015. + +And, for that matter, why is being able to escalate to `root` scary in a container? Well, if you are running a configuration without UID namespaces, `root` in the container is equivalent to `root` on the host: if the user can pivot outside the container filesystem, they can have full root access to the machine. Docker-in-docker setups with an Alpine-based container providing the Docker CLI, for example, would be easy to break out of if they were running PAM in a misconfigured way. + +But in practice, nobody combines PAM with the Alpine Docker images, as there's no reason to do so. Accordingly, be wary of marketing materials discussing CVE-2019-5021, in practice your configuration was most likely never vulnerable to it. diff --git a/content/blog/on-the-topic-of-community-management-cocs-etc.md b/content/blog/on-the-topic-of-community-management-cocs-etc.md new file mode 100644 index 0000000..5d261b4 --- /dev/null +++ b/content/blog/on-the-topic-of-community-management-cocs-etc.md @@ -0,0 +1,34 @@ +--- +title: "On the topic of community management, CoCs, etc." +date: "2021-08-08" +--- + +Many people may remember that at one point, Alpine [had a rather troubled community](https://lists.alpinelinux.org/~alpine/devel/%3CCA%2BT2pCE2H8Z8ERg5vS3mnr98Ets%2B0-m0aZpp4ZPzQ%2BhuKMPOjA%40mail.gmail.com%3E#%3CCA+T2pCE2H8Z8ERg5vS3mnr98Ets+0-m0aZpp4ZPzQ+huKMPOjA@mail.gmail.com%3E), which to put it diplomatically, resulted in a developer leaving the project.  This was the result of not properly managing the Alpine community as it grew -- had we taken early actions to ensure appropriate moderation and community management, that particular incident would never have happened. + +We did ultimately fix this issue and now have a community that tries to be friendly, welcoming and constructive, but it took a lot of work to get there.  As I was one of the main people who did that work, I think it might be helpful to talk about what I've learned through that process. + +## Moderation is critical + +For large projects like Alpine, active moderation is the most crucial aspect.  It is basically the part that makes or breaks everything else you try to do.  Building the right moderation team is also important: it needs to be a team that everyone can believe in. + +That means that the people who are pushing for community management may or may not be the right people to do the actual day to day moderation work, and should rather focus on policy.  This is because there will be bias against the people pushing for changes in the way the community is managed by some members.  Building a moderation team that gently enforces established policy, but is otherwise perceived as neutral is critical to success. + +## Policy statements (such as Codes of Conduct) + +It is not necessarily a requirement to write a Code of Conduct.  However, if you are retrofitting one into a pre-existing community, it needs to be done from the bottom up, allowing everyone to say their thoughts.  Yes, you will get people who present bad faith arguments, because they are resistant to change, or perhaps they see no problem with the status quo.  In most cases, however, it is likely because people are resistant to change.  By including the community in the discussion about its community management goals, you ensure they will generally believe in the governance decisions made. + +Alpine did ultimately adopt a Code of Conduct.  Most people have never read it, and it doesn't matter.  When we wrote it, we were writing it to address specific patterns of behavior we wanted to remove from the community space.  The real purpose of a Code of Conduct is simply to set expectations, both from participants _and_ the moderation team. + +However, if you _do_ adopt a Code of Conduct, you must actually enforce it as needed, which brings us back to moderation.  I have unfortunately seen many projects in the past few years, which have simply clicked the "Add CoC" button on GitHub and attached a copy of the Contributor Covenant, and then went on to do exactly nothing to actually align their community with the Code of Conduct they published.  Simply publishing a Code of Conduct is an optional first step to improving community relations, but it is _never_ the last step. + +## Fostering inclusivity + +The other key part of building a healthy community is to build a community where everyone feels like they are represented.  This is achieved by encouraging community participation in governance, both at large, and in a targeted way: the people making the decisions and moderating the community should ideally look like the people who actually use the software created. + +This means that you should try to encourage women, people of color and other marginalized people to participate in project governance.  One way of doing so is by amplifying their work in your project.  You should also amplify the work of other contributors, too.  Basically, if people are doing cool stuff, the community team should make everyone aware of it.  A great side effect of a community team actively doing this is that it encourages people to work together constructively, which reinforces the community management goals. + +## Final thoughts + +Although it was not easy, Alpine ultimately implemented all of the above, and the community is much healthier than it was even a few years ago.  People are happy, code is being written, and we're making progress on substantive improvements to the Alpine system, as a community. + +Change is scary, but in the long run, I think everyone in the Alpine community agrees by now that it was worth it.  Hopefully other communities will find this advice helpful, too. diff --git a/content/blog/open-cores-isas-etc-what-is-actually-open-about-them.md b/content/blog/open-cores-isas-etc-what-is-actually-open-about-them.md new file mode 100644 index 0000000..c253952 --- /dev/null +++ b/content/blog/open-cores-isas-etc-what-is-actually-open-about-them.md @@ -0,0 +1,32 @@ +--- +title: "open cores, ISAs, etc: what is actually open about them?" +date: "2021-12-06" +--- + +In the past few years, with the launch of RISC-V, and IBM's OpenPOWER initiative (backed up with hardware releases such as Talos) there has been lots of talk about open hardware projects, and vendors talking about how anyone can go and make a RISC-V or OpenPOWER CPU. While there is a modicum of truth to the assertion that an upstart company could start fabricating their own RISC-V or OpenPOWER CPUs tomorrow, the reality is a lot more complex, and it basically comes down to patents. + +## Components of a semiconductor design + +The world of semiconductors from an intellectual property point of view is a complex one, especially as the majority of semiconductor companies have become "fabless" companies, meaning that they outsource the production of their products to other companies called foundries. This is even true of the big players, for example, AMD has been a fabless company since 2009, when they [spun off their foundry division into its own company called GlobalFoundries](http://web.archive.org/web/20081209040024/https://www.nytimes.com/2008/10/07/technology/07chip.html). + +Usually semiconductors are designed with an automated electronics design language such as Verilog or VHDL. When a company wishes to make a semiconductor, they contract out to a foundry, which provides the company with a customized Verilog or VHDL toolchain, which generates the necessary data to generate a semiconductor according to the foundry's processes. When you hear about a chip being made on "the TSMC 5nm process" or "Intel 7 process," (sidenote: [the Intel 7 process is actually 10nm](https://www.anandtech.com/show/16823/intel-accelerated-offensive-process-roadmap-updates-to-10nm-7nm-4nm-3nm-20a-18a-packaging-foundry-emib-foveros)) this is what they are talking about. + +The processes and tooling used by the foundries are protected by a combination of copyright and relevant patents, which are licensed to the fabless company as part of the production contract. However, these contracts are complicated: for example, in some cases, the IP rights for the generated silicon mask for a semiconductor may actually belong to the foundry, not the company which designed it. Other contracts might impose a vendor exclusivity agreement, where the fabless company is locked into using one, and only one foundry for their chip fabrication needs. + +As should be obvious by now, there is no situation where these foundry processes and tools are open source. At best, the inputs to these tools are, and this is true for RISC-V and OpenPOWER: there are VHDL cores, such as the [Microwatt OpenPOWER core](https://github.com/antonblanchard/microwatt) and [Alibaba's XuanTie RISC-V core](https://github.com/T-head-Semi/openc910), which can be downloaded and, with the appropriate tooling and contracts, synthesized into ASICs that go into products. These inputs are frequently described as SIP cores, or IP cores, short for Semiconductor Intellectual Property Core, the idea being that you can license a set of cores, wire them together, and have a chip. + +## The value of RISC-V and OpenPOWER + +As discussed above, a company looking to make a SoC or similar chip would usually license a bunch of IP cores and glue them together. For example, they might license a CPU core and memory controller from ARM, a USB and PCIe controller from Synopsys, and a GPU core from either ARM or Imagination Technologies. None of the IP cores in the above configuration are open source, the company making the SoC pays a royalty to use all of the licensed IP cores in their product. Notable vendors in this space include MediaTek and Rockchip, but there are many others. + +In practice, it is possible to replace the CPU core in the above designs with one of the aforementioned RISC-V or OpenPOWER ones, and there are other IP cores that can be used from, for example, [the OpenCores project to replace others](https://opencores.org/). However, that may, or may not, actually reduce licensing costs, as many IP cores are licensed as bundles, and there are usually third-party patents that have to be licensed. + +## Patents + +Ultimately, we come to the unavoidable topic, patents. Both RISC-V and OpenPOWER are described as patent-free, or patent-unencumbered, but what does that actually mean? In both cases, it means that the ISA itself is unencumbered by patents... in the case of RISC-V, the ISA itself is patent-free, and in the case of OpenPOWER, there is [a very liberal patent licensing pool](https://openpowerfoundation.org/final-draft-of-the-power-isa-eula-released/). + +But therein lies the rub: in both cases, the patent situation only covers the ISA itself. Implementation details and vendor extensions are not covered by the promises made by both communities. In other words, [SiFive](https://patents.justia.com/assignee/sifive-inc) and [IBM still have entire portfolios](https://patents.justia.com/company/ibm) they can assert against any competitor in their space. RISC-V, as noted before, does not have a multilateral patent pool, and these microarchitectural patents are not covered by the OpenPOWER patent pool, as that covers the POWER ISA only. + +This means that anybody competing with SiFive or IBM respectively, would have to be a patent licensee, if they are planning to produce chips which compete with SiFive or IBM, and these licensing costs are ultimately passed through to the companies licensing the SoC cores. + +There are steps which both communities could take to improve the patent problems: for example, RISC-V could establish a patent pool, and require ecosystem participants to cross-license their patents through it, and IBM could widen the scope of the OpenPOWER patent pool to cover more than the POWER ISA itself. These steps would significantly improve the current situation, enabling truly free (as in freedom) silicon to be fabricated, through a combination of a RISC-V or OpenPOWER core and a set of supporting cores from OpenCores. diff --git a/content/blog/oracle-cloud-sucks.md b/content/blog/oracle-cloud-sucks.md new file mode 100644 index 0000000..029ff3d --- /dev/null +++ b/content/blog/oracle-cloud-sucks.md @@ -0,0 +1,13 @@ +--- +title: "Oracle cloud sucks" +date: "2021-07-14" +coverImage: "Screen-Shot-2021-07-13-at-10.49.59-PM.png" +--- + +**Update:** Oracle have made this right, and I am in fact, now running [production services on their cloud](https://ariadne.space/2021/07/18/moving-my-blog-to-oracle-cloud/).  Thanks to Ross and the other Oracle engineers who reached out offering assistance.  The rest of the blog post is retained for historical purposes. + +In my previous blog, I said that [Oracle was the best option for cheap ARM hosting](https://ariadne.space/2021/07/10/its-time-for-arm-to-embrace-traditional-hosting/). + +Yesterday, Oracle rewarded me for that praise by demonstrating they are, in fact, Oracle and terminating my account.  When I contacted their representative, I was told that I was running services on my instance not allowed by their policies (I was running a non-public IRC server that only connected to other IRC servers, and their policies did not discuss IRC at all) and that the termination decision was final.  Accordingly, I can no longer recommend using Oracle's cloud services for anything -- if you use their service, **you are at risk of losing your hosting at any time**, for any reason they choose to invent, regardless of whether you are a paying customer or not. + +That leaves us with exactly zero options for cheap ARM hosting.  Hopefully Amazon will bring ARM options to Lightsail soon. diff --git a/content/blog/pleroma-litepub-activitypub-and-json-ld.md b/content/blog/pleroma-litepub-activitypub-and-json-ld.md new file mode 100644 index 0000000..a8949d6 --- /dev/null +++ b/content/blog/pleroma-litepub-activitypub-and-json-ld.md @@ -0,0 +1,72 @@ +--- +title: "Pleroma, LitePub, ActivityPub and JSON-LD" +date: "2018-11-12" +--- + +A lot of people make assumptions about my position on whether or not JSON-LD is actually good or not. The reality is that my view is more nuanced than that: there are _great_ uses for JSON-LD, but it's not appropriate in the scenario it is used in ActivityPub. + +## What is JSON-LD anyway? + +JSON-LD stands for _JSON Linked Data_. Linked Data is a “Big Data” technique which involves creating large graphs of interlinked pieces of data, intended to help enrich data sets with more semantic context (this is known as _graph coloring_), as well as additional data linked by URI (hince why it's called _linked data_). The Linked Data concept can be extremely powerful for data analysis when used in the appropriate context. A good example of where linked data is _useful_ is healthcare.gov, where they use it to help compare performance and value verses cost of US health insurance plans. + +## ActivityPub and JSON-LD + +Another example where JSON-LD is ostensibly used is ActivityPub. ActivityPub inherits it's JSON-LD dependency from ActivityStreams 2.0, which is a data format that enjoys wide use outside of the ActivityPub ecosystem: for example, Twitter, Instagram, Facebook and Tumblr all use variations of ActivityStreams 2.0 objects in various places inside their APIs. + +These services find the JSON-LD concept useful because their advertising customers can leverage JSON-LD (in facebook, the _open graph_ concept they frequently pitch to advertisers is built in part on top of JSON-LD) to optimize their advertising campaigns. + +But does JSON-LD provide any value in a social networking environment which does not have advertising? In my opinion, not really: it's just a artifact of the “if you're not the customer, you're the product” nature of the proprietary social networking services. As previously stated, the primary advantage of JSON-LD and the linked data philosophy in general is _data enrichment_, and _data enrichment_ is largely useful to two groups: _advertisers_ and _intelligence_ (public or private). + +Since the federated social networking services don't have advertising, that just leaves _intelligence_. + +### Private intelligence and social networking, how data enrichment can impact your credit score + +There are various kinds of _private intelligence_ firms out there which collect information about you, me, and everyone else. You've probably heard of some of them, and some of the products they sell: companies like Experian, InfoCheckUSA and Equifax sell various products like FICO credit scores and background reports which determine everything from whether or not you can rent or buy a car or house to whether or not you can get a job. + +But did you know these companies crawl your use of the proprietary social networking services? There are companies like [FriendlyScore](https://friendlyscore.com/) which sell credit-related data based on how you utilize social networking services. Those “social” credit scores are directly enabled by technology such as JSON-LD and ActivityStreams 2.0. + +### Public intelligence and social networking, how data enrichment can get you killed + +We've all heard about Predator drones and drone strikes in the news. In the past decade, drone strikes have been used to attack countless targets. But how do our public intelligence agencies determine who is a target? It's very similar to how the private intelligence agencies determine whether you should own a house or have a job: they use big data methods to analyze all of the metadata they collected. + +If you write a post on a social networking service and attach GPS data to it, they can use that information to determine a general pattern of _when_ and _where_ you are, and then feed it into a machine learning algorithm to determine _when_ and _where_ you will likely be in the _future_. They can also use this metadata analysis to prove certain assertions about your identity to a level of certainty which determines if you become a target, even if you're not really the same person they are trying to find. + +### Conclusion: safety is more important than data enrichment + +These techniques that are used both in the public and private sector are what the press tend to refer to as “Big Data” techniques. JSON-LD is a “Big Data” technology that can be leveraged in these ways. But at the same time, we can leverage some “Big Data” techniques in such a way that JSON-LD parsers will automatically do what we want them to do. + +In my opinion, it is a _critical_ obligation of federated social networking service developers to ensure that handling of data is done in the most secure way possible, built on proven fundamentals. I view the inclusion of JSON-LD in the ActivityPub and ActivityStreams 2.0 standards to be harmful toward that obligation. + +## Pleroma and JSON-LD + +As you may know, there are two mainstream ActivityPub servers that are in wide use: Mastodon and Pleroma. Mastodon uses JSON-LD and Pleroma does not. But they are able to interoperate just fine despite this. This is largely because Pleroma provides JSON-LD attributes in the messages it generates without actively using them itself. + +### Handling ActivityPub in a world without JSON-LD + +![The origin of the Transmogrifier name](images/Transmogrifier.png) + +Instead, Pleroma has a module called `Transmogrifier` that translates between _real_ ActivityPub and our _ActivityPub internal representation_. The use of AP constructs in our internal representation is the origin of the statement that Pleroma uses ActivityPub internally, and to an extent it is a very truthful statement: our internal representation and object graph are directly derived from an earlier ActivityPub draft, but it's not _quite_ the same, and there have been a few bugs where things have not been translated correctly which have resulted in leaks and other problems. + +Besides the `Transmogrifier`, we have two functions which fetch new pieces into the graphs we build: `Object.normalize()` and `Activity.normalize()`. This could be considered to be a similar approach to JSON-LD except that it's explicit instead of implicit. The explicit fetching of new graph pieces is a security feature: it allows us to validate that we actually trust what we're fetching before we do it. This helps us to prevent various “fake direction” attacks which can be used for spoofing. + +## LitePub and JSON-LD + +[LitePub](https://litepub.social/litepub) is a recent initiative that was started between Pleroma and a few other ActivityPub implementations to slim down the ActivityPub standard into something that is minimalist and secure. While LitePub itself does not require JSON-LD, LitePub implementations follow some JSON-LD like behaviors where it makes sense, and LitePub provides a `@context` which allows JSON-LD parsers to transparently parse LitePub messages. + +### Leveraging Linked Data for Object Capability Enforcement + +The main principle LitePub is built on is the use of leveraging the linked data paradigm to perform object capability enforcement. This can work either _explicitly_ (as is done in Pleroma) or _implicitly_ (as is done in Mastodon when parsing a LitePub activity). + +We do this by treating every `Object` ID in LitePub as a _capability URI_. When processing messages that reference a _capability URI_, we check to make sure the _capability URI_ is still valid by re-fetching the object. If fetching the object fails, then the _capability URI_ is no longer valid. This prevents zombie activities. + +### A note on Zombie Activities + +There are two primary ways of securing ActivityPub implementations with digital signatures: [JSON Linked Data Signatures (LDSigs)](https://w3c-dvcg.github.io/ld-signatures/) and the construction built on [HTTP Signatures that is leveraged in LitePub](https://litepub.social/litepub/overview.html). These can be referred to as _inline_ signatures and _transient_ signatures, respectively. + +The problem with _inline_ signatures is that they are valid forever. LDSig signatures have no expiration and have no revocation method. Because of this, if an `Object` is deleted, it can come back to life. The solution created by the LDSig advocates is to use `Tombstone` objects for all deletions, but that creates a potential metadata leak that proves a post once existed which harms plausible deniability. + +The LitePub approach on the other hand is to treat all objects as _capability URIs_. This means when an object is deleted, future attempts to access the _capability URI_ fail and thus the object cannot come back to life through boosting or other means. + +## Conclusion + +Hopefully this clarifies my views on JSON-LD and it's applications in the fediverse. Feel free to ask me questions if you have any. diff --git a/content/blog/software-does-not-make-a-product.md b/content/blog/software-does-not-make-a-product.md new file mode 100644 index 0000000..f65ce1f --- /dev/null +++ b/content/blog/software-does-not-make-a-product.md @@ -0,0 +1,32 @@ +--- +title: "Software Does Not Make A Product" +date: "2019-04-28" +--- + +> [Some fediverse developers](https://mastodon.social/@Gargron) approach project management from the philosophy that they are building a product in it's own right instead of a tool. But does that approach really make sense for the fediverse? + +It's that time again, [patches have been presented which improve Mastodon's compatibility](https://github.com/tootsuite/mastodon/pull/10629) with the rest of the fediverse. However, [the usual suspect has expressed disinterest in clicking the merge button](https://github.com/tootsuite/mastodon/pull/10629#issuecomment-485831461). The users protest loudly about this unilateral decision, as is expected by the astute reader. Threats of hard forks are made. GitHub's emoji reactions start to arrive, mostly negative. The usual suspect [fires back saying that the patches do not fit into his personal vision](https://github.com/tootsuite/mastodon/pull/10629#issuecomment-485927096), leading to more negative reactions. But why? + +I believe the main issue at stake is whether or not fediverse software is _the_ product, or if it is the instances themselves which are _the_ product. Yes, both the software and the instance itself, are products, but the question, really, is which one is actually more impactful? + +Gargron (the author of Mastodon), for whatever reason, sees Mastodon itself as the core product. This is obvious based on the marketing copy he writes to promote the Mastodon software and the 300,000+ user instance he personally administrates where he is followed by all new signups by default. It is also obvious based on the dictatorial control he exerts over the software. + +But is this view aligned with reality? Mastodon has very few configurable options, but [admins have made modifications to the software](https://github.com/glitch-soc/mastodon), which add configuration options that contradict Gargron's personal vision. These features are frequently deployed by Mastodon admins and, to an extent, Mastodon instances compete with each other on various configuration differences: custom emoji, theming, formatting options and even the maximum length of a post. This competition, largely, has been enabled by the existence of “friendly” forks that add the missing configuration options. + +My view is different. I see fediverse software as a tool that is used to build a community which optionally exists in a community of communities (the fediverse). In my view, users should be empowered to choose an instance which provides the features they want, with information about what features are available upfront. In essence, it is the instances themselves which are competing for users, not the software. + +Monoculture harms competitiveness, there are thousands of Mastodon instances to choose from, but how many of them are truly memorable? How many are shipping stock Mastodon with the same old default color scheme and theme? + +Outside of Mastodon, the situation is quite different. Most of us see the software we work on as a tool for facilitating community building. Accordingly, we try to do our best to give admins as many ways as possible to make their instance look and feel as they want. They are building the product that actually matters, we're just facilitating their work. After all, they are the ones who have to spend time customizing, promoting and managing the community they build. This is why Pleroma has extensive configuration and theming options that are presented in a way that is very easy to leverage. Likewise, Friendica, Hubzilla and even GNU Social can be customized in the same way: you're in control as the admin, not a product designer. + +But Mastodon is still problematic when it comes to innovation in the fediverse at large. Despite the ability that other fediverse software give to users and admins to present their content in whatever form they want, Mastodon presently fails to render the content correctly: + +![Mastodon presents lists in an incorrect way.](images/image.png?name=image.png) + +The [patches I referred to earlier](https://github.com/tootsuite/mastodon/pull/10629) correct this problem by changing how Mastodon processes posts from remote instances. They also provide a path toward improving usability in the fediverse by allowing us to work toward phasing out the use of Unicode mathematical constants as a substitute for proper formatting. The majority of fediverse microblogging software has supported this kind of formatting for a long time, many implementations predating Mastodon itself. Improved interoperability with other fediverse implementations sounds like a good thing, right? Well, it's not aligned with the Mastodon vision, so it's rejected. + +The viewpoint that the software itself is primarily what matters is stifling fediverse development. As developers, we should be working together to improve the security and expressiveness of the underlying technology. This means that some amount of flexibility is required. Quoting [RFC791](https://www.ietf.org/rfc/rfc0791.txt): + +> In general, an implementation must be conservative in its sending behavior, and liberal in its receiving behavior. + +There is no God of the fediverse. The fediverse exists and operates smoothly because we work together, as developers, in concert with the admin and user community at large. Accomplishing this requires compromise, not unilateral decision making. diff --git a/content/blog/spelunking-through-the-apk-tools-dependency-solver.md b/content/blog/spelunking-through-the-apk-tools-dependency-solver.md new file mode 100644 index 0000000..ccc9634 --- /dev/null +++ b/content/blog/spelunking-through-the-apk-tools-dependency-solver.md @@ -0,0 +1,48 @@ +--- +title: "spelunking through the apk-tools dependency solver" +date: "2021-10-31" +--- + +In our previous episode, I [wrote a high level overview](https://ariadne.space/2021/04/25/why-apk-tools-is-different-than-other-package-managers/) of apk’s differences verses traditional package managers, which many have cited as a helpful resource for understanding the behavior of apk when it does something different than a traditional package manager would. But that article didn’t go into depth in enough detail to explain how it all actually works. This one hopefully will. + +## A high level view of the moving parts + +Our adventure begins at the `/etc/apk/world` file. This file contains the basic set of constraints imposed on the system: every constraint listed here must be solvable in order for the system to be considered correct, and no transaction may be committed that is incorrect. In other words, the package management system can be proven to be in a correct state every time a constraint is added or removed with the apk add/del commands. + +Note I used the word transaction there: at its core, apk is a transactional package manager, though we have not fully exploited the transactional capabilities yet. A transaction is created by copying the current constraint list (`db->world`), manipulating it with `apk_deps_add` and then committing it with `apk_solver_commit`. The commitment phase does pre-flight checks directly and returns an error if the transaction fails to pass. + +This means that removing packages works the same way: you copy the current constraint set, remove the desired constraint, and then commit the result, which either errors out or updates the installed constraint set after the transaction is committed. + +## A deeper look into the solver itself + +As noted above, the primary entry point into the solver is to call the `apk_solver_commit` function, which at the time that I am writing this, is located in the [apk-tools source code at src/commit.c:679](https://gitlab.alpinelinux.org/alpine/apk-tools/-/blob/master/src/commit.c#L679). This function does a few pre-flight checks and then calls into the solver itself, using `apk_solver_solve`, which generates the actual transaction to be committed. If there are errors, the generated transaction is discarded and a report is printed instead, otherwise the generated transaction is committed using `apk_solver_commit_changeset`. + +In essence, the code in src/commit.c can be thought of as the middle layer between the applets and the core solver. The core solver itself lives in src/solver.c and as previously noted, the main entry point is `apk_solver_solve`, which generates a proposed transaction to satisfy the requested constraints. This function [lives at src/solver.c:1021](https://gitlab.alpinelinux.org/alpine/apk-tools/-/blob/master/src/solver.c#L1021), and is the only entry point into the solver itself. + +The first thing the solver does is alphabetically sort the constraint set. If you’ve noticed that `/etc/apk/world` is always in alphabetical order, this is a side effect of that sorting. + +Once the world constraints (the ones in `/etc/apk/world`) are alphabetically ordered, the next step is to figure out what package, if any, presently satisfies the constraint. This is handled by the `discover_name` function, which is called recursively on every constraint applicable to the system, starting with the world constraint. + +The next step is to generate a fuzzy solution. This is done by walking the dependency graph again, calling the `apply_constraint` function. This step does basic dependency resolution, removing possible solutions which explicitly conflict. Reverse dependencies (`install_if`) are partially evaluated in this phase, but complex constraints (such as those involving a version constraint or multiple solutions) are not evaluated yet. + +Once basic constraints are applied to the proposed updated world, the next step is to walk the dependency graph again, reconsidering the fuzzy solution generated in the step above. This step is done by the `reconsider_name` function, which walks over parts of the dependency graph that are still ambiguous. Finally, packages are selected to resolve these ambiguities using the `select_package` function. Afterwards, the final changeset is emitted by the `generate_changeset` function. + +### A deep dive into `reconsider_name` and `select_package` + +As should hopefully be obvious by now, the really complicated cases are handled by the `reconsider_name` function. These cases include scenarios such as virtual providers, situations where more than one package satisfies the constraint set, and so on. For these scenarios, it is the responsibility of the `reconsider_name` function to select the most optimal package. Similarly, it is the responsibility of the `select_package` function to check the work done by `reconsider_name` and finalize the package selection if appropriate by removing the constraint from the ambiguous list. + +The primary purpose of the `reconsider_name` function is to use `discover_name` and `apply_constraint` to move more specific constraints upwards and downwards through the dependency graph, narrowing the possible set of packages which can satisfy a given restraint, ideally to one package or less. These simplified dependency nodes are then fed into `select_package` to deduce the best package selection to make. + +The `select_package` function checks each constraint, and the list of remaining candidate packages, and then picks the best package for each constraint. This is done by calling `compare_providers` for each possible package and until the best one is found. The heuristics checked by `compare_providers` are, in order: + +1. The packages are checked to see if they are `NULL` or not. The one that isn't `NULL` wins. This is mostly as a safety check. +2. We check to see if the user is using `--latest` or not. If they are, then the behavior changes a little bit. The details aren't so important, you can read the source if you really want to know. Basically, in this step, we determine how fresh a package is, in alignment with what the user's likely opinion on freshness would be. +3. The provider versions are compared, if applicable. Highest version wins. +4. The package versions themselves are compared. Highest version wins. +5. The already installed package is preferred if the version is the same (this is helpful in upgrade transactions to make them less noisy). +6. The `provider_priority` field is compared. Highest priority wins. This means that `provider_priority` is **only** checked for unversioned providers. +7. Finally, the earliest repository in `/etc/apk/repositories` is preferred if all else is the same. + +Hopefully, this demystifies some of the common misconceptions around how the solver works, especially how `provider_priority` works. Personally, I think in retrospect, despite working on the spec and implementing it in apk-tools, that `provider_priority` was a mistake, and the preferred solution should be to always use versioned providers (e.g. `provides="foo=100"`) instead. The fact that we have moved to versioning `cmd:` providers in this way demonstrates that `provider_priority` isn't really a good design. + +Next time: what is the maximum number of entries allowed in `/etc/apk/repositories` and why is it so low? diff --git a/content/blog/stop-defining-feature-test-macros-in-your-code.md b/content/blog/stop-defining-feature-test-macros-in-your-code.md new file mode 100644 index 0000000..2a8fc5b --- /dev/null +++ b/content/blog/stop-defining-feature-test-macros-in-your-code.md @@ -0,0 +1,32 @@ +--- +title: "stop defining feature-test macros in your code" +date: "2021-12-21" +--- + +If there is any change in the C world I would like to see in 2022, it would be the abolition of `#define _GNU_SOURCE`. In many cases, defining this macro in C code can have harmful side effects ranging from subtle breakage to miscompilation, because of how feature-test macros work. + +When writing or studying code, you've likely encountered something like this: + +``` +#define _GNU_SOURCE +#include +``` + +Or worse: + +``` +#include +#include +#define _XOPEN_SOURCE +#include +``` + +The `#define _XOPEN_SOURCE` and `#define _GNU_SOURCE` in those examples are defining something known as a _feature-test macro_, which is used to selectively expose function declarations in the headers. These macros are necessary because some standards have conflicting definitions of functions and thus are aliased to other symbols, allowing co-existence of the conflicting functions, but only one version of that function may be defined at a time, so the feature-test macros allow the user to select which definitions they want. + +The correct way to use these macros is by defining them at compile time with compiler flags, e.g. `-D_XOPEN_SOURCE` or `-std=gnu11`. This ensures that the declared feature-test macros are consistently defined while compiling the project. + +As for the reason why `#define _GNU_SOURCE` is a thing? It's because we have documentation which does not correctly explain the role of feature-test macros. Instead, in a given manual page, you might see language like "this function is only enabled if the `_GNU_SOURCE` macro is defined." + +To find out the actual way to use those macros, you would have to read [feature\_test\_macros(7)](https://man7.org/linux/man-pages/man7/feature_test_macros.7.html), which is usually not referenced from individual manual pages, and while that manual page shows the incorrect examples above as bad practice, it understates how much of a bad practice it actually is, and it is one of the first code examples you see on that manual page. + +In conclusion, never use `#define _GNU_SOURCE`, always use compiler flags for this. diff --git a/content/blog/the-alpine-release-process.md b/content/blog/the-alpine-release-process.md new file mode 100644 index 0000000..fbd4478 --- /dev/null +++ b/content/blog/the-alpine-release-process.md @@ -0,0 +1,34 @@ +--- +title: "the Alpine release process" +date: "2021-10-22" +--- + +It's almost Halloween, which means it's almost time for an Alpine release, and all hands are on deck to make sure the process goes smoothly. But what goes into making an Alpine release? What are all the moving parts? Since we are in the process of cutting a new release series, I figured I would write about how it is actually done. + +## the beginning of the development cycle + +The development cycle for an Alpine release is 6 months long: it begins immediately once the release is branched in `aports.git`: at that point, there is no longer a development freeze, and minor changes start flowing in. + +Prior to the beginning of the development cycle, larger changes are proposed as system change proposals, an example of which being the [change proposal introducing Rust to main](https://gitlab.alpinelinux.org/alpine/tsc/-/issues/21) for the Alpine 3.16 development cycle. The largest, most invasive proposals are coordinated by the [Technical Steering Committee](https://gitlab.alpinelinux.org/alpine/tsc), while others may be coordinated by smaller teams, and individual maintainers. Anybody may create a system change proposal and drive it in Alpine, regardless of whether or not they have developer rights in the project. + +As these system change proposals are accepted (possibly after a few rounds of revision), the underlying steps needed to implement the change are sequenced into the overall development schedule if needed. Otherwise, they are implemented at the discretion of the contributor driving the change proposal. + +## soft freeze + +About three weeks before release time, we set up new builders and initiate a mass rebuild of the distribution for the next release. These new builders will continue to follow changes to the `edge` branch until the final release is cut, at which point they will be switched to follow the branch set up for the release. + +At this point, the `edge` branch is limited to minor, low risk changes only, unless explicitly granted an exception by the TSC. Efforts are primarily focused on making bug fix changes only, such as resolving failure-to-build-from-source (FTBFS) issues discovered during the rebuild. + +## release candidates + +The next step before release is to do a few test releases. Release candidates are automatically produced by the builders when a developer updates the `alpine-base` package and tags that commit with an appropriate git tag. These candidates get uploaded to the mirror network and users begin testing them, which usually results in a few bugs being reported which get fixed prior to the final release. + +If you are curious, [you can read the code that is run to generate the releases yourself](https://git.alpinelinux.org/aports/tree/scripts/mkimage.sh), it is located in the `aports.git` repository in the `scripts` folder. The main driver of the release generation process is `mkimage.sh`. + +## the final release + +A few days after each release candidate is cut, the TSC (or a release engineering team delegated by the TSC) evaluates user feedback from testing the new release, and a go/no-go decision is made on making the final release. If the TSC decides the release is not ready, the a new release candidate is made. + +Otherwise, if the decision to release is made, then the `aports.git` tree is branched, the new builders are switched to following the new branch, and the final release is cut on that branch. At that point, the `edge` branch is reopened for unrestricted development. + +Hopefully, in just a few days, we will have shipped the Alpine 3.15.0 release to the world, with very few release candidates required to do so. So far, the release process has largely gone smoothly, but only time will tell. diff --git a/content/blog/the-case-for-blind-key-rotation.md b/content/blog/the-case-for-blind-key-rotation.md new file mode 100644 index 0000000..69d00ad --- /dev/null +++ b/content/blog/the-case-for-blind-key-rotation.md @@ -0,0 +1,43 @@ +--- +title: "The Case For Blind Key Rotation" +date: "2018-12-30" +--- + +ActivityPub uses cryptographic signatures, mainly for the purpose of authenticating messages. This is largely for the purpose of spoofing prevention, but as any observant person would understand, digital signatures carry strong forensic value. + +Unfortunately, while ActivityPub uses cryptographic signatures, the types of cryptographic signatures to use have been left unspecified. This has lead to various implementations having to choose on their own which signature types to use. + +The fediverse has settled on using not one but _two_ types of cryptographic signature: + +- **HTTP Signatures**: based on an [IETF internet-draft](https://tools.ietf.org/html/draft-cavage-http-signatures-10), HTTP signatures provide a cryptographic validation of the headers, including a Digest header which provides some information about the underlying object. HTTP Signatures are an example of _detached_ signatures. HTTP Signatures also generally sign the Date header which provides a defacto validity period. +- **JSON-LD Linked Data Signatures**: based on a [W3C community draft](https://w3c-dvcg.github.io/ld-signatures/), JSON-LD Linked Data Signatures provide an inline cryptographic validation of the JSON-LD document being signed. JSON-LD Linked Data Signatures are commonly referred to as LDS signatures or LDSigs because frankly the title of the spec is a mouthful. LDSigs are an example of _inline_ signatures. + +## Signatures and Deniability + +When we refer to _deniability_, what we're talking about is _forensic deniability_, or put simply the ability to plausibly argue in a court or tribunal that you did not sign a given object. In essence, _forensic deniability_ is the ability to argue _plausible deniability_ when presented with a piece of forensic evidence. + +Digital signatures are by their very nature harmful with regard to _forensic deniability_ because they are digital evidence showing that you signed something. But not all signature schemes are made equal, some are less harmful to deniability than others. + +A good signature scheme which does not harm deniability has the following basic attributes: + +- Signatures are ephemeral: they only hold validity for a given time period. +- Signatures are revocable: they can be invalidated during the validity period in some way. + +Both HTTP Signatures and LDSigs have weaknesses — specifically, both implementations do not allow for the possibility of future revocation of the signature, but LDSigs is even worse because LDSigs are intentionally forever. + +## Mitigating the revocability problem with Blind Key Rotation + +Blind Key Rotation is a mitigation that builds on the fact that ActivityPub implementations must fetch a given actor again in the event that signature authentication fails, by using this fact to provide some level of revocability. + +The mitigation works as follows: + +1. You delete one or more objects in a short time period. +2. Some time after the deletions are processed, the instance rekeys your account. It does not send any Update message or similar because signing your new key with your old key defeats the purpose of this exercise. +3. When you next publish content, signature validation fails and the instance fetches your account's actor object again to learn the new keys. +4. With the new keys, signature validation passes and your new content is published. + +It is important to emphasize that in a Blind Key Rotation, you do not send out an Update message with new keys. The reason why this is, is because you do not want to create a cryptographic relationship between the keys. By creating a cryptographic relationship, you introduce new digital evidence which can be used to prove that you held the original keypair at some time in the past. + +## Questions? + +If you still have questions, contact me on the fediverse: [@kaniini@pleroma.site](https://pleroma.site/kaniini) diff --git a/content/blog/the-end-of-a-short-era.md b/content/blog/the-end-of-a-short-era.md new file mode 100644 index 0000000..9ec6682 --- /dev/null +++ b/content/blog/the-end-of-a-short-era.md @@ -0,0 +1,11 @@ +--- +title: "The End of a Short Era" +date: "2021-03-21" +coverImage: "Screenshot_2021-02-13-jejune-client.png" +--- + +Earlier this year, I started [a project called Jejune](https://github.com/kaniini/jejune) and migrated my blog to it.  For various reasons, I have decided to switch to WordPress instead. + +The main reason why is because WordPress has plugins which do everything I wanted Jejune to do, so using an already established platform provides more time for me to work on my more important projects. + +For posting to the fediverse, I plan to use a public Mastodon or Pleroma instance, though most of my social graph migrated back to Twitter so I probably won't be too active there.  After all, my main reason for using social platforms is to communicate with my friends, so I am going to be where my friends actually are.  Feel free to let me know suggestions, though! diff --git a/content/blog/the-end-of-freenode.md b/content/blog/the-end-of-freenode.md new file mode 100644 index 0000000..f7c4ba3 --- /dev/null +++ b/content/blog/the-end-of-freenode.md @@ -0,0 +1,69 @@ +--- +title: "the end of freenode" +date: "2021-06-14" +coverImage: "BitchX_logo_-_ACiD.png" +--- + +My first experience with IRC was in 1999.  I was in middle school, and a friend of mine ordered a [Slackware CD from Walnut Creek CDROM](https://web.archive.org/web/19980209123157/http://ftp.cdrom.com/titles/os/os.htm#linux_distributions).  This was Slackware 3.4, and contained the GNOME 1.x desktop environment on the disc, which came with the [BitchX IRC client](http://bitchx.sourceforge.net/). + +At first, I didn't really know what BitchX was, I just thought it was a cool program that displayed random ascii art, and then tried to connect to various servers.  After a while, I found out that an IRC client allowed you to connect to an IRC network, and get help with Slackware. + +At that time, freenode didn't exist.  The Slackware IRC channel was on DALnet, and I started using DALnet to learn more about Slackware.  Like most IRC newbies, it didn't go so well: I got banned from `#slackware` in like 5 minutes or something.  I pleaded for forgiveness, in the way redolent of a middle schooler.  And eventually, I got unbanned and stuck around for a while.  That was my first experience with IRC. + +After a few months, I got bored of running Linux and reinstalled Windows 98 on my computer, because I wanted to play games that only worked on Windows, and so, largely, my interest in IRC waned. + +A few years passed... I was in eighth grade.  I found out that one of the girls in my class was [a witch](https://en.wikipedia.org/wiki/Wicca).  I didn't really understand what that meant, and so I pressed her for more details.  She said that she was a Wiccan, and that I should read more about it on the Internet if I wanted to know more.  I still didn't quite understand what she meant, but I looked it up on [AltaVista](https://en.wikipedia.org/wiki/AltaVista), which linked me to an entire category of sites on [dmoz.org](https://en.wikipedia.org/wiki/DMOZ).  So, I read through these websites and on one of them I saw: + +> Come join our chatroom on DALnet: `irc.dal.net #wicca` + +DALnet!  I knew what that was, so I looked for an IRC client that worked on Windows, and eventually installed [mIRC](https://mirc.co.uk/).  Then I joined DALnet again, this time to join `#wicca`.  I found out about a lot of other amazing ideas from the people on that channel, and wound up joining others like `#otherkin` around that time.  Many of my closest friends to this day are from those days. + +At this time, DALnet was the largest IRC network, with almost 150,000 daily users.  Eventually, my friends introduced me to mIRC script packs, like [NoNameScript](https://archives.darenet.org/?dir=irc/scripts/mirc/nonamescript), and I used that for a few years on and off, sometimes using BitchX on Slackware instead, as I figured out how to make my system dual boot at some point. + +## The DALnet DDoS attacks + +For a few years, all was well, until the end of July 2002, when DALnet started being the target of [Distributed Denial of Service](https://en.wikipedia.org/wiki/Denial-of-service_attack#Distributed_DoS) attacks.  We would of course, later find out that these attacks were at the request of [Jason Michael Downey (Nessun), who had just launched a competing IRC network called Rizon](https://en.wikipedia.org/wiki/Operation:_Bot_Roast). + +However, this resulted in `#slackware` and many other technical channels moving from DALnet to `irc.openprojects.net`, a network that was the predecessor to freenode.  Using `screen`, I was able to run two copies of the BitchX client, one for freenode, and one for DALnet, but I had difficulties connecting to the DALnet network due to the DDoS attacks. + +## Early freenode + +At the end of 2002, `irc.openprojects.net` became freenode.  At that time, freenode was a much different place, with community projects like `#freenoderadio`, a group of people who streamed various 'radio' shows on an Icecast server.  Freenode had less than 5,000 users, and it was a community where most people knew each other, or at least knew somebody who knew somebody else. + +At this time, freenode ran `dancer-ircd`, with `dancer-services`, which were written by the Debian developer Andrew Suffield and based on ircd-hybrid 6 and HybServ accordingly. + +Dancer had a lot of bugs, the software would frequently do weird things and the services were quite spartan compared to what was available on DALnet.  I knew based on what was available over on DALnet, that we could make something better for freenode, and so I started to learn about IRCD. + +## Hatching a plan to make services better + +By this time, I was in my last year of high school, and was writing IRC bots in Perl.  I hadn't really tried to write anything in C yet, but I was learning a little bit about C by playing around with a test copy of UnrealIRCd on my local machine.  But I started to talk to `lilo` about improving the services.  I knew it could be done, but I didn't know how to do it yet, which lead me to start searching for services projects that were simple and understandable. + +In my searching for services software, I found `rakaur`'s [Shrike project](https://github.com/rakaur/shrike), which was a very simple clone of Undernet's X service which could be used with ircd-hybrid.  I talked with `rakaur`, and I learned more about C, and even added some features.  Unfortunately, we had a falling out at that time because a user on the network we ran together found out that he could make `rakaur`'s IRC bot run `rm -rf --no-preserve-root /`, and did so. + +After working on Shrike a bit, I finally knew what to do: extend Shrike into a full set of DALnet-like services.  I showed what I was working on to `lilo` and he was impressed: I became a freenode staff member, and continued to work on the services, and all went well for a while.  He also recruited my friend `jilles` to help with the coding, and we started fixing bugs in `dancer-ircd` and `dancer-services` as an interim solution.  And we started writing `atheme` as a longer-term replacement to `dancer-services`, originally under the auspices of freenode. + +## Spinhome + +In early 2006, `lilo` launched his Spinhome project.  Spinhome was a fundraising effort so that `lilo` could get a mobile home to replace the double-wide trailer he had been living in.  Some people saw him trying to fundraise while being the owner of freenode as a conflict of interest, which lead to a falling out with a lot of staffers, projects, etc.  OFTC went from being a small network to a much larger network during this time. + +One side effect of this was that the `atheme` project got spun out into its own organization: atheme.org, which continues to exist in some form to this day. + +The atheme.org project was founded on the concept of promoting _digital autonomy_, which is basically the network equivalent of software freedom, and has advocated in various ways to preserve IRC in the context of digital autonomy for years.  In retrospect, some of the ways we advocated for digital autonomy were somewhat obnoxious, but as they say, hindsight is always 20/20. + +## The hit and run + +In September 2006, `lilo` was hit by a motorist while riding his bicycle.  This lead to a managerial crisis inside freenode, where there were two rifts: one group which wanted to lead the network was lead by Christel Dahlskjaer, while the other group was lead by Andrew Kirch (`trelane`).  Christel wanted to update the network to use all of the new software we developed over the past few years, and so atheme.org gave her our support, which convinced enough of the sponsors and so on to also support her. + +A few months later, `lilo`'s brother tried to claim title to the network to turn into some sort of business.  This lead to Christel and Richard Hartmann (`RichiH`) meeting with him in order to get him to back away from that attempt. + +After that, things largely ran smoothly for several years: freenode switched to `atheme`, and then they switched to `ircd-seven`, a customized version of `charybdis` which we had written to be a replacement for `hyperion` (our fork of `dancer-ircd`), after which things ran well until... + +## Freenode Limited + +In 2016, Christel incorporated freenode limited, under the guise that it would be used to organize [the freenode #live conferences](https://freenode.live).  In early 2017, she sold 66% of her stake in freenode limited to Andrew Lee, who I wrote [about in last month's chapter](https://ariadne.space/2021/05/20/the-whole-freenode-kerfluffle/). + +All of that lead to Andrew's takeover of the network last month, and last night they decided to remove the `#fsf` and `#gnu` channels from the network, and k-lined my friend Amin Bandali when he criticized them about it, which means freenode is definitely no longer a network about FOSS. + +Projects should use alternative networks, like OFTC or Libera, or better yet, operate their own IRC infrastructure.  Self-hosting is really what makes IRC great: you can run your own server for your community and not be beholden to anyone else.  As far as IRC goes, that's the future I feel motivated to build. + +This concludes my coverage of the freenode meltdown.  I hope people enjoyed it and also understand why freenode was important to me: without `lilo`'s decision to take a chance on a dumbfuck kid like myself, I wouldn't have ever really gotten as deeply involved in FOSS as I have, so to see what has happened has left me heartbroken. diff --git a/content/blog/the-fsfs-relationship-with-firmware-is-harmful-to-free-software-users.md b/content/blog/the-fsfs-relationship-with-firmware-is-harmful-to-free-software-users.md new file mode 100644 index 0000000..bcc8f58 --- /dev/null +++ b/content/blog/the-fsfs-relationship-with-firmware-is-harmful-to-free-software-users.md @@ -0,0 +1,48 @@ +--- +title: "the FSF’s relationship with firmware is harmful to free software users" +date: "2022-01-22" +--- + +The FSF has an unfortunate relationship with firmware, resulting in policies that made sense in the late 1980s, but actively harm users today, through recommending obsolescent equipment, requiring increased complexity in RYF-certified hardware designs and discouraging both good security practices and the creation of free replacement firmware. As a result of these policies, deficient hardware often winds up in the hands of those who need software freedom the most, in the name of RYF-certification. + +## the FSF and microcode + +The normal Linux kernel is not recommended by the FSF, because it allows for the use of proprietary firmware with devices. Instead, they recommend Linux-libre, which disables support for proprietary firmware by ripping out code which allows for the firmware to be loaded on to devices. Libreboot, being FSF-recommended, also has [this policy of disallowing firmware blobs in the source tree](https://libreboot.org/news/policy.html#problems-with-ryf-criteria), despite it being a source of nothing but problems. + +The end result is that users who deploy the FSF-recommended firmware and kernel wind up with varying degrees of broken configurations. Worse yet, the Linux-libre project [removes warning messages](https://lists.gnu.org/archive/html/info-gnu/2018-04/msg00002.html) which suggest a user may want to update their processor microcode to avoid Meltdown and Spectre security vulnerabilities. + +While it is true that processor microcode is a proprietary blob, from a security and reliability point of view, there are two types of CPU: you can have a broken CPU, or a less broken CPU, and microcode updates are intended to give you a less broken CPU. This is particularly important because microcode updates fix real problems in the CPU, and Libreboot has patches which [hack around problems caused by deficient microcode](https://browse.libreboot.org/lbmk.git/plain/resources/coreboot/default/patches/0012-fix-speedstep-on-x200-t400-Revert-cpu-intel-model_10.patch?id=9938fa14b1bf54db37c0c18bdfec051cae41448e) burned into the CPU at manufacturing time, since it’s not allowed to update the microcode at early boot time. + +There is also a common misconception about the capabilities of processor microcode. Over the years, I have talked with numerous software freedom advocates about the microcode issue, and many of them believe that microcode is capable of reprogramming the processor as if it were an FPGA or something. In reality, the microcode is a series of hot patches to the instruction decode logic, which is largely part of a fixed function execution pipeline. In other words, you can’t microcode update a CPU to add or substantially change capabilities. + +By discouraging (or outright inhibiting in the case of Linux-libre) end users to exercise their freedom (a key tenet of software freedom being that the user has agency to do whatever she wants with her computer) to update their processor microcode, the FSF pursues a policy which leaves users at risk for vulnerabilities such as Meltdown and Spectre, which were [partially mitigated through a microcode update](https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html). + +## Purism’s **Libr**em 5: a case **study** + +The FSF “[Respects Your Freedom](https://ryf.fsf.org/about/criteria)” certification has a loophole so large you could drive a truck through it called the “secondary processor exception”. This is because it knows that generally speaking, entirely libre devices do not presently exist that have the capabilities people want. Purism used this loophole to sell a phone that had proprietary software blobs while passing it off as entirely free. The relevant text of the exception that allowed them to do this was: + +> However, there is one exception for secondary embedded processors. The exception applies to software delivered inside auxiliary and low-level processors and FPGAs, within which software installation is not intended after the user obtains the product. This can include, for instance, microcode inside a processor, firmware built into an I/O device, or the gate pattern of an FPGA. The software in such secondary processors does not count as product software. + +Purism was able to accomplish this by making the Librem 5 have not one, but _two_ processors: when the phone first boots, it uses a secondary CPU as a service processor, which loads all of the relevant blobs (such as those required to initialize the DDR4 memory) before starting the main CPU and shutting itself off. In this way, they could have all the blobs they needed to use, without having to worry about them being user visible from PureOS. Under the policy, that left them free and clear for certification. + +The problem of course is that by hiding these blobs in the service processor, users are largely unaware of their existence, and are unable to leverage their freedom to study, reverse engineer and replace these blobs with libre firmware, a remedy that would typically be made available to them as part of the [four freedoms](https://www.gnu.org/philosophy/free-sw.html). + +This means that users of the Librem 5 phone are objectively harmed in three ways: first, they are unaware of the existence of the blobs to begin with, second they do not have the ability to study the blobs, and third, they do not have the ability to replace the blobs. By pursing RYF certification, Purism released a device that is objectively worse for the practical freedom of their customers. + +The irony, of course, is that Purism failed to gain certification at the end of this effort, creating a device that harmed consumer freedoms, with increased complexity, just to attempt to satisfy the requirements of a certification program they ultimately failed to gain certification from. + +## The Novena laptop: a second case study + +In 2012, Andrew “bunnie” Huang began a project to create a laptop with the most free components he could find, called [the Novena open laptop](https://hackaday.com/2012/12/16/bunnie-builds-a-laptop-for-himself-hopefully-us/). It was based on the Freescale (now NXP) i.MX 6 CPU, which has an integrated Vivante GPU and WiFi radio. Every single component in the design had data sheets freely available, and the schematic itself was published under a free license. + +But because the SoC used required blobs to boot the GPU and WiFi functionality, the FSF required that these components be mechanically disabled in the product in order to receive certification, despite an ongoing effort to write replacement firmware for both components. This replacement firmware was eventually released, and people are using these chips with that free firmware today. + +Had bunnie chosen to comply with the RYF certification requirements, customers which purchased the Novena laptop would have been unable to use the integrated GPU and WiFi functionality, as it was physically disabled on the board, despite the availability of free replacement firmware for those components. Thankfully, bunnie chose not to move forward on RYF certification, and thus the Novena laptop can be used with GPU acceleration and WiFi. + +## the hardware which remains + +In practice, it is difficult to get anything much more freedom-respecting than the Novena laptop. From a right-to-repair perspective, the Framework laptop is very good, but it still uses proprietary firmware. It is, however, built on a modern x86 CPU, and could be a reasonable target for corebooting, especially now that the [embedded controller firmware’s source code](https://github.com/FrameworkComputer/EmbeddedController) has been released under a free license. + +However, because of the Intel ME, the Framework laptop will rightly never be RYF-certified. Instead, the FSF promotes buying old thinkpads from 2009 with Libreboot pre-installed. This is a total disservice to users, as a computer from 2009 is totally obsolete now, and as discussed above, Intel CPUs tend to be rather broken without their microcode updates. + +My advice is to ignore the RYF certification program, as it is actively harmful to the practical adoption of free software, and just buy whatever you can afford that will run a free OS well. At this point, total blob-free computing is a fool’s errand, so there are a lot of AMD Ryzen-based machines that will give you decent performance and GPU acceleration without the need for proprietary drivers. Vendors which use coreboot for their systems and open the source code for their embedded controllers should be at the front of the line. But the FSF will never suggest this as an option, because they have chosen unattainable ideological purity over the pragmatism of recommending what the market can actually provide. diff --git a/content/blog/the-long-term-consequences-of-maintainers-actions.md b/content/blog/the-long-term-consequences-of-maintainers-actions.md new file mode 100644 index 0000000..fd3d276 --- /dev/null +++ b/content/blog/the-long-term-consequences-of-maintainers-actions.md @@ -0,0 +1,20 @@ +--- +title: "The long-term consequences of maintainers' actions" +date: "2021-09-16" +--- + +OpenSSL 3 has entered Alpine, and we have been switching software to use it over the past week.  While OpenSSL 1.1 is not going anywhere any time soon, it will eventually leave the distribution, once it no longer has any dependents.  I mostly bring this up because it highlights a few examples of maintainers not thinking about the big picture, let me explain. + +First, the good news: in distribution-wide rebuilds, we already know that the overwhelming majority of packages in Alpine build just fine with OpenSSL 3, when individually built against it.  Roughly 85% of `main` builds just fine with OpenSSL 3, and 89% of `community` builds with it.  The rebuild effort is off to a good start. + +Major upgrades to OpenSSL are not without their fallout, however.  In many cases, we cannot upgrade packages to use OpenSSL 3 because they have dependencies which themselves cannot yet be built with OpenSSL 3.  So, that 15% of `main` ultimately translates to 30-40% of `main` once you take into account dependencies like `curl`, which builds just fine with OpenSSL 3, but has hundreds of dependents, some of which don't. + +A major example of this is `mariadb`.  It has been known that OpenSSL 3 was on the horizon for over 4 years now, and that the OpenSSL 3 release would remove support for the classical OpenSSL programming approach of touching random internals.  However, they are [just now beginning to update their OpenSSL support to use the modern APIs](https://jira.mariadb.org/browse/MDEV-25785).  Because of this, we wound up having to downgrade dozens of packages which would otherwise have supported OpenSSL 3 just fine, because the maintainers of those packages did their part and followed the OpenSSL deprecation warnings as they showed up in OpenSSL releases.  MariaDB is a highly profitable company, [who do business with the overwhelming majority of the Fortune 500 companies](https://www.crunchbase.com/organization/mariadb).  But yet, when OpenSSL 3 releases started to be cut, they weren't ready, and despite having years of warning they're still not, which accordingly limits what packages can get the OpenSSL 3 upgrade as a result. + +Another casualty will be Ansible: we have already moved it to `community`.  You are probably wondering why Ansible, a software package which does not use OpenSSL at all, would be a casualty, so please let me explain.  Ansible uses `paramiko` for its SSH client, which is a great SSH library for Python, and is a totally solid decision to make.  However, `paramiko` uses `cryptography` for its cryptographic functions, again a totally solid decision to make, `cryptography` is a great library for developers to use. + +For distributions, however, the story is different: `cryptography` moved to using Rust, because they wanted to leverage all of the static analysis capabilities built into the language.  This, too, is a reasonable decision, from a development perspective.  From the ecosystem perspective, however, it is problematic, as the Rust ecosystem is still rapidly evolving, and so we cannot support a single branch of the Rust compiler for an entire 2 year lifecycle, which means it exists in `community`.  Our solution, historically, has been to hold `cryptography` at the latest version that did not require Rust to build.  However, that version is not compatible with OpenSSL 3, and so it will eventually need to be upgraded to a new version which is.  And so, since `cryptography` has to move to `community`, so does `paramiko` and Ansible. + +The ideology of moving fast and breaking things, while tolerated in the technology industry, does not translate to success in the world at large.  Outside of technology, the world prefers stability: the reason why banks still buy mainframes, and still use z/OS, is because the technology works and can be depended upon.  Similarly, the engine controller in cars, and medical devices like pacemakers and insulin pumps, are running on C.  They don't run on C because C is a good language with all the latest features, they run on C because the risks and mitigations for issues in C programs are [well-understood and documented as part of MISRA C](https://www.misra.org.uk/). + +Distributions exist to provide a similar set of stability and reliability guarantees.  If we cannot provide a long-term support lifecycle for a piece of technology your software depends on, then we are forced to provide a shorter support lifecycle for your software as well.  For some, that is fine, but I think many will be disappointed to see that they haven't fully gotten OpenSSL 3, or that Ansible has had to be moved to `community`. diff --git a/content/blog/the-problematic-gpl-or-later-clause.md b/content/blog/the-problematic-gpl-or-later-clause.md new file mode 100644 index 0000000..a3558ad --- /dev/null +++ b/content/blog/the-problematic-gpl-or-later-clause.md @@ -0,0 +1,30 @@ +--- +title: "the problematic GPL \"or later\" clause" +date: "2021-11-16" +--- + +The GNU General Public License started life as [the GNU Emacs Public License](https://www.free-soft.org/gpl_history/emacs_gpl.html) in 1987 (the linked version is from February 1988), and has been built on the principle of copyleft: [the use of the copyright system to enforce software freedom through licensing](https://www.gnu.org/licenses/copyleft.en.html). This prototype version of the GPL was used for other packages, such as GNU Bison (in 1988), and [Nethack (in 1989)](https://www.free-soft.org/gpl_history/nethack_gpl.html), and was most likely written by Richard Stallman himself. + +This prototype version was also referred to as [the _GNU General Public License_ in a 1988 bulletin](https://www.gnu.org/bulletins/bull5.html#SEC5), so we can think of it in a way as GPLv0. This version of the GPL however, was mothballed, in Feburary 1989, [with the publication of the GPLv1](https://www.gnu.org/licenses/old-licenses/gpl-1.0.en.html). One of the new features introduced in the newly rewritten GPLv1 license, was the "or later" clause: + +> 7\. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. +> +> Each version is given a distinguishing version number. If the Program specifies a version number of the license which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the license, you may choose any version ever published by the Free Software Foundation. +> +> Section 7 of the [GNU General Public License version 1](https://www.gnu.org/licenses/old-licenses/gpl-1.0.en.html) + +The primary motive for the version upgrade clause, at the time, was quite simple: the concept of using copyright to enforce software freedom, was, at the time, a new and novel concept, and there was a concern that the license might have flaws or need clarifications. Accordingly, to streamline the process, they added the version upgrade clause to allow authors to consent to using new versions of the GPL as an alternative. Indeed, in the [January 1991 release of the GNU Bulletin](https://www.gnu.org/bulletins/bull10.html#SEC6), plans to release the GPLv2 were announced as an effort to clarify the license: + +> We will also be releasing a version 2 of the ordinary GPL. There are no real changes in its policies, but we hope to clarify points that have led to misunderstanding and sometimes unnecessary worry. +> +> GNU Bulletin volume 1, number 10, "[New library license](https://www.gnu.org/bulletins/bull10.html#SEC6)" + +After that, not much happened in the GNU project regarding licensing for a long time, until the GPLv3 drafting process in 2006. From a governance point of view, the GPLv3 drafting process was a significant accomplishment in multi-stakeholder governance, as [outlined by Eben Moglen](https://softwarefreedom.org/resources/2013/A_History_of_the_GPLv3_Revision_Process.pdf). + +However, for all of the success of the GPLv3 drafting process, it must be noted that the GPL is ultimately published by the Free Software Foundation, an organization that many have questioned the long-term viability of lately. When the "or later version" clause was first introduced to the GPL, it was unthinkable that the Free Software Foundation could ever be in such a state of affairs, but now it is. + +And this is ultimately the problem: what happens if the FSF shuts down, and has to liquidate? What if an intellectual property troll acquires the GNU copyright assignments, or acquires the trademark rights to the FSF name, and publishes a new GPL version? There are many possibilities to be concerned about, but developers can do two things to mitigate the damage. + +First, they can stop using the "or later" clause in new GPL-licensed code. This will, effectively, limit those projects from being upgraded to new versions of the GPL, which may be published by a compromised FSF. In so doing, projects should be able to avoid relicensing discussions, as GPLv3-only code is compatible with GPLv3-or-later: the common denominator in this case is GPLv3. + +Second, they can stop assigning copyright to the FSF. In the event that the FSF becomes compromised, for example, by an intellectual property troll, this limits the scope of their possible war chest for malicious GPL enforcement litigation. As we have learned from the McHardy cases involving Netfilter, in a project with multiple copyright holders, effective GPL enforcement litigation is most effective when done as a class action. In this way, dilution of the FSF copyright assignment pool protects the commons over time from exposure to malicious litigation by a compromised FSF. diff --git a/content/blog/the-three-taps-of-doom.md b/content/blog/the-three-taps-of-doom.md new file mode 100644 index 0000000..a04b187 --- /dev/null +++ b/content/blog/the-three-taps-of-doom.md @@ -0,0 +1,32 @@ +--- +title: "the three taps of doom" +date: "2021-07-03" +--- + +A few years ago, I worked as the CTO of an advertising startup.  At first, we used Skype for messaging amongst the employees, and then later, we switched to Slack.  The main reason for switching to Slack was because they had an IRC gateway -- you could connect to a Slack workspace with an IRC client, which allowed for the people who wanted to use IRC to do so, while providing a polished experience for those who were unfamiliar with IRC. + +## the IRC gateway + +In the beginning, Slack had an IRC gateway.  On May 15th, 2018, Slack [discontinued the IRC gateway](https://web.archive.org/web/20180314224655/https://get.slack.help/hc/en-us/articles/201727913-Connect-to-Slack-over-IRC-and-XMPP), beginning my descent into [Cocytus](https://en.wikipedia.org/wiki/Cocytus#In_the_Divine_Comedy).  Prior to the shutdown of the IRC gateway, I had always interacted with the Slack workspace via IRC.  This was replaced with the Slack mobile and desktop apps. + +The IRC gateway, however, was quite buggy, so it was probably good that they got rid of it.  It did not comply with any reasonable IRC specifications, much less support anything from IRCv3, so the user experience was quite disappointing albeit serviceable. + +## the notifications + +Switching from IRC to the native Slack clients, I now got to deal with one of Slack's main features: notifications.  If you've ever used slack, you're likely familiar with the [unholy notification sound](https://www.youtube.com/watch?v=U7iGyCdA0xk), or as I have come to know it, the triple tap of existential doom.  Let me explain. + +At this point, we used slack for _everything_: chat, paging people, even monitoring tickets coming in.  The workflow was efficient, but due to matters outside my control, revenues were declining.  This lead to the CEO becoming quite antsy.  One day he discovered that he could use `@all`, `@tech` or `@sales` to page people with his complaints. + +This means that I would now get pages like: + +**Monitoring:** `@tech` Service `rtb-frontend-nyc` is degraded **CEO:** `@tech` I demand you implement a filtering feature our customer is requiring to scale up + +The monitoring pages were helpful, the CEO paging us demanding that we implement filtering features that spied on users and definitely would not actually result in scaled up revenue (because the customers were paying CPM) were not helpful. + +The pages in question were actually a lot more intense than I show here, these are tame examples, but it felt like I had to walk on eggshells in order to use Slack. + +## Quitting that job + +In the middle of 2018, I quit that job for various reasons.  And as a result, I uninstalled Slack, and immediately felt much better.  But every time I hear the Slack notification sound, I now get anxious as a result. + +The moral of this story is: if you use Slack, don't use it for paging, and make sure your CEO doesn't have access to the paging features.  It will be a disaster.  And if you're running a FOSS project, consider not using Slack, as there are likely many technical people who avoid Slack due to their own experiences with it. diff --git a/content/blog/the-tragedy-of-gethostbyname.md b/content/blog/the-tragedy-of-gethostbyname.md new file mode 100644 index 0000000..1ea90ea --- /dev/null +++ b/content/blog/the-tragedy-of-gethostbyname.md @@ -0,0 +1,70 @@ +--- +title: "the tragedy of gethostbyname" +date: "2022-03-27" +--- + +A frequent complaint expressed on a certain website about Alpine is related to the deficiencies regarding the musl DNS resolver when querying large zones. In response, it is usually mentioned that applications which are expecting reliable DNS lookups should be using a dedicated DNS library for this task, not the `getaddrinfo` or `gethostbyname` APIs, but this is usually rebuffed by comments saying that these APIs are fine to use because they are allegedly reliable on GNU/Linux. + +For a number of reasons, the assertion that DNS resolution via these APIs under glibc is more reliable is false, but to understand why, we must look at the history of why a `libc` is responsible for shipping these functions to begin with, and how these APIs evolved over the years. For instance, did you know that `gethostbyname` originally didn't do DNS queries at all? And, the big question: why are these APIs blocking, when DNS is inherently an asynchronous protocol? + +Before we get into this, it is important to again restate that if you are an application developer, and your application depends on reliable DNS performance, you must absolutely use a dedicated DNS resolver library designed for this task. There are many libraries available that are good for this purpose, such as [c-ares](https://c-ares.org/), [GNU adns](https://www.gnu.org/software/adns/), [s6-dns](https://skarnet.org/software/s6-dns/) and [OpenBSD's libasr](https://github.com/OpenSMTPD/libasr). As should hopefully become obvious at the end of this article, the DNS clients included with `libc` are designed to provide basic functionality only, and there is no guarantee of portable behavior across client implementations. + +## the introduction of `gethostbyname` + +Where did `gethostbyname` come from, anyway? Most people believe this function came from BIND, the reference DNS implementation developed by the Berkeley CSRG. In reality, it was introduced to BSD in 1982, alongside the `sethostent` and `gethostent` APIs. I happen to have a copy of the 4.2BSD source code, so here is the implementation from 4.2BSD, which was released in early 1983: + +struct hostent \* +gethostbyname(name) + register char \*name; +{ + register struct hostent \*p; + register char \*\*cp; + + sethostent(0); + while (p = gethostent()) { + if (strcmp(p->h\_name, name) == 0) + break; + for (cp = p->h\_aliases; \*cp != 0; cp++) + if (strcmp(\*cp, name) == 0) + goto found; + } +found: + endhostent(); + return (p); +} + +As you can see, the 4.2BSD implementation only checks the `/etc/hosts` file and nothing else. This answers the question about why `gethostbyname` and its successor, `getaddrinfo` do DNS queries in a blocking way: they did not want to introduce a replacement API for `gethostbyname` that was asynchronous. + +## the introduction of DNS to `gethostbyname` + +DNS resolution was first introduced to `gethostbyname` in 1984, when it was introduced to BSD. [This version, which is too long to include here](https://github.com/dank101/4.3BSD-Reno/blob/00328b5a67ffe35e67baeba8f7ab75af79f7ae64/lib/libc/net/gethostnamadr.c#L213) also translated dotted-quad IPv4 addresses into a `struct hostent`. In essence, the 4.3BSD implementation does the following: + +1. If the requested hostname begins with a number, try to parse it as a dotted quad. If this fails, set `h_errno` to `HOST_NOT_FOUND` and bail. Yes, this means 4.3BSD would fail to resolve hostnames like `12-34-56-78.static.example.com`. +2. Attempt to do a DNS query using `res_search`. If the query was successful, return the first IP address found as the `struct hostent`. +3. If the DNS query failed, fall back to the original `/etc/hosts` searching algorithm above, now called `_gethtbyname` and using `strcasecmp` instead of `strcmp` (for consistency with DNS). + +A fixed version of this algorithm was also included with BIND's `libresolv` as `res_gethostbyname`, and the `res_search` and related functions were imported into BSD libc from BIND. + +## standardization of `gethostbyname` in POSIX + +The `gethostbyname` and `getaddrinfo` APIs were first standardized in X/Open Networking Services Issue 4 (commonly referred to as XNS4) specification, which itself was part of the X/Open Single Unix Specification version 3 (commonly referred to as SUSv3), released in 1995. Of note, X/Open tried to deprecate `gethostbyname` in favor of `getaddrinfo` as part of the XNS5 specification, [removing it entirely except for a mention in their specification for `netdb.h`](https://pubs.opengroup.org/onlinepubs/009619199/netdbh.htm#tagcjh_06_02). + +Later, it returned [as part of POSIX issue 6, released in 2004](https://pubs.opengroup.org/onlinepubs/009696799/functions/gethostbyaddr.html). That version says: + +> **Note:** In many cases it is implemented by the Domain Name System, as documented in RFC 1034, RFC 1035, and RFC 1886. +> +> POSIX issue 6, IEEE 1003.1:2004. + +Oh no, what is this about, and do application developers need to care about it? Very simply, it is about the [Name Service Switch](https://en.wikipedia.org/wiki/Name_Service_Switch), frequently referred to as NSS, which allows the `gethostbyname` function to have hotpluggable implementations. The Name Service Switch was a feature introduced to Solaris, which was implemented to allow support for Sun's NIS+ directory service. + +As developers of other operating systems wanted to support software like Kerberos and LDAP, it quickly was reimplemented in other systems as well, such as GNU/Linux. These days, systems running systemd frequently use this feature in combination with a custom NSS module named `nss-systemd` to force use of `systemd-resolved` as the DNS resolver, which has different behavior than the original DNS client derived from BIND that ships in most `libc` implementations. + +An administrator can disable support for DNS lookups entirely, simply by editing the `/etc/nsswitch.conf` file and removing the `dns` module, which means application developers depending on reliable DNS service need to care a lot about this: it means on systems with NSS, your application cannot depend on `gethostbyname` to actually support DNS at all. + +## musl and DNS + +Given the background above, it should be obvious by now that musl's DNS client was written under the assumption that applications that have specific requirements for DNS would be using a specialized library for this purpose, as `gethostbyname` and `getaddrinfo` are not really suitable APIs, since their behavior is entirely implementation-defined and largely focused around blocking queries to a directory service. + +Because of this, the DNS client was written to behave as simply as possible, but the use of DNS for bulk data distribution, such as in DNSSEC, DKIM and other applications, have led to a desire to implement support for DNS over TCP as an extension to the musl DNS client. + +In practice, this will fix the remaining complaints about the musl DNS client once it lands in a musl release, but application authors depending on reliable DNS performance should really use a dedicated DNS client library for that purpose: using APIs that were designed to simply parse `/etc/hosts` and had DNS support shoehorned into them will always deliver unreliable results. diff --git a/content/blog/the-various-ways-to-check-if-an-integer-is-even.md b/content/blog/the-various-ways-to-check-if-an-integer-is-even.md new file mode 100644 index 0000000..803c155 --- /dev/null +++ b/content/blog/the-various-ways-to-check-if-an-integer-is-even.md @@ -0,0 +1,91 @@ +--- +title: "The various ways to check if an integer is even" +date: "2021-04-27" +--- + +You have probably seen this post on Twitter by now: + + + + +But actually, the way most people test whether a number is even is _wrong._  It's not your fault, computers think differently than we do.  And in most cases, the compiler fixes your mistake for you.  But it's been a long day of talking about Alpine governance, so I thought I would have some fun. + +However, a quick note: for these examples, I am using ML, specifically the OCaml dialect of it.  Translating these expressions to your language however should not be difficult, and I will provide C-like syntax for the right answer below too. + +## Using the modulus operator and bitwise math + +The usual way people test whether a number is even is the way they teach you in grade school: `x mod 2 == 0`.  In a C-like language, this would be represented as `x % 2 == 0`.  However, this is actually quite slow, as the `div` instruction is quite expensive on most CPUs. + +There is a much faster way to check if a number is even or odd, but to understand why it is faster, we should discuss some number theory first.  Whether a number is even or odd ultimately comes down to a single number: `1`. + +There are two numbers in the entire universe that have the property that they are the same number in _any_ number system we use today: `0` is always zero, and `1` is always one.  This holds true for binary (base 2), octal (base 8), decimal (base 10), and hexadecimal (base 16). + +Accordingly, we can use binary logic to test whether a number is even or not by testing whether it ends in `1` when represented as binary.  But many programmers probably don't actually know how to do this -- it doesn't usually come up when you're writing a web app, after all. + +The answer is to use logical and: `x land 1 == 0` (or in C, `x & 1 == 0`).  We can prove that both expressions are functionally equivalent, by defining both testing functions and testing for the same output: + +\# let evenMod x = x mod 2 == 0;; +val evenMod : int -> bool = +# let evenAnd x = x land 1 == 0;; +val evenAnd : int -> bool = +# let evenMatches x = evenMod(x) == evenAnd(x);; +val evenMatches : int -> bool = +# evenMatches(0);; +- : bool = true +# evenMatches(1);; +- : bool = true +# evenMatches(2);; +- : bool = true +# evenMatches(3);; +- : bool = true + +As you can see, both are equivalent.  And to be clear, **this is the right way to test whether an integer is even**.  The other ways below are intended to be a joke.  Also, most compilers will optimize `x mod 2 == 0` to `x land 1 == 0`. + +## Using functional programming + +The nice thing about math is that there's always one way to prove something, especially when there's more than one way.  Modulus operator?  Bitwise logic?  Please.  We're going to solve this problem [the way Alonzo Church intended](https://en.wikipedia.org/wiki/Lambda_calculus).  But to do that, we need to think about what _actually makes a number even_.  The answer is simple, of course: an even number is _one which is not odd_.  But what is an odd number?  Well, one that isn't even of course. + +But can we really apply this circular logic to code?  Of course we can! + +\# let rec isEven x = +   x = 0 || isOdd (x - 1) + and isOdd x = +   x <> 0 && isEven (x - 1);; +val isEven : int -> bool = +val isOdd : int -> bool = +# isEven(0);; +- : bool = true +# isEven(1);; +- : bool = false +# isEven(2);; +- : bool = true +# isEven(3);; +- : bool = false + +As you can see, we've succeeded in proving that an even number is clearly not odd! + +## Using pattern matching + +In 1962, Bell Labs [invented pattern matching, as part of the SNOBOL language](https://en.wikipedia.org/wiki/SNOBOL).  Pattern matching has become a popular programming language feature, being implemented in not just SNOBOL, but also Erlang, Elixir, Haskell, ML, Rust and many more.  But can we use it to determine if a number is even or odd?  Absolutely. + +\# let rec isEven x = + match x with +      | 0 -> true +      | 1 -> false +      | 2 -> true +      | x -> isEven(x - 2);; +val isEven : int -> bool = +# isEven(0);; +- : bool = true +# isEven(1);; +- : bool = false +# isEven(2);; +- : bool = true +# isEven(3);; +- : bool = false +# isEven(4);; +- : bool = true +# isEven(5);; +- : bool = false + +As you can see, we have demonstrated many ways to test if a number is even or not.  Some are better than others, but others are more amusing. diff --git a/content/blog/the-vulnerability-remediation-lifecycle-of-alpine-containers.md b/content/blog/the-vulnerability-remediation-lifecycle-of-alpine-containers.md new file mode 100644 index 0000000..c13424d --- /dev/null +++ b/content/blog/the-vulnerability-remediation-lifecycle-of-alpine-containers.md @@ -0,0 +1,235 @@ +--- +title: "the vulnerability remediation lifecycle of Alpine containers" +date: "2021-06-08" +--- + +Anybody who has the responsibility of maintaining a cluster of systems knows about the vulnerability remediation lifecycle: vulnerabilities are discovered, disclosed to vendors, mitigated by vendors and then consumers deploy the mitigations as they update their systems. + +In the proprietary software world, the deployment phase is [colloquially known as Patch Tuesday](https://en.wikipedia.org/wiki/Patch_Tuesday), because many vendors release patches on the second and fourth Tuesday of each month.  But how does all of this actually _happen_, and how do you know what patches you actually _need_? + +I thought it might be nice to look at all the moving pieces that exist in Alpine's remediation lifecycle, beginning from discovery of the vulnerability, to disclosure to Alpine, to user remediation.  For this example, we will track CVE-2016-20011, which I just fixed in Alpine, which is a minor vulnerability in the `libgrss` library concerning a lack of TLS certificate validation when fetching `https` URIs. + +## The vulnerability itself + +GNOME's `libsoup` is an HTTP client/server library for the the GNOME platform, analogous to `libcurl`.  It has two sets of session APIs: the newer `SoupSession` API and the older `SoupSessionSync`/`SoupSessionAsync` family of APIs.  As a result of creating the newer `SoupSession` API, it was discovered at some point that the older `SoupSessionSync`/`SoupSessionAsync` APIs did not enable TLS certificate validation by default. + +As a result of discovering that design flaw in `libsoup`, Michael Catanzaro -- one of the `libsoup` maintainers, began to audit users of `libsoup` in the GNOME platform.  One such user of `libsoup` is `libgrss`, which did not take any steps to enable TLS certificate validation on its own, so Michael [opened a bug against it in 2016](https://bugzilla.gnome.org/show_bug.cgi?id=772647). + +Five years passed and he decided to check up on these bugs.  That lead to the [filing of a new bug in GNOME's gitlab against `libgrss`](https://gitlab.gnome.org/GNOME/libgrss/-/issues/4), as the GNOME bugzilla service is in the process of being turned down.  As `libgrss` was still broken in 2021, he requested a CVE identifier for the vulnerability, and was issued [CVE-2016-20011](https://cve.circl.lu/cve/CVE-2016-20011). + +### How do CVE identifiers get determined, anyway? + +You might notice that the CVE identifier he was issued is CVE-**2016**\-20011, even though it is presently 2021.  Normally, CVE identifiers use the current year, as requesting a CVE identifier is usually an early step in the disclosure process, but CVE identifiers are _actually_ grouped by the year that a vulnerability was first publicly disclosed.  In the case of CVE-2016-20011, the identifier was assigned to the 2016 year because of the public GNOME bugzilla report which was filed in 2016. + +The CVE website at MITRE has [more information about how CVE identifiers are grouped](https://cve.mitre.org/about/faqs.html#year_portion_of_cve_id) if you want to know more. + +## The National Vulnerability Database + +Our vulnerability was issued CVE-2016-20011, but how does Alpine actually find out about it?  The answer is quite simple: [the NVD](https://nvd.nist.gov).  When a CVE identifier is issued, information about the vulnerability is forwarded along to the National Vulnerability Database activity at NIST, a US governmental agency.  The NVD consumes CVE data and enriches it with additional links and information about the vulnerability.  They also generate Common Product Enumeration rules which are intended to map the vulnerability to an actual product and set of versions. + +Common Product Enumeration rules consist of a CPE URI which tries to map a vulnerability to an ecosystem and product name, and an optional set of version range constraints.  For CVE-2016-20011, the NVD staff issued a CPE URI of `cpe:2.3:a:gnome:libgrss:*:*:*:*:*:*:*:*` and a version range constraint of `<= 0.7.0`. + +## security.alpinelinux.org + +The final step in vulnerability information making its way to Alpine is the security team's [issue tracker](https://security.alpinelinux.org/).  Every hour, we download the latest version of the `CVE-Modified` and `CVE-Recent` f[eeds offered by the National Vulnerability Database activity](https://nvd.nist.gov/vuln/data-feeds#JSON_FEED).  We then use those feeds to update our own internal vulnerability tracking database. + +Throughout the day, the security team pulls various reports from the vulnerability tracking database, [for example a list of potential vulnerabilities in `edge/community`](https://security.alpinelinux.org/branch/edge-community).  The purpose of checking these reports is to see if there are any new vulnerabilities to investigate. + +As `libgrss` is in `edge/community`, [CVE-2016-20011 appeared on that report](https://security.alpinelinux.org/vuln/CVE-2016-20011). + +## Mitigation + +Once we start to work a vulnerability, there are a few steps that we take.  First, we research the vulnerability, by checking the links provided to us through the CVE feed and other feeds the security tracker consumes.  The NVD staff are usually very quick at linking to `git` commits and other data we can use for mitigating the vulnerability.  However, sometimes, such as in the case of CVE-2016-20011, there is no longer an active upstream maintainer of the package, and [we have to mitigate the issue ourselves](https://gitlab.gnome.org/GNOME/libgrss/-/merge_requests/7). + +Once we have a patch that is known to fix the issue, we prepare a [software update and push it to `aports.git`](https://git.alpinelinux.org/aports/commit/?id=95041a2fc6599d30708c3ae51c58638ba7e27dbd).  We then [backport the security fix to other branches in `aports.git`](https://git.alpinelinux.org/aports/commit/?h=3.13-stable&id=2f6b3a650df492a31aebf6c040893cf0def8d3d1). + +Once the fix is committed to all of the appropriate branches, the [build servers take over](https://build.alpinelinux.org/), building a new version of the package with the fixes.  The build servers then upload the new packages to the master mirror, and from there, they get distributed [through the mirror network](https://mirrors.alpinelinux.org/) to Alpine's user community. + +## Remediation + +At this point, if you're a casual user of Alpine, you would just do something like `apk upgrade -Ua` and move on with your life, knowing that your system is up to date. + +But what if you're running a cluster of hundreds or thousands of Alpine servers and containers?  How would you know what to patch?  What should be prioritized? + +To solve those problems, there are security scanners, which can check containers, images and filesystems for vulnerabilities.  Some are proprietary software, but there are many options that are free.  However, security scanners are not perfect, like Alpine's vulnerability investigation tool, they sometimes generate both false positives and false negatives. + +Where do security scanners get their data?  In most cases for Alpine systems, they get their data from the [Alpine security database](https://secdb.alpinelinux.org/), a product maintained by the Alpine security team.  Using that database, they check the apk `installed` database to see what packages and versions are installed in the system.  Let's look at a few of them. + +## Creating a test case by mixing Alpine versions + +**Note:** You should never _actually_ mix Alpine versions like this.  If done in an uncontrolled way, you risk system unreliability and your security scanning solution won't know what to do as each Alpine version's security database is specific to _that_ version of Alpine.  Don't create a franken-alpine! + +In the case of `libgrss`, we know that `0.7.0-r1` and newer have a fix for CVE-2016-20011, but the security fix has already been published.  So, where can we get `0.7.0-r0`?  We can get it from Alpine 3.12 of course.  Accordingly, we make a filesystem with `apk` and install Alpine 3.12 into it: + +nanabozho:~# apk add --root ~/test-image --initdb --allow-untrusted -X http://dl-cdn.alpinelinux.org/v3.12/main -X http://dl-cdn.alpinelinux.org/v3.12/community alpine-base libgrss-dev=0.7.0-r0 +\[...\] +OK: 126 MiB in 92 packages +nanabozho:~# apk upgrade --root ~/test-image -X http://dl-cdn.alpinelinux.org/v3.13/main -X http://dl-cdn.alpinelinux.org/v3.13/community +\[...\] +OK: 127 MiB in 98 packages +nanabozho:~# apk info --root ~/test-image libgrss +Installed: Available: +libgrss-0.7.0-r0 ? +nanabozho:~# cat ~/test-image/etc/alpine-release +3.13.5 + +Now that we have our image, lets see what detects the vulnerability, and what doesn't. + +### trivy + +[Trivy](https://github.com/aquasecurity/trivy) is considered by many to be the most reliable scanner for Alpine systems, but can it detect this vulnerability?  In theory, it should be able to. + +I have installed `trivy` to `/usr/local/bin/trivy` on my machine by downloading the go binary from the GitHub release.  They have a script that can do this for you, but I'm not a huge fan of `curl | sh` type scripts. + +To scan a filesystem image with trivy, you do `trivy fs /path/to/filesystem`: + +nanabozho:~# trivy fs -f json ~/test-image/ +2021-06-07T23:48:40.308-0600 INFO Detected OS: alpine +2021-06-07T23:48:40.308-0600 INFO Detecting Alpine vulnerabilities... +2021-06-07T23:48:40.309-0600 INFO Number of PL dependency files: 0 +\[ + { + "Target": "localhost (alpine 3.13.5)", + "Type": "alpine" + } +\] + +Hmm, that's strange.  I wonder why? + +nanabozho:~# trivy --debug fs ~/test-image/ +2021-06-07T23:42:54.036-0600 DEBUG Severities: UNKNOWN,LOW,MEDIUM,HIGH,CRITICAL +2021-06-07T23:42:54.038-0600 DEBUG cache dir: /root/.cache/trivy +2021-06-07T23:42:54.039-0600 DEBUG DB update was skipped because DB is the latest +2021-06-07T23:42:54.039-0600 DEBUG DB Schema: 1, Type: 1, UpdatedAt: 2021-06-08 00:19:21.979880152 +0000 UTC, NextUpdate: **2021-06-08 12:19:21.979879952 +0000 UTC**, DownloadedAt: 2021-06-08 05:23:09.354950757 +0000 UTC + +Ah, trivy's security database only updates twice per day, so trivy has not become aware of CVE-2016-20011 being mitigated by `libgrss-0.7.0-r1` yet. + +I rebuilt trivy's database locally and put it in `~/.cache/trivy/db/trivy.db`: + +nanabozho:~# trivy fs -f json ~/test-image/ +2021-06-08T01:37:20.574-0600 INFO Detected OS: alpine +2021-06-08T01:37:20.574-0600 INFO Detecting Alpine vulnerabilities... +2021-06-08T01:37:20.576-0600 INFO Number of PL dependency files: 0 +\[ + { + "Target": "localhost (alpine 3.13.5)", + "Type": "alpine", + "Vulnerabilities": \[ + { + "VulnerabilityID": "CVE-2016-20011", + "PkgName": "libgrss", + "InstalledVersion": "0.7.0-r0", + "FixedVersion": "0.7.0-r1", + "Layer": { + "DiffID": "sha256:4bd83511239d179fb096a1aecdb2b4e1494539cd8a0a4edbb58360126ea8d093" + }, + "SeveritySource": "nvd", + "PrimaryURL": "https://avd.aquasec.com/nvd/cve-2016-20011", + "Description": "libgrss through 0.7.0 fails to perform TLS certificate verification when downloading feeds, allowing remote attackers to manipulate the contents of feeds without detection. This occurs because of the default behavior of SoupSessionSync.", + "Severity": "HIGH", + "CweIDs": \[ + "CWE-295" + \], + "CVSS": { + "nvd": { + "V2Vector": "AV:N/AC:L/Au:N/C:N/I:P/A:N", + "V3Vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N", + "V2Score": 5, + "V3Score": 7.5 + } + }, + "References": \[ + "https://bugzilla.gnome.org/show\_bug.cgi?id=772647", + "https://gitlab.gnome.org/GNOME/libgrss/-/issues/4" + \], + "PublishedDate": "2021-05-25T21:15:00Z", + "LastModifiedDate": "2021-06-01T17:03:00Z" + }, + { + "VulnerabilityID": "CVE-2016-20011", + "PkgName": "libgrss-dev", + "InstalledVersion": "0.7.0-r0", + "FixedVersion": "0.7.0-r1", + "Layer": { + "DiffID": "sha256:4bd83511239d179fb096a1aecdb2b4e1494539cd8a0a4edbb58360126ea8d093" + }, + "SeveritySource": "nvd", + "PrimaryURL": "https://avd.aquasec.com/nvd/cve-2016-20011", + "Description": "libgrss through 0.7.0 fails to perform TLS certificate verification when downloading feeds, allowing remote attackers to manipulate the contents of feeds without detection. This occurs because of the default behavior of SoupSessionSync.", + "Severity": "HIGH", + "CweIDs": \[ + "CWE-295" + \], + "CVSS": { + "nvd": { + "V2Vector": "AV:N/AC:L/Au:N/C:N/I:P/A:N", + "V3Vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N", + "V2Score": 5, + "V3Score": 7.5 + } + }, + "References": \[ + "https://bugzilla.gnome.org/show\_bug.cgi?id=772647", + "https://gitlab.gnome.org/GNOME/libgrss/-/issues/4" + \], + "PublishedDate": "2021-05-25T21:15:00Z", + "LastModifiedDate": "2021-06-01T17:03:00Z" + } + \] + } +\] + +Ah, that's better. + +### clair + +[Clair](https://github.com/quay/clair) is a security scanner previously written by the CoreOS team, and now maintained by Red Hat.  It is considered the gold standard for security scanning of containers.  How does it do with the filesystem we baked? + +nanabozho:~# clairctl report ~/test-image/ +2021-06-08T00:11:04-06:00 ERR error="UNAUTHORIZED: authentication required; \[map\[Action:pull Class: Name:root/test-image Type:repository\]\]" + +Oh, right, it can't just scan a filesystem.  One second. + +nanabozho:~$ cd ~/dev-src/clair +nanabozho:~$ make local-dev-up-with-quay +\[a bunch of commands later\] +nanabozho:~$ clairctl report test-image:1 +test-image:1 found libgrss 0.7.0-r0 CVE-2016-20011 (fixed: 0.7.0-r1) + +As you can see, clair does succeed in finding the vulnerability, when you bake an actual Docker image and publish it to a local quay instance running on localhost. + +But this is really a lot of work to just scan for vulnerabilities, so I wouldn't recommend clair for that. + +### grype + +[grype](https://github.com/anchore/grype) is a security scanner made by Anchore.  They talk a lot about how Anchore's products can also be used to build a Software Bill of Materials for a given image.  Let's see how it goes with our test image: + +nanabozho:~# grype dir:~/test-image/ +✔ Vulnerability DB \[updated\] +✔ Cataloged packages \[98 packages\] +✔ Scanned image \[3 vulnerabilities\] +NAME INSTALLED FIXED-IN VULNERABILITY SEVERITY +libgrss 0.7.0-r0 (fixes indeterminate) CVE-2016-20011 High +libxml2 2.9.10-r7 (fixes indeterminate) CVE-2019-19956 High +openrc 0.42.1-r19 (fixes indeterminate) CVE-2018-21269 Medium + +grype does detect that a vulnerable `libgrss` is installed, but the `(fixes indeterminate)` seems fishy to me.  There also appear to be some other hits that the other scanners didn't notice.  Lets fact check this against a pure Alpine 3.13 container: + +nanabozho:~# grype dir:~/test-image-pure/ +✔ Vulnerability DB \[no update available\] +✔ Cataloged packages \[98 packages\] +✔ Scanned image \[3 vulnerabilities\] +NAME INSTALLED FIXED-IN VULNERABILITY SEVERITY +libgrss 0.7.0-r1 (fixes indeterminate) CVE-2016-20011 High +libxml2 2.9.10-r7 (fixes indeterminate) CVE-2019-19956 High +openrc 0.42.1-r19 (fixes indeterminate) CVE-2018-21269 Medium + +Oh no, it detects `0.7.0-r1` as vulnerable too, which I assume is simply because Anchore's database hasn't updated yet.  Researching the other two vulnerabilities, the `openrc` one seems to be a vulnerability we missed, while the `libxml2` one is a false positive. + +I think, however, it is important to note that Anchore's scanning engine assumes a package is vulnerable if there is a CVE and the distribution hasn't acknowledged a fix.  That may or may not actually be reliable enough of the time, but it is an admittedly interesting approach. + +## Conclusion + +For vulnerability scanning, I have to recommend either trivy or grype.  Clair is really complicated to set up and is really geared at people scanning entire container registries at once.  In general, I would recommend trivy over grype simply because it does not speculate about unconfirmed vulnerabilities, which I think is a distraction to developers, but I think grype has a lot of potential as well, though they may want to add the ability to only scan for confirmed vulnerabilities. + +In general, I hope this blog entry answers a lot of questions about the remediation lifecycle in general as well. diff --git a/content/blog/the-whole-freenode-kerfluffle.md b/content/blog/the-whole-freenode-kerfluffle.md new file mode 100644 index 0000000..cb7b8d8 --- /dev/null +++ b/content/blog/the-whole-freenode-kerfluffle.md @@ -0,0 +1,42 @@ +--- +title: "the whole freenode kerfluffle" +date: "2021-05-20" +--- + +> But the thing is IRC has always been a glorious thing. The infra has always been sponsored by companies or people. But the great thing about IRC is you can always vote and let the networks and world know which you choose - by using /server. +> +> — Andrew Lee (rasengan), chairman of freenode limited + +Yesterday, operational control over freenode was taken over by Andrew Lee, the person [who has been owner of freenode limited since 2017](https://find-and-update.company-information.service.gov.uk/company/10308021/officers).  Myself and others have had questions about this arrangement since we noticed the change in ownership interest in freenode limited back in 2017. + +Historically, freenode staff had stated that everything was under control and that Andrew's involvement in freenode limited had no operational impact on the network.  It turns out that Christel was lying to them: Andrew had operational control and legal authority over the freenode domains.  This lead to [several current volunteers drafting their resignation letters](https://fuchsnet.ch/freenode-resign-letter.txt). + +When I asked Andrew about the [current state of the freenode domain](http://distfiles.dereferenced.org/stuff/rasengan-log.txt), one of his associates who I hadn't spoken to in months (since terminating the Ophion project I was doodling on during lockdown) came out of nowhere and started offering me [bribes of staff privileges and money for Alpine](https://distfiles.dereferenced.org/stuff/nirvana-log.txt).  These developments were concerning to the Alpine council and interim technical committee, so we scheduled an event at AlpineConf to talk about the situation. + +Our initial conclusion was that we should wait until the end of the month and see how the situation shakes out, and possibly plan to stand up our own IRC infrastructure or use another network.  Then this happened yesterday: + +\[02:54:38\] <-- ChanServ (ChanServ@services.) has quit (Killed (grumble (My fellow staff so-called 'friends' are about to hand over account data to a non-staff member. If you care about your data, drop your NickServ account NOW before that happens.))) + +Given that situation, members of the Alpine council and technical committee gathered together to discuss the situation.  We decided to move to OFTC immediately, as we wanted to give users the widest window of opportunity to delete their data.  This move [has now been concluded](https://alpinelinux.org/posts/Switching-to-OFTC.html), and I appreciate the help of the OFTC IRC network staff as well as the Alpine infrastructure team to migrate all of our IRC-facing services across.  The fact that we were able to move so quickly without much disruption is a testament to the fact that IRC and other open protocols like it are vital for the free software community. + +## So, why does he want to control freenode anyway? + +I have had the pleasure of using freenode since 2003, and have been a staff member on several occasions.  My work on IRC, such as starting the IRCv3 project and writing charybdis and atheme, was largely motivated by a desire to improve freenode.  It is unfortunate that one person's desire for control over an IRC network has lead to so much destruction. + +But why is he actually driven to control these IRC networks?  Many believe it is about data mining, or selling the services database, or some other boring but sensible explanation. + +But that's not why.  What I believe to be the real answer is actually much sadder. + +I spent several months talking to Andrew and his associate, Shane, last year during lockdown while I was [writing an IRCX server](https://github.com/ophion-project/ophion) (I didn't have much to do last summer during lockdown and I had always wanted to write an IRCX server).  Shane linked a server to my testnet because he was enthusiastic about IRCX, he had previously been a user on `irc.msn.com`.  Both he and Andrew acted as IRCops on the server they linked.  In that time, I learned a lot about both of them, what their thought processes are, how they operate. + +In December 2018, Andrew acquired the irc.com domain.  On that domain, he wrote a post titled [Let's take IRC further](http://web.archive.org/web/20181207230330/https://www.irc.com/lets-take-irc-further).  Based on this post, we can gather a few details about Andrew's childhood: he grew up as a marginalized person, and as a result of that marginalization, he was bullied.  IRC was his outlet, a space for him that was actually safe for him to express himself.  Because of that, he was able to learn about technology and free software. + +Because of this, I believe Andrew's intention is to preserve IRC as it was formative in his transition from childhood to adulthood.  He finds IRC to be comforting, in the same way that I find the [bunny plushie](https://www.jellycat.com/us/bashful-cream-bunny-bas3bc/) I sleep with to be comforting.  This is understandable to me, as many people strongly desire to preserve the environment they proverbially grew up in. + +However, in implementing his desire to preserve the IRC network he grew up on, he has effectively destroyed it: projects are leaving or planning to leave en masse, which is sad. + +Whether you want to participate in [Andrew's imaginary kingdom or not](https://en.wikipedia.org/wiki/Role-playing) is up to you, but I believe the current situation to be untenable for the free software community.  We cannot depend on an IRC network where any criticism of Andrew may be perceived by him as a traumatic experience. + +I strongly encourage everyone to move their projects to either [OFTC](https://oftc.net/) or [Libera Chat](https://libera.chat/).  I will be disconnecting from freenode on May 22nd, and I have no plans to ever return. + +And to the volunteers who kept the network going, with whom I had the privilege on several occasions over the years of working with: I wish you luck with Libera Chat. diff --git a/content/blog/there-is-no-such-thing-as-a-glibc-based-alpine-image.md b/content/blog/there-is-no-such-thing-as-a-glibc-based-alpine-image.md new file mode 100644 index 0000000..c279509 --- /dev/null +++ b/content/blog/there-is-no-such-thing-as-a-glibc-based-alpine-image.md @@ -0,0 +1,34 @@ +--- +title: "there is no such thing as a \"glibc based alpine image\"" +date: "2021-08-26" +--- + +For whatever reason, [the `alpine-glibc` project](https://github.com/sgerrand/alpine-pkg-glibc) is [apparently being used in production](https://github.com/adoptium/containers/issues/1#issuecomment-905522460).  Worse yet, some are led to believe that Alpine officially supports or at least approves of its usage.  For the reasons I am about to outline, we don't.  I have also [proposed an update to Alpine which will block the installation of the `glibc` packages](https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/24647) produced by the `alpine-glibc` project, and have [referred acceptance of that update to the TSC](https://gitlab.alpinelinux.org/alpine/tsc/-/issues/17) to determine if we actually want to put our foot down or not.  I have additionally suggested that the TSC may wish to have the Alpine Council reach out to the `alpine-glibc` project to find a solution which appropriately communicates that the project is not supported in any way by Alpine.  It should be hopefully clear that there is no such thing as a "glibc based alpine image" because Alpine does not use glibc, it uses musl. + +**Update:** the TSC has decided that it is better to approach this problem as a documentation issue.  We will therefore try to identify common scenarios, including using the glibc package, that cause stability issues to Alpine and document them as scenarios that should ideally be avoided. + +## What the `alpine-glibc` project actually does + +The `alpine-glibc` project attempts to package the GNU C library (`glibc`) in such a way that it can be used on Alpine transparently.  However, it is conceptually flawed, because it uses system libraries where available, which have been compiled against the `musl` C library.  Combining code built for `musl` with code built for `glibc` is like trying to run Windows programs on OS/2: both understand `.EXE` files to some extent, but they are otherwise very different. + +But why are they different?  They are both libraries designed to run [ELF binaries](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format), after all.  The answer is due to differences in the [application binary interface](https://en.wikipedia.org/wiki/Application_binary_interface), also known as an ABI.  Specifically, `glibc` supports and heavily uses a backwards compatibility technique called [symbol versioning](https://www.akkadia.org/drepper/symbol-versioning), and `musl` does not support it at all. + +## How symbol versioning works + +Binary programs, such as those compiled against `musl` or `glibc`, have something called a _symbol table_.  The symbol table contains a list of symbols needed from the system libraries, for example the C library functions like `printf` are known as _symbols_.  When a binary program is run, it is not executed directly by the kernel: instead, a special program known as an _ELF interpreter_ is loaded, which sets up the mapping from symbols in the symbol table to the actual locations where those symbols exist.  That mapping is known as the _global object table_. + +On a system with symbol versioning, additional data in the symbol table designates what version of a symbol is actually wanted.  For example, when you request `printf` on a glibc system, you might actually wind up requesting `printf@GLIBC_2_34` or some other kind of versioned symbol.  This allows newer programs to prefer the newer `printf` function, while older programs can reference an older version of the implementation.  That allows for low-cost backwards compatibility: all you have to do is keep around the old versions of the routines until you decide to drop support in the ABI for them. + +## Why mixing these two worlds is bad + +However, if you combine a world expecting symbol versioning and one which does not, you wind up with undefined behavior.  For very simple programs, it appears to work, but for more complicated programs, you will wind up with strange behavior and possible crashes, as the _global object table_ references routines with different behavior than expected by the program.  For example, a program expecting a C99 compliant `printf` routine will get one on musl if it asks for `printf`.  But a program expecting a C99 compliant `printf` routine on glibc will ask for `printf@GLIBC_2_12` or similar. + +The symbol versioning problem spreads to the system libraries too: on Alpine, libraries don't provide versioned symbols: instead, you get the latest version of each symbol.  But if a glibc program is expecting `foo` to be an older routine without the semantics of the current implementation of `foo`, then it will either crash or do something weird. + +This has security impacts: the lack of consistency for whether versioned symbols are actually supported by the system basically turns any interaction with versioned symbols into what is called a [weird machine](https://en.wikipedia.org/wiki/Weird_machine).  This means that an attacker possibly controls more attack surface than they would in a situation where the system ran either pure glibc or pure musl. + +## Alternatives to `alpine-glibc` + +As `alpine-glibc` is primarily discussed in the context of containers, we will keep this conversation largely focused on that.  There are a few options if you want a small container to run a binary blob linked against `glibc` in a container that are far better than using `alpine-glibc`.  For example, you can use [Google's distroless tools](https://github.com/GoogleContainerTools/distroless), which will build a Debian-based container with only the application and its runtime dependencies, which allows for a fairly small container.  You can also try to use [the `gcompat` package](https://git.adelielinux.org/adelie/gcompat), which emulates the GNU C library ABI in the same way that WINE emulates the Windows ABI. + +But whatever you do, you shouldn't use the `alpine-glibc` project to do this.  You will wind up with something completely broken. diff --git a/content/blog/to-secure-the-supply-chain-you-must-properly-fund-it.md b/content/blog/to-secure-the-supply-chain-you-must-properly-fund-it.md new file mode 100644 index 0000000..1e6a5f4 --- /dev/null +++ b/content/blog/to-secure-the-supply-chain-you-must-properly-fund-it.md @@ -0,0 +1,12 @@ +--- +title: "to secure the supply chain, you must properly fund it" +date: "2021-12-11" +--- + +Yesterday, a new [0day vulnerability dropped in Apache Log4j](https://nvd.nist.gov/vuln/detail/CVE-2021-44228). It turned out to be worse than the initial analysis: because of recursive nesting of substitutions, [it is possible to execute remote code in any program which passes user data to Log4j for logging](https://twitter.com/_StaticFlow_/status/1469358229767475205?s=20). Needless to say, the way this disclosure was handled was a disaster, as it was quickly discovered that many popular services were using Log4j, but how did we get here? + +Like many projects, Log4j is only maintained by volunteers, and because of this, coordination of security response is naturally more difficult: a coordinated embargo is easy to coordinate, if you have a dedicated maintainer to do it. In the absence of a dedicated maintainer, you have chaos: as soon as a commit lands in git to fix a bug, the race is on: security maintainers are scurrying to reverse engineer what the bug you fixed was, which is why vulnerability embargoes can be helpful. + +It turns out that like many other software projects in the commons, Log4j does not have a dedicated maintainer, while corporations make heavy use of the project, and so, as usual, the maintainers have to beg for scraps from their fellow peers or the corporations that use the code. Incidentally, [one of the Log4j maintainers' GitHub sponsors profile is here](https://github.com/sponsors/rgoers), if you would like to contribute some money to his cause. + +When corporations sponsor the maintenance of the FOSS projects they use, they are effectively buying an insurance policy that guarantees a prompt, well-coordinated response to security problems. The newly established Open Source Program Offices at these companies should ponder which is more expensive: $100k/year salary for a maintainer of a project they are heavily dependent upon, or millions in damages from data breaches when a security vulnerability causes serious customer data exposure, like this one. diff --git a/content/blog/trustworthy-computing-in-2021.md b/content/blog/trustworthy-computing-in-2021.md new file mode 100644 index 0000000..ef1935c --- /dev/null +++ b/content/blog/trustworthy-computing-in-2021.md @@ -0,0 +1,137 @@ +--- +title: "Trustworthy computing in 2021" +date: "2021-10-19" +--- + +Normally, when you hear the phrase “trusted computing,” you think about schemes designed to create roots of trust for companies, rather than the end user. For example, Microsoft’s Palladium project during the Longhorn development cycle of Windows is a classically cited example of trusted computing used as a basis to enforce Digital Restrictions Management against the end user. + +However, for companies and software maintainers, or really anybody who is processing sensitive data, maintaining a secure chain of trust is paramount, and that root of trust is always the hardware. In the past, this was not so difficult: we had very simple computers, usually with some sort of x86 CPU and a BIOS, which was designed to be just enough to get DOS up and running on a system. This combination resulted in something trivial to audit and for the most part everything was fine. + +More advanced systems of the day, like the Macintosh and UNIX workstations such as those sold by Sun and IBM used implementations of IEEE-1275, [also known as Open Firmware](https://en.wikipedia.org/wiki/Open_Firmware). Unlike the BIOS used in the PC, Open Firmware was written atop a small Forth interpreter, which allowed for a lot more flexibility in handling system boot. Intel, noting the features that were enabled by Open Firmware, ultimately decided to create their own competitor called the [Extensible Firmware Interface](https://en.m.wikipedia.org/wiki/Unified_Extensible_Firmware_Interface), which was launched with the Itanium. + +Intel’s EFI evolved into an architecture-neutral variant known as the Unified Extensible Firmware Interface, frequently referred to as UEFI. For the most part, UEFI won against Open Firmware: the only vendor still supporting it being IBM, and only as a legacy compatibility option for their POWER machines. Arguably the demise of Open Firmware was more related to industry standardization on x86 instead of the technical quality of UEFI however. + +So these days the most common architecture is x86 with UEFI firmware. Although many firmwares out there are complex, this in and of itself isn’t impossible to audit: most firmware is built on top of TianoCore. However, it isn’t ideal, and is not even the largest problem with modern hardware. + +## Low-level hardware initialization + +Most people when asked how a computer boots, would say that UEFI is the first thing that the computer runs, and then that boots into the operating system by way of a boot loader. And, for the most part, due to magic, this is a reasonable assumption for the layperson. But it isn’t true at all. + +In reality, most machines have either a dedicated service processor, or a special execution mode that they begin execution in. Regardless of whether a dedicated service processor (like the AMD PSP, older Intel ME, various ARM SoCs, POWER, etc.) or a special execution mode (newer Intel ME), system boot starts by executing code burned into a _mask rom_, which is part of the CPU circuitry itself. + +Generally the _mask rom_ code is designed to bring up just enough of the system to allow transfer of execution to a platform-provided payload. In other words, the _mask rom_ typically brings up the processor’s core complex, and then jumps into platform-specific firmware in NOR flash, which then gets you into UEFI or Open Firmware or whatever your device is running that is user-facing. + +Some _mask roms_ initialize more, others less. As they are immutable, they cannot be tampered with on a targeted basis. However, once the main core complex is up, sometimes the service processor (or equivalent) sticks around and is still alive. In situations where the service processor remains operational, there is the possibility that it can be used as a backdoor. Accordingly, the behavior of the service processor must be carefully considered when evaluating the trustworthiness of a system. + +One can ask a few simple questions to evaluate the trustworthiness of a system design, assuming that the worst case scenario is assumed for any question where the answer is unknown. These questions are: + +- How does the system boot? Does it begin executing code at a hardwired address or is there a service processor? +- If there is a service processor, what is the initialization process that the service processor does? Is the mask rom and intermediate firmware auditable? Has it already been audited by a trusted party? +- What components of the low level init process are stored in NOR flash or similar? What components are immutable? +- What other functions does the service processor perform? Can they be disabled? Can the service processor be instructed to turn off? + +## System firmware + +The next point of contention, of course, is the system firmware itself. On most systems today, this is an implementation of UEFI, either Aptio or InsydeH2O. Both are derived from the open source TianoCore EDK codebase. + +In most cases, these firmwares are too complicated for an end user to audit. However, some machines support [coreboot](https://en.wikipedia.org/wiki/Coreboot), which can be used to replace the proprietary UEFI with a system firmware of your choosing, including one built on TianoCore. + +From a practical perspective, the main point of consideration at the firmware level is whether the trust store can be modified. UEFI mandates the inclusion of Microsoft’s signing key by default, but if you can uninstall their key and install your own, it is possible to gain some trustworthiness from the implementation, assuming it is not backdoored. This should be considered a minimum requirement for gaining some level of trust in the system firmware, but ultimately if you cannot audit the firmware, then you should not extend high amounts of trust to it. + +## Resource isolation + +A good system design will attempt to [isolate resources using IOMMUs](https://en.wikipedia.org/wiki/Input–output_memory_management_unit). This is because external devices, such as those on the PCIe bus should not be trusted with unrestricted access to system memory, as they can potentially be backdoored. + +It is sometimes possible to use virtualization technology to create barriers between PCIe devices and the main OS. Qubes OS for example [uses the Xen hypervisor and dedicated VMs to isolate specific pieces of hardware and their drivers](https://www.qubes-os.org/doc/architecture/). + +Additionally, with appropriate use of IOMMUs, system stability is improved, as badly behaving hardware and drivers cannot crash the system. + +## A reasonably secure system + +Based on the discussion above, we can conclude some properties of what a secure system would look like. Not all systems evaluated later in this blog will have all of these properties. But we have a framework none the less, where the more properties that are there indicate a higher level of trustworthiness: + +- The system should have a hardware initialization routine that is as simple as possible. +- The service processor, if any, should be restricted to hardware initialization and tear down and should not perform any other functionality. +- The system firmware should be freely available and reproducible from source. +- The system firmware must allow the end user to control any signing keys enrolled into the trust store. +- The system should use IOMMUs to mediate I/O between the main CPU and external hardware devices like PCIe cards and so on. + +## How do systems stack up in the real world? + +Using the framework above, lets look at a few of the systems I own and see how trustworthy they actually are. The results may surprise you. These are systems that anybody can purchase, without having to do any sort of hardware modifications themselves, from reputable vendors. Some examples are intentionally silly, in that while they are secure, you wouldn't actually want to use them today for getting work done due to obsolescence. + +### Compaq DeskPro 486/33m + +The DeskPro is an Intel 80486DX system running at 33mhz. It has 16MB of RAM, and I haven't gotten around to unpacking it yet. But, it's reasonably secure, even when turned on. + +As [described in the 80486 programmer's manual](http://bitsavers.trailing-edge.com/components/intel/80486/i486_Processor_Programmers_Reference_Manual_1990.pdf), the 80486 is hardwired to start execution from `0xFFFFFFF0`. As long as there is a ROM connected to the chip in such a way that the `0xFFFFFFF0` address can be read, the system will boot whatever is there. This jumps into a BIOS, and then from there, into its operating system. We can audit the system BIOS if desired, or, if we have a CPLD programmer, replace it entirely with our own implementation, since it's socketed on the system board. + +There is no service processor, and booting from any device other than the hard disk can be restricted with a password. Accordingly, any practical attack against this machine would require disassembly of it, for example, to replace the hard disk. + +However, this machine does not use IOMMUs, as it predates IOMMUs, and it is too slow to use Xen to provide equivalent functionality. Overall it scores 3 out of 5 points on the framework above: simple initialization routine, no service controller, no trust store to worry about. + +**Where you can get one**: eBay, local PC recycler, that sort of thing. + +### Dell Inspiron 5515 (AMD Ryzen 5700U) + +This machine is my new workhorse for x86 tasks, since my previous x86 machine had a significant failure of the system board. Whenever I am doing x86-specific Alpine development, it is generally on this machine. But how does it stack up? + +Unfortunately, it stacks up rather badly. Like modern Intel machines, system initialization is controlled by a service processor, the [AMD Platform Security Processor.](https://en.wikipedia.org/wiki/AMD_Platform_Security_Processor) Worse yet, unlike Intel, the PSP firmware is distributed as a single signed image, and cannot have unwanted modules removed from it. + +The system uses InsydeH2O for its UEFI implementation, which is closed source. It does allow Microsoft's signing keys to be removed from the trust store. And while IOMMU functionality is available, it is available to virtualized guests only. + +So, overall, it scores only 1 out of 5 possible points for trustworthiness. It should not surprise you to learn that I don't do much sensitive computing on this device, instead using it for compiling only. + +**Where you can get one**: basically any electronics store you want. + +## IBM/Lenovo ThinkPad W500 + +This machine used to be my primary computer, quite a while ago, and ThinkPads are known for being able to take quite a beating. It is also the first computer I tried coreboot on. These days, you can use [Libreboot to install a deblobbed version of coreboot on the W500](https://stafwag.github.io/blog/blog/2019/02/10/how-to-install-libreboot-on-a-thinkspad-w500/). And, since it is based on the Core2 Quad CPU, it does not have the Intel Management Engine service processor. + +But, of course, the Core2 Quad is too slow for day to day work on an operating system where you have to compile lots of things. However, if you don't have to compile lots of things, it might be a reasonably priced option. + +When you use this machine with a coreboot distribution like Libreboot, it scores 4 out of 5 on the trustworthiness score, the highest of all x86 devices evaluated. Otherwise, with the normal Lenovo BIOS, it scores 3 out of 5, as the main differentiator is the availability of a reproducible firmware image: there is no Intel ME to worry about, and the UEFI BIOS allows removal of all preloaded signing keys. + +However, if you use an old ThinkPad, using Libreboot introduces modern features that are not available in the Lenovo BIOS, for example, you can build a firmware that fully supports the latest UEFI specification by using the TianoCore payload. + +**Where you can get it**: eBay, PC recyclers. The maintainer of Libreboot [sells refurbished ThinkPads on her website with Libreboot pre-installed](https://minifree.org/product/libreboot-w500/). Although her pricing is higher than a PC recycler, you are paying not only for a refurbished ThinkPad, but also to support the Libreboot project, hence the pricing premium. + +### Raptor Computing Systems Blackbird (POWER9 Sforza) + +A while ago, somebody sent me a Blackbird system they built after growing tired of the `#talos` community. The vendor promises that the system is built entirely on user-controlled firmware. How does it measure up? + +Firmware wise, it's true: you can compile every piece of firmware yourself, and [instructions are provided to do so](https://wiki.raptorcs.com/wiki/Compiling_Firmware). However, the OpenPOWER firmware initialization process is quite complicated. This is offset by the fact that you have all of the source code, of course. + +There is a service processor, specifically the BMC. It runs the OpenBMC firmware, and is potentially a network-connected element. However, you can compile the firmware that runs on it yourself. + +Overall, I give the Blackbird 5 out of 5 points, however, the pricing is expensive to buy directly from Raptor. A complete system usually runs in the neighborhood of about $3000-4000. There are also a lot of bugs with PPC64LE Linux still, too. + +**Where you can get it**: eBay sometimes, the [Raptor Computing Systems website](https://www.raptorcs.com/). + +### Apple MacBook Air M1 + +Last year, Apple announced machines based on their own ARM CPU design, the Apple M1 CPU. Why am I bringing this up, since I am a free software developer, and Apple is usually wanting to destroy software freedom? Great question: the answer basically is that Apple's M1 devices are designed in such a way that they have potential to be trustworthy_\`_, performant and unlike Blackbird, reasonably affordable. However, this is still a matter of potential: the [Asahi Linux project](https://asahilinux.org/), while making fast progress has not yet arrived at production-quality support for this hardware yet. So how does it measure up? + +Looking at [the Asahi docs for system boot](https://github.com/AsahiLinux/docs/wiki/M1-vs.-PC-Boot#iboot), there are three stages of system boot: SecureROM, and the two iBoot stages. The job of SecureROM is to initialize and load just enough to get the first iBoot stage running, while the first iBoot stage's job is only to get the second iBoot stage running. The second iBoot stage then starts whatever kernel is passed to it, as long as it matches the enrolled hash for secure boot, which is user-controllable. This means that the second iBoot stage can chainload into GRUB or similar to boot Linux. Notably, there is no PKI involved in the secure boot process, it is strictly based on hashes. + +This means that the system initialization is as simple as possible, leaving the majority of work to the second stage bootloader. There are no keys to manage, which means no trust store. The end user may trust whatever kernel hash she wishes. + +But what about the Secure Enclave? Does it act as a service processor? [No, it doesn't: it remains offline until it is explicitly started by MacOS](https://oftc.irclog.whitequark.org/asahi/2021-05-07#29846589;). And on the M1, everything is gated behind an IOMMU. + +Therefore, the M1 actually gets 4 out of 5, making it roughly as trustworthy as the Libreboot ThinkPad, and slightly less trustworthy than the Blackbird. But unlike those devices, the performance is good, and the cost is reasonable. However... it's not quite ready for Linux users yet. That leaves the Libreboot machines as providing the best balance between usability and trustworthiness today, even though the performance is quite slow by comparison to more modern computers. If you're excited by these developments, you should [follow the Asahi Linux project](https://asahilinux.org) and perhaps donate to [marcan's Patreon](https://www.patreon.com/marcan). + +**Where to get it**: basically any electronics store + +### SolidRun Honeycomb (NXP LX2160A, 16x Cortex-A72) + +My main `aarch64` workhorse at the moment is the SolidRun Honeycomb. I picked one up last year, and got Alpine running on it. Like the Blackbird, all firmware that can be flashed to the board is open source. SolidRun provides a build of u-boot or a build of TianoCore to use on the board. In general, they do a good job at enabling the ability to build your own firmware, the [process is reasonably documented](https://github.com/SolidRun/lx2160a_uefi), with the only binary blob being DDR PHY training data. + +However, mainline Linux support is only starting to mature: networking support just landed in full with Linux 5.14, for example. There are also bugs with the PCIe controller. And at $750 for the motherboard and CPU module, it is expensive to get started, but not nearly as expensive as something like Blackbird. + +If you're willing to put up with the PCIe bugs, however, it is a good starting point for a fully open system. In that regard, Honeycomb does get 5 out of 5 points, just like the Blackbird system. + +**Where to get it**: [SolidRun's website](https://www.solid-run.com/arm-servers-networking-platforms/honeycomb-workstation/). + +## Conclusions + +While we have largely been in the dark for modern user-trustworthy computers, things are finally starting to look up. While Apple is a problematic company, for many reasons, they are at least producing computers which, once Linux is fully functional on them, are basically trustworthy, but at a sufficiently low price point verses other platforms like Blackbird. Similarly, Libreboot seems to be back up and running and will hopefully soon be targeting more modern hardware. diff --git a/content/blog/understanding-thread-stack-sizes-and-how-alpine-is-different.md b/content/blog/understanding-thread-stack-sizes-and-how-alpine-is-different.md new file mode 100644 index 0000000..efc0eb2 --- /dev/null +++ b/content/blog/understanding-thread-stack-sizes-and-how-alpine-is-different.md @@ -0,0 +1,113 @@ +--- +title: "understanding thread stack sizes and how alpine is different" +date: "2021-06-25" +--- + +From time to time, somebody reports a bug to some project about their program crashing on Alpine.  Usually, one of two things happens: the developer doesn't care and doesn't fix the issue, because it works under GNU/Linux, or the developer fixes their program to behave correctly _only_ for the Alpine case, and it remains silently broken on other platforms. + +## The Default Thread Stack Size + +In general, it is my opinion that if your program is crashing on Alpine, it is because your program is dependent on behavior that is not guaranteed to actually exist, which means your program is not actually portable.  When it comes to this kind of dependency, the typical issue has to deal with the _thread stack size limit_. + +You might be wondering: what is a thread stack, anyway?  The answer, of course, is quite simple: each thread has its own stack memory, because it's not really feasible for multiple threads to use the same stack memory, and on most platforms the size of that memory is much smaller than the main thread's stack, though programmers are not necessarily aware of that discontinuity. + +Here is a table of common `x86_64` platforms and their default stack sizes for the main thread (process) and child threads: + +| OS | Process Stack Size | Thread Stack Size | +| --- | --- | --- | +| Darwin (macOS, iOS, etc) | 8 MiB | 512 KiB | +| FreeBSD | 8 MiB | 2 MiB | +| OpenBSD (before 4.6) | 8 MiB | **64 KiB** | +| OpenBSD (4.6 and later) | 8 MiB | 512 KiB | +| Windows | 1 MiB | 1 MiB | +| Alpine 3.10 and older | 8 MiB | 80 KiB | +| Alpine 3.11 and newer | 8 MiB | 128 KiB | +| GNU/Linux | 8 MiB | **8 MiB** | + +I've highlighted the OpenBSD and GNU/Linux default thread stack sizes because they represent the smallest and largest possible default thread stack sizes. + +Because the Linux kernel has overcommit mode, GNU/Linux systems use 8 MiB by default, which leads to a potential problem when running code developed against GNU/Linux on other systems.  As most threads only need a small amount of stack memory, other platforms use smaller limits, such as OpenBSD using only 64 KiB and Alpine using at most 128 KiB by default.  This leads to crashes in code which assumes a full 8MiB is available for each thread to use. + +If you find yourself debugging a weird crash that doesn't make sense, and your application is multi-threaded, it likely means that you're exhausting the stack limit. + +## What can I do about it? + +To fix the issue, you will need to either change the way your program is written, or change the way it is compiled.  There's a few options you can take to fix the problem, depending on how much time you're willing to spend.  In most cases, these sorts of crashes are caused by attempting to manipulate a large variable which is stored on the stack.  Generally, moving the variable off the stack is the best way to fix the issue, but there are alternative options. + +### Moving the variable off the stack + +Lets say that the code has a large array that is stored on the stack, which causes the stack exhaustion issue.  In this case, the easiest solution is to move it off the stack.  There's two main approaches you can use to do this: _thread-local storage_ and _heap storage_.  Thread-local storage is a way to reserve additional memory for thread variables, think of it like `static` but bound to each local thread.  Heap storage is what you're working with when you use `malloc` and `free`. + +To illustrate the example, we will adjust this code to use both kinds of storage: + +void some\_function(void) {
 + char scratchpad\[500000\];

 + + memset(scratchpad, 'A', sizeof scratchpad);
 +} + +Thread-local variables are referenced with the `thread_local` keyword.  You must include `threads.h` in order to use it: + +#include + +

void some\_function(void) { +
 thread\_local char scratchpad\[500000\];

 + memset(scratchpad, 'A', sizeof scratchpad);
 +} + +You can also use the heap.  The most portable example would be the obvious one: + +#include + +

const size\_t scratchpad\_size = 500000;
 + +
void some\_function(void) {
 + char \*scratchpad = calloc(1, scratchpad\_size);

 + + memset(scratchpad, 'A', scratchpad\_size);

 + + free(scratchpad);
 +} + +However, if you don't mind sacrificing portability outside `gcc` and `clang`, you can use the `cleanup` attribute: + +#include + +

#define autofree \_\_attribute\_\_(cleanup(free))

 + +const size\_t scratchpad\_size = 500000;

 + +void some\_function(void) {
 + autofree char \*scratchpad = calloc(1, scratchpad\_size);

 + + memset(scratchpad, 'A', scratchpad\_size);
 +} + +This is probably the best way to fix code like this if you're not targeting compilers like the Microsoft one. + +### Adjusting the thread stack size at runtime + +`pthread_create` takes an optional `pthread_attr_t` pointer as the second parameter.  This can be used to set an alternate stack size for the thread at runtime: + +#include + +

pthread\_t worker\_thread; + +

void launch\_worker(void) {
 + pthread\_attr\_t attr; + +

 pthread\_attr\_init(&attr);
 + pthread\_attr\_setstacksize(&attr, 1024768);

 + + pthread\_create(&worker\_thread, &attr, some\_function);
 +} + +By changing the stacksize when calling `pthread_create`, the child thread will have a larger stack. + +### Adjusting the stack size at link time + +In modern Alpine systems, since 2018, it is possible to set the default thread stack size at link time.  This can be done with a special `LDFLAGS` flag, like `-Wl,-z,stack-size=1024768`. + +You can also use tools like [chelf](https://github.com/Gottox/chelf) or [muslstack](https://github.com/yaegashi/muslstack) to patch pre-built binaries to use a larger stack, but this shouldn't be done inside Alpine packaging, for example. + +Hopefully, this article is helpful for those looking to learn how to solve the stack size issue. diff --git a/content/blog/using-otp-asn-1-support-with-elixir.md b/content/blog/using-otp-asn-1-support-with-elixir.md new file mode 100644 index 0000000..d717ccf --- /dev/null +++ b/content/blog/using-otp-asn-1-support-with-elixir.md @@ -0,0 +1,65 @@ +--- +title: "Using OTP ASN.1 support with Elixir" +date: "2019-10-21" +--- + +The OTP ecosystem which grew out of Erlang has all sorts of useful applications included with it, such as support for [encoding and decoding ASN.1 messages based on ASN.1 definition files](http://erlang.org/doc/apps/asn1/). + +I recently began work on [Cacophony](https://git.pleroma.social/pleroma/cacophony), which is a programmable LDAP server implementation, intended to be embedded in the Pleroma platform as part of the authentication components. This is intended to allow applications which support LDAP-based authentication to connect to Pleroma as a single sign-on solution. More on that later, that's not what this post is about. + +## Compiling ASN.1 files with `mix` + +The first thing you need to do in order to make use of the `asn1` application is install a mix task to compile the files. Thankfully, [somebody already published a Mix task to accomplish this](https://github.com/vicentfg/asn1ex). To use it, you need to make a few changes to your `mix.exs` file: + +1. Add `compilers: [:asn1] ++ Mix.compilers()` to your project function. +2. Add `{:asn1ex, git: "https://github.com/vicentfg/asn1ex"}` in the dependencies section. + +After that, run `mix deps.get` to install the Mix task into your project. + +Once you're done, you just place your ASN.1 definitions file in the `asn1` directory, and it will generate a parser in the `src` directory when you compile your project. The generated parser module will be automatically loaded into your application, so don't worry about it. + +For example, if you have `asn1/LDAP.asn1`, the compiler will generate `src/LDAP.erl` and `src/LDAP.hrl`, and the generated module can be called as `:LDAP` in your Elixir code. + +## How the generated ASN.1 parser works + +ASN.1 objects are marshaled (encoded) and demarshaled (parsed) to and from Erlang records. Erlang records are essentially tuples which begin with an atom that identifies the type of the record. + +[Elixir provides a module for working with records](https://hexdocs.pm/elixir/master/Record.html), which comes with some documentation that explain the concept in more detail, but overall the functions in the `Record` module are unnecessary and not really worth using, I just mention it for completeness. + +Here is an example of a record that contains sub-records inside it. We will be using this record for our examples. + +``` +message = {:LDAPMessage, 1, {:unbindRequest, :NULL}, :asn1_NOVALUE} +``` + +This message maps to an LDAP `unbindRequest`, inside an LDAP envelope. The `unbindRequest` carries a null payload, which is represented by `:NULL`. + +The LDAP envelope (the outer record) contains three fields: the message ID, the request itself, and an optional access-control modifier, which we don't want to send, so we use the special `:asn1_NOVALUE` parameter. Accordingly, this message has an ID of 1 and represents an `unbindRequest` without any special access-control modifiers. + +### Encoding messages with the `encode/2` function + +To encode a message, you must represent it in the form of an Erlang record, as shown in our example. Once you have the Erlang record, you pass it to the `encode/2` function: + +``` +iex(1)> message = {:LDAPMessage, 1, {:unbindRequest, :NULL}, :asn1_NOVALUE} +{:LDAPMessage, 1, {:unbindRequest, :NULL}, :asn1_NOVALUE} +iex(2)> {:ok, msg} = :LDAP.encode(:LDAPMessage, message) +{:ok, <<48, 5, 2, 1, 1, 66, 0>>} +``` + +The first parameter is the Erlang record type of the outside message. An astute observer will notice that this signature has a peculiar quality: it takes the Erlang record type as a separate parameter as well as the record. This is because the generated encode and decode functions are recursive-descent, meaning they walk the passed record as a tree and recurse downward on elements of the record! + +### Decoding messages with the `decode/2` function + +Now that we have encoded a message, how do we decode one? Well, lets use our `msg` as an example: + +``` +iex(6)> {:ok, decoded} = :LDAP.decode(:LDAPMessage, msg) +{:ok, {:LDAPMessage, 1, {:unbindRequest, :NULL}, :asn1_NOVALUE}} +iex(7)> decoded == message +true +``` + +As you can see, decoding works the same way as encoding, except the input and output are reversed: you pass in the binary message and get an Erlang record out. + +Hopefully this blog post is useful in answering questions that I am sure people have about making use of the `asn1` application with Elixir. There are basically no documentation or guides for it anywhere, which is why I wrote this post. diff --git a/content/blog/using-qemu-user-emulation-to-reverse-engineer-binaries.md b/content/blog/using-qemu-user-emulation-to-reverse-engineer-binaries.md new file mode 100644 index 0000000..cc9970c --- /dev/null +++ b/content/blog/using-qemu-user-emulation-to-reverse-engineer-binaries.md @@ -0,0 +1,186 @@ +--- +title: "using qemu-user emulation to reverse engineer binaries" +date: "2021-05-05" +--- + +QEMU is primarily known as the software which provides full system emulation under Linux's KVM.  Also, it can be used without KVM to do full emulation of machines from the hardware level up.  Finally, there is `qemu-user`, which allows for emulation of individual programs.  That's what this blog post is about. + +The main use case for `qemu-user` is actually _not_ reverse-engineering, but simply running programs for one CPU architecture on another.  For example, Alpine developers leverage `qemu-user` when they use `dabuild(1)` to cross-compile Alpine packages for other architectures: `qemu-user` is used to run the configure scripts, test suites and so on.  For those purposes, `qemu-user` works quite well: we are even considering using it to build the entire `riscv64` architecture in the 3.15 release. + +However, most people don't realize that you can run a `qemu-user` emulator which targets the same architecture as the host.  After all, that would be a little weird, right?  Most also don't know that you can control the emulator using `gdb`, which is possible and allows you to debug binaries which detect if they are being debugged. + +You don't need `gdb` for this to be a powerful reverse engineering tool, however.  The emulator itself includes many powerful tracing features.  Lets look into them by writing and compiling a sample program, that does some recursion by [calculating whether a number is even or odd inefficiently](https://ariadne.space/2021/04/27/the-various-ways-to-check-if-an-integer-is-even/): + +#include +#include + +bool isOdd(int x); +bool isEven(int x); + +bool isOdd(int x) { +   return x != 0 && isEven(x - 1); +} + +bool isEven(int x) { +   return x == 0 || isOdd(x - 1); +} + +int main(void) { +   printf("isEven(%d): %d\\n", 1025, isEven(1025)); +   return 0; +} + +Compile this program with `gcc`, by doing `gcc -ggdb3 -Os example.c -o example`. + +The next step is to install the `qemu-user` emulator for your architecture, in this case we want the `qemu-x86_64` package: + +$ doas apk add qemu-x86\_64 +(1/1) Installing qemu-x86\_64 (6.0.0-r1) +$ + +Normally, you would also want to install the `qemu-openrc` package and start the `qemu-binfmt` service to allow for the emulator to handle any program that couldn't be run natively, but that doesn't matter here as we will be running the emulator directly. + +The first thing we will do is check to make sure the emulator can run our sample program at all: + +$ qemu-x86\_64 ./example +isEven(1025): 0 + +Alright, all seems to be well.  Before we jump into using `gdb` with the emulator, lets play around a bit with the tracing features.  Normally when reverse engineering a program, it is common to use tracing programs like `strace`.  These tracing programs are quite useful, but they suffer from a design flaw: they use `ptrace(2)` to accomplish the tracing, which can be detected by the program being traced.  However, we can use qemu-user to do the tracing in a way that is transparent to the program being analyzed: + +$ qemu-x86\_64 -d strace ./example +22525 arch\_prctl(4098,274903714632,136818691500777464,274903714112,274903132960,465) = 0 +22525 set\_tid\_address(274903715728,274903714632,136818691500777464,274903714112,0,465) = 22525 +22525 brk(NULL) = 0x0000004000005000 +22525 brk(0x0000004000007000) = 0x0000004000007000 +22525 mmap(0x0000004000005000,4096,PROT\_NONE,MAP\_PRIVATE|MAP\_ANONYMOUS|MAP\_FIXED,-1,0) = 0x0000004000005000 +22525 mprotect(0x0000004001899000,4096,PROT\_READ) = 0 +22525 mprotect(0x0000004000003000,4096,PROT\_READ) = 0 +22525 ioctl(1,TIOCGWINSZ,0x00000040018052b8) = 0 ({55,236,0,0}) +isEven(1025): 0 +22525 writev(1,0x4001805250,0x2) = 16 +22525 exit\_group(0) + +But we can do even more.  For example, we can learn how a CPU would hypothetically break a program down into translation buffers full of micro-ops (these are TCG micro-ops but real CPUs are similar enough to gain a general understanding of the concept): + +$ qemu-x86\_64 -d op ./example +OP: +ld\_i32 tmp11,env,$0xfffffffffffffff0 +brcond\_i32 tmp11,$0x0,lt,$L0 + +---- 000000400185eafb 0000000000000000 +discard cc\_dst +discard cc\_src +discard cc\_src2 +discard cc\_op +mov\_i64 tmp0,$0x0 +mov\_i64 rbp,tmp0 + +---- 000000400185eafe 0000000000000031 +mov\_i64 tmp0,rsp +mov\_i64 rdi,tmp0 + +---- 000000400185eb01 0000000000000031 +mov\_i64 tmp2,$0x4001899dc0 +mov\_i64 rsi,tmp2 + +---- 000000400185eb08 0000000000000031 +mov\_i64 tmp1,$0xfffffffffffffff0 +mov\_i64 tmp0,rsp +and\_i64 tmp0,tmp0,tmp1 +mov\_i64 rsp,tmp0 +mov\_i64 cc\_dst,tmp0 + +---- 000000400185eb0c 0000000000000019 +mov\_i64 tmp0,$0x400185eb11 +sub\_i64 tmp2,rsp,$0x8 +qemu\_st\_i64 tmp0,tmp2,leq,0 +mov\_i64 rsp,tmp2 +mov\_i32 cc\_op,$0x19 +goto\_tb $0x0 +mov\_i64 tmp3,$0x400185eb11 +st\_i64 tmp3,env,$0x80 +exit\_tb $0x7f72ebafc040 +set\_label $L0 +exit\_tb $0x7f72ebafc043 +\[...\] + +If you want to trace the actual CPU registers for every instruction executed, that's possible too: + +$ qemu-x86\_64 -d cpu ./example +RAX=0000000000000000 RBX=0000000000000000 RCX=0000000000000000 RDX=0000000000000000 +RSI=0000000000000000 RDI=0000000000000000 RBP=0000000000000000 RSP=0000004001805690 +R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000000 +R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000 +RIP=000000400185eafb RFL=00000202 \[-------\] CPL=3 II=0 A20=1 SMM=0 HLT=0 +ES =0000 0000000000000000 00000000 00000000 +CS =0033 0000000000000000 ffffffff 00effb00 DPL=3 CS64 \[-RA\] +SS =002b 0000000000000000 ffffffff 00cff300 DPL=3 DS   \[-WA\] +DS =0000 0000000000000000 00000000 00000000 +FS =0000 0000000000000000 00000000 00000000 +GS =0000 0000000000000000 00000000 00000000 +LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT +TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy +GDT=     000000400189f000 0000007f +IDT=     000000400189e000 000001ff +CR0=80010001 CR2=0000000000000000 CR3=0000000000000000 CR4=00000220 +DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 +DR6=00000000ffff0ff0 DR7=0000000000000400 +CCS=0000000000000000 CCD=0000000000000000 CCO=EFLAGS +EFER=0000000000000500 +\[...\] + +You can also trace with disassembly for each translation buffer generated: + +$ qemu-x86\_64 -d in\_asm ./example +---------------- +IN:   +0x000000400185eafb:  xor    %rbp,%rbp +0x000000400185eafe:  mov    %rsp,%rdi +0x000000400185eb01:  lea    0x3b2b8(%rip),%rsi        # 0x4001899dc0 +0x000000400185eb08:  and    $0xfffffffffffffff0,%rsp +0x000000400185eb0c:  callq  0x400185eb11 + +---------------- +IN:   +0x000000400185eb11:  sub    $0x190,%rsp +0x000000400185eb18:  mov    (%rdi),%eax +0x000000400185eb1a:  mov    %rdi,%r8 +0x000000400185eb1d:  inc    %eax +0x000000400185eb1f:  cltq     +0x000000400185eb21:  mov    0x8(%r8,%rax,8),%rcx +0x000000400185eb26:  mov    %rax,%rdx +0x000000400185eb29:  inc    %rax +0x000000400185eb2c:  test   %rcx,%rcx +0x000000400185eb2f:  jne    0x400185eb21 +\[...\] + +All of these options, and more, can also be stacked.  For more ideas, look at `qemu-x86_64 -d help`.  Now, lets talk about using this with `gdb` using qemu-user's gdbserver functionality, which allows for `gdb` to control a remote machine. + +To start a program under gdbserver mode, we use the `-g` argument with a port number.  For example, `qemu-x86_64 -g 1234 ./example` will start our example program with a gdbserver listening on port 1234.  We can then connect to that gdbserver with `gdb`: + +$ gdb ./example +\[...\] +Reading symbols from ./example... +(gdb) target remote localhost:1234 +Remote debugging using localhost:1234 +0x000000400185eafb in ?? () +(gdb) br isEven +Breakpoint 1 at 0x4000001233: file example.c, line 12. +(gdb) c +Continuing. + +Breakpoint 1, isEven (x=1025) at example.c:12 +12          return x == 0 || isOdd(x - 1); +(gdb) bt full +#0  isEven (x=1025) at example.c:12 +No locals. +#1  0x0000004000001269 in main () at example.c:16 +No locals. + +All of this is happening without any knowledge or cooperation of the program.  As far as its concerned, its running as normal, there is no ptrace or any other weirdness. + +However, this is not 100% perfect: a program could be clever and run the `cpuid` instruction and check for `GenuineIntel` or `AuthenticAMD` and crash out if it doesn't see that it is running on a legitimate CPU.  Thankfully, qemu-user has the ability to spoof CPUs with the `-cpu` option. + +If you find yourself needing to spoof the CPU, you'll probably have the best results with a simple CPU type like `-cpu Opteron_G1-v1` or similar.  That CPU type spoofs an Opteron 240 processor, which was one of the first x86\_64 CPUs on the market.  You can get a full list of CPUs supported by your copy of the qemu-user emulator by doing `qemu-x86_64 -cpu help`. + +There's a lot more qemu-user emulation can do to help with reverse engineering, for some ideas, look at `qemu-x86_64 -h` or similar. diff --git a/content/blog/what-is-ocap-and-why-should-i-care.md b/content/blog/what-is-ocap-and-why-should-i-care.md new file mode 100644 index 0000000..c7541a2 --- /dev/null +++ b/content/blog/what-is-ocap-and-why-should-i-care.md @@ -0,0 +1,103 @@ +--- +title: "What is OCAP and why should I care?" +date: "2019-06-28" +--- + +OCAP refers to Object CAPabilities. Object Capabilities are one of many possible ways to achieve capability-based security. OAuth Bearer Tokens, for example, are an example of an OCAP-style implementation. + +In this context, OCAP refers to an adaptation of ActivityPub which utilizes capability tokens. + +But why should we care about OCAP? OCAP is a more flexible approach that allows for more efficient federation (considerably reduced cryptography overhead!) as well as conditional endorsement of actions. The latter enables things like forwarding `Create` activities using tokens that would not normally be authorized to do such things (think of this like sudo, but inside the federation). Tokens can also be used to authorize fetches allowing for non-public federation that works reliably without leaking metadata about threads. + +In short, OCAP fixes almost everything that is lacking about ActivityPub's security, because it defines a rigid, robust and future-proof security model for the fediverse to use. + +## How does it all fit together? + +This work is being done in the LitePub (maybe soon to be called SocialPub) working group. LitePub is to ActivityPub what the WHATWG is to HTML5. The examples I use here don't necessarily completely line up with what is _really_ in the spec, because they are meant to just be a basic outline of how the scheme works. + +So the first thing that we do is extend the AS2 actor description with a new endpoint (`capabilityAcquisitionEndpoint`) which is used to acquire a new capability object. + +Example: Alyssa P. Hacker's actor object + +{ + "@context": "https://social.example/litepub-v1.jsonld", + "id": "https://social.example/~alyssa", + "capabilityAcquisitionEndpoint": "https://social.example/caps/new" + \[...\] +} + +Bob has a server which lives at `chatty.example`. Bob wants to exchange notes with Alyssa. To do this, Bob's instance needs to acquire a capability that he uses to federate in the future by POSTing a document to the `capabilityAcquisitionEndpoint` and signing it with HTTP Signatures: + +Example: Bob's instance acquires the inbox:write and objects:read capabilities + +{ + "@context": "https://chatty.example/litepub-v1.jsonld", + "id": "https://chatty.example/caps/request/9b2220dc-0e2e-4c95-9a5a-912b0748c082", + "type": "Request", + "capability": \["inbox:write", "objects:read"\], + "actor": "https://chatty.example" +} + +It should be noted here that Bob's instance itself makes the request, using an instance-specific actor. This is important because capability tokens are scoped to their actor. In this case, the capability token may be invoked by any children actors of the instance, because it's an instance-wide token. But the instance could request the token strictly on Bob's behalf by using Bob's actor and signing the request with Bob's key. + +Alyssa's instance responds with a capability object: + +Example: A capability token + +{ + "@context": "https://social.example/litepub-v1.jsonld", + "id": "https://social.example/caps/640b0093-ae9a-4155-b295-a500dd65ee11", + "type": "Capability", + "capability": \["inbox:write", "objects:read"\], + "scope": "https://chatty.example", + "actor": "https://social.example" +} + +There's a few peculiar things about this object that I'm sure you've probably noticed. Lets look at this object together: + +- The `scope` describes the actor which may use the token. Implementations check the scope for validity by merging it against the actor referenced in the message. +- The `actor` here describes the actor which _granted_ the capability. Usually this is an instance-wide actor, but it may also be any other kind of actor. + +In traditional ActivityPub the mechanism through which Bob authenticates and later authorizes federation is left undefined. This is the hole that got filled with signature-based authentication, and is being filled again with OCAP. + +But how do we invoke the capability to exchange messages? There's a couple of ways. + +When pushing messages, we can simply reference the capability by including it in the message: + +Example: Pushing a note using a capability + +{ + "@context": "https://chatty.example/litepub-v1.jsonld", + "id": "https://chatty.example/activities/63ffcdb1-f064-4405-ab0b-ec97b94cfc34", + "capability": "https://social.example/caps/640b0093-ae9a-4155-b295-a500dd65ee11", + "type": "Create", + "object": { + "id": "https://chatty.example/objects/de18ad80-879c-4ad2-99f7-e1c697c0d68b", + "type": "Note", + "attributedTo": "https://chatty.example/~bob", + "content": "hey alyssa!", + "to": \["https://social.example/~alyssa"\] + }, + "to": \["https://social.example/~alyssa"\], + "cc": \[\], + "actor": "https://chatty.example/~bob" +} + +Easy enough, right? Well, there's another way we can do it as well, which is to use the capability as a _bearer token_ (because it is one). This is useful when fetching objects: + +Example: Fetching an object with HTTP + capability token + +GET /objects/de18ad80-879c-4ad2-99f7-e1c697c0d68b HTTP/1.1 +Accept: application/activity+json +Authorization: Bearer https://social.example/caps/640b0093-ae9a-4155-b295-a500dd65ee11 + +HTTP/1.1 200 OK +Content-Type: application/activity+json + +\[...\] + +Because we have a valid capability token, the server can make decisions on whether or not to disclose the object based on the relationship associated with that token. + +This is basically OCAP in a nutshell. It's simple and easy for implementations to adopt and gives us a framework for extending it in the future to allow for all sorts of things without leakage of cryptographically-signed metadata. + +If this sort of stuff interests you, drop by [#litepub](https://blog.dereferenced.org/tag:litepub) on freenode! diff --git a/content/blog/what-would-activitypub-look-like-with-capability-based-security-anyway.md b/content/blog/what-would-activitypub-look-like-with-capability-based-security-anyway.md new file mode 100644 index 0000000..e8a0bf4 --- /dev/null +++ b/content/blog/what-would-activitypub-look-like-with-capability-based-security-anyway.md @@ -0,0 +1,177 @@ +--- +title: "What would ActivityPub look like with capability-based security, anyway?" +date: "2019-01-18" +--- + +> This is the third article in a series of articles about ActivityPub detailing the challenges of building a trustworthy, secure implementation of the protocol stack. +> +> In this case, it also does a significant technical deep dive into informally specifying a set of protocol extensions to ActivityPub. Formal specification of these extensions will be done in the Litepub working group, and will likely see some amount of change, **_so this blog entry should be considered non-normative in it's entirety_**. + +Over the past few years of creating and revising ActivityPub, many people have made a push for the inclusion of a capability-based security model as the core security primitive (instead, the core security primitive is “this section is non-normative,” but I'm not salty), but what would that look like? + +There's a few different proposals in the works at varying stages of development that could be used to retrofit capability-based security into ActivityPub: + +- [OCAP-LD](https://w3c-ccg.github.io/ocap-ld/), which adds a generic object capabilities framework for any consumer of JSON-LD (such as the Linked Data Platform, or the neutered version of LDP that is described as part of ActivityPub), +- Litepub Capability Enforcement, which is preliminarily described by this blog post, and +- [PolaPub aka CapabilityPub](https://gitlab.com/dustyweb/polapub) which is only an outline stored in an .org document. It is presumed that PolaPub or CapabilityPub or whatever it is called next week will be built on OCAP-LD, but in fairness, this is pure speculation. + +## Why capabilities instead of ACLs? + +ActivityPub, like the fediverse in general, is an open world system. Traditional ACLs fail to provide proper scalability to the possibility of 100s of millions of accounts across millions of instances. Object capabilities, on the other hand, are opaque tokens which allow the bearer to possibly consume a set of permissions. + +The capability enforcement proposals presently proposed would be deployed as a hybrid approach: capabilities to provide horizontal scalability for the large number of accounts and instances, and deny lists to block specific interactions from actors. The combination of capabilities and deny lists provides for a highly robust permissions system for the fediverse, and mimics previous work on federated open world systems. + +## Drawing inspiration from previous work: the Second Life Open Grid Protocol + +I've been following large scale interactive communications architectures for many years, which has allowed me to learn many things about the design and implementation of open world horizontally-scaled systems. + +One of the projects that I followed very closely was started in 2008, as a collaboration between Linden Lab, IBM and some other participants: the [Open Grid Protocol](http://wiki.secondlife.com/wiki/OGP_Explained). While the Open Grid Protocol itself ultimately did not work out for various reasons (largely political), a large amount of the work was recycled into a significant redesign of the Second Life service's backend, and the SL grid itself now resembles a federated network in many ways. + +OGP was built on the concept of using capability tokens as URIs, which would either map to an active web service or a confirmation. Since the capability token was opaque and difficult to forge, it provided sufficient proof of authentication without sharing any actual information about the authorization itself: the web services act on the session established by the capability URIs instead of on an account directly. + +Like ActivityPub, OGP is an actor-centric messaging protocol: when logging in, the login server provides a set of “seed capabilities”, which allow use of the other services. From the perspective of the other services, invocation of those capability URIs is seen as an account performing an action. Sound familiar in a way? + +The way Linden Lab implemented this part of OGP was by having a capabilities server which handled routing the invoked capability URIs to other web services. This step in and of itself is not particularly required, an OGP implementation could handle consumption of the capability URIs directly, as OpenSim does for example. + +## Bringing capability URIs into ActivityPub as a first step + +So, we have established that capability URIs are an opaque token that can be called as a substitute for whatever backend web service was going to be used in the first place. But, what does that get us? + +The simplest way to look at it is this way: there are activities which are _relayable_ and others which are _not relayable_. Both can become capability-enabled, but require separate strategies. + +### Relayable activities + +`Create` (in this context, thread replies) activities are relayable. This means the capability can simply be invoked by treating it as an `inbox`, and the server the capability is invoked on will relay the side effects forward. The exact mechanism for this is not yet defined, as it will require prototyping and verification, but it's not impossible. Capability URIs for relayable activities can likely be directly aliased to the `sharedInbox` if one is available, however. + +### Intransitive activities + +Intransitive activities (ones which act on a pre-existing object that is not supplied) like `Announce`, `Like`, `Follow` will require proofs. We can already provide proofs in the form of an `Accept` activity: + +``` +{ + "@context": "https://www.w3.org/ns/activitystreams", + "id": "https://example.social/proofs/fa43926a-63e5-4133-9c52-36d5fc6094fa", + "type": "Accept", + "actor": "https://example.social/users/bob", + "object": { + "id": "https://example.social/activities/12945622-9ea5-46f9-9005-41c5a2364f9c", + "type": "Announce", + "object": "https://example.social/objects/d6cb8429-4d26-40fc-90ef-a100503afb73", + "actor": "https://example.social/users/alyssa", + "to": ["https://example.social/users/alyssa/followers"], + "cc": ["https://www.w3.org/ns/activitystreams#Public"] + } +} +``` + +This proof can be optionally signed with LDS in the same way as OCAP-LD proofs. Signing the proof is not covered here, and the proof must be fetchable, as somebody looking to distribute their intransitive actions on objects known to be security labeled must validate the proof somehow. + +### Object capability discovery + +A security labelled object has a new field, `capabilities` which is an `Object` that contains a set of allowed actions and the corresponding capability URI for them: + +``` +{ + "@context": [ + "https://www.w3.org/ns/activitystreams", + "https://litepub.social/litepub/lice-v0.0.1.jsonld" + ], + "capabilities": { + "Announce": "https://example.social/caps/4f230498-5a01-4bb5-b06b-e3625fc03947", + "Create": "https://example.social/caps/d4c4d96a-36d9-4df5-b9da-4b8c74e02567", + "Like": "https://example.social/caps/21a946fb-1bad-48ae-82c1-e8d1d2ab28c3" + }, + [...] +} +``` + +### Example: Invoking a capability + +Bob makes a post, which he allows liking, and replying, but not announcing. That post looks like this: + +``` +{ + "@context": [ + "https://www.w3.org/ns/activitystreams", + "https://litepub.social/litepub/lice-v0.0.1.jsonld" + ], + "capabilities": { + "Create": "https://example.social/caps/d4c4d96a-36d9-4df5-b9da-4b8c74e02567", + "Like": "https://example.social/caps/21a946fb-1bad-48ae-82c1-e8d1d2ab28c3" + }, + "id": "https://example.social/objects/d6cb8429-4d26-40fc-90ef-a100503afb73", + "type": "Note", + "content": "I'm really excited about the new capabilities feature!", + "attributedTo": "https://example.social/users/bob" +} +``` + +As you can tell, the capabilities object does not include an `Announce` grant, which means that a proof will not be provided for `Announce` objects. + +Alyssa wants to like the post, so she creates a normal Like activity and sends it to the `Like` capability URI. The server responds with an `Accept` object that she can forward to her recipients: + +``` +{ + "@context": [ + "https://www.w3.org/ns/activitystreams", + "https://litepub.social/litepub/lice-v0.0.1.jsonld" + ], + "id": "https://example.social/proofs/fa43926a-63e5-4133-9c52-36d5fc6094fa", + "type": "Accept", + "actor": "https://example.social/users/bob", + "object": { + "id": "https://example.social/activities/12945622-9ea5-46f9-9005-41c5a2364f9c", + "type": "Like", + "object": "https://example.social/objects/d6cb8429-4d26-40fc-90ef-a100503afb73", + "actor": "https://example.social/users/alyssa", + "to": [ + "https://example.social/users/alyssa/followers", + "https://example.social/users/bob" + ] + } +} +``` + +Bob can be removed from the recipient list, as he already processed the side effects of the activity when he accepted it. Alyssa can then forward this object on to her followers, which can verify the proof by fetching it, or alternatively verifying the LDS signature if present. + +### Example: Invoking a relayable capability + +Some capabilities, like `Create` result in the server hosting the invoked capability relaying the message forward instead of using proofs. + +In this example, the post being relayed is assumed to be publicly accessible. Instances where a post is not publicly accessible should create a capability URI which returns the post object. + +Alyssa decides to post a reply to the message from Bob she just liked above: + +``` +{ + "@context": [ + "https://www.w3.org/ns/activitystreams", + "https://litepub.social/litepub/lice-v0.0.1.jsonld" + ], + "to": ["https://example.social/users/alyssa/followers"], + "cc": ["https://www.w3.org/ns/activitystreams#Public"], + "type": "Create", + "actor": "https://www.w3.org/users/alyssa", + "object": { + "capabilities": { + "Create": "https://example.social/caps/97706df4-86c0-480d-b8f5-f362a1f45a01", + "Like": "https://example.social/caps/6db4bec5-619d-45a2-b3d7-82e5a30ce8a5" + }, + "type": "Note", + "content": "I am really liking the new object capabilities feature too!", + "attributedTo": "https://example.social/users/alyssa" + } +} +``` + +An astute reader will note that the capability set is the same as the parent. This is because the parent reserves the right to reject any post which requests more rights than were in the parent post's capability set. + +Alyssa POSTs this message to the `Create` capability from the original message and gets back a `202 Accepted` status from the server. The server will then relay the message to her followers collection by dereferencing it remotely. + +A possible extension here would be to allow the Create message to become intransitive and combined with a proof. This could be done by leaving the `to` and `cc` fields empty, and specifying `audience` instead or something along those lines. + +## Considerations with backwards compatibility + +Obviously, it goes without saying that an ActivityPub 1.0 implementation can ignore these capabilities and do whatever they want to do. Thusly, it is suggested that messages with security labelling contrary to what is considered normal for ActivityPub 1.0 are not sent to ActivityPub 1.0 servers. + +Determining what servers are compatible ahead of time is still an area that needs significant research activity, but I believe it can be done! diff --git a/content/blog/why-apk-tools-is-different-than-other-package-managers.md b/content/blog/why-apk-tools-is-different-than-other-package-managers.md new file mode 100644 index 0000000..c45c0b7 --- /dev/null +++ b/content/blog/why-apk-tools-is-different-than-other-package-managers.md @@ -0,0 +1,32 @@ +--- +title: "Why apk-tools is different than other package managers" +date: "2021-04-25" +--- + +Alpine as you may know uses the [apk-tools package manager](https://gitlab.alpinelinux.org/alpine/apk-tools), which we built because pre-existing package managers did not meet the design requirements needed to build Alpine.  But what makes it different, and why does that matter? + +## `apk add` and `apk del` manipulate the desired state + +In traditional package managers like `dnf` and `apt`, requesting the installation or removal of packages causes those packages to be directly installed or removed, after a consistency check. + +In `apk`, when you do `apk add foo` or `apk del bar`, it adds `foo` or `bar` as a dependency constraint in `/etc/apk/world` which describes the desired system state.  Package installation or removal is done as a side effect of modifying this system state.  It is also possible to edit `/etc/apk/world` with the text editor of your choice and then use `apk fix` to synchronize the installed packages with the desired system state. + +Because of this design, you can also add conflicts to the desired system state.  For example, we [recently had a bug in Alpine where `pipewire-pulse` was preferred over `pulseaudio` due to having a simpler dependency graph](https://gitlab.alpinelinux.org/alpine/apk-tools/-/issues/10742).  This was not a problem though, because users could simply add a conflict against `pipewire-pulse` by doing `apk add !pipewire-pulse`. + +Another result of this design is that `apk` will never commit a change to the system that leaves it unbootable.  If it cannot verify the correctness of the requested change, it will back out adding the constraint before attempting to change what packages are actually installed on the system.  This allows our dependency solver to be rigid: there is no way to override or defeat the solver other than providing a scenario that results in a valid solution. + +## Verification and unpacking is done in parallel to package fetching + +Unlike other package managers, when installing or upgrading packages, `apk` is completely driven by the package fetching I/O.  When the package data is fetched, it is verified and unpacked on the fly.  This allows package installations and upgrades to be extremely fast. + +To make this safe, package contents are initially unpacked to temporary files and then atomically renamed once the verification steps are complete and the package is ready to be committed to disk. + +## `apk` does not use a particularly advanced solver + +Lately, traditional package managers have bragged about having advanced SAT solvers for resolving complicated constraint issues automatically.  For example, [aptitude is capable of solving sudoku puzzles](https://web.archive.org/web/20080823224640/http://algebraicthunk.net/~dburrows/blog/entry/package-management-sudoku/).  `apk` is definitely not capable of that, and I consider that a feature. + +While it is true that `apk` does have a deductive dependency solver, it does not perform backtracking.  The solver is also constrained: it is not allowed to make changes to the `/etc/apk/world` file.  This ensures that the solver cannot propose a solution that will leave your system in an inconsistent state. + +Personally, I think that trying to make a smart solver instead of appropriately constraining the problem is a poor design choice.  I believe the fact that `apt`, `aptitude` and `dnf` have all written code to constrain their SAT solvers in various ways proves this point. + +To conclude, package managers can be made to go fast, and be safe while doing it, but require a careful design that is well-constrained.  `apk` makes its own tradeoffs: a less powerful but easy to audit solver, trickier parallel execution instead of phase-based execution.  These were the right decisions for us, but may not be the right decisions for other distributions. diff --git a/content/blog/why-rms-should-not-be-leading-the-free-software-movement.md b/content/blog/why-rms-should-not-be-leading-the-free-software-movement.md new file mode 100644 index 0000000..304c77b --- /dev/null +++ b/content/blog/why-rms-should-not-be-leading-the-free-software-movement.md @@ -0,0 +1,26 @@ +--- +title: "Why RMS should not be leading the free software movement" +date: "2021-03-23" +--- + +Earlier today, I was [invited to sign the open letter calling for the FSF board to resign](https://rms-open-letter.github.io), [which I did](https://github.com/rms-open-letter/rms-open-letter.github.io/pull/44).  To me, it was obvious to sign the letter, which on it's own makes a compelling argument for [why RMS should not be an executive director at FSF](https://rms-open-letter.github.io/appendix). + +But I believe there is an even more compelling reason. + +When we started Alpine 15 years ago, we largely copied the way other distributions handled things... and so Alpine copied the problems that many other FOSS projects had as well.  Reviews were harsh and lacking empathy, which caused problems with contributor burnout.  During this time, Alpine had some marginal amount of success, Docker did eventually standardize on Alpine as platform of choice for micro-containers and postmarketOS was launched. + +Due to burnout, I turned in my commit privileges, resigned from the core team and wound up taking a sabbatical from the project for almost a year. + +In the time I was taking a break from Alpine, an amazing thing happened: the core team decided to change the way things were done in the project.  Instead of giving harsh reviews as the bigger projects did, the project pivoted towards a philosophy of _collaborative kindness_.  And as a result, burnout issues went away, and we started attracting all sorts of new talent. + +[Robert Hansen resigned as the GnuPG FAQ maintainer today](https://twitter.com/robertjhansen/status/1374242002653577216).  In his message, he advocates that we should _demand_ kindness from our leaders, and he's right, it gets better results.  Alpine would not be the success it is today had we not decided to change the way the project was managed.  I want that level of success for FOSS as a whole. + +Unfortunately, it is proven again and again that RMS is not capable of being kind. + +He screams at interns when they do not do work to his exact specification (unfortunately, FSF staff are forced to sign an NDA that covers almost every minutia of their experience being an FSF staffer so this is not easy to corroborate). + +He shows up on e-mail lists and overrides the decisions made by his subordinates, despite those decisions having strong community consensus. + +He is a wholly ineffective leader, and his continued leadership will ultimately be harmful to FSF. + +And so, that is why I signed the letter demanding his resignation yet again. diff --git a/content/blog/you-cant-stop-the-corporate-music.md b/content/blog/you-cant-stop-the-corporate-music.md new file mode 100644 index 0000000..3f714ab --- /dev/null +++ b/content/blog/you-cant-stop-the-corporate-music.md @@ -0,0 +1,58 @@ +--- +title: "you can't stop the (corporate) music" +date: "2021-09-28" +--- + +I've frequently said that marketing departments are the most damaging appendage of any modern corporation. However, there is one example of this which really proves the point: corporate songs, and more recently, corporate music videos. These Lovecraftian horrors are usually created in order to raise employee morale, typically at the cost of hundreds of thousands of dollars and thousands of man-hours being wasted on meetings to compose the song by committee. But don't take my word for it: here's some examples. + +## HP's "Power Shift" + +https://www.youtube.com/watch?v=VLTh4uVJduI + +With a corporate song like this, it's no surprise that PA-RISC went nowhere. + +Lets say you're a middle manager at Hewlett-Packard in 1991 leading the PA-RISC workstation team. Would you wake up one day and say "I know! What we need is to produce a rap video to show why PA-RISC is cool"? No, you probably wouldn't. But that's what somebody at HP did. The only thing this song makes me want to do is **not** buy a PA-RISC workstation. The lyrics likely haunt the hired composer to this day, just look at the hook: + +You want power, +you want speed, +the 700 series, +is what you need! + +PA-RISC has set the pace, +Hewlett-Packard now leads the race! + +## Brocade One: A Hole in None + +https://www.youtube.com/watch?v=RzAnyfgpUcE + +This music video is so bad that Broadcom acquired Brocade to put it out of its misery a year later. + +Your company is tanking because Cisco and Juniper released new products, such as the Cisco Nexus and Juniper MX router, that were far better than the Brocade NetIron MLXe router. What do you do? Make a better router? Nah, that's too obvious. Instead, make a rap video talking about how your management tools are better! (I can speak from experience that Brocade's VCS solution didn't actually work reliably, but who cares about facts.) + +## PriceWaterhouseCoopers: talking about taxes, state and local + +https://www.youtube.com/watch?v=itWiTKU4nCo + +If you ever wondered if accountants could write hard hitting jams: the answer is no. + +At least this one sounds like they made it in-house: the synth lead sounds like it is a Casio home synthesizer, and the people singing it sound like they probably worked there. Outside of the completely blasé lyrics, this one is surprisingly tolerable, but one still has to wonder if it was a good use of corporate resources to produce. Most likely not. + +## The Fujitsu Corporate Anthem + +https://www.youtube.com/watch?v=FRTf3UXCpiE + +Fujitsu proves that this isn't just limited to US companies + +As far as corporate songs go, this one is actually quite alright. Fujitsu went all out on their propaganda exercise, hiring Mitsuko Miyake to sing their corporate power ballad, backed by a big band orchestra. If you hear it from them, Fujitsu exists to bring us all into a corporate utopia, powered by Fujitsu. Terrifying stuff, honestly. + +## The Gazprom Song: the Soviet Corporate Song + +https://www.youtube.com/watch?v=xGbI87tyr\_4 + +Remember when Gazprom had a gas leak sealed by detonating an atomic bomb? + +A Russian friend of mine sent me this when I noted I was looking for examples of corporate propaganda. This song is about Gazprom, the Russian state gas company. Amongst other things, it claims to be a national savior in this song. I have no idea if that's true or not, but they once had a gas leak sealed by [getting the Soviet military to detonate an atomic bomb](https://www.youtube.com/watch?v=3kwQfjGnVpw), so that seems pretty close. + +## Please stop making these songs + +While I appreciate material where the jokes write themselves, these songs represent the worst of corporatism. Spend the money buying employees something they would actually appreciate, like a gift card or something instead of making these eldritch horrors. Dammit, I still have the PWC one stuck in my head. Gaaaaaaah!