Updating FreeBSD, and Re-Inventing the Wheel
Intro
Today’s post is a cautionary tale regarding re-inventing the wheel. From Wikipedia:
To reinvent the wheel is to attempt to duplicate—most likely with inferior results—a basic method that has already previously been created or optimized by others.
But where does FreeBSD come into play here? Read on…
Updating FreeBSD
FreeBSD 14.2-RELEASE
was made available a couple days ago as of this writing on December 3rd. I’ve been running 13.3-RELEASE
for a while now on one of my servers, so I decided to update. Generally this is fairly straight forward for FreeBSD; In fact, I have a machine that has been updating across major versions since version 10. Most often, it really works well, and really makes me with more Linux OS developers would get on the ball for their major version upgrades. Anyway, I digress.
In modern versions of FreeBSD, we can use the freebsd-update
tool to perform major and minor version upgrades. In my specific example, 13.3-RELEASE
→ 14.2-RELEASE
. So let’s try it! In the example below, I’ll skip past the steps that went smoothly:
1 | > sudo freebsd-update fetch # all good! |
Sweet, we’re ready for the upgrade
command:
1 | > sudo freebsd-update -r 14.2-RELEASE upgrade |
Hold up! Fetching 6451
patches, but we’re “done” at 150? Something is wrong. I tried this a few times, each with varying points of failure. Something seems to be up with the downloading of files. Let’s dig a bit:
1 | > which freebsd-update |
Oh, it’s just a shell script! Dig more…
Inspecting the freebsd-update
script, we see some clues:
- A couple (why!) declarations of
PHTTPGET=/usr/libexec/phttpget
PHTTPGET
then being used by passing lists of files to download. For example, the fragment below:
1 | Attempt to fetch metadata patches |
OK, so we read a patch list and break that up into file entries in which to feed to phttpget
to download in parallel. Sounds reasonable. Let’s check out phttpget
:
1 | file /usr/libexec/phttpget |
So this guy is a binary. Running with --help
doesn’t yield anything. What about the man page?
man phttpget
1 | PHTTPGET(8) FreeBSD System Manager's Manual PHTTPGET(8) |
Oof, not a whole lot of options there. Let’s take a look at the code. A quick Google landed me on the tools home on the internet. Oh man, alarm bells are already going off. From the page (cut+paste):
Note that phttpget is currently extremely minimalist. Of particular note:
* Phttpget can only issue GET requests.
* Phttpget cannot download files larger than 2GB (but this can be easily changed -- search for INT_MAX and replace it by something bigger).
* Phttpget blithely ignores HTTP errors and redirects... in fact, if the HTTP status code is anything other than 200, phttpget will skip over that file and move on to the next file.
* Phttpget ignores timestamps provided by the server. When it creates a file, the file's timestamp will be set to the current date, not the date provided by the server.
* Phttpget creates downloaded files in the current directory, with names equal to final segment of the download path (i.e., if it downloads http://www.example.com/foo/bar/baz then it will create a file named baz in the current directory). Phttpget makes not attempt to check for symlinks or other nastiness. Do not use phttpget if any other user can write to your current directory!
* If you already have a file where phttpget wants to create a file, it will silently remove the existing file.
* I wrote phttpget in about 28 hours, and finished under 12 hours ago. It has had very little testing and probably still contains lots of bugs. (12 hours later: bugcount--. Version 0.1 had a deadlock when fetching a very large number of files due to a missing "break"; this is fixed in version 0.2.)
Fair enough, but… this is the core tool for updating a OS distribution!? Oh my. The source is linked on the page, so download and take a look. It’s pretty small, a single phttpget.c.
file.
Skimming the code reveals a very simple implementation, which is great! But also, skimming the code, I can spot various assumptions about HTTP that aren’t quite right; The description by the author on their page certainly helped in this area. I do see something that sticks out right away: a pipelined
“option” in the code, but no way to set it via the CLI. It’s initialized to 0
(disabled), but how does it become enabled? Searching reveals this:
1 |
|
This is the one and only spot in the code in which this is flipped from 0
. The comment is really helpful here as its states the intent. Unfortunately, this isn’t a great assumption. hln
here is representing the minor version in a HTTP string, e.g. HTTP/1.1
yields 1
. This isn’t enough to assume pipelining, however. The server can stop serving requests for, well really whatever reason it pleases. Skimming the code some more, I note no retries, no way to control timeouts, etc. Looking back at freebsd-update
, this remains to be the case (single try → fail).
For a system update utility, this is now blowing my mind… but I’ll move on for now.
Let’s patch this thing, and really keep the HTTP work as K.I.S.S. as possible: No pipelining. For this, simple comment out the above two lines and re-compile with make
.
Fixing the Thing
Now that we’ve found a bug, let’s try the fix! I copied over my new binary, and tried again:
1 | sudo freebsd-update -r 14.2-RELEASE upgrade |
Certainly a bit slower, but not too bad… and would you look at that, it worked and I can now properly update.
The Lesson
So what is the lesson here? From my perspective, it’s don’t reinvent the wheel. We’ve been collectively using wget and cURL in the industry across the BSDs, Linux, Windows, and more. The wheel is there, it works, it’s well known, trusted, maintained, etc. Why make a new one? We’re not in a situation in which production FreeBSD servers are dependent on a buggy tool written “…in about 28 hours”. I would think this would raise the red flags.
I plan on reporting this properly, along with the suggestion of “Just use cURL”, and I hope the FreeBSD folk take it to heart. cURL even has a parallel transfer support!