A Higgs-Bugson in the Linux Kernel
Ne02ptzero | 232 points | 16day ago | blog.janestreet.com
gnfargbl|15day ago
Calling this a "Higgs-Bugson" doesn't make a lot of sense. There's nothing uncertain or difficult to reproduce about the Higgs.
The reason that it took so long to find was that the cross-section of production is very low, the decay signatures are hard to separate from the background, the specific energy scale it existed at was not well-defined, and building the LHC was (to put it mildly) difficult and expensive.
Roughly, if you'll forgive a bad analogy from a long-lapsed physicist, it was the equivalent of trying to find a very weak glow from a specific type of bug hiding at an unknown location in a huge field of corn. Except that your vision was very bad, so you had to invent a new type of previously-unimaginably excellent eyeglasses to see the thing. Also before you could even start looking you had to expend a painful amount of time and money building a flashlight so incredibly huge that it needed new types of cryogenic cooling inventing, just to stop it from melting when you switched it on.
If you had a software bug that you were almost certain was there, but you needed half of the world's GPU clusters for three years to locate and prove it, then that would be a Higgs-Bugson.
lisper|15day ago
A bug that shows up in production but goes away when you try to debug it is usually called a Heisenbug.
Agingcoder|12day ago
Yes - happened to me once. I could reproduce it consistently until I turned tcpdump on , at which point it would go away. I resorted to ebpf as well.
alexpotato|15day ago
Regarding NFS, I've always loved this quote from the CTO at a hedge fund I once worked at:
"NFS is lot like heroin:
at first, it seems amazing.
But then it ruins your life"
(This is a place that did almost EVERYTHING via NFS including different applications communicating via shared files on NFS mounts. It also had the weird setup of using BOTH Linux AND Windows permissions on NFS mounts shared between user desktops [windows] an servers [linux])
stavros|15day ago
The problem I have with reviews like these is that they're expressed in absolute terms. Yes, NFS might ruin my life, but if it ruins my life less than every other alternative, it's still a win.
eqvinox|15day ago
I'd go as far as saying most networked concurrent file access will ruin your life one way or another, because it's just a hard problem, and it's trying to solve it at a very odd layer; a "classic" fs can't really take advantage of higher layer transactional or other known constraints in order to make things work better…
fragmede|15day ago
Google Docs solved the problem at the right layer then.
eqvinox|15day ago
No, Google Docs solved a different problem at the right layer. Their solution isn't transferable to other specific problems that may currently be approached using networked file systems, let alone the generic case.
burnt-resistor|15day ago
The devil is in the details, the devil you know is preferable, and there's yet no perfectly angelic systems or code (because of the widespread allergy to formal methods and job security).. which will lead to less evil, but still imperfect systems.
johncolanduoni|15day ago
Continuing the analogy, many people eventually discover that they used NFS because they didn’t understand their underlying problem clearly.