HN Offline: Real-time Linux is officially part of the kernel

Real-time Linux is officially part of the kernel

jonbaer | 503 points | 9mon ago | arstechnica.com

jpfr|9mon ago

This is a big achievement after many years of work!

Here are a few links to see how the work is done behind the scenes. Sadly arstechnica has only funny links and doesn't provide the actual source (why LinkedIn?).

Most of the work was done by Thomas Gleixner and team. He founded Linutronix, now (I believe) owned by Intel.

Pull request for the last printk bits: https://marc.info/?l=linux-kernel&m=172623896125062&w=2

Pull request for PREEMPT_RT in the kernel config: https://marc.info/?l=linux-kernel&m=172679265718247&w=2

This is the log of the RT patches on top of kernel v6.11.

https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-...

I think there are still a few things you need on top of a vanilla kernel. For example the new printk infrastructure still needs to be adopted by the actual drivers (UART consoles and so on). But the size of the RT patchset is already much much smaller than before. And being configurable out-of-the-box is of course a big sign of confidence by Linus.

Congrats to the team!

weinzierl|9mon ago

Thomas Gleixner is one if the most prolific people I've heard of. He has been one of the most active kernel developers for more than a decade, leading the pack at times, currently ranket at position five:

https://lwn.net/Articles/956765/

froh|9mon ago

TIL in 2022, Linutronix became an "independent subsidiary" of Intel, indeed:

https://www.linutronix.de/company/history.php

helpfulContrib|9mon ago

[dead]

femto|9mon ago

If you want to see the effect of the real-time kernel, build and run the cyclictest utility from the Linux Foundation.

https://wiki.linuxfoundation.org/realtime/documentation/howt...

It measures and displays the interrupt latency for each CPU core. Without the real-time patch, worst case latency can be double digit milliseconds. With the real-time patch, worst case drops to single digit microseconds. (To get consistently low latency you will also have to turn off any power saving states, as a transition between sleep states can hog the CPU, despite the RT kernel.) Cyclictest is an important tool if you're doing real-time with Linux.

As an example, if you're doing processing for software defined radio, it's the difference between the system occasionally having "blips" and the system having rock solid performance, doing what it is supposed to every time. With the real time kernel in place, I find I can do acid-test things, like running GNOME and libreoffice on the same laptop as an SDR, and the SDR doesn't skip a beat. Without the real-time kernel it would be dropping packets all over the place.

aero-glide2|9mon ago

Interestingly, whenever I touch my touchpad, the worst case latency shoots up 20x, even with RT patch. What could be causing this? And this is always on core 5.

femto|9mon ago

Perhaps the code associated with the touchpad has a priority greater than that you used to run cyclictest (80?). Does it still happen if you boost the priority of cyclictest to the highest possible, using the option:

--priority=99

Apply priority 99 with care to your own code. A tight endless loop with priority 99 will override pretty well everything else, so about the only way to escape will be to turn your computer off. Been there, done that :-)

snvzz|9mon ago

The most important is to set the policy, described in sched(7), rather than the priority.

Notice that without setting the priority, default policy is other, which is the standard one most processes get unless they request else.

By setting priority (while not specifying policy), the policy becomes fifo, the highest, which is meant to give the cpu immediately and not preempt until process releases it.

This implicit change in policy is why you see such brutal effect from setting priority.

femto|9mon ago

Thanks.

robocat|9mon ago

Perhaps an SMM ring -2 touchpad driver?

If you're developing anything on x86 that needs realtime - how do you disable SMM drivers causing unexpected latency?

jabl|9mon ago

Buy HW that can be flashed with coreboot?

And while it won't (completely) remove SMM, https://github.com/corna/me_cleaner might get rid of some stuff. I think that's more about getting rid of spyware and ring -1 security bugs than improving real-time behavior though.

angus-g|9mon ago

Maybe a PS/2 touchpad that is triggering (a bunch of) interrupts? Not sure how hardware interrupts work with RT!

jabl|9mon ago

One of the features of PREEMPT_RT is that it converts interrupt handlers to running in their own threads (with some exceptions, I believe), instead of being tacked on top of whatever thread context was active at the time like with the softirq approach the "normal" kernel uses. This allows the scheduler to better decide what should run (e.g. your RT process rather than serving interrupts for downloading cat pictures).

monero-xmr|9mon ago

Touchpad support very poor in Linux. I use System76 and the touchpad is always a roll of the dice with every kernel upgrade, despite it being a "good" distro / vendor

dijit|9mon ago

Quiet reminder that "real-time" is almost best considered "consistent-time".

The problem space is such that it doesn't necessarily mean "faster" or lower latency in any way, just that where there is latency: it's consistent.

amiga386|9mon ago

I always viewed it as "the computer needs to control things that are happening in real time and won't wait for it if it's late".

PhilipRoman|9mon ago

Indeed, some of my colleagues worked on a medical device which must be able to reset itself in 10 seconds, in case something goes wrong. 10 seconds is plenty of time on average, the real problem is eliminating those remaining 0.01% cases.

froh|9mon ago

consistent as in reliably bounded that is.