Is there a way to keep Linux responsive when at ~100% CPU usage?
One big difference that I've noticed between Windows and Linux is that Windows does a much better job ensuring that the system stays responsive even under heavy load.
For instance, I often need to compile Rust code. Anyone who writes Rust knows that the Rust compiler is very good at using all your cores and all the CPU time it can get its hands on (which is good, you want it to compile as fast as possible after all). But that means that for a time while my Rust code is compiling, I will be maxing out all my CPU cores at 100% usage.
When this happens on Windows, I've never really noticed. I can use my web browser or my code editor just fine while the code compiles, so I've never really thought about it.
However, on Linux when all my cores reach 100%, I start to notice it. It seems like every window I have open starts to lag and I get stuttering as the programs struggle to get a little bit of CPU that's left. My web browser starts lagging with whole seconds of no response and my editor behaves the same. Even my KDE Plasma desktop environment starts lagging.
I suppose Windows must be doing something clever to somehow prioritize user-facing GUI applications even in the face of extreme CPU starvation, while Linux doesn't seem to do a similar thing (or doesn't do it as well).
Is this an inherent problem of Linux at the moment or can I do something to improve this? I'm on Kubuntu 24.04 if it matters. Also, I don't believe it is a memory or I/O problem as my memory is sitting at around 60% usage when it happens with 0% swap usage, while my CPU sits at basically 100% on all cores. I've also tried disabling swap and it doesn't seem to make a difference.
EDIT: Tried nice -n +19, still lags my other programs.
EDIT 2: Tried installing the Liquorix kernel, which is supposedly better for this kinda thing. I dunno if it's placebo but stuff feels a bit snappier now? My mouse feels more responsive. Again, dunno if it's placebo. But anyways, I tried compiling again and it still lags my other stuff.
The Linux kernel uses the CPU default scheduler, CFS, a mode that tries to be fair to all processes at the same time - both foreground and background - for high throughput. Abstractly think "they never know what you intend to do" so it's sort of middle of the road as a default - every CPU cycle of every process gets a fair tick of work unless they've been intentionally nice'd or whatnot. People who need realtime work (classic use is for audio engineers who need near-zero latency in their hardware inputs like a MIDI sequencer, but also embedded hardware uses realtime a lot) reconfigure their system(s) to that to that need; for desktop-priority users there are ways to alter the CFS scheduler to help maintain desktop responsiveness.
Have a look to Github projects such as this one to learn how and what to tweak - not that you need to necessarily use this but it's a good point to start understanding how the mojo works and what you can do even on your own with a few sysctl tweaks to get a better desktop experience while your rust code is compiling in the background. https://github.com/igo95862/cfs-zen-tweaks (in this project you're looking at the set-cfs-zen-tweaks.sh file and what it's tweaking in /proc so you can get hints on where you research goals should lead - most of these can be set with a sysctl)
There's a lot to learn about this so I hope this gets you started down the right path on searches for more information to get the exact solution/recipe which works for you.
Responsiveness for typical everyday usage is one of the main scenarios kernels like Zen/Liquorix and their out of the box scheduler configurations are meant to improve, and in my experience they help a lot. Maybe give them a go sometime!
Edit: For added context, I remember Zen significantly improving responsiveness under heavy loads such as the one OP is experiencing back when I was experimenting with some particularly computationally intensive tasks
The System76 scheduler helps to tune for better desktop responsiveness under high load: https://github.com/pop-os/system76-scheduler
I think if you use Pop!OS this may be set up out-of-the-box.
Lots of bad answers here. Obviously the kernel should schedule the UI to be responsive even under high load. That’s doable; just prioritise running those over batch jobs. That’s a perfectly valid demand to have on your system.
This is one of the cases where Linux shows its history as a large shared unix system and its focus as a server OS; if the desktop is just a program like any other, who’s to say it should have more priority than Rust?
I’ve also run into this problem. I never found a solution for this, but I think one of those fancy new schedulers might work, or at least is worth a shot. I’d appreciate hearing about it if it does work for you!
Hopefully in a while there are separate desktop-oriented schedulers for the desktop distros (and ideally also better OOM handlers), but that seems to be a few years away maybe.
In the short term you may have some success in adjusting the priority of Rust with nice, an incomprehensibly named tool to adjust the priority of your processes. High numbers = low priority (the task is “nicer” to the system). You run it like this: nice -n5 cargo build.
"The kernel runs out of time to solve the NP-complete scheduling problem in time."
More responsiveness requires more context-switching, which then subtracts from the available total CPU bandwidth.
There is a point where the task scheduler and CPUs get so overloaded that a non-RT kernel can no longer guarantee timed events.
So, web browsing is basically poison for the task scheduler under high load. Unless you reserve some CPU bandwidth (with cgroups, etc.) beforehand for the foreground task.
Since SMT threads also aren't real cores (about ~0.4 - 0.7 of an actual core), putting 16 tasks on a 16/8 machine is only going to slow down the execution of all other tasks on the shared cores.
I usually leave one CPU thread for "housekeeping" if I need to do something else. If I don't, some random task is going to be very pleased by not having to share a core. That "spare" CPU thread will be running literally everything else, so it may get saturated by the kernel tasks alone.
nice +5 is more of a suggestion to "please run this task with a worse latency on a contended CPU.".
(I think I should benchmark make -j15 vs. make -j16 to see what the difference is)
Sounds like Kubuntu's fault to me. If they provide the desktop environment, shouldn't they be the ones making it play nice with the Linux scheduler? Linux is configurable enough to support real-time scheduling.
FWIW I run NixOS and I've never experienced lag while compiling Rust code.
If you compile on windows server the same problem happens. The server is basically gone. So there seems to be some special scheduler configuration in windows client os.
You say it doesn't top out on memory, so you don't need the -p MemoryLimit=500M parameter. Set your compiler CPUQuota to maybe 80%, or whatever you can work out with trial and error.
All the comments here are great. One other suggestion I didn't see: use chrt to start the build process with the sched_batch policy. It's lower than sched_other, which most processes will be in, so the compilation processes should be bumped off the CPU for virtually everyone else
While I ultimately think your solution is to use a different scheduler, and that the most useful responses you've gotten have been about that; and that I agree with your response that Linux distros should really be tuning the scheduler for the UI by default and let developers and server runners take the burden of tuning differently for their workloads... all that said, I can't let this comment on your post go by:
which is good, you want it to compile as fast as possible after all
If fast compile times are your priority, you're using the wrong programming language. One of Go's fundamental principles is fast compile times; even with add-on caching tooling in other languages, Go remains one of the fastest-compiling statically compiled, strongly typed programming languages available. I will not install Haskell programs unless they're precompiled bin packages, that's a hard rule. I will only reluctantly install Rust packages, and will always choose bins if available. But I'll pick a -git Go package without hesitation, because they build crazy fast.
Anyway, I hope you find the scheduler of your dreams and live happily ever after.
TLDR you might be interested in the rust-based scheduler one of the Canonical Devs released as a PoC. Seemed to be designed similar to your needs of keeping the system (particularly games) responsive even whilst running heavy tasks like kernel compilations. You can swap out schedulers at run time on Linux iirc?
My work windoz machine clogged up quite much recompiling large projects (GB s of C/C++ code), so I set it to use 19/20 "cores". Worked okayish but was not some snappy experience IMO (64GB RAM & SSD).
Actually, I've experienced the opposite. I find Windows lagging much more often than Linux when compiling something, especially since Linux switched to the EEVDF scheduler. The most important factor that influences lag on both systems seems to be the power profile though. If I set my power profile to save battery, the system lags from time to time but if I set it to performance it basically never happens (on GNOME you can change that in the quick menu, not sure about KDE). It might be that your Windows is simply tuned more towards performance by default at the cost of higher power consumption.
EDIT: Tried nice -n +19, still lags my other programs.
yea, this is wrong way of doing things. You should have better results with CPU-pinning. Increasing priority for YOUR threads that interact all the time with disk io, memory caches and display IO is the wrong end of the stick. You still need to display compilation progress, warnings, access IO.
There's no way of knowing why your system is so slow without profiling it first. Taking any advice from here or elsewhere without telling us first what your machine is doing is missing the point. You need to find out what the problem is and report it at the source.
Hmm, I can't say that I've ever noticed this. I have a 3950x 16-core CPU and I often do video re-encoding with ffmpeg on all cores, and occasionally compile software on all cores too. I don't notice it in the GUI's responsiveness at all.
Are you absolutely sure it's not I/O related? A compile is usually doing a lot of random IO as well. What kind of drive are you running this on? Is it the same drive as your home directory is on?
Way back when I still had a much weaker 4-core CPU I had issues with window and mouse lagging when running certain heavy jobs as well, and it turned out that using ionice helped me a lot more than using nice.
I also remember that fairly recently there was a KDE/plasma stutter bug due to it reading from ~/.cache constantly. Brodie Robertson talked about it: https://www.youtube.com/watch?v=sCoioLCT5_o
Yeah I think the philosophy of Linux is to not assume what you are going to be use it for. Why should Linux know where your priorities are better than you?
Some people want to run their rustc, ffmpeg or whatever intensive program and don't mind getting a coffee while that happens, or it's running on a non-user facing server anyway, to ensure that the process happens as soon as technically possible. Mind you that your case is not an "average usecase" either, not everyone is a developer that does compilation tasks.
So you've got a point that the defaults could be improved for the desktop software developer user or somehow made more easily configurable. As suggested downthread, try the nice command, an optimized scheduler or kernel, or pick a distribution equipped with that kind of kernel by default. The beauty of Linux is that there are many ways to solve a problem, and with varying levels of effort you can get things to pretty much exactly where you want them, rather than some crowdpleasing default.
It sounds like the issue is that the Rust compiler uses 100% of your CPU capacity. Is there any command line option for it that throttles the amount of cpu it will use? This doesn't sound like an issue that you should be tackling at the OS level. Maybe you could wrap the compiler in a docker container and use resource constraints?