1y ago

Is there a way to keep Linux responsive when at ~100% CPU usage?

One big difference that I've noticed between Windows and Linux is that Windows does a much better job ensuring that the system stays responsive even under heavy load.

For instance, I often need to compile Rust code. Anyone who writes Rust knows that the Rust compiler is very good at using all your cores and all the CPU time it can get its hands on (which is good, you want it to compile as fast as possible after all). But that means that for a time while my Rust code is compiling, I will be maxing out all my CPU cores at 100% usage.

When this happens on Windows, I've never really noticed. I can use my web browser or my code editor just fine while the code compiles, so I've never really thought about it.

However, on Linux when all my cores reach 100%, I start to notice it. It seems like every window I have open starts to lag and I get stuttering as the programs struggle to get a little bit of CPU that's left. My web browser starts lagging with whole seconds of no response and my editor behaves the same. Even my KDE Plasma desktop environment starts lagging.

I suppose Windows must be doing something clever to somehow prioritize user-facing GUI applications even in the face of extreme CPU starvation, while Linux doesn't seem to do a similar thing (or doesn't do it as well).

Is this an inherent problem of Linux at the moment or can I do something to improve this? I'm on Kubuntu 24.04 if it matters. Also, I don't believe it is a memory or I/O problem as my memory is sitting at around 60% usage when it happens with 0% swap usage, while my CPU sits at basically 100% on all cores. I've also tried disabling swap and it doesn't seem to make a difference.

EDIT: Tried nice -n +19, still lags my other programs.

EDIT 2: Tried installing the Liquorix kernel, which is supposedly better for this kinda thing. I dunno if it's placebo but stuff feels a bit snappier now? My mouse feels more responsive. Again, dunno if it's placebo. But anyways, I tried compiling again and it still lags my other stuff.

122 comments

Responsiveness for typical everyday usage is one of the main scenarios kernels like Zen/Liquorix and their out of the box scheduler configurations are meant to improve, and in my experience they help a lot. Maybe give them a go sometime!
Edit: For added context, I remember Zen significantly improving responsiveness under heavy loads such as the one OP is experiencing back when I was experimenting with some particularly computationally intensive tasks
- https://github.com/zen-kernel/zen-kernel/wiki/Detailed-Feature-List
  That's the reason I installed Zen too and use it as the default. While Zen is meant to improve responsiveness of interactive usage on the system, it comes at a price. The overall performance might decrease and it should require more power. But if someone needs to solve the problem of the OP (need to work on the computer while under heavy load), then Zen is probably the right tool. Some distributions have the Zen Kernel in their repository and the install process is straightforward.
  
  Very good points, it's all trade-offs at the end of the day. I've always found them more than worth it myself for non server workloads, but as always YMMV.

nice +5 cargo build
nice is a program that sets priorities for the CPU scheduler. Default is 0. Goes from -19, which is max prio, to +19 which is min prio.
This way other programs will get CPU time before cargo/rustc.
- So the better approach would be to spawn all desktop and base GUI things with nice -18 or something?
  
  No. This will wreak havoc. At most at -1 but I'd advise against that. Just spawn the lesser-prioritised programs with a positive value.
- It's more of a workaround than a solution. I don't want to have to do this for every intensive program I run. The desktop should just be responsive without any configuration.
  
  You could give your compiler a lower priority instead of upping everything else.
  
  Yes, this is a bad solution. No program should have that privilege, it needs to be an allowlist and not a blocklist.

The System76 scheduler helps to tune for better desktop responsiveness under high load: https://github.com/pop-os/system76-scheduler I think if you use Pop!OS this may be set up out-of-the-box.
- I distro hop occasionally but always find myself coming back to popos. There are so many quality of life improvements that seem small but make all the difference.

You could try using nice to give the rust compiler less priority (higher number) for scheduling.
- This seems too complicated if I need to do that for other programs as well.
  
  You can just alias to do this in the programs you do use
  Sure, the first time you won't have this enabled, but after that it just works.

Lots of bad answers here. Obviously the kernel should schedule the UI to be responsive even under high load. That’s doable; just prioritise running those over batch jobs. That’s a perfectly valid demand to have on your system.
This is one of the cases where Linux shows its history as a large shared unix system and its focus as a server OS; if the desktop is just a program like any other, who’s to say it should have more priority than Rust?
I’ve also run into this problem. I never found a solution for this, but I think one of those fancy new schedulers might work, or at least is worth a shot. I’d appreciate hearing about it if it does work for you!
Hopefully in a while there are separate desktop-oriented schedulers for the desktop distros (and ideally also better OOM handlers), but that seems to be a few years away maybe.
In the short term you may have some success in adjusting the priority of Rust with nice, an incomprehensibly named tool to adjust the priority of your processes. High numbers = low priority (the task is “nicer” to the system). You run it like this: nice -n5 cargo build.
- Obviously the kernel should schedule the UI to be responsive even under high load.
  Obviously... to you.
  This is one of the cases where Linux shows its history as a large shared unix system and its focus as a server OS; if the desktop is just a program like any other,
  Exactly.
  
  Obviously… to you.
  No. I'm sorry but if you are logged in with a desktop environment, obviously the UI of that desktop needs to stay responsive at all times, also under heavy load. If you don't care about such a basic requirement, you could run the system without a desktop or you could tweak it yourself. But the default should be that a desktop is prioritized and input from users is responded to as quickly as possible.
  This whole "Linux shouldn't assume anything"-attitude is not helpful. It harms Linux's potential as a replacement for Windows and macOS and also just harms its UX. Linux cannot ever truly replace Windows and macOS if it doesn't start thinking about these basic UX guarantees, like a responsive desktop.
  This is one of the cases where Linux shows its history as a large shared unix system and its focus as a server OS; if the desktop is just a program like any other,
  Exactly.
  You say that like it's a good thing; it is not. The desktop is not a program like any other, it is much more important that the desktop keeps being responsive than most other programs in the general case. Of course, you should have the ability to customize that but for the default and the general case, desktop responsiveness needs to be prioritized.
  
  I meant, obviously in the sense that Windows and macOS both apparently already do this and that it’s a desirable property to have, not that it’s technically easy.

It really depends on your desktop. For instance gnome handles high CPU very well in my experience.
I would run your compiler in a podman container with a CPU cap.
Edit: it might be related to me using Fedora

"The kernel runs out of time to solve the NP-complete scheduling problem in time."
More responsiveness requires more context-switching, which then subtracts from the available total CPU bandwidth. There is a point where the task scheduler and CPUs get so overloaded that a non-RT kernel can no longer guarantee timed events.
So, web browsing is basically poison for the task scheduler under high load. Unless you reserve some CPU bandwidth (with cgroups, etc.) beforehand for the foreground task.
Since SMT threads also aren't real cores (about ~0.4 - 0.7 of an actual core), putting 16 tasks on a 16/8 machine is only going to slow down the execution of all other tasks on the shared cores. I usually leave one CPU thread for "housekeeping" if I need to do something else. If I don't, some random task is going to be very pleased by not having to share a core. That "spare" CPU thread will be running literally everything else, so it may get saturated by the kernel tasks alone.
nice +5 is more of a suggestion to "please run this task with a worse latency on a contended CPU.".
(I think I should benchmark make -j15 vs. make -j16 to see what the difference is)
- That's all fine, but as I said, Windows seems to handle this situation without a hitch. Why can Windows do it when Linux can't?
  Also, it sounds like you suggest there is a tradeoff between bandwidth and responsiveness. That sounds reasonable. But shouldn't Linux then allow me to easily decide where I want that tradeoff to lie? Currently I only have workarounds. Why isn't there some setting somewhere to say "Yes, please prioritise responsiveness even if it reduces bandwidth a little bit". And that probably ought to be the default setting. I don't think a responsive UI should be questioned - that should just be a given.
  
  You're right of course. I think the issue is that Linux doesn't care about the UI. As far as it is concerned GUI is just another program. That's the same reason you don't have things like ctrl-alt-del on Linux.
  
  I agree that UI should always take priority. I shouldn't have to do anything to guarantee this.
  I have HZ_1000, tickless kernel with nohz_full set up. This all has a throughput/bandwidth cost (about 2%) in exchange for better responsiveness by default.
  But this is not enough, because the short burst UI tasks need near-zero wake-up latency... By the time the task scheduler has done its re-balancing the UI task is already sleeping/halted again, and this cycle repeats. So the nice/priorities don't work very well for UI tasks. Only way a UI task can run immediately is if it can preempt something or if the system has a somewhat idle CPU to put it on.
  The kernel doesn't know any better which tasks are like this. The on-going EEVDF, sched_ext scheduler projects attempt to improve the situation. (EEVDF should allow specifying the desired latency, while sched_ext will likely allow tuning the latency automatically)
  
  Why can Windows do it when Linux can’t?
  Windows lies to you. The only way they don't get this problem is that they are reserving some CPU bandwidth for the UI beforehand. Which explains the 1-2% y-cruncher worse results on windows.

I face similar issue when updating steam games although I think that's related to disk read write
But either way, issues like these gonna need to be address before we finally hit the year of Linux desktop lol

Sounds like Kubuntu's fault to me. If they provide the desktop environment, shouldn't they be the ones making it play nice with the Linux scheduler? Linux is configurable enough to support real-time scheduling.
FWIW I run NixOS and I've never experienced lag while compiling Rust code.
- I have a worrying feeling that if I opened a bug for the KDE desktop about this, they'd just say it's a problem of the scheduler and that's the kernel so it's out of their hands. But maybe I should try?
  
  The kde peeps are insanely nice so I guess you should try.

If you compile on windows server the same problem happens. The server is basically gone. So there seems to be some special scheduler configuration in windows client os.
- I wonder if Linux should also provide server and desktop variants like Windows does, with different scheduler settings and such. The use cases are quite different after all, it's kinda weird they use the same settings.
  
  it's typically up to the distribution to configure things like that, and many Linux distributions do come in both server and desktop or workstation variants like Ubuntu desktop vs Ubuntu server, or RHEL server vs RHEL Workstation
  I can't say how well they tune these things as I haven't ran them personally, but they do exist.

So I just tried using nice -n +19 and it still lags my browser and my UI. So that's not even a good workaround.
- Is your browser Firefox?
  What kind of storage devices do you have? NVMe?
  Did you check with tools like iotop to see if something is going on IO wise?
  You assumed that the problem is caused by the CPU being utilized at 100%.
  This may not be the case.
  A lot of us don't run a DE at all. I myself use Awesome WM.
  For non-tilers, Openbox with some toolbar would be the ideal setup.
  I mention this because we (non-DE users) would have no experience with some funky stuff like a possible KDE indexer running in the background killing IO performance and thrashing buffered/cached memory.
  Also, some of us run firefox with eatmydata because we hate fsync 🤨
  Neither KDE nor Gnome is peak Desktop Linux experience.
  Ubuntu and its flavors is not peak distro experience either.
  If you want to try Desktop Linux for real, you will need to dip your toes a little bit deeper.
  
  Yes Firefox, yes NVMe. No, there is no IO happening and again, sitting at relatively low memory usage. I was not running anything else than the compiler, my editor and Firefox. I'm fairly confident the CPU usage is the culprit as memory usage is not severely affected and disk usage by the compiler should be pretty minimal (and I don't see how disk usage would make Firefox slow if there's still plenty of RAM available).
  Neither KDE nor Gnome is peak Desktop Linux experience. Ubuntu and its flavors is not peak distro experience either.
  If you want to try Desktop Linux for real, you will need to dip your toes a little bit deeper.
  I've heard much of the opposite - KDE is touted as an easy-to-use desktop and Ubuntu is largely a popular "just works" distro. And honestly that has been my primary experience. Mostly everything works, but there are some hiccups here and there like the problem I posted about in this thread.
  What alternative would you suggest?

Found this b for your problem of limiting one specific program such as rust compiler: https://askubuntu.com/questions/1367612/how-can-i-limit-the-cpu-and-ram-usage-for-a-process
- I don't really want to limit the Rust compiler. If I leave my computer running while I take a break, I don't want it to artificially throttle the compiler. I just want user input and responsiveness of open windows to take priority over the compiler.
- OP most likely wants the opposite for the compiler...

Firefox on my raspberry pi grinds the thing to a halt, so I created a shortcut:
systemd-run --scope -p MemoryLimit=500M -p CPUQuota=50% firefox-esr
You say it doesn't top out on memory, so you don't need the -p MemoryLimit=500M parameter. Set your compiler CPUQuota to maybe 80%, or whatever you can work out with trial and error.

All the comments here are great. One other suggestion I didn't see: use chrt to start the build process with the sched_batch policy. It's lower than sched_other, which most processes will be in, so the compilation processes should be bumped off the CPU for virtually everyone else

Yep, CPU scheduler is the correct answer. Id recommend reading this arch wiki on it. https://wiki.archlinux.org/title/improving_performance

Linux defaults are optimized for performance and not for desktop usability.
- If that is the case, Linux will never be a viable desktop OS alternative.
  Either that needs to change or distributions targeting desktop needs to do it. Maybe we need desktop and server variants of Linux. It kinda makes sense as these use cases are quite different.
  EDIT: I'm curious about the down votes. Do people really believe that it benefits Linux to deprioritise user experience in this way? Do you really think Linux will become an actual commonplace OS if it keeps focusing on "performance" instead of UX?
  
  Linux is already a popular and viable desktop OS - for its target audience.
  The downvote comes from you implying people cannot dev in Linux when its the platform of choice for this workload.
  Now surely the user experience could be polished, but advanced users are at this point used to the workflow, and basic ones will stick to Windows out of inertia no matter what. Therefore the incentive for improving this kind of things is extremely low.
  
  "Desktop" Linux exists in this state for decades. Who cares? Maybe we won't have consumer desktops as a niche soon. Existing users are fine with that. Don't say you are waiting that Linux will become "a viable desktop OS alternative" in next few years.
  It's also not about "desktop and sever variants". Desktop Linux is either conservative or underresourced. Conservatives will told you that you are wrong and there is no issue. And they are major Linux zealots. For the other side someone need to write code and do system design, and there are not many of people for that. So, it's better not to expect a solution anytime soon, if you are not planning to work on it by yourself.

This hasn't been my experience when no swapping is involved (not a concern for me anymore with 32GiB physical RAM with 28GiB zram).
And I've been Rusting since v1.0, and Linuxing for even longer.
And my setup is boring (and stable), using Arch's LTS kernel which is built with CONFIG_HZ=300. Long gone are the days of running linux-ck.
Although I do use craneleft backend now day to day, so compiles don't take too long anyway.
- P.S. Since it wasn't mentioned already, look up cgroups.
  Back when I had a humble laptop (pre-Rust), using nice and co. didn't help much. Custom schedulers come with their own stability and worst-case-scenario baggage. cgroups should give you supported and well-tested tunable kernel-level resource usage control.

While I ultimately think your solution is to use a different scheduler, and that the most useful responses you've gotten have been about that; and that I agree with your response that Linux distros should really be tuning the scheduler for the UI by default and let developers and server runners take the burden of tuning differently for their workloads... all that said, I can't let this comment on your post go by:
which is good, you want it to compile as fast as possible after all
If fast compile times are your priority, you're using the wrong programming language. One of Go's fundamental principles is fast compile times; even with add-on caching tooling in other languages, Go remains one of the fastest-compiling statically compiled, strongly typed programming languages available. I will not install Haskell programs unless they're precompiled bin packages, that's a hard rule. I will only reluctantly install Rust packages, and will always choose bins if available. But I'll pick a -git Go package without hesitation, because they build crazy fast.
Anyway, I hope you find the scheduler of your dreams and live happily ever after.
- I only said as fast as possible - I generally think the compile times are fine and not a huge problem. Certainly worth it for all the benefits.
  
  There’s no free lunch after all. Go’s quick compilation also means the language is very simple, which means all the complexity shifts to the program’s code.

TLDR you might be interested in the rust-based scheduler one of the Canonical Devs released as a PoC. Seemed to be designed similar to your needs of keeping the system (particularly games) responsive even whilst running heavy tasks like kernel compilations. You can swap out schedulers at run time on Linux iirc?
https://www.phoronix.com/news/Rust-Linux-Scheduler-Experiment
- Interesting, thanks for sharing

I always did make -j$(nproc --ignore=1) to avoid this while building cpp code. But this problem seems to be less severe if there are a lot of cores.

My work windoz machine clogged up quite much recompiling large projects (GB s of C/C++ code), so I set it to use 19/20 "cores". Worked okayish but was not some snappy experience IMO (64GB RAM & SSD).
- What desktop?
  
  Wooden IKEA one.

Cpulimit?

Actually, I've experienced the opposite. I find Windows lagging much more often than Linux when compiling something, especially since Linux switched to the EEVDF scheduler. The most important factor that influences lag on both systems seems to be the power profile though. If I set my power profile to save battery, the system lags from time to time but if I set it to performance it basically never happens (on GNOME you can change that in the quick menu, not sure about KDE). It might be that your Windows is simply tuned more towards performance by default at the cost of higher power consumption.

Are you on x11 or wayland? For me x11 behaves really bad on these situations, and wayland is much much snappier.
- I am on Wayland actually
  
  Then it's wayland fault haha! Nah, hopefully it gets better!

Install a real time kernel such as Xanmod

EDIT: Tried nice -n +19, still lags my other programs.
yea, this is wrong way of doing things. You should have better results with CPU-pinning. Increasing priority for YOUR threads that interact all the time with disk io, memory caches and display IO is the wrong end of the stick. You still need to display compilation progress, warnings, access IO.
There's no way of knowing why your system is so slow without profiling it first. Taking any advice from here or elsewhere without telling us first what your machine is doing is missing the point. You need to find out what the problem is and report it at the source.

Just use -j, --jobs: https://doc.rust-lang.org/cargo/commands/cargo-build.html#miscellaneous-options
- I actually tried that but I had to reduce it all the way to 4 jobs, which slows compilation down a lot.
- The CPU is already 100% busy, so changing number of compilation jobs won't help, CPU can't go faster than 100%.

I experience the exact same thing.
The key is that you need to allow processes in your oom killer. There are different implementations like oomd or earlyoom.
Oomd freezes and doesnt kill, and I suppose distros do a bad job at allowlisting the desktop etc in there.
- As I mention at the end, this situation has nothing to do with running out of memory. It's purely CPU starvation.
  
  Thanks, yes this can have 2 causes and both need to be fixed.
- Maybe it is distro specific
  In Fedora workstation it does its job well. I sometimes run two many VMs at once and it hangs for a second before killing the VM

Hmm, I can't say that I've ever noticed this. I have a 3950x 16-core CPU and I often do video re-encoding with ffmpeg on all cores, and occasionally compile software on all cores too. I don't notice it in the GUI's responsiveness at all.
Are you absolutely sure it's not I/O related? A compile is usually doing a lot of random IO as well. What kind of drive are you running this on? Is it the same drive as your home directory is on?
Way back when I still had a much weaker 4-core CPU I had issues with window and mouse lagging when running certain heavy jobs as well, and it turned out that using ionice helped me a lot more than using nice.
I also remember that fairly recently there was a KDE/plasma stutter bug due to it reading from ~/.cache constantly. Brodie Robertson talked about it: https://www.youtube.com/watch?v=sCoioLCT5_o

Yeah I think the philosophy of Linux is to not assume what you are going to be use it for. Why should Linux know where your priorities are better than you?
Some people want to run their rustc, ffmpeg or whatever intensive program and don't mind getting a coffee while that happens, or it's running on a non-user facing server anyway, to ensure that the process happens as soon as technically possible. Mind you that your case is not an "average usecase" either, not everyone is a developer that does compilation tasks.
So you've got a point that the defaults could be improved for the desktop software developer user or somehow made more easily configurable. As suggested downthread, try the nice command, an optimized scheduler or kernel, or pick a distribution equipped with that kind of kernel by default. The beauty of Linux is that there are many ways to solve a problem, and with varying levels of effort you can get things to pretty much exactly where you want them, rather than some crowdpleasing default.
- There's a setting in windows to change the priority management, most people never see it.
  By default it's configured for user responsiveness, but you can set it for service responsiveness.
  Though this is nothing like the process priority management in Linux, it's one setting, that frankly I've never seen it make any difference. At least with Linux you can configure all sorts of priority management, on the fly no less.
  Even with a server, you'd still want the UI to have priority. God knows when you do have to remote in, it's because you gotta fix something, and odds are the server is gonna be misbehavin' already.
  
  Even with a server, you'd still want the UI to have priority. God knows when you do have to remote in, it's because you gotta fix something, and odds are the server is gonna be misbehavin' already.
  That's a fair point.
  I still contend that regularly using processes that hog every available cpu cycle it can get its hands on was not a common enough desktop use case that warranted changing the defaults. It should be up to the user to configure to their needs. That said, a toggle switch like the hidden windows setting you described would be nice.
- Why should Linux know where your priorities are better than you?
  Because a responsive desktop is basic good UX that should never ever be questioned. That should at least be the default and if you don't want your desktop to have special priority, then you can configure it yourself.
  pick a distribution equipped with that kind of kernel by default.
  I'm running Kubuntu, an official variant of Ubuntu which is very much a "just works" kind of distribution - yet this doesn't just work.
  
  What if I know it will compile for several minutes so I leave it alone to go office chair jousting? It would be fair to lock up the UI in this case.

Ha, that's funny. When I run some Visual Studio builds on Windows it completely freezes at times.
Never have that issue on EOS with KDE.

No. And even worse is Linux's OOM behaviour - 99% of the time it just reboots the machine! Yes I have swap and zswap.
Linux is just really bad at desktop.
- Try oomd https://github.com/facebookincubator/oomd
  
  This should work out of the box!
- I haven't had that experience. (Systemd oom)

It sounds like the issue is that the Rust compiler uses 100% of your CPU capacity. Is there any command line option for it that throttles the amount of cpu it will use? This doesn't sound like an issue that you should be tackling at the OS level. Maybe you could wrap the compiler in a docker container and use resource constraints?
- Why is that a problem? You'd want a compiler to be as fast as possible. nice would be way easier to use than a container...
- It sounds like the issue is that the Rust compiler uses 100% of your CPU capacity.
  No, I definitely want it to use as many resources it can get. I just want the desktop and the windows I interact with to have priority over the compiler, so that the compiler doesn't steal CPU time from those programs.
  
  No, I definitely want it to use as many resources it can get.
  undefined
  
  taskset -c 0 nice -n+5 bash -c 'while :; do :; done' & taskset -c 0 nice -n+0 bash -c 'while :; do :; done'
  Observe the cpu usage of nice +5 job: it's ~1/10 of the nice +0 job. End one of the tasks and the remaining jumps back to 100%.
  Nice'ing doesn't limit the max allowed cpu bandwidth of a task; it only matters when there is contention for that bandwidth, like running two tasks on the same CPU thread. To me, this sounds exactly what you want: run at full tilt when there is no contention.

122 comments