RHEL Advanced #2: Kernel Tuning — sysctl, tuned, kdump

11 min read

In #1 Boot Process we saw how the kernel gets loaded into memory. This post picks up from there. Once boot finishes, we tune the kernel parameters to fit the workload, swap workload profiles in a single line, and — for the rare occasion when the kernel dies in a panic — capture a memory dump at that exact moment for post-mortem analysis.

Position of this post in the RHEL Advanced series:

  • #1 Boot Process — GRUB2, dracut, Recovery Mode
  • #2 Kernel Tuning — sysctl, tuned, kdump ← this post
  • #3 Performance Analysis — sar, top/htop, iostat, vmstat, perf
  • #4 SELinux Advanced — Writing Policy, audit2allow
  • #5 Security Hardening — auditd, OpenSCAP, FIPS
  • #6 Subscription / Satellite / Insights
  • #7 Cockpit for GUI Management and Web Console

Where Each of the Three Tools Sits #

ToolWhat it adjustsWhen applied
sysctlKernel parameters (vm, net, kernel, fs)Immediately at runtime + on every boot
tunedPredefined workload profiles (sysctl + cpufreq + io-scheduler bundles)Immediately when a profile is applied
kdumpCapture a memory dump at the moment of kernel panicAt the moment of panic

sysctl is the per-line tool you touch directly, tuned is the abstraction that bundles those lines into workload-sized profiles and applies them at once, and kdump is the safety net that only kicks in when the kernel dies. Together they form a coherent operational set.

sysctl — Runtime Kernel Parameters #

The Linux kernel exposes its parameters as files under /proc/sys/. sysctl is the command that reads and writes those files.

basic usage
# list every parameter
$ sudo sysctl -a | less

# read a specific parameter
$ sudo sysctl vm.swappiness
vm.swappiness = 30

# change immediately (lost on reboot)
$ sudo sysctl -w vm.swappiness=10

A dotted key like vm.swappiness is just notation for /proc/sys/vm/swappiness.

two equivalent forms
$ sudo sysctl vm.swappiness
$ cat /proc/sys/vm/swappiness   # same value

Permanent settings — /etc/sysctl.d/ #

To apply on every boot, write the values to a file. The standard on RHEL 9 is to drop modular files into /etc/sysctl.d/*.conf.

/etc/sysctl.d/99-tune.conf
# memory / swap
vm.swappiness = 10
vm.dirty_ratio = 20
vm.dirty_background_ratio = 5

# network
net.core.somaxconn = 4096
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_tw_reuse = 1

# file descriptors
fs.file-max = 2097152
apply
# only this file
$ sudo sysctl -p /etc/sysctl.d/99-tune.conf

# every standard location (the action that runs automatically at boot)
$ sudo sysctl --system

The single file /etc/sysctl.conf still exists for backward compatibility, but the operational standard is the split files under /etc/sysctl.d/. It is easier to trace where a change came from by filename, and it pairs well with Ansible or packages that drop bundled settings as a single file.

Precedence and the filename convention #

sysctl --system reads these directories in lexicographic order.

read order (later wins)
/etc/sysctl.d/*.conf
/run/sysctl.d/*.conf
/usr/lib/sysctl.d/*.conf
/etc/sysctl.conf

Within a directory, files are read in alphabetical order, so operational settings usually carry a 99- prefix to be applied last (e.g., 99-tune.conf).

Frequently touched keys #

A bundle of keys you reach for often in operations.

KeyMeaningRecommended
vm.swappinessPage cache vs swap preference (0=avoid swap, 100=swap aggressively)10 for servers, 1 ~ 5 for DBs
vm.dirty_ratioDirty page limit (%) — synchronous flush past this20
vm.overcommit_memoryMemory overcommit policy1 for DB / Redis
net.core.somaxconnMax listen() queue size4096+
net.ipv4.tcp_max_syn_backlogSYN queue4096+
net.ipv4.ip_local_port_rangeEphemeral port range1024 65535
fs.file-maxSystem-wide file descriptor limit2 million+
kernel.pid_maxMaximum PID4194304 with many containers

DB workloads, web server workloads, and container hosts each call for different settings. Pin one standard file for each role, drop it under /etc/sysctl.d/, and keep consistency by copying the same file to every new machine.

Cases where a change does not stick #

  • Read-only parameters — keys like kernel.osrelease cannot change after boot. Trying gives permission denied.
  • Boot-time-only parameters — some vm.* keys are runtime-mutable, but values like kernel.numa_balancing only behave as intended when given as a boot parameter.
  • Inside a container — parts of /proc/sys/ are isolated by container namespaces. You must change them on the host for the change to take effect.

tuned — Workload Profiles #

tuned is the daemon that bundles many tuning items — sysctl values + CPU governor + I/O scheduler + disk readahead and so on — into named profiles and applies them at once. RHEL 9 ships it pre-installed and starts it automatically at boot.

check enablement
$ sudo systemctl status tuned
$ sudo systemctl enable --now tuned

Listing and applying profiles #

tuned-adm basics
# list available profiles
$ sudo tuned-adm list
Available profiles:
- accelerator-performance
- balanced                     - General non-specialized tuned profile
- desktop
- hpc-compute
- latency-performance          - Optimize for low latency at the cost of throughput
- network-latency
- network-throughput
- powersave
- throughput-performance       - Broadly applicable tuning that provides excellent...
- virtual-guest                - Optimize for running inside a virtual guest
- virtual-host                 - Optimize for running KVM guests
Current active profile: throughput-performance

# current profile
$ sudo tuned-adm active

# change profile
$ sudo tuned-adm profile virtual-guest

# recommended profile (RHEL inspects the machine and suggests one)
$ sudo tuned-adm recommend
virtual-guest

tuned-adm recommend automatically distinguishes bare metal, virtual machines, and laptops, and proposes a profile accordingly. Boot RHEL on a virtualized instance like EC2, and it suggests virtual-guest.

Frequently used profiles #

ProfileWhere it fits
throughput-performanceGeneral server default. CPU governor performance, relaxed dirty ratio
latency-performanceResponse time first — trading systems, real-time processing
network-latencylatency-performance + network queue tuning
network-throughputHigh-throughput networks (10G+ NICs)
virtual-guestKVM/AWS/GCP guest default
virtual-hostKVM hypervisor host
powersavePower saving (laptops, etc.)
accelerator-performanceGPU / accelerator workloads

For DB machines start from throughput-performance or latency-performance; for cloud guests, virtual-guest.

What is inside a profile #

profile definition
$ ls /usr/lib/tuned/throughput-performance/
tuned.conf

$ cat /usr/lib/tuned/throughput-performance/tuned.conf
[main]
summary=...
include=latency-performance

[cpu]
force_latency=cstate.id:3|3
governor=performance
energy_perf_bias=performance
min_perf_pct=100

[disk]
readahead=>4096

[sysctl]
kernel.sched_min_granularity_ns = 10000000
kernel.sched_wakeup_granularity_ns = 15000000
vm.dirty_ratio = 40
vm.dirty_background_ratio = 10
vm.swappiness=10
net.core.busy_read=50
net.core.busy_poll=50
net.ipv4.tcp_fastopen=3

Look at the [sysctl] section and you can see it is, in the end, just a bundle of sysctl keys. While the profile is active those keys hold the profile’s values; switching to a different profile reapplies the new values immediately.

Custom profiles #

It is common to inherit an existing profile and override only what you need.

custom profile directory
$ sudo mkdir -p /etc/tuned/myapp-throughput
$ sudo vi /etc/tuned/myapp-throughput/tuned.conf
/etc/tuned/myapp-throughput/tuned.conf
[main]
summary=Custom throughput profile for myapp
include=throughput-performance

[sysctl]
net.core.somaxconn = 16384
net.ipv4.tcp_max_syn_backlog = 16384
vm.swappiness = 1

[vm]
transparent_hugepages=never
apply
$ sudo tuned-adm profile myapp-throughput
$ sudo tuned-adm active
Current active profile: myapp-throughput

/etc/tuned/ is the user-defined area; /usr/lib/tuned/ is where the package ships its defaults. If a user profile and a system profile share the same name, the user profile wins.

Relationship between tuned and sysctl.d #

Values applied by tuned and values written under /etc/sysctl.d/ can collide. The precedence is simple — last write wins. At boot, tuned usually runs before sysctl --system, so /etc/sysctl.d/ ends up taking effect. But running tuned-adm profile X again at runtime overwrites those values at that moment.

Operational guidance:

  • System-wide policy/etc/sysctl.d/
  • Workload bundlestuned profiles
  • If both touch the same key, unify on one side. Usually it is cleaner to move the key into the tuned profile and remove it from sysctl.d.

kdump — Memory Dump at Kernel Panic #

If the kernel dies in a panic and you have a dump (vmcore) of memory captured at that moment, you can analyze it post-mortem with tools like crash after rebooting. kdump handles that capture.

How it works #

The core idea of kdump is to keep two kernels in memory.

  1. At boot time, in addition to the normal kernel, a crash kernel is preloaded into memory (via the kexec mechanism).
  2. When the normal kernel panics, control jumps without a hardware reset to that preloaded crash kernel.
  3. The crash kernel writes the normal kernel’s memory region to disk as a vmcore file.
  4. Then a normal reboot.

Memory for the crash kernel is reserved at boot time, so a slice of RAM (typically 256 MB to a few GB) is unavailable during normal operation. It is not free, but it is nearly mandatory on production machines where panic analysis matters.

Enablement #

RHEL 9 usually ships with it enabled. Verify:

kdump status
$ sudo systemctl status kdump
$ sudo kdumpctl status

# memory load status
$ sudo cat /sys/kernel/kexec_crash_loaded
1     # 1 means loaded

If disabled:

enable
$ sudo dnf install -y kexec-tools
$ sudo systemctl enable --now kdump

The crashkernel parameter #

The memory to reserve is specified via a GRUB kernel argument. RHEL 9 usually sets crashkernel=auto or an explicit value automatically, but some workloads require adjusting it.

check current value
$ cat /proc/cmdline
... crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M ...

# change to a specific value
$ sudo grubby --update-kernel=ALL --args="crashkernel=512M"
$ sudo reboot

crashkernel=512M is unconditionally 512MB. crashkernel=1G-4G:192M,4G-64G:256M,... applies different values depending on RAM size. Leaving the RHEL 9 default is usually safe.

Where vmcore is stored #

Configure where to store vmcore in /etc/kdump.conf.

key parts of /etc/kdump.conf
# default: local disk
path /var/crash
core_collector makedumpfile -l --message-level 7 -d 31

# send to NFS
# nfs nfs.example.com:/srv/crash

# send via SSH
# ssh user@dump-server.example.com
# sshkey /root/.ssh/kdump_id_rsa

# if disk write fails, just reboot
# default reboot
KeyMeaning
pathvmcore storage path
core_collector makedumpfile -d 31Trim empty pages / page cache from the dump to shrink it (-d 31 recommended)
nfs / sshRemote storage. Hedges against a broken local disk
defaultAction on dump failure (reboot, halt, poweroff, shell, dump_to_rootfs)

After changing settings:

rebuild initramfs and restart
$ sudo kdumpctl rebuild
$ sudo systemctl restart kdump

Test — trigger a panic on purpose #

Never run this on a production machine — only on an isolated test machine:

forced panic (test environment only)
$ sudo sysctl -w kernel.sysrq=1
$ echo c | sudo tee /proc/sysrq-trigger

The machine panics immediately and a vmcore drops at /var/crash/<date>/vmcore. Verify after reboot.

check the result
$ ls /var/crash/
127.0.0.1-2026-04-27-10:30:00/

$ ls /var/crash/127.0.0.1-2026-04-27-10:30:00/
vmcore  vmcore-dmesg.txt

vmcore-dmesg.txt alone often contains the dmesg output from just before the panic and is enough for a first-pass diagnosis.

Analyze with crash #

the crash tool
$ sudo dnf install -y crash
$ sudo dnf install -y kernel-debuginfo-$(uname -r) --enablerepo=rhel-9-for-x86_64-baseos-debug-rpms

$ sudo crash /usr/lib/debug/lib/modules/$(uname -r)/vmlinux \
             /var/crash/127.0.0.1-2026-04-27-10:30:00/vmcore

crash> bt        # stack trace at the moment of panic
crash> log       # kernel log
crash> ps        # process list
crash> mod       # loaded modules
crash> sys       # system info

A single bt (backtrace) shows which function the panic occurred in. Deep analysis is its own topic, but simply having a vmcore available is itself an operational safety net.

Common Pitfalls #

  • Using only sysctl -w and not writing to a file — the change is lost at reboot. Permanent settings must go under /etc/sysctl.d/.
  • Cramming everything into the single /etc/sysctl.conf — you lose the ability to trace where a change came from. Split per topic into 99-app.conf, 99-network.conf, etc.
  • tuned and sysctl.d colliding on the same key — last applied wins. Unify on one side.
  • Insufficient kdump disk space — vmcore can be several GB. Keep room on the filesystem holding /var/crash.
  • Forgetting kdumpctl rebuild after editing kdump.conf — changes do not take effect. Always rebuild → restart.
  • Removing the crashkernel= argument — someone tidying up GRUB args drops it and kdump stops working from the next boot onward. Periodically verify with cat /proc/cmdline.
  • Setting vm.overcommit_memory=1 carelessly on a container host — it changes OOM patterns for some workloads. Validate per workload.

Commands Worth Remembering #

TaskCommand
Read sysctl value / change temporarilysysctl <key> / sysctl -w <key>=<v>
Apply sysctl.dsudo sysctl --system
Apply a single filesudo sysctl -p /etc/sysctl.d/99-tune.conf
tuned active profilesudo tuned-adm active
Switch tuned profilesudo tuned-adm profile <name>
tuned recommendationsudo tuned-adm recommend
kdump statussudo kdumpctl status
Crash kernel loaded?cat /sys/kernel/kexec_crash_loaded
Rebuild kdumpsudo kdumpctl rebuild && sudo systemctl restart kdump
Analyze vmcoresudo crash <vmlinux> <vmcore>

Wrap-up #

  • sysctl — adjusts runtime kernel parameters, with /etc/sysctl.d/*.conf for permanent separation. The 99- prefix guarantees last application.
  • tuned — bundles workload profiles. throughput-performance (server default), virtual-guest (cloud guest); custom profiles inherit from /etc/tuned/ and override only what you need.
  • kdump — captures memory dump at kernel panic. Reserve memory for the crash kernel via crashkernel=, store vmcore at /var/crash or to NFS / SSH. Analyze post-mortem with crash.
  • The roles — sysctl is per-line, tuned is the workload bundle, kdump is the panic safety net. When the same key collides, unify on one side.

The next post looks at what is eating the time on a machine where the kernel is running smoothly: performance analysis. We cover which tool to reach for — sar, top/htop, iostat, vmstat, perf — and which signals to read with each.

X