RHEL Advanced #3: Performance Analysis — sar, top/htop, iostat, vmstat, perf

11 min read

In #2 Kernel Tuning we looked at tools that tune the kernel for a given workload. This post points the other direction: when a machine is already running and has become slow, where to look first. We pin down which signal calls for which of the five tools — top, vmstat, iostat, sar, perf — and tie them together with one simple framework: USE (Utilization, Saturation, Errors).

Position of this post in the RHEL Advanced series:

The Four Resources and the USE Checklist #

Performance problems leak out of one of four resources in the end.

ResourceWhat to look at
CPUUtilization, run queue length, context switches
MemoryUsage, swap activity, page faults
Disk I/OIOPS, throughput, await (response time), util%
NetworkBandwidth, packet rate, errors / drops

For each one, look at the same three signals — Brendan Gregg’s USE methodology.

  • Utilization — how much of the resource is in use (% busy)
  • Saturation — is the resource piling up work it cannot handle (queue)
  • Errors — are there errors?

CPU may look busy (U high) but if nothing is queuing (S low) and there are no errors (E 0), the system is healthy. Conversely, CPU at 50% but a run queue of 30 means saturation is the real problem even though utilization looks ordinary. Always read U/S/E together for an accurate diagnosis.

First Glance — top / htop #

The fastest tool to reach for, and standard procedure right after logging into a machine.

top basics
$ top
top - 14:23:15 up 12 days,  4:11,  2 users,  load average: 1.52, 0.98, 0.65
Tasks: 312 total,   2 running, 310 sleeping,   0 stopped,   0 zombie
%Cpu(s):  8.3 us,  2.1 sy,  0.0 ni, 88.5 id,  0.7 wa,  0.3 hi,  0.1 si,  0.0 st
MiB Mem :  15891.2 total,   3421.5 free,   8123.4 used,   4346.3 buff/cache
MiB Swap:   2048.0 total,   2048.0 free,      0.0 used.   7234.8 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 1234 postgres  20   0 1234567  98765  43210 R  45.2   0.6   3:12.45 postgres
 ...

First line — load average #

The three numbers in load average: 1.52, 0.98, 0.65 are the average number of runnable processes over the last 1 / 5 / 15 minutes.

  • Compare with the CPU core count (e.g., 4) — exceeding 4 signals CPU shortage.
  • 1/5/15 trend — 1-min > 15-min means load is rising; the reverse means it is settling.
  • Linux load average includes both R state and D state (uninterruptible sleep, usually disk I/O wait), so a slow disk drives load up even when the CPU is idle.

CPU line — %Cpu(s) #

FieldMeaning
usUser-space CPU
syKernel-space CPU
niProcesses with adjusted nice values
idIdle
waI/O wait — share where the CPU is idle but waiting on I/O
hi / sihardirq / softirq
stSteal — time the hypervisor took away in a virtualized environment

High wa suggests a disk or network I/O bottleneck; high st (on cloud instances) hints at noisy-neighbor effects.

htop — a more readable top #

install and use
$ sudo dnf install -y htop
$ htop
  • Per-core usage bar graphs
  • Process tree view (F5)
  • Search (F3), filter (F4), kill (F9), nice change (F7/F8)
  • Mouse-click sorting

In production you must assume only top is available (it is installed by default), but as an interactive diagnostic tool, htop is much faster to work with.

Frequently used top shortcuts #

KeyAction
1Expand per-core view
MSort by memory usage
PSort by CPU usage
TSort by cumulative time
cShow full command lines
HExpand thread-level view
kKill process
qQuit

vmstat — CPU , Memory , I/O on One Screen #

vmstat at 1-second intervals
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 421888  35420 1234560    0    0     8    23  150  280  3  1 95  1  0
 0  0      0 421872  35420 1234560    0    0     0     0  120  240  1  0 99  0  0
 2  1      0 421860  35420 1234560    0    0     0   512  185  320  8  2 88  2  0
ColumnMeaning
rRun queue — processes waiting for CPU (CPU saturation)
bBlock — processes in uninterruptible sleep (usually I/O wait)
si / soSwap in/out (KB/s) — non-zero suggests memory pressure
bi / boBlock in/out (KB/s) — disk reads / writes
in / csInterrupts / context switches per second
us / sy / id / wa / stSame as top

What to read #

  • r > number of CPU cores — CPU saturation. Processes are queueing
  • si/so > 0 — pages going in / out of swap. Memory saturation
  • High wa and b > 0 — disk I/O bottleneck
  • Abnormally high cs — context-switch storm (too many threads, lock contention, etc.)

A single line of vmstat 1 shows the state of all four resources simultaneously — making it the second tool you reach for in any diagnosis.

iostat — Drill Into Disks #

Lives inside the sysstat package.

install and basics
$ sudo dnf install -y sysstat
$ iostat -xz 1
Linux 5.14.0-... (host)   04/28/2026   _x86_64_   (4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.32    0.00    1.84    2.41    0.00   90.43

Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm  r_await  w_await aqu-sz  rareq-sz  wareq-sz  svctm  %util
nvme0n1         12.34   45.67    234.56    812.34     0.00     2.11   0.00   4.42     0.32     1.45   0.07     19.01     17.79   0.21   1.23
nvme0n1p1        0.00    0.01      0.02      0.05     0.00     0.00   0.00   0.00     0.10     0.50   0.00      8.00      8.00   0.05   0.00

-x extended stats, -z hide idle devices, 1 second interval.

ColumnMeaning
r/s / w/sRead / write requests per second (IOPS)
rkB/s / wkB/sRead / write throughput per second
r_await / w_awaitAverage response time (ms) — queue wait + service time
aqu-szAverage queue length (saturation)
%utilShare of time the disk spent on I/O

What to read #

  • %util close to 100% + large aqu-sz — disk saturated. On SSDs, may be near the IOPS limit
  • Abnormally large await (HDD: 20ms+, SSD: 5ms+) — response-time problem
  • rkB/s + wkB/s close to disk specs — bandwidth saturation

%util alone can mislead. Modern NVMe handles requests in parallel, so 100% %util is not necessarily saturation. You need aqu-sz and await together for an accurate picture.

Together with CPU #

CPU + disk in one shot
$ iostat -xz 1

iostat shows a CPU average in its first paragraph. High CPU %iowait together with high disk %util makes a disk bottleneck obvious.

sar — Time Series #

top/vmstat/iostat only show the present moment. They cannot tell you what happened at 3 AM yesterday. sar fills that gap. The sysstat package collects system stats every 10 minutes (by default) in the background and stores them under /var/log/sa/.

enablement — RHEL 9 usually automatic
$ sudo systemctl enable --now sysstat
$ sudo systemctl status sysstat-collect.timer

Querying by time #

today
$ sar -u
Linux 5.14.0-...   04/28/2026

00:00:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
00:10:01 AM     all      2.34      0.00      0.81      0.12      0.00     96.73
00:20:01 AM     all      3.21      0.00      0.92      0.14      0.00     95.73
...
another day — /var/log/sa/sa<day>
$ sar -u -f /var/log/sa/sa27   # day 27

Per-resource options #

OptionResource
-uCPU
-rMemory
-SSwap
-bI/O totals
-dPer-disk
-n DEVNetwork interfaces
-n EDEVNetwork errors
-qRun queue, load average
-wContext switches
-WPage swap

Scoping by time and the average row #

between 9 and 11 only
$ sar -u -s 09:00:00 -e 11:00:00

# average only
$ sar -u -A | grep Average

This is the core tool for post-incident post-mortems. You didn’t catch it live — you walk in afterward and reconstruct. sar captures the timeline that top cannot.

sar retention #

Default is 28 days. Increase via HISTORY= in /etc/sysconfig/sysstat.

6 months
$ sudo sed -i 's/^HISTORY=.*/HISTORY=180/' /etc/sysconfig/sysstat

For machines that need long-term trend analysis, raising it to 90–180 days is the standard.

perf — CPU Hotspots #

When the CPU is busy and you want to know which functions are consuming the time, reach for perf. It is a call-stack-level profiler that includes kernel functions.

install
$ sudo dnf install -y perf

30-second whole-system profile #

whole system
$ sudo perf record -F 99 -ag -- sleep 30
$ sudo perf report
OptionMeaning
-F 9999 samples per second (99 recommended to avoid HZ overlap)
-aAll CPUs
-gInclude call stacks
-- sleep 30For 30 seconds

Specific PID #

per process
$ sudo perf record -F 99 -p 1234 -g -- sleep 30
$ sudo perf report

Reading the result #

perf report is a text UI. It shows CPU time share per function with call stacks.

report sketch
+   45.32%  postgres   postgres            [.] hash_search_with_hash_value
+   12.45%  postgres   postgres            [.] heap_hot_search_buffer
+    5.21%  postgres   libc-2.34.so        [.] __memcpy_avx_unaligned_erms
...

Flame Graph — visualization #

The perf report text view makes deep call stacks awkward to scan. Convert to SVG with Brendan Gregg’s FlameGraph tool:

building a flame graph
$ sudo perf record -F 99 -ag -- sleep 30
$ sudo perf script > out.perf
$ ./FlameGraph/stackcollapse-perf.pl out.perf > out.folded
$ ./FlameGraph/flamegraph.pl out.folded > flame.svg

Open flame.svg in a browser and the width of each call stack maps directly to its share of CPU time. Hot spots jump out immediately.

Tying It Together #

The order in which to reach for each tool when a problem comes in.

diagnosis flow
1. top / htop          ← first glance. load avg, big CPU/MEM picture
2. vmstat 1            ← r, b, si/so, wa — CPU,MEM,IO on one screen
3. once a resource is named — go deep with its dedicated tool
   ├─ CPU suspect  → perf record / report
   ├─ MEM suspect  → free -h, /proc/meminfo, smem
   ├─ Disk suspect → iostat -xz 1, biotop, iotop
   └─ Net suspect  → ss -s, sar -n DEV/EDEV, iftop
4. need timeline → sar -u/-r/-d/-n DEV (-f /var/log/sa/sa..)
SignalFirst toolSecond tool
load avg above core counttop (r column in vmstat)perf record
wa highiostat -xz 1iotop or biotop (BCC)
si/so > 0free -h, vmstatsmem, identify OOM candidates
cs stormpidstat -w 1suspect lock contention, perf next
graph from 3 AM yesterdaysar -u -s 03:00 -e 04:00 -f /var/log/sa/saYYother resources at the same time

Common Pitfalls #

  • Reading load average as CPU load alone — Linux load includes D state. A slow disk drives it up even when the CPU is idle. Always pair with vmstat r and b.
  • Judging disk saturation from %util alone — NVMe handles parallelism. You need aqu-sz and await together.
  • "%CPU totals over 100% must be wrong" — it is normal. top’s default view sums across cores, so 4 cores can reach 400%. Press 1 to expand per-core and see per-core utilization.
  • Forgetting to enable sar — RHEL 9 usually enables it automatically, but minimal installs may skip it. Enabling it after an incident does not recover that incident’s data. Turn it on proactively on any machine you manage.
  • Running perf record too long against production traffic — sampling carries some overhead. On production instances, keep runs to 30-second to 1-minute slices.
  • Reading top’s memory used at face value — buff/cache is included in used. The reliable “real available” figure is the available column in free -h.

Commands Worth Remembering #

TaskCommand
First glancetop or htop
CPU/MEM/IO one screenvmstat 1
Drill into disksiostat -xz 1
Truly available memoryfree -h
Per-process CPU/IOpidstat 1 / pidstat -d 1
Timeline (today)sar -u, sar -r, sar -d
Timeline (specific day)sar -u -f /var/log/sa/sa<day>
CPU profilingsudo perf record -F 99 -ag -- sleep 30 && sudo perf report
Network statsss -s, sar -n DEV 1
Context switchesvmstat 1 cs column, pidstat -w 1

Wrap-up #

  • USE checklist — Utilization , Saturation , Errors. Read all three for each of CPU/MEM/Disk/Network.
  • First card is top/htop — load average and the big picture. Then vmstat 1 for all four resources at once.
  • Once narrowed, dedicated toolsiostat -xz 1 for disks, perf record for CPU hotspots, free -h + smem for memory.
  • Timeline lives in sar — the post-incident workhorse. Stretch retention to 90–180 days for long-term trend analysis.
  • Do not confuse the signals — load avg includes D state, %util alone is not enough on NVMe, top’s used includes buff/cache.

The next post returns to security. We go beyond the standard tools introduced in Intermediate #1 Intro to SELinux and cover writing policy by hand and turning audit2allow output into a permanent module.

X