RHEL in Practice #4 Monitoring: Cockpit, PCP

11 min read

Once you have stood up a web server, a database, and container workloads in turn, it is time to look at what is actually happening on top of them. If you cannot see where the CPU spikes, which process is holding memory, or when the disk saturated, you have nowhere to start when an incident hits. This post organizes RHEL monitoring along two axes: the Cockpit web console for managing a server from the browser, and Performance Co-Pilot (PCP) for collecting and recording performance metrics.

The two tools play different roles. Cockpit is a management console that shows the state of this very moment at a glance and lets you work on services; PCP is a recording device that steadily accumulates metrics so you can look back at the past. On RHEL the two are designed to interlock, so you can view the performance graphs PCP has gathered right inside the Cockpit screen. Let’s start with Cockpit.

Cockpit: The Web Console for Managing a Server from the Browser #

Cockpit is the web-based management console Red Hat ships with RHEL by default. Without a terminal, using only a browser, you can start and stop services, check storage, browse logs, configure networking, and even open a web terminal — all on one screen. It is especially handy when you operate a server together with a colleague who is not comfortable with SSH, or when you want to view several machines from one place.

Installation and Service Registration #

A minimal RHEL install may not include Cockpit, so install it with dnf.

# install
sudo dnf install -y cockpit

# auto-start at boot + start immediately (in one shot)
sudo systemctl enable --now cockpit.socket

# check status
systemctl status cockpit.socket

The important point is that the unit you register here is cockpit.socket, not cockpit.service. Cockpit works through socket activation: normally only the socket waits, and the service wakes up the moment someone connects on port 9090. When there is no connection it uses almost no memory, so keeping the management console available all the time imposes little burden.

Opening Port 9090 with firewalld #

Cockpit uses port 9090. Following the same firewall flow covered in RHEL in Practice #1, we open it permanently by service unit. firewalld already includes a cockpit service definition, so you do not have to write the port number directly.

# permanently allow the cockpit service
sudo firewall-cmd --add-service=cockpit --permanent

# apply
sudo firewall-cmd --reload

# verify
sudo firewall-cmd --list-services

The two steps — creating a permanent rule with --permanent, then applying it to the runtime with --reload — are the basic flow of RHEL firewall work. Skip either and 9090 is blocked again after a reboot.

Connecting from a Browser #

Now connect from another PC’s browser to https://server-address:9090. Note that it is https, not http. Cockpit uses a self-signed certificate at first, so the browser raises a warning; on an internal network it is fine to proceed past it as an exception. If you expose it externally, it is safer to swap in a proper certificate or put it behind a reverse proxy.

On the login screen you sign in with a regular user account on the server. After logging in, when you need to do something that requires privileges, turn on Administrative access at the top of the screen to switch to sudo privileges.

What You Do with Cockpit #

After logging in, the left-hand menu lets you handle the following.

  • Overview. Shows CPU, memory, and disk utilization as real-time graphs. Hardware information and uptime are here too.
  • Services. The list of systemd units. With a single click you can start, stop, or restart a service and toggle whether it is enabled. It is essentially a screen-based stand-in for the systemctl commands covered in RHEL in Practice #1.
  • Logs. Filters journald logs by priority, time, and unit. A visualized version of the terminal’s journalctl.
  • Storage. Checks filesystem usage, mount state, and LVM volumes, and shows disk I/O trends.
  • Terminal. Opens a shell right inside the browser. Tasks not handled by the GUI you finish here with commands.

Extending Functionality with Add-ons #

Cockpit can be extended by adding more packages. To manage the containers and virtualization from earlier posts from the GUI, install the following.

# manage Podman containers in Cockpit
sudo dnf install -y cockpit-podman

# manage virtual machines (KVM) in Cockpit
sudo dnf install -y cockpit-machines

Installing cockpit-podman lets you view the Podman containers you stood up in RHEL in Practice #3 from the Cockpit screen, pull images, and start or stop containers. cockpit-machines handles KVM virtual machines the same way. Add-ons appear as items in the left menu immediately upon installation, so no separate registration step is needed. That said, the more add-ons you install, the wider the scope Cockpit can touch — so for a console exposed externally, picking only the add-ons you actually use is the way to reduce the attack surface.

If you operate several RHEL machines, you can also pull other servers into one Cockpit and manage them together. Register another server over SSH from the host selector at the top left, and if that server is also running Cockpit, you can switch screens and move between machines from one console.

PCP: Performance Co-Pilot, the Recorder of Performance #

If Cockpit shows you the present, PCP lets you look back at the past. Performance Co-Pilot is a performance monitoring framework that collects metrics from all over the system at regular intervals and accumulates them on disk as archives. When you get a report the next morning that the CPU spiked at 3 a.m., having a PCP archive lets you rewind to that moment and see what happened.

Installation and Service Registration #

# install
sudo dnf install -y pcp

# start the metric collector daemon and the logging daemon together
sudo systemctl enable --now pmcd
sudo systemctl enable --now pmlogger

# check status
systemctl status pmcd pmlogger

PCP has two core services. pmcd (Performance Metrics Collector Daemon) is the collector daemon that gathers metrics, and pmlogger is the logging daemon that periodically records those collected metrics to disk. For real-time queries alone pmcd is enough, but to look back at the past you must also turn on pmlogger to leave archives behind.

Querying Metrics in Real Time #

PCP can pull metrics straight from the command line. Let’s go over the three you reach for most.

# print a vmstat-style system summary periodically
pmstat

# print specified metrics in tabular form (every 2s, 5 times)
pmrep -t 2 -s 5 kernel.all.load mem.util.free

# simply check the values of a single metric
pmval -t 1 -s 3 kernel.all.cpu.user

pmstat summarizes the system as a whole on one line, like vmstat, for a quick scan. pmrep is the tool for picking the metrics you want and laying them out in a table — handy when you want to see CPU, memory, and disk side by side on one screen. pmval tracks the value of just one metric, so it is best for zeroing in on a specific indicator. You can see the full list of available metric names with the pminfo command.

Archiving Historical Data #

The archives pmlogger leaves behind accumulate under /var/log/pcp/pmlogger in per-hostname directories. They are kept as daily files, and if you point the same query command at an archive with the -a option, it reads back the data from that point in time.

# check the archive directory
ls /var/log/pcp/pmlogger/$(hostname)/

# re-query the load for a specific time window from yesterday's archive
pmrep -a /var/log/pcp/pmlogger/$(hostname)/20260505 \
      -S @03:00 -T @04:00 kernel.all.load

-S means the start time and -T the end time. This way you can look at a recorded past window as if replaying it, rather than in real time. If the archives take up too much disk, the cleanup task bundled with PCP automatically compresses old files and deletes any past the retention period, so you just set the retention policy to match your needs. The defaults comfortably accumulate a few days’ worth, but for a server where you need to see long-term trends, extend the retention period and secure that much disk headroom.

If you operate several machines, you can also designate one as a central collection server and tie the other servers’ pmlogger to send metrics there. That said, rather than reaching for centralization from the start, it is safer to begin with the basic configuration of leaving local archives on each server, get comfortable with operating it, then expand.

Wiring with Cockpit to View Performance Graphs #

PCP’s real value shows when wired with Cockpit. Install the package that joins the two tools, and you can view the metrics PCP has gathered as graphs right inside the Cockpit screen.

# the package that joins Cockpit and PCP
sudo dnf install -y cockpit-pcp

After installing it and reconnecting to Cockpit, the graphs on the Overview screen go beyond a simple real-time display: they let you travel back in time through past windows to see CPU, memory, disk, and network trends. Instead of reading archives directly from the command line, you can drag the time bar on screen to pin the moment you want — convenient for visually narrowing down the time of an incident quickly.

Keep the Basic Monitoring Commands at Your Fingertips #

Even with Cockpit and PCP in place, the terminal’s basic commands are still the fastest diagnostic tools. When something looks off on a server you just SSH’d into, you can use them right away with no extra install. top shows the processes using the most CPU and memory in a live ranking, ss -tlnp checks which port is open on which process, and journalctl -u service-name -f follows a specific service’s logs in real time. And sar is the lightest way to view historical data before you bring in PCP, letting you look back at the CPU, memory, and disk statistics that the sysstat package has collected, broken down by time slot.

Here is when to reach for each of these. Right after you connect on an incident report, do a first pass with top for the load culprit, ss for port state, and journalctl for the most recent logs. When the cause is not visible on the spot and you need to look back at the past, you move on to the PCP archive or sar; when you want to visually compare several indicators on one screen, stepping up to Cockpit is the natural flow. Having many tools does not mean firing them all up at once — stepping from the lightweight commands up to the heavier console, narrowing things down stage by stage, is what makes diagnosis fast.

Operational Points #

  • Cockpit is socket activation. Enable cockpit.socket and the service only wakes up when there is a connection, so the everyday burden is light.
  • 9090 is https. Connect from the browser at https://server:9090, and when exposing it externally, put a proper certificate or reverse proxy in front.
  • PCP is both pmcd and pmlogger. For real-time only, pmcd is enough, but to look back at the past, also turn on pmlogger to leave archives.
  • Replay archives with -a. Point a query command’s -a option at an archive under /var/log/pcp/pmlogger to look directly at a past point in time.
  • Wiring is cockpit-pcp. Install this package and you can scroll back through PCP graphs by time bar on the Cockpit screen, narrowing the time of an incident quickly.
  • Basic commands always come first. top, ss, journalctl, and sar are first-pass diagnostic tools you use immediately with no extra install.

Wrap-up #

Let’s recap what we set up in this post. With Cockpit we handled services, storage, logs, and the terminal on one screen in the browser, and pulled in containers and virtual machines via add-ons. With PCP we queried metrics in real time and looked back at the past with pmlogger archives, then joined the two with cockpit-pcp to review performance graphs chronologically on screen. Finally we confirmed that basic commands like top, ss, journalctl, and sar are still the fastest first-pass means of diagnosis. Once you have both a tool that shows the present and a tool that records the past, you can finally find where to start when an incident hits.

Monitoring is not a set-it-and-forget-it task; it is closer to making the shape of the normal state familiar to your eye in ordinary times. You have to know what range the load usually moves in, because only then can you recognize an anomaly when it strays outside that range. The habit of accumulating a few days of PCP archives and glancing at Cockpit graphs now and then becomes your most reliable baseline when an incident actually strikes.

Next: Automating RHEL with Ansible #

So far we have stood up web, DB, containers, and monitoring one machine at a time by hand, learning one full cycle of RHEL operations. But once servers grow from one to ten or a hundred, repeating the same work by hand soon hits its limit.

#5 Automating RHEL with Ansible: Bridging to the RHCE Track moves the install, service registration, firewall, and SELinux work we have done by hand into an Ansible playbook, organizing how to configure many RHEL machines consistently in one go. It also covers how this flow connects to Red Hat’s RHCE certification.

X