#Infrastructure
300 posts
AWS Certified CloudOps Engineer - Associate (SOA-C03) #9 Domain 3-3 Deployment — Container Operations (ECS, EKS, ECR)
The ninth post of the SOA-C03 series covers container operations, newly added in SOA-C03. It covers the difference between ECS and EKS, choosing between the Fargate and EC2 launch types, how to store and scan images with ECR, container logging and monitoring, and deployment and scaling operations.
AWS Certified Developer - Associate (DVA-C02) #15 Full-Scale Multiple-Choice Mock Exam — 50 Questions + Explanations
The final post of the DVA-C02 series. Matched to the real exam's domain weights (development 32% , security 26% , deployment 24% , troubleshooting and optimization 18%), you solve 50 questions and find your weak domains through each question's answer and explanation. Solve them on the clock, then go back to the relevant domain post to shore up any gaps.
Certified Kubernetes Administrator (CKA) #24 Troubleshooting 3: Control Plane (apiserver/etcd/scheduler Down), etcd Recovery
The twenty-fourth post in the Certified Kubernetes Administrator (CKA) series. Taking the fact that control plane components run as static Pods as the starting point of diagnosis, we organize how to narrow down causes by symptom — from an apiserver down that leaves kubectl unresponsive, to an etcd down, to a scheduler/controller-manager down. We get hands-on with inspecting containers directly via crictl and journalctl, and with fixing manifests so kubelet restarts them.
Certified Kubernetes Application Developer (CKAD) #19 Ingress and NetworkPolicy
The nineteenth post in the Certified Kubernetes Application Developer (CKAD) series. It covers Ingress, which routes external traffic at L7, and NetworkPolicy, which controls Pod-to-Pod communication with a whitelist, from a hands-on exam perspective. We will build everything from host/path routing and pathType, IngressClass, and TLS through to the podSelector-based default deny pattern, with YAML examples.
Certified Kubernetes Security Specialist (CKS) #17 Falco behavioral analysis, audit logs (Runtime)
The 17th post in the Certified Kubernetes Security Specialist (CKS) series. As the core of the final domain — Monitoring, Logging, and Runtime Security — we cover the rule structure of Falco, the syscall-based runtime threat-detection tool, along with writing custom rules and reading its output, then move on to the Kubernetes API audit log: policy levels and stages, apiserver flag configuration, and log analysis — all framed around the tasks that show up on the exam again and again.
Hardware Intermediate #1: Reading Performance Metrics — Turning Slow into Numbers
If Hardware Basics gave you the mental model of the four resources, the intermediate series starts on the operations floor. It uses utilization, saturation, and errors as the three questions for reading metrics, and explains what load average and %wa really mean as signs of hardware behavior.
Kubernetes and Cloud Native Associate (KCNA) #8: Exam Tips and Common Mistakes
A condensed recap to read one more time right before you walk into the KCNA exam. We cover time management for 60 questions in 90 minutes, the question formats people most often trip over (multiple response, double negatives), pairs of easily confused concepts (Deployment vs StatefulSet, CRI vs CNI vs CSI, HPA vs VPA, and more), techniques for narrowing down the options, a compact per-domain checklist, and a final pre-exam check for online-proctored sessions.
Red Hat Certified Engineer (RHCE) #16: RHCSA automation 3 — storage (LVM), filesystems (NFS)
The sixteenth post in the Red Hat Certified Engineer (RHCE) series. We automate the storage portion of RHCSA hands-on work with Ansible. We walk through carving partitions with parted, building a VG with lvg and an LV with lvol, then formatting with filesystem and handling both fstab and the live mount in one shot with mount — plus adding swap, mounting NFS remotely, and the storage system role alternative, all from an idempotency angle.
Red Hat Certified System Administrator (RHCSA) #13: SELinux in depth — contexts, booleans, troubleshooting (audit2allow)
The thirteenth post in the Red Hat Certified System Administrator (RHCSA) series. We cover switching SELinux between enforcing/permissive and making it permanent, the structure of file and process contexts and applying policy-correct labels with semanage fcontext and restorecon, the booleans you flip with getsebool and setsebool -P, how to open a non-standard port with semanage port, and the troubleshooting flow that traces denial logs with ausearch, sealert, and audit2allow to build policy. We work through the most common RHCSA situation by hand — a service blocked by SELinux.
AWS Certified CloudOps Engineer - Associate (SOA-C03) #8 Domain 3-2 Deployment: Systems Manager Operations Automation
The eighth post of the SOA-C03 series covers Systems Manager, the core tool of the deployment and automation domain. It covers how to manage config and secrets with Parameter Store, apply patches in bulk with Patch Manager, maintain a desired state with State Manager, connect securely without keys via Session Manager, and also covers Run Command and Automation.
AWS Certified Developer - Associate (DVA-C02) #14 Exam Tips and Frequently Missed Patterns
The post right before the wrap-up of the DVA-C02 series. It lays out time management for 65 questions in 130 minutes, how to filter answer choices with constraint keywords, handling multiple-response and BEST/MOST questions, the most frequently confused concept pairs in DVA (SQS vs SNS vs EventBridge, User Pool vs Identity Pool, Secrets Manager vs Parameter Store, etc.), a per-domain keyword→service quick mapping, and a just-before-the-exam checklist.
Certified Kubernetes Administrator (CKA) #23 Troubleshooting 2: Nodes and kubelet (NotReady, disk/memory pressure)
The twenty-third post in the Certified Kubernetes Administrator (CKA) series. We follow the diagnostic flow for a node that has dropped to NotReady from start to finish. We read conditions with k describe node, SSH into the node and narrow down the cause with systemctl status kubelet and journalctl -u kubelet, and fix a stopped kubelet, a stopped runtime, a full disk, and memory pressure symptom by symptom. We also cover how to isolate a problem node with cordon and drain.