All posts

Friday, June 19, 2026 5 min read

Advanced RAG #2: Chunking Strategies That Decide Retrieval Quality

The root of retrieval failures often lies not in retrieval itself but in the step before it: chunking. We cover the limits of fixed-size splitting, structure-based chunking, handling tables and code, metadata, and parent-child chunking that searches small and feeds large.

AI LLM Claude

Thursday, June 18, 2026 5 min read

Advanced RAG #1: Start by Finding Where RAG Goes Wrong

When RAG gives a strange answer, blindly tweaking the prompt is gambling. We start by splitting failures into retrieval failures and generation failures, then build a golden set and a baseline so every improvement becomes measurable.

Infrastructure Hardware

Thursday, June 18, 2026 8 min read

Hardware Advanced #7: Firmware, BMC, and the Lifecycle — The Other Computer Inside Your Server

A look at the BMC, the management computer that stays on independently of the main CPU. It covers remote console and power control, IPMI and Redfish, the firmware stack and update operations, failure prediction with SMART and ECC counters, management-network security, and the lifecycle from warranty expiry to disk disposal — closing out the Hardware Advanced series.

AI LLM Claude

Wednesday, June 17, 2026 6 min read

AI Agent Development #7: Capstone Project — An Issue Triage Agent

Tie together every piece from the series and finish a triage agent that classifies GitHub issues and proposes labels and replies. We cover read tools, write tools behind an approval gate, and evaluation with a golden set.

Infrastructure Hardware

Wednesday, June 17, 2026 9 min read

Hardware Advanced #6: Data Center Cooling and Racks — Electricity Always Becomes Heat

Nearly all the power that enters a server comes back out as heat. Starting from the basic airflow contract of front intake and rear exhaust, this post maps out data center cooling end to end: hot/cold aisle containment, rack density and the limits of air cooling, liquid cooling with D2C and immersion, and how ASHRAE temperature guidelines tie into PUE.

AI LLM Claude

Tuesday, June 16, 2026 5 min read

AI Agent Development #6: Building Your Own MCP Server

In Part 11 of LLM App Development we connected to MCP servers someone else built. This time we build our own tools as an MCP server. We cover writing a server with FastMCP, wiring it into our agent loop, and the criteria for splitting tools out into a server.

Infrastructure Hardware

Tuesday, June 16, 2026 9 min read

Hardware Advanced #5: Datacenter Power — The Real Reason You Can't Rack More Servers

Even with empty slots in the rack, new servers get rejected — because of the power budget. This post walks the power environment a server lives in, from an operator's point of view: PSU redundancy and A/B feeds, per-rack kW contracts, PDUs and UPS, generators and ATS, PUE, and the power density that GPU servers have driven up.

AI LLM Claude

Monday, June 15, 2026 5 min read

AI Agent Development #5: Dividing Work with Subagents

When one agent does everything, both its context and its responsibilities bloat. We cover why you delegate work to subagents, a delegate tool, the orchestrator-worker pattern with parallel execution, and rules to keep delegation from going too far.

Infrastructure Hardware

Monday, June 15, 2026 9 min read

Hardware Advanced #4: ZFS Deep Dive — When RAID and the Filesystem Become One

ZFS merged RAID, volume management, and the filesystem into a single layer, solving the structural problems of the traditional stack. This post walks through it all from an operations point of view: copy-on-write that eliminates the write hole, checksums that verify every read with self-healing, resilver that copies only live data, RAIDZ and the ARC, snapshots with send/recv, and lz4 compression.

AI LLM Claude

Sunday, June 14, 2026 6 min read

AI Agent Development #4: Context Management for Long-Running Work

The longer an agent runs, the closer its conversation grows to the context limit. We cover techniques for surviving long-running work: capping tool results, clearing old results, summary compression and server-side compaction, and a file-based scratchpad.

Framework Gin Go Golang

Sunday, June 14, 2026 5 min read

Gin Basics #7 Project Structure and a Mini REST API

Split the code accumulated in one file into layers, separate out configuration, and finish the series with a mini REST API.

Infrastructure Hardware

Sunday, June 14, 2026 9 min read

Hardware Advanced #3: Memory Deep Dive — Page Cache, THP, and Bandwidth

A tour inside the kernel memory machinery: the read and write paths through the page cache, the latency spikes THP creates, explicit hugepages and the TLB, how swappiness is actually implemented along with zswap, and the memory bandwidth bottleneck that keeps throughput flat even when cores sit idle.