#Claude

34 posts

Advanced RAG #2: Chunking Strategies That Decide Retrieval Quality
5 min read

Advanced RAG #2: Chunking Strategies That Decide Retrieval Quality

The root of retrieval failures often lies not in retrieval itself but in the step before it: chunking. We cover the limits of fixed-size splitting, structure-based chunking, handling tables and code, metadata, and parent-child chunking that searches small and feeds large.

Advanced RAG #1: Start by Finding Where RAG Goes Wrong
5 min read

Advanced RAG #1: Start by Finding Where RAG Goes Wrong

When RAG gives a strange answer, blindly tweaking the prompt is gambling. We start by splitting failures into retrieval failures and generation failures, then build a golden set and a baseline so every improvement becomes measurable.

AI Agent Development #7: Capstone Project — An Issue Triage Agent
6 min read

AI Agent Development #7: Capstone Project — An Issue Triage Agent

Tie together every piece from the series and finish a triage agent that classifies GitHub issues and proposes labels and replies. We cover read tools, write tools behind an approval gate, and evaluation with a golden set.

AI Agent Development #6: Building Your Own MCP Server
5 min read

AI Agent Development #6: Building Your Own MCP Server

In Part 11 of LLM App Development we connected to MCP servers someone else built. This time we build our own tools as an MCP server. We cover writing a server with FastMCP, wiring it into our agent loop, and the criteria for splitting tools out into a server.

AI Agent Development #5: Dividing Work with Subagents
5 min read

AI Agent Development #5: Dividing Work with Subagents

When one agent does everything, both its context and its responsibilities bloat. We cover why you delegate work to subagents, a delegate tool, the orchestrator-worker pattern with parallel execution, and rules to keep delegation from going too far.

AI Agent Development #4: Context Management for Long-Running Work
6 min read

AI Agent Development #4: Context Management for Long-Running Work

The longer an agent runs, the closer its conversation grows to the context limit. We cover techniques for surviving long-running work: capping tool results, clearing old results, summary compression and server-side compaction, and a file-based scratchpad.

AI Agent Development #3: Planning and Self-Correction
5 min read

AI Agent Development #3: Planning and Self-Correction

To hand an agent multi-step work, it needs rules of behavior and a plan. We cover system prompt design, getting the agent to plan first, mid-task verification and retries, and tuning thinking depth with adaptive thinking.

AI Agent Development #2: Designing Good Tools
5 min read

AI Agent Development #2: Designing Good Tools

Most of the quality gap between agents comes from their tools. We cover the principles of tool design: the description the model reads, schema design, error messages, and classifying dangerous tools with a confirmation step.

AI Agent Development #1: Building a Robust Agent Loop
5 min read

AI Agent Development #1: Building a Robust Agent Loop

Take the minimal agent loop from LLM App Development up to production level. Handle every stop_reason, return tool errors as results, plus retries and logging. The starting point of this series.

LLM App Development #13: A Real-World Project — Internal Document Q&A Bot
5 min read

LLM App Development #13: A Real-World Project — Internal Document Q&A Bot

Bring all the pieces together and build a Q&A bot that answers from internal documents, start to finish. A finale combining RAG, streaming, grounding prompts, and conversation memory.

LLM App Development #12: Cost, Evaluation, and Observability
5 min read

LLM App Development #12: Cost, Evaluation, and Observability

What you need to actually operate the app you built. Cutting token cost and prompt caching, evaluation that measures quality, and observability that looks into behavior.

LLM App Development #11: Connecting Tools with MCP
5 min read

LLM App Development #11: Connecting Tools with MCP

MCP (Model Context Protocol), the standard for connecting tools. Instead of writing tools by hand every time, connect Claude to ready-made tool servers.