LLM App Development #5: Getting Structured Output

5 min read

In Part 4 we saw how to narrow the output format with a prompt. But even when you ask for “only one word,” the model may occasionally answer with a sentence. To handle that result directly in code, the format cannot wobble. In this post we cover how to force the output format, so you can plug the result straight into code.

A prompt alone is not enough #

Recall the classification example from Part 4. We instructed: “answer with only one of positive, negative, or neutral.” Most of the time it complies, but the model might once in a while answer with a sentence like “somewhat negative.” A person reading it has no problem, but what if code uses that result like this?

if answer == "negative":
    flag_review(review)

If the answer comes back as “somewhat negative” instead of “negative,” this condition simply misses. The moment the format slips even once, the handling breaks. To use a result mechanically in an app, the format has to be 100% guaranteed.

Structured output solves this. It forces the response to conform to a JSON schema you define. The model cannot step outside that format.

Defining the output format with Pydantic #

In Python, the most convenient way is to define the structure you want as a Pydantic model and pass it to messages.parse. The response comes back as a validated object.

structured_pydantic.py
from typing import Literal
from pydantic import BaseModel
import anthropic

class Review(BaseModel):
    sentiment: Literal["positive", "negative", "neutral"]
    summary: str
    score: int

client = anthropic.Anthropic()

response = client.messages.parse(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": "Analyze this review: Shipping was fast, but the product was defective. I did get a refund.",
    }],
    output_format=Review,
)

review = response.parsed_output
print(review.sentiment)  # always one of positive/negative/neutral
print(review.score)      # an integer

response.parsed_output is a validated Review instance. sentiment is guaranteed to be one of the three, and score is an integer. You do not have to parse a JSON string yourself or check the format. Without worrying about whether the model kept the format, you just pull out the object’s attributes and use them.

Writing the schema directly #

If you do not use Pydantic, you put a JSON schema directly into output_config.

structured_schema.py
import json

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": "Analyze this review: Shipping was fast, but the product was defective. I did get a refund.",
    }],
    output_config={
        "format": {
            "type": "json_schema",
            "schema": {
                "type": "object",
                "properties": {
                    "sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
                    "summary": {"type": "string"},
                    "score": {"type": "integer"},
                },
                "required": ["sentiment", "summary", "score"],
                "additionalProperties": False,
            },
        },
    },
)

text = next(b.text for b in response.content if b.type == "text")
data = json.loads(text)
print(data["sentiment"])

The first text block of the result is valid JSON that follows the schema. Only values in the enum come out, and every field marked required is present. So after json.loads you can access keys directly and safely.

Extracting structured data from text #

Where structured output really shines is turning free-form text into tidy data. From things people wrote — emails, meeting notes, reviews — you can extract just the information you need and hold it as a structured object. Nested objects and lists work too.

extract_tasks.py
from typing import Literal
from pydantic import BaseModel

class Task(BaseModel):
    title: str
    priority: Literal["high", "medium", "low"]

class MeetingNotes(BaseModel):
    summary: str
    tasks: list[Task]

notes = """In today's meeting we set the launch schedule for next quarter.
The design mockups need to be finished quickly by next week,
and hiring more QA staff will be looked into slowly when there is time."""

response = client.messages.parse(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": f"Organize the following meeting notes:\n\n{notes}"}],
    output_format=MeetingNotes,
)

result = response.parsed_output
for task in result.tasks:
    print(f"[{task.priority}] {task.title}")

Freely written meeting notes come back organized into a one-line summary and a tasks list. Each task is an object with a title and a priority, and the priority is one of the three set values. Writing such extraction by hand with regular expressions or string handling is fiddly, but structured output has the model solve it in one shot.

Note
Schemas have limits. Numeric minimums and maximums, string length, and other fine constraints, as well as recursive structures, are not supported. The Python and TypeScript SDKs automatically strip unsupported constraints and validate them on the client side, so just write naturally with Pydantic first. Structured output is available on claude-opus-4-8, claude-sonnet-4-6, and claude-haiku-4-5.

Where people commonly trip up #

  • The JSON is cut off mid-way — If max_tokens is small, the JSON is cut off before it closes and parsing fails. For structured output, estimate the result length and give a generous limit.
  • The format is not guaranteed on a refusal — If the model refuses for safety reasons (stop_reason is refusal), it may not follow the schema. In an app that handles user input, handle this case separately.
  • Changing the schema too often — A new schema incurs a one-time compilation cost on the first call, and the same schema is cached for 24 hours. If you tweak the schema slightly on every call, you cannot use that cache. It is better to keep the schema fixed.

Wrapping up #

In this post we covered structured output, which forces the output format.

  • A prompt only narrows the format and does not guarantee it, but structured output forces conformance to a schema.
  • In Python, a Pydantic model and messages.parse give you a validated object directly.
  • Without Pydantic, you can put a JSON schema straight into output_config.
  • You can extract structured data, including nested objects and lists, from free-form text.

Up to here, the flow has been Claude returning text or tidy data as its “answer.” In the next post, “LLM App Development #6: Connecting External Functions with Tool Calling,” the direction changes. We will let Claude call functions we define directly, connecting it to the outside world: search, databases, and external APIs.

X