Python Testing #5: Testing the Outside World — Files, HTTP, Databases, and Web Frameworks

6 min read

In #4 mock and monkeypatch we covered how to replace external dependencies with fakes. But once you get comfortable with mocks, a new worry shows up. A test where everything is faked out is fast and stable — yet it guarantees nothing about whether a real file actually gets written or real data actually lands in the database. It’s entirely possible to write code where every mock passes and the deployment still breaks.

In this post we’ll take the four regions of the outside world — files, HTTP, databases, and web frameworks — and work out, region by region, where to use the real thing and where to start faking.

Files: use the real filesystem with tmp_path #

File I/O is not something to mock. The local disk is fast and deterministic, so faking it costs more than it buys you. pytest’s built-in tmp_path fixture gives every test its own empty temporary directory.

tests/test_config.py
import json
from pathlib import Path

def save_config(path: Path, config: dict) -> None:
    path.write_text(json.dumps(config), encoding="utf-8")

def test_save_config(tmp_path):
    config_file = tmp_path / "config.json"

    save_config(config_file, {"debug": True, "port": 8000})

    assert json.loads(config_file.read_text()) == {"debug": True, "port": 8000}

tmp_path is a pathlib.Path object. Each test gets a different directory, and pytest cleans up afterwards. Because you wrote and read a real file, everything — encoding, path handling, serialization — has been verified for real. There’s one prerequisite: like save_config above, the code must accept the path as an argument so you can inject tmp_path. If the code operates relative to the working directory, you can instead move the current directory itself with monkeypatch.chdir(tmp_path).

HTTP: fake only the network boundary #

If you swap out the fetch_user function itself with mock.patch, none of the logic inside the function — URL construction, error handling — ever runs. A better place to intercept is the network boundary: run your own code all the way through, and intercept only at the moment a real socket would open. If you use httpx, respx plays this role. Install it with pip install respx.

tests/test_api_client.py
import httpx
import respx

def fetch_user(user_id: int) -> dict:
    response = httpx.get(f"https://api.example.com/users/{user_id}")
    response.raise_for_status()
    return response.json()

@respx.mock
def test_fetch_user():
    respx.get("https://api.example.com/users/1").mock(
        return_value=httpx.Response(200, json={"id": 1, "name": "curtis"})
    )

    assert fetch_user(1)["name"] == "curtis"

The URL construction, raise_for_status(), and JSON parsing in fetch_user all actually execute — only the socket is replaced with a fake response. Register an httpx.Response(404) and you can reproduce the error path the same way. If you use requests, the equivalent library is responses, with nearly identical usage. The pytest-httpx plugin we used in Modern Python in Practice #6 belongs to the same family. These libraries raise an error whenever a request goes to an unregistered URL, which also protects you from accidentally hitting the real network mid-test.

Separate tests that call the real API with a marker #

But how would you know if your fake responses have drifted from the real API spec? Only a real call can verify that. Separate those tests with the markers we saw in #3.

real-call test
@pytest.mark.external
def test_real_api_contract():
    response = httpx.get("https://api.example.com/users/1")
    assert "name" in response.json()
pyproject.toml
[tool.pytest.ini_options]
markers = ["external: tests that call a real external API"]
addopts = "-m 'not external'"

They’re excluded from normal runs and only execute when you explicitly ask with pytest -m external. You’ve moved them into a separate group that’s allowed to be slow and occasionally flaky.

Databases: give every test a clean state #

The core problem with database tests is state. If data left behind by one test affects the next, results depend on execution order. There are three main strategies for a clean slate: in-memory SQLite (fastest), transaction rollback against your real schema, and a containerized database (slowest, but identical to production).

sqlite:///:memory: never even touches the disk, so it’s the fastest. But if production runs PostgreSQL, know the limits before relying on it. JSON operators, array types, type-checking strictness, and concurrency behavior all differ — a query that passes on SQLite can break on PostgreSQL, and vice versa.

The transaction rollback pattern #

Open a transaction when the test starts; when it ends, roll back instead of committing. No need to delete data row by row — you simply return to the initial state.

tests/conftest.py
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from app.models import Base

@pytest.fixture(scope="session")
def engine():
    engine = create_engine("sqlite:///:memory:")
    Base.metadata.create_all(engine)
    yield engine
    engine.dispose()

@pytest.fixture
def db_session(engine):
    connection = engine.connect()
    transaction = connection.begin()
    session = Session(bind=connection, join_transaction_mode="create_savepoint")

    yield session

    session.close()
    transaction.rollback()
    connection.close()

Thanks to join_transaction_mode="create_savepoint", any commit() inside test code only goes as far as a SAVEPOINT, and the rollback() at the end of the fixture undoes everything. Tests just take db_session and call add() and commit() as usual; nothing persists in the database, so every test starts from an empty state.

If your queries depend on PostgreSQL-specific features, testcontainers can spin up a Docker container just for the test run. Start a PostgresContainer("postgres:16") once in a session-scoped fixture, and layer the rollback pattern above on top for per-test isolation. You pay a few seconds of startup time in exchange for verifying against the same database as production.

Web frameworks: send requests without a server #

FastAPI’s TestClient calls your routes in-process, without starting a server.

tests/test_app.py
from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)

def test_health():
    response = client.get("/health")
    assert response.status_code == 200

Request parsing, dependency injection, response serialization — the entire framework actually runs. The detailed patterns, including dependency_overrides for swapping a real database dependency with a test one and the async client, are covered in Modern Python in Practice #6.

Django’s test client follows the same shape. Inside a TestCase you call views with self.client.get("/polls/"), and per-test transaction rollback for the database — the very pattern we saw above — is built into the framework. Django itself is covered in the Django Basics series.

The unit/integration boundary: trading speed for trust #

To put every choice so far in a single line: at every boundary we’re making the same trade. The closer to real, the higher the trust and the lower the speed.

RegionFast sideReal side
Filestmp_path (already real enough)use as is
HTTPrespx / responsesreal calls behind the external marker
DBin-memory SQLiterollback pattern, testcontainers
WebTestClient (in-process)E2E against staging

The industry default is to keep the fast side as your baseline and maintain a small, elite set of real-side tests. The fast tests run on every commit; the real tests run before deployment or once a day.

Recap #

  • Don’t mock files — use the real filesystem with tmp_path
  • For HTTP, block the network boundary, not the function: respx for httpx, responses for requests
  • Separate real-API tests with the external marker and exclude them from normal runs
  • For databases, use the transaction rollback pattern for a clean state per test, and testcontainers for PostgreSQL-specific features
  • FastAPI’s TestClient and Django’s test client run the entire framework without a server
  • The closer to real, the slower and the more trustworthy — keep the default fast and the real tests few

In the next post, #6, we’ll cover test design and coverage: how many tests to write and of what kind, what separates good tests from bad ones, and how to read coverage numbers.

X