Python Without the GIL Is Here: Where Free-Threading Stands and When to Use It

7 min read

In Modern Python Advanced #5 we established that, because of the GIL, threads are pointless for CPU-bound work. Even with 8 cores, Python threads execute bytecode one at a time, so the right answer for using all your CPUs was to spawn processes with multiprocessing. That premise is now changing. A GIL-free Python build appeared as an experiment in 3.13 and entered the officially supported stage in 3.14. This post lays out what free-threading is, how far it has come as of mid-2026, and how to decide whether you should use it now.

What free-threading is #

It’s the change proposed by PEP 703. In one line: rework CPython’s internals so the interpreter runs safely even with the GIL removed.

What the GIL protected was the consistency of object reference counting. Simply removing the lock would let multiple threads touch reference counts simultaneously and corrupt memory. So the free-threaded build changed the reference counting scheme itself. Mechanisms like biased reference counting (which splits counts per thread), fine-grained per-object locks, and internal data structures readable without locks now do the GIL’s job.

The result, from the outside, is identical behavior. Code using threading.Thread runs as is. One thing changes: CPU-bound multithreading genuinely executes in parallel.

Timeline: from experiment to official support #

  • Python 3.13 (October 2024): the free-threaded build shipped for the first time, at the experimental stage. It came as a separate build, 3.13t, and the single-thread overhead was substantial.
  • Python 3.14 (October 2025): PEP 779 moved it to the officially supported stage. The experimental label came off, and with the adaptive specializing interpreter (PEP 659) optimizations now enabled in free-threaded mode, performance improved significantly.

One caution: official support does not mean the default build changed. Free-threaded is still a separate, opt-in build. The default installer from python.org still ships with the GIL. The final stage — fully replacing the GIL build — has no scheduled date yet.

Install and verify #

A t suffix on the version marks the free-threaded build. With uv, installation is one line.

install the free-threaded build
# install the free-threaded build
uv python install 3.14t

# pin it for a project
uv init my-app --python 3.14t

# verify the build (you'll see "free-threading build" in the output)
python -VV

To confirm the GIL is actually off in a running process, check from code.

check GIL status
import sys

print(sys._is_gil_enabled())   # False means running without the GIL

This check matters for a specific reason: importing a C extension that doesn’t support free-threading silently re-enables the GIL for the entire process. No error, no warning — it just behaves like a GIL build. So make a habit of importing all your dependencies and then checking sys._is_gil_enabled(). You can also force the GIL on in a free-threaded build with the environment variable PYTHON_GIL=1 or the runtime option -X gil=1.

What gets faster #

Split CPU-bound work across threads and you can expect speedups proportional to your core count.

code that goes truly parallel under free-threading
from concurrent.futures import ThreadPoolExecutor

def count_primes(limit):
    count = 0
    for n in range(2, limit):
        if all(n % d for d in range(2, int(n ** 0.5) + 1)):
            count += 1
    return count

with ThreadPoolExecutor(max_workers=8) as pool:
    results = list(pool.map(count_primes, [200_000] * 8))

On a GIL build, the 8 threads in this code run effectively in series and take about as long as one thread. On the free-threaded build, 8 cores work simultaneously and the speedup approaches the core count.

The advantages over multiprocessing are clear too. No process-spawning cost, and no pickling data back and forth. Threads share large arrays or models in the same memory, as is. For workloads that wanted CPU parallelism and shared data at the same time, this is a structural improvement.

Cost 1: single-thread overhead #

The GIL, being simple, was also a fast synchronization mechanism. Replace it with fine-grained locks and distributed reference counting, and single-thread performance actually drops.

As of 3.14, single-thread overhead is roughly 5–10% depending on platform. By pyperformance benchmark averages, it’s about 1% on macOS aarch64 and about 8% on x86-64 Linux. In the 3.13 experimental build the adaptive interpreter was disabled, making it far worse; 3.14 shrank it dramatically, and the CPython team’s direction is to keep shrinking it in subsequent versions.

The interpretation: if your program uses only one thread, the free-threaded build is pure loss. It only pays off when multicore speedups more than offset this overhead.

Cost 2: a C extension ecosystem in transition #

Pure Python code mostly just works. The problem is C extensions. Extensions written over 30 years on the assumption that a GIL exists must be fixed to guarantee their own thread safety, and wheels for the free-threaded build (cp314t) must be built and published separately.

The state of things as of mid-2026:

  • NumPy: the 2.4 line keeps improving free-threading support and ships cp314t wheels for major platforms.
  • PyTorch: after preview wheels in 2.9, official cp314t wheels ship starting with 2.10.0 (January 2026).
  • Unsupported packages: foundational packages like grpcio still lack cp314t wheels. Any project depending on one of them is blocked all the way up the stack.

The compatibility status across the ecosystem is tracked package by package at py-free-threading.github.io. The summary: the core numerical libraries have mostly crossed over, and what remains is the long tail of small C extensions.

Should you use it in production now? #

Let’s lay out the decision criteria.

For most cases, the default build is still the answer.

  • Web API servers, scripts, and I/O-bound workloads gain almost nothing from free-threading. You only pay the overhead.
  • If your dependency tree is heavy on C extensions and even one is unsupported, the GIL silently turns back on — and the entire point of adopting it evaporates.
  • You also need to verify separately that your production monitoring and profiling tools support the free-threaded build.

There are clearly cases worth trying.

  • Workloads that need multicore CPU computation and large shared data together. Code that suffered under multiprocessing’s IPC costs is the prime example.
  • Projects with few dependencies, or built mostly on libraries like NumPy and PyTorch that have already finished compatibility work.
  • New projects where you want to validate a concurrency design. uv python install 3.14t gives you an isolated experiment environment in one line, so the cost is low.

In short: a full switch is premature; this is the stage for experimenting with selected workloads. If you do adopt it, ship a sys._is_gil_enabled() check before deployment along with race condition tests under multithreaded load. Concurrency bugs that the GIL previously masked can surface under free-threading.

What happens to multiprocessing and asyncio? #

Even in a future where free-threading is the default, neither tool disappears — their roles just narrow. multiprocessing remains for cases where process isolation itself is the goal (crash isolation, separate memory limits, separate interpreter state), while the “we only used it because we had to for CPU parallelism” usage will likely migrate to threads. asyncio keeps its own territory, large-scale I/O concurrency, which never depended on the GIL. If anything, running multiple event loops, one per thread, on top of a free-threaded build is a combination that’s newly opening up. The change simplifies tool selection from “how do I dodge the GIL?” to “what fits the workload?”

Wrap-up #

  • Free-threading is a separate CPython build reworked to run safely with the GIL removed (PEP 703).
  • It arrived as an experiment in 3.13 and reached official support in 3.14 via PEP 779. The default build still has the GIL.
  • Install with uv python install 3.14t, and after your imports, verify the GIL is really off with sys._is_gil_enabled().
  • CPU-bound multithreading speeds up in proportion to core count, and memory is shared without multiprocessing’s serialization costs.
  • The costs: roughly 5–10% single-thread overhead (varies by platform) and a C extension ecosystem still in transition.
  • For now, experiment with selected workloads rather than switching wholesale — CPU-parallel plus shared-data workloads are the ones worth trying first.

For the general background on the GIL and the division of labor among threading, multiprocessing, and asyncio, see Modern Python Advanced #5 — worth reading first if you need the context.

X