Contents
30 Chapter

E2E testing — Playwright

Automating full scenarios in a real browser with Playwright. Setup, locator rules, auth via storageState, the page object pattern, taming flaky tests, CI integration, and visual regression.

Chapter 29 covered automated tests at the component and hook level. They do a good job of verifying that each component behaves as intended in its place. What unit tests cannot give you full confidence about, though, is whether the entire user scenario that emerges when components come together — sign-up → login → add Todo → toggle done — runs correctly. This chapter looks at Playwright, the tool that fills that gap.

If the unit tests of Chapter 29 verify small pieces frequently, the E2E tests of this chapter verify large flows occasionally. The two levels complement each other; replacing one with the other quickly raises the cost. And in the final chapter of Part 5 (Chapter 33, Deploy and Observability), we extend this chapter’s E2E to run automatically against preview deploy environments.

What E2E catches #

Bugs that the unit / integration tests of Chapter 29 cannot catch usually take these shapes:

  • Contracts between components drift: component A sends a string, component B expects a number.
  • Routing / state preservation: a bug where the input disappears when you leave the page and return.
  • Auth state and permissions: accessing a protected page while logged out.
  • Production-build-only issues: dynamic import timing, hydration mismatches.
  • Integration with external dependencies: real DB / Server Action / auth flow.

These bugs only show up in an environment where every part runs together. Playwright launches a real browser and automates the click-and-type flow exactly as a user would do it.

Playwright setup #

install Playwright
pnpm create playwright@latest

Answer the prompts (TypeScript Yes, GitHub Actions Yes recommended).

After installation, these files appear:

generated files
modern-react-demo/
├── playwright.config.ts        ← config
├── tests/                       ← E2E test files
│   └── example.spec.ts
└── tests-examples/              ← learning examples (delete if not needed)

Key settings in playwright.config.ts.

playwright.config.ts
import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  testDir: './tests',
  fullyParallel: true,
  retries: process.env.CI ? 2 : 0,
  workers: process.env.CI ? 1 : undefined,
  reporter: 'html',

  use: {
    baseURL: 'http://localhost:3000',
    trace: 'on-first-retry',
  },

  projects: [
    { name: 'chromium', use: { ...devices['Desktop Chrome'] } },
    { name: 'firefox', use: { ...devices['Desktop Firefox'] } },
    { name: 'webkit', use: { ...devices['Desktop Safari'] } },
  ],

  webServer: {
    command: 'pnpm dev',
    url: 'http://localhost:3000',
    reuseExistingServer: !process.env.CI,
  },
});

The essentials.

  • baseURL saves you from writing the host in every test.
  • webServer brings up the Next.js dev server automatically before tests start. The same applies in CI.
  • projects runs the three browsers Chromium / Firefox / WebKit as a matrix.
  • retries: 2 (CI-only) auto-retries transient flakes up to two times.

First scenario — the home page renders #

tests/home.spec.ts
import { test, expect } from '@playwright/test';

test('renders the home page', async ({ page }) => {
  await page.goto('/');

  await expect(page.getByRole('heading', { name: 'Home' })).toBeVisible();
  await expect(page.getByRole('link', { name: 'About' })).toBeVisible();
});

test('navigates to the about page', async ({ page }) => {
  await page.goto('/');

  await page.getByRole('link', { name: 'About' }).click();

  await expect(page).toHaveURL('/about');
  await expect(page.getByRole('heading', { name: 'About' })).toBeVisible();
});

Run the tests.

run tests
pnpm exec playwright test

Adding the --ui flag opens UI mode where you can step through each step visually.

UI mode
pnpm exec playwright test --ui

Locator priority #

The same principle from Testing Library in Chapter 29 applies. Prefer selectors a user can perceive on screen.

The recommended order:

  1. getByRole(role, { name }) — ARIA role + accessible name. The most stable.
  2. getByLabel(text) — form controls linked through a label.
  3. getByPlaceholder(text) — placeholders.
  4. getByText(text) — text content.
  5. getByTestId(id) — the data-testid attribute. The last resort.

CSS selectors (page.locator('.btn-primary')) work too, but CSS classes are fragile under design refactors. The same selector breaking after a design-system swap is a costly surprise. Role / label-based selectors are stable.

Use getByTestId only when every option above is impractical. data-testid attributes ship into the production build, so reach for them only when no other selector can express the target.

Automating the auth flow — storageState #

Most scenarios — Todo apps and the like — run from a logged-in state. Filling out the login form in every test is slow and flaky.

Playwright provides a standard pattern: save the logged-in state once, then have every test reuse it.

tests/auth.setup.ts
import { test as setup, expect } from '@playwright/test';

const authFile = 'playwright/.auth/user.json';

setup('saves the logged-in state', async ({ page }) => {
  await page.goto('/login');
  await page.getByLabel('Email').fill('test@example.com');
  await page.getByLabel('Password').fill('test-password');
  await page.getByRole('button', { name: 'Log in' }).click();

  await expect(page).toHaveURL('/dashboard');

  await page.context().storageState({ path: authFile });
});

Set up the setup project and dependencies in playwright.config.ts.

playwright.config.ts (auth setup added)
export default defineConfig({
  projects: [
    { name: 'setup', testMatch: /.*\.setup\.ts/ },
    {
      name: 'chromium',
      use: { ...devices['Desktop Chrome'], storageState: 'playwright/.auth/user.json' },
      dependencies: ['setup'],
    },
  ],
});

Now every chromium test starts already logged in. There is no need to walk through the login form each time, which speeds things up considerably.

Per-role storageState #

If you handle scenarios for different permission levels — admin / regular user — create a separate storageState file per role. This dovetails naturally with the role-based pattern we will build in Chapter 32 (Auth and Sessions).

The page object pattern #

As tests grow, the same selectors and actions repeat across files.

repeated selectors
test('adds a todo', async ({ page }) => {
  await page.goto('/todos');
  await page.getByLabel('Task').fill('Exercise');
  await page.getByRole('button', { name: 'Add' }).click();
  // ...
});

test('completes a todo', async ({ page }) => {
  await page.goto('/todos');
  await page.getByLabel('Task').fill('Reading');
  await page.getByRole('button', { name: 'Add' }).click();
  // ... selector duplication
});

The page object pattern gathers the selectors and actions for one page into a single object.

tests/pages/TodosPage.ts
import { type Page, type Locator } from '@playwright/test';

export class TodosPage {
  readonly input: Locator;
  readonly addButton: Locator;

  constructor(readonly page: Page) {
    this.input = page.getByLabel('Task');
    this.addButton = page.getByRole('button', { name: 'Add' });
  }

  async goto() {
    await this.page.goto('/todos');
  }

  async addTodo(text: string) {
    await this.input.fill(text);
    await this.addButton.click();
  }

  async toggleByText(text: string) {
    await this.page.getByRole('checkbox', { name: text }).check();
  }
}
test using the page object
import { test, expect } from '@playwright/test';
import { TodosPage } from './pages/TodosPage';

test('adds and completes a todo', async ({ page }) => {
  const todos = new TodosPage(page);
  await todos.goto();

  await todos.addTodo('Exercise');
  await expect(page.getByText('Exercise')).toBeVisible();

  await todos.toggleByText('Exercise');
  await expect(page.getByRole('checkbox', { name: 'Exercise' })).toBeChecked();
});

Caution: do not create page objects too early. Extract them once two or three scenarios accumulate and the same selectors start to repeat. Keeping the first one or two scenarios as plain tests reads better.

Making slow E2E faster #

E2E is slow by nature. The following tools speed it up.

1. Parallel execution #

With fullyParallel: true, tests inside a single file also run in parallel. The workers option controls the concurrency.

parallel config
export default defineConfig({
  fullyParallel: true,
  workers: process.env.CI ? 4 : undefined,
});

Parallelization assumes tests are isolated from each other. Two tests sharing the same user ID running at the same time will clash. Patterns that isolate data per user / per test are required.

2. Deterministic selectors #

Using a common word like getByText('Done') becomes ambiguous when several appear on screen. Narrow it down with the following patterns.

narrow by scope
await page.getByRole('listitem')
  .filter({ hasText: 'Exercise' })
  .getByRole('button', { name: 'Delete' })
  .click();

3. Lean on auto-wait #

Playwright automatically waits for an element to be visible before an action. You rarely need a waitFor of your own. The following is an anti-pattern.

🚫 unnecessary sleep
await page.waitForTimeout(2000);  // almost always a wrong signal
await page.click('button');

waitForTimeout is just “I’m not sure, wait a bit.” Too short and it goes flaky; too long and it is slow. Instead, state what you are waiting for.

✅ state what you wait for
await expect(page.getByText('Submitted!')).toBeVisible();
await page.click('button');

4. When retries are warranted #

A test that fails once or twice is flaky. Turning on retries absorbs transient flakes automatically. That said, retries can become “a tool for hiding bugs.” A healthier flow is to flag flaky tests separately and chase down the cause.

flaky marker
test('an occasionally flaky scenario', async ({ page }) => {
  test.fixme();  // known flaky, in progress
  // ...
});

Visual regression (optional) #

When you want to verify pixel-level UI changes, use toHaveScreenshot.

visual regression test
test('home page visual consistency', async ({ page }) => {
  await page.goto('/');
  await expect(page).toHaveScreenshot('home.png');
});

The first run saves a baseline screenshot, and subsequent runs do pixel comparisons.

The cost: fonts and antialiasing differ across environments, so false positives appear often. Do not put visual regression on every page; limit it to core components of the design system or pages where consistency really matters.

CI integration — GitHub Actions #

.github/workflows/e2e.yml:

.github/workflows/e2e.yml
name: e2e

on:
  push:
    branches: [main]
  pull_request:

jobs:
  e2e:
    timeout-minutes: 30
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: pnpm/action-setup@v3
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'pnpm'
      - run: pnpm install --frozen-lockfile
      - run: pnpm exec playwright install --with-deps
      - run: pnpm exec playwright test
      - uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: playwright-report
          path: playwright-report/
          retention-days: 7

The essentials.

  • playwright install --with-deps installs the browsers along with their system dependencies.
  • The if: failure() artifact upload automatically stores trace / video / screenshot on GitHub Actions on failure, which makes debugging far easier.

E2E against preview deploy environments #

We will return to this in Chapter 33 (Deploy and Observability), but the most powerful pattern is running Playwright against the preview deploy URL from Vercel / Cloudflare Pages. Since it runs on the real production build with real environment variables, it catches production-only bugs well.

playwright.config.ts (preview deploy aware)
export default defineConfig({
  use: {
    baseURL: process.env.PREVIEW_URL ?? 'http://localhost:3000',
  },
  // do not start webServer in CI
  webServer: process.env.PREVIEW_URL
    ? undefined
    : { command: 'pnpm dev', url: 'http://localhost:3000' },
});

Unit / integration / E2E split — once more #

Let us fill in the table from Chapter 29 from this chapter’s standpoint.

LevelRepresentative toolWhat it verifiesPer-test speedRecommended share
UnitVitestPure functions, small components, hooks~10msmany
IntegrationVitest + jsdomComponent cooperation, form flows~100msmedium
E2EPlaywrightUser scenarios (sign-up → …)~5 secondsfew

Saying “keep E2E few” does not mean their value is low. One E2E test covers a wide swath of bugs, so scenario selection matters more than count.

What makes a good E2E scenario:

  • Business-critical flows (sign-up → payment → use, the critical path)
  • Areas where regressions occur often
  • Integrated behavior hard to express as units (for example, auth + permissions + routing)

Try it yourself — E2E for the Chapter 27 guestbook #

E2E-verify the guestbook unit-tested in Chapter 29.

  1. Post-a-message scenario: navigate to /guestbook, fill in name and message and submit → verify that the new message appears in the list.
  2. Validation-failure scenario: submit with an empty name → verify the error message appears on screen and nothing is added to the list.
  3. Delete scenario: after posting, click the delete button → verify that the message disappears.
  4. Extracting a page object: after writing the three scenarios above, extract the shared selectors into a GuestbookPage object. Compare the code before and after extraction and see for yourself how the readability changes.

Walk through these four steps and the rhythm for writing E2E and the timing for extracting page objects settle into your hands.

Exercises #

  1. Splitting work between Vitest and Playwright. Answer which tool is appropriate to verify each of the five behaviors. (a) Timezone handling in a formatDate utility, (b) the boolean toggle of the useToggle hook, (c) login → blocked access to a protected page, (d) empty-input validation on form submit, (e) the full payment flow. After answering, match them up with the recommended-share table in the chapter.
  2. Flaky analysis. Guess why the following test fails occasionally and fix it: await page.click('button'); await page.waitForTimeout(1000); expect(...). Refer to the “lean on auto-wait” section.
  3. Picking the scope for visual regression. Pick two pages from an app you built (or one built in this book) where visual regression would add value, and two where it would only add cost. Summarize your selection criterion in one sentence.

In one line: Playwright is the standard tool for automating user scenarios in a real browser. Prefer role / label-based locators, and walk through auth once with storageState. Extract page objects after scenarios accumulate, and replace waitForTimeout with expects that state what you wait for. The essence of unit / integration / E2E is scenario selection rather than count ratio (many / medium / few), and leaving trace / video to land as CI artifacts on failure cuts the debugging cost dramatically.

Next chapter #

The next chapter, Chapter 31 Performance · Bundles · Web Vitals, covers the tools for measuring and improving how quickly a built app reaches the user. Where Chapter 14 (Performance Optimization) was about React-internal re-render cost, Chapter 31 is about the three metrics users actually feel: LCP / INP / CLS. We also take another look at how the streaming of Chapter 26 (Suspense and use()) interacts with these metrics.

X