Testing #1 — Why Test? The Place of Unit, Integration, and E2E
After finishing the React track and the TypeScript track, the next thing you naturally bump into is testing. There are many tools (Vitest, Jest, Cypress, Playwright), and the paradigms are split too (behavior vs implementation). In this first post, instead of tools, we’ll start with the picture.
This series is Testing, in 6 parts.
- #1 Why Test? The Place of Unit, Integration, and E2E ← this post
- #2 Vitest setup and your first unit test
- #3 React Testing Library — component tests
- #4 Async and network mocking — MSW
- #5 User events and form testing
- #6 E2E and CI integration with Playwright
There’s almost no code in this post. Every decision in later posts builds on this picture.
The real reasons people don’t test #
“Too busy” is the most common answer. But take one step in and you’ll see other layers.
- I don’t know what to test — the pressure of feeling like you have to write five tests per component. You don’t know where to stop.
- A breaking test is scarier — you change the code a little and 20 tests turn red. “If this is what testing is, I’d rather not write any.”
- CI is red and nobody looks — once it starts breaking, people go numb.
When these three pile up, you arrive at “tests are great, but they don’t fit our project.” In reality it’s the result of a wrong call at the fork in the road of which test to use where. That fork is the topic of this post.
The test pyramid — the oldest picture #
Let’s start with the familiar one.
/\
/ \
/ E2E\ ← 적게, 비싸게, 느리게
/------\
/ 통합 \ ← 적당히
/--------- \
/ 단위 \ ← 많이, 싸게, 빠르게
/--------------\What the three layers mean:
- Unit — testing a single function/class/component in isolation from everything else. Usually on the order of milliseconds. You can carry hundreds or thousands of them with no pain.
- Integration — the area where multiple modules run together. A server handler running with a DB, a parent component interacting with its children, and so on.
- E2E (end-to-end) — real browser, real server, real DB, all together. The user scenario as it is.
The shape of the pyramid recommends one thing clearly: lots of unit tests, fewest E2E. Why:
- Cost — unit tests run in milliseconds, E2E in seconds. The time gap between 100 E2E and 100 unit tests is overwhelming.
- Stability — E2E is sensitive to variables like network, timing, and display. The same test passing one moment and failing the next — flakiness — gets bad fast.
- Debugging — when an E2E test breaks, it’s hard to track down “where” it broke.
But — there are things unit tests alone don’t catch #
If you take the pyramid too literally, you fall into a trap. The classic anti-pattern:
test('useCounter sets state correctly', () => {
const { result } = renderHook(() => useCounter());
act(() => result.current.setCount(5));
expect(result.current.count).toBe(5);
});What is this test catching? When you call setCount(5), count becomes 5 — i.e., the fact that React’s useState works. It barely catches problems in our own code. Meanwhile, change the internals of useCounter and it breaks.
When tests like this pile up, you arrive at the point where refactoring becomes scary. “The behavior is the same, I just changed the implementation,” but 50 tests turn red — and you stop refactoring forever.
The direction of the fix is twofold.
- Test behavior — verify only what the user/caller sees. As long as the result is the same, internal changes pass.
- Don’t draw the unit boundary too narrowly — instead of testing
useCounteralone, test it together with the component that uses it. Surprisingly, more stable.
The testing trophy — a more modern picture #
Kent C. Dodds proposed a picture that captures this nuance.
┌──────────────┐
│ E2E │ ← 핵심 시나리오 몇 개
├──────────────┤
│ │
│ 통합 │ ← 가장 많이
│ │
├──────────────┤
│ 단위 │ ← 복잡한 로직만
├──────────────┤
│ 정적 │ ← TS, ESLint, 타입 체크
└──────────────┘Two things differ from the pyramid.
- Static analysis (type checker, linter) sits at the bottom. Don’t let tests try to catch what TypeScript / ESLint already catches. The strongest free safety net you can get.
- Integration is the thickest layer. “Multiple components running together” is closest to the user’s view and stable enough to survive refactoring.
This series follows the trophy view. In frontend territory like React/Next.js, the boundary between “unit” and “integration” is blurry — almost every RTL test is closer to integration anyway.
What to test — a decision tree #
One tree that helps.
이 코드가 깨지면 사용자가 알아챌까?
├─ 아니오 → 테스트하지 마라 (또는 정적 분석에 맡겨라)
└─ 예 → 테스트해라
↓
어떤 층에서?
├─ 한 함수의 복잡한 알고리즘 → 단위
├─ 여러 모듈/컴포넌트의 상호작용 → 통합
└─ 핵심 사용자 시나리오 (회원가입, 결제 등) → E2E“Will the user notice?” is the big filter. Don’t worry about things like the variable names of an internal helper function. Test only what’s externally observable.
What it means to test “behavior” #
Let’s test the same component in two different ways.
function LoginForm({ onSubmit }) {
const [email, setEmail] = useState('');
const [password, setPassword] = useState('');
const handleSubmit = (e) => {
e.preventDefault();
onSubmit({ email, password });
};
return (
<form onSubmit={handleSubmit}>
<label>
이메일
<input value={email} onChange={(e) => setEmail(e.target.value)} />
</label>
<label>
비밀번호
<input
type="password"
value={password}
onChange={(e) => setPassword(e.target.value)}
/>
</label>
<button type="submit">로그인</button>
</form>
);
}Implementation-bound test (anti-pattern):
test('email state updates', () => {
const { container } = render(<LoginForm onSubmit={jest.fn()} />);
const input = container.querySelectorAll('input')[0];
fireEvent.change(input, { target: { value: 'a@b.com' } });
// 내부 state 를 검증하려는 시도 ...
});The index in querySelectorAll('input')[0] breaks the moment the component shifts even slightly: an added label, a reordering, an extra input.
Behavior-focused test:
test('사용자가 폼을 채워 제출하면 onSubmit 이 값과 함께 호출된다', async () => {
const onSubmit = vi.fn();
render(<LoginForm onSubmit={onSubmit} />);
await userEvent.type(screen.getByLabelText('이메일'), 'a@b.com');
await userEvent.type(screen.getByLabelText('비밀번호'), 'secret');
await userEvent.click(screen.getByRole('button', { name: '로그인' }));
expect(onSubmit).toHaveBeenCalledWith({
email: 'a@b.com',
password: 'secret',
});
});The differences are clear.
- Inputs are found by label rather than index → the same way a user fills out the form on screen.
- The button is found by role + name rather than just text → accessibility-friendly.
- Verification touches only the externally exposed
onSubmitcallback → renaming an internal state variable doesn’t break it.
Every component test in this series goes with the second form.
The place and limits of mocking #
In tests, external dependencies (API, DB, time, randomness) are usually mocked. Why this is needed:
- External calls are slow, depend on the network, and have a cost.
- External responses can change, making tests flaky.
- Scenarios like “a specific error response” are hard to produce in a real system.
But mocking is a dangerous tool.
- Mock too much and tests drift from the real system. The most common cause of “all tests passed but production broke.”
- Mock too deep and the mock and the real implementation evolve separately. When one side changes and the other can’t keep up, you have a bug waiting.
Principles:
- Mock only at external system boundaries — HTTP calls, DB connections, the file system.
- Internal modules are almost never mocked. Letting a unit test use the real result of another module flows naturally into integration testing.
- HTTP mocking is covered in #4 MSW — instead of mocking
fetch, you intercept at the network layer. From the code’s point of view, it’s as if realfetchwas called.
“Tests first” or “tests later”? #
The TDD debate is endless. We’ll just touch on the texture here.
Where TDD works well:
- Clear algorithms, functions with well-defined inputs and outputs (parser, validator, calculator).
- Bug fixes — write a failing test first, then fix it; the bug doesn’t come back.
Where TDD feels awkward:
- Components whose UI shape isn’t decided yet.
- Integration code where you don’t yet know the actual response shape of the external API.
- Exploratory coding (the stage of trying out whether a library is the right fit).
This series doesn’t force TDD. “Is there a test in the place where one should be?” is the more important question.
The trap of coverage #
Coverage is the number that shows “what % of the code did the tests execute.” You’ll be tempted to set a goal like 90%, but there are two traps.
- 90% coverage doesn’t mean 90% quality — it just means the lines were executed, not verified. Even a test with not a single
expectcan have high coverage. - The last 10% costs the most — error paths, edge cases, branches that rarely run. Pour time into them and tests start sliding into anti-patterns.
The view I recommend:
- Write new code so that it’s naturally covered to about 80%. The parts that aren’t are usually meaningless code (dead code, unnecessary branches).
- Look at scenario coverage rather than branch/line coverage. Are core flows like “sign up → verification email → log in” all tested?
- Don’t aim for 100% coverage. Once the main flows are covered, trying to cover the last line is a bigger loss.
We’ll come back to coverage reports in CI in #6.
Allocating your time — the conclusion of this track #
Testing is a story about how you allocate time. With infinite time you could cover every line with unit/integration/E2E, but reality isn’t that. So where do you spend the time?
- Static analysis (types, linter) — almost free. Keep it on.
- Integration tests (RTL + MSW) — the best value. More than half of your time.
- Unit tests — only for complex logic (calculator, parser, validator).
- E2E — 5–10 core user flows. Going beyond that turns into a maintenance burden.
This allocation sets the skeleton of the 6-part track.
#1 — 그림 잡기 (이번 글)
#2 — 단위 테스트 도구 (Vitest)
#3, #4, #5 — 통합 테스트 (RTL + MSW + userEvent)
#6 — E2E (Playwright) + CIStarting the series — what you should have on hand #
From the next post on, we get hands-on. What you should have:
- Node 22+ (so Vitest cleanly handles modern ESM/types)
- pnpm or npm
- TypeScript familiarity is nice but not required — the TS track is enough of a stepping stone.
Tool setup and the first test happen in #2 Vitest.
Wrap-up #
- The reason people don’t test is usually not “too busy” but the lack of a picture for what / where / how.
- More than the pyramid, the trophy — allocate weight in the order of static analysis → integration → unit → E2E. Integration is the thickest layer.
- Test behavior. Bind to implementation and refactoring becomes scary.
- Mock only at system boundaries. Mock internal modules and the test and the real thing evolve apart.
- Don’t cling to coverage numbers. Whether the core scenarios are covered matters more.
- The track flows: static analysis → Vitest unit → RTL+MSW integration → Playwright E2E.
In the next post (#2 Vitest setup and your first unit test), we set up the tools hands-on. We’ll attach Vitest to a project and write the first test on the simplest function. The meanings of describe / it / expect, and the role of watch mode.