Browser test failures that only happen on repeat runs are some of the most annoying bugs in automation. The app looks fine, the first run passes, then the second run fails on a login step, a dashboard assertion, or a navigation that should have been deterministic. The code under test did not change, but the browser did. That usually means stale client-side state is leaking between tests, and the usual suspects are cache, local storage, session storage, cookies, and IndexedDB.

When people talk about browser test failures caused by cache and local storage, they usually mean a broad class of state pollution problems, where one test run leaves behind data that changes the behavior of the next run. In modern web apps, that state can be subtle. It is not just a cached image or a remembered language preference. It can be auth tokens, feature flags, route metadata, offline data, service worker responses, or a partially synced IndexedDB record that makes the app behave differently on the next page load.

If a browser test is flaky only after the first pass, do not immediately blame waits or selectors. First ask, what did the browser remember?

What browser state pollution actually looks like

State pollution is easier to understand if you think in layers.

Cache

The browser cache can store network responses, static assets, and other fetch results. Sometimes the browser, a proxy, or a service worker serves old content, even when the server has already changed. This can cause:

  • stale JavaScript bundles being loaded after deploys,
  • old API responses hiding newly created records,
  • unexpected behavior after a test intentionally modifies server state,
  • tests passing locally but failing in CI because cache behavior differs.

Caching is not always a bug. For production users, it is a feature. For tests, it can be poison if the test assumes a clean first load.

For background on browser storage and caching concepts, the MDN cache storage documentation and service worker docs are worth reading.

Local storage

Local storage is simple, persistent key-value storage scoped to an origin. Many apps use it for theme preferences, auth-related hints, onboarding state, and app configuration. Because it survives page reloads and often survives browser restarts, it is a common source of test contamination.

Typical symptoms include:

  • a tutorial modal not appearing because a previous test dismissed it,
  • a feature flag remaining enabled from another run,
  • a login flow skipping steps because a token or flag exists,
  • the app opening to the wrong tenant, locale, or route.

MDN’s localStorage reference is a good reminder that the data is origin-bound, not test-bound.

IndexedDB

IndexedDB is where things get harder. Many modern apps use it for offline data, background sync queues, client-side caches, and app state that is too rich for local storage. Because it is structured, asynchronous, and often hidden behind application code, IndexedDB test failures can be harder to spot than simple local storage issues.

Typical symptoms include:

  • old records reappearing after a create/delete flow,
  • an offline-first app showing stale entities,
  • a search index or sync queue surviving between runs,
  • a migration behaving differently depending on previously stored schema versions.

If your app uses IndexedDB, the MDN IndexedDB guide is a useful reference.

Why these failures appear only on repeat runs

The first run often starts with a clean browser profile, especially in CI or a freshly launched local session. The second run reuses something that was written earlier. That reuse can come from several places:

  • the same browser context or profile is reused across tests,
  • browser storage is not cleared between specs,
  • service worker or HTTP cache survives test boundaries,
  • the app under test writes persistent state in setup steps,
  • test cleanup removes server data but not client-side state,
  • failures interrupt teardown, leaving the next run dirty.

The key idea is that browser automation often has two sources of truth: the backend and the browser. A test may reset the database but still keep an old auth token, cached API response, or IndexedDB record in the browser. The next run then begins from an inconsistent state.

How to prove the problem is browser state, not the app

Before changing your test suite, make sure you have evidence. It is easy to over-correct and start clearing everything after every test, which can hide real app issues and slow down the suite.

1. Reproduce with a fresh profile and then with a reused profile

Run the same failing test twice:

  • once in a brand-new browser profile or isolated context,
  • once in a reused profile or the same persistent context.

If the first run passes and the second fails, that is a strong sign that state is leaking.

2. Inspect browser storage after the failing step

Use devtools or automation APIs to inspect:

  • local storage keys and values,
  • IndexedDB databases and object stores,
  • cookies,
  • service worker registrations,
  • application cache or cache storage entries,
  • session storage if your runner reuses tabs.

In Chrome devtools, Application is the place to start. In automation, many tools expose APIs for reading storage directly. The goal is not to guess. The goal is to show which key or record changes the behavior.

3. Compare network behavior with a clean and dirty state

If the app loads different API responses, a network trace can reveal whether the browser is serving a cached response or the app is skipping a request entirely because it found persisted client-side data. Tools like the browser devtools network panel, Playwright traces, or Cypress logs are helpful here.

4. Add a one-time diagnostic dump

During debugging, print a compact dump of relevant storage before the failing action.

typescript

const storage = await page.evaluate(() => ({
  localStorage: { ...localStorage },
  sessionStorage: { ...sessionStorage }
}));
console.log(storage);

That snippet only covers the easy part, but it is often enough to catch an unexpected flag or stale token.

A clean failure is easier to diagnose than a hidden one. Dump state before you clear it.

A practical debugging order that saves time

When a browser test becomes flaky, debug in this order.

1. Start with test isolation

Ask whether each test gets its own browser context, incognito profile, or containerized browser process. If not, you may be debugging shared state by design.

2. Check the app’s own cleanup hooks

Some apps register logout flows, cache resets, or client-side teardown actions. These are often incomplete. A logout might remove a token but leave IndexedDB intact, which means a later test still sees stale app state.

3. Clear storage selectively and rerun

Try clearing only one storage layer at a time:

  • local storage,
  • IndexedDB,
  • cookies,
  • cache storage,
  • service workers.

If clearing local storage fixes the issue, you have already narrowed the blast radius. If not, move on to IndexedDB or cache storage.

4. Disable service workers when appropriate

Service workers can make browser tests tricky because they intercept requests and serve responses from their own caches. If your test is not specifically validating offline behavior, consider disabling or bypassing the service worker in test runs.

5. Check for cross-spec contamination

A test can be individually correct and still fail when run after a different spec. That usually means suite-level setup and teardown are incomplete. Make sure each spec is not inheriting state from a previous one.

Local storage cleanup strategies

Local storage is the easiest thing to clear, but there are tradeoffs.

Clear everything for the origin

This is the blunt approach, and in many test environments it is perfectly reasonable.

typescript

await page.evaluate(() => localStorage.clear());

This works when your app stores only test-owned data in local storage. If your app shares the origin with other tools, admin apps, or embedded widgets, you may need a more precise strategy.

Remove only known keys

If the app stores data you want to preserve during a test, remove just the keys that matter.

typescript

await page.evaluate(() => {
  localStorage.removeItem('auth_token');
  localStorage.removeItem('onboardingDismissed');
});

This is more surgical, but it requires discipline. You need a stable list of keys and a good reason not to clear the rest.

Use per-test browser contexts

The best fix is often structural, not procedural. If each test gets a new browser context, local storage pollution is much less likely to matter. In Playwright, isolated contexts are the default pattern. In Selenium, you may need to manage profiles more carefully. In Cypress, test isolation features help, but you still need to understand what persists across specs versus tests.

IndexedDB cleanup strategies

IndexedDB pollution is more common in apps with offline support, persistent drafts, client-side sync, or rich caching. It is also more painful because the API is asynchronous and the contents are schema-driven.

Delete the database by name

If you know the database name, deleting it is usually the cleanest approach.

typescript

await page.evaluate(async () => {
  await indexedDB.deleteDatabase('app-db');
});

This is often enough if the app uses a single database and you can recreate it on the next load.

Clear all known databases in test setup

Some apps use multiple databases. In that case, list and clear them in a setup step if the browser supports it.

typescript

await page.evaluate(async () => {
  const databases = await indexedDB.databases();
  await Promise.all(
    databases
      .filter(db => db.name)
      .map(db => indexedDB.deleteDatabase(db.name))
  );
});

Support for indexedDB.databases() may vary, so treat this as a debugging and controlled-test-environment tool, not a universal production pattern.

Watch for schema migration side effects

IndexedDB failures are often not caused by the data itself, but by versioned schema migrations. A run that leaves behind version 3 data might cause the next run to take a different migration path than a clean profile. If your tests cover app startup, you need to know whether the app handles old and fresh schemas consistently.

Inspect records when a test fails

If a bug only shows up on repeat runs, log the presence of a specific key or record rather than clearing everything immediately. For example, if a draft record causes a page to preload with stale data, verify whether that record exists before the app renders.

Cache pollution in browser tests, and how to control it

Cache pollution is a broader term than local storage cleanup because it can involve multiple browser mechanisms.

HTTP cache

If the browser reuses cached JS bundles, CSS files, or API responses, test behavior can change depending on whether the cached asset is stale or fresh. You may see symptoms like:

  • version mismatch between HTML and JS bundle,
  • old code paths being exercised after a deploy,
  • API requests unexpectedly not appearing in the network log.

You can reduce this risk by:

  • launching browsers in a fresh context,
  • disabling cache in devtools-driven test setups,
  • using unique asset hashes on deploy,
  • ensuring cache-control headers are appropriate for the asset type.

Service worker cache

Service workers can cache responses independently of the browser HTTP cache. If your app uses a service worker, the test may load data from the service worker even when the network would have returned something different. That is a common source of confusion in offline-capable apps.

For service worker-heavy applications, consider whether your automated browser should register the worker at all. If the test is about core UI behavior and not offline support, a service worker can introduce unnecessary noise.

Asset caching versus data caching

Static assets and runtime data are different. A stale JS bundle is a deployment concern. A stale API response in Cache Storage is a state management concern. When debugging, separate them. It is possible for one test to fail because the app code itself is old, while another fails because the app code is fine but the data is stale.

Tool-specific patterns that help

Playwright

Playwright makes isolation easier because browser contexts are cheap and can be created per test. If you are seeing persistent state issues, verify that you are not reusing a context or page across unrelated tests.

import { test } from '@playwright/test';
test('opens dashboard cleanly', async ({ browser }) => {
  const context = await browser.newContext();
  const page = await context.newPage();
  await page.goto('https://example.com');
  await context.close();
});

For debugging, also consider tracing a failing run so you can inspect network calls, console output, and storage-related behavior.

Selenium

With Selenium, persistence often depends on the browser profile and how your grid or local driver is configured. If you reuse a profile across tests, you may inherit browser state. If you need a clean session, make sure each test creates a new driver instance or a truly isolated profile.

from selenium import webdriver

options = webdriver.ChromeOptions() options.add_argument(‘–incognito’) driver = webdriver.Chrome(options=options) driver.get(‘https://example.com’) driver.quit()

Incognito is not a universal cure, but it can help confirm whether the failure is tied to persistent profile data.

Cypress

Cypress runs in a controlled browser session, but state can still leak across tests or specs if cleanup is incomplete. Use cy.clearLocalStorage() when appropriate, but do not rely on it blindly if IndexedDB or service workers are involved.

beforeEach(() => {
  cy.clearLocalStorage();
  cy.clearCookies();
});

If your app uses IndexedDB, you may need a custom task or application-specific reset path, because Cypress does not treat IndexedDB cleanup as a one-line universal fix.

A debugging checklist that usually finds the culprit

When you suspect browser test failures caused by cache and local storage, run through this list:

  • Does the failure happen only on the second or third run?
  • Does a clean browser profile make the test pass?
  • Do local storage keys differ between passing and failing runs?
  • Does clearing IndexedDB fix the issue?
  • Are service workers registered in the test environment?
  • Are you reusing browser contexts, tabs, or profiles across tests?
  • Is the app storing auth or feature-flag state on the client side?
  • Are you resetting backend data but not browser data?
  • Does the failing step rely on a page that might be served from cache?
  • Is there a migration path in IndexedDB that only fails when old records exist?

This list is intentionally boring. That is because the problem is usually boring. A test is not haunted. It is remembering.

When to clear state, and when not to

There is a temptation to fix every flaky browser test by clearing everything before each test. That can work, but it can also hide real dependencies and slow the suite down.

Clear aggressively when

  • the app is small and the suite is not performance-sensitive,
  • tests are integration-style and need deterministic starting conditions,
  • the app uses persistent client-side state heavily,
  • failures are expensive and hard to diagnose.

Clear selectively when

  • you need to preserve a large setup cost,
  • only one feature area uses browser persistence,
  • the app under test shares origin with multiple systems,
  • you want to preserve realistic user behavior between actions in a single scenario.

Prefer isolation over cleanup when possible

If the browser context is isolated per test, you solve the root problem rather than treating the symptoms. Cleanup is still useful, especially for destructive tests or mixed workflows, but it should not be your only strategy.

Designing tests to avoid state pollution in the first place

The best debugging guide is one you rarely need. Some design choices reduce browser state pollution from the start.

Keep test data and browser state separate

Use backend API setup to create server-side entities, but create browser-side state only when the test actually needs it. Do not reuse a logged-in browser session across unrelated tests unless the scenario depends on it.

Make state visible

If the app reads local storage keys, feature flags, or IndexedDB records, surface that in helper functions or setup logs. Hidden state is hard to debug.

Reset from the app, not just the test runner

Sometimes the most reliable cleanup is an app endpoint or UI action that clears client state the same way a real user or logout flow would. This can be more stable than poking at browser internals, especially when multiple storage layers are involved.

Version your browser-persistent data

If your app depends on local storage or IndexedDB, version the stored data schema and handle migrations carefully. This helps both production users and tests, because stale state becomes a known compatibility issue rather than a mystery failure.

A simple decision tree

If a browser test fails on repeat runs, ask:

  1. Does it pass in a new browser context? If yes, suspect pollution.
  2. Does clearing local storage fix it? If yes, focus on key management.
  3. Does deleting IndexedDB fix it? If yes, inspect offline data or migrations.
  4. Does disabling cache or service workers fix it? If yes, focus on asset and response caching.
  5. Does the problem disappear when cookies are cleared? If yes, investigate auth and session state.
  6. Does none of the above help? Then the bug may be in the app logic, not the browser state.

Closing thoughts

Browser state pollution is one of those problems that looks random until you inspect the layers. Cache, local storage, IndexedDB, cookies, and service workers all exist to make web apps faster and more useful for humans, but they make automated tests more complicated. That is not a reason to avoid them. It is a reason to be explicit about isolation, cleanup, and the browser profile you are testing against.

If your team keeps seeing indexeddb test failures, cache pollution in browser tests, or repeat-run flakes that vanish after a manual refresh, the fix is usually not another sleep or a longer timeout. It is a cleaner boundary between test runs and browser memory.

For more background on the broader testing practices behind these issues, see software testing, test automation, and continuous integration. Continuous integration is especially relevant here, because repeated runs are exactly where browser state pollution tends to surface.