If your browser suite passes on main but starts failing on pull requests, the problem is usually not the browser test itself. It is the environment around it. GitHub Actions can make two runs look identical while quietly changing permissions, caches, secrets, base refs, workflow triggers, and even how much concurrency your tests receive.

That is why the same Playwright, Cypress, or Selenium checks can look stable on main and suddenly become the tests that make every PR red. The behavior is frustrating, but it is also useful. A failure that appears only on pull requests is usually telling you something about CI environment drift, not just application behavior.

If a browser test is flaky only on PRs, assume the workflow context changed before you assume the UI changed.

This guide walks through the hidden differences between pull request test failures and main-branch runs, how to isolate them, and how to make GitHub Actions browser tests more predictable over time.

Why PR runs and main-branch runs are not the same

In GitHub Actions, a workflow is not just code plus a runner. The event that triggered it matters. push, pull_request, pull_request_target, workflow_dispatch, and scheduled runs all behave a little differently. GitHub documents the core platform here: GitHub Actions.

For browser testing, the important part is that PRs often run in a more restricted context than direct pushes to main.

Typical differences include:

  • different event payloads and environment variables
  • read-only or reduced permissions
  • missing secrets or masked values
  • different checkout behavior for merge commits vs branch heads
  • caches that miss more often on PR branches
  • conditional logic that only runs on main
  • different concurrency pressure when many PRs are open
  • changes in dependency resolution due to lockfile drift or matrix differences

A test can fail on PRs simply because the workflow setup is different, even if the app code is unchanged.

Start by confirming the failure mode

Before changing the workflow, answer three questions:

  1. Does the same commit fail on PR but pass after merge?
  2. Does it fail only for external forks, or for every PR?
  3. Is the failure deterministic or flaky?

These questions tell you whether you are looking at permissions, event context, timing, or actual app behavior.

Failure only on external forks

If the browser tests fail only when a PR comes from a fork, the first suspect is permissions or secrets. Forked PRs do not get the same access to repository secrets by default, and they often run in a stricter sandbox.

That means your test may be depending on something hidden, like:

  • an authenticated test user token
  • a private API endpoint
  • a feature flag secret
  • a cloud storage credential used to seed data

If those values are missing, the app may render differently, redirect, or fail to load test data.

Failure on every PR, but not on main

If every PR is affected, look for workflow differences, cache behavior, parallelism, and branch-specific code paths.

Common examples:

  • the PR workflow installs dependencies from scratch while main reuses cache
  • main publishes a build artifact before tests run, but PRs test the source directly
  • branch protection or merge commit behavior changes the tested code version
  • the suite runs with fewer workers on PRs because of a condition in YAML

Failure is flaky, not deterministic

If the same PR passes on retry, the issue is likely timing, race conditions, animation, network waiting, or resource contention.

That is where browser tests become vulnerable to CI environment drift. A local laptop is usually faster, warmer, and less noisy than a shared runner. GitHub-hosted runners can vary in load, and your suite may be racing the browser, the app, or the backend.

Hidden differences to check first

Here are the most common reasons browser tests fail on pull requests but not on main.

1. Different checkout refs

On PRs, the checked-out code might be a merge commit or a head branch, depending on your workflow. On main, it is usually the pushed commit on the branch itself.

That matters if your app behavior changes based on commit ancestry, generated version files, or release logic.

A good practice is to print the commit being tested:

- name: Show commit context
  run: |
    echo "event=$GITHUB_EVENT_NAME"
    echo "ref=$GITHUB_REF"
    echo "sha=$GITHUB_SHA"
    git log --oneline -1

If your test suite assumes one type of checkout but receives another, it may be testing the wrong thing.

2. Secrets are missing on PRs

A browser test can pass on main because the app has access to the right tokens, then fail on PRs because those secrets are absent or empty.

Watch for these patterns:

  • login flows silently skip setup and send the test to a generic page
  • feature flags fall back to off
  • API calls return 401 or 403
  • test users are not created, so selectors never appear

Do not rely on secrets being available. Instead, make your test environment explicit and fail fast if required variables are missing.

const required = ['E2E_BASE_URL', 'TEST_USER_EMAIL', 'TEST_USER_PASSWORD'];
for (const key of required) {
  if (!process.env[key]) throw new Error(`Missing ${key}`);
}

3. Cache hits differ between branches

GitHub Actions caches are branch-sensitive in practice. A cache that exists on main may not exist on a new PR branch, or it may be older than you expect. If your workflow depends on cached browser binaries, package managers, or build outputs, PRs may end up slower or differently provisioned than main.

This is a common source of GitHub Actions flaky tests in browser suites, because timeouts are often tuned to a warm cache.

Things to inspect:

  • package manager cache keys
  • browser binary cache keys
  • build artifact reuse
  • dependency install mode, frozen lockfile vs permissive install
  • whether the test job depends on a previous build job

A cache miss usually does not cause a functional failure directly, but it can increase setup time enough to trip race conditions, timeouts, or startup checks.

4. Conditional logic only runs on main

A workflow often contains conditions like if: github.ref == 'refs/heads/main' or if: github.event_name == 'push'. That can make the main branch behave like a special case.

Examples:

  • main deploys a preview build, PRs do not
  • main runs a build step before tests, PRs use source files directly
  • main sets environment variables that PRs do not receive
  • main runs more browser workers because it assumes a stable branch

If the test environment differs between branches, your browser suite may not actually be testing the same app.

5. Parallelism and resource contention change

Browser tests are sensitive to CPU, memory, and timing. A suite that passes with 2 workers on main may fail on PRs if the workflow adjusts the matrix, runner type, or concurrency.

Common failure patterns:

  • test isolation issues where two tests fight over the same user account
  • shared temp directories or local storage collisions
  • app startup taking longer under load
  • screenshots or video collection slowing the runner just enough to miss a timeout

In test automation terms, concurrency can turn a harmless timing assumption into a test failure. That is part of why continuous integration works best when the environment is predictable and intentionally constrained.

6. PRs test the merge result, not the branch alone

Depending on your workflow, PRs may test the merge commit that GitHub creates, not just the branch head. That is useful, but it means the browser test can fail because the PR branch and main interact differently.

This is especially relevant when the frontend depends on shared contracts, such as:

  • API response shape
  • environment-specific config
  • CSS or component changes that combine badly with already merged work
  • feature flag combinations

Sometimes the PR is not broken on its own, but it breaks the future merge result. That is the kind of failure CI is supposed to catch.

A debugging workflow that saves time

When browser tests fail on pull requests, resist the urge to tweak timeouts first. Start by reducing the number of moving parts.

Step 1: Make the workflow print everything relevant

Add a short diagnostics step at the beginning of the job.

- name: Debug context
  run: |
    node -v
    npm -v
    echo "event=$GITHUB_EVENT_NAME"
    echo "ref=$GITHUB_REF"
    echo "sha=$GITHUB_SHA"
    env | sort | grep -E '^(CI|GITHUB|NODE|PLAYWRIGHT|CYPRESS|APP_)'

This gives you a direct comparison between PR and main runs.

Step 2: Compare the workflow file itself

A surprising number of CI bugs happen because the workflow changed on main, but the PR run is using an older version of the workflow file, or vice versa. If your workflow is in the same repository, verify which revision defines the job. If a reusable workflow is used, confirm the version or ref.

Step 3: Turn one flaky test into a focused reproduction

Take the failing browser test and isolate it.

  • run only the failing spec
  • run only the failing browser
  • disable parallel execution temporarily
  • remove retries until you understand the timing

If the failure disappears when isolated, you probably have a shared state issue, not a selector issue.

Step 4: Remove hidden dependencies

Ask whether the test depends on any of these:

  • seed data from another job
  • network access to a third-party service
  • feature flags or launch darkly style toggles
  • artifacts from a previous workflow step
  • local file paths that exist only in one job

The fewer external dependencies your browser test has, the less likely it is to fail only on PRs.

Example: GitHub Actions workflow with explicit diagnostics

This is a small pattern that helps catch PR-only differences early.

name: e2e
on:
  pull_request:
  push:
    branches: [main]

jobs: browser-tests: runs-on: ubuntu-latest timeout-minutes: 30 steps: - uses: actions/checkout@v4

  - name: Print context
    run: |
      echo "event=$GITHUB_EVENT_NAME"
      echo "ref=$GITHUB_REF"
      echo "sha=$GITHUB_SHA"

  - uses: actions/setup-node@v4
    with:
      node-version: 20
      cache: npm

  - run: npm ci
  - run: npm run build
  - run: npm test:e2e

If the PR job fails after npm ci but main does not, compare dependency install output first. If it fails after npm run build, compare build artifacts and environment variables. If it fails only during browser execution, focus on timing, network, and app state.

Browser test failure patterns that often hide the real issue

“Element not found” on PR only

This often means the page rendered differently because some state was missing. That state might be a cookie, session, mock response, or feature flag.

It can also happen when the app is simply slower on PR runs and the test is checking too soon. Prefer explicit waits for visible UI state over hard sleeps.

typescript

await page.getByRole('button', { name: 'Save' }).waitFor({ state: 'visible' });
await page.getByRole('button', { name: 'Save' }).click();

Timeout on navigation

Navigation failures on PRs often point to backend startup, proxy configuration, or environment mismatch.

Questions to ask:

  • Does the app base URL differ between branches?
  • Is the API mock server running in both jobs?
  • Is the browser pointed at the right port?
  • Is the container or service healthy before tests begin?

Assertion passes locally but not in CI

This is usually a timing, rendering, or font difference issue. It can also be a data issue if the PR branch uses a different fixture or setup path.

Do not immediately add retries. Retries can hide a real synchronization bug and make CI look healthier than it is.

How to reduce PR-only instability

Make the environment deterministic

Lock down the things your tests depend on:

  • exact Node.js version
  • exact browser versions if your tool allows it
  • pinned dependencies via lockfiles
  • explicit environment variables
  • known-good test data seeding

This is standard test automation hygiene, and it matters more in CI than locally.

Separate build failures from browser failures

If your app build and browser tests happen in one long job, a failed install or build can look like a browser issue. Split the workflow into jobs where it makes sense, and pass artifacts explicitly.

That makes debugging much easier:

  • install and build once
  • upload build artifact
  • run browser tests against that artifact

Limit shared mutable state

Do not let browser tests share accounts, inboxes, shopping carts, or backend entities unless the system is designed for it. PR runs are often more parallelized and more crowded, so shared state collisions show up there first.

Use unique test IDs, fresh users, and isolated fixtures.

Prefer explicit waits and stable selectors

PR runs often expose tests that are waiting on the wrong thing. Instead of waiting for arbitrary time, wait for a real condition.

Bad pattern:

typescript

await page.waitForTimeout(3000);

Better pattern:

typescript

await expect(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();

Stable selectors reduce the chance that minor rendering differences on PR branches break a test.

Review concurrency settings

If PRs are slower, reduce worker count for browser suites or separate heavy specs into another job. If one flaky spec is causing churn, quarantine it temporarily and fix the root cause, rather than leaving retries to paper it over.

A quick checklist for pull request test failures

When a PR breaks browser tests, check these in order:

  1. Event type, pull_request vs push
  2. Checkout ref and commit SHA
  3. Missing secrets or env vars
  4. Cache hit or miss behavior
  5. Browser install and dependency versions
  6. Build artifact consistency
  7. Worker count and timeouts
  8. Shared state, test data, and feature flags
  9. Network dependencies, proxies, or mock services
  10. Whether the failure reproduces on a minimal isolated run

If you can answer those ten items, you usually know whether the issue is CI environment drift or a real app regression.

When the fix is in the workflow, not the test

A lot of teams assume they should only edit the browser test when it fails. Sometimes that is correct, but often the workflow is the real bug.

Workflow-level fixes include:

  • making PR and main jobs use the same install and build steps
  • using the same runner image and browser versions
  • exporting the same environment variables in both paths
  • making caches branch-agnostic where safe
  • uploading logs, traces, screenshots, and videos for every failure
  • failing fast when required secrets or config values are missing

That last point is important. A missing secret should fail explicitly during setup, not halfway through a browser flow where the symptom looks unrelated.

What good observability looks like for CI browser tests

Good CI browser test output should answer three questions quickly:

  • what code ran
  • what environment ran it
  • what the browser saw when it failed

For GitHub Actions, that usually means collecting:

  • job logs
  • test runner traces
  • screenshots on failure
  • browser console output
  • network failures
  • artifact versions for the app under test

This is where browser automation becomes much more useful than raw pass/fail. The more evidence you collect, the easier it is to distinguish a broken workflow from a broken feature.

Conclusion

When browser tests fail on pull requests but not on main, the root cause is usually hidden in the CI setup: permissions, secrets, caches, checkout behavior, parallelism, or branch-specific logic. The browser is just the place where the mismatch becomes visible.

The most reliable way to fix it is to make PR and main runs as similar as possible, then isolate the differences one by one. Print the context. Compare the jobs. Reduce shared state. Keep selectors stable. Use explicit waits. Fail fast when config is missing.

That approach takes more discipline than adding retries, but it gives you a better system. And once your GitHub Actions browser tests stop being branch-sensitive, your team spends less time arguing with CI and more time fixing actual product bugs.

If you want a deeper conceptual backdrop, the fundamentals of software testing, test automation, and continuous integration are worth revisiting with your workflow in mind.