Why GitHub Actions Browser Tests Fail Only on Pull Requests, Not Main Branch

If your browser suite passes on main but starts failing on pull requests, the problem is usually not the browser test itself. It is the environment around it. GitHub Actions can make two runs look identical while quietly changing permissions, caches, secrets, base refs, workflow triggers, and even how much concurrency your tests receive.

That is why the same Playwright, Cypress, or Selenium checks can look stable on main and suddenly become the tests that make every PR red. The behavior is frustrating, but it is also useful. A failure that appears only on pull requests is usually telling you something about CI environment drift, not just application behavior.

If a browser test is flaky only on PRs, assume the workflow context changed before you assume the UI changed.

This guide walks through the hidden differences between pull request test failures and main-branch runs, how to isolate them, and how to make GitHub Actions browser tests more predictable over time.

Why PR runs and main-branch runs are not the same

In GitHub Actions, a workflow is not just code plus a runner. The event that triggered it matters. push, pull_request, pull_request_target, workflow_dispatch, and scheduled runs all behave a little differently. GitHub documents the core platform here: GitHub Actions.

For browser testing, the important part is that PRs often run in a more restricted context than direct pushes to main.

Typical differences include:

different event payloads and environment variables
read-only or reduced permissions
missing secrets or masked values
different checkout behavior for merge commits vs branch heads
caches that miss more often on PR branches
conditional logic that only runs on main
different concurrency pressure when many PRs are open
changes in dependency resolution due to lockfile drift or matrix differences

A test can fail on PRs simply because the workflow setup is different, even if the app code is unchanged.

Start by confirming the failure mode

Before changing the workflow, answer three questions:

Does the same commit fail on PR but pass after merge?
Does it fail only for external forks, or for every PR?
Is the failure deterministic or flaky?

These questions tell you whether you are looking at permissions, event context, timing, or actual app behavior.

Failure only on external forks

If the browser tests fail only when a PR comes from a fork, the first suspect is permissions or secrets. Forked PRs do not get the same access to repository secrets by default, and they often run in a stricter sandbox.

That means your test may be depending on something hidden, like:

an authenticated test user token
a private API endpoint
a feature flag secret
a cloud storage credential used to seed data

If those values are missing, the app may render differently, redirect, or fail to load test data.

Failure on every PR, but not on main

If every PR is affected, look for workflow differences, cache behavior, parallelism, and branch-specific code paths.

Common examples:

the PR workflow installs dependencies from scratch while main reuses cache
main publishes a build artifact before tests run, but PRs test the source directly
branch protection or merge commit behavior changes the tested code version
the suite runs with fewer workers on PRs because of a condition in YAML

Failure is flaky, not deterministic

If the same PR passes on retry, the issue is likely timing, race conditions, animation, network waiting, or resource contention.

That is where browser tests become vulnerable to CI environment drift. A local laptop is usually faster, warmer, and less noisy than a shared runner. GitHub-hosted runners can vary in load, and your suite may be racing the browser, the app, or the backend.

Hidden differences to check first

Here are the most common reasons browser tests fail on pull requests but not on main.

1. Different checkout refs

On PRs, the checked-out code might be a merge commit or a head branch, depending on your workflow. On main, it is usually the pushed commit on the branch itself.

That matters if your app behavior changes based on commit ancestry, generated version files, or release logic.

A good practice is to print the commit being tested:

- name: Show commit context
  run: |
    echo "event=$GITHUB_EVENT_NAME"
    echo "ref=$GITHUB_REF"
    echo "sha=$GITHUB_SHA"
    git log --oneline -1

If your test suite assumes one type of checkout but receives another, it may be testing the wrong thing.

2. Secrets are missing on PRs

A browser test can pass on main because the app has access to the right tokens, then fail on PRs because those secrets are absent or empty.

Watch for these patterns:

login flows silently skip setup and send the test to a generic page
feature flags fall back to off
API calls return 401 or 403
test users are not created, so selectors never appear

Do not rely on secrets being available. Instead, make your test environment explicit and fail fast if required variables are missing.

const required = ['E2E_BASE_URL', 'TEST_USER_EMAIL', 'TEST_USER_PASSWORD'];
for (const key of required) {
  if (!process.env[key]) throw new Error(`Missing ${key}`);
}

3. Cache hits differ between branches

GitHub Actions caches are branch-sensitive in practice. A cache that exists on main may not exist on a new PR branch, or it may be older than you expect. If your workflow depends on cached browser binaries, package managers, or build outputs, PRs may end up slower or differently provisioned than main.

This is a common source of GitHub Actions flaky tests in browser suites, because timeouts are often tuned to a warm cache.

Things to inspect:

package manager cache keys
browser binary cache keys
build artifact reuse
dependency install mode, frozen lockfile vs permissive install
whether the test job depends on a previous build job

A cache miss usually does not cause a functional failure directly, but it can increase setup time enough to trip race conditions, timeouts, or startup checks.

4. Conditional logic only runs on `main`

A workflow often contains conditions like if: github.ref == 'refs/heads/main' or if: github.event_name == 'push'. That can make the main branch behave like a special case.

Examples:

main deploys a preview build, PRs do not
main runs a build step before tests, PRs use source files directly
main sets environment variables that PRs do not receive
main runs more browser workers because it assumes a stable branch

If the test environment differs between branches, your browser suite may not actually be testing the same app.

5. Parallelism and resource contention change

Browser tests are sensitive to CPU, memory, and timing. A suite that passes with 2 workers on main may fail on PRs if the workflow adjusts the matrix, runner type, or concurrency.

Common failure patterns:

test isolation issues where two tests fight over the same user account
shared temp directories or local storage collisions
app startup taking longer under load
screenshots or video collection slowing the runner just enough to miss a timeout

In test automation terms, concurrency can turn a harmless timing assumption into a test failure. That is part of why continuous integration works best when the environment is predictable and intentionally constrained.

6. PRs test the merge result, not the branch alone

Depending on your workflow, PRs may test the merge commit that GitHub creates, not just the branch head. That is useful, but it means the browser test can fail because the PR branch and main interact differently.

This is especially relevant when the frontend depends on shared contracts, such as:

API response shape
environment-specific config
CSS or component changes that combine badly with already merged work
feature flag combinations

Sometimes the PR is not broken on its own, but it breaks the future merge result. That is the kind of failure CI is supposed to catch.

A debugging workflow that saves time

When browser tests fail on pull requests, resist the urge to tweak timeouts first. Start by reducing the number of moving parts.

Step 1: Make the workflow print everything relevant

Add a short diagnostics step at the beginning of the job.

- name: Debug context
  run: |
    node -v
    npm -v
    echo "event=$GITHUB_EVENT_NAME"
    echo "ref=$GITHUB_REF"
    echo "sha=$GITHUB_SHA"
    env | sort | grep -E '^(CI|GITHUB|NODE|PLAYWRIGHT|CYPRESS|APP_)'

This gives you a direct comparison between PR and main runs.

Step 2: Compare the workflow file itself

A surprising number of CI bugs happen because the workflow changed on main, but the PR run is using an older version of the workflow file, or vice versa. If your workflow is in the same repository, verify which revision defines the job. If a reusable workflow is used, confirm the version or ref.

Step 3: Turn one flaky test into a focused reproduction

Take the failing browser test and isolate it.

run only the failing spec
run only the failing browser
disable parallel execution temporarily
remove retries until you understand the timing

If the failure disappears when isolated, you probably have a shared state issue, not a selector issue.

Step 4: Remove hidden dependencies

Ask whether the test depends on any of these:

seed data from another job
network access to a third-party service
feature flags or launch darkly style toggles
artifacts from a previous workflow step
local file paths that exist only in one job

The fewer external dependencies your browser test has, the less likely it is to fail only on PRs.

Example: GitHub Actions workflow with explicit diagnostics

This is a small pattern that helps catch PR-only differences early.

name: e2e
on:
  pull_request:
  push:
    branches: [main]

jobs: browser-tests: runs-on: ubuntu-latest timeout-minutes: 30 steps: - uses: actions/checkout@v4

  - name: Print context
    run: |
      echo "event=$GITHUB_EVENT_NAME"
      echo "ref=$GITHUB_REF"
      echo "sha=$GITHUB_SHA"

  - uses: actions/setup-node@v4
    with:
      node-version: 20
      cache: npm

  - run: npm ci
  - run: npm run build
  - run: npm test:e2e

If the PR job fails after npm ci but main does not, compare dependency install output first. If it fails after npm run build, compare build artifacts and environment variables. If it fails only during browser execution, focus on timing, network, and app state.

Browser test failure patterns that often hide the real issue

“Element not found” on PR only

This often means the page rendered differently because some state was missing. That state might be a cookie, session, mock response, or feature flag.

It can also happen when the app is simply slower on PR runs and the test is checking too soon. Prefer explicit waits for visible UI state over hard sleeps.

typescript

await page.getByRole('button', { name: 'Save' }).waitFor({ state: 'visible' });
await page.getByRole('button', { name: 'Save' }).click();

Navigation failures on PRs often point to backend startup, proxy configuration, or environment mismatch.

Questions to ask:

Does the app base URL differ between branches?
Is the API mock server running in both jobs?
Is the browser pointed at the right port?
Is the container or service healthy before tests begin?

Assertion passes locally but not in CI

This is usually a timing, rendering, or font difference issue. It can also be a data issue if the PR branch uses a different fixture or setup path.

Do not immediately add retries. Retries can hide a real synchronization bug and make CI look healthier than it is.

How to reduce PR-only instability

Make the environment deterministic

Lock down the things your tests depend on:

exact Node.js version
exact browser versions if your tool allows it
pinned dependencies via lockfiles
explicit environment variables
known-good test data seeding

This is standard test automation hygiene, and it matters more in CI than locally.

Separate build failures from browser failures

If your app build and browser tests happen in one long job, a failed install or build can look like a browser issue. Split the workflow into jobs where it makes sense, and pass artifacts explicitly.

That makes debugging much easier:

install and build once
upload build artifact
run browser tests against that artifact

Limit shared mutable state

Do not let browser tests share accounts, inboxes, shopping carts, or backend entities unless the system is designed for it. PR runs are often more parallelized and more crowded, so shared state collisions show up there first.

Use unique test IDs, fresh users, and isolated fixtures.

Prefer explicit waits and stable selectors

PR runs often expose tests that are waiting on the wrong thing. Instead of waiting for arbitrary time, wait for a real condition.

Bad pattern:

typescript

await page.waitForTimeout(3000);

Better pattern:

typescript

await expect(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();

Stable selectors reduce the chance that minor rendering differences on PR branches break a test.

Review concurrency settings

If PRs are slower, reduce worker count for browser suites or separate heavy specs into another job. If one flaky spec is causing churn, quarantine it temporarily and fix the root cause, rather than leaving retries to paper it over.

A quick checklist for pull request test failures

When a PR breaks browser tests, check these in order:

Event type, pull_request vs push
Checkout ref and commit SHA
Missing secrets or env vars
Cache hit or miss behavior
Browser install and dependency versions
Build artifact consistency
Worker count and timeouts
Shared state, test data, and feature flags
Network dependencies, proxies, or mock services
Whether the failure reproduces on a minimal isolated run

If you can answer those ten items, you usually know whether the issue is CI environment drift or a real app regression.

When the fix is in the workflow, not the test

A lot of teams assume they should only edit the browser test when it fails. Sometimes that is correct, but often the workflow is the real bug.

Workflow-level fixes include:

making PR and main jobs use the same install and build steps
using the same runner image and browser versions
exporting the same environment variables in both paths
making caches branch-agnostic where safe
uploading logs, traces, screenshots, and videos for every failure
failing fast when required secrets or config values are missing

That last point is important. A missing secret should fail explicitly during setup, not halfway through a browser flow where the symptom looks unrelated.

What good observability looks like for CI browser tests

Good CI browser test output should answer three questions quickly:

what code ran
what environment ran it
what the browser saw when it failed

For GitHub Actions, that usually means collecting:

job logs
test runner traces
screenshots on failure
browser console output
network failures
artifact versions for the app under test

This is where browser automation becomes much more useful than raw pass/fail. The more evidence you collect, the easier it is to distinguish a broken workflow from a broken feature.

Conclusion

When browser tests fail on pull requests but not on main, the root cause is usually hidden in the CI setup: permissions, secrets, caches, checkout behavior, parallelism, or branch-specific logic. The browser is just the place where the mismatch becomes visible.

The most reliable way to fix it is to make PR and main runs as similar as possible, then isolate the differences one by one. Print the context. Compare the jobs. Reduce shared state. Keep selectors stable. Use explicit waits. Fail fast when config is missing.

That approach takes more discipline than adding retries, but it gives you a better system. And once your GitHub Actions browser tests stop being branch-sensitive, your team spends less time arguing with CI and more time fixing actual product bugs.

If you want a deeper conceptual backdrop, the fundamentals of software testing, test automation, and continuous integration are worth revisiting with your workflow in mind.

Why PR runs and main-branch runs are not the same

Start by confirming the failure mode

Failure only on external forks

Failure on every PR, but not on main

Failure is flaky, not deterministic

Hidden differences to check first

1. Different checkout refs

2. Secrets are missing on PRs

3. Cache hits differ between branches

4. Conditional logic only runs on main

5. Parallelism and resource contention change

6. PRs test the merge result, not the branch alone

A debugging workflow that saves time

Step 1: Make the workflow print everything relevant

Step 2: Compare the workflow file itself

Step 3: Turn one flaky test into a focused reproduction

Step 4: Remove hidden dependencies

Example: GitHub Actions workflow with explicit diagnostics

Browser test failure patterns that often hide the real issue

“Element not found” on PR only

Timeout on navigation

Assertion passes locally but not in CI

How to reduce PR-only instability

Make the environment deterministic

Separate build failures from browser failures

Limit shared mutable state

Prefer explicit waits and stable selectors

Review concurrency settings

A quick checklist for pull request test failures

When the fix is in the workflow, not the test

What good observability looks like for CI browser tests

Conclusion

4. Conditional logic only runs on `main`