June 3, 2026
Why GitHub Actions Browser Tests Fail Only on Pull Requests, Not Main Branch
Debug why browser tests fail on pull requests in GitHub Actions but pass on main, including cache differences, permissions, env vars, parallelism, and CI drift.
If your browser suite passes on main but starts failing on pull requests, the problem is usually not the browser test itself. It is the environment around it. GitHub Actions can make two runs look identical while quietly changing permissions, caches, secrets, base refs, workflow triggers, and even how much concurrency your tests receive.
That is why the same Playwright, Cypress, or Selenium checks can look stable on main and suddenly become the tests that make every PR red. The behavior is frustrating, but it is also useful. A failure that appears only on pull requests is usually telling you something about CI environment drift, not just application behavior.
If a browser test is flaky only on PRs, assume the workflow context changed before you assume the UI changed.
This guide walks through the hidden differences between pull request test failures and main-branch runs, how to isolate them, and how to make GitHub Actions browser tests more predictable over time.
Why PR runs and main-branch runs are not the same
In GitHub Actions, a workflow is not just code plus a runner. The event that triggered it matters. push, pull_request, pull_request_target, workflow_dispatch, and scheduled runs all behave a little differently. GitHub documents the core platform here: GitHub Actions.
For browser testing, the important part is that PRs often run in a more restricted context than direct pushes to main.
Typical differences include:
- different event payloads and environment variables
- read-only or reduced permissions
- missing secrets or masked values
- different checkout behavior for merge commits vs branch heads
- caches that miss more often on PR branches
- conditional logic that only runs on
main - different concurrency pressure when many PRs are open
- changes in dependency resolution due to lockfile drift or matrix differences
A test can fail on PRs simply because the workflow setup is different, even if the app code is unchanged.
Start by confirming the failure mode
Before changing the workflow, answer three questions:
- Does the same commit fail on PR but pass after merge?
- Does it fail only for external forks, or for every PR?
- Is the failure deterministic or flaky?
These questions tell you whether you are looking at permissions, event context, timing, or actual app behavior.
Failure only on external forks
If the browser tests fail only when a PR comes from a fork, the first suspect is permissions or secrets. Forked PRs do not get the same access to repository secrets by default, and they often run in a stricter sandbox.
That means your test may be depending on something hidden, like:
- an authenticated test user token
- a private API endpoint
- a feature flag secret
- a cloud storage credential used to seed data
If those values are missing, the app may render differently, redirect, or fail to load test data.
Failure on every PR, but not on main
If every PR is affected, look for workflow differences, cache behavior, parallelism, and branch-specific code paths.
Common examples:
- the PR workflow installs dependencies from scratch while
mainreuses cache mainpublishes a build artifact before tests run, but PRs test the source directly- branch protection or merge commit behavior changes the tested code version
- the suite runs with fewer workers on PRs because of a condition in YAML
Failure is flaky, not deterministic
If the same PR passes on retry, the issue is likely timing, race conditions, animation, network waiting, or resource contention.
That is where browser tests become vulnerable to CI environment drift. A local laptop is usually faster, warmer, and less noisy than a shared runner. GitHub-hosted runners can vary in load, and your suite may be racing the browser, the app, or the backend.
Hidden differences to check first
Here are the most common reasons browser tests fail on pull requests but not on main.
1. Different checkout refs
On PRs, the checked-out code might be a merge commit or a head branch, depending on your workflow. On main, it is usually the pushed commit on the branch itself.
That matters if your app behavior changes based on commit ancestry, generated version files, or release logic.
A good practice is to print the commit being tested:
- name: Show commit context
run: |
echo "event=$GITHUB_EVENT_NAME"
echo "ref=$GITHUB_REF"
echo "sha=$GITHUB_SHA"
git log --oneline -1
If your test suite assumes one type of checkout but receives another, it may be testing the wrong thing.
2. Secrets are missing on PRs
A browser test can pass on main because the app has access to the right tokens, then fail on PRs because those secrets are absent or empty.
Watch for these patterns:
- login flows silently skip setup and send the test to a generic page
- feature flags fall back to off
- API calls return 401 or 403
- test users are not created, so selectors never appear
Do not rely on secrets being available. Instead, make your test environment explicit and fail fast if required variables are missing.
const required = ['E2E_BASE_URL', 'TEST_USER_EMAIL', 'TEST_USER_PASSWORD'];
for (const key of required) {
if (!process.env[key]) throw new Error(`Missing ${key}`);
}
3. Cache hits differ between branches
GitHub Actions caches are branch-sensitive in practice. A cache that exists on main may not exist on a new PR branch, or it may be older than you expect. If your workflow depends on cached browser binaries, package managers, or build outputs, PRs may end up slower or differently provisioned than main.
This is a common source of GitHub Actions flaky tests in browser suites, because timeouts are often tuned to a warm cache.
Things to inspect:
- package manager cache keys
- browser binary cache keys
- build artifact reuse
- dependency install mode, frozen lockfile vs permissive install
- whether the test job depends on a previous build job
A cache miss usually does not cause a functional failure directly, but it can increase setup time enough to trip race conditions, timeouts, or startup checks.
4. Conditional logic only runs on main
A workflow often contains conditions like if: github.ref == 'refs/heads/main' or if: github.event_name == 'push'. That can make the main branch behave like a special case.
Examples:
maindeploys a preview build, PRs do notmainruns a build step before tests, PRs use source files directlymainsets environment variables that PRs do not receivemainruns more browser workers because it assumes a stable branch
If the test environment differs between branches, your browser suite may not actually be testing the same app.
5. Parallelism and resource contention change
Browser tests are sensitive to CPU, memory, and timing. A suite that passes with 2 workers on main may fail on PRs if the workflow adjusts the matrix, runner type, or concurrency.
Common failure patterns:
- test isolation issues where two tests fight over the same user account
- shared temp directories or local storage collisions
- app startup taking longer under load
- screenshots or video collection slowing the runner just enough to miss a timeout
In test automation terms, concurrency can turn a harmless timing assumption into a test failure. That is part of why continuous integration works best when the environment is predictable and intentionally constrained.
6. PRs test the merge result, not the branch alone
Depending on your workflow, PRs may test the merge commit that GitHub creates, not just the branch head. That is useful, but it means the browser test can fail because the PR branch and main interact differently.
This is especially relevant when the frontend depends on shared contracts, such as:
- API response shape
- environment-specific config
- CSS or component changes that combine badly with already merged work
- feature flag combinations
Sometimes the PR is not broken on its own, but it breaks the future merge result. That is the kind of failure CI is supposed to catch.
A debugging workflow that saves time
When browser tests fail on pull requests, resist the urge to tweak timeouts first. Start by reducing the number of moving parts.
Step 1: Make the workflow print everything relevant
Add a short diagnostics step at the beginning of the job.
- name: Debug context
run: |
node -v
npm -v
echo "event=$GITHUB_EVENT_NAME"
echo "ref=$GITHUB_REF"
echo "sha=$GITHUB_SHA"
env | sort | grep -E '^(CI|GITHUB|NODE|PLAYWRIGHT|CYPRESS|APP_)'
This gives you a direct comparison between PR and main runs.
Step 2: Compare the workflow file itself
A surprising number of CI bugs happen because the workflow changed on main, but the PR run is using an older version of the workflow file, or vice versa. If your workflow is in the same repository, verify which revision defines the job. If a reusable workflow is used, confirm the version or ref.
Step 3: Turn one flaky test into a focused reproduction
Take the failing browser test and isolate it.
- run only the failing spec
- run only the failing browser
- disable parallel execution temporarily
- remove retries until you understand the timing
If the failure disappears when isolated, you probably have a shared state issue, not a selector issue.
Step 4: Remove hidden dependencies
Ask whether the test depends on any of these:
- seed data from another job
- network access to a third-party service
- feature flags or launch darkly style toggles
- artifacts from a previous workflow step
- local file paths that exist only in one job
The fewer external dependencies your browser test has, the less likely it is to fail only on PRs.
Example: GitHub Actions workflow with explicit diagnostics
This is a small pattern that helps catch PR-only differences early.
name: e2e
on:
pull_request:
push:
branches: [main]
jobs: browser-tests: runs-on: ubuntu-latest timeout-minutes: 30 steps: - uses: actions/checkout@v4
- name: Print context
run: |
echo "event=$GITHUB_EVENT_NAME"
echo "ref=$GITHUB_REF"
echo "sha=$GITHUB_SHA"
- uses: actions/setup-node@v4
with:
node-version: 20
cache: npm
- run: npm ci
- run: npm run build
- run: npm test:e2e
If the PR job fails after npm ci but main does not, compare dependency install output first. If it fails after npm run build, compare build artifacts and environment variables. If it fails only during browser execution, focus on timing, network, and app state.
Browser test failure patterns that often hide the real issue
“Element not found” on PR only
This often means the page rendered differently because some state was missing. That state might be a cookie, session, mock response, or feature flag.
It can also happen when the app is simply slower on PR runs and the test is checking too soon. Prefer explicit waits for visible UI state over hard sleeps.
typescript
await page.getByRole('button', { name: 'Save' }).waitFor({ state: 'visible' });
await page.getByRole('button', { name: 'Save' }).click();
Timeout on navigation
Navigation failures on PRs often point to backend startup, proxy configuration, or environment mismatch.
Questions to ask:
- Does the app base URL differ between branches?
- Is the API mock server running in both jobs?
- Is the browser pointed at the right port?
- Is the container or service healthy before tests begin?
Assertion passes locally but not in CI
This is usually a timing, rendering, or font difference issue. It can also be a data issue if the PR branch uses a different fixture or setup path.
Do not immediately add retries. Retries can hide a real synchronization bug and make CI look healthier than it is.
How to reduce PR-only instability
Make the environment deterministic
Lock down the things your tests depend on:
- exact Node.js version
- exact browser versions if your tool allows it
- pinned dependencies via lockfiles
- explicit environment variables
- known-good test data seeding
This is standard test automation hygiene, and it matters more in CI than locally.
Separate build failures from browser failures
If your app build and browser tests happen in one long job, a failed install or build can look like a browser issue. Split the workflow into jobs where it makes sense, and pass artifacts explicitly.
That makes debugging much easier:
- install and build once
- upload build artifact
- run browser tests against that artifact
Limit shared mutable state
Do not let browser tests share accounts, inboxes, shopping carts, or backend entities unless the system is designed for it. PR runs are often more parallelized and more crowded, so shared state collisions show up there first.
Use unique test IDs, fresh users, and isolated fixtures.
Prefer explicit waits and stable selectors
PR runs often expose tests that are waiting on the wrong thing. Instead of waiting for arbitrary time, wait for a real condition.
Bad pattern:
typescript
await page.waitForTimeout(3000);
Better pattern:
typescript
await expect(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();
Stable selectors reduce the chance that minor rendering differences on PR branches break a test.
Review concurrency settings
If PRs are slower, reduce worker count for browser suites or separate heavy specs into another job. If one flaky spec is causing churn, quarantine it temporarily and fix the root cause, rather than leaving retries to paper it over.
A quick checklist for pull request test failures
When a PR breaks browser tests, check these in order:
- Event type,
pull_requestvspush - Checkout ref and commit SHA
- Missing secrets or env vars
- Cache hit or miss behavior
- Browser install and dependency versions
- Build artifact consistency
- Worker count and timeouts
- Shared state, test data, and feature flags
- Network dependencies, proxies, or mock services
- Whether the failure reproduces on a minimal isolated run
If you can answer those ten items, you usually know whether the issue is CI environment drift or a real app regression.
When the fix is in the workflow, not the test
A lot of teams assume they should only edit the browser test when it fails. Sometimes that is correct, but often the workflow is the real bug.
Workflow-level fixes include:
- making PR and main jobs use the same install and build steps
- using the same runner image and browser versions
- exporting the same environment variables in both paths
- making caches branch-agnostic where safe
- uploading logs, traces, screenshots, and videos for every failure
- failing fast when required secrets or config values are missing
That last point is important. A missing secret should fail explicitly during setup, not halfway through a browser flow where the symptom looks unrelated.
What good observability looks like for CI browser tests
Good CI browser test output should answer three questions quickly:
- what code ran
- what environment ran it
- what the browser saw when it failed
For GitHub Actions, that usually means collecting:
- job logs
- test runner traces
- screenshots on failure
- browser console output
- network failures
- artifact versions for the app under test
This is where browser automation becomes much more useful than raw pass/fail. The more evidence you collect, the easier it is to distinguish a broken workflow from a broken feature.
Conclusion
When browser tests fail on pull requests but not on main, the root cause is usually hidden in the CI setup: permissions, secrets, caches, checkout behavior, parallelism, or branch-specific logic. The browser is just the place where the mismatch becomes visible.
The most reliable way to fix it is to make PR and main runs as similar as possible, then isolate the differences one by one. Print the context. Compare the jobs. Reduce shared state. Keep selectors stable. Use explicit waits. Fail fast when config is missing.
That approach takes more discipline than adding retries, but it gives you a better system. And once your GitHub Actions browser tests stop being branch-sensitive, your team spends less time arguing with CI and more time fixing actual product bugs.
If you want a deeper conceptual backdrop, the fundamentals of software testing, test automation, and continuous integration are worth revisiting with your workflow in mind.