Loading tutorials…
Loading tutorials…
Broken internal links bleed crawl budget and tank user experience. Broken external links signal stale content. This walks through the exact filter + export workflow that turns a 4xx report into a prioritized fix list in 90 minutes.
Who this is forTechnical SEOs, in-house marketers, or owners who've completed a Screaming Frog crawl and see a Response Codes tab full of 4xx errors. If you're staring at 600 broken links and unsure where to start, this is the workflow.
What you'll need
Step 1
Open the Response Codes tab → Filter dropdown → "Client Error (4xx)" then export. Repeat for 5xx.
After your crawl completes, click the Response Codes tab in the main view.
Click the Filter dropdown above the table and select 'Client Error (4xx).' This narrows the view to every URL Screaming Frog tried to reach and got back a 400-series response.
404 is the most common — page removed or moved without redirect. 403 means the URL exists but the crawler was blocked (usually WAF or auth). 410 is intentional permanent removal — treat the same as 404 for cleanup.
Click Export → CSV. Save the file as `4xx_audit_YYYY-MM-DD.csv` so you have a dated baseline to measure against after fixes ship.
Repeat with the filter set to 'Server Error (5xx).' 5xx counts should be near zero — anything above 1% means infrastructure problems (timeouts, app crashes) you need to escalate to engineering before doing further SEO work.
Step 2
Click any 4xx URL → bottom panel → Inlinks tab. This shows every page on your site linking to the broken URL.
Click any 4xx URL in the upper Response Codes table. The bottom panel updates to show details for that URL.
Click the Inlinks tab in the bottom panel. This is the most valuable view in Screaming Frog for broken-link work — it shows every internal page that links to the broken URL.
Note three things for each broken URL: (1) how many internal pages link to it (the inlink count), (2) which templates those pages share (header, footer, sidebar, product template), (3) which anchor texts are used.
A URL with 200+ inlinks from the global footer is one fix. A URL with 3 inlinks from random old blog posts is three separate fixes — usually deferred or auto-redirected.
Step 3
Bulk Export → Response Codes → Client Error (4xx) Inlinks. Get every source-target pair in one CSV.
Top menu → Bulk Export → Response Codes → Client Error (4xx) Inlinks. This generates a CSV with every (source URL, target broken URL, anchor text, link type) tuple in one file.
Open in Sheets or Excel. Add a column for 'Template' and bucket each source URL by the template it lives on (e.g., '/blog post template,' '/product detail,' 'global footer,' 'sitewide nav').
Pivot or COUNTIF by Template + Target URL. Templates with 50+ pages and a single broken target are top priority — one template edit fixes 50+ links.
Bulk Export → External Links also produces a 4xx report for outbound links. Treat outbound 4xx separately — fix or remove, but lower priority than internal.
Step 4
Sort by inlink count descending. Cross-reference with GA4 or GSC to identify broken URLs that had real traffic.
Sort your CSV by inlink count, descending. The top 10 broken URLs by inlink count almost always represent 70-80% of the broken-link equity loss on the site.
Cross-reference the top broken URLs against GSC's 'Pages → Why pages aren't indexed' report. URLs flagged as 'Not found (404)' that previously ranked are the highest-priority restorations.
If a broken URL had backlinks (check via Ahrefs or Semrush Backlinks report), it's a top-priority 301 redirect — every external backlink to a 404 is wasted equity. Redirect to the closest semantic match.
For broken URLs with no backlinks and no inlinks (orphan 404s): ignore. They likely came from old crawls or removed sitemap entries.
Step 5
Group fixes by source template. One edit to the footer fixes 800 instances. One edit to a blog post template fixes 200.
For sitewide template links (footer, header nav, sidebar): one edit in the template fixes every instance. Coordinate with engineering — these are usually 15-30 minute edits.
For content links (inside blog posts, product descriptions): use a database update if the link is identical across N posts. WordPress's wp-cli `search-replace` command can update 10,000 instances in minutes.
For dynamic links (links to programmatically-generated pages that no longer exist): fix at the source — the script generating the link should check existence, or the destination page should be restored.
After every batch of fixes, trigger a re-crawl in Screaming Frog of just the affected templates. Confirm the 4xx count drops before moving to the next batch.
Step 6
For broken URLs with external backlinks (per Ahrefs), set up 301 redirects to the closest semantic match — not blanket redirects to /.
Export your top-30 broken URLs into Ahrefs Batch Analysis. Note which have referring domains (backlinks).
For each broken URL with backlinks: identify the closest live page semantically. For a deleted /blog/old-post URL, redirect to the current article on the same topic — not to /blog or /.
Lazy blanket redirects (every 404 → /) destroy the equity you're trying to preserve. Google treats these as 'soft 404s' and ignores them. Redirect to relevant pages or accept the 404.
Implement redirects in your CDN (Cloudflare Rules), web server (.htaccess, nginx config), or framework (Next.js redirects in next.config.js). CDN-level is fastest; framework-level is most maintainable.
Common mistakes
Fixing broken links one at a time
What goes wrong: Three weeks of ticking off individual 404s while the same broken URL appears in 200 places. You finish the list and run a new crawl — the count is the same because the template still generates the broken link. Estimated wasted dev time: 30-60 hours, or $1,500-3,000 in agency hours.
How to avoid: Always group by source template before fixing. One template edit beats 200 URL edits.
Blanket-redirecting every 404 to the homepage
What goes wrong: Google detects the pattern, marks the redirects as 'soft 404s,' and drops them from the index. You lose every backlink-equity transfer those redirects were supposed to preserve. On a site with 100+ external backlinks to deleted pages, this is $5,000-20,000 in DR equity lost.
How to avoid: Redirect each broken URL to the closest semantic match — not /. If no good match exists, accept the 404 (it's better than a soft 404).
Ignoring outbound broken links
What goes wrong: Your blog posts link to 80 dead competitor URLs and tools that 404'd years ago. User experience tanks, dwell time drops, and Google reads it as low-content-care. Compounds over time.
How to avoid: Bulk Export → External Links → Client Error (4xx). Replace or remove dead outbound links. Schedule quarterly.
Not differentiating 404 from 410
What goes wrong: You treat intentionally-removed pages (410 Gone) as broken links and try to restore them. You waste 4-8 hours on URLs that are gone on purpose.
How to avoid: Filter by exact status code (404, 410, 451) separately. Confirm 410s are intentional with engineering before acting.
Skipping the Bulk Export step
What goes wrong: You click through 4xx URLs one at a time in the Inlinks panel. After 20 minutes you've reviewed 6 URLs. The full audit takes 8 hours instead of 90 minutes.
How to avoid: Always start with Bulk Export → Response Codes → Client Error (4xx) Inlinks. The CSV gives you the full picture in one file.
Not validating the WAF isn't producing false 4xx
What goes wrong: Cloudflare Bot Fight Mode returns 403 for 5% of your URLs. You spend a day 'fixing' broken pages that aren't actually broken. Real broken links go unfixed because the noise drowned them out.
How to avoid: Before fixing 4xx, spot-check 5-10 random 4xx URLs in a browser. If they load fine, the issue is at the WAF/crawl layer, not the site. Re-crawl with the SF user-agent allowlisted.
Recap
Done — what's next
How to set up Screaming Frog and run your first crawl
Read the next tutorial
Hand it off
Broken-link cleanup is recurring work — every CMS change, every product removal, every blog post update creates new ones. A vetted technical SEO specialist on EverestX will own the monthly crawl, the fix list, and the close-out — typically $400-700/mo at $14-16/hr.
See specialist rates
Under 1% of crawled URLs returning 4xx is healthy. 1-3% needs attention but isn't urgent. Above 5% indicates either a recent migration without redirects, or a template generating broken links — investigate the pattern immediately.
Redirect any 404 with backlinks or meaningful organic traffic. Accept 404s for genuinely-removed content with no equity (old promotional pages, deleted user-generated content, expired campaigns). Soft 404s — blanket redirects to / — are worse than honest 404s.
Three usual reasons: (1) WAF rate-limiting or bot blocking the SF user-agent; (2) URLs requiring authentication (admin paths, member-only content); (3) Cloudflare's Bot Fight Mode returning 403 to non-browser traffic. Spot-check 5 random URLs manually before acting.
Not directly in Google's algorithm signals, but they hurt user experience, dwell time, and signal stale content. Compounding effect over years can drag rankings. Worth a quarterly cleanup, not a weekly one.
Yes. Schedule Screaming Frog crawls via the CLI (Configuration → Scheduling on Windows, or shell script + cron on Mac/Linux). Pipe the 4xx CSV into a weekly report. Even better: use the Screaming Frog API or a service like ContentKing for real-time detection.
Screaming Frog SEO Spider
Screaming Frog only earns its keep when the crawl matches how Googlebot actually sees your site. This walks through the install, license activation, memory tuning, and configuration choices that 90% of first-time users get wrong.
Screaming Frog SEO Spider
Redirect chains kill crawl budget and bleed link equity. Screaming Frog finds them in seconds; fixing them takes coordination. This walks the exact workflow from chain detection to clean 301s.
Ahrefs
Every site has dead backlinks pointing at 404'd pages. Reclaiming them is the highest ROI link work in SEO — the links already exist; you just have to redirect or replace the destination.
Google Search Console
GSC's Indexing report shows you what's broken — in language that often hides what to actually do about it. This is the field-tested decoder: every error type, what causes it, and the specific fix that works.
Screaming Frog SEO Spider
You've crawled the site. You have 6,000 issues. You're not sure which 30 actually matter. This is the honest decision framework for when self-managed technical SEO becomes false economy.