Service Announcements
Crawler Update: Better Redirect Handling & Deeper Error Page Checks
Over the past days, I’ve been refining the crawling engine to make scan results more precise and technically reliable. This update focuses mainly on how redirects and error pages are handled internally.
Here’s what changed.
Published on Feb 19, 2026
Redirects Are Now Analyzed Explicitly
If a URL was on the blacklist, it was supposed to be ignored.
However, when that URL returned a redirect, the crawler still followed it.
Example:
- URL A → listed in blacklist
- URL A responds with 301 → redirect to URL B
- URL B returns 404
- The 404 was included in scan statistics
What Changed
Blacklist checks now happen before any redirect is processed.
The updated logic:
- If URL A is blacklisted → it is skipped entirely
- No redirect is followed
- No downstream URLs (B, C, …) are evaluated
- No error codes from redirect targets enter the statistics
Redirect chains originating from excluded URLs are now completely ignored.
Error Pages Are No Longer Ignored
In the past, HTTP error responses like 404 or 500 would stop the parsing process. That’s technically correct behavior — but not helpful for a website scan.
Now the crawler:
- Accepts 4xx and 5xx responses
- Parses HTML error pages
- Extracts and validates links on those pages
- Detects broken assets even on 404/500 templates
Most websites return fully styled error pages containing scripts, stylesheets, images, and navigation links. These are now checked just like any other page.