Scanning Pipeline
Deep dive into each phase of the Surfbot scanning pipeline.
Pipeline Overview
Every Surfbot scan executes a sequential pipeline of five phases. Each phase builds on the results of the previous one, creating a comprehensive picture of your external attack surface.
Scans are on-demand — you trigger them when you need them. Scheduled/continuous scanning is on the roadmap.
Phase 1: Discovery
Tool: subfinder → dnsx
Goal: Map the full extent of your external footprint.
The discovery phase uses multiple data sources and techniques in parallel:
| Technique | Description | Typical Yield |
|---|---|---|
| Passive DNS | Historical records from SecurityTrails, VirusTotal, etc. | High |
| Certificate Transparency | Subdomains from CT log entries | High |
| DNS Brute-force | Dictionary of ~100k common subdomain names | Medium |
| Permutation | Generates variations of discovered names | Medium |
| Web crawling | Extracts linked domains/subdomains from responses | Low–Medium |
Output: A deduplicated list of subdomains and their resolved IP addresses.
Phase 2: Port Scanning
Tool: naabu
Goal: Identify open ports and running services on every discovered host.
We use a SYN-based scanner for speed, followed by service fingerprinting on open ports.
For each open port, we capture:
- Port number and protocol
- Service name and version (via banner grabbing)
- TLS certificate details (if applicable)
Phase 3: HTTP Probing
Tool: httpx
Goal: Fingerprint web applications running on HTTP/HTTPS ports.
Every open port serving HTTP(S) is probed for:
- Response metadata — Status code, headers, redirect chains
- Technology stack — Framework, CMS, CDN, WAF detection (using Wappalyzer signatures)
- Content analysis — Page title, favicon hash, body content hash
This phase often reveals forgotten staging environments, admin panels, and shadow IT that organizations didn't know existed.
Phase 4: Vulnerability Assessment
Tool: Nuclei (8,000+ official templates + 19 custom Surfbot templates)
Goal: Identify known vulnerabilities, misconfigurations, and exposures.
The templates executed depend on the scan profile you selected:
Scan Profiles
| Profile | Tags/Severity Included | Use Case |
|---|---|---|
| Passive | Tech detection, SSL/TLS, DNS, info-severity | Safe recon — no intrusive checks |
| Standard | Misconfigs, exposures, CVEs, secrets (excludes DoS, intrusive) | Balanced assessment for most domains |
| Deep | Everything except denial-of-service templates | Comprehensive scan for domains you fully control |
What Gets Checked
CVE Detection: Matches service versions against known CVE databases. For example, if Phase 2 identified Apache/2.4.49, the scanner flags CVE-2021-41773 (path traversal).
Misconfiguration Checks:
- Open directory listings
- Default credentials on admin panels
- Exposed
.env,.git/config,phpinfo()files - Missing security headers (HSTS, CSP, X-Frame-Options)
Secret Exposure: Scans response bodies and JavaScript files for:
- API keys (AWS, GCP, Stripe, etc.)
- Hardcoded tokens and passwords
- Private keys and certificates
SSL/TLS Analysis:
- Expired or self-signed certificates
- Weak cipher suites
- Protocol downgrade vulnerabilities (POODLE, DROWN)
Phase 5: Differential Analysis
Goal: Identify what changed since the last scan.
This is covered in detail in Differential Scanning. In short, every finding is compared against the previous scan to produce a clear delta:
- New assets, ports, or vulnerabilities
- Changed service versions, certificates, or configurations
- Resolved findings that are no longer present
Timing
| Domain Size | Typical Duration |
|---|---|
| Small (< 50 subdomains) | 5–10 minutes |
| Medium (50–500 subdomains) | 10–30 minutes |
| Large (500+ subdomains) | 30–90 minutes |
Scans run in parallel where possible. You'll see results stream in as each phase completes — you don't have to wait for the full pipeline to finish.