Skip to main content
surfbot.

How It Works

A high-level overview of Surfbot's architecture and scanning methodology.

Architecture Overview

Surfbot operates as a distributed scanning platform. When you add a domain, our system orchestrates a multi-phase pipeline that mirrors what a skilled penetration tester would do manually — but runs continuously and at scale.

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌──────────────┐
│  Discovery   │───▶│  Port Scan   │───▶│ HTTP Probe   │───▶│  Vuln Scan   │───▶│  Diff Engine  │
│ (subdomains) │    │ (services)   │    │ (web finger) │    │ (CVEs/misc)  │    │ (change det.) │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘    └──────────────┘

Each phase feeds data into the next. The output of the entire pipeline is compared against previous results by the differential engine.

Component Breakdown

Discovery Engine

The discovery phase maps your external footprint. It combines multiple techniques:

  • Passive DNS — Historical DNS records from public datasets
  • Certificate Transparency — Subdomains from CT log monitoring
  • DNS Brute-forcing — Dictionary-based subdomain enumeration
  • Recursive permutation — Generates permutations of discovered subdomains

This typically uncovers 2–10x more subdomains than most organizations are aware of.

Port Scanner

Every discovered host is scanned for open ports. We scan the top 1,000 most common ports by default, with full 65k scans available on higher tiers.

The scanner identifies:

  • Open TCP ports
  • Running service and version (banner grabbing)
  • TLS/SSL configuration and certificate details

HTTP Prober

For every open HTTP/HTTPS port, the prober collects:

  • Response status codes and headers
  • Technology fingerprints (frameworks, CMS, CDN, WAF)
  • Screenshot captures
  • Page title and content hashes

This creates a rich picture of what's actually running on each endpoint.

Vulnerability Scanner

The vuln scanner runs targeted checks against discovered services:

  • CVE Detection — Known vulnerabilities matched against service versions
  • Misconfiguration Checks — Default credentials, open admin panels, directory listings
  • Secret Exposure — API keys, tokens, and credentials in public responses
  • SSL/TLS Issues — Expired certs, weak ciphers, protocol downgrade vulnerabilities

Differential Engine

The final stage compares current results against the previous scan. This is Surfbot's key differentiator — covered in detail in Differential Scanning.

Data Flow

All scan data flows through a central pipeline:

  1. Raw findings are normalized into a common schema
  2. Deduplication removes redundant entries
  3. Severity scoring assigns risk levels using CVSS and contextual factors
  4. The differential engine calculates deltas
  5. Results are persisted and made available via dashboard and API
  6. Webhooks fire for configured alert conditions

Infrastructure

Surfbot runs on distributed infrastructure across multiple regions. Scans are parallelized and load-balanced to minimize scan duration while respecting rate limits and avoiding disruption to your services.

All scan traffic originates from published IP ranges that you can allowlist if needed. See your dashboard settings for the current list.

On this page