Build Your Own Web Fingerprinting Tool in Rust
Create a Rust tool that detects CMS/tech stacks via headers, responses, and TLS hints—plus how to reduce exposure.Learn essential cybersecurity strategies an...
Fingerprint responsibly: gather headers and HTML hints, log them, and learn how to minimize your own exposure. This lab is fully local, with validation and cleanup.
What You’ll Build
- A mock website with custom headers and HTML meta tags.
- A Rust fingerprinting CLI that fetches a page, extracts headers/meta, and emits JSON.
- Safety controls (timeouts, UA tagging) and remediation tips.
Prerequisites
- macOS or Linux with Rust 1.80+.
- Python 3.10+.
- Run locally only unless you have written permission for a target.
Safety and Legal
- Fingerprint only systems you own/are authorized to test.
- Keep concurrency low and add delays; avoid hitting login or admin pages without consent.
- Strip sensitive data from logs if working on real targets.
Step 1) Start a mock site with obvious fingerprints
Click to view commands
mkdir -p mock_site
cat > mock_site/index.html <<'HTML'
<html>
<head>
<title>Demo CMS</title>
<meta name="generator" content="ExampleCMS 3.2">
<script src="/static/jquery-3.7.0.js"></script>
</head>
<body><h1>Hello from the demo site</h1></body>
</html>
HTML
cat > mock_site/server.py <<'PY'
from http.server import SimpleHTTPRequestHandler, HTTPServer
class Handler(SimpleHTTPRequestHandler):
def end_headers(self):
self.send_header("Server", "demo-web/1.2")
self.send_header("X-Powered-By", "ExampleCMS 3.2")
super().end_headers()
if __name__ == "__main__":
HTTPServer(("0.0.0.0", 8010), Handler).serve_forever()
PY
python3 mock_site/server.py > mock_site/server.log 2>&1 &
Common fix: If port 8010 is busy, change to a free port in both the server and scanner commands.
Step 2) Create the Rust project
Click to view commands
cargo new rust-fp
cd rust-fp
Step 3) Add dependencies
Replace Cargo.toml with:
Click to view toml code
[package]
name = "rust-fp"
version = "0.1.0"
edition = "2021"
[dependencies]
tokio = { version = "1.40", features = ["full"] }
reqwest = { version = "0.12", features = ["rustls-tls"] }
scraper = "0.19"
clap = { version = "4.5", features = ["derive"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
chrono = { version = "0.4", features = ["serde"] }
anyhow = "1.0"
regex = "1.10"
Step 4) Implement the complete fingerprinting CLI
Replace src/main.rs with comprehensive fingerprinting code:
Click to view complete Rust code
use clap::Parser;
use chrono::Utc;
use reqwest::Client;
use scraper::{Html, Selector};
use serde::Serialize;
use std::time::Duration;
use std::collections::HashMap;
use regex::Regex;
#[derive(Parser, Debug)]
#[command(author, version, about)]
struct Args {
/// Target URL (http://127.0.0.1:8010/)
#[arg(long)]
url: String,
/// Request timeout seconds
#[arg(long, default_value_t = 10)]
timeout: u64,
/// Output format (json, csv)
#[arg(long, default_value = "json")]
format: String,
/// Output file (optional)
#[arg(long)]
output: Option<String>,
}
#[derive(Serialize, Debug, Clone)]
struct Fingerprint {
url: String,
status: Option<u16>,
server: Option<String>,
powered_by: Option<String>,
meta_generator: Option<String>,
title: Option<String>,
timestamp: String,
// Enhanced fields
headers: HashMap<String, String>,
cookies: Vec<String>,
technologies: Vec<String>,
javascript_frameworks: Vec<String>,
cms: Option<String>,
ssl_info: Option<SslInfo>,
response_time_ms: u64,
}
#[derive(Serialize, Debug, Clone)]
struct SslInfo {
version: Option<String>,
cipher: Option<String>,
issuer: Option<String>,
}
struct Fingerprinter {
client: Client,
timeout: Duration,
}
impl Fingerprinter {
fn new(timeout: u64) -> anyhow::Result<Self> {
let client = Client::builder()
.user_agent("rust-fp/1.0 (+security@example.com)")
.timeout(Duration::from_secs(timeout))
.danger_accept_invalid_certs(false)
.build()?;
Ok(Self {
client,
timeout: Duration::from_secs(timeout),
})
}
async fn fingerprint(&self, url: &str) -> anyhow::Result<Fingerprint> {
let start = std::time::Instant::now();
let resp = match self.client.get(url).send().await {
Ok(r) => r,
Err(e) => {
return Ok(Fingerprint {
url: url.to_string(),
status: None,
server: None,
powered_by: None,
meta_generator: None,
title: None,
timestamp: Utc::now().to_rfc3339(),
headers: HashMap::new(),
cookies: Vec::new(),
technologies: Vec::new(),
javascript_frameworks: Vec::new(),
cms: None,
ssl_info: None,
response_time_ms: start.elapsed().as_millis() as u64,
});
}
};
let status = Some(resp.status().as_u16());
let headers = resp.headers().clone();
let body = resp.text().await.unwrap_or_default();
let response_time = start.elapsed().as_millis() as u64;
// Parse HTML
let doc = Html::parse_document(&body);
// Extract headers
let mut header_map = HashMap::new();
for (name, value) in headers.iter() {
if let Ok(val_str) = value.to_str() {
header_map.insert(name.to_string(), val_str.to_string());
}
}
// Extract cookies
let cookies: Vec<String> = headers
.get_all(reqwest::header::SET_COOKIE)
.iter()
.filter_map(|v| v.to_str().ok().map(|s| s.to_string()))
.collect();
// Extract basic info
let title_sel = Selector::parse("title").unwrap();
let meta_gen_sel = Selector::parse("meta[name=\"generator\"]").unwrap();
let title = doc
.select(&title_sel)
.next()
.and_then(|n| {
let text = n.text().collect::<String>().trim().to_string();
if text.is_empty() { None } else { Some(text) }
});
let meta_generator = doc
.select(&meta_gen_sel)
.next()
.and_then(|n| n.value().attr("content"))
.map(|s| s.to_string());
let server = header_map.get("server").cloned();
let powered_by = header_map.get("x-powered-by").cloned();
// Detect technologies
let technologies = self.detect_technologies(&header_map, &body);
let javascript_frameworks = self.detect_javascript_frameworks(&body);
let cms = self.detect_cms(&header_map, &body, &meta_generator);
// SSL/TLS info (would need TLS inspection in production)
let ssl_info = self.extract_ssl_info(&header_map);
Ok(Fingerprint {
url: url.to_string(),
status,
server,
powered_by,
meta_generator,
title,
timestamp: Utc::now().to_rfc3339(),
headers: header_map,
cookies,
technologies,
javascript_frameworks,
cms,
ssl_info,
response_time_ms: response_time,
})
}
fn detect_technologies(&self, headers: &HashMap<String, String>, body: &str) -> Vec<String> {
let mut techs = Vec::new();
// Server detection
if let Some(server) = headers.get("server") {
let server_lower = server.to_lowercase();
if server_lower.contains("nginx") {
techs.push("Nginx".to_string());
} else if server_lower.contains("apache") {
techs.push("Apache".to_string());
} else if server_lower.contains("iis") {
techs.push("IIS".to_string());
}
}
// PHP detection
if headers.contains_key("x-powered-by") {
if let Some(pb) = headers.get("x-powered-by") {
if pb.to_lowercase().contains("php") {
techs.push("PHP".to_string());
}
}
}
// Check for PHP in body
if body.contains("<?php") || body.contains(".php") {
techs.push("PHP".to_string());
}
// WordPress detection
if body.contains("wp-content") || body.contains("wp-includes") {
techs.push("WordPress".to_string());
}
// React detection
if body.contains("react") || body.contains("__REACT_DEVTOOLS") {
techs.push("React".to_string());
}
techs
}
fn detect_javascript_frameworks(&self, body: &str) -> Vec<String> {
let mut frameworks = Vec::new();
// jQuery
if body.contains("jquery") || body.contains("jQuery") {
frameworks.push("jQuery".to_string());
}
// React
if body.contains("react") || body.contains("ReactDOM") {
frameworks.push("React".to_string());
}
// Vue.js
if body.contains("vue") || body.contains("Vue.js") {
frameworks.push("Vue.js".to_string());
}
// Angular
if body.contains("angular") || body.contains("ng-") {
frameworks.push("Angular".to_string());
}
frameworks
}
fn detect_cms(&self, headers: &HashMap<String, String>, body: &str, meta_gen: &Option<String>) -> Option<String> {
// WordPress
if body.contains("wp-content") || body.contains("wp-includes") ||
body.contains("wordpress") {
return Some("WordPress".to_string());
}
// Drupal
if body.contains("drupal") || headers.contains_key("x-drupal-cache") {
return Some("Drupal".to_string());
}
// Joomla
if body.contains("joomla") || body.contains("Joomla!") {
return Some("Joomla".to_string());
}
// Check meta generator
if let Some(gen) = meta_gen {
let gen_lower = gen.to_lowercase();
if gen_lower.contains("wordpress") {
return Some("WordPress".to_string());
} else if gen_lower.contains("drupal") {
return Some("Drupal".to_string());
} else if gen_lower.contains("joomla") {
return Some("Joomla".to_string());
}
}
None
}
fn extract_ssl_info(&self, _headers: &HashMap<String, String>) -> Option<SslInfo> {
// In production, this would inspect TLS handshake
// For now, return None or basic info
None
}
}
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let args = Args::parse();
let fingerprinter = Fingerprinter::new(args.timeout)?;
let fingerprint = fingerprinter.fingerprint(&args.url).await?;
// Output based on format
match args.format.as_str() {
"json" => {
let json_output = serde_json::to_string_pretty(&fingerprint)?;
if let Some(output_file) = &args.output {
std::fs::write(output_file, &json_output)?;
println!("Results saved to {}", output_file);
} else {
println!("{}", json_output);
}
}
"csv" => {
// Simple CSV output
println!("url,status,server,powered_by,cms,technologies");
println!(
"{},{},{},{},{},{}",
fingerprint.url,
fingerprint.status.unwrap_or(0),
fingerprint.server.as_ref().unwrap_or(&"".to_string()),
fingerprint.powered_by.as_ref().unwrap_or(&"".to_string()),
fingerprint.cms.as_ref().unwrap_or(&"".to_string()),
fingerprint.technologies.join(";")
);
}
_ => {
return Err(anyhow::anyhow!("Unsupported format: {}", args.format));
}
}
Ok(())
}
Click to view commands
cargo run -- --url http://127.0.0.1:8010/
Common fixes:
connection refused: ensuremock_site/server.pyis running and the URL/port match.- HTML parsing empty: confirm the page has a
<title>andmeta name="generator">.
Understanding Why Web Fingerprinting Works
Why Fingerprinting is Effective
Information Leakage: Web servers leak information through headers, HTML, and TLS configurations.
Uniqueness: Each technology stack has distinct fingerprints that can be identified.
Passive Reconnaissance: Fingerprinting can be done passively without triggering security alerts.
Why Fingerprinting is Dangerous
Attack Surface: Fingerprinting reveals technology stacks, enabling targeted attacks.
Vulnerability Disclosure: Fingerprints can reveal outdated software versions with known vulnerabilities.
Reconnaissance: Attackers use fingerprinting for initial reconnaissance before attacks.
Step 5) Reduce your own fingerprint (defense)
Why Defense Matters
Attack Prevention: Reducing fingerprints makes it harder for attackers to identify vulnerable technologies.
Privacy: Minimizing information leakage protects your infrastructure details.
Security: Hiding technology stacks prevents targeted attacks.
Production-Ready Defense
- Strip or generalize
Server/X-Powered-Byheaders at your reverse proxy - Remove
meta generatortags in production builds - Serve assets from a CDN/WAF that normalizes headers and TLS configs
- Avoid exposing internal hostnames in TLS SANs; rotate certs when names change
Enhanced Defense Example:
Click to view configuration
# Nginx configuration to reduce fingerprinting
server {
# Remove Server header
server_tokens off;
# Remove X-Powered-By (if using PHP)
fastcgi_hide_header X-Powered-By;
# Normalize headers
more_set_headers "Server: nginx";
more_clear_headers "X-Powered-By";
# Remove version information
proxy_hide_header X-Version;
}
Advanced Scenarios
Scenario 1: Advanced Fingerprinting Techniques
Challenge: Detecting sophisticated fingerprinting attempts
Solution:
- Monitor for fingerprinting patterns
- Alert on unusual header analysis
- Use honeypots to detect reconnaissance
- Implement rate limiting
- Log fingerprinting attempts
Scenario 2: Multi-Layer Fingerprinting
Challenge: Defending against comprehensive fingerprinting
Solution:
- Normalize all headers
- Remove version information
- Use generic error messages
- Standardize TLS configurations
- Regular security audits
Scenario 3: Fingerprinting Detection
Challenge: Identifying when attackers are fingerprinting
Solution:
- Monitor for header analysis patterns
- Track unusual request patterns
- Alert on fingerprinting tools
- Correlate with other reconnaissance
- Implement detection rules
Troubleshooting Guide
Problem: Fingerprinting still reveals information
Diagnosis:
- Review all headers
- Check HTML source
- Analyze TLS configuration
- Test with fingerprinting tools
Solutions:
- Remove all version headers
- Strip HTML comments
- Normalize TLS configs
- Use generic error pages
- Regular security reviews
Problem: False positives in detection
Diagnosis:
- Review detection rules
- Analyze false positive patterns
- Check legitimate use cases
Solutions:
- Fine-tune detection rules
- Whitelist legitimate tools
- Use multiple detection methods
- Regular rule reviews
- Context-aware detection
Problem: Performance impact of defense
Diagnosis:
- Profile header processing
- Check response times
- Review resource usage
Solutions:
- Optimize header processing
- Use caching
- Minimize processing overhead
- Profile and optimize
- Consider CDN solutions
Code Review Checklist for Web Fingerprinting
Defense
- Server headers removed/generalized
- X-Powered-By headers removed
- Meta generator tags removed
- TLS configuration normalized
- Version information hidden
Detection
- Fingerprinting attempts monitored
- Alerting configured
- Logging implemented
- Rate limiting enabled
- Honeypots deployed
Monitoring
- Fingerprinting patterns tracked
- Reconnaissance alerts
- Regular security audits
- Threat intelligence integration
- Automated response
Quick Reference
- Collect headers + HTML meta to guess stacks; keep timeouts tight.
- Tag your UA and keep logs—transparency reduces abuse flags.
- Defend by minimizing banners, normalizing responses, and hiding generator tags.
Cleanup
Click to view commands
cd ..
pkill -f "mock_site/server.py" || true
rm -rf rust-fp mock_site
Quick Reference
- Collect headers + HTML meta to guess stacks; keep timeouts tight.
- Tag your UA and keep logs—transparency reduces abuse flags.
- Defend by minimizing banners, normalizing responses, and hiding generator tags.