How to Detect Bots on a Website:
A Developer's Guide
Bot traffic now accounts for over 47% of all web traffic—and not all of it is friendly. From scraping bots stealing your content to credential stuffers attacking your login forms, understanding how to detect bots is essential for any modern web application.
In this guide, we'll break down the most effective bot detection techniques and show you how to implement them in your stack.
What Are Bots?
Bots are automated programs that interact with websites and APIs. They range from helpful search engine crawlers to malicious scrapers and fraud bots.
Good bots include:
- Search engine crawlers (Googlebot, Bingbot)
- Monitoring services (uptime checkers, performance tools)
- Social media preview generators
- Authorized API clients
Bad bots include:
- Content scrapers stealing your data
- Credential stuffing bots testing stolen passwords
- Inventory hoarding bots (sneaker bots, ticket scalpers)
- Click fraud bots inflating ad metrics
- Spam bots flooding forms and comments
Detection Technique #1: IP Intelligence
The fastest way to identify bots is by analyzing the source IP address. Most bots operate from:
- Datacenter IPs: AWS, Google Cloud, DigitalOcean—uncommon for real users
- VPNs and Proxies: Used to hide bot origin or bypass rate limits
- Tor exit nodes: Anonymous networks often abused by attackers
- Known bot networks: IPs previously flagged for malicious activity
IP intelligence APIs like IPASIS provide real-time risk scores based on these signals, returning results in under 20ms.
Detection Technique #2: Behavioral Analysis
Human users behave differently than bots. Key behavioral signals include:
- Mouse movements: Bots often move in straight lines or don't move at all
- Typing speed: Automated form fills happen instantly
- Navigation patterns: Bots access pages in unusual sequences
- Time on page: Bots scrape content in milliseconds
- Session depth: Real users browse multiple pages
JavaScript-based fingerprinting libraries can track these behaviors client-side and send anomaly scores to your backend for validation.
Detection Technique #3: Browser Fingerprinting
Bots often run in headless browsers or with spoofed user agents. Fingerprinting detects inconsistencies:
- User-Agent mismatches: Claimed browser doesn't match actual capabilities
- Missing JavaScript features: Headless browsers lack certain APIs
- Canvas/WebGL fingerprints: Unique rendering patterns identify automation tools
- Font enumeration: Bots often have unusual installed fonts
Detection Technique #4: CAPTCHA Challenges
When automated detection isn't enough, CAPTCHA provides a final barrier:
- reCAPTCHA v3: Invisible scoring (good UX, but can be bypassed)
- hCaptcha: Privacy-focused, more resistant to automation
- Cloudflare Turnstile: Low-friction, enterprise-grade
Best practice: Only trigger CAPTCHA for high-risk requests (low trust score, suspicious behavior) to avoid annoying legitimate users.
How IP Intelligence Powers Bot Detection
IP analysis is the fastest signal you can check—it happens server-side, before any JavaScript loads or forms are submitted.
Here's why it's effective:
- Residential users rarely use VPNs for normal browsing
- Datacenter IPs are a red flag for automated traffic
- Known bot IPs can be blocked instantly
- No client-side code required (can't be bypassed with JavaScript disabled)
Implementation Example: Bot Detection with IPASIS
Here's how to integrate real-time bot detection into your API or signup flow:
// Node.js / Express example
import fetch from 'node-fetch';
async function checkForBot(ipAddress) {
const response = await fetch(`https://api.ipasis.com/check/${ipAddress}`, {
headers: {
'Authorization': `Bearer ${process.env.IPASIS_API_KEY}`
}
});
const data = await response.json();
// IPASIS returns a trust score: 0 (high risk) to 100 (trusted)
return {
isBot: data.trust_score < 50,
riskLevel: data.risk_level, // 'low', 'medium', 'high'
signals: {
isVPN: data.vpn,
isProxy: data.proxy,
isTor: data.tor,
isDatacenter: data.datacenter,
isKnownBot: data.bot
}
};
}
// Use in your middleware
app.post('/api/signup', async (req, res) => {
const clientIP = req.headers['x-forwarded-for'] || req.connection.remoteAddress;
const botCheck = await checkForBot(clientIP);
if (botCheck.isBot) {
// Option 1: Block immediately
return res.status(403).json({ error: 'Signup blocked' });
// Option 2: Require additional verification
return res.json({ requiresCaptcha: true });
}
// Continue with normal signup flow
// ...
});Layered Defense: Combining Techniques
The most effective bot detection uses multiple signals:
- First layer (server-side): Check IP reputation—instant decision
- Second layer (client-side): JavaScript fingerprinting and behavior tracking
- Third layer (on-demand): CAPTCHA for high-risk actions
This approach minimizes false positives while catching sophisticated bots.
Rate Limiting as a Bot Deterrent
Even if you can't identify every bot, rate limiting prevents abuse:
- Per IP: Max 100 requests per hour
- Per user session: Max 5 login attempts per 15 minutes
- Per API key: Enforce plan limits
Combine with IP intelligence: Apply stricter limits to datacenter IPs and looser limits to trusted residential IPs.
Best Practices for Bot Detection
- Start with passive monitoring: Log bot signals before blocking to tune your thresholds
- Whitelist known good bots: Allow Googlebot, monitoring services, etc.
- Monitor false positives: Track user reports of "blocked by mistake"
- Update detection rules: Bot techniques evolve—review signals monthly
- Combine signals: No single technique is perfect; layer them
- Optimize for speed: IP checks should complete in <50ms to avoid UX lag
When to Block vs. Challenge
Not all bot traffic should be blocked outright. Use this decision matrix:
| Risk Level | Action | Use Case |
|---|---|---|
| Low (0-39) | Allow | Normal residential IP, no red flags |
| Medium (40-79) | Challenge (CAPTCHA) | VPN user, unusual behavior |
| High (80-100) | Block | Datacenter IP, known bot network |
Monitoring Bot Traffic
Set up dashboards to track:
- Percentage of traffic flagged as bots
- Most common bot sources (IPs, ASNs, countries)
- Pages/endpoints most targeted by bots
- Conversion rate differences between trusted vs. challenged users
This data helps you tune detection thresholds and identify new attack patterns.
The Cost of Ignoring Bots
Undetected bots cause real damage:
- Skewed analytics: Bot traffic inflates metrics, hiding real user behavior
- Wasted resources: Server costs scale with bot requests
- Degraded UX: Bots can slow down your site for real users
- Security breaches: Credential stuffing leads to account takeovers
- Lost revenue: Inventory hoarding bots prevent real customers from buying
Investing in bot detection pays for itself through improved performance, security, and data quality.
Related Articles
Stop bots in under 20ms.
IPASIS provides real-time IP intelligence for bot detection. 1,000 free requests per day.