ipasis
Blog/Bot Detection

How to Detect Bots on a Website: A Developer's Guide

March 9, 20268 min read

Bot traffic now accounts for over 47% of all web traffic—and not all of it is friendly. From scraping bots stealing your content to credential stuffers attacking your login forms, understanding how to detect bots is essential for any modern web application.

In this guide, we'll break down the most effective bot detection techniques and show you how to implement them in your stack.

What Are Bots?

Bots are automated programs that interact with websites and APIs. They range from helpful search engine crawlers to malicious scrapers and fraud bots.

Good bots include:

  • Search engine crawlers (Googlebot, Bingbot)
  • Monitoring services (uptime checkers, performance tools)
  • Social media preview generators
  • Authorized API clients

Bad bots include:

  • Content scrapers stealing your data
  • Credential stuffing bots testing stolen passwords
  • Inventory hoarding bots (sneaker bots, ticket scalpers)
  • Click fraud bots inflating ad metrics
  • Spam bots flooding forms and comments

Detection Technique #1: IP Intelligence

The fastest way to identify bots is by analyzing the source IP address. Most bots operate from:

  • Datacenter IPs: AWS, Google Cloud, DigitalOcean—uncommon for real users
  • VPNs and Proxies: Used to hide bot origin or bypass rate limits
  • Tor exit nodes: Anonymous networks often abused by attackers
  • Known bot networks: IPs previously flagged for malicious activity

IP intelligence APIs like IPASIS provide real-time risk scores based on these signals, returning results in under 20ms.

Detection Technique #2: Behavioral Analysis

Human users behave differently than bots. Key behavioral signals include:

  • Mouse movements: Bots often move in straight lines or don't move at all
  • Typing speed: Automated form fills happen instantly
  • Navigation patterns: Bots access pages in unusual sequences
  • Time on page: Bots scrape content in milliseconds
  • Session depth: Real users browse multiple pages

JavaScript-based fingerprinting libraries can track these behaviors client-side and send anomaly scores to your backend for validation.

Detection Technique #3: Browser Fingerprinting

Bots often run in headless browsers or with spoofed user agents. Fingerprinting detects inconsistencies:

  • User-Agent mismatches: Claimed browser doesn't match actual capabilities
  • Missing JavaScript features: Headless browsers lack certain APIs
  • Canvas/WebGL fingerprints: Unique rendering patterns identify automation tools
  • Font enumeration: Bots often have unusual installed fonts

Detection Technique #4: CAPTCHA Challenges

When automated detection isn't enough, CAPTCHA provides a final barrier:

  • reCAPTCHA v3: Invisible scoring (good UX, but can be bypassed)
  • hCaptcha: Privacy-focused, more resistant to automation
  • Cloudflare Turnstile: Low-friction, enterprise-grade

Best practice: Only trigger CAPTCHA for high-risk requests (low trust score, suspicious behavior) to avoid annoying legitimate users.

How IP Intelligence Powers Bot Detection

IP analysis is the fastest signal you can check—it happens server-side, before any JavaScript loads or forms are submitted.

Here's why it's effective:

  • Residential users rarely use VPNs for normal browsing
  • Datacenter IPs are a red flag for automated traffic
  • Known bot IPs can be blocked instantly
  • No client-side code required (can't be bypassed with JavaScript disabled)

Implementation Example: Bot Detection with IPASIS

Here's how to integrate real-time bot detection into your API or signup flow:

// Node.js / Express example
import fetch from 'node-fetch';

async function checkForBot(ipAddress) {
  const response = await fetch(`https://api.ipasis.com/check/${ipAddress}`, {
    headers: {
      'Authorization': `Bearer ${process.env.IPASIS_API_KEY}`
    }
  });
  
  const data = await response.json();
  
  // IPASIS returns a trust score: 0 (high risk) to 100 (trusted)
  return {
    isBot: data.trust_score < 50,
    riskLevel: data.risk_level, // 'low', 'medium', 'high'
    signals: {
      isVPN: data.vpn,
      isProxy: data.proxy,
      isTor: data.tor,
      isDatacenter: data.datacenter,
      isKnownBot: data.bot
    }
  };
}

// Use in your middleware
app.post('/api/signup', async (req, res) => {
  const clientIP = req.headers['x-forwarded-for'] || req.connection.remoteAddress;
  const botCheck = await checkForBot(clientIP);
  
  if (botCheck.isBot) {
    // Option 1: Block immediately
    return res.status(403).json({ error: 'Signup blocked' });
    
    // Option 2: Require additional verification
    return res.json({ requiresCaptcha: true });
  }
  
  // Continue with normal signup flow
  // ...
});

Layered Defense: Combining Techniques

The most effective bot detection uses multiple signals:

  1. First layer (server-side): Check IP reputation—instant decision
  2. Second layer (client-side): JavaScript fingerprinting and behavior tracking
  3. Third layer (on-demand): CAPTCHA for high-risk actions

This approach minimizes false positives while catching sophisticated bots.

Rate Limiting as a Bot Deterrent

Even if you can't identify every bot, rate limiting prevents abuse:

  • Per IP: Max 100 requests per hour
  • Per user session: Max 5 login attempts per 15 minutes
  • Per API key: Enforce plan limits

Combine with IP intelligence: Apply stricter limits to datacenter IPs and looser limits to trusted residential IPs.

Best Practices for Bot Detection

  • Start with passive monitoring: Log bot signals before blocking to tune your thresholds
  • Whitelist known good bots: Allow Googlebot, monitoring services, etc.
  • Monitor false positives: Track user reports of "blocked by mistake"
  • Update detection rules: Bot techniques evolve—review signals monthly
  • Combine signals: No single technique is perfect; layer them
  • Optimize for speed: IP checks should complete in <50ms to avoid UX lag

When to Block vs. Challenge

Not all bot traffic should be blocked outright. Use this decision matrix:

Risk LevelActionUse Case
Low (0-39)AllowNormal residential IP, no red flags
Medium (40-79)Challenge (CAPTCHA)VPN user, unusual behavior
High (80-100)BlockDatacenter IP, known bot network

Monitoring Bot Traffic

Set up dashboards to track:

  • Percentage of traffic flagged as bots
  • Most common bot sources (IPs, ASNs, countries)
  • Pages/endpoints most targeted by bots
  • Conversion rate differences between trusted vs. challenged users

This data helps you tune detection thresholds and identify new attack patterns.

The Cost of Ignoring Bots

Undetected bots cause real damage:

  • Skewed analytics: Bot traffic inflates metrics, hiding real user behavior
  • Wasted resources: Server costs scale with bot requests
  • Degraded UX: Bots can slow down your site for real users
  • Security breaches: Credential stuffing leads to account takeovers
  • Lost revenue: Inventory hoarding bots prevent real customers from buying

Investing in bot detection pays for itself through improved performance, security, and data quality.

Stop bots in under 20ms.

IPASIS provides real-time IP intelligence for bot detection. 1,000 free requests per day.