ipasis
Blog/Security Engineering

5 Critical IP Signals for Robust Bot Detection

February 14, 20267 min read

Modern botnets have evolved beyond simple volumetric attacks. They utilize residential proxies, rotate user agents, and mimic human behavior to bypass traditional WAFs. To detect sophisticated automation, security engineers must rely on deep context associated with the ingress IP address.

Here are the five specific IP-based signals that every production-grade bot detection system must evaluate before granting access.

1. Connection Type Classification (VPN, Proxy, Tor)

The highest-fidelity signal for bot detection is the connection type. Legitimate users typically connect via ISPs (residential) or Mobile networks. Traffic originating from Data Centers, VPN concentrators, or Tor exit nodes carries a significantly higher risk profile.

Automated scripts require infrastructure. While sophisticated attackers use residential proxies, the vast majority of scraping and credential stuffing attacks still originate from hosting providers (AWS, DigitalOcean, Hetzner) or public proxies.

Implementation Strategy: Drop or CAPTCHA traffic flagged as is_proxy, is_vpn, or is_tor. Treat hosting traffic with extreme suspicion unless your API specifically serves servers.

Node.js Middleware Example

const axios = require('axios');

async function ipRiskMiddleware(req, res, next) {
  const clientIp = req.headers['x-forwarded-for'] || req.socket.remoteAddress;
  
  try {
    // Query IPASIS or internal cache
    const response = await axios.get(`https://api.ipasis.com/json/${clientIp}`);
    const data = response.data;

    // Block Tor and Data Center traffic immediately
    if (data.security.is_tor || data.connection.type === 'hosting') {
      return res.status(403).json({ error: 'Access denied due to high-risk connection type.' });
    }

    // Flag VPNs for CAPTCHA challenge
    if (data.security.is_vpn) {
      req.needsCaptcha = true;
    }

    next();
  } catch (error) {
    // Fail open or closed depending on security posture
    next(); 
  }
}

2. ASN and Organization Context

Not all Autonomous System Numbers (ASNs) are created equal. Analyzing the ASN allows you to categorize traffic based on the entity owning the network block.

  • Residential ISPs (e.g., Comcast, AT&T): Generally low risk, high volume.
  • Cloud Providers (e.g., AWS, GCP): High risk for login/signup endpoints. Legitimate users do not browse Amazon from an EC2 instance.
  • Boutique Hosting: Often used for bulletproof hosting.

If your application targets consumers, blocklists should include ASNs belonging to major cloud providers to eliminate a large swath of bot traffic.

3. Geolocation vs. Browser Locale Consistency

A common oversight in bot scripts is the mismatch between the IP's geolocation and the HTTP headers sent by the headless browser.

The Signal: Compare the IP geolocation (Country/Timezone) against the Accept-Language header and the client's reported timezone offset.

  • IP Location: US (United States)
  • Accept-Language: ru-RU,ru;q=0.9
  • Risk: High.

While expats and travelers exist, a statistically significant volume of traffic with this mismatch indicates a proxy network where the exit node location does not match the attacker's machine configuration.

4. Port Scanning and Open Port History

IPs that are part of botnets often act as both attackers and victims. If an IP address has a history of open ports commonly associated with compromised devices (e.g., MikroTik routers on port 8291, or default Telnet port 23), it is likely a residential proxy node.

Ingesting signals regarding open ports allows you to assign a risk score to residential IPs that would otherwise look legitimate. If an IP is exposing port 3389 (RDP) or 22 (SSH) to the public internet, treat requests from it with elevated scrutiny.

5. Velocity and Abuse History

Static analysis is insufficient for zero-day attacks. You must correlate IP metadata with historical abuse data.

  • Velocity: How many unique sessions has this IP initiated in the last 60 seconds?
  • Abuse History: Has this IP been reported for spam, brute force, or DDoS in the last 24 hours?

This data must be retrieved with sub-millisecond latency. Caching strategies (Redis/Memcached) are essential here. Do not rely solely on local counters; use a global threat feed API to check if the IP is attacking other systems simultaneously.

FAQ

Q: Should I block all VPN traffic? A: Not necessarily. Privacy-conscious users utilize VPNs. It is better to challenge VPN users with MFA or CAPTCHA rather than a hard block, unless your application manages highly sensitive financial data.

Q: How do I handle IPv6? A: Bot detection on IPv6 is harder due to address space size. Do not track individual IPv6 addresses; track the /64 subnet. An attacker can rotate the last 64 bits indefinitely, but they usually cannot change the subnet assigned by their ISP.

Q: What is the latency impact of IP lookups? A: Third-party API lookups can add 50-200ms. To mitigate this, perform lookups asynchronously where possible, or cache results in Redis for 24 hours. The IP metadata for a specific address rarely changes minute-to-minute.


Secure Your Perimeter with IPASIS

Don't let bots drain your resources. IPASIS provides enterprise-grade IP intelligence with precision detection for VPNs, proxies, and tor nodes. Integrate our low-latency API today to filter traffic before it hits your application logic.

Get your API Key | View Documentation

Start detecting VPNs and Bots today.

Identify anonymized traffic instantly with IPASIS.

Get API Key