Every domain name has a story. Who registered it? When does it expire? For decades, developers answered these questions by running whois example.com in a terminal and parsing a wall of unstructured text. That worked for one-off lookups, but it's a nightmare when building a security scanner that checks thousands of domains per hour, or a brand protection bot monitoring typo-squatting variations.

Modern applications need structured, programmatic access to domain registration data. Your cybersecurity dashboard can't scrape terminal output, and your automated monitoring tool can't handle inconsistent formats across 1,500+ top-level domains. You need an API that returns clean JSON, handles errors predictably, and scales with your infrastructure.

WHOIS vs. RDAP: The Technical Foundation

Legacy WHOIS was built in 1982. It uses TCP Port 43, returns unstructured ASCII text, and has no standardized format. RDAP (Registration Data Access Protocol), standardized by the IETF in RFC 7480, is the modern replacement. It's RESTful, returns structured JSON, includes proper HTTP status codes, and supports internationalization.

If you're building new infrastructure, use RDAP. The only reason to touch legacy WHOIS is if you're maintaining code written before 2015.

Integration Mechanics: How to Build It

IANA maintains a bootstrap service that maps TLDs to their RDAP servers. Here's how to query the bootstrap and look up a domain:

python
import requests

def query_domain_rdap(domain):
    tld_servers = {
        'com': 'https://rdap.verisign.com/com/v1',
        'net': 'https://rdap.verisign.com/net/v1',
        'org': 'https://rdap.publicinterestregistry.org/rdap'
    }

    tld = domain.split('.')[-1].lower()

    if tld not in tld_servers:
        raise ValueError(f"TLD '{tld}' not supported")

    base_url = tld_servers[tld]
    url = f"{base_url}/domain/{domain}"

    try:
        response = requests.get(url, timeout=10)
        response.raise_for_status()
        return response.json()
    except requests.exceptions.HTTPError as e:
        if e.response.status_code == 404:
            return None
        raise

Understanding the Response Structure

A successful RDAP query returns JSON with these key fields: ldhName (domain in ASCII format), events (array of dated events: registration, expiration, last update), status (EPP status codes), entities (registrar/registrant/admin contacts), and nameservers (DNS configuration). The status array is particularly useful — pendingDelete means a domain is about to drop and become available.

The Privacy Barrier: GDPR and Redaction

In 2018, GDPR changed WHOIS forever. Registrant and admin contact details are now typically REDACTED FOR PRIVACY. Registrar information and domain dates remain public. For brand protection tools, shift from "domains registered by known bad actors" to "domains with similar names registered recently."

The abuse contact isn't redacted — it's required to be visible under ICANN policy and is useful as a reporting channel for phishing sites and malware.

Building a Cache Layer

python
import redis
import json
from datetime import timedelta

class RDAPCache:
    """Redis-backed cache for RDAP queries."""

    def __init__(self, redis_client):
        self.redis = redis_client
        self.default_ttl = timedelta(hours=24)

    def _cache_key(self, domain):
        return f"rdap:{domain.lower()}"

    def get(self, domain):
        cached = self.redis.get(self._cache_key(domain))
        return json.loads(cached) if cached else None

    def set(self, domain, rdap_data, ttl=None):
        ttl = ttl or self.default_ttl
        self.redis.setex(
            self._cache_key(domain),
            int(ttl.total_seconds()),
            json.dumps(rdap_data)
        )

    def query_with_cache(self, domain):
        cached = self.get(domain)
        if cached:
            return cached
        rdap_data = query_domain_rdap(domain)
        self.set(domain, rdap_data)
        return rdap_data

Set TTLs based on monitoring needs: 24 hours for general monitoring, 4-6 hours for expiration monitoring, 1 hour for high-value acquisition targets, and 7 days for historical analysis.