webscraping-errors·Sep 26, 2025

429 Error: How to Handle Rate Limits When Scraping Websites

Learn about HTTP 429 Too Many Requests error, why it occurs during web scraping, and effective strategies to handle rate limiting.

What is HTTP 429 Too Many Requests?

The 429 status code means "Too Many Requests" - the server is limiting the rate of requests from your IP address or user session. This is a protective measure to prevent abuse and ensure fair usage of server resources.

Common Causes of 429 Errors

Rate limiting - Too many requests per minute/hour
API quotas - Exceeding API usage limits
IP-based throttling - Same IP making too many requests
Session-based limits - Too many requests per session
Concurrent request limits - Too many simultaneous requests
Resource protection - Server protecting against overload

How to Handle Rate Limits

1. Implement Request Delays

Add realistic delays between requests:

import time
import random

def make_request(url):
    # Random delay between 1-3 seconds
    delay = random.uniform(1, 3)
    time.sleep(delay)
    
    response = requests.get(url, headers=headers)
    return response

2. Use Exponential Backoff

Implement exponential backoff for retries:

import time
import random

def exponential_backoff(attempt):
    """Calculate delay with exponential backoff"""
    base_delay = 1
    max_delay = 300  # 5 minutes
    delay = min(base_delay * (2 ** attempt) + random.uniform(0, 1), max_delay)
    return delay

def make_request_with_retry(url, max_retries=5):
    for attempt in range(max_retries):
        try:
            response = requests.get(url, headers=headers)
            if response.status_code != 429:
                return response
        except requests.exceptions.RequestException:
            pass
        
        if attempt < max_retries - 1:
            delay = exponential_backoff(attempt)
            print(f"429 error, retrying in {delay:.2f} seconds...")
            time.sleep(delay)
    
    return None

3. Check Retry-After Header

Respect the Retry-After header when provided:

def make_request_with_retry_after(url):
    response = requests.get(url, headers=headers)
    
    if response.status_code == 429:
        retry_after = response.headers.get('Retry-After')
        if retry_after:
            try:
                wait_time = int(retry_after)
                print(f"Server requested wait time: {wait_time} seconds")
                time.sleep(wait_time)
                # Retry the request
                response = requests.get(url, headers=headers)
            except ValueError:
                # If Retry-After is not a number, use default delay
                time.sleep(60)
                response = requests.get(url, headers=headers)
    
    return response

4. Use Proxy Rotation

Rotate IP addresses to distribute requests:

proxies = [
    {'http': 'proxy1:port', 'https': 'proxy1:port'},
    {'http': 'proxy2:port', 'https': 'proxy2:port'},
    {'http': 'proxy3:port', 'https': 'proxy3:port'},
]

def get_random_proxy():
    return random.choice(proxies)

def make_request_with_proxy(url):
    proxy = get_random_proxy()
    try:
        response = requests.get(url, headers=headers, proxies=proxy)
        return response
    except requests.exceptions.RequestException:
        # Try with a different proxy
        proxy = get_random_proxy()
        response = requests.get(url, headers=headers, proxies=proxy)
        return response

Professional Solutions

For production scraping, consider using ScrapingForge API:

Automatic rate limiting - Built-in protection against 429 errors
Residential proxies - High success rates with real IP addresses
Request queuing - Intelligent request timing and distribution
Global infrastructure - Distribute requests across multiple locations

import requests

url = "https://api.scrapingforge.com/v1/scrape"
params = {
    'api_key': 'YOUR_API_KEY',
    'url': 'https://target-website.com',
    'country': 'US',
    'render_js': 'true'
}

response = requests.get(url, params=params)

Best Practices Summary

Implement proper delays - Don't overwhelm the target server
Use exponential backoff - Handle retries intelligently
Respect rate limit headers - Follow server-provided limits
Distribute requests - Use proxy rotation and queuing
Monitor success rates - Track and adjust your approach
Consider professional tools - Use ScrapingForge for complex scenarios

Conclusion

HTTP 429 Too Many Requests errors are common but manageable obstacles in web scraping. By implementing proper delays, exponential backoff, proxy rotation, and monitoring, you can significantly reduce the occurrence of this error. For production scraping projects, consider using professional services like ScrapingForge that handle these challenges automatically.

422 Error in Web Scraping: Causes and How to Resolve It

Learn about HTTP 422 Unprocessable Entity error, why it occurs during web scraping, and effective strategies to handle validation issues.

500 Error in Web Scraping: Common Causes and Fixes

Learn about HTTP 500 Internal Server Error, why it occurs during web scraping, and effective strategies to handle server-side issues.