Proxy Rotation¶
Proxy rotation distributes requests across multiple proxy servers to avoid IP-based rate limits, bypass geographical restrictions, and maintain anonymity.
Why Proxy Rotation?¶
IP-Based Rate Limits¶
Many services enforce rate limits based on IP address:
Without proxy rotation:
Your IP: 192.168.1.100
├─ Request 1 → API (count: 1/100)
├─ Request 2 → API (count: 2/100)
├─ Request 3 → API (count: 3/100)
...
└─ Request 100→ API (count: 100/100)
BLOCKED! ❌
With proxy rotation:
Request 1 → Proxy A (192.168.1.10) → API (count: 1/100)
Request 2 → Proxy B (192.168.1.11) → API (count: 1/100)
Request 3 → Proxy C (192.168.1.12) → API (count: 1/100)
Request 4 → Proxy A (192.168.1.10) → API (count: 2/100)
Request 5 → Proxy B (192.168.1.11) → API (count: 2/100)
...
All requests succeed! ✓
Use Cases¶
1. Web Scraping - Avoid IP blocks when scraping at scale - Distribute load across multiple IPs - Maintain anonymity
2. API Access - Bypass per-IP rate limits - Access geo-restricted APIs - Reduce risk of API key deactivation
3. Testing - Simulate requests from different locations - Test geo-specific functionality - Verify load balancing
4. Privacy - Hide your real IP address - Prevent tracking - Access region-locked content
Proxy Formats Supported¶
The proxy manager supports multiple proxy formats for flexibility:
Format 1: IP:PORT¶
Pattern: ^(\d{1,3}\.){3}\d{1,3}:\d{1,5}$
Format 2: IP:PORT:USER:PASS¶
Pattern: ^(\d{1,3}\.){3}\d{1,3}:\d{1,5}:[^:]+:[^:]+$
Format 3: Full URL with Credentials¶
Patterns:
- ^http://[^:]+:[^@]+@[^:]+:\d+$
- ^https://[^:]+:[^@]+@[^:]+\d+$
Automatic Format Conversion¶
The proxy manager normalizes formats internally:
# Input in IP:PORT:USER:PASS format
"192.168.1.10:8080:admin:secret"
# Internally converted to URL format
"http://admin:secret@192.168.1.10:8080"
Proxy Validation¶
The proxy manager validates proxies to ensure they're properly formatted before use.
Format Validation¶
@classmethod
def validate(cls, proxy: str) -> bool:
"""Validate proxy format."""
if not proxy or not isinstance(proxy, str):
return False
for pattern in cls.PROXY_PATTERNS:
if re.match(pattern, proxy):
# For IP-based formats, validate octets
if pattern in cls.PROXY_PATTERNS[:2]:
ip_part = proxy.split(":")[0]
if not cls._validate_ip_octets(ip_part):
return False
return True
return False
IP Octet Validation¶
IP addresses are validated to ensure each octet is in the valid range (0-255):
@classmethod
def _validate_ip_octets(cls, ip: str) -> bool:
"""Validate IP octets are in range 0-255."""
octets = ip.split(".")
if len(octets) != 4:
return False
try:
return all(0 <= int(octet) <= 255 for octet in octets)
except ValueError:
return False
Validation Examples¶
ProxyManager.validate("192.168.1.10:8080") # ✓ Valid
ProxyManager.validate("192.168.1.10:8080:admin:pass") # ✓ Valid
ProxyManager.validate("http://admin:pass@192.168.1.10:8080") # ✓ Valid
ProxyManager.validate("256.1.1.1:8080") # ✗ Invalid octet
ProxyManager.validate("192.168.1:8080") # ✗ Invalid IP format
ProxyManager.validate("not-a-proxy") # ✗ Invalid format
Loading and Filtering¶
When proxies are loaded, invalid ones are filtered out:
def _load_proxies(self) -> None:
proxies = []
# Load from various sources
if self._config.list:
proxies.extend(self._config.list)
# Filter invalid proxies
self._proxies = []
for proxy in proxies:
if self.validate(proxy):
self._proxies.append(proxy)
else:
logger.debug(f"Filtered invalid proxy format: {proxy}")
Failed Proxy Tracking¶
The proxy manager tracks failed proxies to avoid repeatedly using problematic ones.
Failed Proxy State¶
class ProxyManager:
def __init__(self, config: ProxyConfig):
self._proxies: List[str] = []
self._failed_proxies: Dict[str, float] = {}
self._lock = asyncio.Lock()
_proxies: All valid proxies_failed_proxies: Failed proxies with retry timestamps_lock: Async lock for thread-safe operations
Marking Failed Proxies¶
async def mark_failed(self, proxy: str) -> None:
"""Mark proxy as failed (unavailable for retry_delay)."""
async with self._lock:
if proxy in self._proxies:
self._failed_proxies[proxy] = (
time.time() + self._config.retry_delay
)
When a proxy fails, it's marked with a timestamp indicating when it should be retried.
Marking Successful Proxies¶
async def mark_success(self, proxy: str) -> None:
"""Mark proxy as successful (clear failed status)."""
async with self._lock:
self._failed_proxies.pop(proxy, None)
If a proxy succeeds again, its failed status is cleared.
Getting Next Proxy¶
async def get_next(self) -> Optional[str]:
"""Get next available proxy."""
async with self._lock:
now = time.time()
# Remove expired failed entries
self._failed_proxies = {
p: t for p, t in self._failed_proxies.items() if t > now
}
# Get available proxies (not in failed state)
available = [
p for p in self._proxies
if p not in self._failed_proxies
]
if not available:
return None
return random.choice(available)
Failed Proxy Lifecycle¶
Time 0s: Proxy fails → mark_failed() → _failed_proxies[proxy] = 60s
Time 10s: get_next() → Proxy excluded from selection
Time 30s: get_next() → Proxy excluded from selection
Time 60s: get_next() → Expired, removed from _failed_proxies
Time 60s: Proxy available for selection again
Retry Delay Configuration¶
Default retry delay is 60 seconds. Adjust based on your needs:
# Short retry delay for quick testing
ProxyConfig(retry_delay=10.0)
# Long retry delay for production scraping
ProxyConfig(retry_delay=300.0) # 5 minutes
WebShare.io Integration¶
The proxy manager integrates with WebShare.io for easy proxy management.
Loading from WebShare¶
def _load_webshare_proxies(self, url: str) -> List[str]:
"""Load proxies from WebShare.io URL."""
import requests
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
proxies = []
for line in response.text.strip().split("\n"):
line = line.strip()
if not line:
continue
# WebShare format: IP:PORT:USER:PASS
parts = line.split(":")
if len(parts) >= 4:
ip, port, user, pw = parts[:4]
proxy = f"http://{user}:{pw}@{ip}:{port}"
proxies.append(proxy)
return proxies
except Exception as e:
raise ProxyValidationError(
f"Failed to load webshare proxies: {e}"
) from e
Using WebShare Proxies¶
from fastreq.utils.proxies import ProxyManager, ProxyConfig
config = ProxyConfig(
enabled=True,
webshare_url="https://your-webshare-proxy-list-url",
retry_delay=60.0,
)
manager = ProxyManager(config)
proxy = await manager.get_next()
Environment Variable Loading¶
Proxies can also be loaded from the PROXIES environment variable:
# Set environment variable
export PROXIES="192.168.1.10:8080,192.168.1.11:8080"
# Automatically loaded
manager = ProxyManager(config)
Proxy Manager Internals¶
Thread Safety¶
All proxy operations are protected by an async lock:
async def get_next(self) -> Optional[str]:
async with self._lock: # Thread-safe
# Modify shared state
This ensures multiple concurrent tasks can safely access the proxy manager.
Proxy Statistics¶
The proxy manager provides statistics:
def count(self) -> int:
"""Get total number of proxies."""
return len(self._proxies)
def count_available(self) -> int:
"""Get number of available proxies."""
now = time.time()
return sum(
1 for p in self._proxies
if p not in self._failed_proxies
or self._failed_proxies[p] <= now
)
Random Selection¶
Proxies are selected randomly to distribute load:
Random selection helps avoid: - Predictable patterns - Uneven proxy usage - Hotspots on specific proxies
Example Usage¶
Basic Proxy Rotation¶
from fastreq import FastRequests
from fastreq.utils.proxies import ProxyManager, ProxyConfig
# Configure proxy rotation
proxy_config = ProxyConfig(
enabled=True,
list=[
"192.168.1.10:8080",
"192.168.1.11:8080:admin:pass",
"http://user:pass@192.168.1.12:8080",
],
retry_delay=60.0,
)
# Create client with proxy rotation
client = FastRequests(
random_proxy=True, # Enable proxy rotation
concurrency=10,
)
WebShare Integration¶
proxy_config = ProxyConfig(
enabled=True,
webshare_url="https://your-api.webshare.io/api/v2/proxy",
retry_delay=120.0, # 2 minutes
)
manager = ProxyManager(proxy_config)
print(f"Loaded {manager.count()} proxies")
Monitoring Proxy Health¶
manager = ProxyManager(proxy_config)
# Check proxy status
print(f"Total proxies: {manager.count()}")
print(f"Available proxies: {manager.count_available()}")
# Get next available proxy
proxy = await manager.get_next()
if proxy:
print(f"Using proxy: {proxy}")
else:
print("No proxies available!")
Best Practices¶
1. Use Multiple Proxies¶
Don't rely on a single proxy:
# Good: Multiple proxies for rotation
ProxyConfig(list=[
"192.168.1.10:8080",
"192.168.1.11:8080",
"192.168.1.12:8080",
])
# Bad: Single proxy (no rotation benefit)
ProxyConfig(list=["192.168.1.10:8080"])
2. Handle Proxy Exhaustion¶
When all proxies fail, the request will fail without a proxy:
proxy = await manager.get_next()
if not proxy:
logger.error("All proxies failed!")
# Handle gracefully: wait, alert, etc.
3. Monitor Failed Proxies¶
Track proxy health and rotation:
logger.info(f"Proxies: {manager.count_available()}/{manager.count()}")
if manager.count_available() < manager.count() * 0.5:
logger.warning("More than 50% of proxies failed!")
4. Use Appropriate Retry Delays¶
Adjust retry delay based on proxy quality:
# High-quality proxies: Short retry delay
ProxyConfig(retry_delay=30.0)
# Low-quality proxies: Long retry delay
ProxyConfig(retry_delay=300.0)
5. Validate Proxies Before Use¶
The manager validates format, but consider testing connectivity:
Troubleshooting¶
All Proxies Failing¶
Problem: count_available() returns 0
Possible Causes: 1. All proxies marked as failed 2. Retry delay too long 3. Proxies genuinely offline
Solutions:
# Reduce retry delay
ProxyConfig(retry_delay=30.0) # Instead of 60.0
# Check proxy connectivity manually
# Consider using a proxy health check service
Invalid Proxy Format¶
Problem: Proxies being filtered out
Solution: Check format:
from fastreq.utils.proxies import ProxyManager
ProxyManager.validate("192.168.1.10:8080") # Should return True
ProxyManager.validate("invalid") # Should return False
Proxy Not Working¶
Problem: Requests still failing with proxy
Possible Causes: 1. Proxy is offline 2. Credentials incorrect 3. Proxy blocked by target
Solution: Test proxy manually:
import requests
proxy = "http://user:pass@192.168.1.10:8080"
try:
response = requests.get(
"https://httpbin.org/ip",
proxies={"http": proxy, "https": proxy},
timeout=10
)
print(response.json()) # Should show proxy IP
except Exception as e:
print(f"Proxy failed: {e}")
Related Documentation¶
- How-to: Use Proxies - Practical proxy usage guide
- Architecture - How proxy rotation integrates with other components