In my web scraping journey with Selenium, I encountered a roadblock when the website I was scraping suddenly returned a “403 Forbidden Access” error. It turned out that my VPS IP had been banned.

403 forbidden error

To overcome this issue and ensure seamless web scraping, I explored using proxy servers with Selenium. After some trial and error, I discovered a reliable solution: Selenium Wire.

Selenium Wire is an extension of Selenium that offers additional features, including seamless proxy integration.

import requests

def get_proxies(api_key, page=1, page_size=25):
    url = f"https://proxy.webshare.io/api/v2/proxy/list/?mode=direct&page={page}&page_size={page_size}"
    headers = {"Authorization": api_key}
    response = requests.get(url, headers=headers)

    if response.status_code == 200:
        proxies = []
        for proxy in response.json().get('results', []):
            proxies.append(f"http://{proxy['username']}:{proxy['password']}@{proxy['proxy_address']}:{proxy['port']}")
        return proxies

    print(f"Error fetching proxies: {response.status_code}")
    return []

To use these proxies with Selenium, follow these steps:

  1. Fetch a random proxy from the list obtained using the get_proxies function.
  2. Configure the Selenium WebDriver to use this proxy with Selenium Wire.

Here’s a snippet of the code:

from seleniumwire import webdriver
import random

def configure_selenium_with_proxy(proxy_list, chrome_options):
    proxy = random.choice(proxy_list)
    return webdriver.Chrome(options=chrome_options, seleniumwire_options={'proxy': proxy})

By implementing this approach with Selenium Wire, you can easily bypass IP bans and ensure uninterrupted web scraping.