How to Bypass Captcha when Using Helium by CapSolver

Integrate Helium with CapSolver

1. Introduction: The Challenge of Web Automation

For developers, web automation is a crucial part of the workflow, whether for testing, scraping, or data collection. However, modern websites employ sophisticated anti-bot measures and CAPTCHA challenges that often bring automation scripts to a halt.

The combination of Helium and CapSolver offers an elegant solution to this persistent problem:

  • Helium: A lightweight Python library that wraps Selenium, providing a simple, human-readable API that drastically simplifies browser operations.
  • CapSolver: An AI-powered CAPTCHA solving service capable of handling major challenges like Cloudflare Turnstile, reCAPTCHA, and more.

Together, this integration enables seamless web automation, allowing your scripts to handle CAPTCHA challenges automatically.

1.1. Integration Objectives

This guide will walk you through achieving the following core goals:

  1. Simplify Browser Automation: Use Helium’s intuitive API for clean, readable code.
  2. Automate CAPTCHA Solving: Integrate CapSolver’s API to handle CAPTCHA challenges without manual intervention.
  3. Maintain Flexibility: Enjoy the simplicity of Helium while retaining full access to the power of the underlying Selenium WebDriver when needed.

2. Why Choose Helium?

Helium is a Python library designed to make Selenium much easier to use. It provides a high-level API that allows you to write browser automation code that reads like plain English instructions.

2.1. Key Features

  • Simple Syntax: Write click("Submit") instead of complex XPath or CSS selectors.
  • Auto-waiting: Automatically waits for elements to appear, eliminating the need for explicit wait code.
  • High Readability: Code is clear and instruction-like: write("Hello", into="Search Box").
  • Full Selenium Compatibility: You can access the underlying Selenium driver anytime via get_driver().
  • Lightweight: Minimal overhead on top of Selenium.

2.2. Installation

# Install Helium
pip install helium

# Install requests library for CapSolver API calls
pip install requests

2.3. Basic Usage Example

from helium import *

# Start browser and navigate
start_chrome("https://wikipedia.org")

# Type into the search box
write("Python programming", into=S("input[name='search']"))

# Click the search button
click(Button("Search"))

# Check if text exists
if Text("Python").exists():
    print("Found Python article!")

# Close browser
kill_browser()

3. CapSolver: AI-Powered CAPTCHA Solution

CapSolver is an AI-powered automatic CAPTCHA solving service that supports a wide range of CAPTCHA types. It offers a simple API for submitting CAPTCHA challenges and receiving solutions within seconds.

3.1. Supported CAPTCHA Types

  • Cloudflare Turnstile: The most common modern anti-bot challenge.
  • reCAPTCHA v2: Both image-based and invisible variants.
  • reCAPTCHA v3: Score-based verification.
  • AWS WAF: Amazon Web Services CAPTCHA.
  • DataDome: Enterprise bot protection.
  • And many more…

3.2. Getting Started with CapSolver

  1. Sign up at the CapSolver Dashboard.
  2. Add funds to your account.
  3. Retrieve your API key from the dashboard.

Bonus: Use code HELIUM when registering to receive bonus credits!

3.3. API Endpoints

  • Server A: https://api.capsolver.com
  • Server B: https://api-stable.capsolver.com

4. Integration Methods: API vs. Browser Extension

4.1. API Integration (Recommended)

The API integration method provides you with full control over the CAPTCHA solving process and works with any CAPTCHA type supported by CapSolver.

4.1.1. Core Integration Pattern

The following Python functions handle the core logic for creating a task and polling for the result:

import time
import requests
from helium import *

CAPSOLVER_API_KEY = "YOUR_API_KEY"
CAPSOLVER_API = "https://api.capsolver.com"


def create_task(task_payload: dict) -> str:
    """Create a CAPTCHA solving task and return the task ID."""
    response = requests.post(
        f"{CAPSOLVER_API}/createTask",
        json={
            "clientKey": CAPSOLVER_API_KEY,
            "task": task_payload
        }
    )
    result = response.json()
    if result.get("errorId") != 0:
        raise Exception(f"CapSolver Error: {result.get('errorDescription')}")
    return result["taskId"]


def get_task_result(task_id: str, max_attempts: int = 120) -> dict:
    """Poll for task result until solved or timeout."""
    for _ in range(max_attempts):
        response = requests.post(
            f"{CAPSOLVER_API}/getTaskResult",
            json={
                "clientKey": CAPSOLVER_API_KEY,
                "taskId": task_id
            }
        )
        result = response.json()

        if result.get("status") == "ready":
            return result["solution"]
        elif result.get("status") == "failed":
            raise Exception(f"Task Failed: {result.get('errorDescription')}")

        time.sleep(1)

    raise TimeoutError("CAPTCHA solving timed out")


def solve_captcha(task_payload: dict) -> dict:
    """Complete CAPTCHA solving workflow."""
    task_id = create_task(task_payload)
    return get_task_result(task_id)

4.2. Browser Extension Integration

You can also use the CapSolver browser extension alongside Helium for automatic CAPTCHA detection and solving.

4.2.1. Installation Steps

  1. Download the CapSolver extension from capsolver.com/en/extension.
  2. Extract the extension files.
  3. Configure your API key: Edit the config.js file within the extension folder.
  4. Load it into Chrome via Helium:
from helium import *
from selenium.webdriver import ChromeOptions

options = ChromeOptions()
# Ensure the path points to the extracted extension folder
options.add_argument('--load-extension=/path/to/capsolver-extension')

start_chrome(options=options)
# The extension will automatically detect and solve CAPTCHAs

Note: The extension must have a valid API key configured before it can solve CAPTCHAs automatically.

5. Practical Code Examples

5.1. Solving reCAPTCHA v2

This example demonstrates solving reCAPTCHA v2 on Google’s demo page, including automatic site key detection:

import time
import requests
from helium import *
from selenium.webdriver import ChromeOptions

CAPSOLVER_API_KEY = "YOUR_API_KEY"
CAPSOLVER_API = "https://api.capsolver.com"


def solve_recaptcha_v2(site_key: str, page_url: str) -> str:
    """Solve reCAPTCHA v2 and return the token."""
    # Create the task
    response = requests.post(
        f"{CAPSOLVER_API}/createTask",
        json={
            "clientKey": CAPSOLVER_API_KEY,
            "task": {
                "type": "ReCaptchaV2TaskProxyLess",
                "websiteURL": page_url,
                "websiteKey": site_key,
            }
        }
    )
    result = response.json()

    if result.get("errorId") != 0:
        raise Exception(f"Error: {result.get('errorDescription')}")

    task_id = result["taskId"]
    print(f"Task created: {task_id}")

    # Poll for result
    while True:
        result = requests.post(
            f"{CAPSOLVER_API}/getTaskResult",
            json={
                "clientKey": CAPSOLVER_API_KEY,
                "taskId": task_id
            }
        ).json()

        if result.get("status") == "ready":
            return result["solution"]["gRecaptchaResponse"]
        elif result.get("status") == "failed":
            raise Exception(f"Failed: {result.get('errorDescription')}")

        print("  Waiting for solution...")
        time.sleep(1)


def main():
    target_url = "https://www.google.com/recaptcha/api2/demo"

    # Configure browser with anti-detection options
    options = ChromeOptions()
    options.add_experimental_option('excludeSwitches', ['enable-automation'])
    options.add_experimental_option('useAutomationExtension', False)
    options.add_argument('--disable-blink-features=AutomationControlled')

    print("Starting browser...")
    start_chrome(target_url, options=options)
    driver = get_driver()

    try:
        time.sleep(2)

        # Auto-detect site key from page
        recaptcha_element = driver.find_element("css selector", ".g-recaptcha")
        site_key = recaptcha_element.get_attribute("data-sitekey")
        print(f"Detected site key: {site_key}")

        # Solve the CAPTCHA with CapSolver
        print("nSolving reCAPTCHA v2 with CapSolver...")
        token = solve_recaptcha_v2(site_key, target_url)
        print(f"Got token: {token[:50]}...")

        # Inject the token
        print("nInjecting token...")
        driver.execute_script(f'''
            var responseField = document.getElementById('g-recaptcha-response');
            responseField.style.display = 'block';
            responseField.value = '{token}';
        ''')
        print("Token injected!")

        # Submit using Helium's simple syntax
        print("nSubmitting form...")
        click("Submit")

        time.sleep(3)

        # Check for success
        if "Verification Success" in driver.page_source:
            print("n=== SUCCESS! ===")
            print("reCAPTCHA was solved and form was submitted!")

    finally:
        kill_browser()


if __name__ == "__main__":
    main()

5.2. Solving Cloudflare Turnstile

Cloudflare Turnstile is another common challenge. Here is how to solve it:

import time
import requests
from helium import *
from selenium.webdriver import ChromeOptions

CAPSOLVER_API_KEY = "YOUR_API_KEY"
CAPSOLVER_API = "https://api.capsolver.com"


def solve_turnstile(site_key: str, page_url: str) -> str:
    """Solve Cloudflare Turnstile and return the token."""
    response = requests.post(
        f"{CAPSOLVER_API}/createTask",
        json={
            "clientKey": CAPSOLVER_API_KEY,
            "task": {
                "type": "AntiTurnstileTaskProxyLess",
                "websiteURL": page_url,
                "websiteKey": site_key,
            }
        }
    )
    result = response.json()

    if result.get("errorId") != 0:
        raise Exception(f"Error: {result.get('errorDescription')}")

    task_id = result["taskId"]

    while True:
        result = requests.post(
            f"{CAPSOLVER_API}/getTaskResult",
            json={
                "clientKey": CAPSOLVER_API_KEY,
                "taskId": task_id
            }
        ).json()

        if result.get("status") == "ready":
            return result["solution"]["token"]
        elif result.get("status") == "failed":
            raise Exception(f"Failed: {result.get('errorDescription')}")

        time.sleep(1)


def main():
    target_url = "https://your-target-site.com"
    turnstile_site_key = "0x4XXXXXXXXXXXXXXXXX"  # Find this in the page source

    # Configure browser
    options = ChromeOptions()
    options.add_argument('--disable-blink-features=AutomationControlled')

    start_chrome(target_url, options=options)
    driver = get_driver()

    try:
        # Wait for Turnstile to load
        time.sleep(3)

        # Solve the CAPTCHA
        print("Solving Turnstile...")
        token = solve_turnstile(turnstile_site_key, target_url)
        print(f"Got token: {token[:50]}...")

        # Inject the token
        driver.execute_script(f'''
            document.querySelector('input[name="cf-turnstile-response"]').value = "{token}";

            // Trigger callback if present
            const callback = document.querySelector('[data-callback]');
            if (callback) {{
                const callbackName = callback.getAttribute('data-callback');
                if (window[callbackName]) {{
                    window[callbackName]('{token}');
                }}
            }}
        ''')

        # Submit the form using Helium
        if Button("Submit").exists():
            click("Submit")

        print("Turnstile bypassed!")

    finally:
        kill_browser()


if __name__ == "__main__":
    main()

5.3. Solving reCAPTCHA v3

reCAPTCHA v3 is score-based and does not require user interaction:

import time
import requests
from helium import *
from selenium.webdriver import ChromeOptions

CAPSOLVER_API_KEY = "YOUR_API_KEY"
CAPSOLVER_API = "https://api.capsolver.com"


def solve_recaptcha_v3(
    site_key: str,
    page_url: str,
    action: str = "verify",
    min_score: float = 0.7
) -> str:
    """Solve reCAPTCHA v3 with specified action and minimum score."""
    response = requests.post(
        f"{CAPSOLVER_API}/createTask",
        json={
            "clientKey": CAPSOLVER_API_KEY,
            "task": {
                "type": "ReCaptchaV3TaskProxyLess",
                "websiteURL": page_url,
                "websiteKey": site_key,
                "pageAction": action,
                "minScore": min_score
            }
        }
    )
    result = response.json()

    if result.get("errorId") != 0:
        raise Exception(f"Error: {result.get('errorDescription')}")

    task_id = result["taskId"]

    while True:
        result = requests.post(
            f"{CAPSOLVER_API}/getTaskResult",
            json={
                "clientKey": CAPSOLVER_API_KEY,
                "taskId": task_id
            }
        ).json()

        if result.get("status") == "ready":
            return result["solution"]["gRecaptchaResponse"]
        elif result.get("status") == "failed":
            raise Exception(f"Failed: {result.get('errorDescription')}")

        time.sleep(1)


def main():
    target_url = "https://your-target-site.com"
    recaptcha_v3_key = "6LcXXXXXXXXXXXXXXXXXXXXXXXXX"

    # Setup headless browser for v3
    options = ChromeOptions()
    options.add_argument('--headless')

    start_chrome(target_url, options=options)
    driver = get_driver()

    try:
        # Solve reCAPTCHA v3 with "login" action
        print("Solving reCAPTCHA v3...")
        token = solve_recaptcha_v3(
            recaptcha_v3_key,
            target_url,
            action="login",
            min_score=0.9
        )

        # Inject the token
        driver.execute_script(f'''
            var responseField = document.querySelector('[name="g-recaptcha-response"]');
            if (responseField) {{
                responseField.value = '{token}';
            }}
            // Call callback if exists
            if (typeof onRecaptchaSuccess === 'function') {{
                onRecaptchaSuccess('{token}');
            }}
        ''')

        print("reCAPTCHA v3 bypassed!")

    finally:
        kill_browser()


if __name__ == "__main__":
    main()

6. Best Practices and Tips

6.1. Browser Configuration: Mimicking Human Users

To prevent your automation script from being easily detected, configure Chrome options to make it appear more like a regular browser:

from helium import *
from selenium.webdriver import ChromeOptions

options = ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-automation'])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_argument('--window-size=1920,1080') # Set window size

start_chrome(options=options)

6.2. Combining the Best of Helium and Selenium

Use Helium’s simple syntax for most operations, but access the underlying Selenium driver for complex, low-level control:

from helium import *

start_chrome("https://target-site.com")

# Use Helium for simple interactions
write("username", into="Email")
write("password", into="Password")

# Access Selenium driver for complex operations
driver = get_driver()
driver.execute_script("window.scrollTo(0, document.body.scrollHeight)") # Scroll to bottom

# Switch back to Helium
click("Login")

6.3. Rate Limiting and Random Delays

Avoid triggering rate limits by adding random delays to mimic human behavior:

import random
import time

def human_delay(min_sec=1.0, max_sec=3.0):
    """Random delay to mimic human behavior."""
    time.sleep(random.uniform(min_sec, max_sec))

# Use between actions
click("Next")
human_delay()
write("data", into="Input")

6.4. Error Handling and Retry Logic

Implement robust error handling and retry logic for CAPTCHA solving:

def solve_with_retry(task_payload: dict, max_retries: int = 3) -> dict:
    """Solve CAPTCHA with retry logic."""
    for attempt in range(max_retries):
        try:
            return solve_captcha(task_payload)
        except TimeoutError:
            if attempt < max_retries - 1:
                print(f"Timeout, retrying... ({attempt + 1}/{max_retries})")
                time.sleep(5)
            else:
                raise
        except Exception as e:
            if "balance" in str(e).lower():
                raise  # Do not retry balance errors
            if attempt < max_retries - 1:
                time.sleep(2)
            else:
                raise

7. Helium vs. Selenium: A Quick Comparison

Operation Selenium Helium
Click button driver.find_element(By.XPATH, "//button[text()='Submit']").click() click("Submit")
Type text driver.find_element(By.NAME, "email").send_keys("test@test.com") write("test@test.com", into="Email")
Press Enter element.send_keys(Keys.ENTER) press(ENTER)
Check text exists "Welcome" in driver.page_source Text("Welcome").exists()

8. Conclusion

The integration of Helium and CapSolver creates a powerful and elegant toolkit for web automation:

  • Helium provides a clean, highly readable API for browser automation.
  • CapSolver handles the core obstacle of CAPTCHAs with AI-powered solving.
  • Together, they enable developers to achieve seamless automation with minimal, maintainable code.

Whether you are building web scrapers, automated testing systems, or data collection pipelines, this combination delivers the required simplicity and robust functionality.

Bonus: Use code HELIUM when signing up at CapSolver to receive bonus credits!

9. Frequently Asked Questions (FAQ)

9.1. Why choose Helium over pure Selenium?

Helium makes Selenium easier to use by offering:

  • Much simpler, human-readable syntax.
  • Automatic waiting for elements.
  • Less verbose code.
  • Full access to Selenium when needed.
  • Faster development time.

9.2. Which CAPTCHA types work best with this integration?

CapSolver supports all major CAPTCHA types. Cloudflare Turnstile and reCAPTCHA v2/v3 have the highest success rates. The integration works seamlessly with any CAPTCHA that CapSolver supports.

9.3. Can I use this in headless mode?

Yes! Helium supports headless mode via ChromeOptions. For reCAPTCHA v3 and token-based CAPTCHAs, headless mode works perfectly. For v2 visual CAPTCHAs, headed mode may provide better results.

9.4. How do I find the site key for a CAPTCHA?

Look in the page source for:

  • Turnstile: data-sitekey attribute or cf-turnstile elements.
  • reCAPTCHA: data-sitekey attribute on the g-recaptcha div.

9.5. What if CAPTCHA solving fails?

Common solutions:

  1. Verify your API key and balance.
  2. Ensure the site key is correct.
  3. Check that the page URL matches where the CAPTCHA appears.
  4. For v3, try adjusting the action parameter and minimum score.
  5. Implement retry logic with delays.

9.6. Can I still use Selenium features with Helium?

Yes! Call get_driver() to access the underlying Selenium WebDriver for any operation Helium doesn’t cover directly.

Note: The code examples in this article are for demonstration purposes only. Please replace them with your actual API key and target URLs.

Leave a Reply