Chat with us, powered by LiveChat
← Return to MyDisct Solver

Browser Action

The POST /browser/action endpoint executes browser-level actions within an active adaptive-mode BBRE (Browser-Backed Request Engine) session. Browser actions give you direct control over a real browser instance running inside the BBRE engine, allowing you to navigate to URLs, click elements, fill input fields, type text character by character, select dropdown options, take screenshots, execute JavaScript, scroll the page, manage cookies, and wait for elements to appear. This is the core endpoint that transforms BBRE from a simple HTTP request proxy into a full browser automation platform. Every browser action operates within the context of an existing session, which means the browser remembers its state between actions. If you navigate to a login page and fill in credentials with one action, the browser retains those filled values when you send the next action to click the submit button. Cookies set by the website persist across actions, JavaScript state is maintained, and the DOM reflects all previous interactions. This stateful behavior is what makes browser actions powerful for automating complex web workflows like multi-step form submissions, authenticated data extraction, checkout processes, and any scenario where you need to interact with a website the same way a human user would. Browser actions are exclusively available in adaptive mode sessions. If you attempt to execute a browser action on a passive mode session, the action will fail because passive sessions do not maintain a browser instance. You must create your session with "mode": "adaptive" to use this endpoint. The BBRE engine supports over 30 distinct browser actions organized into categories: navigation, interaction, query, wait, scroll, cookies, JavaScript execution, screenshot, and batch operations. Each action accepts specific parameters that control its behavior, and returns results appropriate to the action type. This page documents every available action with its parameters, provides working code examples in JavaScript, Python, and cURL, demonstrates SDK usage through the BrowserAPI class, and covers real-world workflow patterns that combine multiple actions into complete automation sequences.

Adaptive Mode Required

Browser actions only work with sessions created in "adaptive" mode. Adaptive mode sessions maintain a full browser instance with JavaScript execution, DOM rendering, and navigation history. If your session was created in "passive" mode, you cannot use browser actions on it. You must create a new session with "mode": "adaptive" to use the POST /browser/action endpoint. To create an adaptive session, see the Create Session documentation. Passive mode sessions are designed for lightweight HTTP-level interactions where cookie persistence is sufficient. Adaptive mode sessions are designed for full browser automation where you need to interact with page elements, execute JavaScript, or handle websites that require a real browser environment.

Endpoint

POST https://bbre-solver-api.mydisct.com/browser/action

Authentication

All requests to the BBRE API require authentication via an API key. You must include your API key in the HTTP headers of every request. The API accepts the key through either the x-api-key header or the apikey header. Both header names are supported and functionally identical. If no API key is provided, the API returns a 401 error with the API_KEY_REQUIRED error code. If the provided key is invalid or belongs to a suspended or inactive account, the API returns a 403 error with the appropriate error code (INVALID_API_KEY, ACCOUNT_SUSPENDED, or ACCOUNT_INACTIVE).

Header Type Required Description
x-api-key string required Your BBRE API key. You can find this in your MyDisct Solver dashboard.
Content-Type string required Must be set to application/json for all requests.

Request Body Parameters

The request body must be a JSON object containing the session identifier, the action to execute, and optionally an object of action-specific parameters. The sessionId identifies which adaptive session the action should run in, the action string specifies which browser action to perform, and the params object provides the inputs that the action needs. Different actions accept different parameters, which are documented in detail in the action-specific sections below. If you omit the params field, it defaults to an empty object {}, which is valid for actions that do not require any parameters like back, forward, getTitle, getUrl, and html.

Parameter Type Required Default Description
sessionId string required - The unique identifier of the adaptive mode session in which to execute the browser action. This must be a valid, active, non-expired session that was created with "mode": "adaptive". You receive this ID from the POST /session/create response.
action string required - The name of the browser action to execute. Must be one of the supported action names listed in the Available Actions section below. Action names are case-sensitive and must be provided in lowercase (e.g., "navigate", "click", "fill").
params object optional {} An object containing action-specific parameters. The structure and required fields depend on the action being executed. For example, the navigate action requires a url field, while the click action requires a selector field. Actions that do not require parameters (like back, forward, getTitle) can omit this field entirely.

Available Actions Overview

The BBRE browser engine supports a comprehensive set of actions organized into functional categories. The following table provides a quick reference of every available action, its category, and a brief description. Detailed documentation for each action including parameters, defaults, and examples is provided in the category-specific sections that follow.

Category Action Description
Navigation navigate Navigate the browser to a specified URL and wait for the page to load.
Navigation goto Alias for navigate. Functionally identical.
Navigation reload Reload the current page.
Navigation back Navigate back in the browser history, equivalent to clicking the back button.
Navigation forward Navigate forward in the browser history, equivalent to clicking the forward button.
Interaction click Click an element identified by a CSS selector.
Interaction fill Fill an input field with text, replacing any existing content.
Interaction type Type text character by character with configurable delay between keystrokes.
Interaction select Select an option from a dropdown element by value.
Interaction selectDropdown Alias for select. Functionally identical.
Interaction clickCheckboxByText Click a checkbox identified by its label text.
Interaction fillForm Fill multiple form fields at once and optionally submit the form.
Interaction clear Clear the content of an input field.
Query html Get the full HTML content of the current page.
Query getText Get the text content of an element identified by a CSS selector.
Query getTitle Get the title of the current page.
Query getUrl Get the current URL of the browser.
Query find Find an element by CSS selector and return information about it.
Query findByText Find an element by its text content.
Screenshot screenshot Take a screenshot of the current page and return it as base64-encoded data.
Wait wait Wait for a specified number of seconds before continuing.
Wait waitForElement Wait for an element to appear on the page, with configurable timeout and polling.
Scroll scroll Perform a custom scroll operation with specified options.
Scroll scrollDown Scroll the page down by a specified number of pixels.
Scroll scrollUp Scroll the page up by a specified number of pixels.
Scroll scrollToTop Scroll to the top of the page.
Scroll scrollToBottom Scroll to the bottom of the page.
Cookies getCookies Get all cookies from the current browser context.
Cookies setCookie Set a cookie in the browser context.
Cookies deleteCookies Delete all cookies from the browser context.
JavaScript execute Execute JavaScript code in the page context and return the result.
JavaScript evaluate Alias for execute. Functionally identical.
Batch batch Execute multiple browser actions in sequence within a single API call.

Interaction Actions

Interaction actions let you interact with page elements the same way a human user would. You can click buttons and links, fill input fields, type text with realistic keystroke timing, select options from dropdowns, check checkboxes, fill entire forms at once, and clear input fields. These actions are the building blocks for automating any web workflow that involves user input.

click

Clicks an element identified by a CSS selector. The BBRE engine locates the element on the page, scrolls it into view if necessary, and performs a mouse click on it. The optional delay parameter adds a pause before the click, which can help with websites that detect instantaneous clicks as bot behavior.

Parameter Type Required Default Description
selector string required - CSS selector identifying the element to click. Examples: "#submit-btn", ".login-button", "button[type='submit']".
delay number optional 0.1 Delay in seconds before performing the click. A small delay makes the interaction appear more human-like.

fill

Fills an input field with the specified text, replacing any existing content in the field. This action first clears the field and then sets the value directly. It is faster than type because it does not simulate individual keystrokes. Use fill when speed is more important than realism, and use type when you need to simulate realistic typing behavior for bot detection evasion.

Parameter Type Required Default Description
selector string required - CSS selector identifying the input field to fill. Examples: "#email", "input[name='username']".
text string required - The text to fill into the input field. Any existing content in the field is replaced.

type

Types text into an input field character by character with a configurable delay between keystrokes. Unlike fill, which sets the value instantly, type simulates realistic keyboard input by dispatching individual key events for each character. This is important for websites that monitor typing patterns, validate input on each keystroke, or use JavaScript event listeners on keydown/keyup events. The delay parameter controls the time between each keystroke in seconds.

Parameter Type Required Default Description
selector string required - CSS selector identifying the input field to type into.
text string required - The text to type into the field, one character at a time.
delay number optional 0.05 Delay in seconds between each keystroke. The default of 0.05 seconds (50 milliseconds) simulates a fast but realistic typing speed.

select / selectDropdown

Selects an option from a <select> dropdown element by its value attribute. The selectDropdown action is an alias for select and behaves identically. The selector parameter identifies the dropdown element, and the value parameter specifies which option to select.

Parameter Type Required Default Description
selector string required - CSS selector identifying the <select> dropdown element.
value string required - The value attribute of the option to select. This must match the value attribute of one of the <option> elements inside the dropdown.

clickCheckboxByText

Clicks a checkbox identified by its associated label text. This is useful when checkboxes do not have predictable IDs or CSS selectors, but their labels contain identifiable text. The exact parameter controls whether the text match must be exact or can be a partial match.

Parameter Type Required Default Description
text string required - The label text associated with the checkbox. The engine searches for a checkbox whose label contains this text.
exact boolean optional false When true, the label text must match exactly. When false (default), a partial match is sufficient.

fillForm

Fills multiple form fields at once using a key-value mapping of CSS selectors to values. This is a convenience action that eliminates the need to send separate fill actions for each field. The optional submit parameter automatically submits the form after all fields are filled, which saves an additional click action on the submit button.

Parameter Type Required Default Description
formData object required - An object where keys are CSS selectors and values are the text to fill into each field. Example: {"#email": "[email protected]", "#password": "secret123"}
submit boolean optional false When true, the form is automatically submitted after all fields are filled. When false, the fields are filled but the form is not submitted.

clear

Clears the content of an input field identified by a CSS selector. This removes all text from the field, leaving it empty. Use this before fill or type if you need to ensure the field is empty before entering new content.

Parameter Type Required Default Description
selector string required - CSS selector identifying the input field to clear.

Query Actions

Query actions let you extract information from the current page without modifying it. You can retrieve the full page HTML, get the text content of specific elements, read the page title and URL, and find elements by CSS selector or text content. These actions are essential for verifying that your navigation and interaction actions produced the expected results, and for extracting data from pages after you have navigated to them.

html

Returns the full HTML content of the current page. This action does not accept any parameters. The returned HTML includes the complete DOM as rendered by the browser, including any content generated by JavaScript. This is useful for parsing page content, extracting data, or debugging what the browser is currently displaying.

getText

Returns the text content of an element identified by a CSS selector. This extracts only the visible text, stripping all HTML tags. Use this to read the content of specific page elements like headings, paragraphs, table cells, or status messages.

Parameter Type Required Default Description
selector string required - CSS selector identifying the element whose text content to retrieve.

getTitle

Returns the title of the current page as defined by the <title> tag. This action does not accept any parameters. The page title is useful for verifying that navigation landed on the expected page.

getUrl

Returns the current URL of the browser. This action does not accept any parameters. The URL reflects the actual location of the browser after all navigations and redirects. Use this to verify that the browser is on the expected page, especially after form submissions or login attempts that may redirect to different URLs.

find

Finds an element by CSS selector and returns information about it. This is useful for checking whether an element exists on the page before attempting to interact with it.

Parameter Type Required Default Description
selector string required - CSS selector identifying the element to find.

findByText

Finds an element by its text content. This searches the page for elements containing the specified text and returns information about the first match. This is particularly useful when elements do not have predictable CSS selectors but contain identifiable text.

Parameter Type Required Default Description
text string required - The text content to search for on the page.

Wait Actions

Wait actions pause execution until a condition is met or a specified time has elapsed. These are critical for handling dynamic web pages where content loads asynchronously after the initial page load. Without proper waiting, your subsequent actions might fail because the target elements have not appeared yet. The BBRE engine provides both simple time-based waiting and intelligent element-based waiting with configurable polling intervals.

wait

Pauses execution for a specified number of seconds. This is a simple time-based wait that is useful when you know approximately how long to wait for a page transition, animation, or asynchronous operation to complete. For more precise waiting, use waitForElement instead, which waits for a specific condition rather than a fixed time.

Parameter Type Required Default Description
seconds number required - The number of seconds to wait. Accepts decimal values for sub-second precision (e.g., 0.5 for half a second).

waitForElement

Waits for an element to appear on the page. You can identify the target element by text content, CSS selector, or both. The engine polls the page at regular intervals checking for the element's presence. This is the recommended way to handle dynamic content because it waits only as long as necessary rather than using a fixed delay. The throwOnTimeout parameter controls whether the action fails or succeeds silently when the element does not appear within the timeout period.

Parameter Type Required Default Description
text string optional - Text content to search for. The engine waits for an element containing this text to appear on the page. Provide either text or selector (or both).
selector string optional - CSS selector to wait for. The engine waits for an element matching this selector to appear. Provide either text or selector (or both).
timeout number optional 30 Maximum time in seconds to wait for the element. If the element does not appear within this time, the behavior depends on the throwOnTimeout setting.
pollInterval number optional 0.5 How often in seconds to check for the element. Lower values detect the element faster but use more resources. The default of 0.5 seconds is a good balance for most use cases.
visible boolean optional true When true, the element must be visible (not hidden by CSS) to satisfy the wait condition. When false, the element only needs to exist in the DOM.
throwOnTimeout boolean optional false When true, the action fails with an error if the element does not appear within the timeout. When false (default), the action completes successfully even if the element was not found, allowing you to check the result and decide how to proceed.

Scroll Actions

Scroll actions control the scroll position of the page. Scrolling is important for interacting with elements that are not currently visible in the viewport, triggering lazy-loaded content, and simulating realistic user behavior. Many websites load additional content as the user scrolls down (infinite scroll), and some anti-bot systems track scroll behavior as a signal of human interaction. The BBRE engine provides both directional scrolling with configurable pixel amounts and absolute scrolling to the top or bottom of the page.

scroll

Performs a custom scroll operation with the specified options. This is the most flexible scroll action, allowing you to control the exact scroll behavior through the options parameter.

Parameter Type Required Default Description
options object optional {} Scroll configuration options. The exact properties depend on the scroll behavior you want to achieve.

scrollDown

Scrolls the page down by a specified number of pixels. This is the most commonly used scroll action for loading lazy content and bringing lower page elements into view.

Parameter Type Required Default Description
pixels number optional 500 The number of pixels to scroll down. The default of 500 pixels is roughly equivalent to one viewport height on most screens.

scrollUp

Scrolls the page up by a specified number of pixels. Use this to return to previously viewed content or to bring upper page elements back into view.

Parameter Type Required Default Description
pixels number optional 500 The number of pixels to scroll up.

scrollToTop

Scrolls to the very top of the page. This action does not accept any parameters. It is equivalent to pressing the Home key in a browser.

scrollToBottom

Scrolls to the very bottom of the page. This action does not accept any parameters. It is useful for triggering lazy-loaded content at the bottom of the page or for reaching footer elements.

JavaScript Execution

JavaScript execution actions let you run arbitrary JavaScript code in the context of the current page. This is the most powerful and flexible action type because it gives you direct access to the page's DOM, JavaScript variables, and browser APIs. You can use it to extract complex data structures, modify page behavior, trigger custom events, interact with JavaScript frameworks, or perform any operation that is possible through the browser's developer console. The evaluate action is an alias for execute and behaves identically.

execute / evaluate

Executes JavaScript code in the page context and returns the result. The script is wrapped and executed within the page, meaning it has access to the page's document, window, and all JavaScript variables and functions defined by the page. The return value of your script is serialized and included in the action result. If your script throws an error, the action fails with the error message.

Parameter Type Required Default Description
script string required - The JavaScript code to execute in the page context. The code has full access to the page DOM and JavaScript environment. The return value is included in the action result.

Screenshot

The screenshot action captures a visual snapshot of the current page and returns it as base64-encoded image data. Screenshots are invaluable for debugging automation workflows, verifying that the browser is displaying the expected content, monitoring visual changes on pages, and creating visual records of completed workflows. The screenshot captures the full visible viewport as rendered by the browser, including all CSS styling, images, and dynamically generated content.

screenshot

Parameter Type Required Default Description
options object optional {} Screenshot configuration options. Controls the format and scope of the captured image.

Batch Actions

The batch action lets you execute multiple browser actions in sequence within a single API call. Instead of making separate HTTP requests for each action, you can bundle them into one batch request. This reduces network overhead and latency, especially when you need to perform a series of related actions like navigating to a page, filling a form, and clicking submit. Each action in the batch is executed in order, and the results of all actions are returned together. The optional stopOnError parameter controls whether the batch stops at the first failure or continues executing remaining actions.

batch

Parameter Type Required Default Description
actions array required - An array of action objects, each containing an action string and an optional params object. Actions are executed in the order they appear in the array. Example: [{"action": "navigate", "params": {"url": "https://example.com"}}, {"action": "click", "params": {"selector": "#btn"}}]
stopOnError boolean optional false When true, the batch stops executing at the first action that fails. When false (default), the batch continues executing remaining actions even if one fails.

Response Format

The browser action endpoint returns a JSON response indicating whether the action was executed successfully. On success, the response includes the action-specific result and optionally a browser profile object. On failure, the response includes an error code and a descriptive message. The response structure varies slightly depending on whether the action succeeded, failed with a partial result, or failed completely.

Success Response (200)

A successful browser action returns the following structure. The result field contains the output of the action, which varies by action type. For example, a getTitle action returns the page title string, a screenshot action returns base64 image data, and a navigate action returns navigation status information. The profile field contains the browser profile data if the BBRE engine generated or updated a profile during the action.

JSON
{
  "success": true,
  "service": "MyDisct Solver BBRE",
  "result": {
    "title": "Example Domain"
  },
  "profile": {
    "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "platform": "Win32",
    "language": "en-US",
    "timezone": "America/New_York"
  }
}
Field Type Description
success boolean true when the action executed successfully.
service string Always "MyDisct Solver BBRE". Identifies the service that processed the request.
result object The action-specific result. The structure depends on which action was executed. Navigation actions return status info, query actions return extracted data, screenshot returns base64 image data.
profile object / null The browser profile data generated by the BBRE engine. Contains fingerprint information like userAgent, platform, language, and timezone. May be null if no profile was generated.

Failure Response with Partial Result (200)

When the BBRE engine reports a failure but includes a partial result object, the API returns a 200 status code with "success": false. This can happen when an action partially completes before encountering an error. The partial result may contain useful information about what happened before the failure.

JSON
{
  "success": false,
  "result": {
    "error": "Element not found: #nonexistent-button"
  },
  "profile": {
    "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "platform": "Win32"
  }
}

Error Response Examples

400 - Session ID Required

JSON
{
  "success": false,
  "service": "MyDisct Solver BBRE",
  "error": {
    "code": "SESSION_ID_REQUIRED",
    "message": "Session ID is required."
  }
}

400 - Action Required

JSON
{
  "success": false,
  "service": "MyDisct Solver BBRE",
  "error": {
    "code": "INVALID_REQUEST",
    "message": "Browser action is required."
  }
}

404 - Session Not Found

JSON
{
  "success": false,
  "service": "MyDisct Solver BBRE",
  "error": {
    "code": "SESSION_NOT_FOUND",
    "message": "Session not found."
  }
}

500 - Browser Action Failed

JSON
{
  "success": false,
  "service": "MyDisct Solver BBRE",
  "error": {
    "code": "BROWSER_ACTION_FAILED",
    "message": "Browser action failed."
  }
}

Request Examples

The following examples demonstrate how to execute browser actions using the BBRE API in different programming languages. All examples show complete, working code that you can copy and run directly. Remember to replace YOUR_API_KEY with your actual BBRE API key and YOUR_SESSION_ID with a valid adaptive mode session ID.

Navigate to a URL

The most fundamental browser action. This navigates the browser to a specified URL and waits for the page to fully load. You should always start your browser automation workflow by navigating to the target page.

JavaScript
const axios = require("axios");

const API_BASE = "https://bbre-solver-api.mydisct.com";
const API_KEY = "YOUR_API_KEY";
const SESSION_ID = "YOUR_SESSION_ID";

async function navigateToPage() {
  const response = await axios.post(
    API_BASE + "/browser/action",
    {
      sessionId: SESSION_ID,
      action: "navigate",
      params: {
        url: "https://example.com/login",
        waitUntil: "load",
        timeout: 30
      }
    },
    {
      headers: {
        "Content-Type": "application/json",
        "x-api-key": API_KEY
      }
    }
  );

  console.log("Success:", response.data.success);
  console.log("Result:", response.data.result);
  return response.data;
}

navigateToPage();
Python
import requests

API_BASE = "https://bbre-solver-api.mydisct.com"
API_KEY = "YOUR_API_KEY"
SESSION_ID = "YOUR_SESSION_ID"

headers = {
    "Content-Type": "application/json",
    "x-api-key": API_KEY
}

response = requests.post(
    API_BASE + "/browser/action",
    headers=headers,
    json={
        "sessionId": SESSION_ID,
        "action": "navigate",
        "params": {
            "url": "https://example.com/login",
            "waitUntil": "load",
            "timeout": 30
        }
    }
)

data = response.json()
print("Success:", data["success"])
print("Result:", data["result"])
Bash
curl -X POST https://bbre-solver-api.mydisct.com/browser/action \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "sessionId": "YOUR_SESSION_ID",
    "action": "navigate",
    "params": {
      "url": "https://example.com/login",
      "waitUntil": "load",
      "timeout": 30
    }
  }'

Fill a Form and Click Submit

This example demonstrates filling input fields and clicking a submit button. This is one of the most common browser automation patterns, used for login forms, search forms, registration forms, and any other form-based interaction.

JavaScript
const axios = require("axios");

const API_BASE = "https://bbre-solver-api.mydisct.com";
const API_KEY = "YOUR_API_KEY";
const SESSION_ID = "YOUR_SESSION_ID";
const authHeaders = {
  "Content-Type": "application/json",
  "x-api-key": API_KEY
};

async function fillAndSubmitForm() {
  await axios.post(
    API_BASE + "/browser/action",
    {
      sessionId: SESSION_ID,
      action: "fill",
      params: { selector: "#username", text: "myuser" }
    },
    { headers: authHeaders }
  );

  await axios.post(
    API_BASE + "/browser/action",
    {
      sessionId: SESSION_ID,
      action: "fill",
      params: { selector: "#password", text: "mypassword" }
    },
    { headers: authHeaders }
  );

  const clickResponse = await axios.post(
    API_BASE + "/browser/action",
    {
      sessionId: SESSION_ID,
      action: "click",
      params: { selector: "#login-button" }
    },
    { headers: authHeaders }
  );

  console.log("Click result:", clickResponse.data.result);
  return clickResponse.data;
}

fillAndSubmitForm();
Python
import requests

API_BASE = "https://bbre-solver-api.mydisct.com"
API_KEY = "YOUR_API_KEY"
SESSION_ID = "YOUR_SESSION_ID"

headers = {
    "Content-Type": "application/json",
    "x-api-key": API_KEY
}

requests.post(
    API_BASE + "/browser/action",
    headers=headers,
    json={
        "sessionId": SESSION_ID,
        "action": "fill",
        "params": {"selector": "#username", "text": "myuser"}
    }
)

requests.post(
    API_BASE + "/browser/action",
    headers=headers,
    json={
        "sessionId": SESSION_ID,
        "action": "fill",
        "params": {"selector": "#password", "text": "mypassword"}
    }
)

click_response = requests.post(
    API_BASE + "/browser/action",
    headers=headers,
    json={
        "sessionId": SESSION_ID,
        "action": "click",
        "params": {"selector": "#login-button"}
    }
)

data = click_response.json()
print("Click result:", data["result"])
Bash
curl -X POST https://bbre-solver-api.mydisct.com/browser/action \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "sessionId": "YOUR_SESSION_ID",
    "action": "fill",
    "params": {"selector": "#username", "text": "myuser"}
  }'

curl -X POST https://bbre-solver-api.mydisct.com/browser/action \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "sessionId": "YOUR_SESSION_ID",
    "action": "fill",
    "params": {"selector": "#password", "text": "mypassword"}
  }'

curl -X POST https://bbre-solver-api.mydisct.com/browser/action \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "sessionId": "YOUR_SESSION_ID",
    "action": "click",
    "params": {"selector": "#login-button"}
  }'

Take a Screenshot

This example captures a screenshot of the current page. The screenshot is returned as base64-encoded image data that you can decode and save to a file. Screenshots are useful for debugging, visual verification, and creating records of automation results.

JavaScript
const axios = require("axios");
const fs = require("fs");

const API_BASE = "https://bbre-solver-api.mydisct.com";
const API_KEY = "YOUR_API_KEY";
const SESSION_ID = "YOUR_SESSION_ID";

async function takeScreenshot() {
  const response = await axios.post(
    API_BASE + "/browser/action",
    {
      sessionId: SESSION_ID,
      action: "screenshot",
      params: {}
    },
    {
      headers: {
        "Content-Type": "application/json",
        "x-api-key": API_KEY
      }
    }
  );

  const base64Data = response.data.result.screenshot;
  const buffer = Buffer.from(base64Data, "base64");
  fs.writeFileSync("screenshot.png", buffer);
  console.log("Screenshot saved to screenshot.png");

  return response.data;
}

takeScreenshot();
Python
import requests
import base64

API_BASE = "https://bbre-solver-api.mydisct.com"
API_KEY = "YOUR_API_KEY"
SESSION_ID = "YOUR_SESSION_ID"

headers = {
    "Content-Type": "application/json",
    "x-api-key": API_KEY
}

response = requests.post(
    API_BASE + "/browser/action",
    headers=headers,
    json={
        "sessionId": SESSION_ID,
        "action": "screenshot",
        "params": {}
    }
)

data = response.json()
base64_data = data["result"]["screenshot"]
image_bytes = base64.b64decode(base64_data)

with open("screenshot.png", "wb") as f:
    f.write(image_bytes)

print("Screenshot saved to screenshot.png")

Execute JavaScript

This example executes JavaScript in the page context to extract data that is not easily accessible through other actions. JavaScript execution gives you full access to the page DOM and any JavaScript variables or functions defined by the page.

JavaScript
const axios = require("axios");

const API_BASE = "https://bbre-solver-api.mydisct.com";
const API_KEY = "YOUR_API_KEY";
const SESSION_ID = "YOUR_SESSION_ID";

async function executeJavaScript() {
  const response = await axios.post(
    API_BASE + "/browser/action",
    {
      sessionId: SESSION_ID,
      action: "execute",
      params: {
        script: "return document.querySelectorAll('.product-card').length"
      }
    },
    {
      headers: {
        "Content-Type": "application/json",
        "x-api-key": API_KEY
      }
    }
  );

  console.log("Product count:", response.data.result);
  return response.data;
}

executeJavaScript();
Python
import requests

API_BASE = "https://bbre-solver-api.mydisct.com"
API_KEY = "YOUR_API_KEY"
SESSION_ID = "YOUR_SESSION_ID"

headers = {
    "Content-Type": "application/json",
    "x-api-key": API_KEY
}

response = requests.post(
    API_BASE + "/browser/action",
    headers=headers,
    json={
        "sessionId": SESSION_ID,
        "action": "execute",
        "params": {
            "script": "return document.querySelectorAll('.product-card').length"
        }
    }
)

data = response.json()
print("Product count:", data["result"])

Wait for Element and Extract Text

This example waits for a specific element to appear on the page and then extracts its text content. This pattern is essential for handling dynamic pages where content loads asynchronously after the initial page load.

JavaScript
const axios = require("axios");

const API_BASE = "https://bbre-solver-api.mydisct.com";
const API_KEY = "YOUR_API_KEY";
const SESSION_ID = "YOUR_SESSION_ID";
const authHeaders = {
  "Content-Type": "application/json",
  "x-api-key": API_KEY
};

async function waitAndExtract() {
  await axios.post(
    API_BASE + "/browser/action",
    {
      sessionId: SESSION_ID,
      action: "waitForElement",
      params: {
        selector: ".dashboard-welcome",
        timeout: 15,
        visible: true
      }
    },
    { headers: authHeaders }
  );

  const textResponse = await axios.post(
    API_BASE + "/browser/action",
    {
      sessionId: SESSION_ID,
      action: "getText",
      params: { selector: ".dashboard-welcome" }
    },
    { headers: authHeaders }
  );

  console.log("Welcome message:", textResponse.data.result);
  return textResponse.data;
}

waitAndExtract();
Python
import requests

API_BASE = "https://bbre-solver-api.mydisct.com"
API_KEY = "YOUR_API_KEY"
SESSION_ID = "YOUR_SESSION_ID"

headers = {
    "Content-Type": "application/json",
    "x-api-key": API_KEY
}

requests.post(
    API_BASE + "/browser/action",
    headers=headers,
    json={
        "sessionId": SESSION_ID,
        "action": "waitForElement",
        "params": {
            "selector": ".dashboard-welcome",
            "timeout": 15,
            "visible": True
        }
    }
)

text_response = requests.post(
    API_BASE + "/browser/action",
    headers=headers,
    json={
        "sessionId": SESSION_ID,
        "action": "getText",
        "params": {"selector": ".dashboard-welcome"}
    }
)

data = text_response.json()
print("Welcome message:", data["result"])

Batch Action Example

This example uses the batch action to execute multiple actions in a single API call. The batch navigates to a page, fills a search field, and clicks the search button, all in one request. This reduces the number of HTTP round trips from three to one.

JavaScript
const axios = require("axios");

const API_BASE = "https://bbre-solver-api.mydisct.com";
const API_KEY = "YOUR_API_KEY";
const SESSION_ID = "YOUR_SESSION_ID";

async function executeBatch() {
  const response = await axios.post(
    API_BASE + "/browser/action",
    {
      sessionId: SESSION_ID,
      action: "batch",
      params: {
        actions: [
          {
            action: "navigate",
            params: { url: "https://example.com/search" }
          },
          {
            action: "fill",
            params: { selector: "#search-input", text: "BBRE automation" }
          },
          {
            action: "click",
            params: { selector: "#search-button" }
          },
          {
            action: "waitForElement",
            params: { selector: ".search-results", timeout: 10 }
          }
        ],
        stopOnError: true
      }
    },
    {
      headers: {
        "Content-Type": "application/json",
        "x-api-key": API_KEY
      }
    }
  );

  console.log("Batch result:", response.data.result);
  return response.data;
}

executeBatch();
Python
import requests

API_BASE = "https://bbre-solver-api.mydisct.com"
API_KEY = "YOUR_API_KEY"
SESSION_ID = "YOUR_SESSION_ID"

headers = {
    "Content-Type": "application/json",
    "x-api-key": API_KEY
}

response = requests.post(
    API_BASE + "/browser/action",
    headers=headers,
    json={
        "sessionId": SESSION_ID,
        "action": "batch",
        "params": {
            "actions": [
                {
                    "action": "navigate",
                    "params": {"url": "https://example.com/search"}
                },
                {
                    "action": "fill",
                    "params": {"selector": "#search-input", "text": "BBRE automation"}
                },
                {
                    "action": "click",
                    "params": {"selector": "#search-button"}
                },
                {
                    "action": "waitForElement",
                    "params": {"selector": ".search-results", "timeout": 10}
                }
            ],
            "stopOnError": True
        }
    }
)

data = response.json()
print("Batch result:", data["result"])
Bash
curl -X POST https://bbre-solver-api.mydisct.com/browser/action \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "sessionId": "YOUR_SESSION_ID",
    "action": "batch",
    "params": {
      "actions": [
        {"action": "navigate", "params": {"url": "https://example.com/search"}},
        {"action": "fill", "params": {"selector": "#search-input", "text": "BBRE automation"}},
        {"action": "click", "params": {"selector": "#search-button"}},
        {"action": "waitForElement", "params": {"selector": ".search-results", "timeout": 10}}
      ],
      "stopOnError": true
    }
  }'

Node.js SDK Usage

The mydisctsolver-bbre Node.js SDK provides the BrowserAPI class that wraps the POST /browser/action endpoint with convenient methods for each browser action. When you create a BBRESession instance in adaptive mode, the session.browser property gives you access to the BrowserAPI instance. Each method on the BrowserAPI class corresponds to a browser action and handles the session ID, request formatting, and response parsing automatically. This makes browser automation code significantly cleaner and more readable compared to making raw API calls.

Basic SDK Browser Automation

JavaScript
const { BBRESession } = require("mydisctsolver-bbre");

const session = new BBRESession({
  apiKey: "YOUR_API_KEY",
  mode: "adaptive",
  sensibility: "medium"
});

async function basicBrowserAutomation() {
  await session.start();

  await session.browser.navigate("https://example.com");
  const title = await session.browser.getTitle();
  console.log("Page title:", title);

  const url = await session.browser.getUrl();
  console.log("Current URL:", url);

  const html = await session.browser.html();
  console.log("Page HTML length:", html.length);

  await session.close();
}

basicBrowserAutomation();

SDK Form Filling and Submission

JavaScript
const { BBRESession } = require("mydisctsolver-bbre");

const session = new BBRESession({
  apiKey: "YOUR_API_KEY",
  mode: "adaptive",
  sensibility: "high"
});

async function loginWithSDK() {
  await session.start();

  await session.browser.navigate("https://example.com/login");
  await session.browser.fill("#username", "myuser");
  await session.browser.fill("#password", "mypassword");
  await session.browser.click("#login-button");

  await session.browser.waitForElement({ text: "Dashboard", timeout: 10 });
  const title = await session.browser.getTitle();
  console.log("After login:", title);

  await session.close();
}

loginWithSDK();

SDK Scrolling and Data Extraction

JavaScript
const { BBRESession } = require("mydisctsolver-bbre");

const session = new BBRESession({
  apiKey: "YOUR_API_KEY",
  mode: "adaptive"
});

async function scrollAndExtract() {
  await session.start();

  await session.browser.navigate("https://example.com/products");

  await session.browser.scrollDown(1000);
  await session.browser.wait(2);

  await session.browser.scrollDown(1000);
  await session.browser.wait(2);

  const productCount = await session.browser.execute(
    "return document.querySelectorAll('.product-item').length"
  );
  console.log("Total products loaded:", productCount);

  await session.browser.scrollToTop();

  await session.close();
}

scrollAndExtract();

SDK Cookie Management

JavaScript
const { BBRESession } = require("mydisctsolver-bbre");

const session = new BBRESession({
  apiKey: "YOUR_API_KEY",
  mode: "adaptive"
});

async function manageCookies() {
  await session.start();

  await session.browser.navigate("https://example.com");

  await session.browser.setCookie({
    name: "session_token",
    value: "abc123def456",
    domain: "example.com",
    path: "/",
    secure: true
  });

  const cookies = await session.browser.getCookies();
  console.log("All cookies:", cookies);

  await session.browser.deleteCookies();
  console.log("All cookies cleared");

  await session.close();
}

manageCookies();

SDK Screenshot and Debugging

JavaScript
const { BBRESession } = require("mydisctsolver-bbre");
const fs = require("fs");

const session = new BBRESession({
  apiKey: "YOUR_API_KEY",
  mode: "adaptive"
});

async function screenshotWorkflow() {
  await session.start();

  await session.browser.navigate("https://example.com/dashboard");
  await session.browser.waitForElement({ selector: ".dashboard-content", timeout: 15 });

  const screenshotData = await session.browser.screenshot();
  const buffer = Buffer.from(screenshotData, "base64");
  fs.writeFileSync("dashboard.png", buffer);
  console.log("Dashboard screenshot saved");

  await session.close();
}

screenshotWorkflow();

Real-World Workflow Example

The following example demonstrates a complete real-world browser automation workflow that combines session creation, navigation, form interaction, waiting, data extraction, and session cleanup. This pattern represents a typical use case: navigating to a website, logging in, navigating to a data page, extracting information, and closing the session. The example includes proper error handling with a try/finally block to ensure the session is always closed.

JavaScript
const axios = require("axios");

const API_BASE = "https://bbre-solver-api.mydisct.com";
const API_KEY = "YOUR_API_KEY";
const authHeaders = {
  "Content-Type": "application/json",
  "x-api-key": API_KEY
};

async function browserAction(sessionId, action, params) {
  const response = await axios.post(
    API_BASE + "/browser/action",
    { sessionId, action, params },
    { headers: authHeaders }
  );
  return response.data;
}

async function completeWorkflow() {
  const createResponse = await axios.post(
    API_BASE + "/session/create",
    {
      mode: "adaptive",
      sensibility: "high",
      timeout: 300
    },
    { headers: authHeaders }
  );

  const sessionId = createResponse.data.session.id;
  console.log("Session created:", sessionId);

  try {
    await browserAction(sessionId, "navigate", {
      url: "https://example.com/login",
      timeout: 30
    });
    console.log("Navigated to login page");

    await browserAction(sessionId, "waitForElement", {
      selector: "#login-form",
      timeout: 10
    });

    await browserAction(sessionId, "fill", {
      selector: "#email",
      text: "[email protected]"
    });

    await browserAction(sessionId, "fill", {
      selector: "#password",
      text: "securepassword"
    });

    await browserAction(sessionId, "click", {
      selector: "button[type='submit']"
    });
    console.log("Login form submitted");

    await browserAction(sessionId, "waitForElement", {
      text: "Welcome back",
      timeout: 15
    });
    console.log("Login successful");

    await browserAction(sessionId, "navigate", {
      url: "https://example.com/account/orders"
    });

    await browserAction(sessionId, "waitForElement", {
      selector: ".order-list",
      timeout: 10
    });

    const result = await browserAction(sessionId, "execute", {
      script: "return Array.from(document.querySelectorAll('.order-row')).map(row => ({id: row.querySelector('.order-id').textContent, total: row.querySelector('.order-total').textContent, status: row.querySelector('.order-status').textContent}))"
    });

    console.log("Orders extracted:", result.result);

    const titleResult = await browserAction(sessionId, "getTitle", {});
    console.log("Page title:", titleResult.result);

  } finally {
    await axios.post(
      API_BASE + "/session/close",
      { sessionId },
      { headers: authHeaders }
    );
    console.log("Session closed");
  }
}

completeWorkflow();
Python
import requests

API_BASE = "https://bbre-solver-api.mydisct.com"
API_KEY = "YOUR_API_KEY"

headers = {
    "Content-Type": "application/json",
    "x-api-key": API_KEY
}

def browser_action(session_id, action, params=None):
    response = requests.post(
        API_BASE + "/browser/action",
        headers=headers,
        json={
            "sessionId": session_id,
            "action": action,
            "params": params or {}
        }
    )
    return response.json()

create_response = requests.post(
    API_BASE + "/session/create",
    headers=headers,
    json={
        "mode": "adaptive",
        "sensibility": "high",
        "timeout": 300
    }
)

session_id = create_response.json()["session"]["id"]
print("Session created:", session_id)

try:
    browser_action(session_id, "navigate", {
        "url": "https://example.com/login",
        "timeout": 30
    })
    print("Navigated to login page")

    browser_action(session_id, "waitForElement", {
        "selector": "#login-form",
        "timeout": 10
    })

    browser_action(session_id, "fill", {
        "selector": "#email",
        "text": "[email protected]"
    })

    browser_action(session_id, "fill", {
        "selector": "#password",
        "text": "securepassword"
    })

    browser_action(session_id, "click", {
        "selector": "button[type='submit']"
    })
    print("Login form submitted")

    browser_action(session_id, "waitForElement", {
        "text": "Welcome back",
        "timeout": 15
    })
    print("Login successful")

    browser_action(session_id, "navigate", {
        "url": "https://example.com/account/orders"
    })

    browser_action(session_id, "waitForElement", {
        "selector": ".order-list",
        "timeout": 10
    })

    result = browser_action(session_id, "execute", {
        "script": "return Array.from(document.querySelectorAll('.order-row')).map(row => ({id: row.querySelector('.order-id').textContent, total: row.querySelector('.order-total').textContent}))"
    })
    print("Orders:", result["result"])

finally:
    requests.post(
        API_BASE + "/session/close",
        headers=headers,
        json={"sessionId": session_id}
    )
    print("Session closed")

Complete SDK Workflow

The following example shows the same real-world workflow using the Node.js SDK. Notice how much cleaner the code is compared to the raw API version. The SDK handles session management, request formatting, and response parsing, letting you focus on the automation logic itself.

JavaScript
const { BBRESession } = require("mydisctsolver-bbre");

const session = new BBRESession({
  apiKey: "YOUR_API_KEY",
  mode: "adaptive",
  sensibility: "high"
});

async function completeSDKWorkflow() {
  await session.start();

  try {
    await session.browser.navigate("https://example.com/login");

    await session.browser.waitForElement({ selector: "#login-form", timeout: 10 });

    await session.browser.fill("#email", "[email protected]");
    await session.browser.fill("#password", "securepassword");
    await session.browser.click("button[type='submit']");

    await session.browser.waitForElement({ text: "Welcome back", timeout: 15 });
    console.log("Login successful");

    await session.browser.navigate("https://example.com/account/orders");
    await session.browser.waitForElement({ selector: ".order-list", timeout: 10 });

    const orders = await session.browser.execute(
      "return Array.from(document.querySelectorAll('.order-row')).map(row => ({id: row.querySelector('.order-id').textContent, total: row.querySelector('.order-total').textContent, status: row.querySelector('.order-status').textContent}))"
    );
    console.log("Orders:", orders);

    const title = await session.browser.getTitle();
    console.log("Page:", title);

  } finally {
    await session.close();
    console.log("Session closed");
  }
}

completeSDKWorkflow();

Error Codes

The following table lists all error codes that can be returned by the browser action endpoint, along with their HTTP status codes, descriptions, and recommended solutions. Implementing proper error handling for each of these codes ensures your automation workflows can recover gracefully from any failure scenario.

HTTP Status Error Code Description Solution
400 SESSION_ID_REQUIRED The sessionId field is missing from the request body. Include the sessionId field in your request body. This is the session ID returned by POST /session/create.
400 INVALID_REQUEST The action field is missing from the request body. The error message is "Browser action is required." Include the action field in your request body with a valid action name like "navigate", "click", or "fill".
400 SESSION_CLOSED The session has been explicitly closed and can no longer accept actions. Create a new adaptive session using POST /session/create. Once a session is closed, it cannot be reopened.
400 SESSION_EXPIRED The session has expired because its timeout period has elapsed. Create a new adaptive session with a longer timeout value. Plan your workflow to complete within the session timeout window.
401 API_KEY_REQUIRED No API key was provided in the request headers. Include your API key in the x-api-key or apikey header.
403 INVALID_API_KEY The provided API key is not valid or does not exist. Verify your API key in the MyDisct Solver dashboard. Ensure you are copying the complete key without extra spaces.
403 ACCOUNT_SUSPENDED Your account has been suspended. Contact MyDisct Solver support for information about your account suspension.
403 ACCOUNT_INACTIVE Your account is inactive. Activate your account through the MyDisct Solver dashboard or contact support.
404 SESSION_NOT_FOUND No session exists with the provided session ID. Verify the session ID is correct. The session may have been closed or may have expired and been cleaned up. Create a new session if needed.
500 BROWSER_ACTION_FAILED The browser action failed to execute. This can happen when an element is not found, a navigation times out, JavaScript execution throws an error, or the browser encounters an unexpected state. Check the action parameters for correctness. Verify that the target element exists on the page using a find or waitForElement action before interacting with it. For navigation timeouts, increase the timeout parameter. For JavaScript errors, test your script in a browser console first.
500 SERVICE_ERROR An internal server error occurred while processing the browser action. Retry the action after a short delay (2-5 seconds). If the error persists, contact support.

Using fillForm for Multi-Field Forms

The fillForm action is a powerful convenience method that fills multiple form fields in a single action call. Instead of sending separate fill actions for each field, you provide a formData object that maps CSS selectors to values. This is particularly useful for registration forms, checkout forms, and any form with many fields. When combined with the submit: true option, the form is automatically submitted after all fields are filled, eliminating the need for a separate click action on the submit button.

JavaScript
const axios = require("axios");

const API_BASE = "https://bbre-solver-api.mydisct.com";
const API_KEY = "YOUR_API_KEY";
const SESSION_ID = "YOUR_SESSION_ID";

async function fillRegistrationForm() {
  const response = await axios.post(
    API_BASE + "/browser/action",
    {
      sessionId: SESSION_ID,
      action: "fillForm",
      params: {
        formData: {
          "#first-name": "John",
          "#last-name": "Doe",
          "#email": "[email protected]",
          "#phone": "+1-555-0123",
          "#address": "123 Main Street",
          "#city": "New York",
          "#zip": "10001"
        },
        submit: true
      }
    },
    {
      headers: {
        "Content-Type": "application/json",
        "x-api-key": API_KEY
      }
    }
  );

  console.log("Form submitted:", response.data.success);
  return response.data;
}

fillRegistrationForm();
Python
import requests

API_BASE = "https://bbre-solver-api.mydisct.com"
API_KEY = "YOUR_API_KEY"
SESSION_ID = "YOUR_SESSION_ID"

headers = {
    "Content-Type": "application/json",
    "x-api-key": API_KEY
}

response = requests.post(
    API_BASE + "/browser/action",
    headers=headers,
    json={
        "sessionId": SESSION_ID,
        "action": "fillForm",
        "params": {
            "formData": {
                "#first-name": "John",
                "#last-name": "Doe",
                "#email": "[email protected]",
                "#phone": "+1-555-0123",
                "#address": "123 Main Street",
                "#city": "New York",
                "#zip": "10001"
            },
            "submit": True
        }
    }
)

data = response.json()
print("Form submitted:", data["success"])

Choosing Between type and fill

The fill and type actions both put text into input fields, but they work differently and are suited for different scenarios. Understanding when to use each one is important for building reliable automation workflows.

fill sets the input value directly and instantly. It clears the existing content and replaces it with the new text in a single operation. This is fast and reliable for most forms. Use fill when speed matters and the target website does not monitor individual keystrokes.

type simulates typing by dispatching individual key events (keydown, keypress, keyup) for each character with a configurable delay between them. This is slower but more realistic. Use type when the target website has JavaScript event listeners that respond to individual keystrokes (like autocomplete suggestions, real-time validation, or search-as-you-type features), or when the website monitors typing patterns as part of its bot detection system.

As a general rule, start with fill for all fields. If you encounter issues where the website does not recognize the input or triggers bot detection, switch to type for those specific fields. For websites with aggressive bot detection, using type with a delay of 0.03 to 0.08 seconds per character produces realistic typing patterns that are difficult to distinguish from human input.

Best Practices

Always Wait for Elements Before Interacting

Before clicking a button, filling a field, or extracting text from an element, use waitForElement to ensure the element is present and visible on the page. Web pages load asynchronously, and elements may not be available immediately after navigation. Attempting to interact with an element that has not loaded yet results in a BROWSER_ACTION_FAILED error. A reliable pattern is: navigate to the page, wait for a key element that indicates the page is ready, and then perform your interactions. Set reasonable timeout values based on the expected page load time.

Use Batch Actions to Reduce Latency

When you need to perform a sequence of related actions, use the batch action to bundle them into a single API call. Each individual API call incurs network latency (typically 50-200ms depending on your location), so a workflow with 10 actions saves 9 round trips when batched. Enable stopOnError: true for workflows where each action depends on the previous one succeeding, and leave it as false for independent actions where you want to collect all results regardless of individual failures.

Use High Sensibility for Bot-Protected Sites

When automating interactions on websites with advanced bot detection (DataDome, PerimeterX, Cloudflare Bot Management), create your session with "sensibility": "high". High sensibility applies human-like timing delays between actions, realistic mouse movement patterns, and advanced fingerprint consistency checks. Combine this with the type action instead of fill for input fields, and add small wait actions between interactions to simulate natural human pacing. The extra processing time is a worthwhile trade-off for significantly higher success rates on protected sites.

Close Sessions in a Finally Block

Always wrap your browser automation workflow in a try/finally block (or try/except/finally in Python) and close the session in the finally block. This ensures the session is released even when an action fails or an unexpected error occurs. Unclosed sessions remain active until they expire, consuming one of your 10 active session slots. If your automation runs frequently and sessions are not properly closed, you will quickly hit the SESSION_LIMIT_REACHED error.

Use Screenshots for Debugging Failed Workflows

When a browser action fails unexpectedly, take a screenshot before and after the failing action to see what the browser is actually displaying. The page might show a CAPTCHA challenge, a different layout than expected, a cookie consent dialog blocking the target element, or an error message from the website. Screenshots provide visual context that error messages alone cannot convey. Consider adding screenshot actions at key checkpoints in your workflow during development, and remove them once the workflow is stable to reduce processing time.

Prefer CSS Selectors Over Text-Based Element Finding

When possible, use specific CSS selectors (IDs, classes, data attributes) to identify elements rather than text content. CSS selectors are more reliable because they are less likely to change between page loads or across different languages and locales. Use findByText and clickCheckboxByText as fallbacks when elements do not have stable CSS selectors. When using CSS selectors, prefer IDs (#element-id) over classes (.element-class) because IDs are unique and less likely to match multiple elements.

Common Issues

Issue 1: BROWSER_ACTION_FAILED When Clicking Elements

Problem

You receive a 500 BROWSER_ACTION_FAILED error when trying to click a button or link. The error indicates that the element was not found on the page, even though you can see it when viewing the page in a regular browser.

Solution: This typically happens because the element has not loaded yet when the click action is executed. Web pages load content asynchronously, and the element may appear after the initial page load event. Add a waitForElement action before the click to ensure the element is present and visible. If the element is inside an iframe, it may not be directly accessible through CSS selectors on the main page. Also verify that your CSS selector is correct by testing it in the browser developer console using document.querySelector("your-selector"). If the element is dynamically generated with random class names (common in React and Angular applications), use a more stable selector like a data attribute or the element's text content with findByText.

Issue 2: Browser Actions Fail on Passive Mode Sessions

Problem

All browser action requests fail with errors when using a session that was created in passive mode. The actions do not execute and the API returns failure responses.

Solution: Browser actions are exclusively available in adaptive mode sessions. Passive mode sessions do not maintain a browser instance and therefore cannot execute browser actions. You must create a new session with "mode": "adaptive" to use the POST /browser/action endpoint. If you initially created a passive session because you did not anticipate needing browser actions, close the passive session and create a new adaptive one. The session mode cannot be changed after creation.

Issue 3: Navigation Timeout on Heavy Pages

Problem

The navigate action fails with a timeout error when loading pages that have many resources (images, scripts, stylesheets) or slow third-party services. The default 30-second timeout is not enough for the page to fully load.

Solution: Increase the timeout parameter in the navigate action params. For heavy pages, set the timeout to 60 seconds or more. If the page has many non-essential resources that slow down the load event, you can also try changing the waitUntil parameter to wait for a different load state. Additionally, some pages never fully "load" because they continuously fetch data or have long-running scripts. In these cases, use a shorter navigation timeout and then use waitForElement to wait for the specific content you need rather than waiting for the entire page to finish loading.

Issue 4: JavaScript Execution Returns Unexpected Results

Problem

The execute action returns null or undefined instead of the expected data, or the script throws an error about undefined variables or functions.

Solution: Make sure your script includes a return statement to return the value you want. Without an explicit return, the result is undefined. Also ensure that the page has fully loaded and the JavaScript variables or DOM elements your script references actually exist at the time of execution. Use waitForElement before executing JavaScript that depends on specific DOM elements. If your script references page-specific JavaScript variables or functions, verify they exist by first executing a simple check like return typeof variableName !== 'undefined'. Complex objects like DOM nodes cannot be serialized directly; extract the specific properties you need instead of returning entire DOM elements.