This document explains the Web Agent Scraper node, which automates web interactions and content extraction.

Node Inputs

Required Fields

Optional Fields

  • Use Advanced Scraping: Enable for restricted sites (more robust, but slower)

    Note: Costs additional credits

Node Outputs

  • Scraped URL: Final page URL after actions
  • Website Content: Extracted content/data

Available Actions

  1. Click:

    • Requires attributes
    • Clicks specific element
    Name: class/id
    Value: button-primary
    
  2. Hover:

    • Requires attributes
    • Moves cursor over element
    Name: class
    Value: header-menu
    
  3. Scroll:

    • No attributes needed
    • Scrolls full page

Input Actions

  1. Write:

    • Requires attributes
    • Types text into field
    Name: id
    Value: search-input
    
  2. Select from Dropdown:

    • Requires attributes
    • Chooses option
    Name: id
    Value: category-select
    
  3. Wait:

    • Pauses execution (ms)

Collection Actions

  1. Screenshot:

    • Two types:
      • Screenshot (screen visible in viewport)
      • Screenshot full page
  2. Scrape:

    • Scrapes the provided URL
  3. Scrape Source:

    • No attributes needed
    • Gets HTML source
  4. Get URL:

    • Returns current URL
  5. Get All URLs:

    • All links available on the provided page

    Output: Links separated by commas

  6. Get Link by Label:

    • This is the label or text that you want to search for on the page that you want to click on. We look for the first link element that contains this text. (Case sensitive)
  7. Get All Components by Label:

    • This is the value of the HTML attribute that you want the action to be performed on. For example a class name, id value, etc.

Important Considerations

  1. Actions run in sequence
  2. Advanced scraping adds 10 credits to the cost of the node

In summary, the Web Agent Scraper node automates complex web interactions to collect content that requires multiple steps to access.