Web Agent Scraper
This document explains the Web Agent Scraper node, which automates web interactions and content extraction.
Node Inputs
Required Fields
-
URL: Starting web address
Example: “https://www.gumloop.com/”
-
Actions: Sequence of interactions to perform
Optional Fields
-
Use Advanced Scraping: Enable for restricted sites (more robust, but slower)
Note: Costs additional credits
Node Outputs
- Scraped URL: Final page URL after actions
- Website Content: Extracted content/data
Available Actions
Navigation Actions
-
Click:
- Requires attributes
- Clicks specific element
-
Hover:
- Requires attributes
- Moves cursor over element
-
Scroll:
- No attributes needed
- Scrolls full page
Input Actions
-
Write:
- Requires attributes
- Types text into field
-
Select from Dropdown:
- Requires attributes
- Chooses option
-
Wait:
- Pauses execution (ms)
Collection Actions
-
Screenshot:
- Two types:
- Screenshot (screen visible in viewport)
- Screenshot full page
- Screenshot Full Page (Mobile): Takes a full-page screenshot of the mobile version of the site.
- Two types:
-
Scrape:
- Scrapes the provided URL
-
Scrape Source:
- No attributes needed
- Gets HTML source
-
Get URL:
- Returns current URL
-
Get All URLs:
- All links available on the provided page
Output: Links separated by commas
-
Get Link by Label:
- This is the label or text that you want to search for on the page that you want to click on. We look for the first link element that contains this text. (Case sensitive)
-
Get All Components by Label:
- This is the value of the HTML attribute that you want the action to be performed on. For example a class name, id value, etc.
Important Considerations
- Actions run in sequence
- Advanced scraping adds 10 credits to the cost of the node
Relevant Templates
To get started quickly with website scraping, use one of these ready-made templates:
- Scrape YC Directory
- Scrape and Categorize Lead Websites
- LinkedIn Company Page Scraper
- Real Estate Listing Data Extractor
These templates are designed to simplify common scraping tasks and can be customized to fit your specific requirements.
In summary, the Web Agent Scraper node automates complex web interactions to collect content that requires multiple steps to access.