Website Scraper

This document explains the Website Scraper node, which extracts content from web pages.

Node Inputs

Required Fields

URL: Web address to scrape Example: “https://www.gumloop.com/“

Optional Fields

Use Advanced Scraping: Enable this option to use advanced scraping techniques that utilize residential proxies. This helps to avoid common blocks and restrictions imposed by websites, ensuring more reliable and thorough data extraction.
Timeout: Maximum time (in seconds) to wait for the website to respond before the request is considered failed. This helps to handle slow-loading pages and avoid unnecessary delays.
Example: 30 for a 30-second timeout.

Node Output

Website Content: Extracted text and data

Node Functionality

The Website Scraper node:

Visits web pages
Extracts readable content
Handles various content types
Bypasses common restrictions
Supports batch processing

Common Use Cases

Content Collection:

Input: Blog URLs
Output: Article content
Use: Research, analysis

Data Monitoring:

Input: Product pages
Output: Pricing, details
Use: Market research

Information Gathering:

Input: News sites
Output: Latest updates
Use: News aggregation

Loop Mode Pattern

Input: List of URLs
Process: Scrape each site
Output: Content from each URL

Relevant Templates

To get started quickly with website scraping, use one of these ready-made templates:

These templates are designed to simplify common scraping tasks and can be customized to fit your specific requirements.

Important Considerations

URLs must include https:// or http://

In summary, the Website Scraper node helps you automatically collect web content, with options for handling both simple and restricted websites.

Get Started

Nodes

Common Errors

Website Scraper

Node Inputs

Required Fields

Optional Fields

Node Output

Node Functionality

Common Use Cases

Loop Mode Pattern

Relevant Templates

Important Considerations

Get Started

Nodes

Common Errors

​Node Inputs

​Required Fields

​Optional Fields

​Node Output

​Node Functionality

​Common Use Cases

​Loop Mode Pattern

​Relevant Templates

​Important Considerations

Node Inputs

Required Fields

Optional Fields

Node Output

Node Functionality

Common Use Cases

Loop Mode Pattern

Relevant Templates

Important Considerations