A powerful and flexible web scraping tool that automatically crawls websites, extracts structured data, and discovers new links.
要爬取的起始 URL 列表 | List of starting URLs to crawl
从起始 URL 开始的最大爬取深度(起始页面深度为 0)| Maximum crawl depth from starting URLs (starting page = depth 0)
每次运行最多爬取的页面数量(0 表示不限制)| Maximum pages to crawl per run (0 = unlimited)
页面加载的最长等待时间(秒)| Maximum time to wait for page load (seconds)
页面导航完成条件 | Navigation completion condition
是否在每个页面注入 jQuery 库 | Inject jQuery library into each page
是否忽略 SSL 证书错误 | Ignore SSL certificate errors
是否下载图片、视频等媒体资源(关闭可提高速度)| Download images, videos and other media (disable for faster crawling)
是否下载 CSS 样式表(关闭可提高速度)| Download CSS stylesheets (disable for faster crawling)
是否输出详细的调试日志 | Enable detailed debug logging
Explore more popular scrapers from our marketplace
by CoreClaw
It queries the Google search engine by keyword and returns a structured SERP summary, including the final search parameters, organic results, related queries, and people-also-ask data.
by Kael Odin
Dedup Datasets Worker is a powerful tool for merging and deduplicating datasets from multiple JSON/JSONL files. Fully optimized for the CafeScraper platform with enhanced features and robust error handling.
by Kael Odin
A powerful Google Sheets data import export tool designed for data synchronization, backup, and integration between Google Sheets and external systems. Supports three operation modes, two authentication methods, batch processing, data deduplication, and automatic backup.
by Kael Odin
A high-speed static page scraper based on Cheerio, designed specifically for static HTML pages. Uses Cheerio for HTML parsing, delivering speeds 10-50 times faster than full browser rendering.