A powerful cross-browser web scraping tool using Playwright for complete browser rendering. Supports Chromium, Firefox, and WebKit browser engines. Perfect for dynamic pages, single-page applications (SPAs), infinite scroll pages, and cross-browser testing scenarios.
URLs to start crawling from. Supports multiple URLs. | 开始爬取的 URL 列表,支持多个 URL
CSS selector for finding links to follow. | 用于发现并跟踪链接的 CSS 选择器
Only crawl URLs matching these patterns (e.g., https://example.com/blog/*). | 只爬取匹配这些模式的 URL
URL patterns to skip (e.g., /login, /admin, *.pdf). | 要跳过的 URL 模式
Maximum crawl depth (0 = start page only, 1 = follow one level). | 最大爬取深度(0=仅起始页,1=跟踪一层)
Maximum pages to crawl (0 = unlimited, recommend ≤50 for speed). | 最大爬取页面数(0=不限制,建议≤50)
Maximum results to output (0 = unlimited). | 最大输出结果数(0=不限制)
Concurrent browser tabs (recommend 3-5 for best performance). | 并发浏览器标签数(建议3-5以获得最佳性能)
Page load timeout in seconds (lower = faster failure detection). | 页面加载超时秒数(越低失败检测越快)
Page function execution timeout in seconds. | 页面函数执行超时秒数
Retries for failed requests (0 = no retry). | 失败请求重试次数(0=不重试)
When to consider page navigation complete. 'domcontentloaded' is fastest. | 页面导航完成的判定条件,'domcontentloaded' 最快
Download images and media files (slower). | 下载图片和媒体文件(会变慢)
Download CSS stylesheets. | 下载 CSS 样式表
Bypass CORS and Content Security Policy restrictions. | 绕过 CORS 和内容安全策略限制
Auto-close cookie consent popups. | 自动关闭 Cookie 同意弹窗
Auto-scroll height in pixels (0 = disabled). Useful for infinite scroll pages. | 自动滚动高度像素(0=禁用),适用于无限滚动页面
Keep URL hash fragments in crawled links. | 保留爬取链接中的 URL 哈希部分
Ignore SSL certificate errors. | 忽略 SSL 证书错误
Enable detailed debug logging. | 启用详细调试日志
Log browser console messages. | 记录浏览器控制台消息
Explore more popular scrapers from our marketplace
by CoreClaw
It queries the Google search engine by keyword and returns a structured SERP summary, including the final search parameters, organic results, related queries, and people-also-ask data.
by Kael Odin
Dedup Datasets Worker is a powerful tool for merging and deduplicating datasets from multiple JSON/JSONL files. Fully optimized for the CafeScraper platform with enhanced features and robust error handling.
by Kael Odin
A powerful Google Sheets data import export tool designed for data synchronization, backup, and integration between Google Sheets and external systems. Supports three operation modes, two authentication methods, batch processing, data deduplication, and automatic backup.
by Kael Odin
A high-speed static page scraper based on Cheerio, designed specifically for static HTML pages. Uses Cheerio for HTML parsing, delivering speeds 10-50 times faster than full browser rendering.