A high-performance web scraper for RAG and AI, featuring Google search integration, dual-mode extraction (HTTP/Browser), and multi-format output.
Enter Google Search keywords or URLs to scrape. Each line runs as a separate task. | 输入 Google 搜索关键词或要抓取的 URL。每行作为独立任务运行。
Maximum number of search results to return (1-100). Ignored if query is a URL. | 返回的最大搜索结果数(1-100)。如果 query 是 URL 则忽略。
The format of the output content. | 输出内容的格式。
⚠️ Raw HTTP: Fast but CANNOT handle JavaScript, Cloudflare, or anti-bot protection. Use for: static HTML pages, APIs, simple websites. ✅ Browser: Handles dynamic JS, SPAs, Cloudflare, but slower. | ⚠️ Raw HTTP:快速但无法处理 JavaScript、Cloudflare 或反爬保护。适用于:静态 HTML 页面、API、简单网站。✅ 浏览器:可处理动态 JS、SPA、Cloudflare,但较慢。
Maximum time in seconds to wait for a request to complete (1-300). | 请求完成的最大等待时间(秒,1-300)。
Number of retry attempts for failed requests. | 失败请求的重试次数。
Number of parallel scraping operations. Higher = faster but more resources. | 并行抓取操作数。越高越快但消耗更多资源。
Time to wait for dynamic page content (browser mode only). | 等待动态页面内容的时间(仅浏览器模式)。
Automatically remove cookie warning banners from pages. | 自动移除页面的 Cookie 警告横幅。
Enable debug logging and performance metrics. | 启用调试日志和性能指标。
Explore more popular scrapers from our marketplace
by CoreClaw
It queries the Google search engine by keyword and returns a structured SERP summary, including the final search parameters, organic results, related queries, and people-also-ask data.
by Kael Odin
Dedup Datasets Worker is a powerful tool for merging and deduplicating datasets from multiple JSON/JSONL files. Fully optimized for the CafeScraper platform with enhanced features and robust error handling.
by Kael Odin
A powerful Google Sheets data import export tool designed for data synchronization, backup, and integration between Google Sheets and external systems. Supports three operation modes, two authentication methods, batch processing, data deduplication, and automatic backup.
by Kael Odin
A high-speed static page scraper based on Cheerio, designed specifically for static HTML pages. Uses Cheerio for HTML parsing, delivering speeds 10-50 times faster than full browser rendering.