

It queries the Google search engine by keyword and returns a structured SERP summary, including the final search parameters, organic results, related queries, and people-also-ask data.
It queries Google SERP by keyword and outputs one row per organic result. Public search-level fields are repeated on every row. Organic-result fields change row by row.
The final output row count is capped by max_results. The scraper estimates the required page count at about 9 organic results per page, starts those page requests concurrently, then continues requesting later pages until it has streamed the requested number of unique rows or no more useful results are found.
Search-level fields such as keyword, current_page, google_domain, country, language, geo_location, safe_search, results_count_collected, results_total_text, related_queries, and people_also_ask are repeated on each organic-result row.
position is rewritten as the continuous output rank after pagination and de-duplication. Other organic-result fields such as title, source_name, display_url, url, clean_url, root_domain, redirect_url, snippet, highlighted_terms, image_alt, has_image, and favicon_url come from the current item in the organic list.
The script reads PROXY_AUTH and PROXY_DOMAIN, builds socks5://{PROXY_AUTH}@{PROXY_DOMAIN}, and sends it as input_proxy.
It also extracts runtime values from PROXY_AUTH when frontend input does not override them:
task_id: from taskId-...user_name: current default account segmentuser_id: current configured default user id unless explicitly providedThis is the shape of one returned row. The full result is still a row list at the platform level because the SDK uses table-style push_data.
| Column name | Description | Data type |
|---|---|---|
| keyword | Search keyword used for the request | Text |
| current_page | Current SERP page number from pagination.current_page | Number |
| start_position | Result offset used for the request | Number |
| google_domain | Google domain used for the search request | Text |
| country | Country parameter from the search request | Text |
| language | Language parameter from the search request | Text |
| geo_location | Resolved location value from location or uule | Text |
| safe_search | Safe search mode | Text |
| exclude_autocorrected_results | Whether auto-corrected results are excluded | Boolean |
| results_filtering | Google result filtering mode | Text |
| scraped_at | Scrape timestamp in ISO 8601 format | Text |
| search_url | Requested Google search URL from search_metadata.spider_url | Url |
| results_count_collected | Number of organic results collected on the current page | Number |
| results_total_text | Raw total-results text from search_information.total_results | Text |
| related_queries | Related query records repeated on each row | Array |
| position | Continuous organic result ranking position after pagination and de-duplication | Number |
| title | Organic result title | Text |
| source_name | Organic result source name | Text |
| display_url | Google display URL shown in the result | Url |
| url | Resolved organic result URL | Url |
| clean_url | Canonical URL without query string or fragment | Url |
| root_domain | Root domain extracted from the organic result URL | Text |
| redirect_url | Google redirect URL for the organic result | Url |
| snippet | Organic result snippet text | Text |
| highlighted_terms | Comma-separated highlighted terms from the snippet | Text |
| image_alt | Image alt text for the organic result | Text |
| has_image | Whether the organic result includes image metadata | Boolean |
| favicon_url | Raw favicon URL or data URI from the organic result | Text |
| people_also_ask | People-also-ask records repeated on each row | Array |
related_queries| Field | Description | Data type |
|---|---|---|
| block_position | Related-query block position in the SERP | Number |
| topic_title | Related query title text | Text |
| related_search_url | Google search URL for the related query | Url |
| item_position | Item position inside the related query block | Number |
people_also_ask| Field | Description | Data type |
|---|---|---|
| position | Position in the people-also-ask list | Number |
| question | Question text | Text |
| answer | Answer or snippet text when available | Text |
| source_url | Source URL for the answer when available | Url |
| Param | Type | Required | Default | Description |
|---|---|---|---|---|
| keyword | String | Yes | — | Search keyword. A full Google search URL is also accepted and the builder will extract supported parameters from it when possible. |
| max_results | Integer | No | 10 | Maximum number of organic result rows to output. Internal pagination still uses Google offsets 0, 10, 20, and so on, but users only need to provide the desired row count. |
| domain | String | No | https://www.google.com/ | Google domain used during scraping. |
| gl | String | No | us | Region setting for search results using a two-letter country code. |
| hl | String | No | en | Interface language for Google search results. |
| cr | Array | No | [] | Restricts results to one or more countries or regions. Values are joined with |, for example countryFR|countryDE. |
| lr | Array | No | [] | Restricts results to one or more languages. Values use lang_XX and are joined with |, for example lang_fr|lang_de. |
| location | String | No | — | Geographic location text used to simulate local search context. |
| tbs | String | No | — | Advanced Google search filters, such as time or search-vertical filters. |
| safe | String | No | off | Adult-content filtering mode. Supported values are active and off. |
| nfpr | String | No | 0 | Controls spelling auto-correction.nfpr=1 disables Google auto-correction. |
| filter | String | No | 0 | Enables or disables Google's duplicate-like result filtering. |
domain=https://www.google.com/gl=ushl=ensafe=offnfpr=0filter=0max_results=10Hidden internal defaults still include uule, num=10, render_js, device, retry logic, 5 parallel probes per offset, a second same-offset probe batch when all 5 probes fail, and a lightweight pool capped at 50 in-flight probe requests.
request_builder.py: input parsing, normalization, validation, and request payload constructiongoogle_serp_client.py: spider interface HTTP request execution with retry supportpagination_runner.py: concurrent internal pagination, unique-row tracking, and ordered streaming outputresponse_mapper.py: interface response extraction and row-level output mappingmain.py: task orchestration and CoreSDK integrationoutput_schema.json: output field schema aligned with the current row formatThe integration tests start a local mock CoreSDK gRPC server and a local mock spider interface server, then verify payload construction, retry behavior, and row-level output mapping.
Explore more popular scrapers from our marketplace
by Kael Odin
Dedup Datasets Worker is a powerful tool for merging and deduplicating datasets from multiple JSON/JSONL files. Fully optimized for the CafeScraper platform with enhanced features and robust error handling.
by Kael Odin
A powerful Google Sheets data import export tool designed for data synchronization, backup, and integration between Google Sheets and external systems. Supports three operation modes, two authentication methods, batch processing, data deduplication, and automatic backup.
by Kael Odin
A high-speed static page scraper based on Cheerio, designed specifically for static HTML pages. Uses Cheerio for HTML parsing, delivering speeds 10-50 times faster than full browser rendering.
by Kael Odin
A powerful cross-browser web scraping tool using Playwright for complete browser rendering. Supports Chromium, Firefox, and WebKit browser engines. Perfect for dynamic pages, single-page applications (SPAs), infinite scroll pages, and cross-browser testing scenarios.