Product Hunt Scraper CoreClaw Worker to scrape trending products by keyword for market research, competitor tracking, lead generation and AI startup trend monitoring. Support API, feed, browser and proxy auto strategy.
Product Hunt Scraper is a CoreClaw Worker for discovering trending products by keyword. It is designed for market research, competitor tracking, lead generation, product newsletter workflows, and AI/startup trend monitoring.
The worker is adapted from the open-source ph_ai_tracker / ProductHunt-Scraper project, but removes long-running services and local persistence so it fits CoreClaw's one-shot Worker model.
AI agents, developer tools, sales automation, and data analytics.search_terms as the CoreClaw task-splitting field.auto strategy:
PROXY_DOMAIN and PROXY_AUTH.| Name | Type | Default | Description |
|---|---|---|---|
search_terms | array / stringList | AI agents, developer tools, sales automation | One keyword per line. CoreClaw splits tasks by this field. |
limit | integer | 20 | Maximum products returned per search term. |
strategy | select | auto | auto, browser, scraper, feed, search, or api. Use auto unless you have a specific reason. |
api_token | string | empty | Optional Product Hunt API token. Required only for api. |
recent_days | integer | 30 | Keep products posted within the last N days when timestamps exist. Use 0 to disable. |
max_enrich | integer | 0 | Number of detail pages to visit for missing fields.0 is fastest and recommended. |
timeout_seconds | integer | 45 | Timeout for HTTP/API/browser navigation. Auto mode feed/search fallbacks use fixed short internal caps. |
CoreClaw may split search_terms and pass a single subtask as {"string": "AI agents"}. This Worker explicitly supports that flattened input shape.
| Field | Description |
|---|---|
status | success or failed |
source | Actual provider used:api, browser, scraper, feed, or search |
search_term | Search keyword that produced the row |
rank | Rank within the current search term, sorted by votes when available |
name | Product name |
tagline | Short Product Hunt tagline |
description | Product description when available |
votes_count | Product Hunt votes |
url | Product Hunt product URL |
topics | Product topics/categories |
posted_at | Product posted time when available |
error | Failure reason for failed rows |
| Strategy | Best For | Notes |
|---|---|---|
auto | Production CoreClaw runs | Recommended. Uses token API when available, then Product Hunt Atom feed and bounded site-search fallback. Does not enter page scraping paths by default. |
browser | Product Hunt blocks HTTP with 403 | Uses CoreClaw remote fingerprint browser through ChromeWs + PROXY_AUTH. |
scraper | Fast local parsing or simple cloud runs | Tries HTTP/browser paths and then non-page fallbacks when Product Hunt blocks pages. |
feed | Product Hunt page access is blocked | Uses the public Product Hunt Atom feed. Best for recent launches. |
search | Keyword-specific fallback | Uses product-page site-search results when Product Hunt pages/API/feed are insufficient. |
api | Stable official API data | Requires Product Hunt API token. |
CoreClaw cloud does not reliably allow direct outbound access. The Worker is designed to use platform-provided network access:
The Worker automatically builds:
Do not hard-code proxy credentials.
Syntax check:
The actual main.py entry requires CoreClaw's SDK gRPC service, so run full end-to-end tests on CoreClaw.
Removed from the original project:
Kept and adapted:
__NEXT_DATA__ parserIt scrapes trending products from Product Hunt by keywords for market research, competitor tracking, lead generation, and startup trend monitoring.
Switch to browser strategy; it uses CoreClaw’s fingerprint browser + proxy.
It’s your keyword list; CoreClaw automatically splits tasks by this field.
Explore more popular scrapers from our marketplace
by Techforce Global
Search products and walk away with selling prices, retail prices, discounts, hero images, and the latest customer reviews for every product, ready to drop into your spreadsheet, dashboard, or BI tool. The Quince.com Product Scraper turns catalog into clean, structured product data in minutes.
by yankun guo
A dedicated tool to extract structured detailed data for individual SHEIN products via product URL or product ID. It connects to a remote Chromium instance, automatically bypasses SHEIN's risk verification, loads the target product page, parses complete product attributes, and returns normalized data. Supports 10+ regional SHEIN sites and configurable workflow retries, ideal for product information monitoring, price tracking, competitor research, and trend analysis.
by yankun guo
A scalable tool to automatically discover, parse, and extract structured SHEIN product data through three input modes (keyword, category URL, category ID). It supports multi-regional SHEIN sites (US/UK/DE/FR, etc.), customizable sorting rules, and extraction of core product attributes (price, rating, sales volume, badges, etc.), ideal for price tracking, competitor research, trend analysis, and listing monitoring.
by yankun guo
Enter questions or links,no coding required to extract full Perplexity AI answers with source citations in HTML format. Ideal for research, fact-checking and content analysis.