Product Hunt Scraper CoreClaw Worker to scrape trending products by keyword for market research, competitor tracking, lead generation and AI startup trend monitoring. Support API, feed, browser and proxy auto strategy.
Product Hunt Scraper is a CoreClaw Worker for discovering trending products by keyword. It is designed for market research, competitor tracking, lead generation, product newsletter workflows, and AI/startup trend monitoring.
The worker is adapted from the open-source ph_ai_tracker / ProductHunt-Scraper project, but removes long-running services and local persistence so it fits CoreClaw's one-shot Worker model.
AI agents, developer tools, sales automation, and data analytics.search_terms as the CoreClaw task-splitting field.auto strategy:
PROXY_DOMAIN and PROXY_AUTH.| Name | Type | Default | Description |
|---|---|---|---|
search_terms | array / stringList | AI agents, developer tools, sales automation | One keyword per line. CoreClaw splits tasks by this field. |
limit | integer | 20 | Maximum products returned per search term. |
strategy | select | auto | auto, browser, scraper, feed, search, or api. Use auto unless you have a specific reason. |
api_token | string | empty | Optional Product Hunt API token. Required only for api. |
recent_days | integer | 30 | Keep products posted within the last N days when timestamps exist. Use 0 to disable. |
max_enrich | integer | 0 | Number of detail pages to visit for missing fields.0 is fastest and recommended. |
timeout_seconds | integer | 45 | Timeout for HTTP/API/browser navigation. Auto mode feed/search fallbacks use fixed short internal caps. |
CoreClaw may split search_terms and pass a single subtask as {"string": "AI agents"}. This Worker explicitly supports that flattened input shape.
| Field | Description |
|---|---|
status | success or failed |
source | Actual provider used:api, browser, scraper, feed, or search |
search_term | Search keyword that produced the row |
rank | Rank within the current search term, sorted by votes when available |
name | Product name |
tagline | Short Product Hunt tagline |
description | Product description when available |
votes_count | Product Hunt votes |
url | Product Hunt product URL |
topics | Product topics/categories |
posted_at | Product posted time when available |
error | Failure reason for failed rows |
| Strategy | Best For | Notes |
|---|---|---|
auto | Production CoreClaw runs | Recommended. Uses token API when available, then Product Hunt Atom feed and bounded site-search fallback. Does not enter page scraping paths by default. |
browser | Product Hunt blocks HTTP with 403 | Uses CoreClaw remote fingerprint browser through ChromeWs + PROXY_AUTH. |
scraper | Fast local parsing or simple cloud runs | Tries HTTP/browser paths and then non-page fallbacks when Product Hunt blocks pages. |
feed | Product Hunt page access is blocked | Uses the public Product Hunt Atom feed. Best for recent launches. |
search | Keyword-specific fallback | Uses product-page site-search results when Product Hunt pages/API/feed are insufficient. |
api | Stable official API data | Requires Product Hunt API token. |
CoreClaw cloud does not reliably allow direct outbound access. The Worker is designed to use platform-provided network access:
The Worker automatically builds:
Do not hard-code proxy credentials.
Syntax check:
The actual main.py entry requires CoreClaw's SDK gRPC service, so run full end-to-end tests on CoreClaw.
Removed from the original project:
Kept and adapted:
__NEXT_DATA__ parserIt scrapes trending products from Product Hunt by keywords for market research, competitor tracking, lead generation, and startup trend monitoring.
Switch to browser strategy; it uses CoreClaw’s fingerprint browser + proxy.
It’s your keyword list; CoreClaw automatically splits tasks by this field.
Explore more popular scrapers from our marketplace
by Wahlberg
Scrape public Reddit posts, comments, votes, media, and metadata by URL or keyword. Support sorting, filtering, and structured output for research, monitoring, and analysis.
by Odin Kael
A powerful course scraper for extracting online courses from Coursera and EDX platforms.
by Odin Kael
Stably scrape job postings from recruitment platforms including Indeed and LinkedIn. Supports remote/full-time/salary filtering, custom proxies, and multi-dimensional precise search. Deploy with one click to obtain overseas job data.
by Odin Kael
Get global stock market data from Yahoo Finance, supporting US stocks, Hong Kong stocks, and A-shares. Extract comprehensive data including historical OHLCV, company information, financial statements, dividend history, split history, and analyst ratings. No coding required. Export to CSV or JSON with one click.