

Auto-detect URLs or keywords to bulk extract full metadata for YouTube videos, channels & playlists. Download subtitles & comments, export to JSON/CSV. Perfect for market research & competitor analysis.
Unified YouTube Scraper. Give it YouTube URLs or search keywords and it auto-detects the input type and returns videos with full metadata. One scraper covers every YouTube data type:
| Input | What it returns |
|---|---|
Video URL (watch?v=, youtu.be, /shorts/) | That single video with full detail |
Channel URL (/@handle, /channel/UC..., /c/, /user/) | The channel's videos / shorts / streams |
Playlist URL (?list=) | Every video in the playlist |
Search results URL (/results?search_query=) | Matching videos |
| Search keyword | Matching videos / Shorts / streams (respecting the filters) |
Per-type result caps are independent and applied per search term and per
channel: maxResults (videos), maxResultsShorts, maxResultStreams.
Search terms support the full YouTube search filters — Sorting order,
Date filter, Length filter, Video type filter and Features
(HD, Subtitles/CC, Creative Commons, 3D, Live, Purchased, 4K, 360°, Location,
HDR, VR180) — all encoded into the search sp parameter. Channels additionally
support publishedAfter (date) and Channel Sort By.
Optional add-ons (per video): subtitles / transcript (downloadSubtitles,
with subtitleFormat srt/vtt/plaintext/json and preferAutoGeneratedSubtitles)
and comments (includeComments).
JSON, CSV.
Table columns = one column per logical field (stable column count — exactly the fields in the Field Dictionary below). Array/object fields (subtitles, descriptionLinks, hashtags, collaborators, aboutChannelInfo, comments, …) stay as a single column: the console collapses them to "N items" / "N fields" (click to expand); CSV/XLSX export serializes the whole value as a JSON string in that one cell.
Long-subtitle note: the full SRT is inlined inside the
subtitlescell's JSON. CSV has no per-cell length limit (full subtitles for long videos); Excel/XLSX caps cells at 32,767 chars, so very long subtitles get truncated in XLSX — export as CSV for complete long subtitles.
This is an example of how results will look like.
| Field | Type | Description |
|---|---|---|
| title | string | Video title |
| id | string | YouTube video ID |
| url | string | Canonical watch URL |
| type | string | Item type: video, short, or stream |
| sourceType | string | How the video was discovered: video, search, channel, playlist |
| input | string | The original input (URL or keyword) that produced this video |
| thumbnailUrl | string | Highest-resolution thumbnail URL |
| viewCount | string | View count text |
| date | string | Publish date or relative published time |
| likes | string | Like count text |
| commentsCount | string | Number of comments |
| duration | string | Video duration (H:MM:SS) |
| channelName | string | Channel name |
| channelUrl | string | Channel URL |
| numberOfSubscribers | string | Channel subscriber count text |
| text | string | Video description text |
| descriptionLinks | array | Links extracted from the description |
| subtitles | boolean | Whether subtitles / captions are available |
| subtitleLanguage | string | Language code of the downloaded subtitle track |
| subtitlesText | string | Full subtitle transcript in the chosen subtitleFormat (srt/vtt/plaintext/json) |
| transcript | array | Subtitle segments [{start, dur, text}] |
| comments | array | Top-level comments (only when Include Comments is on)[{cid, replyToCid, type, publishedTimeText, pageUrl, videoId, comment, author, authorIsChannelOwner, voteCount, replyCount, hasCreatorHeart, title, commentsCount}] |
| isMonetized | boolean | Whether the video appears monetized |
| commentsTurnedOff | boolean | Whether comments are disabled |
| liveStatus | string | Live status: is_live / is_upcoming / was_live / not_live |
| availableQualities | array | Available video qualities, e.g.["2160p","1080p","720p"] (from the player response) |
| location | string | Recording location; null if the video has none |
| error | string | Error message (null on success) |
| error_code | string | Error code (null on success) |
| warning | string | Non-blocking warning message |
| warning_code | string | Non-blocking warning code |
| success | boolean | Whether this record was scraped successfully |
| Param | Type | Required | Default | Description |
|---|---|---|---|---|
| startURLs | array | No | [] (empty) | Direct video / channel / playlist / search / shorts URLs (search filters do NOT apply here).When set, URLs take priority and search terms are ignored. |
| searchKeywords | array | No | python tutorial | Keywords to search; each resolves to matching results. Ignored when startURLs is provided. |
| maxResults | integer | No | 10 | Max regular videos per search term / channel (0 = skip videos) |
| maxResultsShorts | integer | No | 0 | Max Shorts per search term / channel (0 = skip) |
| maxResultStreams | integer | No | 0 | Max streams per search term / channel (0 = skip) |
| sortBy | string | No | relevance | Search sort: relevance / date / views / rating |
| dateFilter | string | No | any | Upload date: any / hour / today / week / month / year |
| lengthFilter | string | No | any | Duration: any / short (<4m) / medium (4–20m) / long (>20m) |
| videoTypeFilter | string | No | any | Search type chip: any / video / channel / playlist / movie |
| features | array | No | [] | Search features: hd, subtitles, creativeCommons, 3d, live, purchased, 4k, 360, location, hdr, vr180 |
| downloadSubtitles | boolean | No | false | Fetch each video's subtitles / transcript |
| subtitleLanguages | array | No | [] | Preferred subtitle language codes (e.g. en, es) |
| preferAutoGeneratedSubtitles | boolean | No | false | Prefer the auto-generated (ASR) track over manual captions |
| subtitleFormat | string | No | srt | subtitlesText format: srt / vtt / plaintext / json |
| publishedAfter | string | No | "" | Channel only: keep videos published on/after YYYY-MM-DD |
| channelSortBy | string | No | latest | Channel videos order: latest / popular / oldest |
| includeComments | boolean | No | false | Fetch top-level comments with author + interaction details |
| maxComments | integer | No | 20 | Max top-level comments per video |
| maxConcurrency | integer | No | 8 | Videos enriched in parallel (raise if your proxy sustains many concurrent connections; lower on empty-response churn) |
| perVideoTimeoutSecs | integer | No | 30 | Abandon a video if enrichment exceeds this many seconds (0 = no limit); recorded as a 504 |
At least one of startURLs or searchKeywords must be provided. If both are set, URLs take priority and search terms are ignored — which is why the defaults pre-fill searchKeywords and leave startURLs empty.
Requires PROXY_AUTH (username:password) and PROXY_DOMAIN (host:port)
environment variables; the request proxy is built as
socks5://{PROXY_AUTH}@{PROXY_DOMAIN}.
Locale coherence (anti-detection): the scraper sends a self-consistent locale
(accept-language + PREF timezone + hl/gl) selected by the GEO env var
(US/GB/DE/FR/JP/BR/IN, default US); ACCEPT_LANGUAGE can override the
language. With a rotating multi-country proxy, pin the egress to the SAME country as
GEO (most rotating proxies accept a country-XX token in the proxy username) — a
fixed locale over random-country IPs (timezone/language ≠ IP country) is a strong bot
signal. InnerTube API calls (youtubei/v1/*) use a real fetch/XHR header profile
(accept: */*, origin, x-youtube-client-*, JSON content-type) rather than a page
navigation one, and the request identity (TLS + UA + client-hints) stays consistent
per process. A CONSENT cookie is sent to skip EU consent walls.
This scraper reuses the repository's proven InnerTube fetch + video-detail /
comment parsers. The list-resolution paths are new and were validated against
live YouTube data: search → videos, channel → videos, and playlist → videos
all extract correctly. YouTube periodically migrates its videoRenderer /
lockupViewModel structures, so re-check on first run after long gaps.
Per-type caps & filters: maxResults / maxResultsShorts / maxResultStreams
are counted independently and applied per search term and per channel. The search
filters (sort / date / length / type / features) apply to search terms only — they
are encoded into the search sp protobuf — and are ignored for direct URLs.
Channel date range: publishedAfter filters a channel's videos by publish date
using a hybrid strategy — a cheap relative-time early-stop while paginating plus an
exact ISO-date filter from each video's detail. channelSortBy=latest is exact;
popular / oldest are best-effort via the sort chip and fall back to latest when
YouTube does not expose it.
Not supported: Apify's "Save subtitles to key-value store" — this platform has no
key-value store, so subtitles are returned only as the subtitlesText / transcript
fields.
Run model: runs as a single CoreClaw task (no per-URL subtask split), matching Apify — all search terms and all direct URLs are processed once in one run, with concurrency handled internally (the input schema sets no b).
Performance: inputs (search terms / channels / URLs) are resolved in parallel and
videos are enriched on a pipeline — enrichment starts as soon as the first targets are
found, overlapping ongoing resolution. Enrichment runs maxConcurrency videos at once
(default 8). Each video fans out several sub-requests (watch page, subtitles, comments),
so a very high concurrency can saturate a single proxy endpoint and trigger empty
responses regardless of how large the rotating IP pool is — raise maxConcurrency only
if your proxy sustains many concurrent connections, and lower it if you see empty-response
/ 404 churn. perVideoTimeoutSecs (default 30) bounds only the optional add-ons
(subtitles / comments): if they exceed the budget they are left empty and the video
record is still emitted with its full detail — a stuck add-on never discards a video.
Subtitles and comments for a video run concurrently and both reuse the watch page already
fetched for its detail (no duplicate page fetches).
Subtitles / transcript: caption availability and language (subtitles,
subtitleLanguage) are detected reliably. For the transcript body the scraper
uses a yt-dlp-style cascade: (1) the watch page's /api/timedtext track (json3 → xml);
YouTube increasingly gates this behind a PO / BotGuard token and returns an empty body,
so on miss it (2) re-fetches the player response via several mobile / embedded / TV
InnerTube clients (ANDROID_VR → TVHTML5 → IOS → MWEB → WEB_EMBEDDED_PLAYER →
ANDROID) whose caption baseUrls are usually not PO-gated (each client is gated
independently, so trying more raises the odds of hitting an un-gated track; the first one
that yields text wins) — this is how yt-dlp recovers captions without a browser, and it
brings back the transcript for most gated videos; and finally (3) the InnerTube
get_transcript endpoint. Every step is plain InnerTube over curl_cffi — no browser,
no JS runtime, no PO-token provider. If all steps miss, subtitlesText / transcript
come back empty and the scraper degrades gracefully. The few videos gated across all
clients would still need a PO-token provider or a browser engine (e.g. Camoufox), which
remains out of scope for this version. Client constants mirror yt-dlp master and may need
refreshing if YouTube rotates client versions.
Explore more popular scrapers from our marketplace
by CoreClaw
Extract public profile data in bulk by entering a URL, including channel name, subscriber count, video count, view count, description and popular videos. Export in CSV or JSON format for competitor analysis and user research, with one-click structured data export.
by CoreClaw
By entering keywords, batch extract public YouTube channel data, including channel name, subscriber count, video count, view count, description, popular videos, etc., outputting in CSV or JSON format. Supports competitor analysis, user research, zero-code operation, one-click export of structured data.
by CoreClaw
Extract public YouTube video comments in bulk via video IDs, including content, commenter details, likes, replies, and author interactions. Export structured data to CSV or JSON with one click for sentiment analysis and user insights.
by CoreClaw
Extract public YouTube video data in bulk via video IDs, including title, description, channel info, views, likes, comments and duration. Export structured data to CSV or JSON with one click for content analysis and statistics.