Facebook
Twitterhttps://www.listennotes.com/podcast-datasets/keyword/#termshttps://www.listennotes.com/podcast-datasets/keyword/#terms
Batch export all podcasts or episodes by full-text keyword search, e.g., people, brands, topics...
Facebook
Twitter== Quick facts ==
The most up-to-date and comprehensive podcast database available All languages & All countries Includes over 3,500,000 podcasts Features 35+ data fields , such as basic metadata, global rank, RSS feed (with audio URLs), Spotify links, and more Delivered in SQLite format Learn how we build a high quality podcast database: https://www.listennotes.help/article/105-high-quality-podcast-database-from-listen-notes
== Use Cases ==
AI training, including speech recognition, generative AI, voice cloning / synthesis, and news analysis Alternative data for investment research, such as sentiment analysis of executive interviews, market research and tracking investment themes PR and marketing, including social monitoring, content research, outreach, and guest booking ...
== Data Attributes ==
See the full list of data attributes on this page: https://www.listennotes.com/podcast-datasets/fields/?filter=podcast_only
How to access podcast audio files: Our dataset includes RSS feed URLs for all podcasts. You can retrieve audio for over 170 million episodes directly from these feeds. With access to the raw audio, you’ll have high-quality podcast speech data ideal for AI training and related applications.
== Custom Offers ==
We can provide custom datasets based on your needs, such as language-specific data, daily/weekly/monthly update frequency, or one-time purchases.
We also provide a RESTful API at PodcastAPI.com
Contact us: hello@listennotes.com
== Need Help? ==
If you have any questions about our products, feel free to reach out hello@listennotes.com
== About Listen Notes, Inc. ==
Since 2017, Listen Notes, Inc. has provided the leading podcast search engine and podcast database.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Note: due to zenodo limitations here we host solely the metadata. the whole dataset can be found at: https://drive.google.com/drive/u/0/folders/1tpg9WXkl4L0zU84AwLQjrFqnP-jw1t7z
We introduce PodcastMix, a dataset formalizing the task of separating background music and foreground speech in podcasts. It contains audio files at 44.1kHz and the corresponding metadata. For further details check the following paper and the associated GitHub repository:
This dataset contains four parts. Due to zenodo file size limitation we host the training dataset on google drive. We highlight the content of the zenodo archives within brackets:
The training dataset, PodcastMix-synth may be found at our google drive repository: https://drive.google.com/drive/folders/1tpg9WXkl4L0zU84AwLQjrFqnP-jw1t7z?usp=sharing . The archive comprises 450GB of audio and metadata with the following structure:
Make sure you maintain the folder structure of the original dataset when you uncompress these files.
This dataset is created by Nicolas Schmidt, Marius Miron, Music Technology Group - Universitat Pompeu Fabra (Barcelona) and Jordi Pons. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 Unported License (CC BY-SA 4.0).
Please acknowledge PodcastMix in Academic Research. When the present dataset is used for academic research, we would highly appreciate if authors quote the following publications:
The dataset and its contents are made available on an “as is” basis and without warranties of any kind, including without limitation satisfactory quality and conformity, merchantability, fitness for a particular purpose, accuracy or completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, the UPF is not liable for, and expressly excludes, all liability for loss or damage however and whenever caused to anyone by any use of the dataset or any part of it.
PURPOSES. The data is processed for the general purpose of carrying out research development and innovation studies, works or projects. In particular, but without limitation, the data is processed for the purpose of communicating with Licensee regarding any administrative and legal / judicial purposes.
Facebook
TwitterThe number of podcast consumers in the United States has been growing steadily. According to estimates, around *** million people consumed podcasts of any format. This marks an increase of around ** million Americans. For the first time, these estimates included both audio and video podcasts, compared to previous years, when the data only covered audio consumption.
Facebook
Twitterhttps://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Some Podcasts
Podcasts are taken from the PodcastFillers dataset. The PodcastFillers dataset consists of 199 full-length podcast episodes in English with manually annotated filler words and automatically generated transcripts. The podcast audio recordings, sourced from SoundCloud, are CC-licensed, gender-balanced, and total 145 hours of audio from over 350 speakers.
[!TIP] This dataset doesn't upload the PodcastFillers annotations, which are under a non-commercial license. See here… See the full description on the dataset page: https://huggingface.co/datasets/ylacombe/podcast_fillers_by_license.
Facebook
TwitterThe number of listeners in the 'Free (Ad-Supported) Podcast Listeners' segment of the media market in the United States was forecast to continuously increase between 2025 and 2030 by a total of **** million users (+***** percent). After the thirteenth consecutive increasing year, the number of listeners is estimated to reach ****** million users and therefore a new peak in 2030.
Facebook
Twitter== Quick facts ==
The most up-to-date and comprehensive podcast database available Includes over 3,500,000 podcasts and over 176 million episodes (including direct playable audio urls) Features 35+ data fields , such as basic metadata, global rank, RSS feed (with audio URLs), Spotify links, and more Delivered in SQLite format
== Use Cases ==
AI training, including speech recognition, generative AI, voice cloning / synthesis, and news analysis Alternative data for investment research, such as sentiment analysis of executive interviews, market research and tracking investment themes PR and marketing, including social monitoring, content research, outreach, and guest booking ...
== Custom Offers ==
We can provide custom datasets based on your needs, such as language-specific data, daily/weekly/monthly update frequency, or one-time purchases.
We also provide a RESTful API at PodcastAPI.com
Contact us: hello@listennotes.com
== Need Help? ==
If you have any questions about our products, feel free to reach out hello@listennotes.com
== About Listen Notes, Inc. ==
Since 2017, Listen Notes, Inc. has provided the leading podcast search engine and podcast database.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset of 1024 freely accessible podcast episodes. Link to the respective audio file is provided.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We release a new dataset consisting of podcast metadata (title and description) for 29 539 shows. This dataset can be used to reproduce the experiments from the article Topic Modeling on Podcast Short-Text Metadata accepted at the ECIR 2022 conference.
More information about this data and how it should be used in experiments can be found in our paper and GitHub repository.
Please cite our paper if you use the code or data.
Facebook
Twitter== Quick starts ==
Batch export podcast metadata to CSV files:
1) Export by search keyword: https://www.listennotes.com/podcast-datasets/keyword/
2) Export by category: https://www.listennotes.com/podcast-datasets/category/
== Quick facts ==
The most up-to-date and comprehensive podcast database available All languages & All countries Includes over 3,500,000 podcasts Features 35+ data fields , such as basic metadata, global rank, RSS feed (with audio URLs), Spotify links, and more Delivered in CSV format
== Data Attributes ==
See the full list of data attributes on this page: https://www.listennotes.com/podcast-datasets/fields/?filter=podcast_only
How to access podcast audio files: Our dataset includes RSS feed URLs for all podcasts. You can retrieve audio for over 170 million episodes directly from these feeds. With access to the raw audio, you’ll have high-quality podcast speech data ideal for AI training and related applications.
== Custom Offers ==
We can provide custom datasets based on your needs, such as language-specific data, daily/weekly/monthly update frequency, or one-time purchases.
We also provide a RESTful API at PodcastAPI.com
Contact us: hello@listennotes.com
== Need Help? ==
If you have any questions about our products, feel free to reach out hello@listennotes.com
== About Listen Notes, Inc. ==
Since 2017, Listen Notes, Inc. has provided the leading podcast search engine and podcast database.
Facebook
Twitterhttps://www.listennotes.com/podcast-datasets/playlist/#termshttps://www.listennotes.com/podcast-datasets/playlist/#terms
Batch export all podcasts or episodes in a specific playlist.
Facebook
TwitterThe Spotify Podcast Dataset consists of 105,360 episodes with transcripts and creator descriptions, and is provided as a training dataset for the summarization task.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A collection of podcast episodes hosted by Shannon Jamail and the Retreat Ranch team. Each episode explores retreat leadership, business strategy, and personal development for retreat hosts and venue owners. Learn actionable strategies for building, growing, and thriving in the retreat world.
Facebook
Twitterhttps://www.sci-tech-today.com/privacy-policyhttps://www.sci-tech-today.com/privacy-policy
Podcast Statistics: Podcasts have become a significant medium for information and entertainment, revolutionizing content consumption globally. In 2024, the global podcast market reached USD 30.03 billion.
A podcast is an audio program available on the internet, typically released in episodes, allowing users to download and listen on demand, enhancing accessibility. In the United States, 67% of individuals aged 12 and older have listened to a podcast, with 47% engaging monthly and 34% weekly. Globally, podcast listenership is projected to reach 504.9 million by 2024, indicating a steady upward trend.
Demographically, 59% of U.S. individuals aged 12-34 are monthly podcast listeners, followed by 55% aged 35-54, and 27% aged 55 and above. The medium's appeal is further evidenced by 46% of weekly podcast listeners reporting product or service purchases based on podcast advertisements. Additionally, 23% of weekly listeners spend 10 hours or more engaging with podcasts each week. These statistics underscore podcasts' expansive reach and influential role in modern media consumption.​
Facebook
Twitterhttps://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Podcast Market Size 2025-2029
The podcast market size is forecast to increase by USD 33.44 billion at a CAGR of 39.9% between 2024 and 2029.
The market is experiencing significant growth, fueled by the increasing proliferation of podcast platforms and the rising use of data analytics for targeted content and advertising. This dynamic market is characterized by intense competition among podcast service providers, as they strive to cater to the diverse and inconsistent user preferences. Advertisements have emerged as a significant revenue stream, with advanced technologies like artificial intelligence (AI) and blockchain technologies enabling targeted advertising and transcription technology enhancing accessibility. The availability of a wide range of podcast genres and topics, coupled with advanced analytics capabilities, enables providers to deliver personalized content and targeted advertising, enhancing user experience and engagement.
However, this competitive landscape poses challenges for companies seeking to differentiate themselves and maintain a loyal user base. Effective strategies for content creation, user engagement, and data-driven marketing will be essential for companies looking to capitalize on the opportunities presented by this evolving market. With the proliferation of subscription-based services and playback devices, such as media players, computers, IPods, and mobile phones, podcast listeners have unprecedented access to a wide range of content.
What will be the Size of the Podcast Market during the forecast period?
Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
Request Free Sample
In the dynamic market, reach and impact continue to soar, with podcast partnerships and sponsorship opportunities driving growth. Podcast advertising networks optimize ad formats, including pre-roll, mid-roll, and post-roll, to maximize ROI for advertisers. Audio mastering ensures high-quality sound, while podcast content repurposing and syndication expand reach across various platforms. Podcast downloads and plays remain key engagement metrics, with podcast charts and ratings shaping listener preferences.
Podcast collaboration, sound design, and cross-promotion foster community and innovation. Podcast apps provide convenience, while podcast video content adds visual appeal. Podcast ad rates vary based on audience size and demographics, making it essential for businesses to assess potential returns before investing. Media players, computers, iPods, mobile phones, and smartphones enable easy access to audio learning resources and podcast directories.
How is this Podcast Industry segmented?
The podcast industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.
Type
Interviews
Conversational
Solo
Panels
Repurposed content
Genre
News and politics
Society and culture
Comedy
Sports
Others
Platform
Streaming services
Dedicated podcast apps
Web-based platforms
Smart speaker integration
Geography
North America
US
Canada
Mexico
Europe
Italy
Spain
Sweden
UK
APAC
China
India
Japan
Rest of World (ROW)
By Type Insights
The interviews segment is estimated to witness significant growth during the forecast period. The market is experiencing significant growth and innovation, with interview-led shows playing a pivotal role in audience engagement. Podcasts cater to diverse consumer interests, from business and technology to entertainment and wellness. Interviews feature dynamic conversations between hosts and a range of guests, including experts, industry leaders, celebrities, and everyday individuals with captivating experiences. This format fosters authenticity and connection, enabling listeners to explore various narratives and gain valuable expertise. Podcast production software, technology, and editing tools facilitate seamless content creation and distribution. Consumption habits continue to evolve, with listeners embracing shorter and longer episodes.
Content marketing, sponsorships, and transcription services enhance podcast monetization strategies. Ethical considerations, branding, and community building are essential components of podcast business models. Hosting platforms, licensing, and distribution networks ensure widespread access to diverse podcast content. Innovation in podcast technology, transcription, and monetization continues to shape the industry landscape. Furthermore, blockchain technologies have also entered the podcasting space, providing secure and decentralized distribution channels for podcasts.
Download Free Sample Report
The Interviews segment w
Facebook
TwitterAs of July 2025, a survey found that the most popular podcast genre in the United States was comedy, with ***percent of respondents to a survey stating that they were listening to podcasts designed to make them laugh. Podcasts on news and politics as well as sports were also popular choices.Podcasts are becoming increasingly popularPodcasts have become a go-to form of audio entertainment, with digital episodes on different topics being either streamable or downloadable for easily accessible consumption. The number of podcast listeners within the United States is estimated to surpass a listenership of 100 million consumers in 2028.Podcast market leadersWithin the constantly growing market for podcast, the globally leading podcast publisher in 2024 was iHeartRadio with nearly *** million unique streams and downloads and views. In 2022, a study found that YouTube, Apple Podcast and Spotify were the most popular platforms to access podcast among Americans.
Facebook
TwitterUpgrade AI's English with 28,399 hours of real podcast data on diverse topics, ideal for enhancing learning and conversation skills.
Facebook
TwitterDataset Card for "lexFridmanPodcast-transcript-audio"
Dataset Summary
This dataset is created by applying whisper to the videos of the Youtube channel Lex Fridman Podcast. The dataset was created a medium size whisper model.
Languages
Language: English
Dataset Structure
The dataset contains all the transcripts plus the audio of the different videos of Lex Fridman Podcast.
Data Fields
The dataset is composed by:
id: Id of the youtube… See the full description on the dataset page: https://huggingface.co/datasets/Whispering-GPT/lex-fridman-podcast.
Facebook
TwitterListen Notes Podcast API is the longest-running and most widely used Podcast API, trusted by over 10,000 developers and companies since 2017.
=> Get started at PodcastAPI.com
🛠️ Rich Endpoints & Metadata
25 versatile endpoints covering every common podcast use case Detailed response schemas and examples—explore the full reference at docs.PodcastAPI.com
🚀 Why Choose Listen Notes Podcast API
1) Premium Data Quality
Aggregated from multiple sources and refreshed 24/7 AI-powered and manual cleansing of spammy contents, malformed RSS feeds, broken audio links, and more
2) Speed, Reliability & Scalability
Fully managed backend infrastructure—no ops overhead 99.999% uptime Real-time system status at listennotesstatus.com
3) Cost & Time Savings
Skip hundreds of engineering hours building your own database Avoid ongoing maintenance costs—focus on your product, not the plumbing
4) White-Glove Support
PRO & ENTERPRISE subscribers receive direct, rapid assistance from our very technical founder & CEO Expert guidance from the team that built and maintains this API
5) Proven in Production
Powering podcast players, music apps, smart speakers, public transit entertainment systems, PR agencies, marketing platforms, EdTech products, and more Trusted by 10,000+ companies & developers worldwide
6) Committed for the Long Haul
Operational since 2017 and here to stay Continuous investment in new features, performance enhancements, and data quality
=> Visit PodcastAPI.com to sign up and start building today!
Facebook
TwitterA comprehensive, community-powered dataset of podcasts, episodes, creators, and guests, providing detailed information, statistics, and relationship mapping within the podcasting industry. Considered the IMDb for podcasts.
Facebook
Twitterhttps://www.listennotes.com/podcast-datasets/keyword/#termshttps://www.listennotes.com/podcast-datasets/keyword/#terms
Batch export all podcasts or episodes by full-text keyword search, e.g., people, brands, topics...