Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 39 series, with data for years 1998 - 2004 (not all combinations necessarily have data for all years), and is no longer being released. This table contains data described by the following dimensions (Not all combinations are available): Geography (13 items: Canada;Newfoundland and Labrador;Prince Edward Island;Nova Scotia; ...), Age group (3 items: Total population;Children 2 to 11 years;Teens 12 to 17 years)
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 156 series, with data for years 1998 - 2004 (not all combinations necessarily have data for all years), and is no longer being released. This table contains data described by the following dimensions (Not all combinations are available): Geography (13 items: Canada;Newfoundland and Labrador;Prince Edward Island;Nova Scotia; ...), Sex (2 items: Males;Females), Age group (6 items: 18 years and over;18 to 24 years;25 to 34 years;35 to 49 years; ...).
According to the most recent data, U.S. viewers aged 15 years and older spent on average almost ***** hours watching TV per day in 2023. Adults aged 65 and above spent the most time watching television at over **** hours, whilst 15 to 19-year-olds watched TV for less than *** hours each day. The dynamic TV landscape The way people consume video entertainment platforms has significantly changed in the past decade, with a forecast suggesting that the time spent watching traditional TV in the U.S. will probably decline in the years ahead, while digital video will gain in popularity. Younger age groups in particular tend to cut the cord and subscribe to video streaming services, such as Netflix, Hulu, and Amazon Prime Video. TV advertising in a transition period Similarly, the TV advertising market made a development away from traditional linear TV towards online media. While the ad spending on traditional TV in the U.S. generally increased until the end of the 2010s, this value is projected to decline to below ** billion U.S. dollars in the next few years. By contrast, investments in connected TV advertising are expected to steadily grow, despite the amount being just over half of the traditional TV ad spend by 2025.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 1044 series, with data for years 1990 - 1998 (not all combinations necessarily have data for all years), and was last released on 2007-01-29. This table contains data described by the following dimensions (Not all combinations are available): Geography (29 items: Austria; Belgium (Flemish speaking); Belgium; Belgium (French speaking) ...), Sex (2 items: Males; Females ...), Age group (3 items: 11 years;15 years;13 years ...), Time spent (6 items: Not at all; Less than 1/2 hour;2 to 3 hours;1/2 hour to 1 hour ...).
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides a comprehensive overview of the top 250 television shows listed on IMDB. It offers insights into various aspects of these shows, including their titles, the years they aired, the total number of episodes in each series, the age rating assigned to each show, the average user rating on IMDB, the number of votes each show has received, and the category of the show (either a TV Series or a TV Mini-Series).
The dataset is particularly useful for understanding audience preferences and trends in the television industry. For instance, the ratings and vote counts can reveal which shows are most popular among viewers, while the distribution of categories can shed light on the relative popularity of different types of television shows. Additionally, the year of release can be used to analyze trends in television production over time.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Avg Viewing Time: TV: 4 Years & Older data was reported at 3.270 Hour/Day in 31 Dec 2023. This records an increase from the previous number of 3.160 Hour/Day for 24 Dec 2023. Avg Viewing Time: TV: 4 Years & Older data is updated weekly, averaging 3.280 Hour/Day from Mar 2020 (Median) to 31 Dec 2023, with 195 observations. The data reached an all-time high of 4.490 Hour/Day in 29 Mar 2020 and a record low of 2.530 Hour/Day in 07 Aug 2022. Avg Viewing Time: TV: 4 Years & Older data remains active status in CEIC and is reported by Médiamétrie. The data is categorized under Global Database’s France – Table FR.TB001: TV Audience: Average Viewing Time. [COVID-19-IMPACT]
This dataset was developed as part of a Dutch HOSAN research program exploring the feasibility of utilizing heritage datasets from the Netherlands to create speech models that represent all Dutch voices.
The dataset contains a large quantity of Dutch audio data from Dutch television broadcasts in the period 1972-2022, stored at the Netherlands Institute for Sound & Vision. The audio files add up to a total of 81k hours of audio, with most audio files having a length of 30 minutes to 1 hour.
An initial selection was made of material from the period 1972-2022 that met the following criteria:
This initial selection contained approximately 184k hours of TV and 128k hours of radio. For training speech models, only the TV data was selected. The set was further reduced by selecting specific genres (see genres.txt file), and by removing audio with a length longer than three hours. Only a single broadcast per day of any given series (e.g. one single edition of the Dutch public broadcaster's news programme per day) was selected, as it was a requirement for training the speech models that the set contained as little duplication of audio fragments as possible.
Low-resolution versions of the MXF carriers were downloaded, the audio (in AAC format) extracted and this dataset delivered to the researchers under secure conditions with strict non-disclosure agreements in place regarding both the data and the resulting models.
Initial use of the data revealed that eighty-eight audio files contained a virtually flat audio signal. Investigation of a sample at Sound & Vision revealed that these came from videos for which the original analogue carriers contained no audio signal. The carrier IDs of these files are contained in the file 'no_audio.txt'.
This published version of the dataset contains the following files:
The audio files themselves are under copyright. The published dataset serves as a reference standard for detailing any research conducted using it.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains a list of over 6,000 top-rated titles on IMDb, including both movies and TV series, with a minimum average user rating of 7 and over 10,000 votes.
A dataset is updated daily at 10:00 AM CET. If you find this dataset helpful, feel free to give it an upvote! 😊
You can find the IMDb (Unofficial) API at this link: IMDb API on RapidAPI. This API offers access to the entire IMDb database, including detailed ratings, episode information, cast details, and much more.
Hi, this is my first dataset. Hope you have fun analyzing it !
1) first_air_date - The date when the show was first aired on television
2) origin_country - The country where the show was created / originates from
3) original_language - The original language of the show
4) name - Name of the show in English. Note that names in original language are not included in this dataset.
5) popularity - A metric that measures how popular a TV show is based on consumer views
6) vote_average - Average of the total number of votes the show received
7) vote_count - The number of votes the show received
8) overview - A brief description of the show
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Television Brands Ecommerce Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/devsubhash/television-brands-ecommerce-dataset on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This dataset contains 912
samples with 7
attributes. There are some missing values in this dataset.
Here are the columns in this dataset- 1. Brand: This indicates the manufacturer of the product i.e. Television 2. Resolution: This has multiple categories and indicates the type of display i.e. LED, HD LED, etc. 3. Size: This indicates the screen size in inches 4. Selling Price: This column has the Selling Price or the Discounted Price of the product 5. Original Price: This includes the Original Price of the product from the manufacturer. 6. Operating system: This categorical variable shows the type of OS like Android, Linux, etc. 7. Rating: Average customer ratings on a scale of 5.
Inspiration: This dataset could be used to explore the current market scenario for Televisions. There are various types of screens with different operating systems offered by several manufacturers at competitive prices. Some questions this dataset could be used to answer are -
--- Original source retains full ownership of the source dataset ---
Cable TV news is a data set of nearly 24/7 video, audio, and text captions from three U.S. cable TV networks (CNN, FOX, and MSNBC) from January 2010 to July 2019. Using machine learning tools, the authors detect faces in 244,038 hours of video, label each face's presented gender, identify prominent public figures, and align text captions to audio.
Open Broadcast Media Audio from TV (OpenBMAT) is an open, annotated dataset for the task of music detection that contains over 27 hours of TV broadcast audio from 4 countries distributed over 1647 one-minute long excerpts. It is designed to encompass several essential features for any music detection dataset and is the first one to include annotations about the loudness of music in relation to other simultaneous non-music sounds. OpenBMAT has been cross-annotated by 3 annotators obtaining high inter-annotator agreement percentages, which validates the annotation methodology and ensures the annotations reliability.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Introduction: Screentime is ubiquitous with children and parents concerned and anxious about its effect on the well-being of their children. This project uses the 2020 data from the National Survey of Children’s Health (NSCH) to determine if there is a correlation between the amount of weekday screentime in children ages 17 and younger and reported instances of mental health treatment and mental health treatment needed. Objectives: The primary objective of this project is to determine if there is a correlation between screentime and the mental health of children, ages 17 and younger. Methods: This project utilizes 2020 data from the NSCH, specifically the survey information collected about children ages 17 and younger on screentime, mental health professional treatment, and age of the child. Screentime refers to weekday time spent in front of a TV, computer, cellphone, or other electronic device watching programs, playing games, accessing the internet or using social media. After analyzing the three aforementioned variables, the percentage of mental health treatment occurrences by age group per screen time category indicates whether there is a correlation between children’s screentime and their mental health. Results: Preschool-aged (0-5 years old) children who spent 2 hours per weekday in front of a screen had the highest occurrence of mental health treatment, doubling the other categories of screentime. In school-aged (6-13 years old) children, there is a rise in mental health treatment needed as screentime increases. In adolescent (14-17 years old) children, there is a significant increase in the occurrence of mental health treatment as screentime increases, where 60% of adolescents who require mental health treatment spent four or more hours in front of a screen. Conclusions: There is a correlation between increased screentime and the occurrence of mental health treatment in children, particularly with the Adolescent (14-17 years old) age group.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ITTV is a publicly available dataset of Italian TV programs introduced in
Alessandro Ilic Mezza, Paolo Sani, and Augusto Sarti, "Automatic TV Genre Classification Based on Visually-Conditioned Deep Audio Features," in 2023 31st European Signal Processing Conference (EUSIPCO), 2023.
ITTV consists of 2625 manually annotated YouTube videos, totaling over 670 hours. Each clip is assigned one of seven classes:
Cartoons
Commercials
Football
Music
News
Talk Shows
Weather Forecast
ITTV genre taxonomy is similar to that of the well-known RAI dataset described in
Maurizio Montagnuolo and Alberto Messina, "Parallel neural networks for multimodal video genre classification,” Multimedia Tools and Applications, vol. 41, no. 1, pp. 125–159, 2009.
The dataset contains genre annotations and metadata in CSV format. Please note that audio data is not provided.
We provide the annotations for a balanced training (1575 clips) and validation (525 clips) split, as well as for a disjoint test set containing 525 installments from TV programs not included in the development set.
As YouTube continuously updates, some videos may not be available in the future. Although we intend to keep ITTV updated as best as possible, please note that some content may not be available at any given time.
Some YouTube videos (especially from the Football class and, to a lesser extent, the Cartoons class) may only be available in some countries due to regional restrictions imposed by the content creator. All videos are known to be accessible from Italy (last accessed on Nov. 25th, 2022.)
Please contact Alessandro Ilic Mezza for further questions (e-mail: alessandroilic.mezza@polimi.it).
Overview
Broadcast Audio Fingerprinting dataset is an open, available upon request, annotated dataset for the task of music monitoring in broadcast. It contains 2,000 tracks from Epidemic Sound's private catalogue as reference tracks that represent 74 hours. As queries, it contains over 57 hours of TV broadcast audio from 23 countries and 203 channels distributed with 3,425 one-min audio excerpts.
It has been annotated by six annotators in total and each query has been cross-annotated by three of them obtaining high inter-annotator agreement percentages, which validates the annotation methodology and ensures the reliability of the annotations.
Purpose of the dataset
This dataset aims to become the standard dataset to evaluate Audio Fingerprinting algorithms since it’s built on real data, without the use of any data-augmentation techniques. It is also the first dataset to address background music fingerprinting, which is a real problem in royalties distribution.
Dataset use
This dataset is available for conducting non-commercial research related to audio analysis. It shall not be used for music generation or music synthesis.
About the data
All audio files are monophonic, 8kHz, 128kb/s, pcm_s16le encoded in .wav. Annotations mark which tracks sound (either in foreground or background) in each query (if any) and also the specific times where it starts and ends sound in the query.
Note that there are 88 queries that do not have any matches.
For more information check the dedicated Github repository: https://github.com/guillemcortes/baf-dataset and the dataset datasheet included in the files.
Dataset contents
The dataset is structured following this schema
baf-dataset/
├── baf_datasheet.pdf
├── annotations.csv
├── changelog.md
├── cross_annotations.csv
├── queries_info.csv
├── queries
│ ├── query_0001.wav
│ ├── query_0002.wav
│ ├── …
│ └── query_3425.wav
├── queries_info.csv
└── references
├── ref_0001.wav
├── ref_0002.wav
├── …
└── ref_2000.wav
There are two folders named queries and references containing the wav files of TV broadcast recordings and the reference tracks, respectively.
annotations.csv file contains the annotations made by the 6 annotators, giving the following information:
query | reference | query_start | query_end | annotator |
---|---|---|---|---|
query_0692.wav | ref_1235.wav | 0.0 | 59.904 | annotator_6 |
cross_annotations.csv contains the resulting annotations after merging the overlapping annotations in annotations.csv file. x_tag has three different values:
single: the segment has only been annotated by one annotator.
majority: the segment has been annotated by two annotators.
unanimity: the segment has been annotated by the three annotators.
query | reference | query_Start | query_end | annotators | x_tag |
---|---|---|---|---|---|
query_0693.wav | ref_1834.wav | 37.53 | 38.07 | ['annotator_3'] | single |
query_0693.wav | ref_1834.wav | 18.18 | 37.48 | ['annotator_3', 'annotator_5', 'annotator_3'] | unanimity |
query_0693.wav | ref_1834.wav | 37.48 | 37.53 | ['annotator_5', 'annotator_3'] | majority |
queries_info.csv contains information about the queries as a citation reference. It contains the country, the channel and the date where the broadcast happened.
filename | country | channel | datetime |
---|---|---|---|
query_0001.wav | Norway | Discovery Channel | 2021-02-26 14:45:26 |
changelog.md contains a curated, chronologically ordered list of notable changes for each version of the dataset.
baf_datasheet.pdf contains standardized documentation for datasets
Ownership of the data
Next, we specify the ownership of all the data included in BAF: Broadcast Audio Fingerprinting dataset. For licensing information, please refer to the “License” section.
Reference tracks
The reference tracks are owned by Epidemic Sound AB, which has given a worldwide, revocable, non-exclusive, royalty-free licence to use and reproduce this data collection consisting of 2,000 low-quality monophonic 8kHz downsampled audio recordings.
Query tracks
The query tracks come from publicly available TV broadcast emissions so the ownership of each recording belongs to the channel that emitted the content. We publish them under the right of quotation provided by the Berne Convention.
Annotations
Guillem Cortès together with Alex Ciurana and Emilio Molina from BMAT Music Licensing S.L. have managed the annotation therefore the annotations belong to BMAT.
Accessing the dataset
The dataset is available upon request. Please include, in the justification field, your academic affiliation (if you have one) and a brief description of your research topics and why you would like to use this dataset. Bear in mind that this information is important for the evaluation of every access request.
License
This dataset is available for conducting non-commercial research related to audio analysis. It shall not be used for music generation or music synthesis. Given the different ownership of the elements of the dataset, the dataset is licensed under the following conditions:
User’s access request
Research only, non-commercial purposes
No adaptations nor derivative works
Attribution to Epidemic Sound and the authors as it is indicated in the ”citation” section.
Please include, in the justification field, your academic affiliation (if you have one) and a brief description of your research topics and why you would like to use this dataset.
Acknowledgments
With the support of Ministerio de Ciencia Innovación y universidades through Retos-Colaboración call, reference: RTC2019-007248-7, and also with the support of the Industrial Doctorates Plan of the Secretariat of Universities and Research of the Department of Business and Knowledge of the Generalitat de Catalunya. Reference: DI46-2020.
By Gove Allen [source]
The Law and Order Dataset is a comprehensive collection of data related to the popular television series Law and Order that aired from 1990 to 2010. This dataset, compiled by IMDB.com, provides detailed information about each episode of the show, including its title, summary, airdate, director, writer, guest stars, and IMDb rating.
With over 450 episodes spanning 20 seasons of the original series as well as its spin-offs like Law and Order: Special Victims Unit, this dataset offers a wealth of information for analyzing various facets of criminal justice and law enforcement portrayed in the show. Whether you are a student or researcher studying crime-related topics or simply an avid fan interested in exploring behind-the-scenes details about your favorite episodes or actors involved in them, this dataset can be a valuable resource.
By examining this extensive collection of data using SQL queries or other analytical techniques, one can gain insights into patterns such as common tropes used in different seasons or characters that appeared most frequently throughout the series. Additionally, researchers can investigate correlations between factors like episode directors/writers and their impact on viewer ratings.
This dataset allows users to dive deep into analyzing aspects like crime types covered within episodes (e.g., homicide cases versus white-collar crimes), how often certain guest stars made appearances (including famous actors who had early roles on the show), or which writers/directors contributed most consistently high-rated episodes. Such analyses provide opportunities for uncovering trends over time within Law and Order's narrative structure while also shedding light on societal issues addressed by the series.
By making this dataset available for educational purposes at collegiate levels specifically aimed at teaching SQL skills—a powerful tool widely used in data analysis—the intention is to empower students with real-world examples they can explore hands-on while honing their database querying abilities. The graphical representation accompanying this dataset further enhances understanding by providing visualizations that illustrate key relationships between different variables.
Whether you are a seasoned data analyst, a budding criminologist, or simply looking to understand the intricacies of one of the most successful crime dramas in television history, the Law and Order Dataset offers you a vast array of information ripe for exploration and analysis
Understanding the Columns
Before diving into analyzing the data, it's important to understand what each column represents. Here is an overview:
Episode
: The episode number within its respective season.Title
: The title of each episode.Season
: The season number in which each episode belongs.Year
: The year in which each episode was released.Rating
: IMDB rating for each episode (on a scale from 0-10).Votes
: Number of votes received by each episode on IMDB.Description
: Brief summary or description of each episode's plot.Director
: Director(s) responsible for directing an episode.Writers
: Writer(s) credited for writing an episode.Stars
: Actor(s) who starred in an individual episode.Exploring Episode Data
The dataset allows you to explore various aspects of individual episodes as well as broader trends throughout different seasons:
1. Analyzing Ratings:
- You can examine how ratings vary across seasons using aggregation functions like average (AVG), minimum (MIN), maximum (MAX), etc., depending on your analytical goals. - Identify popular episodes by sorting based on highest ratings or most votes received.
2.Trends over Time:
- Investigate how ratings have changed over time by visualizing them using line charts or bar graphs based on release years or seasons. - Examine if there are any significant fluctuations in ratings across different seasons or years.
3. Directors and Writers:
- Identify episodes directed by a specific director or written by particular writers by filtering the dataset based on their names. - Analyze the impact of different directors or writers on episode ratings.
4. Popular Actors:
- Explore episodes featuring popular actors from the show such as Mariska Hargitay (Olivia Benson), Christopher Meloni (Elliot Stabler), etc. - Investigate whether episodes with popular actors received higher ratings compared to ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Netflix "Top 10" TV Shows and Films’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/dhruvildave/netflix-top-10-tv-shows-and-films on 28 January 2022.
--- Dataset description provided by original source is as follows ---
Every Tuesday, Netflix publishes four global Top 10 lists for films and TV: Film (English), TV (English), Film (Non-English), and TV (Non-English). These lists rank titles based on weekly hours viewed: the total number of hours that members around the world watched each title from Monday to Sunday of the previous week.
Each season of a series and each film is considered on their own, so you might see both Stranger Things seasons 2 and 3 in the Top 10. Because titles sometimes move in and out of the Top 10, there is also the total number of weeks that a season of a series or film has spent on the list.
Netflix also publishes Top 10 lists for nearly 100 countries and territories (the same locations where there are Top 10 rows on Netflix). Country lists are also ranked based on hours viewed but don’t show country-level viewing directly.
Finally, Netflix provides a list of the Top 10 most popular Netflix films and TV (branded Netflix in any country) in each of the four categories based on the hours that each title was viewed during its first 28 days.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We investigate whether TV watching at ages 6-7 and 8-9 affects cognitive development measured by math and reading scores at ages 8-9, using a rich childhood longitudinal sample from NLSY79. Dynamic panel data models are estimated to handle the unobserved child-specific factor, endogeneity of TV watching, and dynamic nature of the causal relation. A special emphasis is placed on the last aspect, where TV watching affects cognitive development, which in turn affects future TV watching. When this feedback occurs, it is not straightforward to identify and estimate the TV effect. We develop a two-stage estimation method which can deal with the feedback feature; we also apply the standard econometric panel data approaches. Overall, for math score at ages 8-9, we find that watching TV during ages 6-7 and 8-9 has a negative total effect, mostly due to a large negative effect of TV watching at the younger ages 6-7. For reading score, there is evidence that watching no more than 2 hours of TV per day has a positive effect, whereas the effect is negative outside this range. In both cases, however, the effect magnitudes are economically small.
Number of average usual hours and average actual hours worked in a reference week by type of work (full- and part-time employment), job type (main or all jobs), gender, and age group, annual.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
The "Thorsten-Voice" dataset
This truly open source (CC0 license) german (🇩🇪) voice dataset contains about 40 hours of transcribed voice recordings by Thorsten Müller, a single male, native speaker in over 38.000 wave files.
Mono Samplerate: 44.100Hz Trimmed silence at begin/end Denoised Normalized to -24dB
Disclaimer
"Please keep in mind, I am not a professional speaker, just an open source speech technology enthusiast who donates his voice. I contribute my personal… See the full description on the dataset page: https://huggingface.co/datasets/Thorsten-Voice/TV-44kHz-Full.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 39 series, with data for years 1998 - 2004 (not all combinations necessarily have data for all years), and is no longer being released. This table contains data described by the following dimensions (Not all combinations are available): Geography (13 items: Canada;Newfoundland and Labrador;Prince Edward Island;Nova Scotia; ...), Age group (3 items: Total population;Children 2 to 11 years;Teens 12 to 17 years)