Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
YouTube was launched in 2005. It was founded by three PayPal employees: Chad Hurley, Steve Chen, and Jawed Karim, who ran the company from an office above a small restaurant in San Mateo. The first...
This dataset provides estimated YouTube RPM (Revenue Per Mille) ranges for different niches in 2025, based on ad revenue earned per 1,000 monetized views.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This is the statistics for the Top 10 songs of various spotify artists and their YouTube videos. The Creators above generated the data and uploaded it to Kaggle on February 6-7 2023. The license to use this data is "CC0: Public Domain", allowing the data to be copied, modified, distributed, and worked on without having to ask permission. The data is in numerical and textual CSV format as attached. This dataset contains the statistics and attributes of the top 10 songs of various artists in the world. As described by the creators above, it includes 26 variables for each of the songs collected from spotify. These variables are briefly described next:
Track: name of the song, as visible on the Spotify platform. Artist: name of the artist. Url_spotify: the Url of the artist. Album: the album in wich the song is contained on Spotify. Album_type: indicates if the song is relesead on Spotify as a single or contained in an album. Uri: a spotify link used to find the song through the API. Danceability: describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable. Energy: is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy. Key: the key the track is in. Integers map to pitches using standard Pitch Class notation. E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on. If no key was detected, the value is -1. Loudness: the overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typically range between -60 and 0 db. Speechiness: detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks. Acousticness: a confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic. Instrumentalness: predicts whether a track contains no vocals. "Ooh" and "aah" sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly "vocal". The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0. Liveness: detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. Valence: a measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry). Tempo: the overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration. Duration_ms: the duration of the track in milliseconds. Stream: number of streams of the song on Spotify. Url_youtube: url of the video linked to the song on Youtube, if it have any. Title: title of the videoclip on youtube. Channel: name of the channel that have published the video. Views: number of views. Likes: number of likes. Comments: number of comments. Description: description of the video on Youtube. Licensed: Indicates whether the video represents licensed content, which means that the content was uploaded to a channel linked to a YouTube content partner and then claimed by that partner. official_video: boolean value that indicates if the video found is the official video of the song. The data was last updated on February 7, 2023.
During the first half of 2023, the majority of copyright claims received by YouTube were spotted by the platform's Content ID tool, which cross-checks uploaded videos against a larger file database. Over 2.75 million claims were submitted via Copyright Match Tool, while approximately of two million claims were submitted to the platform via webforms.
How many people use social media?
Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
Who uses social media?
Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
How much time do people spend on social media?
Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
What are the most popular social media platforms?
Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Data set of materials in vessels
The handling of materials in glassware vessels is the main task in chemistry laboratory research as well as a large number of other activities. Visual recognition of the physical phase of the
materials is essential for many methods ranging from a simple task such as fill-level evaluation to the
identification of more complex properties such as solvation, precipitation, crystallization and phase
separation. To help train neural nets for this task, a new data set was created. The data set contains a
thousand images of materials, in different phases and involved in different chemical processes, in a
laboratory setting. Each pixel in each image is labeled according to several layers of classification, as
given below:
a. Vessel/Background: For each pixel assign value of one if it is part of the vessel and zero otherwise.
This annotation was used as the ROI map for the valve filter method.
b. Filled/Empty: This is similar to the above, but also distinguishes between the filled and empty
regions of the vessel. For each pixel, one of the following three values is assigned:0 (background); 1
(empty vessel); or 2 (filled vessel).
c. Phase type: This is similar to the above but distinguishes between liquid and solid regions of the
filled vessel. For each pixel, one of the following four values: 0 (background); 1 (empty vessel); 2
(liquid); or 3 (solid).
d. Fine-grained physical phase type: This is similar to the above but distinguishes between specific
classes of physical phase. For each pixel, one of 15 values is assigned: 1 (background); 2 (empty
vessel); 3 (liquid); 4 (liquid phase two, in the case where more than one phase of the liquid appears in
the vessel); 5 (suspension); 6 (emulsion); 7 (foam); 8 (solid); 9 (gel); 10 (powder); 11 (granular); 12
(bulk); 13 (solid-liquid mixture); 14 (solid phase two, in the case where more than one phase of solid
exists in the vessel): and 15 (vapor).
The annotations are given as images of the size of the original image, where the pixel value is the
class number. The annotation of the vessel region (a) is used in the ROI input for the valve filter net .
4.1. Validation/testing set
The data set is divided into training and testing sets. The testing set is itself divided into two subsets;
one contains images extracted from the same YouTube channels as the training set, and therefore was
taken under similar conditions as the training images. The second subset contains images extracted
from YouTube channels not included in the training set, and hence contains images taken under
different conditions from those used to train the net.
4.2. Creating the data set
The creation of a large number of images with a variety of chemical processes and settings could have
been a daunting task. Luckily, several YouTube channels dedicated to chemical experiments exist
which offer high-quality footage of chemistry experiments. Thanks to these channels, including
NurdRage, NileRed, ChemPlayer, it was possible to collect a large number of high-quality images in a
short time. Pixel-wise annotation of these images was another challenging task, and was performed by
Alexandra Emanuel and Mor Bismuth.
For more details see: Setting attention region for convolutional neural networks using region selective features, for recognition of materials within glass vessels
This dataset was first published in 2017.8
For newer and Bigger datasets see
https://zenodo.org/record/4736111#.YbG-RrtyZH4
https://zenodo.org/record/3697452#.YbG-TLtyZH4
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This data set contains combined on-court performance data for NBA players in the 2016-2017 season, alongside salary, Twitter engagement, and Wikipedia traffic data.
Further information can be found in a series of articles for IBM Developerworks: "Explore valuation and attendance using data science and machine learning" and "Exploring the individual NBA players".
A talk about this dataset has slides from March, 2018, Strata:
Further reading on this dataset is in the book Pragmatic AI, in Chapter 6 or full book, Pragmatic AI: An introduction to Cloud-based Machine Learning and watch lesson 9 in Essential Machine Learning and AI with Python and Jupyter Notebook
You can watch a breakdown of using cluster analysis on the Pragmatic AI YouTube channel
Learn to deploy a Kaggle project into a production Machine Learning sklearn + flask + container by reading Python for Devops: Learn Ruthlessly Effective Automation, Chapter 14: MLOps and Machine learning engineering
Use social media to predict a winning season with this notebook: https://github.com/noahgift/core-stats-datascience/blob/master/Lesson2_7_Trends_Supervized_Learning.ipynb
Learn to use the cloud for data analysis.
Data sources include ESPN, Basketball-Reference, Twitter, Five-ThirtyEight, and Wikipedia. The source code for this dataset (in Python and R) can be found on GitHub. Links to more writing can be found at noahgift.com.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Have you ever wanted to create your own maps, or integrate and visualize spatial datasets to examine changes in trends between locations and over time? Follow along with these training tutorials on QGIS, an open source geographic information system (GIS) and learn key concepts, procedures and skills for performing common GIS tasks – such as creating maps, as well as joining, overlaying and visualizing spatial datasets. These tutorials are geared towards new GIS users. We’ll start with foundational concepts, and build towards more advanced topics throughout – demonstrating how with a few relatively easy steps you can get quite a lot out of GIS. You can then extend these skills to datasets of thematic relevance to you in addressing tasks faced in your day-to-day work.
In 2023, Meta Platforms had a total annual revenue of over 134 billion U.S. dollars, up from 116 billion in 2022. LinkedIn reported its highest annual revenue to date, generating over 15 billion USD, whilst Snapchat reported an annual revenue of 4.6 billion USD.
Youtuber, Doug DeMuro is popular for his car reviews and at the end of each review, he gives the car a score, a DougScore.
The data is gotten from his website. It contains description as well as the link to google sheets. The data is the same except for a little cleaning to convert it to csv.
So how does the DougScore work? There are 10 separate categories, and they’re each judged on a scale of 1 through 10 — with “1” being the worst, and “10” being the best, meaning the highest possible DougScore is 100. The ten categories are split into two separate groups: “Weekend” and “Daily.” The “Weekend” categories measure a car’s appeal to enthusiasts; in other words, how much fun it would be to drive on the weekend. The “Daily” categories, meanwhile, focus on a car’s livability and practicality.
The Weekend categories are Styling, Acceleration, Handling, Fun Factor, Cool Factor while the daily categories are Features, Comfort, Quality, Practicality, Value. Each category are summed as Weekend Total and Daily Total respectively.
Please upvote if you like this dataset, and don't forget Doug's channel. Thanks.
The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years. User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A curated dataset of over 400 unique and creative YouTube channel name ideas organized by popular niches such as gaming, travel, tech, beauty, vlogging, pets, DIY, education, and more. Includes a free YouTube channel name generator to help creators find inspiration for their brand.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The Indian Premier League (IPL) is a professional Twenty20 cricket league in India usually contested between March and May of every year by eight teams representing eight different cities or states in India. The league was founded by the Board of Control for Cricket in India (BCCI) in 2007. The IPL has an exclusive window in ICC Future Tours Programme.
The IPL is the most-attended cricket league in the world and in 2014 was ranked sixth by average attendance among all sports leagues. In 2010, the IPL became the first sporting event in the world to be broadcast live on YouTube. The brand value of the IPL in 2019 was ₹475 billion (US$6.7 billion), according to Duff & Phelps. According to BCCI, the 2015 IPL season contributed ₹11.5 billion (US$160 million) to the GDP of the Indian economy.
The dataset consist of data about IPL matches played from the year 2008 to 2019. IPL is a professional Twenty20 cricket league founded by the Board of Control for Cricket in India (BCCI) in 2008. The league has 8 teams representing 8 different Indian cities or states. It enjoys tremendous popularity and the brand value of the IPL in 2019 was estimated to be ₹475 billion (US$6.7 billion). So let’s analyze IPL through stats.
The dataset has 18 columns. Let’s get acquainted with the columns. - id: The IPL match id. - season: The IPL season - city: The city where the IPL match was held. - date: The date on which the match was held. - team1: One of the teams of the IPL match - team2: The other team of the IPL match - toss_winner: The team that won the toss - toss_decision: The decision taken by the team that won the toss to ‘bat’ or ‘field’ - result: The result(‘normal’, ‘tie’, ‘no result’) of the match. - dl_applied: (1 or 0)indicates whether the Duckworth-Lewis rule was applied or not. - winner: The winner of the match. - win_by_runs: Provides the runs by which the team batting first won - win_by_runs: Provides the number of wickets by which the team batting second won. - player_of_match: The outstanding player of the match. - venue: The venue where the match was hosted. - umpire1: One of the two on-field umpires who officiate the match. - umpire2: One of the two on-field umpires who officiate the match. - umpire3: The off-field umpire who officiates the match
Facebook received 73,390 user data requests from federal agencies and courts in the United States during the second half of 2023. The social network produced some user data in 88.84 percent of requests from U.S. federal authorities. The United States accounts for the largest share of Facebook user data requests worldwide.
During a 2024 survey among marketers worldwide, approximately 83 percent selected increased exposure as a benefit of social media marketing. Increased traffic followed, mentioned by 73 percent of the respondents, while 65 percent cited generated leads.
The multibillion-dollar social media ad industry
Between 2019 – the last year before the pandemic – and 2024, global social media advertising spending skyrocketed by 140 percent, surpassing an estimated 230 billion U.S. dollars in the latter year. That figure was forecast to increase by nearly 50 percent by the end of the decade, exceeding 345 billion dollars in 2029. As of 2024, the social media networks with the most monthly active users were Facebook, with over three billion, and YouTube, with more than 2.5 billion.
Pros and cons of GenAI for social media marketing
According to another 2024 survey, generative artificial intelligence's (GenAI) leading benefits for social media marketing according to professionals worldwide included increased efficiency and easier idea generation. The third place was a tie between increased content production and enhanced creativity. All those advantages were cited by between 33 and 38 percent of the interviewees. As for GenAI's top challenges for global social media marketing,
maintaining authenticity and the value of human creativity ranked first, mentioned by 43 and 40 percent of the respondents, respectively. Another 35 percent deemed ensuring the content resonates as an obstacle.
Cristiano Ronaldo has one of the most popular Instagram accounts as of April 2024.
The Portuguese footballer is the most-followed person on the photo sharing app platform with 628 million followers. Instagram's own account was ranked first with roughly 672 million followers.
How popular is Instagram?
Instagram is a photo-sharing social networking service that enables users to take pictures and edit them with filters. The platform allows users to post and share their images online and directly with their friends and followers on the social network. The cross-platform app reached one billion monthly active users in mid-2018. In 2020, there were over 114 million Instagram users in the United States and experts project this figure to surpass 127 million users in 2023.
Who uses Instagram?
Instagram audiences are predominantly young – recent data states that almost 60 percent of U.S. Instagram users are aged 34 years or younger. Fall 2020 data reveals that Instagram is also one of the most popular social media for teens and one of the social networks with the biggest reach among teens in the United States.
Celebrity influencers on Instagram
Many celebrities and athletes are brand spokespeople and generate additional income with social media advertising and sponsored content. Unsurprisingly, Ronaldo ranked first again, as the average media value of one of his Instagram posts was 985,441 U.S. dollars.
During a January 2024 global survey among marketers, nearly 60 percent reported plans to increase their organic use of YouTube for marketing purposes in the following 12 months. LinkedIn and Instagram followed, respectively mentioned by 57 and 56 percent of the respondents intending to use them more. According to the same survey, Facebook was the most important social media platform for marketers worldwide.
During a 2024 survey among marketers worldwide, around 86 percent reported using Facebook for marketing purposes. Instagram and LinkedIn followed, respectively mentioned by 79 and 65 percent of the respondents.
The global social media marketing segment
According to the same study, 59 percent of responding marketers intended to increase their organic use of YouTube for marketing purposes throughout that year. LinkedIn and Instagram followed with similar shares, rounding up the top three social media platforms attracting a planned growth in organic use among global marketers in 2024. Their main driver is increasing brand exposure and traffic, which led the ranking of benefits of social media marketing worldwide.
Social media for B2B marketing
Social media platform adoption rates among business-to-consumer (B2C) and business-to-business (B2B) marketers vary according to each subsegment's focus. While B2C professionals prioritize Facebook and Instagram – both run by Meta, Inc. – due to their popularity among online audiences, B2B marketers concentrate their endeavors on Microsoft-owned LinkedIn due to its goal to connect people and companies in a corporate context.
As of January 2024, #love was the most used hashtag on Instagram, being included in over two billion posts on the social media platform. #Instagood and #instagram were used over one billion times as of early 2024.
How much time do people spend on social media?
As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in
the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively.
People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general.
During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
YouTube was launched in 2005. It was founded by three PayPal employees: Chad Hurley, Steve Chen, and Jawed Karim, who ran the company from an office above a small restaurant in San Mateo. The first...