Netflix's global subscriber base has reached an impressive milestone, surpassing *** million paid subscribers worldwide in the fourth quarter of 2024. This marks a significant increase of nearly ** million subscribers compared to the previous quarter, solidifying Netflix's position as a dominant force in the streaming industry. Adapting to customer losses Netflix's growth has not always been consistent. During the first half of 2022, the streaming giant lost over *** million customers. In response to these losses, Netflix introduced an ad-supported tier in November of that same year. This strategic move has paid off, with the lower-cost plan attracting ** million monthly active users globally by November 2024, demonstrating Netflix's ability to adapt to changing market conditions and consumer preferences. Global expansion Netflix continues to focus on international markets, with a forecast suggesting that the Asia Pacific region is expected to see the most substantial growth in the upcoming years, potentially reaching around **** million subscribers by 2029. To correspond to the needs of the non-American target group, the company has heavily invested in international content in recent years, with Korean, Spanish, and Japanese being the most watched non-English content languages on the platform.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Netflix subscription fee in different countries’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/prasertk/netflix-subscription-price-in-different-countries on 28 January 2022.
--- Dataset description provided by original source is as follows ---
Which countries pay the most and least for Netflix in 2021?
Data source: https://www.comparitech.com/blog/vpn-privacy/countries-netflix-cost/ Cover image credit: https://www.pexels.com/photo/light-man-people-woman-5112410/
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Netflix Shows’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/netflix-showse on 13 February 2022.
--- Dataset description provided by original source is as follows ---
Background
Netflix in the past 5-10 years has captured a large populate of viewers. With more viewers, there most likely an increase of show variety. However, do people understand the distribution of ratings on Netflix shows?
Netflix Suggestion Engine
Because of the vast amount of time it would take to gather 1,000 shows one by one, the gathering method took advantage of the Netflix’s suggestion engine. The suggestion engine recommends shows similar to the selected show. As part of this data set, I took 4 videos from 4 ratings (totaling 16 unique shows), then pulled 53 suggested shows per video. The ratings include: G, PG, TV-14, TV-MA. I chose not to pull from every rating (e.g. TV-G, TV-Y, etc.).
Source
Access to the study can be found at The Concept Center
This dataset was created by Chase Willden and contains around 1000 samples along with User Rating Score, Rating Description, technical information and other features such as: - Release Year - Title - and more.
- Analyze User Rating Size in relation to Rating
- Study the influence of Rating Level on User Rating Score
- More datasets
If you use this dataset in your research, please credit Chase Willden
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this post, you'll see how the Netflix platform is evolving, how many users Netflix has and how they perform against the growing competition.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Netflix in the past 5-10 years has captured a large populate of viewers. With more viewers, there most likely an increase of show variety. However, do people understand the distribution of ratings on Netflix shows?
Because of the vast amount of time it would take to gather 1,000 shows one by one, the gathering method took advantage of the Netflix’s suggestion engine. The suggestion engine recommends shows similar to the selected show. As part of this data set, I took 4 videos from 4 ratings (totaling 16 unique shows), then pulled 53 suggested shows per video. The ratings include: G, PG, TV-14, TV-MA. I chose not to pull from every rating (e.g. TV-G, TV-Y, etc.).
The data set and the research article can be found at The Concept Center
I was watching Netflix with my wife and we asked ourselves, why are there so many R and TV-MA rating shows?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Netflix produced more than 2,769 hours of original content in 2019. This was a huge 80.15% increase compared to 2018. Netflix had over 2,000 originals at the beginning of 2021.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘1000 Netflix Shows’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/chasewillden/netflix-shows on 28 January 2022.
--- Dataset description provided by original source is as follows ---
Netflix in the past 5-10 years has captured a large populate of viewers. With more viewers, there most likely an increase of show variety. However, do people understand the distribution of ratings on Netflix shows?
Because of the vast amount of time it would take to gather 1,000 shows one by one, the gathering method took advantage of the Netflix’s suggestion engine. The suggestion engine recommends shows similar to the selected show. As part of this data set, I took 4 videos from 4 ratings (totaling 16 unique shows), then pulled 53 suggested shows per video. The ratings include: G, PG, TV-14, TV-MA. I chose not to pull from every rating (e.g. TV-G, TV-Y, etc.).
The data set and the research article can be found at The Concept Center
I was watching Netflix with my wife and we asked ourselves, why are there so many R and TV-MA rating shows?
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Netflix has been met with tons of competition from major multinational companies. These are the key Netflix Statistics you need to know.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Netflix, Inc. operates as a streaming entertainment service company. The firm provides subscription service streaming movies and television episodes over the Internet and sending DVDs by mail. It operates through the following segments: Domestic Streaming, International Streaming and Domestic DVD. The Domestic Streaming segment derives revenues from monthly membership fees for services consisting of streaming content to its members in the United States. The International Streaming segment includes fees from members outside the United States. The Domestic DVD segment covers revenues from services consisting of DVD-by-mail. The company was founded by Marc Randolph and Wilmot Reed Hastings Jr. on August 29, 1997 and is headquartered in Los Gatos, CA.
Mutual fund holders 49.41% Individual stakeholders 4.17% Other institutional 31.86%
Netflix, Inc. 100 Winchester Circle Los Gatos California 95032
P: (408) 540-3700 Investor Relations: (408) 809-5360 www.netflix.com
The data is collected from Yahoo Finance. Inspiration is the release of the fifth season of my favorite Netflix show Money Heist (La Casa de Papel)
https://scoop.market.us/privacy-policyhttps://scoop.market.us/privacy-policy
Streaming Services Statistics: Streaming services have transformed the entertainment landscape, revolutionizing how people consume content.
The advent of high-speed internet and the proliferation of smart devices have fueled the growth of these platforms, offering a wide array of movies, TV shows, music, and more, at the viewers' convenience.
This introduction provides an overview of key statistics that shed light on the impact, trends, and challenges within the streaming industry.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The average Netflix user spends 3.2 hours per day streaming content on Netflix.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The company reported that its users are 49% women and 51% men.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundMental health conditions and psychiatric disorders are among the leading causes of illness, disability, and death among young people around the globe. In the United States, teen suicide has increased by about 30% in the last decade. Raising awareness of warning signs and promoting access to mental health resources can help reduce suicide rates for at-risk youth. However, death by suicide remains a taboo topic for public discourse and societal intervention. An unconventional approach to address taboo topics in society is the use of popular media.MethodWe conducted a quantitative content analysis of mainstream news reporting on the controversial Netflix series 13 Reasons Why Season 1. Using a combination of top-down and bottom-up search strategies, our final sample consisted of 97 articles published between March 31 and May 31, 2017, from 16 media outlets in 3,150 sentences. We systematically examined the news framing in these articles in terms of content and valence, the salience of health/social issue related frames, and their compliance with the WHO guidelines.ResultsNearly a third of the content directly addressed issues of our interest: 61.6% was about suicide and 38.4% was about depression, bullying, sexual assault, and other related health/social issues; it was more negative (42.8%) than positive (17.4%). The criticism focused on the risk of suicide contagion, glamorizing teen suicide, and the portrayal of parents and educators as indifferent and incompetent. The praise was about the show raising awareness of real and difficult issues young people struggle with in their everyday life and serving as a conversation starter to spur meaningful discussions. Our evaluation of WHO guideline compliance for reporting on suicide yielded mixed results. Although we found recommended practices across all major categories, they were minimal and could be improved.ConclusionDespite their well intentions and best efforts, the 13 Reasons Why production team missed several critical opportunities to be better prepared and more effective in creating social impact entertainment and fostering difficult dialogs. There is an urgent need to train news reporters about established health communication guidelines and promote best practices in media reporting on sensitive topics such as suicide.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘FAANG- Complete Stock Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/aayushmishra1512/faang-complete-stock-data on 14 February 2022.
--- Dataset description provided by original source is as follows ---
There are a few companies that are considered to be revolutionary. These companies also happen to be a dream place to work at for many many people across the world. These companies include - Facebook,Amazon,Apple,Netflix and Google also known as FAANG! These companies make ton of money and they help others too by giving them a chance to invest in the companies via stocks and shares. This data wass made targeting these stock prices.
The data contains information such as opening price of a stock, closing price, how much of these stocks were sold and many more things. There are 5 different CSV files in the data for each company.
--- Original source retains full ownership of the source dataset ---
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
I love movies.
I tend to avoid marvel-transformers-standardized products, and prefer a mix of classic hollywood-golden-age and obscure polish artsy movies. Throw in an occasional japanese-zombie-slasher-giallo as an alibi. Good movies don't exist without bad movies.
On average I watch 200+ movies each year, with peaks at more than 500 movies. Nine years ago I started to log my movies to avoid watching the same movie twice, and also assign scores. Over the years, it gave me a couple insights on my viewing habits but nothing more than what a tenth-grader would learn at school.
I've recently suscribed to Netflix and it pains me to see the global inefficiency of recommendation systems for people like me, who mostly swear by "La politique des auteurs". It's a term coined by famous new-wave french movie critic André Bazin, meaning that the quality of a movie is essentially linked to the director and it's capacity to execute his vision with his crew. We could debate it depends on movie production pipeline, but let's not for now. Practically, what it means, is that I essentially watch movies from directors who made films I've liked.
I suspect Neflix calibrate their recommandation models taking into account the way the "average-joe" chooses a movie. A few months ago I had read a study based on a survey, showing that people chose a movie mostly based on genre (55%), then by leading actors (45%). Director or Release Date were far behind around 10% each. It is not surprising, since most people I know don't care who the director is. Lots of US blockbusters don't even mention it on the movie poster. I am aware that collaborative filtering is based on user proximity , which I believe decreases (or even eliminates) the need to characterize a movie. So here I'm more interested in content based filtering which is based on product proximity for several reasons :
Users tastes are not easily accessible. It is, after all, Netflix treasure chest
Movie offer on Netflix is so bad for someone who likes author's movies that it wouldn't help
Modeling a movie intrinsic qualities is a nice challenge
Enough.
"*The secret of getting ahead is getting started*" (Mark Twain)
https://img11.hostingpics.net/pics/117765networkgraph.png" alt="network graph">
The primary source is www.themoviedb.org. If you watch obscure artsy romanian homemade movies you may find only 95% of your movies referenced...but for anyone else it should be in the 98%+ range.
movies details are from www.themoviedb.org API : movies/details
movies crew & casting are from www.themoviedb.org API : movies/credits
both can be joined by id
they contain all 350k movies up, from end of 19th century to august 2017. If you remove short movies from imdb you get similar amounts of movies.
I uploaded the program to retrieve incremental movie details on github : https://github.com/stephanerappeneau/scienceofmovies/tree/master/PycharmProjects/GetAllMovies (need a dev API key from themoviedb.org though)
I have tried various supervised (decision tree) / unsupervised (clustering, NLP) approaches described in the discussions, source code is on github : https://github.com/stephanerappeneau/scienceofmovies
As a bonus I've uploaded the bio summary from top 500 critically-acclaimed directors from wikipedia, for some interesting NLTK analysis
Here is overview of the available sources that I've tried :
• Imdb.com free csv dumps (ftp://ftp.funet.fi/pub/mirrors/ftp.imdb.com/pub/temporaryaccess/) are badly documented, incomplete, loosely structured and impossible to join/merge. There's an API hosted by Amazon Web Service : 1€ every 100 000 requests. With around 1 million movies, it could become expensive also features are bare. So I've searched for other sources.
• www.themoviedb.org is based on crowdsourcing and has an excellent API, limited to 40 requests every 10 seconds. It is quite generous, well documented, and enough to sweep the 450 000 movies in a few days. For my purpose, data quality is not significantly worse than imdb, and as imdb key is also included there's always the possibility to complete my dataset later (I actually did it)
• www.Boxofficemojo.com has some interesting budget/revenue figures (which are sorely lacking in both imdb & tmdb), but it actually tracks only a few thousand movies, mainly blockbusters. There are other professional sources that are used by film industry to get better predictive / marketing insights but that's beyond my reach for this experiment.
• www.wikipedia.com is an interesting source with no real cap on API calls, however it requires a bit of webscraping and for movies or directors the layout and quality varies a lot, so I suspected it'd get a lot of work to get insights so I put this source in lower priority.
• www.google.com will ban you after a few minutes of web scraping because their job is to scrap data from others, than sell it, duh.
• It's worth mentionning that there are a few dumps of Netflix anonymized user tastes on kaggle, because they've organised a few competitions to improve their recommendation models. https://www.kaggle.com/netflix-inc/netflix-prize-data
• Online databases are largely white anglo-saxon centric, meaning bollywood (India is the 2nd bigger producer of movies) offer is mostly absent from datasets. I'm fine with that, as it's not my cup of tea plus I lack domain knowledge. The sheer amount of indian movies would probably skew my results anyway (I don't want to have too many martial-arts-musicals in my recommendations ;-)). I have, however, tremendous respect for indian movie industry so I'd love to collaborate with an indian cinephile !
https://img11.hostingpics.net/pics/340226westerns.png" alt="Westerns">
Starting from there, I had multiple problem statements for both supervised / unsupervised machine learning
Can I program a tailored-recommendation system based on my own criteria ?
What are the characteristics of movies/directors I like the most ?
What is the probability that I will like my next movie ?
Can I find the data ?
One of the objectives of sharing my work here is to find cinephile data-scientists who might be interested and, hopefully, contribute or share insights :) Other interesting leads : use tagline for NLP/Clustering/Genre guessing, leverage on budget/revenue, link with other data sources using the imdb normalized title, etc.
https://img11.hostingpics.net/pics/977004matrice.png" alt="Correlation matrix">
I've graduated from an french engineering school, majoring in artificial intelligence, but that was 17 years ago right in the middle of A.I-winter. Like a lot of white male rocket scientists, I've ended up in one of the leading european investment bank, quickly abandonning IT development to specialize in trading/risk project management and internal politics. My recent appointment in the Data Office made me aware of recent breakthroughts in datascience, and I thought that developing a side project would be an excellent occasion to learn something new. Plus it'd give me a well-needed credibility which too often lack decision makers when it comes to datascience.
I've worked on some of the features with Cédric Paternotte, a fellow friend of mine who is a professor of philosophy of sciences in La Sorbonne. Working with someone with a different background seem a good idea for motivation, creativity and rigor.
Kudos to www.themoviedb.org or www.wikipedia.com sites, who really have a great attitude towards open data. This is typically NOT the case of modern-bigdata companies who mostly keep data to themselves to try to monetize it. Such a huge contrast with imdb or instagram API, which generously let you grab your last 3 comments at a miserable rate. Even if 15 years ago this seemed a mandatory path to get services for free, I predict one day governments will need to break this data monopoly.
[Disclaimer : I apologize in advance for my engrish (I'm french ^-^), any bad-code I've written (there are probably hundreds of way to do it better and faster), any pseudo-scientific assumption I've made, I'm slowly getting back in statistics and lack senior guidance, one day I regress a non-stationary time series and the day after I'll discover I shouldn't have, and any incorrect use of machine-learning models]
https://img11.hostingpics.net/pics/898068408x161poweredbyrectanglegreen.png" alt="powered by themoviedb.org">
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Netflix's global subscriber base has reached an impressive milestone, surpassing *** million paid subscribers worldwide in the fourth quarter of 2024. This marks a significant increase of nearly ** million subscribers compared to the previous quarter, solidifying Netflix's position as a dominant force in the streaming industry. Adapting to customer losses Netflix's growth has not always been consistent. During the first half of 2022, the streaming giant lost over *** million customers. In response to these losses, Netflix introduced an ad-supported tier in November of that same year. This strategic move has paid off, with the lower-cost plan attracting ** million monthly active users globally by November 2024, demonstrating Netflix's ability to adapt to changing market conditions and consumer preferences. Global expansion Netflix continues to focus on international markets, with a forecast suggesting that the Asia Pacific region is expected to see the most substantial growth in the upcoming years, potentially reaching around **** million subscribers by 2029. To correspond to the needs of the non-American target group, the company has heavily invested in international content in recent years, with Korean, Spanish, and Japanese being the most watched non-English content languages on the platform.