31 datasets found

TikTok global quarterly downloads 2018-2024

statista.com
de.statista.com

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista Research Department, TikTok global quarterly downloads 2018-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Statista Research Department

Description

In the fourth quarter of 2024, TikTok generated around 186 million downloads from users worldwide. Initially launched in China first by ByteDance as Douyin, the short-video format was popularized by TikTok and took over the global social media environment in 2020. In the first quarter of 2020, TikTok downloads peaked at over 313.5 million worldwide, up by 62.3 percent compared to the first quarter of 2019.

              TikTok interactions: is there a magic formula for content success?

              In 2024, TikTok registered an engagement rate of approximately 4.64 percent on video content hosted on its platform. During the same examined year, the social video app recorded over 1,100 interactions on average. These interactions were primarily composed of likes, while only recording less than 20 comments per piece of content on average in 2024.
              The platform has been actively monitoring the issue of fake interactions, as it removed around 236 million fake likes during the first quarter of 2024. Though there is no secret formula to get the maximum of these metrics, recommended video length can possibly contribute to the success of content on TikTok.
              It was recommended that tiny TikTok accounts with up to 500 followers post videos that are around 2.6 minutes long as of the first quarter of 2024. While, the ideal video duration for huge TikTok accounts with over 50,000 followers was 7.28 minutes. The average length of TikTok videos posted by the creators in 2024 was around 43 seconds.

              What’s trending on TikTok Shop?

              Since its launch in September 2023, TikTok Shop has become one of the most popular online shopping platforms, offering consumers a wide variety of products. In 2023, TikTok shops featuring beauty and personal care items sold over 370 million products worldwide.
              TikTok shops featuring womenswear and underwear, as well as food and beverages, followed with 285 and 138 million products sold, respectively. Similarly, in the United States market, health and beauty products were the most-selling items,
              accounting for 85 percent of sales made via the TikTok Shop feature during the first month of its launch. In 2023, Indonesia was the market with the largest number of TikTok Shops, hosting over 20 percent of all TikTok Shops. Thailand and Vietnam followed with 18.29 and 17.54 percent of the total shops listed on the famous short video platform, respectively.

d
Refugee Admission to the US Ending FY 2018
data.world
csv, zip
Updated Nov 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Associated Press (2022). Refugee Admission to the US Ending FY 2018 [Dataset]. https://data.world/associatedpress/refugee-admissions-to-us-end-fy-2018
Explore at:
zip, csvAvailable download formats
Dataset updated
Nov 20, 2022
Authors
The Associated Press
Time period covered
2009 - 2018
Area covered
United States
Description
Overview

At the end of the 2018 fiscal year, the U.S. had resettled 22,491 refugees -- a small fraction of the number of people who had entered in prior years. This is the smallest annual number of refugees since Congress passed a law in 1980 creating the modern resettlement system.

It's also well below the cap of 45,000 set by the administration for 2018, and less than thirty percent of the number granted entry in the final year of Barack Obama’s presidency. It's also significantly below the cap for 2019 announced by President Trump's administration, which is 30,000.

The Associated Press is updating its data on refugees through fiscal year 2018, which ended Sept. 30, to help reporters continue coverage of this story. Previous Associated Press data on refugees can be found here.

Data obtained from the State Department's Bureau of Population, Refugees and Migration show the mix of refugees also has changed substantially:

The numbers of Iraqi, Somali and Syrian refugees -- who made up more than a third of all resettlements in the U.S. in the prior five years -- have almost entirely disappeared. Refugees from those three countries comprise about two percent of the 2018 resettlements.

In 2018, Christians have made up more than sixty percent of the refugee population, while the share of Muslims has dropped from roughly 45 percent of refugees in fiscal year 2016 to about 15 percent. (This data is not available at the city or state level.)

Of the states that usually average at least 100 resettlements, Maine, Louisiana, Michigan, Florida, California, Oklahoma and Texas have seen the largest percentage decreases in refugees. All have had their refugee caseloads drop more than 75% when comparing 2018 to the average over the previous five years (2013-2017).

The past fiscal year marks a dramatic change in the refugee program, with only a fraction as many people entering. That affects refugees currently in the U.S., who may be waiting on relatives to arrive. It affects refugees in other countries, hoping to get to the United States for safety or other reasons. And it affects the organizations that work to house and resettle these refugees, who only a few years ago were dealing with record numbers of people. Several agencies have already closed their doors; others have laid off workers and cut back their programs.

Because there is wide geographic variations on resettlement depending on refugees' country of origin, some U.S. cities have been more affected by this than others. For instance, in past years, Iraqis have resettled most often in San Diego, Calif., or Houston. Now, with only a handful of Iraqis being admitted in 2018, those cities have seen some of the biggest drop-offs in resettlement numbers.

About This Data

Datasheets include:

Annual_refugee_data: This provides the rawest form of the data from Oct. 1, 2008 – Sept. 30, 2018, where each record is a combination of fiscal year, city for refugee arrivals to a specific city and state and from a specific origin. Also provides annual totals for the state.

City_refugees: This provides data grouped by city for refugee arrivals to a specific city and state and from a specific origin, showing totals for each year next to each other in different columns, so you can quickly see trends over time. Data is from Oct. 1, 2008 – Sept. 30, 2018, grouped by fiscal year. It also compares 2018 numbers to a five-year average from 2013-2017.

City_refugees_and_foreign_born_proportions: This provides the data in City_refugees along with data that gives context to the origins of the foreign born populations living in each city. There are regional columns, sub-regional columns and a column specific to the origin listed in the refugee data. Data is from the American Community Survey 5-year 2013-2017 Table B05006: PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION. ### Caveats According to the State Department: "This data tracks the movement of refugees from various countries around the world to the U.S. for resettlement under the U.S. Refugee Admissions Program." The data does not include other types of immigration or visits to the U.S.

The data tracks the refugees' stated destination in the United States. In many cases, this is where the refugees first lived, although many may have since moved.

Be aware that some cities with particularly high totals may be the locations of refugee resettlement programs -- for instance, Glendale, Calif., is home to both Catholic Charities of Los Angeles and the International Rescue Committee of Los Angeles, which work at resettling refugees.

About Refugee Resettlement

The data for refugees from other countries - or for any particular timeframe since 2002 - can be accessed through the State Department's Refugee Processing Center's site by clicking on "Arrivals by Destination and Nationality."

The Refugee Processing Center used to publish a state-by-state list of affiliate refugee organizations -- the groups that help refugees settle in the U.S. That list was last updated in January 2017, so it may now be out of date. It can be found here.

For general information about the U.S. refugee resettlement program, see this State Department description. For more detailed information about the program and proposed 2018 caps and changes, see the FY 2018 Report to Congress.

Queries

The Associated Press has set up a number of pre-written queries to help you filter this data and find local stories. Queries can be accessed by clicking on their names in the upper right hand bar.

Find Cities Impacted - Most Change -- Use this query to see the cities that have seen the largest drop-offs in refugee resettlements. Creates a five-year average of how many refugees of a certain origin have come in the past, and then measures 2018 by that. Be wary of small raw numbers when considering the percentages!

Total Refugees for Each City in Your State -- Use this query to get the number of total refugees who've resettled in your state's cities by year.

Total Refugees in Your State -- Use this query to get the number of total refugees who've resettled in your state by year.

Changes in Origin over Time -- Use this query to track how many refugees are coming from each origin by year. The initial query provides national numbers, but can be filtered for state or even for city.

Extract Raw Data for Your State -- Use this query to type in your state name to extract and download just the data in your state. This is the raw data from the State Department, so it may be slightly more difficult to see changes over time. ###### Contact AP Data Journalist Michelle Minkoff with questions, mminkoff@ap.org
a
Urban Agglomeration Populations: 1950-2035
hub.arcgis.com
gis-for-secondary-schools-schools-be.hub.arcgis.com
Updated May 30, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ArcGIS StoryMaps (2018). Urban Agglomeration Populations: 1950-2035 [Dataset]. https://hub.arcgis.com/datasets/4f1518f13f8d461fae54106692b54ea4
Explore at:
Dataset updated
May 30, 2018
Dataset authored and provided by
ArcGIS StoryMaps
Area covered
Description
Cities ranking and mega citiesTokyo is the world’s largest city with an agglomeration of 37 million inhabitants, followed by New Delhi with 29 million, Shanghai with 26 million, and Mexico City and São Paulo, each with around 22 million inhabitants. Today, Cairo, Mumbai, Beijing and Dhaka all have close to 20 million inhabitants. By 2020, Tokyo’s population is projected to begin to decline, while Delhi is projected to continue growing and to become the most populous city in the world around 2028.By 2030, the world is projected to have 43 megacities with more than 10 million inhabitants, most of them in developing regions. However, some of the fastest-growing urban agglomerations are cities with fewer than 1 million inhabitants, many of them located in Asia and Africa. While one in eight people live in 33 megacities worldwide, close to half of the world’s urban dwellers reside in much smaller settlements with fewer than 500,000 inhabitants.About the dataThe 2018 Revision of the World Urbanization Prospects is published by the Population Division of the United Nations Department of Economic and Social Affairs (UN DESA). It has been issued regularly since 1988 with revised estimates and projections of the urban and rural populations for all countries of the world, and of their major urban agglomerations. The data set and related materials are available at: https://esa.un.org/unpd/wup/

Countries with the most Facebook users 2024

statista.com
de.statista.com

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Countries with the most Facebook users 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

Which county has the most Facebook users?

              There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.

              Facebook – the most used social media

              Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.

              Facebook usage by device
              As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.

G
LSIB 2017: Large Scale International Boundary Polygons, Simplified
developers.google.com
Updated Mar 30, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Department of State, Office of the Geographer (2017). LSIB 2017: Large Scale International Boundary Polygons, Simplified [Dataset]. https://developers.google.com/earth-engine/datasets/catalog/USDOS_LSIB_SIMPLE_2017
Explore at:
Dataset updated
Mar 30, 2017
Dataset provided by
United States Department of State, Office of the Geographer
Time period covered
Mar 30, 2017
Area covered
Earth
Description
The United States Office of the Geographer provides the Large Scale International Boundary (LSIB) dataset. The detailed version (2013) is derived from two other datasets: a LSIB line vector file and the World Vector Shorelines (WVS) from the National Geospatial-Intelligence Agency (NGA). The interior boundaries reflect U.S. government policies on boundaries, boundary disputes, and sovereignty. The exterior boundaries are derived from the WVS; however, the WVS coastline data is outdated and generally shifted from between several hundred meters to over a kilometer. Each feature is the polygonal area enclosed by interior boundaries and exterior coastlines where applicable, and many countries consist of multiple features, one per disjoint region. Compared with the detailed LSIB, in this simplified dataset some disjointed regions of each country have been reduced to a single feature. Furthermore, it excludes medium and smaller islands. The resulting simplified boundary lines are rarely shifted by more than 100 meters from the detailed LSIB lines. Each of the 312 features is a part of the geometry of one of the 284 countries described in this dataset.
Soccer match event dataset
kaggle.com
zip
Updated May 31, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alejandro Espinosa (2022). Soccer match event dataset [Dataset]. https://www.kaggle.com/datasets/aleespinosa/soccer-match-event-dataset/code
Explore at:
zip(520002019 bytes)Available download formats
Dataset updated
May 31, 2022
Authors
Alejandro Espinosa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Context

THIS IS NOT MY DATASET. All the credit is to Luca Pappalardo and Emmanuele Massucco, who shared their open data through this figshare website. Pappalardo, Luca; Massucco, Emanuele (2019): Soccer match event dataset. figshare. Collection. https://doi.org/10.6084/m9.figshare.c.4415000.v5
I am including this dataset in Kaggle because I am playing with it and I don't want to add the CSV files every time I use it.

Content

From the paper:

The data refer to season 2017/2018 of five national soccer competitions in Europe: Spanish first division, Italian first division, English first division, German first division, French first division. These competitions are the most important in Europe according to the UEFA country coefficient, which is used to rank the football associations of Europe and thus determine the number of clubs from an association that will participate in the UEFA Champions League and the UEFA Europa League (https://www.uefa.com/memberassociations/uefarankings/country/#/yr/2019). In addition, we provide the data of the World cup 2018 and the European cup 2016, which are competitions for national teams. In total, we provide seven data sets corresponding to information about all competitions, matches, teams, players, events, referees and coaches

From the source.:

Soccer analytics is attracting an increasing interest of academia and industry, thanks to the availability of sensing technologies that provide high-fidelity data streams extracted from every match. Unfortunately, these detailed data are owned by specialized companies and hence are rarely publicly available for scientific research. To fill this gap, we provide to the public the largest open collection of soccer-logs ever released, collected by Wyscout (https://wyscout.com/) containing all the spatio-temporal events (passes, shots, fouls, etc.) that occur during all matches of an entire season of seven competitions (La Liga, Serie A, Bundesliga, Premier League, Ligue 1, FIFA World Cup 2018, UEFA Euro Cup 2016). A match event contains information about its position, time, outcome, player and characteristics. This dataset has been used recently during the Soccer Data Challenge (https://sobigdata-soccerchallenge.it/) and, to the best of our knowledge, it is the largest public collection of soccer-logs.

Acknowledgements

Again, all the credit is to Luca Pappalardo and Emmanuele Massucco, who shared their open data through this figshare website.

Inspiration

Personally, I am a football lover and I want to merge my data science knowledge with football analytics to find new interesting patterns.

Extra Datasets

There are four extra csv files called games.csv, actions.csv, features.csv and labels.csv that included all the events in this dataset with the SPADL representation using the socceraction package. These files were created using this notebook.
T
GDP by Country in AFRICA
tradingeconomics.com
csv, excel, json, xml
Updated May 27, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2017). GDP by Country in AFRICA [Dataset]. https://tradingeconomics.com/country-list/gdp?continent=africa
Explore at:
xml, json, csv, excelAvailable download formats
Dataset updated
May 27, 2017
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
Africa
Description
This dataset provides values for GDP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
T
United States Balance of Trade
tradingeconomics.com
fr.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Nov 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). United States Balance of Trade [Dataset]. https://tradingeconomics.com/united-states/balance-of-trade
Explore at:
json, excel, xml, csvAvailable download formats
Dataset updated
Nov 19, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 31, 1950 - Aug 31, 2025
Area covered
United States
Description
The United States recorded a trade deficit of 59.55 USD Billion in August of 2025. This dataset provides the latest reported value for - United States Balance of Trade - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Facebook: countries with the highest Facebook reach 2024
statista.com
de.statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stacy Jo Dixon, Facebook: countries with the highest Facebook reach 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description
As of April 2024, Facebook had an addressable ad audience reach 131.1 percent in Libya, followed by the United Arab Emirates with 120.5 percent and Mongolia with 116 percent. Additionally, the Philippines and Qatar had addressable ad audiences of 114.5 percent and 111.7 percent.
Kenya - Service Delivery Indicators Health Survey 2018 - Harmonized Public...
datacatalog.worldbank.org
html
Updated Apr 2, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jane Chuma, The World Bank (2021). Kenya - Service Delivery Indicators Health Survey 2018 - Harmonized Public Use Data [Dataset]. https://datacatalog.worldbank.org/search/dataset/0048631/kenya-service-delivery-indicators-health-survey-2018-harmonized-public-use-data
Explore at:
htmlAvailable download formats
Dataset updated
Apr 2, 2021
Dataset provided by
World Bank Grouphttp://www.worldbank.org/
License
https://datacatalog.worldbank.org/public-licenses?fragment=researchhttps://datacatalog.worldbank.org/public-licenses?fragment=research
Area covered
Kenya
Description
The Service Delivery Indicators (SDI) are a set of health and education indicators that examine the effort and ability of staff and the availability of key inputs and resources that contribute to a functioning school or health facility. The indicators are standardized, allowing comparison between and within countries over time.

The Health SDIs include healthcare provider effort, knowledge and ability, and the availability of key inputs (for example, basic equipment, medicines and infrastructure, such as toilets and electricity). The indicators provide a snapshot of the health facility and assess the availability of key resources for providing high quality care.

The Kenya SDI Health survey team visited a sample of 3,098 health facilities across Kenya between March and July 2018. The 2018 Kenya SDI is the largest to date. The survey team collected rosters covering 24,098 workers for absenteeism and assessed 4,499 health workers for competence using patient case simulation.
Instagram: countries with the highest audience reach 2024
statista.com
de.statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stacy Jo Dixon, Instagram: countries with the highest audience reach 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description
As of April 2024, Bahrain was the country with the highest Instagram audience reach with 95.6 percent. Kazakhstan also had a high Instagram audience penetration rate, with 90.8 percent of the population using the social network. In the United Arab Emirates, Turkey, and Brunei, the photo-sharing platform was used by more than 85 percent of each country's population.
List of newspapers in India
kaggle.com
zip
Updated Apr 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saquib Hussain (2024). List of newspapers in India [Dataset]. https://www.kaggle.com/datasets/saquib7hussain/list-of-newspapers-in-india/discussion
Explore at:
zip(5229 bytes)Available download formats
Dataset updated
Apr 5, 2024
Authors
Saquib Hussain
Area covered
India
Description
As of 31 March 2018, there were over 100,000 publications registered with the Registrar of Newspapers for India. India has the second-largest newspaper market in the world, with daily newspapers reporting a combined circulation of over 240 million copies as of 2018. There are publications produced in each of the 22 scheduled languages of India and in many of the other languages spoken throughout the country. Hindi-language newspapers have the largest circulation, followed by English and Telugu. Newsstand and subscription prices often cover only a small percentage of the cost to produce newspapers in India, and advertising is the primary source of revenue.

Social media as a news outlet worldwide 2024

statista.com
de.statista.com

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Amy Watson, Social media as a news outlet worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Amy Watson

Description

During a 2024 survey, 77 percent of respondents from Nigeria stated that they used social media as a source of news. In comparison, just 23 percent of Japanese respondents said the same. Large portions of social media users around the world admit that they do not trust social platforms either as media sources or as a way to get news, and yet they continue to access such networks on a daily basis.

              Social media: trust and consumption

              Despite the majority of adults surveyed in each country reporting that they used social networks to keep up to date with news and current affairs, a 2018 study showed that social media is the least trusted news source in the world. Less than 35 percent of adults in Europe considered social networks to be trustworthy in this respect, yet more than 50 percent of adults in Portugal, Poland, Romania, Hungary, Bulgaria, Slovakia and Croatia said that they got their news on social media.

              What is clear is that we live in an era where social media is such an enormous part of daily life that consumers will still use it in spite of their doubts or reservations. Concerns about fake news and propaganda on social media have not stopped billions of users accessing their favorite networks on a daily basis.
              Most Millennials in the United States use social media for news every day, and younger consumers in European countries are much more likely to use social networks for national political news than their older peers.
              Like it or not, reading news on social is fast becoming the norm for younger generations, and this form of news consumption will likely increase further regardless of whether consumers fully trust their chosen network or not.

f
Data_Sheet_1_Aquaculture Impacts on China’s Marine Wild Fisheries Over the...
datasetcatalog.nlm.nih.gov
Updated Jul 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhu, Konghao; Xu, Jun; Wang, Kang; Xu, Congjun; Xie, Jiayi; Zhao, Kangshun; Zhang, Min (2021). Data_Sheet_1_Aquaculture Impacts on China’s Marine Wild Fisheries Over the Past 30 Years.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000759628
Explore at:
Dataset updated
Jul 26, 2021
Authors
Zhu, Konghao; Xu, Jun; Wang, Kang; Xu, Congjun; Xie, Jiayi; Zhao, Kangshun; Zhang, Min
Area covered
China
Description
China is the world’s largest producer of aquaculture and capture fisheries. How this country develops its aquaculture sector and whether such development can relieve pressure on wild fisheries remain a contentious issue in the past and for the future. This study aims to provide a broad assessment on the impact of aquaculture development in different periods on marine wild fisheries on the basis of aquaculture and marine wild fish catch data from all the coastal provinces of China. China’s aquaculture and capture fisheries have undergone substantial changes. From 1989 to 2002, China’s aquaculture, especially mariculture, had a strong relationship with marine wild fisheries. However, from 2003 to 2018, the impact of mariculture was weakened, whereas that of freshwater aquaculture had increased. Although aquaculture still puts pressure on marine wild fisheries, China’s aquaculture is currently moving toward sustainable development pattern with low input and high output. These results provide the first statistical evidence on the effects of aquaculture development on marine wild fisheries and contribute to the sustainable management of China’s aquaculture and marine capture fisheries.
VGChartz (Games Dataset)
kaggle.com
zip
Updated Jan 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simon Garanin (2024). VGChartz (Games Dataset) [Dataset]. https://www.kaggle.com/datasets/gsimonx37/vgchartz/data
Explore at:
zip(1351159 bytes)Available download formats
Dataset updated
Jan 23, 2024
Authors
Simon Garanin
License
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Description
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15126770%2Fb5be9743b224eed4a579ad0566c6cfa6%2Fheader.jpg?generation=1706017258113980&alt=media" alt="">

Data obtained using a program from the site vgchartz.com.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15126770%2Fe7672b2b6da2ed0212f6023bc969097c%2Fdata_1.jpg?generation=1706017300688615&alt=media" alt="">

"Founded in 2005 by Brett Walton, VGChartz (Video Game Charts) is a business intelligence and research firm and publisher of the VGChartz.com websites. As an industry research firm, VGChartz publishes video game hardware estimates every week and hosts an ever-expanding game database with over 55,000 titles listed, featuring up-to-date shipment information and legacy sales data. The VGChartz.com website provides consumers with a range of content from news and sales features, to reviews and articles, to social networking and a community forum." - from the site vgchartz.com.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15126770%2Fa099c58fc8cb25b8e26989f05fe58488%2Fdata_2.jpg?generation=1706017370390411&alt=media" alt="">

"Since the end of 2018 VGChartz no longer produces estimates for software sales. This is because the high digital market share for software was making it both more difficult to produce reliable retail estimates and also making those estimates increasingly unrepresentative of the wider performance of the games in question. As a result, on the software front we now only record official shipment/sales data, where such data is made available by developers and publishers. The legacy data remains on the site for those who are interested in browsing through it." - from the site vgchartz.com.

What can you do with the data set?

If you are new to data analytics, try answering the following questions: - in what year did the active growth in the number of video games produced begin? What year was the most successful from this point of view? What can you conclude if you look at the number of video games released by country? - on what day and month were the largest number of video games released? What could be the reason for this pattern? - is there a dependence of the number of copies sold on the ratings of critics or users? - which gaming platforms, publishers and developers are the most common (the largest number of video games have been released over time)? - which gaming platforms, publishers and developers have the largest number of video game copies sold (over all time, the total number of copies sold was the largest)?

If you have enough experience, try solving a regression problem. Train a model that can predict the number of copies sold of video games: - what signs can be used to prevent leakage of the target variable? - how do outliers affect the quality of the model? - which metric should be chosen to evaluate the model? - can adding new data improve the predictive ability of the model? - does the trained model have signs of heteroscedasticity of the residuals? How does this affect the predictive ability of the model? What can you do?

Field descriptions:

The data contains the following fields: 1. name – name of the video game. 2. date - release date of the video game. 3. platform - gaming platform (All – all gaming platforms, Series – all video game series). 4. publisher – publisher. 5. developers - developer. 6. shipped - the number of copies sent (relevant for records with the values All and Series in the platform field). 7. total - total number of copies sold (millions of copies). 8. america - number of copies sold in America (millions of copies). 9. europe - number of copies sold in Europe (millions of copies). 10. japan - number of copies sold in Japan (millions of copies). 11. other - other sales in the world. 12. vgc - rating VGChartz.com. 13. critic - critics' assessment. 14. user - user rating.

Found an error or inaccuracy in the data?

This dataset is the result of painstaking work. After collection and systematization, the data is checked for integrity and correctness. If you notice an error or inaccuracy in the data, or have a suggestion on how to improve the data set, please let me know.

You can look at working with data in my github repository.
Life Expectancy vs GDP, 1950-2018
kaggle.com
zip
Updated Jan 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luxolo Shilo Funde (2022). Life Expectancy vs GDP, 1950-2018 [Dataset]. https://www.kaggle.com/datasets/luxoloshilofunde/life-expectancy-vs-gdp-19502018
Explore at:
zip(215822 bytes)Available download formats
Dataset updated
Jan 14, 2022
Authors
Luxolo Shilo Funde
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Context

Life expectancy at birth is defined as the average number of years that a newborn could expect to live if he or she were to pass through life subject to the age-specific mortality rates of a given period. The years are from 1950 to 2018.

Content

For regional- and global-level data pre-1950, data from a study by Riley was used, which draws from over 700 sources to estimate life expectancy at birth from 1800 to 2001.

Riley estimated life expectancy before 1800, which he calls "the pre-health transition period". "Health transitions began in different countries in different periods, as early as the 1770s in Denmark and as late as the 1970s in some countries of sub-Saharan Africa". As such, for the sake of consistency, we have assigned the period before the health transition to the year 1770. "The life expectancy values employed are averages of estimates for the period before the beginning of the transitions for countries within that region. ... This period has presumably the weakest basis, the largest margin of error, and the simplest method of deriving an estimate."

For country-level data pre-1950, Clio Infra's dataset was used, compiled by Zijdeman and Ribeira da Silva (2015).

For country-, regional- and global-level data post-1950, data published by the United Nations Population Division was used, since they are updated every year. This is possible because Riley writes that "for 1950-2001, I have drawn life expectancy estimates chiefly from various sources provided by the United Nations, the World Bank’s World Development Indicators, and the Human Mortality Database".

For the Americas from 1950-2015, the population-weighted average of Northern America and Latin America and the Caribbean was taken, using UN Population Division estimates of population size.

Acknowledgements

Life expectancy:

Data publisher's source: https://www.lifetable.de/RileyBib.pdf Data published by: James C. Riley (2005) – Estimates of Regional and Global Life Expectancy, 1800–2001. Issue Population and Development Review. Population and Development Review. Volume 31, Issue 3, pages 537–543, September 2005., Zijdeman, Richard; Ribeira da Silva, Filipa, 2015, "Life Expectancy at Birth (Total)", http://hdl.handle.net/10622/LKYT53, IISH Dataverse, V1, and UN Population Division (2019) Link: https://datasets.socialhistory.org/dataset.xhtml?persistentId=hdl:10622/LKYT53, http://onlinelibrary.wiley.com/doi/10.1111/j.1728-4457.2005.00083.x/epdf, https://population.un.org/wpp/Download/Standard/Population/ Dataset: https://ourworldindata.org/life-expectancy

GDP per capita:

Data publisher's source: The Maddison Project Database is based on the work of many researchers that have produced estimates of economic growth for individual countries. Data published by: Bolt, Jutta and Jan Luiten van Zanden (2020), “Maddison style estimates of the evolution of the world economy. A new 2020 update”. Link: https://www.rug.nl/ggdc/historicaldevelopment/maddison/releases/maddison-project-database-2020 Dataset: https://ourworldindata.org/life-expectancy

Inspiration

The life expectancy vs GDP per capita analysis.
Enterprise Survey 2009-2019, Panel Data - Slovenia
microdata.worldbank.org
catalog.ihsn.org
Updated Aug 6, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Bank Group (WBG) (2020). Enterprise Survey 2009-2019, Panel Data - Slovenia [Dataset]. https://microdata.worldbank.org/index.php/catalog/3762
Explore at:
Dataset updated
Aug 6, 2020
Dataset provided by
European Bank for Reconstruction and Developmenthttp://ebrd.com/
European Investment Bankhttp://eib.org/
World Bank Grouphttp://www.worldbank.org/
Time period covered
2008 - 2019
Area covered
Slovenia
Description
Abstract

The documentation covers Enterprise Survey panel datasets that were collected in Slovenia in 2009, 2013 and 2019.

The Slovenia ES 2009 was conducted between 2008 and 2009. The Slovenia ES 2013 was conducted between March 2013 and September 2013. Finally, the Slovenia ES 2019 was conducted between December 2018 and November 2019. The objective of the Enterprise Survey is to gain an understanding of what firms experience in the private sector.

As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.

Geographic coverage

National

Analysis unit

The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must take its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.

Universe

As it is standard for the ES, the Slovenia ES was based on the following size stratification: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).

Kind of data

Sample survey data [ssd]

Sampling procedure

The sample for Slovenia ES 2009, 2013, 2019 were selected using stratified random sampling, following the methodology explained in the Sampling Manual for Slovenia 2009 ES and for Slovenia 2013 ES, and in the Sampling Note for 2019 Slovenia ES.

Three levels of stratification were used in this country: industry, establishment size, and oblast (region). The original sample designs with specific information of the industries and regions chosen are included in the attached Excel file (Sampling Report.xls.) for Slovenia 2009 ES. For Slovenia 2013 and 2019 ES, specific information of the industries and regions chosen is described in the "The Slovenia 2013 Enterprise Surveys Data Set" and "The Slovenia 2019 Enterprise Surveys Data Set" reports respectively, Appendix E.

For the Slovenia 2009 ES, industry stratification was designed in the way that follows: the universe was stratified into manufacturing industries, services industries, and one residual (core) sector as defined in the sampling manual. Each industry had a target of 90 interviews. For the manufacturing industries sample sizes were inflated by about 17% to account for potential non-response cases when requesting sensitive financial data and also because of likely attrition in future surveys that would affect the construction of a panel. For the other industries (residuals) sample sizes were inflated by about 12% to account for under sampling in firms in service industries.

For Slovenia 2013 ES, industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).

Finally, for Slovenia 2019 ES, three levels of stratification were used in this country: industry, establishment size, and region. The original sample design with specific information of the industries and regions chosen is described in "The Slovenia 2019 Enterprise Surveys Data Set" report, Appendix C. Industry stratification was done as follows: Manufacturing – combining all the relevant activities (ISIC Rev. 4.0 codes 10-33), Retail (ISIC 47), and Other Services (ISIC 41-43, 45, 46, 49-53, 55, 56, 58, 61, 62, 79, 95).

For Slovenia 2009 and 2013 ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not a common practice, except in the sectors of construction and agriculture.

For Slovenia 2009 ES, regional stratification was defined in 2 regions. These regions are Vzhodna Slovenija and Zahodna Slovenija. The Slovenia sample contains panel data. The wave 1 panel “Investment Climate Private Enterprise Survey implemented in Slovenia” consisted of 223 establishments interviewed in 2005. A total of 57 establishments have been re-interviewed in the 2008 Business Environment and Enterprise Performance Survey.

For Slovenia 2013 ES, regional stratification was defined in 2 regions (city and the surrounding business area) throughout Slovenia.

Finally, for Slovenia 2019 ES, regional stratification was done across two regions: Eastern Slovenia (NUTS code SI03) and Western Slovenia (SI04).

Mode of data collection

Computer Assisted Personal Interview [capi]

Research instrument

Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions. The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions). Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module). Each variation of the questionnaire is identified by the index variable, a0.

Response rate

Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.

Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as (-8). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response.

For 2009 and 2013 Slovenia ES, the survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Up to 4 attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.

For 2009, the number of contacted establishments per realized interview was 6.18. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The relatively low ratio of contacted establishments per realized interview (6.18) suggests that the main source of error in estimates in the Slovenia may be selection bias and not frame inaccuracy.

For 2013, the number of realized interviews per contacted establishment was 25%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 44%.

Finally, for 2019, the number of interviews per contacted establishments was 9.7%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The share of rejections per contact was 75.2%.
Tanzania Tourism Classification Challenge
kaggle.com
zip
Updated Jun 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tevin Temu (2022). Tanzania Tourism Classification Challenge [Dataset]. https://www.kaggle.com/datasets/tevintemu/tanzania-tourism-classification-challenge
Explore at:
zip(527132 bytes)Available download formats
Dataset updated
Jun 1, 2022
Authors
Tevin Temu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Tanzania
Description
This challenge is open to users from English speaking African countries.

The Tanzanian tourism sector plays a significant role in the Tanzanian economy, contributing about 17% to the country’s GDP and 25% of all foreign exchange revenues. The sector, which provides direct employment for more than 600,000 people and up to 2 million people indirectly, generated approximately $2.4 billion in 2018 according to government statistics. Tanzania received a record 1.1 million international visitor arrivals in 2014, mostly from Europe, the US and Africa.

Tanzania is the only country in the world which has allocated more than 25% of its total area for wildlife, national parks, and protected areas.There are 16 national parks in Tanzania, 28 game reserves, 44 game-controlled areas, two marine parks and one conservation area.

Tanzania’s tourist attractions include the Serengeti plains, which hosts the largest terrestrial mammal migration in the world; the Ngorongoro Crater, the world’s largest intact volcanic caldera and home to the highest density of big game in Africa; Kilimanjaro, Africa’s highest mountain; and the Mafia Island marine park; among many others. The scenery, topography, rich culture and very friendly people provide for excellent cultural tourism, beach holidays, honeymooning, game hunting, historical and archaeological ventures – and certainly the best wildlife photography safaris in the world.

The objective of this hackathon is to develop a machine learning model that can classify the range of expenditures a tourist spends in Tanzania. The model can be used by different tour operators and the Tanzania Tourism Board to automatically help tourists across the world estimate their expenditure before visiting Tanzania.
T
GDP by Country in ASIA
tradingeconomics.com
csv, excel, json, xml
Updated Nov 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). GDP by Country in ASIA [Dataset]. https://tradingeconomics.com/country-list/gdp?continent=asia
Explore at:
xml, json, csv, excelAvailable download formats
Dataset updated
Nov 13, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
Asia
Description
This dataset provides values for GDP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

Instagram: most popular posts as of 2024

statista.com
de.statista.com

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Instagram: most popular posts as of 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

Instagram’s most popular post

              As of April 2024, the most popular post on Instagram was Lionel Messi and his teammates after winning the 2022 FIFA World Cup with Argentina, posted by the account @leomessi. Messi's post, which racked up over 61 million likes within a day, knocked off the reigning post, which was 'Photo of an Egg'. Originally posted in January 2021, 'Photo of an Egg' surpassed the world’s most popular Instagram post at that time, which was a photo by Kylie Jenner’s daughter totaling 18 million likes.
              After several cryptic posts published by the account, World Record Egg revealed itself to be a part of a mental health campaign aimed at the pressures of social media use.

              Instagram’s most popular accounts

              As of April 2024, the official Instagram account @instagram had the most followers of any account on the platform, with 672 million followers. Portuguese footballer Cristiano Ronaldo (@cristiano) was the most followed individual with 628 million followers, while Selena Gomez (@selenagomez) was the most followed woman on the platform with 429 million. Additionally, Inter Miami CF striker Lionel Messi (@leomessi) had a total of 502 million. Celebrities such as The Rock, Kylie Jenner, and Ariana Grande all had over 380 million followers each.

              Instagram influencers

              In the United States, the leading content category of Instagram influencers was lifestyle, with 15.25 percent of influencers creating lifestyle content in 2021. Music ranked in second place with 10.96 percent, followed by family with 8.24 percent. Having a large audience can be very lucrative: Instagram influencers in the United States, Canada and the United Kingdom with over 90,000 followers made around 1,221 US dollars per post.

              Instagram around the globe

              Instagram’s worldwide popularity continues to grow, and India is the leading country in terms of number of users, with over 362.9 million users as of January 2024. The United States had 169.65 million Instagram users and Brazil had 134.6 million users. The social media platform was also very popular in Indonesia and Turkey, with 100.9 and 57.1, respectively. As of January 2024, Instagram was the fourth most popular social network in the world, behind Facebook, YouTube and WhatsApp.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista Research Department, TikTok global quarterly downloads 2018-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/

TikTok global quarterly downloads 2018-2024

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Statista Research Department

Description

              TikTok interactions: is there a magic formula for content success?

              In 2024, TikTok registered an engagement rate of approximately 4.64 percent on video content hosted on its platform. During the same examined year, the social video app recorded over 1,100 interactions on average. These interactions were primarily composed of likes, while only recording less than 20 comments per piece of content on average in 2024.
              The platform has been actively monitoring the issue of fake interactions, as it removed around 236 million fake likes during the first quarter of 2024. Though there is no secret formula to get the maximum of these metrics, recommended video length can possibly contribute to the success of content on TikTok.
              It was recommended that tiny TikTok accounts with up to 500 followers post videos that are around 2.6 minutes long as of the first quarter of 2024. While, the ideal video duration for huge TikTok accounts with over 50,000 followers was 7.28 minutes. The average length of TikTok videos posted by the creators in 2024 was around 43 seconds.

              What’s trending on TikTok Shop?

              Since its launch in September 2023, TikTok Shop has become one of the most popular online shopping platforms, offering consumers a wide variety of products. In 2023, TikTok shops featuring beauty and personal care items sold over 370 million products worldwide.
              TikTok shops featuring womenswear and underwear, as well as food and beverages, followed with 285 and 138 million products sold, respectively. Similarly, in the United States market, health and beauty products were the most-selling items,
              accounting for 85 percent of sales made via the TikTok Shop feature during the first month of its launch. In 2023, Indonesia was the market with the largest number of TikTok Shops, hosting over 20 percent of all TikTok Shops. Thailand and Vietnam followed with 18.29 and 17.54 percent of the total shops listed on the famous short video platform, respectively.

Clear search

Close search

Google apps

Main menu

TikTok global quarterly downloads 2018-2024

Refugee Admission to the US Ending FY 2018

Overview

About This Data

About Refugee Resettlement

Queries

Urban Agglomeration Populations: 1950-2035

Countries with the most Facebook users 2024

LSIB 2017: Large Scale International Boundary Polygons, Simplified

Soccer match event dataset

Context

Content

Acknowledgements

Inspiration

Extra Datasets

GDP by Country in AFRICA

United States Balance of Trade

Facebook: countries with the highest Facebook reach 2024

Kenya - Service Delivery Indicators Health Survey 2018 - Harmonized Public...

Instagram: countries with the highest audience reach 2024

List of newspapers in India

Social media as a news outlet worldwide 2024

Data_Sheet_1_Aquaculture Impacts on China’s Marine Wild Fisheries Over the...

VGChartz (Games Dataset)

What can you do with the data set?

Field descriptions:

Found an error or inaccuracy in the data?

Life Expectancy vs GDP, 1950-2018

Context

Content

Acknowledgements

Inspiration

Enterprise Survey 2009-2019, Panel Data - Slovenia

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Response rate

Tanzania Tourism Classification Challenge

GDP by Country in ASIA

Instagram: most popular posts as of 2024

TikTok global quarterly downloads 2018-2024