100+ datasets found
  1. World Population Statistics - 2023

    • kaggle.com
    Updated Jan 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhavik Jikadara (2024). World Population Statistics - 2023 [Dataset]. https://www.kaggle.com/datasets/bhavikjikadara/world-population-statistics-2023
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 9, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bhavik Jikadara
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    World
    Description
    • The current US Census Bureau world population estimate in June 2019 shows that the current global population is 7,577,130,400 people on Earth, which far exceeds the world population of 7.2 billion in 2015. Our estimate based on UN data shows the world's population surpassing 7.7 billion.
    • China is the most populous country in the world with a population exceeding 1.4 billion. It is one of just two countries with a population of more than 1 billion, with India being the second. As of 2018, India has a population of over 1.355 billion people, and its population growth is expected to continue through at least 2050. By the year 2030, India is expected to become the most populous country in the world. This is because India’s population will grow, while China is projected to see a loss in population.
    • The following 11 countries that are the most populous in the world each have populations exceeding 100 million. These include the United States, Indonesia, Brazil, Pakistan, Nigeria, Bangladesh, Russia, Mexico, Japan, Ethiopia, and the Philippines. Of these nations, all are expected to continue to grow except Russia and Japan, which will see their populations drop by 2030 before falling again significantly by 2050.
    • Many other nations have populations of at least one million, while there are also countries that have just thousands. The smallest population in the world can be found in Vatican City, where only 801 people reside.
    • In 2018, the world’s population growth rate was 1.12%. Every five years since the 1970s, the population growth rate has continued to fall. The world’s population is expected to continue to grow larger but at a much slower pace. By 2030, the population will exceed 8 billion. In 2040, this number will grow to more than 9 billion. In 2055, the number will rise to over 10 billion, and another billion people won’t be added until near the end of the century. The current annual population growth estimates from the United Nations are in the millions - estimating that over 80 million new lives are added yearly.
    • This population growth will be significantly impacted by nine specific countries which are situated to contribute to the population growth more quickly than other nations. These nations include the Democratic Republic of the Congo, Ethiopia, India, Indonesia, Nigeria, Pakistan, Uganda, the United Republic of Tanzania, and the United States of America. Particularly of interest, India is on track to overtake China's position as the most populous country by 2030. Additionally, multiple nations within Africa are expected to double their populations before fertility rates begin to slow entirely.

    Content

    • In this Dataset, we have Historical Population data for every Country/Territory in the world by different parameters like Area Size of the Country/Territory, Name of the Continent, Name of the Capital, Density, Population Growth Rate, Ranking based on Population, World Population Percentage, etc. >Dataset Glossary (Column-Wise):
    • Rank: Rank by Population.
    • CCA3: 3 Digit Country/Territories Code.
    • Country/Territories: Name of the Country/Territories.
    • Capital: Name of the Capital.
    • Continent: Name of the Continent.
    • 2022 Population: Population of the Country/Territories in the year 2022.
    • 2020 Population: Population of the Country/Territories in the year 2020.
    • 2015 Population: Population of the Country/Territories in the year 2015.
    • 2010 Population: Population of the Country/Territories in the year 2010.
    • 2000 Population: Population of the Country/Territories in the year 2000.
    • 1990 Population: Population of the Country/Territories in the year 1990.
    • 1980 Population: Population of the Country/Territories in the year 1980.
    • 1970 Population: Population of the Country/Territories in the year 1970.
    • Area (km²): Area size of the Country/Territories in square kilometers.
    • Density (per km²): Population Density per square kilometer.
    • Growth Rate: Population Growth Rate by Country/Territories.
    • World Population Percentage: The population percentage by each Country/Territories.
  2. T

    United States Population

    • tradingeconomics.com
    • es.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Dec 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2024). United States Population [Dataset]. https://tradingeconomics.com/united-states/population
    Explore at:
    excel, xml, csv, jsonAvailable download formats
    Dataset updated
    Dec 15, 2024
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1900 - Dec 31, 2024
    Area covered
    United States
    Description

    The total population in the United States was estimated at 341.2 million people in 2024, according to the latest census figures and projections from Trading Economics. This dataset provides - United States Population - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  3. g

    Development Economics Data Group - Severely food insecure people (million)...

    • gimi9.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Development Economics Data Group - Severely food insecure people (million) (3-year average) (FAO FS) | gimi9.com [Dataset]. https://gimi9.com/dataset/worldbank_fao_fs_210071/
    Explore at:
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Estimated number of people living in households classified as severely food insecure. It is calculated by multiplying the estimated percentage of people affected by severe food insecurity (I_2.5) by the total population.

  4. Data from: Russian Troll Tweets

    • kaggle.com
    Updated Aug 1, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FiveThirtyEight (2018). Russian Troll Tweets [Dataset]. https://www.kaggle.com/fivethirtyeight/russian-troll-tweets/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 1, 2018
    Dataset provided by
    Kaggle
    Authors
    FiveThirtyEight
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Russia
    Description

    3 million Russian troll tweets

    This data was used in the FiveThirtyEight story Why We’re Sharing 3 Million Russian Troll Tweets.

    This directory contains data on nearly 3 million tweets sent from Twitter handles connected to the Internet Research Agency, a Russian "troll factory" and a defendant in an indictment filed by the Justice Department in February 2018, as part of special counsel Robert Mueller's Russia investigation. The tweets in this database were sent between February 2012 and May 2018, with the vast majority posted from 2015 through 2017.

    FiveThirtyEight obtained the data from Clemson University researchers Darren Linvill, an associate professor of communication, and Patrick Warren, an associate professor of economics, on July 25, 2018. They gathered the data using custom searches on a tool called Social Studio, owned by Salesforce and contracted for use by Clemson's Social Media Listening Center.

    The basis for the Twitter handles included in this data are the November 2017 and June 2018 lists of Internet Research Agency-connected handles that Twitter provided to Congress. This data set contains every tweet sent from each of the 2,752 handles on the November 2017 list since May 10, 2015. For the 946 handles newly added on the June 2018 list, this data contains every tweet since June 19, 2015. (For certain handles, the data extends even earlier than these ranges. Some of the listed handles did not tweet during these ranges.) The researchers believe that this includes the overwhelming majority of these handles’ activity. The researchers also removed 19 handles that remained on the June 2018 list but that they deemed very unlikely to be IRA trolls.

    In total, the nine CSV files include 2,973,371 tweets from 2,848 Twitter handles. Also, as always, caveat emptor -- in this case, tweet-reader beware: In addition to their own content, some of the tweets contain active links, which may lead to adult content or worse.

    The Clemson researchers used this data in a working paper, Troll Factories: The Internet Research Agency and State-Sponsored Agenda Building, which is currently under review at an academic journal. The authors’ analysis in this paper was done on the data file provided here, limiting the date window to June 19, 2015, to Dec. 31, 2017.

    The files have the following columns:

    HeaderDefinition
    external_author_idAn author account ID from Twitter
    authorThe handle sending the tweet
    contentThe text of the tweet
    regionA region classification, as determined by Social Studio
    languageThe language of the tweet
    publish_dateThe date and time the tweet was sent
    harvested_dateThe date and time the tweet was collected by Social Studio
    followingThe number of accounts the handle was following at the time of the tweet
    followersThe number of followers the handle had at the time of the tweet
    updatesThe number of “update actions” on the account that authored the tweet, including tweets, retweets and likes
    post_typeIndicates if the tweet was a retweet or a quote-tweet
    account_typeSpecific account theme, as coded by Linvill and Warren
    retweetA binary indicator of whether or not the tweet is a retweet
    account_categoryGeneral account theme, as coded by Linvill and Warren
    new_june_2018A binary indicator of whether the handle was newly listed in June 2018

    If you use this data and find anything interesting, please let us know. Send your projects to oliver.roeder@fivethirtyeight.com or @ollie.

    The Clemson researchers wish to acknowledge the assistance of the Clemson University Social Media Listening Center and Brandon Boatwright of the University of Tennessee, Knoxville.

  5. Number of global social network users 2017-2028

    • statista.com
    • es.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How many people use social media?

                  Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
    
                  Who uses social media?
                  Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
                  when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
    
                  How much time do people spend on social media?
                  Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
    
                  What are the most popular social media platforms?
                  Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
    
  6. d

    Input Digital Datasets for the Soil-Water Balance Groundwater Recharge Model...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Aug 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Input Digital Datasets for the Soil-Water Balance Groundwater Recharge Model of the Upper Colorado River Basin [Dataset]. https://catalog.data.gov/dataset/input-digital-datasets-for-the-soil-water-balance-groundwater-recharge-model-of-the-upper-
    Explore at:
    Dataset updated
    Aug 15, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Colorado River
    Description

    The Colorado River and its tributaries supply water to more than 35 million people in the United States and 3 million people in Mexico, irrigating more than 4.5 million acres of farmland, and generating about 12 billion kilowatt hours of hydroelectric power annually. Planning for the sustainable management of the Colorado River in future climates requires an understanding of the Upper Colorado River Basin groundwater system. The Upper Colorado River Basin, encompassing more than 110,000 square miles (mi2), contains the headwaters of the Colorado River and is an important source of snowmelt runoff to the River. Groundwater discharge also is an important source of water in the River and its tributaries, with estimates ranging from 21 to 58 percent of streamflow in the upper basin. A study by Castle and others (2014) using remotely sensed gravity observations from the NASA Gravity Recovery and Climate Experiment (GRACE) mission found that UCRB groundwater was depleted by more than 17 million acre-feet (ft) from December 2004 to November 2013. Understanding groundwater-budget components, including groundwater recharge, is important to sustainably manage both groundwater and surface-water supplies in the Colorado River Basin.

  7. Countries with the most Facebook users 2024

    • statista.com
    • es.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Countries with the most Facebook users 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Which county has the most Facebook users?

                  There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.
    
                  Facebook – the most used social media
    
                  Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.
    
                  Facebook usage by device
                  As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.
    
  8. N

    North Carolina Age Cohorts Dataset: Children, Working Adults, and Seniors in...

    • neilsberg.com
    csv, json
    Updated Feb 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). North Carolina Age Cohorts Dataset: Children, Working Adults, and Seniors in North Carolina - Population and Percentage Analysis // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/4b987530-f122-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 22, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    North Carolina
    Variables measured
    Population Over 65 Years, Population Under 18 Years, Population Between 18 and 64 Years, Percent of Total Population for Age Groups
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age cohorts. For age cohorts we divided it into three buckets Children ( Under the age of 18 years), working population ( Between 18 and 64 years) and senior population ( Over 65 years). For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the North Carolina population by age cohorts (Children: Under 18 years; Working population: 18-64 years; Senior population: 65 years or more). It lists the population in each age cohort group along with its percentage relative to the total population of North Carolina. The dataset can be utilized to understand the population distribution across children, working population and senior population for dependency ratio, housing requirements, ageing, migration patterns etc.

    Key observations

    The largest age group was 18 to 64 years with a poulation of 6.47 million (61.17% of the total population). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age cohorts:

    • Under 18 years
    • 18 to 64 years
    • 65 years and over

    Variables / Data Columns

    • Age Group: This column displays the age cohort for the North Carolina population analysis. Total expected values are 3 groups ( Children, Working Population and Senior Population).
    • Population: The population for the age cohort in North Carolina is shown in the following column.
    • Percent of Total Population: The population as a percent of total population of the North Carolina is shown in the following column.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for North Carolina Population by Age. You can refer the same here

  9. Max Foundation Bangladesh Healthy Village Tracker

    • kaggle.com
    Updated Dec 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Remco Geervliet (2021). Max Foundation Bangladesh Healthy Village Tracker [Dataset]. https://www.kaggle.com/remcogeervliet/max-foundation-bangladesh-healthy-village-tracker/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 14, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Remco Geervliet
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Bangladesh
    Description

    Max Foundation

    Max Foundation is a Netherlands-based NGO that works towards a healthy start for every child in the most effective and long-lasting way. Over the past 15 years, our teams in Bangladesh and Ethiopia have reached almost 3 million people, supporting communities in reducing stunting and undernutrition by gaining better access to clean water, sanitation and hygiene, as well as healthy diets and care for mother and child.

    Maximising our impact and cost efficiency are at the core of our work, which makes quantifying and analysing our programmes crucial. We therefore collect a lot of information on the communities we work with; to understand them better and see where and how we can improve as an organisation.

    This dataset is one of many we are making publicly available because we believe that data in the development sector should be open: not as a goal in itself, but as a way to help the sector be more effective and create more impact.

    Content

    These data are collected quarterly at the village-level, in aggregate. In Max Foundation's Healthy Village Approach, our team has created several indicators to track how villages are progressing on WASH (water, sanitation and hygiene), nutrition, and SRHR (sexual and reproductive health and rights) & Baby WASH.

    Privacy and links to our other data

    All of Max Foundation's data are collected and processed according to GDPR standards and explicit informed consent is given by all respondents. They are also clearly informed that choosing not to participate in data collection will in no way affect their eligibility for, or receiving of, products or services from Max Foundation.

    Furthermore, we enforce strong privacy protections on our open data to minimise the risk of these data being used to cause harm or re-identify individuals.

    Concretely this means: - Village are masked by random numbers. However, to ensure it is still possible to compare our data sets, these random numbers are consistent across all datasets. This means that village '1' in this data is the same as village '1' in all of our other Bangladesh datasets, unless stated otherwise. Higher level administrative units can be deduced from matching the village numbers to the bd_ loc_XX datasets in the Max Foundation Bangladesh 2018 WASH Census dataset. - Population counts have been bucketed. The values represent the mid-point of a given bucket, for the number of households in the village, which is bucketed by 20 households, the value 50 represents 40-60 households. The values have also been censored at the upper end, and some at the lower end as well. The column descriptions specify any transformations done to the data.

    A final note to anyone trying to link Max Foundation's various datasets; as data is self-reported, sometimes by individuals other times by whole communities, there may be differences in for instance the number of households or the number of stunted children in a given village in this dataset versus in another. Some differences can be explained by differences in definitions (a household is a concept that is often hard to define and its interpretation may vary from person to person), and others by a lack of information on the part of a respondent. We therefore encourage you to look at these differences and see which value makes the most sense for the specific analysis you are conducting.

    Acknowledgements

    These data could have not been collected without the generous support from the Embassy of the Kingdom of the Netherlands in Dhaka and numerous other donors who have supported us over the years. Special thanks to our Bangladesh team for their excellent work in guiding the data collection process.

    What you can do for our communities

    We invite you to share any interesting insights you have derived from the data with us. From visualising our impact, to uncovering which parts of our programmes are most strongly related with reducing stunting, to making new connections we may have not even considered; we are eager to hear how we can be more effective in what we do and how we do it.

    More detailed data insights are available from our internal data, such as the linking of households between datasets. Please note that we would be happy to share more detailed data with researchers, students and many others once proper agreements are in place.

    As we value impact above all else, we are happy to work with anyone who can help us to improve this. We are constantly adapting our approach based on internal and external findings, and invite you to join us on this journey. Together we can ensure that every child has a healthy start.

  10. FSboard

    • kaggle.com
    zip
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google Research (2025). FSboard [Dataset]. https://www.kaggle.com/datasets/googleai/fsboard/code
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Feb 25, 2025
    Dataset provided by
    Googlehttp://google.com/
    Authors
    Google Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary

    FSboard is an American Sign Language fingerspelling dataset situated in a mobile text entry use case, collected from 147 paid and consenting Deaf signers using Pixel 4A selfie cameras in a variety of environments. At >3 million characters in length and >250 hours in duration, FSboard is the largest fingerspelling recognition dataset to date by a factor of >10x.

    We previously hosted a Kaggle competition using MediaPipe Holistic landmarks for the FSboard data; this release now includes the underlying RGB videos and val/test sets.

    See the our paper for a more complete exposition of the dataset: FSboard: Over 3 million characters of ASL fingerspelling collected via smartphones

    The dataset consists of several categories of synthetically generated phrases (examples in the table below, not real PII) recorded as video clips of ASL fingerspelling (example frames in the figure below, faces blurred here but not in the dataset).

    DirectoryCategoryExample
    "dmk"MacKenzie phrasesprevailing wind from the east
    "daun"URLs/dfinance/list.asp?id=418/
    Addresses9841 gritt hill
    Phone Numbers166-893-6320
    Namesmohammed kim

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F20954272%2F2a7512937441315b8ddf742e9d02195d%2Ffs-blurred.png?generation=1739550608040254&alt=media" alt="">

    Responsible Use

    While facial expressions are an essential component of sign language and are therefore included in the dataset, we ask that you blur the signers’ faces when publicizing examples. You should not attempt to reidentify the signers or use their likenesses to generate and publish other content (deepfakes). Please be culturally respectful of the Deaf/Hard of Hearing community in your use of the dataset and do not exaggerate the significance of improving ASL fingerspelling performance, which is only one small component of American Sign Language.

    Landmarks

    Landmarks were extracted using MediaPipe Holistic . They are provided as tf.train.SequenceExample entries in TFRecordio files. There is also a script which converts these TFRecordio files to Parquet files in a similar format to the one used in the previous Kaggle Competition. Since each entry in the Parquet file represents a single landmark frame, the script also produces a supplemental csv file with video level information.

    Sensitive Content Filtering

    The synthetic URLs generated in the dataset were created by recombining parts from real URLs. As such, the full breadth of content available on the internet is represented. It is important not to infantilize the Deaf community, and therefore important to ensure that any applications in this space is able to produce arbitrary output. Imagine the frustration when your keyboard r*****s to produce certain ducking words. However, it's also important to ensure that an application doesn't easily produce offensive unintended content. In an effort to facilitate people making sane decisions with this data, we've run a sensitive content filter and keyword searches on the phrases used and manually reviewed the result to produce a boolean tag "sensitiveContent" which is available in the json files. Please ensure that the Deaf community is involved in the creation of any applications targeted to them.

    Attribution

    If you use FSboard in your work, please cite: @misc{georg2024fsboard3millioncharacters, title={FSboard: Over 3 million characters of ASL fingerspelling collected via smartphones}, author={Manfred Georg and Garrett Tanzer and Saad Hassan and Maximus Shengelia and Esha Uboweja and Sam Sepah and Sean Forbes and Thad Starner}, year={2024}, eprint={2407.15806}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2407.15806}, }

  11. GitHub Activity Data

    • console.cloud.google.com
    Updated Jun 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:GitHub&inv=1&invt=Ab41nA (2022). GitHub Activity Data [Dataset]. https://console.cloud.google.com/marketplace/product/github/github-repos
    Explore at:
    Dataset updated
    Jun 23, 2022
    Dataset provided by
    GitHubhttps://github.com/
    Googlehttp://google.com/
    Description

    GitHub is how people build software and is home to the largest community of open source developers in the world, with over 12 million people contributing to 31 million projects on GitHub since 2008. This 3TB+ dataset comprises the largest released source of GitHub activity to date. It contains a full snapshot of the content of more than 2.8 million open source GitHub repositories including more than 145 million unique commits, over 2 billion different file paths, and the contents of the latest revision for 163 million files, all of which are searchable with regular expressions. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .

  12. H-1B Visa Petitions 2011-2016

    • kaggle.com
    Updated Feb 28, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sharan Naribole (2017). H-1B Visa Petitions 2011-2016 [Dataset]. https://www.kaggle.com/nsharan/h-1b-visa/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 28, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sharan Naribole
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    H-1B visas are a category of employment-based, non-immigrant visas for temporary foreign workers in the United States. For a foreign national to apply for H1-B visa, a US employer must offer them a job and submit a petition for a H-1B visa to the US immigration department. This is also the most common visa status applied for and held by international students once they complete college or higher education and begin working in a full-time position.

    The following articles contain more information about the H-1B visa process:

    Content

    This dataset contains five year's worth of H-1B petition data, with approximately 3 million records overall. The columns in the dataset include case status, employer name, worksite coordinates, job title, prevailing wage, occupation code, and year filed.

    For more information on individual columns, refer to the column metadata. A detailed description of the underlying raw dataset is available in an official data dictionary.

    Acknowledgements

    The Office of Foreign Labor Certification (OFLC) generates program data, including data about H1-B visas. The disclosure data updated annually and is available online.

    The raw data available is messy and not immediately suitable analysis. A set of data transformations were performed making the data more accessible for quick exploration. To learn more, refer to this blog post and to the complimentary R Notebook.

    Inspiration

    • Is the number of petitions with Data Engineer job title increasing over time?
    • Which part of the US has the most Hardware Engineer jobs?
    • Which industry has the most number of Data Scientist positions?
    • Which employers file the most petitions each year?
  13. T

    Euro Area Population

    • tradingeconomics.com
    • pt.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Oct 10, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2012). Euro Area Population [Dataset]. https://tradingeconomics.com/euro-area/population
    Explore at:
    xml, excel, json, csvAvailable download formats
    Dataset updated
    Oct 10, 2012
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1960 - Dec 31, 2025
    Area covered
    Euro Area
    Description

    The total population In the Euro Area was estimated at 351.4 million people in 2025, according to the latest census figures and projections from Trading Economics. This dataset provides the latest reported value for - Euro Area Population - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  14. Google Landmarks Dataset v2

    • github.com
    • opendatalab.com
    Updated Sep 27, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google (2019). Google Landmarks Dataset v2 [Dataset]. https://github.com/cvdfoundation/google-landmark
    Explore at:
    Dataset updated
    Sep 27, 2019
    Dataset provided by
    Googlehttp://google.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the second version of the Google Landmarks dataset (GLDv2), which contains images annotated with labels representing human-made and natural landmarks. The dataset can be used for landmark recognition and retrieval experiments. This version of the dataset contains approximately 5 million images, split into 3 sets of images: train, index and test. The dataset was presented in our CVPR'20 paper. In this repository, we present download links for all dataset files and relevant code for metric computation. This dataset was associated to two Kaggle challenges, on landmark recognition and landmark retrieval. Results were discussed as part of a CVPR'19 workshop. In this repository, we also provide scores for the top 10 teams in the challenges, based on the latest ground-truth version. Please visit the challenge and workshop webpages for more details on the data, tasks and technical solutions from top teams.

  15. a

    Wikilinks: A Large-scale Cross-Document Coreference Corpus Labeled via Links...

    • academictorrents.com
    bittorrent
    Updated Mar 4, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sameer Singh and Amarnag Subramanya and Fernando Pereira and Andrew McCallum (2017). Wikilinks: A Large-scale Cross-Document Coreference Corpus Labeled via Links to Wikipedia (Original Dataset) [Dataset]. https://academictorrents.com/details/beefa2ec4161432cd1d9f693a88d3670aae68357
    Explore at:
    bittorrent(1837946933)Available download formats
    Dataset updated
    Mar 4, 2017
    Dataset authored and provided by
    Sameer Singh and Amarnag Subramanya and Fernando Pereira and Andrew McCallum
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    Cross-document coreference resolution is the task of grouping the entity mentions in a collection of documents into sets that each represent a distinct entity. It is central to knowledge base construction and also useful for joint inference with other NLP components. Obtaining large, organic labeled datasets for training and testing cross-document coreference has previously been difficult. We use a method for automatically gathering massive amounts of naturally-occurring cross-document reference data to create the Wikilinks dataset comprising of 40 million mentions over 3 million entities. Our method is based on finding hyperlinks to Wikipedia from a web crawl and using anchor text as mentions. In addition to providing large-scale labeled data without human effort, we are able to include many styles of text beyond newswire and many entity types beyond people. ### Introduction The Wikipedia links (WikiLinks) data consists of web pages that satisfy the following two constraints: a. conta

  16. Facebook users worldwide 2017-2027

    • statista.com
    • es.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Facebook users worldwide 2017-2027 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years. User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  17. h

    Stable_Diffusion_3_Recaption

    • huggingface.co
    Updated Apr 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Mongaras (2025). Stable_Diffusion_3_Recaption [Dataset]. https://huggingface.co/datasets/gmongaras/Stable_Diffusion_3_Recaption
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 23, 2025
    Authors
    Gabriel Mongaras
    License

    https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/

    Description

    This dataset is the one specified in the stable diffusion 3 paper which is composed of the ImageNet dataset and the CC12M dataset.

    I used the ImageNet 2012 train/val data and captioned it as specified in the paper: "a photo of a 〈class name〉" (note all ids are 999,999,999) CC12M is a dataset with 12 million images created in 2021. Unfortunately the downloader provided by Google has many broken links and the download takes forever. However, some people in the community publicized the dataset.… See the full description on the dataset page: https://huggingface.co/datasets/gmongaras/Stable_Diffusion_3_Recaption.

  18. Data from: GeoNames

    • data.wu.ac.at
    • huggingface.co
    zip
    Updated Oct 10, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Open Geospatial Data (2013). GeoNames [Dataset]. https://data.wu.ac.at/schema/datahub_io/MzE1MTQ4YWYtZmQyOC00ZWJjLTg3MDEtZWVkMDExNTE3MDA0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 10, 2013
    Dataset provided by
    Open Geospatial Consortiumhttps://www.ogc.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The geonames.org geographical database is available for download free of charge under a creative commons attribution license. It contains over eight million geographical names and consists of 6.3 million unique features whereof 2.2 million populated places and 1.8 million alternate names. All features are categorized into one out of nine feature classes and further subcategorized into one out of 645 feature codes. (more statistics ...).

    The data is accessible free of charge through a number of webservices and a daily database export. Geonames.org is already serving up to over 3 million web service requests per day.

    Geonames is integrating geographical data such as names of places in various languages, elevation, population and others from various sources. All lat/long coordinates are in WGS84 (World Geodetic System 1984). Users may manually edit, correct and add new names using a user friendly wiki interface.

    TODO

    This is a large dataset and there are a whole bunch of specially exported subsets of data at http://download.geonames.org/export/dump/ which it might be worth turning into separate datasets (or at least listing here in Resources).

    Linked Data

    Geonames locations are available as linked data, see dataset:geonames-semantic-web

  19. Average daily time spent on social media worldwide 2012-2024

    • statista.com
    • es.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Average daily time spent on social media worldwide 2012-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How much time do people spend on social media?

                  As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in
                  the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively.
                  People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general.
                  During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
    
  20. A

    ‘COVID vaccination vs. mortality ’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Aug 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2020). ‘COVID vaccination vs. mortality ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-covid-vaccination-vs-mortality-cbd8/06c8ccd2/?iid=010-492&v=presentation
    Explore at:
    Dataset updated
    Aug 4, 2020
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘COVID vaccination vs. mortality ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sinakaraji/covid-vaccination-vs-death on 12 November 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    The COVID-19 outbreak has brought the whole planet to its knees.More over 4.5 million people have died since the writing of this notebook, and the only acceptable way out of the disaster is to vaccinate all parts of society. Despite the fact that the benefits of vaccination have been proved to the world many times, anti-vaccine groups are springing up all over the world. This data set was generated to investigate the impact of coronavirus vaccinations on coronavirus mortality.

    Content

    countryiso_codedatetotal_vaccinationspeople_vaccinatedpeople_fully_vaccinatedNew_deathspopulationratio
    country nameiso code for each countrydate that this data belongnumber of all doses of COVID vaccine usage in that countrynumber of people who got at least one shot of COVID vaccinenumber of people who got full vaccine shotsnumber of daily new deaths2021 country population% of vaccinations in that country at that date = people_vaccinated/population * 100

    Data Collection

    This dataset is a combination of the following three datasets:

    1.https://www.kaggle.com/gpreda/covid-world-vaccination-progress

    2.https://covid19.who.int/WHO-COVID-19-global-data.csv

    3.https://www.kaggle.com/rsrishav/world-population

    you can find more detail about this dataset by reading this notebook:

    https://www.kaggle.com/sinakaraji/simple-linear-regression-covid-vaccination

    Countries in this dataset:

    AfghanistanAlbaniaAlgeriaAndorraAngola
    AnguillaAntigua and BarbudaArgentinaArmeniaAruba
    AustraliaAustriaAzerbaijanBahamasBahrain
    BangladeshBarbadosBelarusBelgiumBelize
    BeninBermudaBhutanBolivia (Plurinational State of)Brazil
    Bosnia and HerzegovinaBotswanaBrunei DarussalamBulgariaBurkina Faso
    CambodiaCameroonCanadaCabo VerdeCayman Islands
    Central African RepublicChadChileChinaColombia
    ComorosCook IslandsCosta RicaCroatiaCuba
    CuraçaoCyprusDenmarkDjiboutiDominica
    Dominican RepublicEcuadorEgyptEl SalvadorEquatorial Guinea
    EstoniaEthiopiaFalkland Islands (Malvinas)FijiFinland
    FranceFrench PolynesiaGabonGambiaGeorgia
    GermanyGhanaGibraltarGreeceGreenland
    GrenadaGuatemalaGuineaGuinea-BissauGuyana
    HaitiHondurasHungaryIcelandIndia
    IndonesiaIran (Islamic Republic of)IraqIrelandIsle of Man
    IsraelItalyJamaicaJapanJordan
    KazakhstanKenyaKiribatiKuwaitKyrgyzstan
    Lao People's Democratic RepublicLatviaLebanonLesothoLiberia
    LibyaLiechtensteinLithuaniaLuxembourgMadagascar
    MalawiMalaysiaMaldivesMaliMalta
    MauritaniaMauritiusMexicoRepublic of MoldovaMonaco
    MongoliaMontenegroMontserratMoroccoMozambique
    MyanmarNamibiaNauruNepalNetherlands
    New CaledoniaNew ZealandNicaraguaNigerNigeria
    NiueNorth MacedoniaNorwayOmanPakistan
    occupied Palestinian territory, including east Jerusalem
    PanamaPapua New GuineaParaguayPeruPhilippines
    PolandPortugalQatarRomaniaRussian Federation
    RwandaSaint Kitts and NevisSaint Lucia
    Saint Vincent and the GrenadinesSamoaSan MarinoSao Tome and PrincipeSaudi Arabia
    SenegalSerbiaSeychellesSierra LeoneSingapore
    SlovakiaSloveniaSolomon IslandsSomaliaSouth Africa
    Republic of KoreaSouth SudanSpainSri LankaSudan
    SurinameSwedenSwitzerlandSyrian Arab RepublicTajikistan
    United Republic of TanzaniaThailandTogoTongaTrinidad and Tobago
    TunisiaTurkeyTurkmenistanTurks and Caicos IslandsTuvalu
    UgandaUkraineUnited Arab EmiratesThe United KingdomUnited States of America
    UruguayUzbekistanVanuatuVenezuela (Bolivarian Republic of)Viet Nam
    Wallis and FutunaYemenZambiaZimbabwe

    --- Original source retains full ownership of the source dataset ---

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bhavik Jikadara (2024). World Population Statistics - 2023 [Dataset]. https://www.kaggle.com/datasets/bhavikjikadara/world-population-statistics-2023
Organization logo

World Population Statistics - 2023

Highlights From the 2023 World Population Data Sheet

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 9, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Bhavik Jikadara
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
World
Description
  • The current US Census Bureau world population estimate in June 2019 shows that the current global population is 7,577,130,400 people on Earth, which far exceeds the world population of 7.2 billion in 2015. Our estimate based on UN data shows the world's population surpassing 7.7 billion.
  • China is the most populous country in the world with a population exceeding 1.4 billion. It is one of just two countries with a population of more than 1 billion, with India being the second. As of 2018, India has a population of over 1.355 billion people, and its population growth is expected to continue through at least 2050. By the year 2030, India is expected to become the most populous country in the world. This is because India’s population will grow, while China is projected to see a loss in population.
  • The following 11 countries that are the most populous in the world each have populations exceeding 100 million. These include the United States, Indonesia, Brazil, Pakistan, Nigeria, Bangladesh, Russia, Mexico, Japan, Ethiopia, and the Philippines. Of these nations, all are expected to continue to grow except Russia and Japan, which will see their populations drop by 2030 before falling again significantly by 2050.
  • Many other nations have populations of at least one million, while there are also countries that have just thousands. The smallest population in the world can be found in Vatican City, where only 801 people reside.
  • In 2018, the world’s population growth rate was 1.12%. Every five years since the 1970s, the population growth rate has continued to fall. The world’s population is expected to continue to grow larger but at a much slower pace. By 2030, the population will exceed 8 billion. In 2040, this number will grow to more than 9 billion. In 2055, the number will rise to over 10 billion, and another billion people won’t be added until near the end of the century. The current annual population growth estimates from the United Nations are in the millions - estimating that over 80 million new lives are added yearly.
  • This population growth will be significantly impacted by nine specific countries which are situated to contribute to the population growth more quickly than other nations. These nations include the Democratic Republic of the Congo, Ethiopia, India, Indonesia, Nigeria, Pakistan, Uganda, the United Republic of Tanzania, and the United States of America. Particularly of interest, India is on track to overtake China's position as the most populous country by 2030. Additionally, multiple nations within Africa are expected to double their populations before fertility rates begin to slow entirely.

Content

  • In this Dataset, we have Historical Population data for every Country/Territory in the world by different parameters like Area Size of the Country/Territory, Name of the Continent, Name of the Capital, Density, Population Growth Rate, Ranking based on Population, World Population Percentage, etc. >Dataset Glossary (Column-Wise):
  • Rank: Rank by Population.
  • CCA3: 3 Digit Country/Territories Code.
  • Country/Territories: Name of the Country/Territories.
  • Capital: Name of the Capital.
  • Continent: Name of the Continent.
  • 2022 Population: Population of the Country/Territories in the year 2022.
  • 2020 Population: Population of the Country/Territories in the year 2020.
  • 2015 Population: Population of the Country/Territories in the year 2015.
  • 2010 Population: Population of the Country/Territories in the year 2010.
  • 2000 Population: Population of the Country/Territories in the year 2000.
  • 1990 Population: Population of the Country/Territories in the year 1990.
  • 1980 Population: Population of the Country/Territories in the year 1980.
  • 1970 Population: Population of the Country/Territories in the year 1970.
  • Area (km²): Area size of the Country/Territories in square kilometers.
  • Density (per km²): Population Density per square kilometer.
  • Growth Rate: Population Growth Rate by Country/Territories.
  • World Population Percentage: The population percentage by each Country/Territories.
Search
Clear search
Close search
Google apps
Main menu