51 datasets found
  1. T

    United States Population

    • tradingeconomics.com
    • es.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS, United States Population [Dataset]. https://tradingeconomics.com/united-states/population
    Explore at:
    excel, xml, csv, jsonAvailable download formats
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1900 - Dec 31, 2024
    Area covered
    United States
    Description

    The total population in the United States was estimated at 341.2 million people in 2024, according to the latest census figures and projections from Trading Economics. This dataset provides - United States Population - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  2. d

    Data release for Evidence of humans in North America during the Last Glacial...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Jul 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Data release for Evidence of humans in North America during the Last Glacial Maximum [Dataset]. https://catalog.data.gov/dataset/data-release-for-evidence-of-humans-in-north-america-during-the-last-glacial-maximum
    Explore at:
    Dataset updated
    Jul 20, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    North America
    Description

    Archaeologists and researchers in allied fields have long sought to understand human colonization of North America. When, how, and from where did people migrate, and what were the consequences of their arrival for the established fauna and landscape are enduring questions. Here, we present evidence from excavated surfaces of in situ human footprints from White Sands National Park (New Mexico, USA), where multiple human footprints are stratigraphically constrained and bracketed by seed layers that yield calibrated 14C ages between ~23 and 21 ka. These findings confirm the presence of humans in North America during the Last Glacial Maximum, adding evidence to the antiquity of human colonization of the Americas and providing a temporal range extension for the coexistence of early inhabitants and Pleistocene megafauna.

  3. N

    United States Age Group Population Dataset: A complete breakdown of United...

    • neilsberg.com
    csv, json
    Updated Sep 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2023). United States Age Group Population Dataset: A complete breakdown of United States age demographics from 0 to 85 years, distributed across 18 age groups [Dataset]. https://www.neilsberg.com/research/datasets/5fd2b2bb-3d85-11ee-9abe-0aa64bf2eeb2/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Sep 16, 2023
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the United States population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for United States. The dataset can be utilized to understand the population distribution of United States by age. For example, using this dataset, we can identify the largest age group in United States.

    Key observations

    The largest age group in United States was for the group of age 25-29 years with a population of 22,854,328 (6.93%), according to the 2021 American Community Survey. At the same time, the smallest age group in United States was the 80-84 years with a population of 5,932,196 (1.80%). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the United States is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of United States total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for United States Population by Age. You can refer the same here

  4. USA Name Data

    • kaggle.com
    zip
    Updated Feb 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data.gov (2019). USA Name Data [Dataset]. https://www.kaggle.com/datasets/datagov/usa-names
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Feb 12, 2019
    Dataset provided by
    Data.govhttps://data.gov/
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    Context

    Cultural diversity in the U.S. has led to great variations in names and naming traditions and names have been used to express creativity, personality, cultural identity, and values. Source: https://en.wikipedia.org/wiki/Naming_in_the_United_States

    Content

    This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data.

    All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences.

    Fork this kernel to get started with this dataset.

    Acknowledgements

    https://bigquery.cloud.google.com/dataset/bigquery-public-data:usa_names

    https://cloud.google.com/bigquery/public-data/usa-names

    Dataset Source: Data.gov. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Banner Photo by @dcp from Unplash.

    Inspiration

    What are the most common names?

    What are the most common female names?

    Are there more female or male names?

    Female names by a wide margin?

  5. U.S. population data for human identification markers

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). U.S. population data for human identification markers [Dataset]. https://catalog.data.gov/dataset/u-s-population-data-for-human-identification-markers
    Explore at:
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Area covered
    United States
    Description

    The primary data consist of allele or haplotype frequencies for N=1036 anonymized U.S. population samples. Additional files are supplements to the associated publications. Any changes to spreadsheets are listed in the "Change Log" tab within each spreadsheet. DOI numbers for associated publications are listed below, under "References".

  6. F

    Native American Multi-Year Facial Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Native American Multi-Year Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-native-american
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    United States
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Native American Multi-Year Facial Image Dataset, thoughtfully curated to support the development of advanced facial recognition systems, biometric identification models, KYC verification tools, and other computer vision applications. This dataset is ideal for training AI models to recognize individuals over time, track facial changes, and enhance age progression capabilities.

    Facial Image Data

    This dataset includes over 5,000+ high-quality facial images, organized into individual participant sets, each containing:

    Historical Images: 22 facial images per participant captured across a span of 10 years
    Enrollment Image: One recent high-resolution facial image for reference or ground truth

    Diversity & Representation

    Geographic Coverage: Participants from USA, Canada, Mexico and more and other Native American regions
    Demographics: Individuals aged 18 to 70 years, with a gender distribution of 60% male and 40% female
    File Formats: All images are available in JPEG and HEIC formats

    Image Quality & Capture Conditions

    To ensure model generalization and practical usability, images in this dataset reflect real-world diversity:

    Lighting Conditions: Images captured under various natural and artificial lighting setups
    Backgrounds: A wide range of indoor and outdoor backgrounds
    Device Quality: Captured using modern, high-resolution mobile devices for consistency and clarity

    Metadata

    Each participant’s dataset is accompanied by rich metadata to support advanced model training and analysis, including:

    Unique participant ID
    File name
    Age at the time of image capture
    Gender
    Country of origin
    Demographic profile
    File format

    Use Cases & Applications

    This dataset is highly valuable for a wide range of AI and computer vision applications:

    Facial Recognition Systems: Train models for high-accuracy face matching across time
    KYC & Identity Verification: Improve time-spanning verification for banks, insurance, and government services
    Biometric Security Solutions: Build reliable identity authentication models
    Age Progression & Estimation Models: Train AI to predict aging patterns or estimate age from facial features
    Generative AI: Support creation and validation of synthetic age progression or longitudinal face generation

    Secure & Ethical Collection

    Platform: All data was securely collected and processed through FutureBeeAI’s proprietary systems
    Ethical Compliance: Full participant consent obtained with transparent communication of use cases
    Privacy-Protected: No personally identifiable information is included; all data is anonymized and handled with care

    Dataset Updates & Customization

    To keep pace with evolving AI needs, this dataset is regularly updated and customizable. Custom data collection options include:

    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap:

  7. n

    Coronavirus (Covid-19) Data in the United States

    • nytimes.com
    • openicpsr.org
    • +2more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html
    Explore at:
    Dataset provided by
    New York Times
    Description

    The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.

    Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.

    We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.

    The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.

  8. census-bureau-usa

    • kaggle.com
    zip
    Updated May 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google BigQuery (2020). census-bureau-usa [Dataset]. https://www.kaggle.com/datasets/bigquery/census-bureau-usa
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    May 18, 2020
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Authors
    Google BigQuery
    Area covered
    United States
    Description

    Context :

    The United States census count (also known as the Decennial Census of Population and Housing) is a count of every resident of the US. The census occurs every 10 years and is conducted by the United States Census Bureau. Census data is publicly available through the census website, but much of the data is available in summarized data and graphs. The raw data is often difficult to obtain, is typically divided by region, and it must be processed and combined to provide information about the nation as a whole. Update frequency: Historic (none)

    Dataset source

    United States Census Bureau

    Sample Query

    SELECT zipcode, population FROM bigquery-public-data.census_bureau_usa.population_by_zip_2010 WHERE gender = '' ORDER BY population DESC LIMIT 10

    Terms of use

    This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    See the GCP Marketplace listing for more details and sample queries: https://console.cloud.google.com/marketplace/details/united-states-census-bureau/us-census-data

  9. h

    human-vs-Ai-generated-dataset

    • huggingface.co
    Updated May 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ahmadreza anaami (2025). human-vs-Ai-generated-dataset [Dataset]. https://huggingface.co/datasets/ahmadreza13/human-vs-Ai-generated-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 29, 2025
    Authors
    ahmadreza anaami
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    ahmadreza13/human-vs-Ai-generated-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. United States Census

    • kaggle.com
    zip
    Updated Apr 17, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    US Census Bureau (2018). United States Census [Dataset]. https://www.kaggle.com/census/census-bureau-usa
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Apr 17, 2018
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    US Census Bureau
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    Context

    The United States Census is a decennial census mandated by Article I, Section 2 of the United States Constitution, which states: "Representatives and direct Taxes shall be apportioned among the several States ... according to their respective Numbers."
    Source: https://en.wikipedia.org/wiki/United_States_Census

    Content

    The United States census count (also known as the Decennial Census of Population and Housing) is a count of every resident of the US. The census occurs every 10 years and is conducted by the United States Census Bureau. Census data is publicly available through the census website, but much of the data is available in summarized data and graphs. The raw data is often difficult to obtain, is typically divided by region, and it must be processed and combined to provide information about the nation as a whole.

    The United States census dataset includes nationwide population counts from the 2000 and 2010 censuses. Data is broken out by gender, age and location using zip code tabular areas (ZCTAs) and GEOIDs. ZCTAs are generalized representations of zip codes, and often, though not always, are the same as the zip code for an area. GEOIDs are numeric codes that uniquely identify all administrative, legal, and statistical geographic areas for which the Census Bureau tabulates data. GEOIDs are useful for correlating census data with other censuses and surveys.

    Fork this kernel to get started.

    Acknowledgements

    https://bigquery.cloud.google.com/dataset/bigquery-public-data:census_bureau_usa

    https://cloud.google.com/bigquery/public-data/us-census

    Dataset Source: United States Census Bureau

    Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Banner Photo by Steve Richey from Unsplash.

    Inspiration

    What are the ten most populous zip codes in the US in the 2010 census?

    What are the top 10 zip codes that experienced the greatest change in population between the 2000 and 2010 censuses?

    https://cloud.google.com/bigquery/images/census-population-map.png" alt="https://cloud.google.com/bigquery/images/census-population-map.png"> https://cloud.google.com/bigquery/images/census-population-map.png

  11. h

    Human-Like-DPO-Dataset

    • huggingface.co
    Updated May 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Human-Like LLMs (2024). Human-Like-DPO-Dataset [Dataset]. https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 19, 2024
    Dataset authored and provided by
    Human-Like LLMs
    License

    https://choosealicense.com/licenses/llama3/https://choosealicense.com/licenses/llama3/

    Description

    Enhancing Human-Like Responses in Large Language Models

    🤗 Models | 📊 Dataset | 📄 Paper

      Human-Like-DPO-Dataset
    

    This dataset was created as part of research aimed at improving conversational fluency and engagement in large language models. It is suitable for formats like Direct Preference Optimization (DPO) to guide models toward generating more human-like responses. The dataset includes 10,884 samples across 256 topics, including: Technology Daily Life Science… See the full description on the dataset page: https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset.

  12. H

    United States of America: WOF Administrative Subdivisions and Human...

    • data.humdata.org
    • data.amerigeoss.org
    shp
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Who's On First (2025). United States of America: WOF Administrative Subdivisions and Human Settlements [Dataset]. https://data.humdata.org/dataset/whosonfirst-data-admin-usa
    Explore at:
    shp(464708199)Available download formats
    Dataset updated
    Sep 1, 2025
    Dataset provided by
    Who's On First
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This dataset contains administrative polygons grouped by country (admin-0) with the following subdivisions according to Who's On First placetypes:
    - macroregion (admin-1 including region)
    - region (admin-2 including state, province, department, governorate)
    - macrocounty (admin-3 including arrondissement)
    - county (admin-4 including prefecture, sub-prefecture, regency, canton, commune)
    - localadmin (admin-5 including municipality, local government area, unitary authority, commune, suburb)

    The dataset also contains human settlement points and polygons for:
    - localities (city, town, and village)
    - neighbourhoods (borough, macrohood, neighbourhood, microhood)

    The dataset covers activities carried out by Who's On First (WOF) since 2015. Global administrative boundaries and human settlements are aggregated and standardized from hundreds of sources and available with an open CC-BY license. Who's On First data is updated on an as-need basis for individual places with annual sprints focused on improving specific countries or placetypes. Please refer to the README.md file for complete data source metadata. Refer to our blog post for explanation of field names.

    Data corrections can be proposed using Write Field, an web app for making quick data edits. You’ll need a Github.com account to login and propose edits, which are then reviewed by the Who's On First community using the Github pull request process. Approved changes are available for download within 24-hours. Please contact WOF admin about bulk edits.

  13. d

    Data from: What We Eat In America (WWEIA) Database

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). What We Eat In America (WWEIA) Database [Dataset]. https://catalog.data.gov/dataset/what-we-eat-in-america-wweia-database-f7f35
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Service
    Area covered
    United States
    Description

    What We Eat in America (WWEIA) is the dietary intake interview component of the National Health and Nutrition Examination Survey (NHANES). WWEIA is conducted as a partnership between the U.S. Department of Agriculture (USDA) and the U.S. Department of Health and Human Services (DHHS). Two days of 24-hour dietary recall data are collected through an initial in-person interview, and a second interview conducted over the telephone within three to 10 days. Participants are given three-dimensional models (measuring cups and spoons, a ruler, and two household spoons) and/or USDA's Food Model Booklet (containing drawings of various sizes of glasses, mugs, bowls, mounds, circles, and other measures) to estimate food amounts. WWEIA data are collected using USDA's dietary data collection instrument, the Automated Multiple-Pass Method (AMPM). The AMPM is a fully computerized method for collecting 24-hour dietary recalls either in-person or by telephone. For each 2-year data release cycle, the following dietary intake data files are available: Individual Foods File - Contains one record per food for each survey participant. Foods are identified by USDA food codes. Each record contains information about when and where the food was consumed, whether the food was eaten in combination with other foods, amount eaten, and amounts of nutrients provided by the food. Total Nutrient Intakes File - Contains one record per day for each survey participant. Each record contains daily totals of food energy and nutrient intakes, daily intake of water, intake day of week, total number foods reported, and whether intake was usual, much more than usual or much less than usual. The Day 1 file also includes salt use in cooking and at the table; whether on a diet to lose weight or for other health-related reason and type of diet; and frequency of fish and shellfish consumption (examinees one year or older, Day 1 file only). DHHS is responsible for the sample design and data collection, and USDA is responsible for the survey’s dietary data collection methodology, maintenance of the databases used to code and process the data, and data review and processing. USDA also funds the collection and processing of Day 2 dietary intake data, which are used to develop variance estimates and calculate usual nutrient intakes. Resources in this dataset:Resource Title: What We Eat In America (WWEIA) main web page. File Name: Web Page, url: https://www.ars.usda.gov/northeast-area/beltsville-md-bhnrc/beltsville-human-nutrition-research-center/food-surveys-research-group/docs/wweianhanes-overview/ Contains data tables, research articles, documentation data sets and more information about the WWEIA program. (Link updated 05/13/2020)

  14. h

    meta-shepherd-human-data

    • huggingface.co
    Updated Aug 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Philipp Schmid (2023). meta-shepherd-human-data [Dataset]. https://huggingface.co/datasets/philschmid/meta-shepherd-human-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 23, 2023
    Authors
    Philipp Schmid
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Dataset Card for "meta-shepherd-human-data"

    Original Dataset: https://github.com/facebookresearch/Shepherd

      Example
    

    Question: Where on the planet would you expect a bald eagle to live?

    Here are the options: Option 1: colorado Option 2: outside Option 3: protection Option 4: zoo exhibit Option 5: world

    Please choose the correct option and justify your choice:

    Answer: Bald eagles are found throughout most of North America, from Alaska and Canada south to… See the full description on the dataset page: https://huggingface.co/datasets/philschmid/meta-shepherd-human-data.

  15. Total population worldwide 1950-2100

    • statista.com
    Updated Jul 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Total population worldwide 1950-2100 [Dataset]. https://www.statista.com/statistics/805044/total-population-worldwide/
    Explore at:
    Dataset updated
    Jul 28, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    The world population surpassed eight billion people in 2022, having doubled from its figure less than 50 years previously. Looking forward, it is projected that the world population will reach nine billion in 2038, and 10 billion in 2060, but it will peak around 10.3 billion in the 2080s before it then goes into decline. Regional variations The global population has seen rapid growth since the early 1800s, due to advances in areas such as food production, healthcare, water safety, education, and infrastructure, however, these changes did not occur at a uniform time or pace across the world. Broadly speaking, the first regions to undergo their demographic transitions were Europe, North America, and Oceania, followed by Latin America and Asia (although Asia's development saw the greatest variation due to its size), while Africa was the last continent to undergo this transformation. Because of these differences, many so-called "advanced" countries are now experiencing population decline, particularly in Europe and East Asia, while the fastest population growth rates are found in Sub-Saharan Africa. In fact, the roughly two billion difference in population between now and the 2080s' peak will be found in Sub-Saharan Africa, which will rise from 1.2 billion to 3.2 billion in this time (although populations in other continents will also fluctuate). Changing projections The United Nations releases their World Population Prospects report every 1-2 years, and this is widely considered the foremost demographic dataset in the world. However, recent years have seen a notable decline in projections when the global population will peak, and at what number. Previous reports in the 2010s had suggested a peak of over 11 billion people, and that population growth would continue into the 2100s, however a sooner and shorter peak is now projected. Reasons for this include a more rapid population decline in East Asia and Europe, particularly China, as well as a prolonged development arc in Sub-Saharan Africa.

  16. d

    COVID-19 Cases and Deaths by Race/Ethnicity - ARCHIVE

    • catalog.data.gov
    • data.ct.gov
    Updated Aug 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ct.gov (2023). COVID-19 Cases and Deaths by Race/Ethnicity - ARCHIVE [Dataset]. https://catalog.data.gov/dataset/covid-19-cases-and-deaths-by-race-ethnicity
    Explore at:
    Dataset updated
    Aug 12, 2023
    Dataset provided by
    data.ct.gov
    Description

    Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve. The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj. The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 . The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 . The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed. COVID-19 cases and associated deaths that have been reported among Connecticut residents, broken down by race and ethnicity. All data in this report are preliminary; data for previous dates will be updated as new reports are received and data errors are corrected. Deaths reported to the either the Office of the Chief Medical Examiner (OCME) or Department of Public Health (DPH) are included in the COVID-19 update. The following data show the number of COVID-19 cases and associated deaths per 100,000 population by race and ethnicity. Crude rates represent the total cases or deaths per 100,000 people. Age-adjusted rates consider the age of the person at diagnosis or death when estimating the rate and use a standardized population to provide a fair comparison between population groups with different age distributions. Age-adjustment is important in Connecticut as the median age of among the non-Hispanic white population is 47 years, whereas it is 34 years among non-Hispanic blacks, and 29 years among Hispanics. Because most non-Hispanic white residents who died were over 75 years of age, the age-adjusted rates are lower than the unadjusted rates. In contrast, Hispanic residents who died tend to be younger than 75 years of age which results in higher age-adjusted rates. The population data used to calculate rates is based on the CT DPH population statistics for 2019, which is available online here: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Population/Population-Statistics. Prior to 5/10/2021, the population estimates from 2018 were used. Rates are standardized to the 2000 US Millions Standard population (data available here: https://seer.cancer.gov/stdpopulations/). Standardization was done using 19 age groups (0, 1-4, 5-9, 10-14, ..., 80-84, 85 years and older). More information about direct standardization for age adjustment is available here: https://www.cdc.gov/nchs/data/statnt/statnt06rv.pdf Categories are mutually exclusive. The category “multiracial” includes people who answered ‘yes’ to more than one race category. Counts may not add up to total case counts as data on race and ethnicity may be missing. Age adjusted rates calculated only for groups with more than 20 deaths. Abbreviation: NH=Non-Hispanic. Data on Connecticut deaths were obtained from the Connecticut Deaths Registry maintained by the DPH Office of Vital Records. Cause of death was determined by a death certifier (e.g., physician, APRN, medical

  17. Vital Signs: Migration - by county (simple)

    • data.bayareametro.gov
    csv, xlsx, xml
    Updated Dec 12, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Census Bureau (2018). Vital Signs: Migration - by county (simple) [Dataset]. https://data.bayareametro.gov/dataset/Vital-Signs-Migration-by-county-simple-/qmud-33nk
    Explore at:
    csv, xml, xlsxAvailable download formats
    Dataset updated
    Dec 12, 2018
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    U.S. Census Bureau
    Description

    VITAL SIGNS INDICATOR Migration (EQ4)

    FULL MEASURE NAME Migration flows

    LAST UPDATED December 2018

    DESCRIPTION Migration refers to the movement of people from one location to another, typically crossing a county or regional boundary. Migration captures both voluntary relocation – for example, moving to another region for a better job or lower home prices – and involuntary relocation as a result of displacement. The dataset includes metropolitan area, regional, and county tables.

    DATA SOURCE American Community Survey County-to-County Migration Flows 2012-2015 5-year rolling average http://www.census.gov/topics/population/migration/data/tables.All.html

    CONTACT INFORMATION vitalsigns.info@bayareametro.gov

    METHODOLOGY NOTES (across all datasets for this indicator) Data for migration comes from the American Community Survey; county-to-county flow datasets experience a longer lag time than other standard datasets available in FactFinder. 5-year rolling average data was used for migration for all geographies, as the Census Bureau does not release 1-year annual data. Data is not available at any geography below the county level; note that flows that are relatively small on the county level are often within the margin of error. The metropolitan area comparison was performed for the nine-county San Francisco Bay Area, in addition to the primary MSAs for the nine other major metropolitan areas, by aggregating county data based on current metropolitan area boundaries. Data prior to 2011 is not available on Vital Signs due to inconsistent Census formats and a lack of net migration statistics for prior years. Only counties with a non-negligible flow are shown in the data; all other pairs can be assumed to have zero migration.

    Given that the vast majority of migration out of the region was to other counties in California, California counties were bundled into the following regions for simplicity: Bay Area: Alameda, Contra Costa, Marin, Napa, San Francisco, San Mateo, Santa Clara, Solano, Sonoma Central Coast: Monterey, San Benito, San Luis Obispo, Santa Barbara, Santa Cruz Central Valley: Fresno, Kern, Kings, Madera, Merced, Tulare Los Angeles + Inland Empire: Imperial, Los Angeles, Orange, Riverside, San Bernardino, Ventura Sacramento: El Dorado, Placer, Sacramento, Sutter, Yolo, Yuba San Diego: San Diego San Joaquin Valley: San Joaquin, Stanislaus Rural: all other counties (23)

    One key limitation of the American Community Survey migration data is that it is not able to track emigration (movement of current U.S. residents to other countries). This is despite the fact that it is able to quantify immigration (movement of foreign residents to the U.S.), generally by continent of origin. Thus the Vital Signs analysis focuses primarily on net domestic migration, while still specifically citing in-migration flows from countries abroad based on data availability.

  18. H

    American Community Survey (ACS)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). American Community Survey (ACS) [Dataset]. http://doi.org/10.7910/DVN/DKI9L4
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the american community survey (acs) with r and monetdb experimental. think of the american community survey (acs) as the united states' census for off-years - the ones that don't end in zero. every year, one percent of all americans respond, making it the largest complex sample administered by the u.s. government (the decennial census has a much broader reach, but since it attempts to contact 100% of the population, it's not a sur vey). the acs asks how people live and although the questionnaire only includes about three hundred questions on demography, income, insurance, it's often accurate at sub-state geographies and - depending how many years pooled - down to small counties. households are the sampling unit, and once a household gets selected for inclusion, all of its residents respond to the survey. this allows household-level data (like home ownership) to be collected more efficiently and lets researchers examine family structure. the census bureau runs and finances this behemoth, of course. the dow nloadable american community survey ships as two distinct household-level and person-level comma-separated value (.csv) files. merging the two just rectangulates the data, since each person in the person-file has exactly one matching record in the household-file. for analyses of small, smaller, and microscopic geographic areas, choose one-, three-, or fiv e-year pooled files. use as few pooled years as you can, unless you like sentences that start with, "over the period of 2006 - 2010, the average american ... [insert yer findings here]." rather than processing the acs public use microdata sample line-by-line, the r language brazenly reads everything into memory by default. to prevent overloading your computer, dr. thomas lumley wrote the sqlsurvey package principally to deal with t his ram-gobbling monster. if you're already familiar with syntax used for the survey package, be patient and read the sqlsurvey examples carefully when something doesn't behave as you expect it to - some sqlsurvey commands require a different structure (i.e. svyby gets called through svymean) and others might not exist anytime soon (like svyolr). gimme some good news: sqlsurvey uses ultra-fast monetdb (click here for speed tests), so follow the monetdb installation instructions before running this acs code. monetdb imports, writes, recodes data slowly, but reads it hyper-fast . a magnificent trade-off: data exploration typically requires you to think, send an analysis command, think some more, send another query, repeat. importation scripts (especially the ones i've already written for you) can be left running overnight sans hand-holding. the acs weights generalize to the whole united states population including individuals living in group quarters, but non-residential respondents get an abridged questionnaire, so most (not all) analysts exclude records with a relp variable of 16 or 17 right off the bat. this new github repository contains four scripts: 2005-2011 - download all microdata.R create the batch (.bat) file needed to initiate the monet database in the future download, unzip, and import each file for every year and size specified by the user create and save household- and merged/person-level replicate weight complex sample designs create a well-documented block of code to re-initiate the monet db server in the future fair warning: this full script takes a loooong time. run it friday afternoon, commune with nature for the weekend, and if you've got a fast processor and speedy internet connection, monday morning it should be ready for action. otherwise, either download only the years and sizes you need or - if you gotta have 'em all - run it, minimize it, and then don't disturb it for a week. 2011 single-year - analysis e xamples.R run the well-documented block of code to re-initiate the monetdb server load the r data file (.rda) containing the replicate weight designs for the single-year 2011 file perform the standard repertoire of analysis examples, only this time using sqlsurvey functions 2011 single-year - variable reco de example.R run the well-documented block of code to re-initiate the monetdb server copy the single-year 2011 table to maintain the pristine original add a new age category variable by hand add a new age category variable systematically re-create then save the sqlsurvey replicate weight complex sample design on this new table close everything, then load everything back up in a fresh instance of r replicate a few of the census statistics. no muss, no fuss replicate census estimates - 2011.R run the well-documented block of code to re-initiate the monetdb server load the r data file (.rda) containing the replicate weight designs for the single-year 2011 file match every nation wide statistic on the census bureau's estimates page, using sqlsurvey functions click here to view these four scripts for more detail about the american community survey (acs), visit: < ul> the us census...

  19. I

    Data for Spatial Accessibility to HIV (Human Immunodeficiency Virus)...

    • databank.illinois.edu
    Updated Aug 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeon-Young Kang; Bita Fayaz Farkhad; Man-pui Sally Chan; Alexander Michels; Dolores Albarracin; Shaowen Wang (2022). Data for Spatial Accessibility to HIV (Human Immunodeficiency Virus) Testing, Treatment, and Prevention Services in Illinois and Chicago, USA [Dataset]. http://doi.org/10.13012/B2IDB-9096476_V1
    Explore at:
    Dataset updated
    Aug 9, 2022
    Authors
    Jeon-Young Kang; Bita Fayaz Farkhad; Man-pui Sally Chan; Alexander Michels; Dolores Albarracin; Shaowen Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Illinois, United States, Chicago
    Dataset funded by
    U.S. National Science Foundation (NSF)
    U.S. National Institutes of Health (NIH)
    Description

    This dataset helps to investigate the Spatial Accessibility to HIV Testing, Treatment, and Prevention Services in Illinois and Chicago, USA. The main components are: population data, healthcare data, GTFS feeds, and road network data. The core components are: 1) GTFS which contains GTFS (General Transit Feed Specification) data which is provided by Chicago Transit Authority (CTA) from Google's GTFS feeds. Documentation defines the format and structure of the files that comprise a GTFS dataset: https://developers.google.com/transit/gtfs/reference?csw=1. 2) HealthCare contains shapefiles describing HIV healthcare providers in Chicago and Illinois respectively. The services come from Locator.HIV.gov. 3) PopData contains population data for Chicago and Illinois respectively. Data come from The American Community Survey and AIDSVu. AIDSVu (https://map.aidsvu.org/map) provides data on PLWH in Chicago at the census tract level for the year 2017 and in the State of Illinois at the county level for the year 2016. The American Community Survey (ACS) provided the number of people aged 15 to 64 at the census tract level for the year 2017 and at the county level for the year 2016. The ACS provides annually updated information on demographic and socio economic characteristics of people and housing in the U.S. 4) RoadNetwork contains the road networks for Chicago and Illinois respectively from OpenStreetMap using the Python osmnx package. The abstract for our paper is: Accomplishing the goals outlined in “Ending the HIV (Human Immunodeficiency Virus) Epidemic: A Plan for America Initiative” will require properly estimating and increasing access to HIV testing, treatment, and prevention services. In this research, a computational spatial method for estimating access was applied to measure distance to services from all points of a city or state while considering the size of the population in need for services as well as both driving and public transportation. Specifically, this study employed the enhanced two-step floating catchment area (E2SFCA) method to measure spatial accessibility to HIV testing, treatment (i.e., Ryan White HIV/AIDS program), and prevention (i.e., Pre-Exposure Prophylaxis [PrEP]) services. The method considered the spatial location of MSM (Men Who have Sex with Men), PLWH (People Living with HIV), and the general adult population 15-64 depending on what HIV services the U.S. Centers for Disease Control (CDC) recommends for each group. The study delineated service- and population-specific accessibility maps, demonstrating the method’s utility by analyzing data corresponding to the city of Chicago and the state of Illinois. Findings indicated health disparities in the south and the northwest of Chicago and particular areas in Illinois, as well as unique health disparities for public transportation compared to driving. The methodology details and computer code are shared for use in research and public policy.

  20. T

    United States Nurses

    • tradingeconomics.com
    • jp.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Sep 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2023). United States Nurses [Dataset]. https://tradingeconomics.com/united-states/nurses
    Explore at:
    json, xml, csv, excelAvailable download formats
    Dataset updated
    Sep 12, 2023
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1999 - Dec 31, 2024
    Area covered
    United States
    Description

    Nurses in the United States increased to 12.71 per 1000 people in 2024 from 12.36 per 1000 people in 2023. This dataset includes a chart with historical data for the United States Nurses.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
TRADING ECONOMICS, United States Population [Dataset]. https://tradingeconomics.com/united-states/population

United States Population

United States Population - Historical Dataset (1900-12-31/2024-12-31)

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
excel, xml, csv, jsonAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Time period covered
Dec 31, 1900 - Dec 31, 2024
Area covered
United States
Description

The total population in the United States was estimated at 341.2 million people in 2024, according to the latest census figures and projections from Trading Economics. This dataset provides - United States Population - actual values, historical data, forecast, chart, statistics, economic calendar and news.

Search
Clear search
Close search
Google apps
Main menu