39 datasets found
  1. N

    United States Age Group Population Dataset: A Complete Breakdown of United...

    • neilsberg.com
    csv, json
    Updated Jul 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). United States Age Group Population Dataset: A Complete Breakdown of United States Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/aabf26b9-4983-11ef-ae5d-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the United States population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for United States. The dataset can be utilized to understand the population distribution of United States by age. For example, using this dataset, we can identify the largest age group in United States.

    Key observations

    The largest age group in United States was for the group of age 30 to 34 years years with a population of 22.71 million (6.86%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in United States was the 80 to 84 years years with a population of 6.25 million (1.89%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the United States is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of United States total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for United States Population by Age. You can refer the same here

  2. N

    Texas Population Breakdown by Gender and Age Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Texas Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e2049364-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Texas
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Texas by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Texas. The dataset can be utilized to understand the population distribution of Texas by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Texas. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Texas.

    Key observations

    Largest age group (population): Male # 10-14 years (1.12 million) | Female # 10-14 years (1.08 million). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Texas population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Texas is shown in the following column.
    • Population (Female): The female population in the Texas is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Texas for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Texas Population by Gender. You can refer the same here

  3. Survey of Consumer Finances

    • federalreserve.gov
    Updated Oct 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Board of Governors of the Federal Reserve Board (2023). Survey of Consumer Finances [Dataset]. http://doi.org/10.17016/8799
    Explore at:
    Dataset updated
    Oct 18, 2023
    Dataset provided by
    Federal Reserve Board of Governors
    Federal Reserve Systemhttp://www.federalreserve.gov/
    Authors
    Board of Governors of the Federal Reserve Board
    Time period covered
    1962 - 2023
    Description

    The Survey of Consumer Finances (SCF) is normally a triennial cross-sectional survey of U.S. families. The survey data include information on families' balance sheets, pensions, income, and demographic characteristics.

  4. T

    United States Money Supply M0

    • tradingeconomics.com
    • zh.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Oct 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States Money Supply M0 [Dataset]. https://tradingeconomics.com/united-states/money-supply-m0
    Explore at:
    json, excel, xml, csvAvailable download formats
    Dataset updated
    Oct 16, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1959 - Oct 31, 2025
    Area covered
    United States
    Description

    Money Supply M0 in the United States increased to 53615000 USD Million in October from 5478000 USD Million in September of 2025. This dataset provides - United States Money Supply M0 - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  5. Top 50 Companies By Revenue (USD Millions)

    • kaggle.com
    zip
    Updated Oct 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubham Parihar 7 (2024). Top 50 Companies By Revenue (USD Millions) [Dataset]. https://www.kaggle.com/datasets/shubhamparihar7/top-50-companies-by-revenue-usd-millions
    Explore at:
    zip(1638 bytes)Available download formats
    Dataset updated
    Oct 24, 2024
    Authors
    Shubham Parihar 7
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Top 50 Companies by Revenue (USD Millions)

    Description : This dataset contains information on the largest companies in the world ranked by their revenue in USD millions. It includes key financial metrics and details about each company, making it a valuable resource for analysis and comparison.

    This list comprises the world's largest companies by consolidated revenue, according to the Fortune Global 500 2024 rankings and other sources. American retail corporation Walmart has been the world's largest company by revenue since 2014. The list is limited to the largest 50 companies, all of which have annual revenues exceeding US$130 billion. This list is incomplete, as not all companies disclose their information to the media or general public. Out of 50 largest companies 23 are American, 17 Asian and 10 European.

    Features :

    • Rank: The rank of the company based on its revenue.
    • Company_Name: The name of the company.
    • Industry: The industry in which the company operates.
    • Revenue (USD Millions): The total revenue of the company in millions of USD.
    • Profit (USD Millions): The total profit of the company in millions of USD.
    • Number of Employees: The total number of employees working for the company.
    • Headquarters: The country where the company's headquarters is located.

    Source : The data has been sourced from the Wikipedia page on List of Largest Companies by Revenue.

    Usage : This dataset can be used for various analyses, including : - Financial performance comparisons across industries. - Visualization of the largest global companies. - Insights into employment statistics in relation to revenue.

    Beginner-Friendly : This dataset is suitable for beginners looking to practice data analysis, data visualization, and financial comparisons. It provides a straightforward structure with easily understandable features, making it an excellent starting point for those new to data science.

  6. d

    Small Business Contact Data | North American Small Business Owners |...

    • datarade.ai
    Updated Oct 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2021). Small Business Contact Data | North American Small Business Owners | Verified Contact Details from 170M Profiles | Best Price Guaranteed [Dataset]. https://datarade.ai/data-products/small-business-contact-data-north-american-small-business-o-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Oct 27, 2021
    Dataset provided by
    Success.ai
    Area covered
    Saint Pierre and Miquelon, Greenland, Honduras, Mexico, United States of America, Guatemala, Panama, Belize, Bermuda, Costa Rica
    Description

    Access B2B Contact Data for North American Small Business Owners with Success.ai—your go-to provider for verified, high-quality business datasets. This dataset is tailored for businesses, agencies, and professionals seeking direct access to decision-makers within the small business ecosystem across North America. With over 170 million professional profiles, it’s an unparalleled resource for powering your marketing, sales, and lead generation efforts.

    Key Features of the Dataset:

    Verified Contact Details

    Includes accurate and up-to-date email addresses and phone numbers to ensure you reach your targets reliably.

    AI-validated for 99% accuracy, eliminating errors and reducing wasted efforts.

    Detailed Professional Insights

    Comprehensive data points include job titles, skills, work experience, and education to enable precise segmentation and targeting.

    Enriched with insights into decision-making roles, helping you connect directly with small business owners, CEOs, and other key stakeholders.

    Business-Specific Information

    Covers essential details such as industry, company size, location, and more, enabling you to tailor your campaigns effectively. Ideal for profiling and understanding the unique needs of small businesses.

    Continuously Updated Data

    Our dataset is maintained and updated regularly to ensure relevance and accuracy in fast-changing market conditions. New business contacts are added frequently, helping you stay ahead of the competition.

    Why Choose Success.ai?

    At Success.ai, we understand the critical importance of high-quality data for your business success. Here’s why our dataset stands out:

    Tailored for Small Business Engagement Focused specifically on North American small business owners, this dataset is an invaluable resource for building relationships with SMEs (Small and Medium Enterprises). Whether you’re targeting startups, local businesses, or established small enterprises, our dataset has you covered.

    Comprehensive Coverage Across North America Spanning the United States, Canada, and Mexico, our dataset ensures wide-reaching access to verified small business contacts in the region.

    Categories Tailored to Your Needs Includes highly relevant categories such as Small Business Contact Data, CEO Contact Data, B2B Contact Data, and Email Address Data to match your marketing and sales strategies.

    Customizable and Flexible Choose from a wide range of filtering options to create datasets that meet your exact specifications, including filtering by industry, company size, geographic location, and more.

    Best Price Guaranteed We pride ourselves on offering the most competitive rates without compromising on quality. When you partner with Success.ai, you receive superior data at the best value.

    Seamless Integration Delivered in formats that integrate effortlessly with your CRM, marketing automation, or sales platforms, so you can start acting on the data immediately.

    Use Cases: This dataset empowers you to:

    Drive Sales Growth: Build and refine your sales pipeline by connecting directly with decision-makers in small businesses. Optimize Marketing Campaigns: Launch highly targeted email and phone outreach campaigns with verified contact data. Expand Your Network: Leverage the dataset to build relationships with small business owners and other key figures within the B2B landscape. Improve Data Accuracy: Enhance your existing databases with verified, enriched contact information, reducing bounce rates and increasing ROI. Industries Served: Whether you're in B2B SaaS, digital marketing, consulting, or any field requiring accurate and targeted contact data, this dataset serves industries of all kinds. It is especially useful for professionals focused on:

    Lead Generation Business Development Market Research Sales Outreach Customer Acquisition What’s Included in the Dataset: Each profile provides:

    Full Name Verified Email Address Phone Number (where available) Job Title Company Name Industry Company Size Location Skills and Professional Experience Education Background With over 170 million profiles, you can tap into a wealth of opportunities to expand your reach and grow your business.

    Why High-Quality Contact Data Matters: Accurate, verified contact data is the foundation of any successful B2B strategy. Reaching small business owners and decision-makers directly ensures your message lands where it matters most, reducing costs and improving the effectiveness of your campaigns. By choosing Success.ai, you ensure that every contact in your pipeline is a genuine opportunity.

    Partner with Success.ai for Better Data, Better Results: Success.ai is committed to delivering premium-quality B2B data solutions at scale. With our small business owner dataset, you can unlock the potential of North America's dynamic small business market.

    Get Started Today Request a sample or customize your dataset to fit your unique...

  7. Diabetes Health Indicators

    • kaggle.com
    zip
    Updated Mar 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siamak Tahmasbi (2025). Diabetes Health Indicators [Dataset]. https://www.kaggle.com/datasets/siamaktahmasbi/diabetes-health-indicators
    Explore at:
    zip(4413929 bytes)Available download formats
    Dataset updated
    Mar 7, 2025
    Authors
    Siamak Tahmasbi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context Diabetes is one of the most prevalent chronic diseases in the United States, affecting millions of Americans each year and placing a substantial financial burden on the economy. It is a serious chronic condition in which the body loses the ability to effectively regulate blood glucose levels, leading to a reduced quality of life and decreased life expectancy. During digestion, food is broken down into sugars, which enter the bloodstream. This triggers the pancreas to release insulin, a hormone that helps cells in the body use these sugars for energy. Diabetes is typically characterized by either insufficient insulin production or the body's inability to use insulin effectively.

    Chronic high blood sugar levels in individuals with diabetes can lead to severe complications, including heart disease, vision loss, kidney disease, and lower-limb amputation. Although there is no cure for diabetes, strategies such as maintaining a healthy weight, eating a balanced diet, staying physically active, and receiving medical treatments can help mitigate its effects. Early diagnosis is crucial, as it allows for lifestyle modifications and more effective treatment, making predictive models for assessing diabetes risk valuable tools for public health officials.

    The scale of the diabetes epidemic is significant. According to the Centers for Disease Control and Prevention (CDC), as of 2018, approximately 34.2 million Americans have diabetes, while 88 million have prediabetes. Alarmingly, the CDC estimates that 1 in 5 individuals with diabetes and about 8 in 10 individuals with prediabetes are unaware of their condition. Type II diabetes is the most common form, and its prevalence varies based on factors such as age, education, income, geographic location, race, and other social determinants of health. The burden of diabetes disproportionately affects those with lower socioeconomic status. The economic impact is also substantial, with the cost of diagnosed diabetes reaching approximately $327 billion annually, and total costs, including undiagnosed diabetes and prediabetes, nearing $400 billion each year.

    Content The Behavioral Risk Factor Surveillance System (BRFSS) is a health-related telephone survey that is collected annually by the CDC. Each year, the survey collects responses from over 400,000 Americans on health-related risk behaviors, chronic health conditions, and the use of preventative services. It has been conducted every year since 1984. For this project, a XPT of the dataset available on CDC website for the year 2023 was used. This original dataset contains responses from 433,323 individuals and has 345 features. These features are either questions directly asked of participants, or calculated variables based on individual participant responses.

    I have selected 20 features from this dataset that are suitable for working on the topic of diabetes, and I have saved them in a CSV file without making any changes to the data. The goal of this is to make it easier to work with the data. For more information or to access updated data, you can refer to the CDC website. I initially examined the original dataset from the CDC and found no duplicate entries. That dataset contains 330 columns and features. Therefore, the duplicate cases in this dataset are not due to errors but rather represent individuals with similar conditions. In my opinion, removing these entries would both introduce errors and reduce accuracy.

    Explore some of the following research questions: - Can survey questions from the BRFSS provide accurate predictions of whether an individual has diabetes? - What risk factors are most predictive of diabetes risk? - Can we use a subset of the risk factors to accurately predict whether an individual has diabetes? - Can we create a short form of questions from the BRFSS using feature selection to accurately predict if someone might have diabetes or is at high risk of diabetes?

    Acknowledgements It is important to reiterate that I did not create this dataset, it is simply a summarized and reformatted dataset derived from the BRFSS 2023 dataset available on the CDC website. It is also worth noting that none of the data in this dataset discloses individuals' identities.

    Inspiration Zidian Xie et al for Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques using the 2014 BRFSS, and Alex Teboul for building Diabetes Health Indicators dataset based on BRFSS 2015 were the inspiration for creating this dataset and exploring the BRFSS in general.

  8. Z

    Data from: Large Landing Trajectory Data Set for Go-Around Analysis

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    Updated Dec 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raphael Monstein; Benoit Figuet; Timothé Krauth; Manuel Waltert; Marcel Dettling (2022). Large Landing Trajectory Data Set for Go-Around Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7148116
    Explore at:
    Dataset updated
    Dec 16, 2022
    Dataset provided by
    ZHAW
    Authors
    Raphael Monstein; Benoit Figuet; Timothé Krauth; Manuel Waltert; Marcel Dettling
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.

    If you use this data for a scientific publication, please consider citing our paper.

    The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:

    go_arounds_minimal.csv.gz

    Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:

        Column name
        Type
        Description
    
    
    
    
        time
        date time
        UTC time of landing or first GA attempt
    
    
        icao24
        string
        Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    
    
        callsign
        string
        Aircraft identifier in air-ground communications
    
    
        airport
        string
        ICAO airport code where the aircraft is landing
    
    
        runway
        string
        Runway designator on which the aircraft landed
    
    
        has_ga
        string
        "True" if at least one GA was performed, otherwise "False"
    
    
        n_approaches
        integer
        Number of approaches identified for this flight
    
    
        n_rwy_approached
        integer
        Number of unique runways approached by this flight
    

    The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.

    go_arounds_augmented.csv.gz

    Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:

        Column name
        Type
        Description
    
    
    
    
        time
        date time
        UTC time of landing or first GA attempt
    
    
        icao24
        string
        Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    
    
        callsign
        string
        Aircraft identifier in air-ground communications
    
    
        airport
        string
        ICAO airport code where the aircraft is landing
    
    
        runway
        string
        Runway designator on which the aircraft landed
    
    
        has_ga
        string
        "True" if at least one GA was performed, otherwise "False"
    
    
        n_approaches
        integer
        Number of approaches identified for this flight
    
    
        n_rwy_approached
        integer
        Number of unique runways approached by this flight
    
    
        registration
        string
        Aircraft registration
    
    
        typecode
        string
        Aircraft ICAO typecode
    
    
        icaoaircrafttype
        string
        ICAO aircraft type
    
    
        wtc
        string
        ICAO wake turbulence category
    
    
        glide_slope_angle
        float
        Angle of the ILS glide slope in degrees
    
    
        has_intersection
    

    string

        Boolean that is true if the runway has an other runway intersecting it, otherwise false
    
    
        rwy_length
        float
        Length of the runway in kilometre
    
    
        airport_country
        string
        ISO Alpha-3 country code of the airport
    
    
        airport_region
        string
        Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
    
    
        operator_country
        string
        ISO Alpha-3 country code of the operator
    
    
        operator_region
        string
        Geographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania)
    
    
        wind_speed_knts
        integer
        METAR, surface wind speed in knots
    
    
        wind_dir_deg
        integer
        METAR, surface wind direction in degrees
    
    
        wind_gust_knts
        integer
        METAR, surface wind gust speed in knots
    
    
        visibility_m
        float
        METAR, visibility in m
    
    
        temperature_deg
        integer
        METAR, temperature in degrees Celsius
    
    
        press_sea_level_p
        float
        METAR, sea level pressure in hPa
    
    
        press_p
        float
        METAR, QNH in hPA
    
    
        weather_intensity
        list
        METAR, list of present weather codes: qualifier - intensity
    
    
        weather_precipitation
        list
        METAR, list of present weather codes: weather phenomena - precipitation
    
    
        weather_desc
        list
        METAR, list of present weather codes: qualifier - descriptor
    
    
        weather_obscuration
        list
        METAR, list of present weather codes: weather phenomena - obscuration
    
    
        weather_other
        list
        METAR, list of present weather codes: weather phenomena - other
    

    This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.

    go_arounds_agg.csv.gz

    Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:

        Column name
        Type
        Description
    
    
    
    
        airport
        string
        ICAO airport code where the aircraft is landing
    
    
        runway
        string
        Runway designator on which the aircraft landed
    
    
        n_landings
        integer
        Total number of landings observed on this runway in 2019
    
    
        ga_rate
        float
        Go-around rate, per 1000 landings
    
    
        glide_slope_angle
        float
        Angle of the ILS glide slope in degrees
    
    
        has_intersection
        string
        Boolean that is true if the runway has an other runway intersecting it, otherwise false
    
    
        rwy_length
        float
        Length of the runway in kilometres
    
    
        airport_country
        string
        ISO Alpha-3 country code of the airport
    
    
        airport_region
        string
        Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
    

    This aggregated data set is used in the paper for the generalized linear regression model.

    Downloading the trajectories

    Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:

    import datetime from tqdm.auto import tqdm import pandas as pd from traffic.data import opensky from traffic.core import Traffic

    load minimum data set

    df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False) df["time"] = pd.to_datetime(df["time"])

    select London City Airport, go-arounds, and 2019-01-04

    airport = "EGLC" start = datetime.datetime(year=2019, month=1, day=4).replace( tzinfo=datetime.timezone.utc ) stop = datetime.datetime(year=2019, month=1, day=5).replace( tzinfo=datetime.timezone.utc )

    df_selection = df.query("airport==@airport & has_ga & (@start <= time <= @stop)")

    iterate over flights and pull the data from OpenSky Network

    flights = [] delta_time = pd.Timedelta(minutes=10) for _, row in tqdm(df_selection.iterrows(), total=df_selection.shape[0]): # take at most 10 minutes before and 10 minutes after the landing or go-around start_time = row["time"] - delta_time stop_time = row["time"] + delta_time

    # fetch the data from OpenSky Network
    flights.append(
      opensky.history(
        start=start_time.strftime("%Y-%m-%d %H:%M:%S"),
        stop=stop_time.strftime("%Y-%m-%d %H:%M:%S"),
        callsign=row["callsign"],
        return_flight=True,
      )
    )
    

    The flights can be converted into a Traffic object

    Traffic.from_flights(flights)

    Additional files

    Additional files are available to check the quality of the classification into GA/not GA and the selection of the landing runway. These are:

    validation_table.xlsx: This Excel sheet was manually completed during the review of the samples for each runway in the data set. It provides an estimate of the false positive and false negative rate of the go-around classification. It also provides an estimate of the runway misclassification rate when the airport has two or more parallel runways. The columns with the headers highlighted in red were filled in manually, the rest is generated automatically.

    validation_sample.zip: For each runway, 8 batches of 500 randomly selected trajectories (or as many as available, if fewer than 4000) classified as not having a GA and up to 8 batches of 10 random landings, classified as GA, are plotted. This allows the interested user to visually inspect a random sample of the landings and go-arounds easily.

  9. T

    United States Government Debt

    • tradingeconomics.com
    • es.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Nov 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States Government Debt [Dataset]. https://tradingeconomics.com/united-states/government-debt
    Explore at:
    csv, excel, json, xmlAvailable download formats
    Dataset updated
    Nov 15, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1942 - Oct 31, 2025
    Area covered
    United States
    Description

    Government Debt in the United States increased to 38040094 USD Million in October from 37637553 USD Million in September of 2025. This dataset provides - United States Government Debt- actual values, historical data, forecast, chart, statistics, economic calendar and news.

  10. 🌊 US Water Quality: 20+ Years of PFAS Monitoring

    • kaggle.com
    zip
    Updated Nov 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anudeep Adiraju (2024). 🌊 US Water Quality: 20+ Years of PFAS Monitoring [Dataset]. https://www.kaggle.com/datasets/anudeepadiraju/ucmr-1-5-combined-csv-data
    Explore at:
    zip(42971180 bytes)Available download formats
    Dataset updated
    Nov 13, 2024
    Authors
    Anudeep Adiraju
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    UCMR Historical PFAS Contamination Dataset (2001-2024)

    Context

    This dataset contains comprehensive monitoring data of Per- and Polyfluoroalkyl Substances (PFAS) and other contaminants in U.S. public water systems, collected under the EPA's Unregulated Contaminant Monitoring Rule (UCMR) program from 2001 to 2024. The data represents a critical resource for understanding the prevalence and patterns of PFAS contamination in drinking water across different regions and time periods.

    Content

    The dataset combines results from multiple UCMR monitoring cycles (UCMR 1-5) and includes over 4 million observations of various contaminants, with a particular focus on PFAS compounds. Each record represents a single analytical measurement at a public water system.

    File Descriptions

    • combined_ucmr_data.csv (4,082,839 rows × 24 columns)

    Features Description

    Key Fields: * PWSID: Public Water System Identification number (string) * PWSName: Name of the Public Water System * Size: Size category of the water system (L: >10,000, S: ≤10,000 people served) * FacilityID: Unique identifier for the facility * FacilityName: Name of the facility * FacilityWaterType: Source water type - GW: Ground Water - SW: Surface Water - GU: Ground Water Under Direct Influence of Surface Water - MX: Mixed Water Types * SamplePointID: Unique identifier for the sampling location * SamplePointName: Description of the sampling location * SamplePointType: Type of sampling point (e.g., EP: Entry Point to distribution system) * CollectionDate: Date of sample collection * Contaminant: Name of the contaminant analyzed * MRL: Minimum Reporting Level in μg/L * Units: Measurement units (typically μg/L) * MethodID: EPA analytical method used * AnalyticalResultsSign: < for less than MRL, = for detected values * AnalyticalResultValue: Numerical result of the analysis * SampleEventCode: Sampling event identifier (SE1, SE2, SE3, SE4) * MonitoringRequirement: Type of monitoring (AM: Assessment Monitoring) * Region: EPA Region number (1-10) * State: Two-letter state code

    Key PFAS Compounds Monitored:

    • PFOA (Perfluorooctanoic acid)
    • PFOS (Perfluorooctanesulfonic acid)
    • PFHxS (Perfluorohexanesulfonic acid)
    • PFNA (Perfluorononanoic acid)
    • PFBS (Perfluorobutanesulfonic acid)
    • HFPO-DA (GenX chemicals)
    • And many others (29 PFAS compounds in total)

    Use Cases

    This dataset is valuable for: 1. Environmental Science: Analyzing trends in PFAS contamination over time 2. Public Health Research: Identifying areas with elevated PFAS levels 3. Machine Learning: - Predicting future PFAS levels - Identifying patterns in contamination spread - Analyzing geographical and temporal trends 4. Policy Analysis: Informing water quality regulations and standards

    Challenges in the Dataset

    1. Missing Values: Results below MRL are indicated with '<' sign
    2. Mixed Data Types: Combination of numeric and categorical variables
    3. Temporal Gaps: Different monitoring cycles with varying sampling frequencies
    4. Regional Variations: Inconsistent coverage across different regions
    5. Multiple Contaminants: Need to handle multiple PFAS compounds simultaneously

    Citation

    Data sourced from EPA's UCMR program. When using this dataset, please cite: - EPA UCMR Program (https://www.epa.gov/dwucmr) - UCMR Data Files (2001-2024)

    Acknowledgments

    Special thanks to: - EPA for making this data publicly available - Public Water Systems for collecting and reporting the data - Environmental laboratories for analyzing the samples

    Inspiration

    1. Can we predict PFAS levels for 2024 based on historical trends?
    2. How do PFAS contamination patterns vary by region and water source type?
    3. What correlations exist between different PFAS compounds?
    4. How effective are current detection methods and reporting limits?
    5. Can we identify high-risk areas for future contamination?

    Additional Resources

  11. Reddit users in the United States 2019-2028

    • statista.com
    Updated Jul 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). Reddit users in the United States 2019-2028 [Dataset]. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/
    Explore at:
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    United States
    Description

    The number of Reddit users in the United States was forecast to continuously increase between 2024 and 2028 by in total 10.3 million users (+5.21 percent). After the ninth consecutive increasing year, the Reddit user base is estimated to reach 208.12 million users and therefore a new peak in 2028. Notably, the number of Reddit users of was continuously increasing over the past years.User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Reddit users in countries like Mexico and Canada.

  12. Small Business Contact Data | North American Entrepreneurs | Verified...

    • datarade.ai
    Updated Feb 12, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2018). Small Business Contact Data | North American Entrepreneurs | Verified Contact Data & Business Details | Best Price Guaranteed [Dataset]. https://datarade.ai/data-products/small-business-contact-data-north-american-entrepreneurs-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Feb 12, 2018
    Dataset provided by
    Area covered
    Bermuda, Guatemala, Saint Pierre and Miquelon, El Salvador, Mexico, Nicaragua, Belize, Canada, Greenland, Honduras
    Description

    Success.ai delivers comprehensive access to Small Business Contact Data, tailored to connect you with North American entrepreneurs and small business leaders. Our extensive database includes verified profiles of over 170 million professionals, ensuring direct access to decision-makers in various industries. With AI-validated accuracy, continuously updated datasets, and a focus on compliance, Success.ai empowers businesses to enhance their marketing, sales, and recruitment efforts while staying ahead in a competitive market.

    Key Features of Success.ai's Small Business Contact Data:

    Extensive Coverage: Access profiles for small business owners and entrepreneurs across the United States, Canada, and Mexico. Our database spans multiple industries, from retail to technology, providing diverse business insights.

    Verified Contact Details: Each profile includes work emails, phone numbers, and firmographic data, enabling precise and effective outreach.

    Industry-Specific Data: Target key sectors such as e-commerce, professional services, healthcare, manufacturing, and more, with tailored datasets designed to meet your specific business needs.

    Real-Time Updates: Continuously updated to maintain a 99% accuracy rate, our data ensures that your campaigns are always backed by the most current information.

    Ethical and Compliant: Fully compliant with GDPR and other global data protection regulations, ensuring ethical use of all contact data.

    Why Choose Success.ai for Small Business Contact Data?

    Best Price Guarantee: Enjoy the most competitive pricing in the market, delivering exceptional value for comprehensive and verified contact data.

    AI-Validated Accuracy: Our advanced AI systems meticulously validate every data point to deliver unmatched reliability and precision.

    Customizable Data Solutions: From hyper-targeted regional datasets to comprehensive industry-wide insights, we tailor our offerings to meet your exact requirements.

    Scalable Access: Whether you're a startup or an enterprise, our solutions are designed to scale with your business needs.

    Comprehensive Use Cases for Small Business Contact Data:

    1. Targeted Marketing Campaigns:

    Refine your marketing strategy by leveraging verified contact details for small business owners. Execute highly personalized email, phone, and multi-channel campaigns with precision.

    1. Sales Prospecting:

    Identify and connect with decision-makers in key industries. Use detailed profiles to enhance your sales outreach, close deals faster, and build long-term client relationships.

    1. Recruitment and Talent Acquisition:

    Discover small business leaders and key players in specific industries to strengthen your recruitment pipeline. Access up-to-date profiles for sourcing top talent.

    1. Market Research:

    Gain insights into small business trends, operational challenges, and industry benchmarks. Leverage this data for competitive analysis and market positioning.

    1. Local Business Engagement:

    Foster partnerships with small businesses by identifying community leaders and entrepreneurial influencers in your target regions.

    APIs to Enhance Your Campaigns:

    Enrichment API: Integrate real-time updates into your CRM and marketing systems to maintain accurate and actionable contact data. Perfect for businesses looking to improve lead quality.

    Lead Generation API: Maximize your lead generation efforts with access to verified contact details, including emails and phone numbers. Tailored for precise targeting of small business decision-makers.

    Tailored Solutions for Diverse Needs:

    Marketing Agencies: Create targeted campaigns with verified data for small business owners across diverse sectors.

    Sales Teams: Drive revenue growth with detailed profiles and direct access to decision-makers.

    Recruiters: Build a talent pipeline with current and verified data on small business leaders and professionals.

    Consultants: Provide data-driven recommendations to clients by leveraging detailed small business insights.

    What Sets Success.ai Apart?

    170M+ Profiles: Access a vast and detailed database of small business owners and entrepreneurs.

    Global Standards Compliance: Rest assured knowing all data is ethically sourced and compliant with global privacy regulations.

    Flexible Integration: Seamlessly integrate data into your existing workflows with customizable delivery options.

    Dedicated Support: Our team of experts is always available to ensure you maximize the value of our solutions.

    Empower Your Outreach with Success.ai:

    Success.ai’s Small Business Contact Data is your gateway to building meaningful connections with North American entrepreneurs. Whether you're driving targeted marketing campaigns, enhancing sales prospecting, or conducting in-depth market research, our verified datasets provide the tools you need to succeed.

    Get started with Success.ai today and unlock the potential of verified Small Business ...

  13. Twitter users in the United States 2019-2028

    • statista.com
    Updated Jul 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). Twitter users in the United States 2019-2028 [Dataset]. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/
    Explore at:
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    United States
    Description

    The number of Twitter users in the United States was forecast to continuously increase between 2024 and 2028 by in total 4.3 million users (+5.32 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 85.08 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Canada and Mexico.

  14. Diabetes Health Indicators Dataset

    • kaggle.com
    zip
    Updated Nov 8, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Teboul (2021). Diabetes Health Indicators Dataset [Dataset]. https://www.kaggle.com/datasets/alexteboul/diabetes-health-indicators-dataset/discussion
    Explore at:
    zip(6324278 bytes)Available download formats
    Dataset updated
    Nov 8, 2021
    Authors
    Alex Teboul
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Diabetes is among the most prevalent chronic diseases in the United States, impacting millions of Americans each year and exerting a significant financial burden on the economy. Diabetes is a serious chronic disease in which individuals lose the ability to effectively regulate levels of glucose in the blood, and can lead to reduced quality of life and life expectancy. After different foods are broken down into sugars during digestion, the sugars are then released into the bloodstream. This signals the pancreas to release insulin. Insulin helps enable cells within the body to use those sugars in the bloodstream for energy. Diabetes is generally characterized by either the body not making enough insulin or being unable to use the insulin that is made as effectively as needed.

    Complications like heart disease, vision loss, lower-limb amputation, and kidney disease are associated with chronically high levels of sugar remaining in the bloodstream for those with diabetes. While there is no cure for diabetes, strategies like losing weight, eating healthily, being active, and receiving medical treatments can mitigate the harms of this disease in many patients. Early diagnosis can lead to lifestyle changes and more effective treatment, making predictive models for diabetes risk important tools for public and public health officials.

    The scale of this problem is also important to recognize. The Centers for Disease Control and Prevention has indicated that as of 2018, 34.2 million Americans have diabetes and 88 million have prediabetes. Furthermore, the CDC estimates that 1 in 5 diabetics, and roughly 8 in 10 prediabetics are unaware of their risk. While there are different types of diabetes, type II diabetes is the most common form and its prevalence varies by age, education, income, location, race, and other social determinants of health. Much of the burden of the disease falls on those of lower socioeconomic status as well. Diabetes also places a massive burden on the economy, with diagnosed diabetes costs of roughly $327 billion dollars and total costs with undiagnosed diabetes and prediabetes approaching $400 billion dollars annually.

    Content

    The Behavioral Risk Factor Surveillance System (BRFSS) is a health-related telephone survey that is collected annually by the CDC. Each year, the survey collects responses from over 400,000 Americans on health-related risk behaviors, chronic health conditions, and the use of preventative services. It has been conducted every year since 1984. For this project, a csv of the dataset available on Kaggle for the year 2015 was used. This original dataset contains responses from 441,455 individuals and has 330 features. These features are either questions directly asked of participants, or calculated variables based on individual participant responses.

    This dataset contains 3 files: 1. diabetes _ 012 _ health _ indicators _ BRFSS2015.csv is a clean dataset of 253,680 survey responses to the CDC's BRFSS2015. The target variable Diabetes_012 has 3 classes. 0 is for no diabetes or only during pregnancy, 1 is for prediabetes, and 2 is for diabetes. There is class imbalance in this dataset. This dataset has 21 feature variables 2. diabetes _ binary _ 5050split _ health _ indicators _ BRFSS2015.csv is a clean dataset of 70,692 survey responses to the CDC's BRFSS2015. It has an equal 50-50 split of respondents with no diabetes and with either prediabetes or diabetes. The target variable Diabetes_binary has 2 classes. 0 is for no diabetes, and 1 is for prediabetes or diabetes. This dataset has 21 feature variables and is balanced. 3. diabetes _ binary _ health _ indicators _ BRFSS2015.csv is a clean dataset of 253,680 survey responses to the CDC's BRFSS2015. The target variable Diabetes_binary has 2 classes. 0 is for no diabetes, and 1 is for prediabetes or diabetes. This dataset has 21 feature variables and is not balanced.

    Explore some of the following research questions: 1. Can survey questions from the BRFSS provide accurate predictions of whether an individual has diabetes? 2. What risk factors are most predictive of diabetes risk? 3. Can we use a subset of the risk factors to accurately predict whether an individual has diabetes? 4. Can we create a short form of questions from the BRFSS using feature selection to accurately predict if someone might have diabetes or is at high risk of diabetes?

    Acknowledgements

    It it important to reiterate that I did not create this dataset, it is just a cleaned and consolidated dataset created from the BRFSS 2015 dataset already on Kaggle. That dataset can be found here and the notebook I used for the data cleaning can be found here.

    Inspiration

    Zidian Xie et al fo...

  15. g

    Replication data for: Let Them Have Choice: Gains from Shifting Away from...

    • datasearch.gesis.org
    • openicpsr.org
    Updated Oct 13, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dafny, Leemore; Ho, Kate; Varela, Mauricio (2019). Replication data for: Let Them Have Choice: Gains from Shifting Away from Employer-Sponsored Health Insurance and toward an Individual Exchange [Dataset]. http://doi.org/10.3886/E114813V1
    Explore at:
    Dataset updated
    Oct 13, 2019
    Dataset provided by
    da|ra (Registration agency for social science and economic data)
    Authors
    Dafny, Leemore; Ho, Kate; Varela, Mauricio
    Description

    Most nonelderly Americans purchase health insurance through their employers, which sponsor a limited number of plans. Using a panel dataset representing over ten million insured lives, we estimate employees' preferences for different health plans and use the estimates to predict their choices if more plans were made available to them on the same terms, i.e., with equivalent subsidies and at large-group prices. Using conservative assumptions, we estimate a median welfare gain of 13 percent of premiums. A proper accounting of the costs and benefits of a transition from employer-sponsored to individually-purchased insurance should include this nontrivial gain. (JEL G22, I13, J32)

  16. Instagram: most popular posts as of 2024

    • statista.com
    • de.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram: most popular posts as of 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Instagram’s most popular post

                  As of April 2024, the most popular post on Instagram was Lionel Messi and his teammates after winning the 2022 FIFA World Cup with Argentina, posted by the account @leomessi. Messi's post, which racked up over 61 million likes within a day, knocked off the reigning post, which was 'Photo of an Egg'. Originally posted in January 2021, 'Photo of an Egg' surpassed the world’s most popular Instagram post at that time, which was a photo by Kylie Jenner’s daughter totaling 18 million likes.
                  After several cryptic posts published by the account, World Record Egg revealed itself to be a part of a mental health campaign aimed at the pressures of social media use.
    
                  Instagram’s most popular accounts
    
                  As of April 2024, the official Instagram account @instagram had the most followers of any account on the platform, with 672 million followers. Portuguese footballer Cristiano Ronaldo (@cristiano) was the most followed individual with 628 million followers, while Selena Gomez (@selenagomez) was the most followed woman on the platform with 429 million. Additionally, Inter Miami CF striker Lionel Messi (@leomessi) had a total of 502 million. Celebrities such as The Rock, Kylie Jenner, and Ariana Grande all had over 380 million followers each.
    
                  Instagram influencers
    
                  In the United States, the leading content category of Instagram influencers was lifestyle, with 15.25 percent of influencers creating lifestyle content in 2021. Music ranked in second place with 10.96 percent, followed by family with 8.24 percent. Having a large audience can be very lucrative: Instagram influencers in the United States, Canada and the United Kingdom with over 90,000 followers made around 1,221 US dollars per post.
    
                  Instagram around the globe
    
                  Instagram’s worldwide popularity continues to grow, and India is the leading country in terms of number of users, with over 362.9 million users as of January 2024. The United States had 169.65 million Instagram users and Brazil had 134.6 million users. The social media platform was also very popular in Indonesia and Turkey, with 100.9 and 57.1, respectively. As of January 2024, Instagram was the fourth most popular social network in the world, behind Facebook, YouTube and WhatsApp.
    
  17. Countries with the most Facebook users 2024

    • statista.com
    • de.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Countries with the most Facebook users 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Which county has the most Facebook users?

                  There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.
    
                  Facebook – the most used social media
    
                  Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.
    
                  Facebook usage by device
                  As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.
    
  18. VGChartz (Games Dataset)

    • kaggle.com
    zip
    Updated Jan 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simon Garanin (2024). VGChartz (Games Dataset) [Dataset]. https://www.kaggle.com/datasets/gsimonx37/vgchartz/data
    Explore at:
    zip(1351159 bytes)Available download formats
    Dataset updated
    Jan 23, 2024
    Authors
    Simon Garanin
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15126770%2Fb5be9743b224eed4a579ad0566c6cfa6%2Fheader.jpg?generation=1706017258113980&alt=media" alt="">

    Data obtained using a program from the site vgchartz.com.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15126770%2Fe7672b2b6da2ed0212f6023bc969097c%2Fdata_1.jpg?generation=1706017300688615&alt=media" alt="">

    "Founded in 2005 by Brett Walton, VGChartz (Video Game Charts) is a business intelligence and research firm and publisher of the VGChartz.com websites. As an industry research firm, VGChartz publishes video game hardware estimates every week and hosts an ever-expanding game database with over 55,000 titles listed, featuring up-to-date shipment information and legacy sales data. The VGChartz.com website provides consumers with a range of content from news and sales features, to reviews and articles, to social networking and a community forum." - from the site vgchartz.com.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15126770%2Fa099c58fc8cb25b8e26989f05fe58488%2Fdata_2.jpg?generation=1706017370390411&alt=media" alt="">

    "Since the end of 2018 VGChartz no longer produces estimates for software sales. This is because the high digital market share for software was making it both more difficult to produce reliable retail estimates and also making those estimates increasingly unrepresentative of the wider performance of the games in question. As a result, on the software front we now only record official shipment/sales data, where such data is made available by developers and publishers. The legacy data remains on the site for those who are interested in browsing through it." - from the site vgchartz.com.

    What can you do with the data set?

    If you are new to data analytics, try answering the following questions: - in what year did the active growth in the number of video games produced begin? What year was the most successful from this point of view? What can you conclude if you look at the number of video games released by country? - on what day and month were the largest number of video games released? What could be the reason for this pattern? - is there a dependence of the number of copies sold on the ratings of critics or users? - which gaming platforms, publishers and developers are the most common (the largest number of video games have been released over time)? - which gaming platforms, publishers and developers have the largest number of video game copies sold (over all time, the total number of copies sold was the largest)?

    If you have enough experience, try solving a regression problem. Train a model that can predict the number of copies sold of video games: - what signs can be used to prevent leakage of the target variable? - how do outliers affect the quality of the model? - which metric should be chosen to evaluate the model? - can adding new data improve the predictive ability of the model? - does the trained model have signs of heteroscedasticity of the residuals? How does this affect the predictive ability of the model? What can you do?

    Field descriptions:

    The data contains the following fields: 1. name – name of the video game. 2. date - release date of the video game. 3. platform - gaming platform (All – all gaming platforms, Series – all video game series). 4. publisher – publisher. 5. developers - developer. 6. shipped - the number of copies sent (relevant for records with the values All and Series in the platform field). 7. total - total number of copies sold (millions of copies). 8. america - number of copies sold in America (millions of copies). 9. europe - number of copies sold in Europe (millions of copies). 10. japan - number of copies sold in Japan (millions of copies). 11. other - other sales in the world. 12. vgc - rating VGChartz.com. 13. critic - critics' assessment. 14. user - user rating.

    Found an error or inaccuracy in the data?

    This dataset is the result of painstaking work. After collection and systematization, the data is checked for integrity and correctness. If you notice an error or inaccuracy in the data, or have a suggestion on how to improve the data set, please let me know.

    You can look at working with data in my github repository.

  19. The Opportunity Atlas

    • redivis.com
    application/jsonl +7
    Updated Apr 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2020). The Opportunity Atlas [Dataset]. http://doi.org/10.57761/aw9b-jd83
    Explore at:
    arrow, spss, stata, avro, csv, sas, application/jsonl, parquetAvailable download formats
    Dataset updated
    Apr 22, 2020
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford Center for Population Health Sciences
    Description

    Abstract

    The Opportunity Atlas has collected contextual data by county and tract. Rather than providing contextual socioeconomic data of where people currently live, the data represents average socioeconomic indicators (e.g., earnings) of where people grew up.

    Documentation

    A core element of Population Health Science is that health outcomes can only be fully understood when they are studied within their context. Therefore, we have a copy of The Opportunity Atlas, a dataset that provides socioeconomic data by county and tract.

    Several studies have shown that especially childhood neighborhoods drive adult outcomes and that residential areas lived in through adulthood have much smaller effects. The focus of the Opportunity Atlas is therefore on contextual data of where people grew up:

    %3E Traditional measures of poverty and neighborhood conditions provide snapshots of income and other variables for residents in an area at a given point in time. But to study how economic opportunity varies across neighborhoods, we really need to follow people over many years and see how one’s outcomes depend upon family circumstances and where on grew up. The Opportunity Atlas is the first dataset that provides such longitudinal information at a detailed neighborhood level. Using the Atlas, you can see not just where the rich and poor currently live – which was possible in previously available data from the Census Bureau – but whether children in a given area tend to grow up to become rich of poor. This focus on mobility out of poverty across generations allows us to trace the roots of outcomes such as poverty and incarceration back to where kids grew up, potentially permitting much more effective interventions.

    As such, The Opportunity Atlas data provides a rich source of data for researchers who wish to overlay health data with contextual data.

    Methodology

    Three sources of Census Bureau are linked to compute the data

    1. The 2000 and 2010 Decennial Census short form
    2. Federal income tax returns for 1989, 1994, 1995, 1998-2015
    3. The 2000 Decennial Census long form and the 2005-2015 American Community Surveys (ACS).

    %3C!-- --%3E

    20.5 million Americans born between 1987-1983 are sampled from these data and mapped back to the Census tracts they lived in through age 23. After that step, a range of outcomes are then estimated for each of the 70,000 tracts. In order to comply with federal data disclosure standards and protect the privacy of individuals no estimates in tracts with 20 or fewer children are published and noise (small random numbers) is added to all the estimates.

    For more information on the data collection and methodology, please visit:

    Website

    Documentation

    Data availability

    Some variables are available for counties only. The table below gives you an overview. Open the table in a new tab for a larger view.

    https://redivis.com/fileUploads/ee6544ef-e1b1-473d-a75d-36618c91f4a5%3E" alt="data availability.png">

  20. Video Game Sales and Ratings

    • kaggle.com
    zip
    Updated Dec 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Video Game Sales and Ratings [Dataset]. https://www.kaggle.com/datasets/thedevastator/video-game-sales-and-ratings/discussion
    Explore at:
    zip(553205 bytes)Available download formats
    Dataset updated
    Dec 20, 2023
    Authors
    The Devastator
    Description

    Video Game Sales and Ratings

    Global Video Game Sales, Ratings, and User Insights

    By Sumit Kumar Shukla [source]

    About this dataset

    The Video Games Sales and Ratings Dataset provides an in-depth view into the dynamic world of video games, offering a comprehensive analysis of sales and ratings across diverse platforms and publishers. This dataset contains valuable facets of information that bring to light various insights about the video game industry over the years.

    The dataset includes critical aspects such as the Name of each individual video game which was accounted for in this data aggregation process. The name captures the branded title under which a specific game is marketed and sold within the global market.

    Additionally, key details on numbers relating to sales are included as well; such as Global Sales which refers to the total number of copies each individual game has sold on all indicated platforms worldwide- recorded in millions; NA_Sales representing accumulated sales figures from North America- also captured in millions; JP_Sales showing similarly compiled data specifically for Japan and EU_Sales for Europe respectively, both equally reflected in millions.

    There is more insightful granularity embedded within our Dataset including Other_Sales, that tallies copies sold outside of specifically mentioned regions (North American, European & Japanese markets), expanding our insights into an even wider spectrum.

    This Dataset not only shares hard figures on sales but also valuable opinions voiced by professional critics & users alike with Critic_Count & User_Count detailing how many individuals had reviewed any specific product with Critic Score being an averaged rating given by critics while User Score echoes sentiments from regular end-users or consumers who purchased these products - showcasing public opinion on these games.

    Critical parameters defining characteristics essential related to gaming experiences like Genre detail distinctive aspects or themes around gameplay found within respective titles while Platform lists down where these titles were played specifically (between options like PC based or console front like PS4, Xbox etc.). The Publisher spotlights deserved attention onto those who took upon themselves to disseminate this creative work unto masses while Developer's name elucidates those daring visionaries who birthed unique experiences with their own hands through coding & design.

    Last but not least, the Rating as per ESRB (Entertainment Software Rating Board) is included to give a sense of what demographic brackets/age each title was intended and marketed for - an essential aspect for parents and individuals mindful about content consumed within video games.

    In short, this Video Games Sales and Ratings Dataset offers insights into the vast world of video gaming from various impactful perspectives serving as a valuable learning resource to anyone interested in gaining understanding or deploying data-driven strategies within any facet of this industry

    How to use the dataset

    Exploring the Dataset

    The first step in analyzing this dataset is getting familiarized with its structure:

    • Name: This attributes refers to the name of each video game included in the dataset.
    • Platform: This denotes the platform(s) on which a particular game operates.
    • Year_of_Release: The year when a particular game was launched is depicted by this attribute.
    • Genre: It indicates what type of genre does a certain video game correspond to.
    • Publisher & Developer: These fields detail out which company has published and developed every game respectively.
    • NA_Sales, EU_Sales, JP_Sales & Other_Sales: These signify sales numbers from North America (NA), Europe(EU), Japan(JP) regions as well as other parts of world respectively (measured in millions).
    • Global_Sales: This category refers to overall international sales for each described gaming product.
    • Critic_Score & User_score: It represents average scores attributed by critics or users; where higher indicates better reception mostly measured on a scale often spanning 0–10 or 0–100.
    • Critic_Count & User_Count: They denote how many critics/users have rated particular games respectively. -**Rating**: ESRB's categorization for games (e.g., E for Everyone,T for Teen,M for Mature etc.) is portrayed through it.

    Data Analysis Recommendations

    Here are some suggestions based on common practices:

    • Sales Performance: You can examine sales figures categorized regionally or globally, leading to understanding which games have ...
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Neilsberg Research (2024). United States Age Group Population Dataset: A Complete Breakdown of United States Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/aabf26b9-4983-11ef-ae5d-3860777c1fe6/

United States Age Group Population Dataset: A Complete Breakdown of United States Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2024 Edition

Explore at:
csv, jsonAvailable download formats
Dataset updated
Jul 24, 2024
Dataset authored and provided by
Neilsberg Research
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
United States
Variables measured
Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the United States population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for United States. The dataset can be utilized to understand the population distribution of United States by age. For example, using this dataset, we can identify the largest age group in United States.

Key observations

The largest age group in United States was for the group of age 30 to 34 years years with a population of 22.71 million (6.86%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in United States was the 80 to 84 years years with a population of 6.25 million (1.89%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

Age groups:

  • Under 5 years
  • 5 to 9 years
  • 10 to 14 years
  • 15 to 19 years
  • 20 to 24 years
  • 25 to 29 years
  • 30 to 34 years
  • 35 to 39 years
  • 40 to 44 years
  • 45 to 49 years
  • 50 to 54 years
  • 55 to 59 years
  • 60 to 64 years
  • 65 to 69 years
  • 70 to 74 years
  • 75 to 79 years
  • 80 to 84 years
  • 85 years and over

Variables / Data Columns

  • Age Group: This column displays the age group in consideration
  • Population: The population for the specific age group in the United States is shown in this column.
  • % of Total Population: This column displays the population of each age group as a proportion of United States total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for United States Population by Age. You can refer the same here

Search
Clear search
Close search
Google apps
Main menu