100+ datasets found
  1. Total population worldwide 1950-2100

    • statista.com
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Total population worldwide 1950-2100 [Dataset]. https://www.statista.com/statistics/805044/total-population-worldwide/
    Explore at:
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    The world population surpassed eight billion people in 2022, having doubled from its figure less than 50 years previously. Looking forward, it is projected that the world population will reach nine billion in 2038, and 10 billion in 2060, but it will peak around 10.3 billion in the 2080s before it then goes into decline. Regional variations The global population has seen rapid growth since the early 1800s, due to advances in areas such as food production, healthcare, water safety, education, and infrastructure, however, these changes did not occur at a uniform time or pace across the world. Broadly speaking, the first regions to undergo their demographic transitions were Europe, North America, and Oceania, followed by Latin America and Asia (although Asia's development saw the greatest variation due to its size), while Africa was the last continent to undergo this transformation. Because of these differences, many so-called "advanced" countries are now experiencing population decline, particularly in Europe and East Asia, while the fastest population growth rates are found in Sub-Saharan Africa. In fact, the roughly two billion difference in population between now and the 2080s' peak will be found in Sub-Saharan Africa, which will rise from 1.2 billion to 3.2 billion in this time (although populations in other continents will also fluctuate). Changing projections The United Nations releases their World Population Prospects report every 1-2 years, and this is widely considered the foremost demographic dataset in the world. However, recent years have seen a notable decline in projections when the global population will peak, and at what number. Previous reports in the 2010s had suggested a peak of over 11 billion people, and that population growth would continue into the 2100s, however a sooner and shorter peak is now projected. Reasons for this include a more rapid population decline in East Asia and Europe, particularly China, as well as a prolongued development arc in Sub-Saharan Africa.

  2. N

    United States Annual Population and Growth Analysis Dataset: A Comprehensive...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). United States Annual Population and Growth Analysis Dataset: A Comprehensive Overview of Population Changes and Yearly Growth Rates in United States from 2000 to 2024 // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/united-states-population-by-year/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Variables measured
    Annual Population Growth Rate, Population Between 2000 and 2024, Annual Population Growth Rate Percent
    Measurement technique
    The data presented in this dataset is derived from the 20 years data of U.S. Census Bureau Population Estimates Program (PEP) 2000 - 2024. To measure the variables, namely (a) population and (b) population change in ( absolute and as a percentage ), we initially analyzed and tabulated the data for each of the years between 2000 and 2024. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the United States population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of United States across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.

    Key observations

    In 2024, the population of United States was 340.11 million, a 0.98% increase year-by-year from 2023. Previously, in 2023, United States population was 336.81 million, an increase of 0.83% compared to a population of 334.02 million in 2022. Over the last 20 plus years, between 2000 and 2024, population of United States increased by 57.95 million. In this period, the peak population was 340.11 million in the year 2024. The numbers suggest that the population has not reached its peak yet and is showing a trend of further growth. Source: U.S. Census Bureau Population Estimates Program (PEP).

    Content

    When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).

    Data Coverage:

    • From 2000 to 2024

    Variables / Data Columns

    • Year: This column displays the data year (Measured annually and for years 2000 to 2024)
    • Population: The population for the specific year for the United States is shown in this column.
    • Year on Year Change: This column displays the change in United States population for each year compared to the previous year.
    • Change in Percent: This column displays the year on year change as a percentage. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for United States Population by Year. You can refer the same here

  3. N

    New York Annual Population and Growth Analysis Dataset: A Comprehensive...

    • neilsberg.com
    csv, json
    Updated Jul 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). New York Annual Population and Growth Analysis Dataset: A Comprehensive Overview of Population Changes and Yearly Growth Rates in New York from 2000 to 2023 // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/bf4c50f9-4dd0-11ef-a154-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Jul 30, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New York
    Variables measured
    Annual Population Growth Rate, Population Between 2000 and 2023, Annual Population Growth Rate Percent
    Measurement technique
    The data presented in this dataset is derived from the 20 years data of U.S. Census Bureau Population Estimates Program (PEP) 2000 - 2023. To measure the variables, namely (a) population and (b) population change in ( absolute and as a percentage ), we initially analyzed and tabulated the data for each of the years between 2000 and 2023. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the New York population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of New York across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.

    Key observations

    In 2023, the population of New York was 19.57 million, a 0.52% decrease year-by-year from 2022. Previously, in 2022, New York population was 19.67 million, a decline of 0.91% compared to a population of 19.85 million in 2021. Over the last 20 plus years, between 2000 and 2023, population of New York increased by 574,257. In this period, the peak population was 20.1 million in the year 2020. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).

    Content

    When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).

    Data Coverage:

    • From 2000 to 2023

    Variables / Data Columns

    • Year: This column displays the data year (Measured annually and for years 2000 to 2023)
    • Population: The population for the specific year for the New York is shown in this column.
    • Year on Year Change: This column displays the change in New York population for each year compared to the previous year.
    • Change in Percent: This column displays the year on year change as a percentage. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for New York Population by Year. You can refer the same here

  4. NYC Open Data

    • kaggle.com
    zip
    Updated Mar 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NYC Open Data (2019). NYC Open Data [Dataset]. https://www.kaggle.com/datasets/nycopendata/new-york
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Mar 20, 2019
    Dataset authored and provided by
    NYC Open Data
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    NYC Open Data is an opportunity to engage New Yorkers in the information that is produced and used by City government. We believe that every New Yorker can benefit from Open Data, and Open Data can benefit from every New Yorker. Source: https://opendata.cityofnewyork.us/overview/

    Content

    Thanks to NYC Open Data, which makes public data generated by city agencies available for public use, and Citi Bike, we've incorporated over 150 GB of data in 5 open datasets into Google BigQuery Public Datasets, including:

    • Over 8 million 311 service requests from 2012-2016

    • More than 1 million motor vehicle collisions 2012-present

    • Citi Bike stations and 30 million Citi Bike trips 2013-present

    • Over 1 billion Yellow and Green Taxi rides from 2009-present

    • Over 500,000 sidewalk trees surveyed decennially in 1995, 2005, and 2015

    This dataset is deprecated and not being updated.

    Fork this kernel to get started with this dataset.

    Acknowledgements

    https://opendata.cityofnewyork.us/

    https://cloud.google.com/blog/big-data/2017/01/new-york-city-public-datasets-now-available-on-google-bigquery

    This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - https://data.cityofnewyork.us/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    By accessing datasets and feeds available through NYC Open Data, the user agrees to all of the Terms of Use of NYC.gov as well as the Privacy Policy for NYC.gov. The user also agrees to any additional terms of use defined by the agencies, bureaus, and offices providing data. Public data sets made available on NYC Open Data are provided for informational purposes. The City does not warranty the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set made available on NYC Open Data, nor are any such warranties to be implied or inferred with respect to the public data sets furnished therein.

    The City is not liable for any deficiencies in the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set, or application utilizing such data set, provided by any third party.

    Banner Photo by @bicadmedia from Unplash.

    Inspiration

    On which New York City streets are you most likely to find a loud party?

    Can you find the Virginia Pines in New York City?

    Where was the only collision caused by an animal that injured a cyclist?

    What’s the Citi Bike record for the Longest Distance in the Shortest Time (on a route with at least 100 rides)?

    https://cloud.google.com/blog/big-data/2017/01/images/148467900588042/nyc-dataset-6.png" alt="enter image description here"> https://cloud.google.com/blog/big-data/2017/01/images/148467900588042/nyc-dataset-6.png

  5. N

    New Hampshire Annual Population and Growth Analysis Dataset: A Comprehensive...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). New Hampshire Annual Population and Growth Analysis Dataset: A Comprehensive Overview of Population Changes and Yearly Growth Rates in New Hampshire from 2000 to 2024 // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/new-hampshire-population-by-year/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New Hampshire, New Hampshire
    Variables measured
    Annual Population Growth Rate, Population Between 2000 and 2024, Annual Population Growth Rate Percent
    Measurement technique
    The data presented in this dataset is derived from the 20 years data of U.S. Census Bureau Population Estimates Program (PEP) 2000 - 2024. To measure the variables, namely (a) population and (b) population change in ( absolute and as a percentage ), we initially analyzed and tabulated the data for each of the years between 2000 and 2024. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the New Hampshire population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of New Hampshire across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.

    Key observations

    In 2024, the population of New Hampshire was 1.41 million, a 0.49% increase year-by-year from 2023. Previously, in 2023, New Hampshire population was 1.4 million, an increase of 0.40% compared to a population of 1.4 million in 2022. Over the last 20 plus years, between 2000 and 2024, population of New Hampshire increased by 168,647. In this period, the peak population was 1.41 million in the year 2024. The numbers suggest that the population has not reached its peak yet and is showing a trend of further growth. Source: U.S. Census Bureau Population Estimates Program (PEP).

    Content

    When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).

    Data Coverage:

    • From 2000 to 2024

    Variables / Data Columns

    • Year: This column displays the data year (Measured annually and for years 2000 to 2024)
    • Population: The population for the specific year for the New Hampshire is shown in this column.
    • Year on Year Change: This column displays the change in New Hampshire population for each year compared to the previous year.
    • Change in Percent: This column displays the year on year change as a percentage. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for New Hampshire Population by Year. You can refer the same here

  6. Data from: Robotic manipulation datasets for offline compositional...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Jun 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcel Hussing; Jorge Mendez; Anisha Singrodia; Cassandra Kent; Eric Eaton (2024). Robotic manipulation datasets for offline compositional reinforcement learning [Dataset]. http://doi.org/10.5061/dryad.9cnp5hqps
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 6, 2024
    Dataset provided by
    University of Pennsylvania
    Massachusetts Institute of Technology
    Authors
    Marcel Hussing; Jorge Mendez; Anisha Singrodia; Cassandra Kent; Eric Eaton
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Offline reinforcement learning (RL) is a promising direction that allows RL agents to be pre-trained from large datasets avoiding recurrence of expensive data collection. To advance the field, it is crucial to generate large-scale datasets. Compositional RL is particularly appealing for generating such large datasets, since 1) it permits creating many tasks from few components, and 2) the task structure may enable trained agents to solve new tasks by combining relevant learned components. This submission provides four offline RL datasets for simulated robotic manipulation created using the 256 tasks from CompoSuite Mendez et al., 2022. In every task in CompoSuite, a robot arm is used to manipulate an object to achieve an objective all while trying to avoid an obstacle. There are for components for each of these four axes that can be combined arbitrarily leading to a total of 256 tasks. The component choices are * Robot: IIWA, Jaco, Kinova3, Panda* Object: Hollow box, box, dumbbell, plate* Objective: Push, pick and place, put in shelf, put in trashcan* Obstacle: None, wall between robot and object, wall between goal and object, door between goal and object The four included datasets are collected using separate agents each trained to a different degree of performance, and each dataset consists of 256 million transitions. The degrees of performance are expert data, medium data, warmstart data and replay data: * Expert dataset: Transitions from an expert agent that was trained to achieve 90% success on every task.* Medium dataset: Transitions from a medium agent that was trained to achieve 30% success on every task.* Warmstart dataset: Transitions from a Soft-actor critic agent trained for a fixed duration of one million steps.* Medium-replay-subsampled dataset: Transitions that were stored during the training of a medium agent up to 30% success. These datasets are intended for the combined study of compositional generalization and offline reinforcement learning. Methods The datasets were collected by using several deep reinforcement learning agents trained to the various degrees of performance described above on the CompoSuite benchmark (https://github.com/Lifelong-ML/CompoSuite) which builds on top of robosuite (https://github.com/ARISE-Initiative/robosuite) and uses the MuJoCo simulator (https://github.com/deepmind/mujoco). During reinforcement learning training, we stored the data that was collected by each agent in a separate buffer for post-processing. Then, after training, to collect the expert and medium dataset, we run the trained agents for 2000 trajectories of length 500 online in the CompoSuite benchmark and store the trajectories. These add up to a total of 1 million state-transitions tuples per dataset, totalling a full 256 million datapoints per dataset. The warmstart and medium-replay-subsampled dataset contain trajectories from the stored training buffer of the SAC agent trained for a fixed duration and the medium agent respectively. For medium-replay-subsampled data, we uniformly sample trajectories from the training buffer until we reach more than 1 million transitions. Since some of the tasks have termination conditions, some of these trajectories are trunctated and not of length 500. This sometimes results in a number of sampled transitions larger than 1 million. Therefore, after sub-sampling, we artificially truncate the last trajectory and place a timeout at the final position. This can in some rare cases lead to one incorrect trajectory if the datasets are used for finite horizon experimentation. However, this truncation is required to ensure consistent dataset sizes, easy data readability and compatibility with other standard code implementations. The four datasets are split into four tar.gz folders each yielding a total of 12 compressed folders. Every sub-folder contains all the tasks for one of the four robot arms for that dataset. In other words, every tar.gz folder contains a total of 64 tasks using the same robot arm and four tar.gz files form a full dataset. This is done to enable people to only download a part of the dataset in case they do not need all 256 tasks. For every task, the data is separately stored in an hdf5 file allowing for the usage of arbitrary task combinations and mixing of data qualities across the four datasets. Every task is contained in a folder that is named after the CompoSuite elements it uses. In other words, every task is represented as a folder named

  7. d

    The Data Group’s Fundraising Data Set with over 100 Million US Donor Data...

    • datarade.ai
    .json, .csv, .xls
    Updated May 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Data Group (2024). The Data Group’s Fundraising Data Set with over 100 Million US Donor Data records. Over 90%with Phones/Emails. Causes-Interests-Wealth-Political Data [Dataset]. https://datarade.ai/data-products/the-data-group-s-fundraising-data-set-with-over-100-million-u-the-data-group
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    May 18, 2024
    Dataset authored and provided by
    The Data Group
    Area covered
    United States
    Description

    In the dynamic realm of fundraising, every decision counts. It's not just about reaching out; it's about connecting, resonating, and ultimately converting. And at the heart of this transformative journey lies one invaluable asset: accurate and updated fundraising data. Let's delve into why choosing fundraising data is not just a choice but a strategic imperative for businesses aiming to exceed campaign goals.

    Why Choose The Data Group’s Fundraising Data?

    Donor data isn't merely a collection of information; it's a goldmine of insights waiting to be unearthed. By leveraging donor data, you gain a profound understanding of your audience - their preferences, behaviors, and inclinations. This knowledge empowers you to craft hyper-targeted campaigns that speak directly to the hearts and minds of your prospects.

    Key Features that Set Fundraising Data Apart

    ● Granular Segmentation: With donor data, segmentation reaches new heights of precision. Say goodbye to generic messaging; with detailed segmentation, you can tailor your communication to resonate with each segment's unique characteristics and interests.

    ● Predictive Analytics: Anticipate your donors' next move with predictive analytics. By analyzing past behavior patterns, fundraising data enables you to forecast future actions, empowering you to stay one step ahead in your marketing endeavors.

    ● Comprehensive Profiles: From demographic information and detailed wealth data to psychographic insights, donor data provides a comprehensive view of your audience. Armed with this knowledge, you can personalize your messaging and offerings to cater to each donor's specific needs and preferences.

    Applications of Fundraising Data & Donor Insight

    ● Personalized Campaigns: Bid farewell to one-size-fits-all marketing. Donor data enables you to create personalized campaigns that resonate on a personal level, driving engagement and fostering long-term relationships with your audience.

    ● Optimized Messaging: With donor data, you can bid adieu to guesswork. By understanding your audience's preferences and pain points, you can craft messaging that speaks directly to their needs, driving conversions and maximizing ROI.

    ● Enhanced Donor Retention : Retaining donors is just as crucial as acquiring them. Up to date Fundraising Data equips you with the insights needed to nurture existing relationships, identify at-risk donors, and implement strategies to foster loyalty and advocacy.

    Now available for our political clients is our new 2024 Election Issues Data. Inquire to learn more.

  8. Forecast revenue big data market worldwide 2011-2027

    • statista.com
    Updated Feb 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Forecast revenue big data market worldwide 2011-2027 [Dataset]. https://www.statista.com/statistics/254266/global-big-data-market-forecast/
    Explore at:
    Dataset updated
    Feb 13, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    The global big data market is forecasted to grow to 103 billion U.S. dollars by 2027, more than double its expected market size in 2018. With a share of 45 percent, the software segment would become the large big data market segment by 2027.

    What is Big data?

    Big data is a term that refers to the kind of data sets that are too large or too complex for traditional data processing applications. It is defined as having one or some of the following characteristics: high volume, high velocity or high variety. Fast-growing mobile data traffic, cloud computing traffic, as well as the rapid development of technologies such as artificial intelligence (AI) and the Internet of Things (IoT) all contribute to the increasing volume and complexity of data sets.

    Big data analytics

    Advanced analytics tools, such as predictive analytics and data mining, help to extract value from the data and generate new business insights. The global big data and business analytics market was valued at 169 billion U.S. dollars in 2018 and is expected to grow to 274 billion U.S. dollars in 2022. As of November 2018, 45 percent of professionals in the market research industry reportedly used big data analytics as a research method.

  9. N

    United States Annual Population and Growth Analysis Dataset: A Comprehensive...

    • neilsberg.com
    csv, json
    Updated Jul 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). United States Annual Population and Growth Analysis Dataset: A Comprehensive Overview of Population Changes and Yearly Growth Rates in United States from 2000 to 2023 // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/bf628b87-4dd0-11ef-a154-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Jul 30, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Variables measured
    Annual Population Growth Rate, Population Between 2000 and 2023, Annual Population Growth Rate Percent
    Measurement technique
    The data presented in this dataset is derived from the 20 years data of U.S. Census Bureau Population Estimates Program (PEP) 2000 - 2023. To measure the variables, namely (a) population and (b) population change in ( absolute and as a percentage ), we initially analyzed and tabulated the data for each of the years between 2000 and 2023. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the United States population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of United States across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.

    Key observations

    In 2023, the population of United States was 334.91 million, a 0.49% increase year-by-year from 2022. Previously, in 2022, United States population was 333.27 million, an increase of 0.37% compared to a population of 332.05 million in 2021. Over the last 20 plus years, between 2000 and 2023, population of United States increased by 52.75 million. In this period, the peak population was 334.91 million in the year 2023. The numbers suggest that the population has not reached its peak yet and is showing a trend of further growth. Source: U.S. Census Bureau Population Estimates Program (PEP).

    Content

    When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).

    Data Coverage:

    • From 2000 to 2023

    Variables / Data Columns

    • Year: This column displays the data year (Measured annually and for years 2000 to 2023)
    • Population: The population for the specific year for the United States is shown in this column.
    • Year on Year Change: This column displays the change in United States population for each year compared to the previous year.
    • Change in Percent: This column displays the year on year change as a percentage. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for United States Population by Year. You can refer the same here

  10. n

    eBird - Dataset - INTAROS Data Catalogue

    • catalog-intaros.nersc.no
    Updated Nov 5, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). eBird - Dataset - INTAROS Data Catalogue [Dataset]. https://catalog-intaros.nersc.no/dataset/ebird
    Explore at:
    Dataset updated
    Nov 5, 2020
    Description

    eBird is among the world’s largest biodiversity-related science projects, with more than 100 million bird sightings contributed annually by eBirders around the world and an average participation growth rate of approximately 20% year over year. eBird is managed by the Cornell Lab of Ornithology. Some data has been contributed under INTAROS WP4. eBird provides open data access in several formats to logged-in users, ranging from raw data to processed datasets geared toward more rigorous scientific modeling. eBird Basic Dataset (EBD) The EBD is the core dataset for accessing all raw eBird observations and associated metadata. The EBD is updated monthly (15th of each month), and is available by direct download through eBird to any logged-in user after completion of a data request form. The data request form allows us to gain some understanding of how the data will be used. Requests are typically approved within 7 days. Data are provided with documentation in spreadsheet format, which can be read by a variety of programs. Although Excel or similar programs work for basic analyses, for larger datasets (>1 million rows) or more sophisticated analyses, we recommend using programs like R. There are several R packages available for summarizing data, including one that is managed here at the Cornell Lab specifically for working with the EBD dataset. The data collection may enable a better understanding of bird population dynamics and the status of bird species including bird conservation management requirements.

  11. Data from: Indian Premier League Dataset

    • kaggle.com
    zip
    Updated Feb 16, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saad Bin Manjur Adit (2021). Indian Premier League Dataset [Dataset]. https://www.kaggle.com/saadbinmanjuradit/indian-premier-league-dataset
    Explore at:
    zip(1261343 bytes)Available download formats
    Dataset updated
    Feb 16, 2021
    Authors
    Saad Bin Manjur Adit
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    Context

    The Indian Premier League (IPL) is a professional Twenty20 cricket league in India usually contested between March and May of every year by eight teams representing eight different cities or states in India. The league was founded by the Board of Control for Cricket in India (BCCI) in 2007. The IPL has an exclusive window in ICC Future Tours Programme.

    The IPL is the most-attended cricket league in the world and in 2014 was ranked sixth by average attendance among all sports leagues. In 2010, the IPL became the first sporting event in the world to be broadcast live on YouTube. The brand value of the IPL in 2019 was ₹475 billion (US$6.7 billion), according to Duff & Phelps. According to BCCI, the 2015 IPL season contributed ₹11.5 billion (US$160 million) to the GDP of the Indian economy.

    Content

    The dataset consist of data about IPL matches played from the year 2008 to 2019. IPL is a professional Twenty20 cricket league founded by the Board of Control for Cricket in India (BCCI) in 2008. The league has 8 teams representing 8 different Indian cities or states. It enjoys tremendous popularity and the brand value of the IPL in 2019 was estimated to be ₹475 billion (US$6.7 billion). So let’s analyze IPL through stats.

    The dataset has 18 columns. Let’s get acquainted with the columns. - id: The IPL match id. - season: The IPL season - city: The city where the IPL match was held. - date: The date on which the match was held. - team1: One of the teams of the IPL match - team2: The other team of the IPL match - toss_winner: The team that won the toss - toss_decision: The decision taken by the team that won the toss to ‘bat’ or ‘field’ - result: The result(‘normal’, ‘tie’, ‘no result’) of the match. - dl_applied: (1 or 0)indicates whether the Duckworth-Lewis rule was applied or not. - winner: The winner of the match. - win_by_runs: Provides the runs by which the team batting first won - win_by_runs: Provides the number of wickets by which the team batting second won. - player_of_match: The outstanding player of the match. - venue: The venue where the match was hosted. - umpire1: One of the two on-field umpires who officiate the match. - umpire2: One of the two on-field umpires who officiate the match. - umpire3: The off-field umpire who officiates the match

    Acknowledgements

    • Data source from 2008-2017 - CricSheet.org and Manas - Kaggle
    • Indian Premier League 2008-2019 Navaneesh Kumar - Kaggle
    • Data source for 2018-2019 - IPL T20 - Official website
  12. Z

    Data from: CESNET-QUIC22: A large one-month QUIC network traffic dataset...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luxemburk, Jan (2024). CESNET-QUIC22: A large one-month QUIC network traffic dataset from backbone lines [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7409923
    Explore at:
    Dataset updated
    Feb 29, 2024
    Dataset provided by
    Lukačovič, Andrej
    Luxemburk, Jan
    Čejka, Tomáš
    Hynek, Karel
    Šiška, Pavel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Please refer to the original data article for further data description: Jan Luxemburk et al. CESNET-QUIC22: A large one-month QUIC network traffic dataset from backbone lines, Data in Brief, 2023, 108888, ISSN 2352-3409, https://doi.org/10.1016/j.dib.2023.108888. We recommend using the CESNET DataZoo python library, which facilitates the work with large network traffic datasets. More information about the DataZoo project can be found in the GitHub repository https://github.com/CESNET/cesnet-datazoo. The QUIC (Quick UDP Internet Connection) protocol has the potential to replace TLS over TCP, which is the standard choice for reliable and secure Internet communication. Due to its design that makes the inspection of QUIC handshakes challenging and its usage in HTTP/3, there is an increasing demand for research in QUIC traffic analysis. This dataset contains one month of QUIC traffic collected in an ISP backbone network, which connects 500 large institutions and serves around half a million people. The data are delivered as enriched flows that can be useful for various network monitoring tasks. The provided server names and packet-level information allow research in the encrypted traffic classification area. Moreover, included QUIC versions and user agents (smartphone, web browser, and operating system identifiers) provide information for large-scale QUIC deployment studies. Data capture The data was captured in the flow monitoring infrastructure of the CESNET2 network. The capturing was done for four weeks between 31.10.2022 and 27.11.2022. The following list provides per-week flow count, capture period, and uncompressed size:

    W-2022-44

    Uncompressed Size: 19 GB Capture Period: 31.10.2022 - 6.11.2022 Number of flows: 32.6M W-2022-45

    Uncompressed Size: 25 GB Capture Period: 7.11.2022 - 13.11.2022 Number of flows: 42.6M W-2022-46

    Uncompressed Size: 20 GB Capture Period: 14.11.2022 - 20.11.2022 Number of flows: 33.7M W-2022-47

    Uncompressed Size: 25 GB Capture Period: 21.11.2022 - 27.11.2022 Number of flows: 44.1M CESNET-QUIC22

    Uncompressed Size: 89 GB Capture Period: 31.10.2022 - 27.11.2022 Number of flows: 153M

    Data description The dataset consists of network flows describing encrypted QUIC communications. Flows were created using ipfixprobe flow exporter and are extended with packet metadata sequences, packet histograms, and with fields extracted from the QUIC Initial Packet, which is the first packet of the QUIC connection handshake. The extracted handshake fields are the Server Name Indication (SNI) domain, the used version of the QUIC protocol, and the user agent string that is available in a subset of QUIC communications. Packet Sequences Flows in the dataset are extended with sequences of packet sizes, directions, and inter-packet times. For the packet sizes, we consider payload size after transport headers (UDP headers for the QUIC case). Packet directions are encoded as ±1, +1 meaning a packet sent from client to server, and -1 a packet from server to client. Inter-packet times depend on the location of communicating hosts, their distance, and on the network conditions on the path. However, it is still possible to extract relevant information that correlates with user interactions and, for example, with the time required for an API/server/database to process the received data and generate the response to be sent in the next packet. Packet metadata sequences have a length of 30, which is the default setting of the used flow exporter. We also derive three fields from each packet sequence: its length, time duration, and the number of roundtrips. The roundtrips are counted as the number of changes in the communication direction (from packet directions data); in other words, each client request and server response pair counts as one roundtrip. Flow statistics Flows also include standard flow statistics, which represent aggregated information about the entire bidirectional flow. The fields are: the number of transmitted bytes and packets in both directions, the duration of flow, and packet histograms. Packet histograms include binned counts of packet sizes and inter-packet times of the entire flow in both directions (more information in the PHISTS plugin documentation There are eight bins with a logarithmic scale; the intervals are 0-15, 16-31, 32-63, 64-127, 128-255, 256-511, 512-1024, >1024 [ms or B]. The units are milliseconds for inter-packet times and bytes for packet sizes. Moreover, each flow has its end reason - either it was idle, reached the active timeout, or ended due to other reasons. This corresponds with the official IANA IPFIX-specified values. The FLOW_ENDREASON_OTHER field represents the forced end and lack of resources reasons. The end of flow detected reason is not considered because it is not relevant for UDP connections. Dataset structure The dataset flows are delivered in compressed CSV files. CSV files contain one flow per row; data columns are summarized in the provided list below. For each flow data file, there is a JSON file with the number of saved and seen (before sampling) flows per service and total counts of all received (observed on the CESNET2 network), service (belonging to one of the dataset's services), and saved (provided in the dataset) flows. There is also the stats-week.json file aggregating flow counts of a whole week and the stats-dataset.json file aggregating flow counts for the entire dataset. Flow counts before sampling can be used to compute sampling ratios of individual services and to resample the dataset back to the original service distribution. Moreover, various dataset statistics, such as feature distributions and value counts of QUIC versions and user agents, are provided in the dataset-statistics folder. The mapping between services and service providers is provided in the servicemap.csv file, which also includes SNI domains used for ground truth labeling. The following list describes flow data fields in CSV files:

    ID: Unique identifier SRC_IP: Source IP address DST_IP: Destination IP address DST_ASN: Destination Autonomous System number SRC_PORT: Source port DST_PORT: Destination port PROTOCOL: Transport protocol QUIC_VERSION QUIC: protocol version QUIC_SNI: Server Name Indication domain QUIC_USER_AGENT: User agent string, if available in the QUIC Initial Packet TIME_FIRST: Timestamp of the first packet in format YYYY-MM-DDTHH-MM-SS.ffffff TIME_LAST: Timestamp of the last packet in format YYYY-MM-DDTHH-MM-SS.ffffff DURATION: Duration of the flow in seconds BYTES: Number of transmitted bytes from client to server BYTES_REV: Number of transmitted bytes from server to client PACKETS: Number of packets transmitted from client to server PACKETS_REV: Number of packets transmitted from server to client PPI: Packet metadata sequence in the format: [[inter-packet times], [packet directions], [packet sizes]] PPI_LEN: Number of packets in the PPI sequence PPI_DURATION: Duration of the PPI sequence in seconds PPI_ROUNDTRIPS: Number of roundtrips in the PPI sequence PHIST_SRC_SIZES: Histogram of packet sizes from client to server PHIST_DST_SIZES: Histogram of packet sizes from server to client PHIST_SRC_IPT: Histogram of inter-packet times from client to server PHIST_DST_IPT: Histogram of inter-packet times from server to client APP: Web service label CATEGORY: Service category FLOW_ENDREASON_IDLE: Flow was terminated because it was idle FLOW_ENDREASON_ACTIVE: Flow was terminated because it reached the active timeout FLOW_ENDREASON_OTHER: Flow was terminated for other reasons

    Link to other CESNET datasets

    https://www.liberouter.org/technology-v2/tools-services-datasets/datasets/ https://github.com/CESNET/cesnet-datazoo Please cite the original data article:

    @article{CESNETQUIC22, author = {Jan Luxemburk and Karel Hynek and Tomáš Čejka and Andrej Lukačovič and Pavel Šiška}, title = {CESNET-QUIC22: a large one-month QUIC network traffic dataset from backbone lines}, journal = {Data in Brief}, pages = {108888}, year = {2023}, issn = {2352-3409}, doi = {https://doi.org/10.1016/j.dib.2023.108888}, url = {https://www.sciencedirect.com/science/article/pii/S2352340923000069} }

  13. T

    United States Money Supply M2

    • tradingeconomics.com
    • pl.tradingeconomics.com
    • +17more
    csv, excel, json, xml
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Money Supply M2 [Dataset]. https://tradingeconomics.com/united-states/money-supply-m2
    Explore at:
    json, xml, csv, excelAvailable download formats
    Dataset updated
    Mar 26, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1959 - Feb 28, 2025
    Area covered
    United States
    Description

    Money Supply M2 in the United States increased to 21447.60 USD Billion in November from 21311.20 USD Billion in October of 2024. This dataset provides - United States Money Supply M2 - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  14. C

    Dataset Business

    • data.colorado.gov
    • data.wu.ac.at
    application/rdfxml +5
    Updated Mar 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CDOS (2025). Dataset Business [Dataset]. https://data.colorado.gov/Business/Dataset-Business/i8gq-uw2n
    Explore at:
    application/rdfxml, csv, application/rssxml, json, tsv, xmlAvailable download formats
    Dataset updated
    Mar 26, 2025
    Authors
    CDOS
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    SOS Colorado Business Entities (corporations, LLCs, etc.) registered with the Colorado Secretary of State. The business registration goes back to the 1800's, with over 1 million records available. Contains business name, principal address, mailing address, owner name, owner address, entity status, type and creation date. Business location is geocoded and is referenced in the location field using the Principal Address information, however location is not guaranteed to be physical location of business in every case. As of January 2015 there are over 1.3 million unique business id's registered with the state of co, and there is a (very rough) average of 2,000 new businesses added each week.

  15. Hit Song Prediction (Million Song Dataset and Audio Features)

    • zenodo.org
    application/gzip, bin +1
    Updated Feb 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eva Zangerle; Eva Zangerle (2020). Hit Song Prediction (Million Song Dataset and Audio Features) [Dataset]. http://doi.org/10.5281/zenodo.3258042
    Explore at:
    bin, application/gzip, csvAvailable download formats
    Dataset updated
    Feb 9, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Eva Zangerle; Eva Zangerle
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Hit Song Prediction Dataset

    This dataset is based on the Million Song Dataset (MSD), which contains one million songs that are representative for western commercial music released between 1922 and 2011. The dataset contains release year information for 515,576 of the MSD songs. Please refer to http://millionsongdataset.com/ for further information on the million song dataset.

    For our hit song prediction experiments, we extract high- and low-level audio features using the Essentia toolkit (cf. https://essentia.upf.edu/). For the high-level features, we make use of the pre-trained classifiers as provided by Essentia. For a detailed description of the features, please visit the Essentia documentation.


    The dataset hence contains:

    • Audio features: the compressed msd_audio_features.tar.gz file contains the low- and high-level features for each track, stored as json files. Please note that we organize all MSD audio feature files based on the track's identifier with one folder holding all tracks with the same first letter of the track identifier to keep the files manageable. For each track, we provide two files: one containing the high-level and one containing the low-level features extracted by Essentia.
    • Billboard data: the folder billboard_data contains two files: msd_bb_matches.csv contains information about the MSD tracks that were also featured in the Billboard Hot 100 charts. Here, we provide the MSD id, Echo Nest id, artist name, track title, release year, peak position in Billboard charts and the number of weeks in the charts. The second file, msd_bb_non_matches.csv contains meta-information about the tracks of the MSD that were not featured in the Billboard Hot 100 and hence were used as negative samples. Here, we provide the MSD id, Echo Nest id, artist name, track title and the release year.


    If you make use of the dataset, please kindly cite the following paper:

    Eva Zangerle, Michael Vötter, Ramona Huber, and Yi-Hsuan Yang. Hit Song Prediction: Leveraging Low- and High-Level Audio Features. In Proceedings of the 20th International Society for Music Information Retrieval Conference 2019 (ISMIR 2019), 2019.


    @inproceedings{zangerle_ismir19,
    title = {{Hit Song Prediction: Leveraging Low- and High-Level Audio Features}},
    author = {Eva Zangerle and Ramona Huber and Michael V\"{o}tter and Yi-Hsuan Yang},
    year = {2019},
    booktitle = {{Proceedings of the 20th International Society for Music Information Retrieval Conference 2019 (ISMIR 2019)}},
    }

  16. Database Monitoring Software Market Size, Share, Growth and Industry Report

    • imarcgroup.com
    pdf,excel,csv,ppt
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IMARC Group, Database Monitoring Software Market Size, Share, Growth and Industry Report [Dataset]. https://www.imarcgroup.com/database-monitoring-software-market
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset provided by
    Imarc Group
    Authors
    IMARC Group
    License

    https://www.imarcgroup.com/privacy-policyhttps://www.imarcgroup.com/privacy-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    The global database monitoring software market size reached USD 5.2 Billion in 2024. Looking forward, IMARC Group expects the market to reach USD 16.0 Billion by 2033, exhibiting a growth rate (CAGR) of 13.3% during 2025-2033. The rising prevalence of data breaches and cyberattacks worldwide, increasing digitization, rapid growth in data volumes across diverse industry verticals, and surging penetration of cloud-based solutions are some of the major factors propelling the market.

    Report Attribute
    Key Statistics
    Base Year
    2024
    Forecast Years
    2025-2033
    Historical Years
    2019-2024
    Market Size in 2024USD 5.2 Billion
    Market Forecast in 2033USD 16.0 Billion
    Market Growth Rate (2025-2033)
    13.3%

    IMARC Group provides an analysis of the key trends in each segment of the global database monitoring software market report, along with forecasts at the global, regional, and country levels from 2025-2033. Our report has categorized the market based on database model, deployment model, organization size, and end use vertical.

  17. In-Memory Database Market Size, Share, Growth and Industry Report

    • imarcgroup.com
    pdf,excel,csv,ppt
    Updated Oct 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IMARC Group (2020). In-Memory Database Market Size, Share, Growth and Industry Report [Dataset]. https://www.imarcgroup.com/in-memory-database-market
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Oct 6, 2020
    Dataset provided by
    Imarc Group
    Authors
    IMARC Group
    License

    https://www.imarcgroup.com/privacy-policyhttps://www.imarcgroup.com/privacy-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    The global in-memory database market size reached USD 8.0 Billion in 2024. Looking forward, IMARC Group expects the market to reach USD 29.9 Billion by 2033, exhibiting a growth rate (CAGR) of 14.93% during 2025-2033.

    Report Attribute
    Key Statistics
    Base Year
    2024
    Forecast Years
    2025-2033
    Historical Years
    2019-2024
    Market Size in 2024
    USD 8.0 Billion
    Market Forecast in 2033
    USD 29.9 Billion
    Market Growth Rate 2025-203314.93%

    IMARC Group provides an analysis of the key trends in each segment of the global in-memory database market report, along with forecasts at the global, regional and country levels from 2025-2033. Our report has categorized the market based on data type, application, end user and vertical.

  18. g

    MovieLens 1M

    • grouplens.org
    • kaggle.com
    Updated Mar 19, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). MovieLens 1M [Dataset]. https://grouplens.org/datasets/movielens/1m/
    Explore at:
    Dataset updated
    Mar 19, 2016
    Description

    Stable benchmark dataset. 1 million ratings from 6000 users on 4000 movies. Released 2/2003.

  19. Z

    CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hynek, Karel (2024). CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13382426
    Explore at:
    Dataset updated
    Sep 30, 2024
    Dataset provided by
    Koumar, Josef
    Šiška, Pavel
    Čejka, Tomáš
    Hynek, Karel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CESNET-TimeSeries24: The dataset for network traffic forecasting and anomaly detection

    The dataset called CESNET-TimeSeries24 was collected by long-term monitoring of selected statistical metrics for 40 weeks for each IP address on the ISP network CESNET3 (Czech Education and Science Network). The dataset encompasses network traffic from more than 275,000 active IP addresses, assigned to a wide variety of devices, including office computers, NATs, servers, WiFi routers, honeypots, and video-game consoles found in dormitories. Moreover, the dataset is also rich in network anomaly types since it contains all types of anomalies, ensuring a comprehensive evaluation of anomaly detection methods.Last but not least, the CESNET-TimeSeries24 dataset provides traffic time series on institutional and IP subnet levels to cover all possible anomaly detection or forecasting scopes. Overall, the time series dataset was created from the 66 billion IP flows that contain 4 trillion packets that carry approximately 3.7 petabytes of data. The CESNET-TimeSeries24 dataset is a complex real-world dataset that will finally bring insights into the evaluation of forecasting models in real-world environments.

    Please cite the usage of our dataset as:

    Josef Koumar, Karel Hynek, Tomáš Čejka, Pavel Šiška, "CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting", arXiv e-prints (2024): https://doi.org/10.48550/arXiv.2409.18874 @misc{koumar2024cesnettimeseries24timeseriesdataset, title={CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting}, author={Josef Koumar and Karel Hynek and Tomáš Čejka and Pavel Šiška}, year={2024}, eprint={2409.18874}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2409.18874}, }

    Time series

    We create evenly spaced time series for each IP address by aggregating IP flow records into time series datapoints. The created datapoints represent the behavior of IP addresses within a defined time window of 10 minutes. The vector of time-series metrics v_{ip, i} describes the IP address ip in the i-th time window. Thus, IP flows for vector v_{ip, i} are captured in time windows starting at t_i and ending at t_{i+1}. The time series are built from these datapoints.

    Datapoints created by the aggregation of IP flows contain the following time-series metrics:

    Simple volumetric metrics: the number of IP flows, the number of packets, and the transmitted data size (i.e. number of bytes)

    Unique volumetric metrics: the number of unique destination IP addresses, the number of unique destination Autonomous System Numbers (ASNs), and the number of unique destination transport layer ports. The aggregation of \textit{Unique volumetric metrics} is memory intensive since all unique values must be stored in an array. We used a server with 41 GB of RAM, which was enough for 10-minute aggregation on the ISP network.

    Ratios metrics: the ratio of UDP/TCP packets, the ratio of UDP/TCP transmitted data size, the direction ratio of packets, and the direction ratio of transmitted data size

    Average metrics: the average flow duration, and the average Time To Live (TTL)

    Multiple time aggregation: The original datapoints in the dataset are aggregated by 10 minutes of network traffic. The size of the aggregation interval influences anomaly detection procedures, mainly the training speed of the detection model. However, the 10-minute intervals can be too short for longitudinal anomaly detection methods. Therefore, we added two more aggregation intervals to the datasets--1 hour and 1 day.

    Time series of institutions: We identify 283 institutions inside the CESNET3 network. These time series aggregated per each institution ID provide a view of the institution's data.

    Time series of institutional subnets: We identify 548 institution subnets inside the CESNET3 network. These time series aggregated per each institution ID provide a view of the institution subnet's data.

    Data Records

    The file hierarchy is described below:

    cesnet-timeseries24/

     |- institution_subnets/
    
     |   |- agg_10_minutes/<id_institution>.csv
    
     |   |- agg_1_hour/<id_institution>.csv
    
     |   |- agg_1_day/<id_institution>.csv
    
     |   |- identifiers.csv
    
     |- institutions/
    
     |   |- agg_10_minutes/<id_institution_subnet>.csv
    
     |   |- agg_1_hour/<id_institution_subnet>.csv
    
     |   |- agg_1_day/<id_institution_subnet>.csv
    
     |   |- identifiers.csv
    
     |- ip_addresses_full/
    
     |   |- agg_10_minutes/<id_ip_folder>/<id_ip>.csv
    
     |   |- agg_1_hour/<id_ip_folder>/<id_ip>.csv
    
     |   |- agg_1_day/<id_ip_folder>/<id_ip>.csv
    
     |   |- identifiers.csv
    
     |- ip_addresses_sample/
    
     |   |- agg_10_minutes/<id_ip>.csv
    
     |   |- agg_1_hour/<id_ip>.csv
    
     |   |- agg_1_day/<id_ip>.csv
    
     |   |- identifiers.csv
    
     |- times/
    
     |   |- times_10_minutes.csv
    
     |   |- times_1_hour.csv
    
     |   |- times_1_day.csv
    
     |- ids_relationship.csv   |- weekends_and_holidays.csv
    

    The following list describes time series data fields in CSV files:

    id_time: Unique identifier for each aggregation interval within the time series, used to segment the dataset into specific time periods for analysis.

    n_flows: Total number of flows observed in the aggregation interval, indicating the volume of distinct sessions or connections for the IP address.

    n_packets: Total number of packets transmitted during the aggregation interval, reflecting the packet-level traffic volume for the IP address.

    n_bytes: Total number of bytes transmitted during the aggregation interval, representing the data volume for the IP address.

    n_dest_ip: Number of unique destination IP addresses contacted by the IP address during the aggregation interval, showing the diversity of endpoints reached.

    n_dest_asn: Number of unique destination Autonomous System Numbers (ASNs) contacted by the IP address during the aggregation interval, indicating the diversity of networks reached.

    n_dest_port: Number of unique destination transport layer ports contacted by the IP address during the aggregation interval, representing the variety of services accessed.

    tcp_udp_ratio_packets: Ratio of packets sent using TCP versus UDP by the IP address during the aggregation interval, providing insight into the transport protocol usage pattern. This metric belongs to the interval <0, 1> where 1 is when all packets are sent over TCP, and 0 is when all packets are sent over UDP.

    tcp_udp_ratio_bytes: Ratio of bytes sent using TCP versus UDP by the IP address during the aggregation interval, highlighting the data volume distribution between protocols. This metric belongs to the interval <0, 1> with same rule as tcp_udp_ratio_packets.

    dir_ratio_packets: Ratio of packet directions (inbound versus outbound) for the IP address during the aggregation interval, indicating the balance of traffic flow directions. This metric belongs to the interval <0, 1>, where 1 is when all packets are sent in the outgoing direction from the monitored IP address, and 0 is when all packets are sent in the incoming direction to the monitored IP address.

    dir_ratio_bytes: Ratio of byte directions (inbound versus outbound) for the IP address during the aggregation interval, showing the data volume distribution in traffic flows. This metric belongs to the interval <0, 1> with the same rule as dir_ratio_packets.

    avg_duration: Average duration of IP flows for the IP address during the aggregation interval, measuring the typical session length.

    avg_ttl: Average Time To Live (TTL) of IP flows for the IP address during the aggregation interval, providing insight into the lifespan of packets.

    Moreover, the time series created by re-aggregation contains following time series metrics instead of n_dest_ip, n_dest_asn, and n_dest_port:

    sum_n_dest_ip: Sum of numbers of unique destination IP addresses.

    avg_n_dest_ip: The average number of unique destination IP addresses.

    std_n_dest_ip: Standard deviation of numbers of unique destination IP addresses.

    sum_n_dest_asn: Sum of numbers of unique destination ASNs.

    avg_n_dest_asn: The average number of unique destination ASNs.

    std_n_dest_asn: Standard deviation of numbers of unique destination ASNs)

    sum_n_dest_port: Sum of numbers of unique destination transport layer ports.

    avg_n_dest_port: The average number of unique destination transport layer ports.

    std_n_dest_port: Standard deviation of numbers of unique destination transport layer ports.

    Moreover, files identifiers.csv in each dataset type contain IDs of time series that are present in the dataset. Furthermore, the ids_relationship.csv file contains a relationship between IP addresses, Institutions, and institution subnets. The weekends_and_holidays.csv contains information about the non-working days in the Czech Republic.

  20. Success.ai | LinkedIn Full Dataset | Enrichment API – 700M Public Profiles &...

    • datarade.ai
    Updated Jan 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2022). Success.ai | LinkedIn Full Dataset | Enrichment API – 700M Public Profiles & 70M Companies – Best Price and Quality Guarantee [Dataset]. https://datarade.ai/data-products/success-ai-linkedin-full-dataset-enrichment-api-700m-pu-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Jan 1, 2022
    Dataset provided by
    Area covered
    Svalbard and Jan Mayen, Equatorial Guinea, Tunisia, Guatemala, Saint Barthélemy, Jordan, United Republic of, Qatar, Greenland, Nicaragua
    Description

    Success.ai’s LinkedIn Data Solutions offer unparalleled access to a vast dataset of 700 million public LinkedIn profiles and 70 million LinkedIn company records, making it one of the most comprehensive and reliable LinkedIn datasets available on the market today. Our employee data and LinkedIn data are ideal for businesses looking to streamline recruitment efforts, build highly targeted lead lists, or develop personalized B2B marketing campaigns.

    Whether you’re looking for recruiting data, conducting investment research, or seeking to enrich your CRM systems with accurate and up-to-date LinkedIn profile data, Success.ai provides everything you need with pinpoint precision. By tapping into LinkedIn company data, you’ll have access to over 40 critical data points per profile, including education, professional history, and skills.

    Key Benefits of Success.ai’s LinkedIn Data: Our LinkedIn data solution offers more than just a dataset. With GDPR-compliant data, AI-enhanced accuracy, and a price match guarantee, Success.ai ensures you receive the highest-quality data at the best price in the market. Our datasets are delivered in Parquet format for easy integration into your systems, and with millions of profiles updated daily, you can trust that you’re always working with fresh, relevant data.

    API Integration: Our datasets are easily accessible via API, allowing for seamless integration into your existing systems. This ensures that you can automate data retrieval and update processes, maintaining the flow of fresh, accurate information directly into your applications.

    Global Reach and Industry Coverage: Our LinkedIn data covers professionals across all industries and sectors, providing you with detailed insights into businesses around the world. Our geographic coverage spans 259M profiles in the United States, 22M in the United Kingdom, 27M in India, and thousands of profiles in regions such as Europe, Latin America, and Asia Pacific. With LinkedIn company data, you can access profiles of top companies from the United States (6M+), United Kingdom (2M+), and beyond, helping you scale your outreach globally.

    Why Choose Success.ai’s LinkedIn Data: Success.ai stands out for its tailored approach and white-glove service, making it easy for businesses to receive exactly the data they need without managing complex data platforms. Our dedicated Success Managers will curate and deliver your dataset based on your specific requirements, so you can focus on what matters most—reaching the right audience. Whether you’re sourcing employee data, LinkedIn profile data, or recruiting data, our service ensures a seamless experience with 99% data accuracy.

    • Best Price Guarantee: We offer unbeatable pricing on LinkedIn data, and we’ll match any competitor.
    • Global Scale: Access 700 million LinkedIn profiles and 70 million company records globally.
    • AI-Verified Accuracy: Enjoy 99% data accuracy through our advanced AI and manual validation processes.
    • Real-Time Data: Profiles are updated daily, ensuring you always have the most relevant insights.
    • Tailored Solutions: Get custom-curated LinkedIn data delivered directly, without managing platforms.
    • Ethically Sourced Data: Compliant with global privacy laws, ensuring responsible data usage.
    • Comprehensive Profiles: Over 40 data points per profile, including job titles, skills, and company details.
    • Wide Industry Coverage: Covering sectors from tech to finance across regions like the US, UK, Europe, and Asia.

    Key Use Cases:

    • Sales Prospecting and Lead Generation: Build targeted lead lists using LinkedIn company data and professional profiles, helping sales teams engage decision-makers at high-value accounts.
    • Recruitment and Talent Sourcing: Use LinkedIn profile data to identify and reach top candidates globally. Our employee data includes work history, skills, and education, providing all the details you need for successful recruitment.
    • Account-Based Marketing (ABM): Use our LinkedIn company data to tailor marketing campaigns to key accounts, making your outreach efforts more personalized and effective.
    • Investment Research & Due Diligence: Identify companies with strong growth potential using LinkedIn company data. Access key data points such as funding history, employee count, and company trends to fuel investment decisions.
    • Competitor Analysis: Stay ahead of your competition by tracking hiring trends, employee movement, and company growth through LinkedIn data. Use these insights to adjust your market strategy and improve your competitive positioning.
    • CRM Data Enrichment: Enhance your CRM systems with real-time updates from Success.ai’s LinkedIn data, ensuring that your sales and marketing teams are always working with accurate and up-to-date information.
    • Comprehensive Data Points for LinkedIn Profiles: Our LinkedIn profile data includes over 40 key data points for every individual and company, ensuring a complete understandin...
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Total population worldwide 1950-2100 [Dataset]. https://www.statista.com/statistics/805044/total-population-worldwide/
Organization logo

Total population worldwide 1950-2100

Explore at:
22 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Feb 24, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description

The world population surpassed eight billion people in 2022, having doubled from its figure less than 50 years previously. Looking forward, it is projected that the world population will reach nine billion in 2038, and 10 billion in 2060, but it will peak around 10.3 billion in the 2080s before it then goes into decline. Regional variations The global population has seen rapid growth since the early 1800s, due to advances in areas such as food production, healthcare, water safety, education, and infrastructure, however, these changes did not occur at a uniform time or pace across the world. Broadly speaking, the first regions to undergo their demographic transitions were Europe, North America, and Oceania, followed by Latin America and Asia (although Asia's development saw the greatest variation due to its size), while Africa was the last continent to undergo this transformation. Because of these differences, many so-called "advanced" countries are now experiencing population decline, particularly in Europe and East Asia, while the fastest population growth rates are found in Sub-Saharan Africa. In fact, the roughly two billion difference in population between now and the 2080s' peak will be found in Sub-Saharan Africa, which will rise from 1.2 billion to 3.2 billion in this time (although populations in other continents will also fluctuate). Changing projections The United Nations releases their World Population Prospects report every 1-2 years, and this is widely considered the foremost demographic dataset in the world. However, recent years have seen a notable decline in projections when the global population will peak, and at what number. Previous reports in the 2010s had suggested a peak of over 11 billion people, and that population growth would continue into the 2100s, however a sooner and shorter peak is now projected. Reasons for this include a more rapid population decline in East Asia and Europe, particularly China, as well as a prolongued development arc in Sub-Saharan Africa.

Search
Clear search
Close search
Google apps
Main menu