Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is one which highlights the demographics of Upper-Middle Class people living in Gachibowli, Hyderabad, India and attempts to, through various methods of statistical analysis, establish a relationship between several of these demographic details.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Minnesota, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Minnesota median household income. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India Proportion of People Living Below 50 Percent Of Median Income: % data was reported at 9.800 % in 2021. This records a decrease from the previous number of 10.000 % for 2020. India Proportion of People Living Below 50 Percent Of Median Income: % data is updated yearly, averaging 6.200 % from Dec 1977 (Median) to 2021, with 14 observations. The data reached an all-time high of 10.300 % in 2019 and a record low of 5.100 % in 2004. India Proportion of People Living Below 50 Percent Of Median Income: % data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s India – Table IN.World Bank.WDI: Social: Poverty and Inequality. The percentage of people in the population who live in households whose per capita income or consumption is below half of the median income or consumption per capita. The median is measured at 2017 Purchasing Power Parity (PPP) using the Poverty and Inequality Platform (http://www.pip.worldbank.org). For some countries, medians are not reported due to grouped and/or confidential data. The reference year is the year in which the underlying household survey data was collected. In cases for which the data collection period bridged two calendar years, the first year in which data were collected is reported.;World Bank, Poverty and Inequality Platform. Data are based on primary household survey data obtained from government statistical agencies and World Bank country departments. Data for high-income economies are mostly from the Luxembourg Income Study database. For more information and methodology, please see http://pip.worldbank.org.;;The World Bank’s internationally comparable poverty monitoring database now draws on income or detailed consumption data from more than 2000 household surveys across 169 countries. See the Poverty and Inequality Platform (PIP) for details (www.pip.worldbank.org).
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Real Median Personal Income in the United States (MEPAINUSA672N) from 1974 to 2023 about personal income, personal, median, income, real, and USA.
This data collection consists of 18 interview transcripts meant to explore the rationales and methods by which investors in Hong Kong buy properties in the UK. The life and impact of the residential choices of the 'super rich' has been a major strand in research by the research team. This work advanced the proposition that the upper-tier of income groups living in cities tend to exploit particular forms of service provision (such as education, cultural life and personal services), are largely distanced from the mundane flow of social life in urban areas and tend to be withdrawn from the civic life of cities more generally. Some of this work is underpinned by the literature on, for example, gated communities, but it has surprisingly been under-used as the guiding framework for close empirical work in affluent neighbourhoods, perhaps largely as a result of the perceived difficulty of working with such individuals. This project will allow us to generate insights into how super-rich neighbourhoods operate, how people come to live there and the social and economic tensions and trade-offs that exist as such processes are allowed to run. As many people question the role and value of wealth and identify inequality as a growing social problem this research will feed into public conversations and policymaker concerns about how socially vital cities can be maintained when capital investment may undermine such objectives on one level (the creation of neighbourhoods that are both exclusive and often 'abandoned' for large parts of the year), while potentially fulfilling broader ambitions at others (over tax receipts for example).Social research has tended not to focus on the super-rich, largely because they are hard to locate, and even harder to collaborate with in research. In this project we seek to address these concerns by focusing extensive research effort on the question of where and how the super-rich live and invest in the property markets of the cities of Hong Kong and London. We see these cities as exemplary in assisting in the construction of further insights and knowledge in how the super-rich seek residential investment opportunities, how they live there when they are 'at home' in such residences and how these patterns of investment shape the social, political and economic life of these cities more broadly. Given that the super-rich make such decisions on the basis of tax incentives and the attraction of major cultural infrastructure (such as galleries and theatre) we have proposed a program of research capable of offering an inside account of the practices that go to make-up these investment patterns including processes of searching for suitable property, its financing, the kinds of property deemed to be suitable and an analysis of how estate agents and city authorities seek to capitalise and retain the potentially highly mobile investment by the super-rich. In economic terms the life and functioning of rich neighbourhood spaces appears intuitively important. For example, attractive and safe spaces for captains of industry, senior figures in political and non-government organizations are often regarded as major markers of urban vitality and the foundation of social networks that may make-up the broader glue of civic and political society. Yet we know very little about how such neighbourhoods operate, who they attract and how they are linked to other cities and their neighbourhoods globally. Our aim in this research is to grapple with what might be described as the 'problem' of these super-rich neighbourhoods - sometime called the 'alpha territory' - and undertake research that will help us to understand more about the advantages and disadvantages of these kinds of property investment. The research was carried out using semi-structured interviews and participant observation at property fairs and development sites in Hong Kong and different cities in the UK. Moreover, semi-structured interviews were conducted to explore the rationales and methods by which investors in Hong Kong buy properties in the UK. Participants were recruited using searches for relevant key actors as well as accessing personal and professional networks that enabled snowballing techniques to elicit further contacts. Interviews were conducted with individual investors, local government officials, planning officers, inward investment agencies, city government officials and estate agents. Interviews were conducted in both English and Cantonese.
Income of individuals by age group, sex and income source, Canada, provinces and selected census metropolitan areas, annual.
In 2022, San Francisco had the highest median household income of cities ranking within the top 25 in terms of population, with a median household income in of 136,692 U.S. dollars. In that year, San Jose in California was ranked second, and Seattle, Washington third.
Following a fall after the great recession, median household income in the United States has been increasing in recent years. As of 2022, median household income by state was highest in Maryland, Washington, D.C., Utah, and Massachusetts. It was lowest in Mississippi, West Virginia, and Arkansas. Families with an annual income of 25,000 and 49,999 U.S. dollars made up the largest income bracket in America, with about 25.26 million households.
Data on median household income can be compared to statistics on personal income in the U.S. released by the Bureau of Economic Analysis. Personal income rose to around 21.8 trillion U.S. dollars in 2022, the highest value recorded. Personal income is a measure of the total income received by persons from all sources, while median household income is “the amount with divides the income distribution into two equal groups,” according to the U.S. Census Bureau. Half of the population in question lives above median income and half lives below. Though total personal income has increased in recent years, this wealth is not distributed throughout the population. In practical terms, income of most households has decreased. One additional statistic illustrates this disparity: for the lowest quintile of workers, mean household income has remained more or less steady for the past decade at about 13 to 16 thousand constant U.S. dollars annually. Meanwhile, income for the top five percent of workers has actually risen from about 285,000 U.S. dollars in 1990 to about 499,900 U.S. dollars in 2020.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Worldwide Bureaucracy Indicators (WWBI) dataset from the World Bank.
The Worldwide Bureaucracy Indicators (WWBI) database is a unique cross-national dataset on public sector employment and wages that aims to fill an information gap, thereby helping researchers, development practitioners, and policymakers gain a better understanding of the personnel dimensions of state capability, the footprint of the public sector within the overall labor market, and the fiscal implications of the public sector wage bill. The dataset is derived from administrative data and household surveys, thereby complementing existing, expert perception-based approaches.
The World Bank introduced the dataset with a series of four blogs:
Can you replicate the figures in the blogs? Can you display any of the data more clearly than in the blogs?
wwbi_data.csv
variable | class | description |
---|---|---|
country_code | character | 3-letter ISO_3166-1 code |
indicator_code | character | code identifying the indicator of bureaucracy |
year | numeric | year of the data |
value | numeric | numeric value of the data |
wwbi_series.csv
variable | class | description |
---|---|---|
indicator_code | character | code identifying the indicator of bureaucracy |
indicator_name | character | name of the indicator |
wwbi_country.csv
variable | class | description |
---|---|---|
country_code | character | 3-letter ISO_3166-1 code |
short_name | character | short or common name for the country |
table_name | character | more alphabetically sortable name of the country |
long_name | character | full name of the country |
x2_alpha_code | character | 2-letter ISO_3166-1 code |
currency_unit | character | currency unit |
special_notes | character | special notes |
region | character | region |
income_group | character | low, lower middle, upper middle, or high income |
wb_2_code | character | alternate 2-letter code |
national_accounts_base_year | integer | national accounts base year |
national_accounts_reference_year | integer | national accounts reference year |
sna_price_valuation | character | UN system of national accounts price valuation |
lending_category | character | International Development Association (IDA), Interanational Bank of Reconstruction and Development (IBRD), a blend or neither |
other_groups | character | Heavily Indebted Poor Countries initiative (HIPC), or countries classified as the "Euro area" |
system_of_national_accounts | integer | which System of National Accounts methodology the country uses (1968, 1993, or 2008 version) |
balance_of_payments_manual_in_use | character | the version of the Balance of Payments Manual used by the country |
external_debt_reporting_status | character | estimate, preliminary, or actual |
system_of_trade | character | Under the general system imports include goods imported for domestic consumption and imports into bonded warehouses and free trade zones. Under the special system imports comprise goods imported for domestic consumption (including transformation and repair) and withdrawals for domestic consumption from bonded warehouses and free trade zones. Goods transported through a country en route to another are excluded. |
government_accounting_concept | character | government accounting concept |
imf_data_dissemination_standard | character | International Monetary Fund data-dissemination standard: Special Data Dissemination Standard (SDDS, 1996, created for countries |
that have or seek to have access to international markets), SDDS Plus (2012, the highest tier of data standards, intended for systemically important economies), enhanced GDDS (e-GDDS, 2015, encouraging participants to emphasize data publication) | ||
latest_household_survey | character | which household survey was most recently administered |
source_of_most_recent_income_and_expenditure_data | character | which survey serves as the basis for income and expenditure data |
vital_registration_complete | logical | whether the vital registration is complete |
latest_agricultural_census | integer | year of latest agricultural census |
latest_industrial_data | integer | year of latest industrial data |
latest_trade_data | in... |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Population density per pixel at 100 metre resolution. WorldPop provides estimates of numbers of people residing in each 100x100m grid cell for every low and middle income country. Through ingegrating cencus, survey, satellite and GIS datasets in a flexible machine-learning framework, high resolution maps of population counts and densities for 2000-2020 are produced, along with accompanying metadata.
DATASET: Alpha version 2010 and 2015 estimates of numbers of people per grid square, with national totals adjusted to match UN population division estimates (http://esa.un.org/wpp/) and remaining unadjusted.
REGION: Africa
SPATIAL RESOLUTION: 0.000833333 decimal degrees (approx 100m at the equator)
PROJECTION: Geographic, WGS84
UNITS: Estimated persons per grid square
MAPPING APPROACH: Land cover based, as described in: Linard, C., Gilbert, M., Snow, R.W., Noor, A.M. and Tatem, A.J., 2012, Population distribution, settlement patterns and accessibility across Africa in 2010, PLoS ONE, 7(2): e31743.
FORMAT: Geotiff (zipped using 7-zip (open access tool): www.7-zip.org)
FILENAMES: Example - AGO10adjv4.tif = Angola (AGO) population count map for 2010 (10) adjusted to match UN national estimates (adj), version 4 (v4). Population maps are updated to new versions when improved census or other input data become available.
https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement
Welcome to the Middle Eastern Facial Images from Past Dataset, meticulously curated to enhance face recognition models and support the development of advanced biometric identification systems, KYC models, and other facial recognition technologies.
This dataset comprises over 5,000+ images, divided into participant-wise sets with each set including:
The dataset includes contributions from a diverse network of individuals across Middle Eastern countries:
To ensure high utility and robustness, all images are captured under varying conditions:
Each image set is accompanied by detailed metadata for each participant, including:
This metadata is essential for training models that can accurately recognize and identify Middle Eastern faces across different demographics and conditions.
This facial image dataset is ideal for various applications in the field of computer vision, including but not limited to:
This table presents income shares, thresholds, tax shares, and total counts of individual Canadian tax filers, with a focus on high income individuals (95% income threshold, 99% threshold, etc.). Income thresholds are based on national threshold values, regardless of selected geography; for example, the number of Nova Scotians in the top 1% will be calculated as the number of taxfiling Nova Scotians whose total income exceeded the 99% national income threshold. Different definitions of income are available in the table namely market, total, and after-tax income, both with and without capital gains.
https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement
Welcome to the Middle Eastern Human Facial Images Dataset, meticulously curated to enhance face recognition models and support the development of advanced biometric identification systems, KYC models, and other facial recognition technologies.
This dataset comprises over 1500 Middle Eastern individual facial image sets, with each set including:
The dataset includes contributions from a diverse network of individuals across Middle Eastern countries.
To ensure high utility and robustness, all images are captured under varying conditions:
Each facial image set is accompanied by detailed metadata for each participant, including:
This metadata is essential for training models that can accurately recognize and identify faces across different demographics and conditions.
This facial image dataset is ideal for various applications in the field of computer vision, including but not limited to:
We understand the evolving nature of AI and machine learning requirements. Therefore, we continuously add more assets with diverse conditions to this off-the-shelf facial image dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Population density per pixel at 100 metre resolution. WorldPop provides estimates of numbers of people residing in each 100x100m grid cell for every low and middle income country. Through ingegrating cencus, survey, satellite and GIS datasets in a flexible machine-learning framework, high resolution maps of population counts and densities for 2000-2020 are produced, along with accompanying metadata. DATASET: Alpha version 2010 and 2015 estimates of numbers of people per grid square, with national totals adjusted to match UN population division estimates (http://esa.un.org/wpp/) and remaining unadjusted. REGION: Africa SPATIAL RESOLUTION: 0.000833333 decimal degrees (approx 100m at the equator) PROJECTION: Geographic, WGS84 UNITS: Estimated persons per grid square MAPPING APPROACH: Land cover based, as described in: Linard, C., Gilbert, M., Snow, R.W., Noor, A.M. and Tatem, A.J., 2012, Population distribution, settlement patterns and accessibility across Africa in 2010, PLoS ONE, 7(2): e31743. FORMAT: Geotiff (zipped using 7-zip (open access tool): www.7-zip.org) FILENAMES: Example - AGO10adjv4.tif = Angola (AGO) population count map for 2010 (10) adjusted to match UN national estimates (adj), version 4 (v4). Population maps are updated to new versions when improved census or other input data become available.
This data file includes the Gini coefficient calculated for different wealth welfare aggregates constructed for all Luxembourg Wealth Study (LWS) datasets in all waves (as of March 2022). It includes Gini coefficients calculated on: • Disposable Net Worth • Value of Principal residence • Financial AssetsThis project sought to renew the ESRC's invaluable financial support to LIS (formerly the Luxembourg Income Study) for a period of five more years. LIS is an independent, non-profit cross-national data archive and research institute located in Luxembourg. LIS relies on financial contributions from national science foundations, other research institutions and consortia, data-providing agencies, and supranational organisations to support data harmonisation and enable free and unlimited data access to researchers in the participating countries and to students world-wide. LIS' primary activity is to make harmonised household microdata available to researchers, thus enabling cross-national, interdisciplinary primary research into socio-economic outcomes and their determinants. Users of the Luxembourg Income Study Database and Luxembourg Wealth Study Database come from countries around the globe, including the UK. LIS has four goals: 1) to harmonise microdatasets from high- and middle-income countries that include data on income, wealth, employment, and demography; 2) to provide a secure method for researchers to query data that would otherwise be unavailable due to country-specific privacy restrictions; 3) to create and maintain a remote-execution system that sends research query results quickly back to users at off-site locations; and 4) to enable, facilitate, promote and conduct crossnational comparative research on the social and economic wellbeing of populations across countries. LIS contains the Luxembourg Income Study (LIS) Database, which includes income data, and the Luxembourg Wealth Study (LWS) Database, which focuses on wealth data. LIS currently includes microdata from 46 countries in Europe, the Americas, Africa, Asia and Australasia. LIS contains over 250 datasets, organised into eight time "waves," spanning the years 1968 to 2011. Since 2007, seventeen more countries have been added to LIS, including the BRICS countries (Brazil, Russia, India, China, South Africa), Japan, South Korea and a number of other Latin American countries. LWS contains 20 wealth datasets from 12 countries, including the UK, and covers the period 1994 to 2007. All told, LIS and LWS datasets together cover 86% of world GDP and 64% of world population. Users submit statistical queries to the microdatabases using a Java-based job submission interface or standard email. The databases are especially valuable for primary research in that they offer access to cross-national data at the micro-level - at the level of households and persons. Users are economists, sociologists, political scientists, and policy analysts, among others, and they employ a range of statistical approaches and methods. LIS also provides extensive documentation - metadata - for both LIS and LWS, concerning technical aspects of the survey data, the harmonisation process, and the social institutions of income and wealth provision in participating countries. In the next five years, for which support is sought, LIS will: - expand LIS, adding Waves IX (2013) and X (2016), and add new middle-income countries; - develop LWS, adding another wave of datasets to existing countries; acquire new wealth datasets for 14 more countries in cooperation with the European Central Bank (based on the Household Finance and Consumption Survey); - create a state-of-the-art metadata search and storage system; - maintain international standards in data security and data infrastructure systems; - provide high-quality harmonised household microdata to researchers around the world; - enable interdisciplinary cross-national social science research covering 45+ countries, including the UK; - aim to broaden its reach and impact in academic and non-academic circles through focused communications strategies and collaborations. All surveyed households and their members are included in our estimates of Gini and Atkinson coefficients, percentile ratios, and poverty lines. Poverty lines are calculated based on the total population. Those lines are then used to calculate poverty rates among subgroups (children and the elderly). Thus, when calculating poverty rates, the subgroups vary, but the poverty lines remain constant within any given dataset. The data file includes the Gini coefficient calculated for different wealth welfare aggregates constructed for all LWS datasets in all waves (as of March 2022).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ireland IE: Income Share Held by Highest 20% data was reported at 40.200 % in 2015. This records an increase from the previous number of 40.100 % for 2014. Ireland IE: Income Share Held by Highest 20% data is updated yearly, averaging 40.700 % from Dec 2003 (Median) to 2015, with 13 observations. The data reached an all-time high of 41.800 % in 2005 and a record low of 39.300 % in 2008. Ireland IE: Income Share Held by Highest 20% data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Ireland – Table IE.World Bank.WDI: Poverty. Percentage share of income or consumption is the share that accrues to subgroups of population indicated by deciles or quintiles. Percentage shares by quintile may not sum to 100 because of rounding.; ; World Bank, Development Research Group. Data are based on primary household survey data obtained from government statistical agencies and World Bank country departments. Data for high-income economies are from the Luxembourg Income Study database. For more information and methodology, please see PovcalNet (http://iresearch.worldbank.org/PovcalNet/index.htm).; ; The World Bank’s internationally comparable poverty monitoring database now draws on income or detailed consumption data from more than one thousand six hundred household surveys across 164 countries in six regions and 25 other high income countries (industrialized economies). While income distribution data are published for all countries with data available, poverty data are published for low- and middle-income countries and countries eligible to receive loans from the World Bank (such as Chile) and recently graduated countries (such as Estonia) only. See PovcalNet (http://iresearch.worldbank.org/PovcalNet/WhatIsNew.aspx) for definitions of geographical regions and industrialized countries.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Population density per pixel at 100 metre resolution. WorldPop provides estimates of numbers of people residing in each 100x100m grid cell for every low and middle income country. Through ingegrating cencus, survey, satellite and GIS datasets in a flexible machine-learning framework, high resolution maps of population counts and densities for 2000-2020 are produced, along with accompanying metadata.
DATASET: Alpha version 2010 and 2015 estimates of numbers of people per grid square, with national totals adjusted to match UN population division estimates (http://esa.un.org/wpp/) and remaining unadjusted.
REGION: Africa
SPATIAL RESOLUTION: 0.000833333 decimal degrees (approx 100m at the equator)
PROJECTION: Geographic, WGS84
UNITS: Estimated persons per grid square
MAPPING APPROACH: Land cover based, as described in: Linard, C., Gilbert, M., Snow, R.W., Noor, A.M. and Tatem, A.J., 2012, Population distribution, settlement patterns and accessibility across Africa in 2010, PLoS ONE, 7(2): e31743.
FORMAT: Geotiff (zipped using 7-zip (open access tool): www.7-zip.org)
FILENAMES: Example - AGO10adjv4.tif = Angola (AGO) population count map for 2010 (10) adjusted to match UN national estimates (adj), version 4 (v4). Population maps are updated to new versions when improved census or other input data become available.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Population density per pixel at 100 metre resolution. WorldPop provides estimates of numbers of people residing in each 100x100m grid cell for every low and middle income country. Through ingegrating cencus, survey, satellite and GIS datasets in a flexible machine-learning framework, high resolution maps of population counts and densities for 2000-2020 are produced, along with accompanying metadata. DATASET: Alpha version 2010 and 2015 estimates of numbers of people per grid square, with national totals adjusted to match UN population division estimates (http://esa.un.org/wpp/) and remaining unadjusted. REGION: Africa SPATIAL RESOLUTION: 0.000833333 decimal degrees (approx 100m at the equator) PROJECTION: Geographic, WGS84 UNITS: Estimated persons per grid square MAPPING APPROACH: Land cover based, as described in: Linard, C., Gilbert, M., Snow, R.W., Noor, A.M. and Tatem, A.J., 2012, Population distribution, settlement patterns and accessibility across Africa in 2010, PLoS ONE, 7(2): e31743. FORMAT: Geotiff (zipped using 7-zip (open access tool): www.7-zip.org) FILENAMES: Example - AGO10adjv4.tif = Angola (AGO) population count map for 2010 (10) adjusted to match UN national estimates (adj), version 4 (v4). Population maps are updated to new versions when improved census or other input data become available. Democratic Republic of the Congo data available from WorldPop here.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
About
The Synthetic Sweden Mobility (SySMo) model provides a simplified yet statistically realistic microscopic representation of the real population of Sweden. The agents in this synthetic population contain socioeconomic attributes, household characteristics, and corresponding activity plans for an average weekday. This agent-based modelling approach derives the transportation demand from the agents’ planned activities using various transport modes (e.g., car, public transport, bike, and walking).
This open data repository contains four datasets:
(1) Synthetic Agents,
(2) Activity Plans of the Agents,
(3) Travel Trajectories of the Agents, and
(4) Road Network (EPSG: 3006)
(OpenStreetMap data were retrieved on August 28, 2023, from https://download.geofabrik.de/europe.html, and GTFS data were retrieved on September 6, 2023 from https://samtrafiken.se/)
The database can serve as input to assess the potential impacts of new transportation technologies, infrastructure changes, and policy interventions on the mobility patterns of the Swedish population.
Methodology
This dataset contains statistically simulated 10.2 million agents representing the population of Sweden, their socio-economic characteristics and the activity plan for an average weekday. For preparing data for the MATSim simulation, we randomly divided all the agents into 10 batches. Each batch's agents are then simulated in MATSim using the multi-modal network combining road networks and public transit data in Sweden using the package pt2matsim (https://github.com/matsim-org/pt2matsim).
The agents' daily activity plans along with the road network serve as the primary inputs in the MATSim environment which ensures iterative replanning while aiming for a convergence on optimal activity plans for all the agents. Subsequently, the individual mobility trajectories of the agents from the MATSim simulation are retrieved.
The activity plans of the individual agents extracted from the MATSim simulation output data are then further processed. All agents with negative utility score and negative activity time corresponding to at least one activity are filtered out as the ‘infeasible’ agents. The dataset ‘Synthetic Agents’ contains all synthetic agents regardless of their ‘feasibility’ (0=excluded & 1=included in plans and trajectories). In the other datasets, only agents with feasible activity plans are included.
The simulation setup adheres to the MATSim 13.0 benchmark scenario, with slight adjustments. The strategy for replanning integrates BestScore (60%), TimeAllocationMutator (30%), and ReRoute (10%)— the percentages denote the proportion of agents utilizing these strategies. In each iteration of the simulation, the agents adopt these strategies to adjust their activity plans. The "BestScore" strategy retains the plan with the highest score from the previous iteration, selecting the most successful strategy an agent has employed up until that point. The "TimeAllocationMutator" modifies the end times of activities by introducing random shifts within a specified range, allowing for the exploration of different schedules. The "ReRoute" strategy enables agents to alter their current routes, potentially optimizing travel based on updated information or preferences. These strategies are detailed further in W. Axhausen et al. (2016) work, which provides comprehensive insights into their implementation and impact within the context of transport simulation modeling.
Data Description
(1) Synthetic Agents
This dataset contains all agents in Sweden and their socioeconomic characteristics.
The attribute ‘feasibility’ has two categories: feasible agents (73%), and infeasible agents (27%). Infeasible agents are agents with negative utility score and negative activity time corresponding to at least one activity.
File name: 1_syn_pop_all.parquet
Column
Description
Data type
Unit
PId
Agent ID
Integer
-
Deso Zone code of Demographic statistical areas (DeSO)1
kommun
Municipality code
marital
Marital Status (single/ couple/ child)
sex
Gender (0 = Male, 1 = Female)
age
Age
HId
A unique identifier for households
HHtype
Type of households (single/ couple/ other)
HHsize
Number of people living in the households
num_babies
Number of children less than six years old in the household
employment Employment Status (0 = Not Employed, 1 = Employed)
studenthood Studenthood Status (0 = Not Student, 1 = Student)
income_class Income Class (0 = No Income, 1 = Low Income, 2 = Lower-middle Income, 3 = Upper-middle Income, 4 = High Income)
num_cars Number of cars owned by an individual
HHcars Number of cars in the household
feasibility
Status of the individual (1=feasible, 0=infeasible)
1 https://www.scb.se/vara-tjanster/oppna-data/oppna-geodata/deso--demografiska-statistikomraden/
(2) Activity Plans of the Agents
The dataset contains the car agents’ (agents that use cars on the simulated day) activity plans for a simulated average weekday.
File name: 2_plans_i.parquet, i = 0, 1, 2, ..., 8, 9. (10 files in total)
Column
Description
Data type
Unit
act_purpose
Activity purpose (work/ home/ school/ other)
String
-
PId
Agent ID
Integer
-
act_end
End time of activity (0:00:00 – 23:59:59)
String
hour:minute:seco
nd
act_id
Activity index of each agent
Integer
-
mode
Transport mode to reach the activity location
String
-
POINT_X
Coordinate X of activity location (SWEREF99TM)
Float
metre
POINT_Y
Coordinate Y of activity location (SWEREF99TM)
Float
metre
dep_time
Departure time (0:00:00 – 23:59:59)
String
hour:minute:seco
nd
score
Utility score of the simulation day as obtained from MATSim
Float
-
trav_time
Travel time to reach the activity location
String
hour:minute:seco
nd
trav_time_min
Travel time in decimal minute
Float
minute
act_time
Activity duration in decimal minute
Float
minute
distance
Travel distance between the origin and the destination
Float
km
speed
Travel speed to reach the activity location
Float
km/h
(3) Travel Trajectories of the Agents
This dataset contains the driving trajectories of all the agents on the road network, and the public transit vehicles used by these agents, including buses, ferries, trams etc. The files are produced by MATSim simulations and organised into 10 *.parquet’ files (representing different batches of simulation) corresponding to each plan file.
File name: 3_events_i.parquet, i = 0, 1, 2, ..., 8, 9. (10 files in total)
Column
Description
Data type
Unit
time
Time in second in a simulation day (0-86399)
Integer
second
type
Event type defined by MATSim simulation*
String
person
Agent ID
Integer
link
Nearest road link consistent with the road network
String
vehicle
Vehicle ID identical to person
Integer
from_node
Start node of the link
Integer
to_node
End node of the link
Integer
One typical episode of MATSim simulation events: Activity ends (actend) -> Agent’s vehicle enters traffic (vehicle enters traffic) -> Agent’s vehicle moves from previous road segment to its next connected one (left link) -> Agent’s vehicle leaves traffic for activity (vehicle leaves traffic) -> Activity starts (actstart)
(4) Road Network
This dataset contains the road network.
File name: 4_network.shp
Column
Description
Data type
Unit
length
The length of road link
Float
metre
freespeed
Free speed
Float
km/h
capacity
Number of vehicles
Integer
permlanes
Number of lanes
Integer
oneway
Whether the segment is one-way (0=no, 1=yes)
Integer
modes
Transport mode
String
from_node
Start node of the link
Integer
to_node
End node of the link
Integer
geometry
LINESTRING (SWEREF99TM)
geometry
metre
Additional Notes
This research is funded by the RISE Research Institutes of Sweden, the Swedish Research Council for Sustainable Development (Formas, project number 2018-01768), and Transport Area of Advance, Chalmers.
Contributions
YL designed the simulation, analyzed the simulation data, and, along with CT, executed the simulation. CT, SD, FS, and SY conceptualized the model (SySMo), with CT and SD further developing the model to produce agents and their activity plans. KG wrote the data document. All authors reviewed, edited, and approved the final document.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundThe urbanization process may affect the lifestyle of rural residents in China. Limited information exists on the extent of sedentarism and physical activity (PA) level of rural residents in middle-income countries. This is the first survey on sedentary time (ST) and PA among rural residents in eastern China.MethodsThis cross-sectional observational study randomly samples rural adults from Zhejiang Province in eastern China (n = 1,320). Participants' ST and PA levels were determined from the International Physical Activity Questionnaire Short Form through face-to-face interviews, and the influencing factors of PA levels were assessed through multi-class logistic regression analysis.ResultsThe findings showed that the daily ST of the participants ranged from 30 to 660 min, with a median of 240 min (P25, P75:120, 240 min), and 54.6% of participants were sedentary for 240 min or above. The daily ST in men, people aged 18 to 44 years, people with bachelors' degree and above, people working for government agencies or institutions, people with unmarried status, and people with an average income of < 2,000 Yuan was longer than that of other respective groups (p < 0.01). In contrast, the daily ST of people with hypertension or with patients with osteoporosis or osteopenia was less than that of normal people (p < 0.01). Additionally, 69.4% of participants generally had a low level of PA (LPA). Compared with those living in northern Zhejiang, people living in southern Zhejiang who were aged 18–44 years, had bachelor's degree or above, were farmers, and had household incomes below 10,000 Yuan per month were more likely to engage in LPA compared to people > 60 years, with high school or technical education levels or with junior college degrees, working in government agencies and institutions, and with household income above 10,000 Yuan per month (p < 0.05). Furthermore, there was no correlation between ST and PA levels.ConclusionMost rural residents in the Zhejiang Province of eastern China had longer daily ST and a LPA. This was predominant in men, young people, highly educated people, unmarried people, and middle to high-income people. Health education programs should be targeted toward specific population groups to decrease the ST and increase PA.
The project, based at the University of Greenwich, UK and Stellenbosch University, South Africa, aimed to examine epidemiologic transitions by identifying and quantifying the drivers of change in CVD risk in the middle-income country of South Africa compared to the high-income nation of England. The project produced a harmonised dataset of national surveys measuring CVD risk factors in South Africa and England for others to use in future work. The harmonised dataset includes microdata from nationally-representative surveys in South Africa derived from the Demographic and Health Surveys, National Income Dynamics Study, South Africa National Health and Nutrition Examination Survey and Study on Global Ageing and Adult Health, covering 11 cross-sections and approximately 156,000 individuals aged 15+ years, representing South Africa’s adult population from 1998 to 2017.
Data for England come from 17 Health Surveys for England (HSE) over the same time period, covering over 168,000 individuals aged 16+ years, representing England’s adult population.
This study uses existing data to identify drivers of recent health transitions in South Africa compared to England. The global burden of non-communicable diseases (NCDs) on health is increasing. Cardiovascular diseases (CVD) in particular are the leading causes of death globally and often share characteristics with many major NCDs. Namely, they tend to increase with age and are influenced by behavioural factors such as diet, exercise and smoking. Risk factors for CVD are routinely measured in population surveys and thus provide an opportunity to study health transitions. Understanding the drivers of health transitions in countries that have not followed expected paths (eg, South Africa) compared to those that exemplified models of 'epidemiologic transition' (eg, England) can generate knowledge on where resources may best be directed to reduce the burden of disease. In the middle-income country of South Africa, CVD is the second leading cause of death after HIV/AIDS and tuberculosis (TB). Moreover, many of the known risk factors for NCDs like CVD are highly prevalent. Rates of hypertension are high, with recent estimates suggesting that over 40% of adults have high blood pressure. Around 60% of women and 30% of men over 15 are overweight in South Africa. In addition, excessive alcohol consumption, a risk factor for many chronic diseases, is high, with over 30% of men aged 15 and older having engaged in heavy episodic drinking within a 30-day period. Nevertheless, infectious diseases such as HIV/AIDS remain the leading cause of death, though many with HIV/AIDS and TB also have NCDs. In high-income countries like England, by contrast, NCDs such as CVD have been the leading causes of death since the mid-1900s. However, CVD and risk factors such as hypertension have been declining in recent decades due to increased prevention and treatment. The major drivers of change in disease burden have been attributed to factors including ageing, improved living standards, urbanisation, lifestyle change, and reduced infectious disease. Together, these changes are often referred to as the epidemiologic transition. However, recent research has questioned whether epidemiologic transition theory accurately describes the experience of many low- and middle-income countries or, in fact, of high-income nations such as England. Furthermore, few studies have empirically tested the relative contributions of demographic, behavioural, health and economic factors to trends in disease burden and risk, particularly on the African continent. In addition, many social and environmental factors are overlooked in this research. To address these gaps, our study will use population measurements of CVD risk derived from surveys in South Africa over nearly 20 years in order to examine whether and to what extent demographic, behavioural, environmental, medical, social and other factors contribute to recent health trends and transitions. We will compare these trends to those occurring in England over the same time period. Thus, this analysis seeks to illuminate the drivers of health transitions in a country which is assumed to still be 'transitioning' to a chronic disease profile but which continues to have a high infectious disease burden (South Africa) as compared to a country which is assumed to have already transitioned following epidemiological transition theory (England). The analysis will employ modelling techniques on pooled cross-sectional data to examine how various factors explain the variation in CVD risk over time in representative population samples from South Africa and England. The results of this analysis may help to identify some of the main contributors to recent changes in CVD risk in South Africa and England. Such information can be used to pinpoint potential areas for intervention, such as social policy and services, thereby helping to set priorities for governmental and...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is one which highlights the demographics of Upper-Middle Class people living in Gachibowli, Hyderabad, India and attempts to, through various methods of statistical analysis, establish a relationship between several of these demographic details.