The South African Population Research Infrastructure Network (SAPRIN) is a national research infrastructure funded through the Department of Science and Technology and hosted by the South African Medical Research Council. One of SAPRIN’s initial goals has been to harmonise the legacy longitudinal data from the three current Health and Demographic Surveillance System (HDSS) Nodes. These long-standing nodes are the MRC/Wits University Agincourt HDSS in Bushbuckridge District, Mpumalanga, established in 1993, with a population of 116 000 people; the University of Limpopo DIMAMO HDSS in the Capricorn District of Limpopo, established in 1996, with a current population of 100 000; and the Africa Health Research Institute (AHRI) HDSS in uMkhanyakude District, KwaZulu-Natal, established in 2000, with a current population of 125 000.
SAPRIN data are processed for longitudinal analysis by organising the demographic data into residence episodes at a geographical location, and membership episodes within a household. Start events include enumeration, birth, in-migration and relocating into a household from within the study population; exit events include death (by cause), out-migration, and relocating to another location in the study population. Variables routinely updated at individual level include health care utilisation, marital status, labour status, education status, as well as recording household asset status. Anticipated outcomes of SAPRIN include: (i) regular releases of up-to-date, longitudinal data, representative of South Africa’s fast-changing poorer communities for research, interpretation and calibration of national datasets; (ii) national statistics triangulation, whereby longitudinal SAPRIN data are triangulated with National Census data for calibration of national statistics and studying the mechanisms driving the national statistics; (iii) An interdisciplinary research platform for conducting observational and interventional research at population level; (iv) policy engagement to provide evidence to underpin policy-making for cost evaluation and targeting intervention programmes, thereby improving the accuracy and efficiency of pro-poor, health and wellbeing interventions; (v) scientific education through training at related universities; and (vi) community engagement, whereby coordinated engagement with communities will enable two-way learning between researchers and community members, and enabling research site communities and service providers to have access to and make effective use of research results.
The Agincourt HDSS covers an area of approximately 420km2 and is located in Bushbuckridge District, Mpumalanga in the rural north-east of South Africa close to the Mozambique border. DIMAMO is located in the Capricorn district, Limpopo Province approximately 40 km from Polokwane, the capital city of Limpopo Province and 15-50 km from the University of Limpopo (Turfloop Campus). The site covers an area of approximately 200 km2. AHRI is situated in the south-east portion of the Umkhanyakude district of KwaZulu-Natal province near the town of Mtubatuba. It is bounded on the west by the Umfolozi-Hluhluwe nature reserve, on the south by the Umfolozi river, on the east by the N2 highway (except form portions where the KwaMsane township strandles the highway) and in the north by the Inyalazi river for portions of the boundary. The area is 438km2.
Exposure episodes
Households resident in dwellings within the study area will be eligible for inclusion in the household component of SAPRIN. All individuals identified by the household proxy informant as a member of the household will be enumerated. A resident household member is an individual that intends to sleep the majority of time at the dwelling occupied by the household over a four-month period. Households will include resident and non-resident members. An individual is a non-resident member if they have close ties to the household, but do not physically reside with the household most of the time. They can also be called temporary migrants and they are enumerated within the household list. Because household membership is not tied to physical residency, an individual may be a member of more than one household.
Event/transaction data
This dataset is not based on a sample but contains information from the complete demographic surveillance areas.
A Data Dictionary for the TSS Individual Reports with Comments reports.
Analytics refers to the methodical examination and calculation of data or statistics. Its purpose is to uncover, interpret, and convey meaningful patterns found within the data. Additionally, analytics involves utilizing these data patterns to make informed decisions. It proves valuable in domains abundant with recorded information, employing a combination of statistics, computer programming, and operations research to measure performance.
Businesses can leverage analytics to describe, predict, and enhance their overall performance. Various branches of analytics encompass predictive analytics, prescriptive analytics, enterprise decision management, descriptive analytics, cognitive analytics, Big Data Analytics, retail analytics, supply chain analytics, store assortment and stock-keeping unit optimization, marketing optimization and marketing mix modeling, web analytics, call analytics, speech analytics, sales force sizing and optimization, price and promotion modeling, predictive science, graph analytics, credit risk analysis, and fraud analytics. Due to the extensive computational requirements involved (particularly with big data), analytics algorithms and software utilize state-of-the-art methods from computer science, statistics, and mathematics.
Columns | Description |
---|---|
Company Name | Company Name refers to the name of the organization or company where an individual is employed. It represents the specific entity that provides job opportunities and is associated with a particular industry or sector. |
Job Title | Job Title refers to the official designation or position held by an individual within a company or organization. It represents the specific role or responsibilities assigned to the person in their professional capacity. |
Salaries Reported | Salaries Reported indicates the information or data related to the salaries of employees within a company or industry. This data may be collected and reported through various sources, such as surveys, employee disclosures, or public records. |
Location | Location refers to the specific geographical location or area where a company or job position is situated. It provides information about the physical location or address associated with the company's operations or the job's work environment. |
Salary | Salary refers to the monetary compensation or remuneration received by an employee in exchange for their work or services. It represents the amount of money paid to an individual on a regular basis, typically in the form of wages or a fixed annual income. |
This Dataset consists of salaries for Data Scientists, Machine Learning Engineers, Data Analysts, and Data Engineers in various cities across India (2022).
-Salary Dataset.csv -Partially Cleaned Salary Dataset.csv
This Dataset is created from https://www.glassdoor.co.in/. If you want to learn more, you can visit the Website.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
EQA30 - Individuals who experienced discrimination in social settings. Published by Central Statistics Office. Available under the license Creative Commons Attribution 4.0 (CC-BY-4.0).Individuals who experienced discrimination in social settings...
This dataset includes a table of the VOC concentrations detected in firefighter breath samples. QQ-plots for benzene, toluene, and ethylbenzene levels in breath samples as well as box-and-whisker plots of pre-, post-, and 1 h post-exposure breath levels of VOCs for firefighters participating in attack, search, and outside ventilation positions are provided. Graphs detailing the responses of individuals to pre-, post-, and 1 h post-exposure concentrations of benzene, toluene, and ethylbenzene are shown. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The original dataset contains identification information for the firefighters who participated in the controlled structure burns. The analyzed tables and graphs can be made publicly available. Format: The original dataset contains identification information for the firefighters who participated in the controlled structure burns. The analyzed tables and graphs can be made publicly available. This dataset is associated with the following publication: Wallace, A., J. Pleil, K. Oliver, D. Whitaker, S. Mentese, K. Fent, and G. Horn. Targeted GC-MS analysis of firefighters’ exhaled breath: Exploring biomarker response at the individual level. JOURNAL OF OCCUPATIONAL AND ENVIRONMENTAL HYGIENE. Taylor & Francis, Inc., Philadelphia, PA, USA, 16(5): 355-366, (2019).
Occupation data for 2021 and 2022 data files
The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. Further information can be found in the ONS article published on 11 July 2023: https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/articles/revisionofmiscodedoccupationaldataintheonslabourforcesurveyuk/january2021toseptember2022" style="background-color: rgb(255, 255, 255);">Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022.
Latest edition information
For the third edition (September 2023), the variables NSECM20, NSECMJ20, SC2010M, SC20SMJ, SC20SMN, SOC20M and SOC20O have been replaced with new versions. Further information on the SOC revisions can be found in the ONS article published on 11 July 2023: https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/articles/revisionofmiscodedoccupationaldataintheonslabourforcesurveyuk/january2021toseptember2022" style="background-color: rgb(255, 255, 255);">Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is the result of an online survey the authors conducted in the German agricultural science community in 2020. The survey inquires not only about the status quo, but also explicitly about the wishes and needs of users, representing the agricultural scientific research domain, of the in-progress NFDI (national research data infrastructure). Questions cover information about produced and (re-)used data, data quality aspects, information about the use of standards, publication practices and legal aspects of agricultural research data, the current situation in research data management in regards to awareness, consulting and curricula as well as needs of the agricultural community in respect to future developments. In total, the questionnaire contained 52 questions and was conducted using the Community Edition of the Open Source Survey Tool LimeSurvey (Version 3.19.3; LimeSurvey GmbH). The questions were accessible in English and German. The first set of questions (Questions 1-4) addressed the respondent’s professional background (i.e. career status, affiliation and subject area, but no personal data) and the user group. The user groups included data users, data providers as well as infrastructure service and information service providers. Subsequent questions were partly user group specific. All questions, the corresponding question types and addressed user groups can be found in the questionnaire files (Survey-Questions-2020-DE.pdf German Version; Survey-Questions-2020-EN.pdf English Version). The survey was accessible online between June 26th and July 21st 2020, could be completed anonymously and took about 20 minutes. The survey was promoted in an undirected manner via mail lists of agricultural institutes and agricultural-specific professional societies in Germany, via social media (e.g. Twitter) and announced during the first community workshop of NFDI4Agri on July 15th 2020 and other scientific events. After closing the survey, we exported the data from the LimeSurvey tool and initially screened it. We considered all questionnaires that contained at least one answered question in addition to the respondent’s professional background information (Questions 1-4). In total, we received 196 questionnaires of which 160 were completed in full (although not always every answer option was used, empty cells are filled with “N/A”). The main data set contains all standardized answers from the respondents. For anonymization, respondents’ individual answers, for instance, free text answers, comments and details in the category "other” were removed from the main data set. The main data set only lists whether such information was provided (“Yes”) or not (“No” or “N/A”). In an additional file respondents’ individual answers of the questions 4-52 are listed alphabetically, so that it is not possible to trace the data back. In the rare cases where only one person has provided such individual information in an answer, it is traceable but does not contain any sensitive data. The main data set containing answers of the 196 questionnaires received can be found in the file Survey-2020-Main-DataSet-Answers.xlsx. The subsidary data set containing the respondents’ individual answers (most answers are in German and are not translated) of the questions 4-52, for instance, free text answers, comments and details in the category "other” (alphabetically listed) can be found in Survey-2020-Subsidary-DataSet-Free_Text_Answers.xlsx.
The Cognitive Triad Dataset (CTD) is used to understand Beck's cognitive triad mechanism in an individual, which is crucial for early diagnosis and prognosis of depression. The Cognitive Triad Dataset (CTD) contains 5886 messages, including 4706 from the Tweeter, 600 from the Time-to-Change blog, and 580 from Beyond Blue personal stories. Six well-trained annotators manually labeled the data. This data includes six classes: self-negative, world-negative, future-negative, self-positive, world-positive, and future-positive. The CTD was evaluated on various sentiment classification algorithms. The dataset will assist in understanding Beck's Cognitive Triad Inventory (CTI) items in an individual's social media messages.
These data, which are part of a larger study undertaken by the University of Wisconsin-Milwaukee, evaluate the responses of criminal justice employees to affirmative action within criminal justice agencies. Information is provided on employees' (1) general mood, (2) attitudes across various attributes, such as race, sex, rank, education and length of service, and (3) demographic characteristics including age, sex, race, educational level, parents' occupations, and living arrangements. The use of criminal justice employees as the units of analysis provides attitudinal and perceptual data in assessing affirmative action programs within each agency. Variables include reasons for becoming a criminal justice employee, attitudes toward affirmative action status in general, and attitudes about affirmative action in criminal justice settings.
This dataset is made available under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). See LICENSE.pdf for details.
Dataset description
Parquet file, with:
35694 rows
154 columns
The file is indexed on [participant]_[month], such that 34_12 means month 12 from participant 34. All participant IDs have been replaced with randomly generated integers and the conversion table deleted.
Column names and explanations are included as a separate tab-delimited file. Detailed descriptions of feature engineering are available from the linked publications.
File contains aggregated, derived feature matrix describing person-generated health data (PGHD) captured as part of the DiSCover Project (https://clinicaltrials.gov/ct2/show/NCT03421223). This matrix focuses on individual changes in depression status over time, as measured by PHQ-9.
The DiSCover Project is a 1-year long longitudinal study consisting of 10,036 individuals in the United States, who wore consumer-grade wearable devices throughout the study and completed monthly surveys about their mental health and/or lifestyle changes, between January 2018 and January 2020.
The data subset used in this work comprises the following:
Wearable PGHD: step and sleep data from the participants’ consumer-grade wearable devices (Fitbit) worn throughout the study
Screener survey: prior to the study, participants self-reported socio-demographic information, as well as comorbidities
Lifestyle and medication changes (LMC) survey: every month, participants were requested to complete a brief survey reporting changes in their lifestyle and medication over the past month
Patient Health Questionnaire (PHQ-9) score: every 3 months, participants were requested to complete the PHQ-9, a 9-item questionnaire that has proven to be reliable and valid to measure depression severity
From these input sources we define a range of input features, both static (defined once, remain constant for all samples from a given participant throughout the study, e.g. demographic features) and dynamic (varying with time for a given participant, e.g. behavioral features derived from consumer-grade wearables).
The dataset contains a total of 35,694 rows for each month of data collection from the participants. We can generate 3-month long, non-overlapping, independent samples to capture changes in depression status over time with PGHD. We use the notation ‘SM0’ (sample month 0), ‘SM1’, ‘SM2’ and ‘SM3’ to refer to relative time points within each sample. Each 3-month sample consists of: PHQ-9 survey responses at SM0 and SM3, one set of screener survey responses, LMC survey responses at SM3 (as well as SM1, SM2, if available), and wearable PGHD for SM3 (and SM1, SM2, if available). The wearable PGHD includes data collected from 8 to 14 days prior to the PHQ-9 label generation date at SM3. Doing this generates a total of 10,866 samples from 4,036 unique participants.
These family food datasets contain more detailed information than the ‘Family Food’ report and mainly provide statistics from 2001 onwards. The UK household purchases and the UK household expenditure spreadsheets include statistics from 1974 onwards. These spreadsheets are updated annually when a new edition of the ‘Family Food’ report is published.
The ‘purchases’ spreadsheets give the average quantity of food and drink purchased per person per week for each food and drink category. The ‘nutrient intake’ spreadsheets give the average nutrient intake (eg energy, carbohydrates, protein, fat, fibre, minerals and vitamins) from food and drink per person per day. The ‘expenditure’ spreadsheets give the average amount spent in pence per person per week on each type of food and drink. Several different breakdowns are provided in addition to the UK averages including figures by region, income, household composition and characteristics of the household reference person.
This dataset contains FEMA applicant-level data for the Individuals and Households Program (IHP). All PII information has been removed. The location is represented by county, city, and zip code. This dataset contains Individual Assistance (IA) applications from DR1439 (declared in 2002) to those declared over 30 days ago. The full data set is refreshed on an annual basis and refreshed weekly to update disasters declared in the last 18 months. This dataset includes all major disasters and includes only valid registrants (applied in a declared county, within the registration period, having damage due to the incident and damage within the incident period). Information about individual data elements and descriptions are listed in the metadata information within the dataset.rnValid registrants may be eligible for IA assistance, which is intended to meet basic needs and supplement disaster recovery efforts. IA assistance is not intended to return disaster-damaged property to its pre-disaster condition. Disaster damage to secondary or vacation homes does not qualify for IHP assistance.rnData comes from FEMA's National Emergency Management Information System (NEMIS) with raw, unedited, self-reported content and subject to a small percentage of human error.rnAny financial information is derived from NEMIS and not FEMA's official financial systems. Due to differences in reporting periods, status of obligations and application of business rules, this financial information may differ slightly from official publication on public websites such as usaspending.gov. This dataset is not intended to be used for any official federal reporting. rnCitation: The Agency’s preferred citation for datasets (API usage or file downloads) can be found on the OpenFEMA Terms and Conditions page, Citing Data section: https://www.fema.gov/about/openfema/terms-conditions.rnDue to the size of this file, tools other than a spreadsheet may be required to analyze, visualize, and manipulate the data. MS Excel will not be able to process files this large without data loss. It is recommended that a database (e.g., MS Access, MySQL, PostgreSQL, etc.) be used to store and manipulate data. Other programming tools such as R, Apache Spark, and Python can also be used to analyze and visualize data. Further, basic Linux/Unix tools can be used to manipulate, search, and modify large files.rnIf you have media inquiries about this dataset, please email the FEMA News Desk at FEMA-News-Desk@fema.dhs.gov or call (202) 646-3272. For inquiries about FEMA's data and Open Government program, please email the OpenFEMA team at OpenFEMA@fema.dhs.gov.rnThis dataset is scheduled to be superceded by Valid Registrations Version 2 by early CY 2024.
Data used in this study has PII and cannot be made public. Please contact the corresponding author of the manuscript. Dhingra R, Keeler C, Staley BS, Jardel HV, Ward-Caviness C, Rebuli ME, Xi Y, Rappazzo K, Hernandez M, Chelminski AN, Jaspers I, Rappold AG. Wildfire smoke exposure and early childhood respiratory health: a study of prescription claims data. Environ Health. 2023 Jun 27;22(1):48. doi: 10.1186/s12940-023-00998-5. PMID: 37370168; PMCID: PMC10294519. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Please reach out to the corresponding author. Format: Data includes HIPAA protected information. This dataset is associated with the following publication: Dhingra, R., C. Keeler, B. Staley, H. Jardel, C. Ward-Caviness, M. Rebuli, Y. Xi, K. Rappazzo, M. Hernandez, A. Chelminski, i. Jaspers, and A. Rappold. Wildfire smoke exposure and early childhood respiratory health: a study of prescription claims data. ENVIRONMENTAL HEALTH. BioMed Central Ltd, London, UK, 22(1): 48, (2023).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This template covers section 2.5 Resource Fields: Entity and Attribute Information of the Data Discovery Form cited in the Open Data DC Handbook (2022). It completes documentation elements that are required for publication. Each field column (attribute) in the dataset needs a description clarifying the contents of the column. Data originators are encouraged to enter the code values (domains) of the column to help end-users translate the contents of the column where needed, especially when lookup tables do not exist.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
EQA69 - Individuals who experienced discrimination in accessing/using health services. Published by Central Statistics Office. Available under the license Creative Commons Attribution 4.0 (CC-BY-4.0).Individuals who experienced discrimination in accessing/using health services...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The YJMob100K human mobility datasets (YJMob100K_dataset1.csv.gz and YJMob100K_dataset1.csv.gz) contain the movement of a total of 100,000 individuals across a 75 day period, discretized into 30-minute intervals and 500 meter grid cells. The first dataset contains the movement of 80,000 individuals across a 75-day business-as-usual period, while the second dataset contains the movement of 20,000 individuals across a 75-day period (including the last 15 days during an emergency) with unusual behavior.
While the name or location of the city is not disclosed, the participants are provided with points-of-interest (POIs; e.g., restaurants, parks) data for each grid cell (~85 dimensional vector) as supplementary information (cell_POIcat.csv.gz). The list of 85 POI categories can be found in POI_datacategories.csv.
For details of the dataset, see Data Descriptor:
Yabe, T., Tsubouchi, K., Shimizu, T., Sekimoto, Y., Sezaki, K., Moro, E., & Pentland, A. (2024). YJMob100K: City-scale and longitudinal dataset of anonymized human mobility trajectories. Scientific Data, 11(1), 397. https://www.nature.com/articles/s41597-024-03237-9
--- Details about the Human Mobility Prediction Challenge 2023 (ended November 13, 2023) ---
The challenge takes place in a mid-sized and highly populated metropolitan area, somewhere in Japan. The area is divided into 500 meters x 500 meters grid cells, resulting in a 200 x 200 grid cell space.
The human mobility datasets (task1_dataset.csv.gz and task2_dataset.csv.gz) contain the movement of a total of 100,000 individuals across a 90 day period, discretized into 30-minute intervals and 500 meter grid cells. The first dataset contains the movement of a 75 day business-as-usual period, while the second dataset contains the movement of a 75 day period during an emergency with unusual behavior.
There are 2 tasks in the Human Mobility Prediction Challenge.
In task 1, participants are provided with the full time series data (75 days) for 80,000 individuals, and partial (only 60 days) time series movement data for the remaining 20,000 individuals (task1_dataset.csv.gz). Given the provided data, Task 1 of the challenge is to predict the movement patterns of the individuals in the 20,000 individuals during days 60-74. Task 2 is similar task but uses a smaller dataset of 25,000 individuals in total, 2,500 of which have the locations during days 60-74 masked and need to be predicted (task2_dataset.csv.gz).
While the name or location of the city is not disclosed, the participants are provided with points-of-interest (POIs; e.g., restaurants, parks) data for each grid cell (~85 dimensional vector) as supplementary information (which is optional for use in the challenge) (cell_POIcat.csv.gz).
For more details, see https://connection.mit.edu/humob-challenge-2023
What We Eat in America (WWEIA) is the dietary intake interview component of the National Health and Nutrition Examination Survey (NHANES). WWEIA is conducted as a partnership between the U.S. Department of Agriculture (USDA) and the U.S. Department of Health and Human Services (DHHS). Two days of 24-hour dietary recall data are collected through an initial in-person interview, and a second interview conducted over the telephone within three to 10 days. Participants are given three-dimensional models (measuring cups and spoons, a ruler, and two household spoons) and/or USDA's Food Model Booklet (containing drawings of various sizes of glasses, mugs, bowls, mounds, circles, and other measures) to estimate food amounts. WWEIA data are collected using USDA's dietary data collection instrument, the Automated Multiple-Pass Method (AMPM). The AMPM is a fully computerized method for collecting 24-hour dietary recalls either in-person or by telephone. For each 2-year data release cycle, the following dietary intake data files are available: Individual Foods File - Contains one record per food for each survey participant. Foods are identified by USDA food codes. Each record contains information about when and where the food was consumed, whether the food was eaten in combination with other foods, amount eaten, and amounts of nutrients provided by the food. Total Nutrient Intakes File - Contains one record per day for each survey participant. Each record contains daily totals of food energy and nutrient intakes, daily intake of water, intake day of week, total number foods reported, and whether intake was usual, much more than usual or much less than usual. The Day 1 file also includes salt use in cooking and at the table; whether on a diet to lose weight or for other health-related reason and type of diet; and frequency of fish and shellfish consumption (examinees one year or older, Day 1 file only). DHHS is responsible for the sample design and data collection, and USDA is responsible for the survey’s dietary data collection methodology, maintenance of the databases used to code and process the data, and data review and processing. USDA also funds the collection and processing of Day 2 dietary intake data, which are used to develop variance estimates and calculate usual nutrient intakes. Resources in this dataset:Resource Title: What We Eat In America (WWEIA) main web page. File Name: Web Page, url: https://www.ars.usda.gov/northeast-area/beltsville-md-bhnrc/beltsville-human-nutrition-research-center/food-surveys-research-group/docs/wweianhanes-overview/ Contains data tables, research articles, documentation data sets and more information about the WWEIA program. (Link updated 05/13/2020)
For information about the City of Austin's Bluetooth travel sensor data, visit our documentation page: https://github.com/cityofaustin/hack-the-traffic/tree/master/docs Each row in this dataset represents one Bluetooth enabled device that detected at two locations in the City of Austin's Bluetooth sensor network. Each record contains a detected device’s anonymized Media Access Control (MAC) address along with contain information about origin and destination points at which the device was detected, as well the time, date, and distance traveled. How does the City of Austin use the Bluetooth travel sensor data? The data enables transportation engineers to better understand short and long-term trends in Austin’s traffic patterns, supporting decisions about systems planning and traffic signal timing. What information does the data contain? The sensor data is available in three datasets: Individual Address Records ( https://data.austintexas.gov/dataset/Bluetooth-Travel-Sensors-Individual-Addresses/qnpj-zrb9/data ) Each row in this dataset represents a Bluetooth device that was detected by one of our sensors. Each record contains a detected device’s anonymized Media Access Control (MAC) address along with the time and location the device was detected. These records alone are not traffic data but can be post-processed to measure the movement of detected devices through the roadway network Individual Traffic Matches ( https://data.austintexas.gov/dataset/Bluetooth-Travel-Sensors-Individual-Traffic-Matche/x44q-icha/data ) Each row in this dataset represents one Bluetooth enabled device that detected at two locations in the roadway network. Each record contains a detected device’s anonymized Media Access Control (MAC) address along with contain information about origin and destination points at which the device was detected, as well the time, date, and distance traveled. Traffic Summary Records ( https://data.austintexas.gov/dataset/Bluetooth-Travel-Sensors-Match-Summary-Records/v7zg-5jg9 ) The traffic summary records contain aggregate travel time and speed summaries based on the individual traffic match records. Each row in the dataset summarizes average travel time and speed along a sensor-equipped roadway segment in 15 minute intervals. Does this data contain personally identifiable information? No. The Media Access Control (MAC) addresses in these datasets are randomly generated.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Person County by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Person County. The dataset can be utilized to understand the population distribution of Person County by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Person County. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Person County.
Key observations
Largest age group (population): Male # 60-64 years (1,638) | Female # 65-69 years (1,626). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Person County Population by Gender. You can refer the same here
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
dataset contains detailed financial and demographic data for 20,000 individuals, focusing on income, expenses, and potential savings across various categories. The data aims to provide insights into personal financial management and spending patterns.
Income
: Monthly income in currency units.Age
: Age of the individual.Dependents
: Number of dependents supported by the individual.Occupation
: Type of employment or job role.City_Tier
: A categorical variable representing the living area tier (e.g., Tier 1, Tier 2).Rent
, Loan_Repayment
, Insurance
, Groceries
, Transport
, Eating_Out
, Entertainment
, Utilities
, Healthcare
, Education
, and Miscellaneous
record various monthly expenses.Desired_Savings_Percentage
and Desired_Savings
: Targets for monthly savings.Disposable_Income
: Income remaining after all expenses are accounted for.Groceries
, Transport
, Eating_Out
, Entertainment
, Utilities
, Healthcare
, Education
, and Miscellaneous
.The South African Population Research Infrastructure Network (SAPRIN) is a national research infrastructure funded through the Department of Science and Technology and hosted by the South African Medical Research Council. One of SAPRIN’s initial goals has been to harmonise the legacy longitudinal data from the three current Health and Demographic Surveillance System (HDSS) Nodes. These long-standing nodes are the MRC/Wits University Agincourt HDSS in Bushbuckridge District, Mpumalanga, established in 1993, with a population of 116 000 people; the University of Limpopo DIMAMO HDSS in the Capricorn District of Limpopo, established in 1996, with a current population of 100 000; and the Africa Health Research Institute (AHRI) HDSS in uMkhanyakude District, KwaZulu-Natal, established in 2000, with a current population of 125 000.
SAPRIN data are processed for longitudinal analysis by organising the demographic data into residence episodes at a geographical location, and membership episodes within a household. Start events include enumeration, birth, in-migration and relocating into a household from within the study population; exit events include death (by cause), out-migration, and relocating to another location in the study population. Variables routinely updated at individual level include health care utilisation, marital status, labour status, education status, as well as recording household asset status. Anticipated outcomes of SAPRIN include: (i) regular releases of up-to-date, longitudinal data, representative of South Africa’s fast-changing poorer communities for research, interpretation and calibration of national datasets; (ii) national statistics triangulation, whereby longitudinal SAPRIN data are triangulated with National Census data for calibration of national statistics and studying the mechanisms driving the national statistics; (iii) An interdisciplinary research platform for conducting observational and interventional research at population level; (iv) policy engagement to provide evidence to underpin policy-making for cost evaluation and targeting intervention programmes, thereby improving the accuracy and efficiency of pro-poor, health and wellbeing interventions; (v) scientific education through training at related universities; and (vi) community engagement, whereby coordinated engagement with communities will enable two-way learning between researchers and community members, and enabling research site communities and service providers to have access to and make effective use of research results.
The Agincourt HDSS covers an area of approximately 420km2 and is located in Bushbuckridge District, Mpumalanga in the rural north-east of South Africa close to the Mozambique border. DIMAMO is located in the Capricorn district, Limpopo Province approximately 40 km from Polokwane, the capital city of Limpopo Province and 15-50 km from the University of Limpopo (Turfloop Campus). The site covers an area of approximately 200 km2. AHRI is situated in the south-east portion of the Umkhanyakude district of KwaZulu-Natal province near the town of Mtubatuba. It is bounded on the west by the Umfolozi-Hluhluwe nature reserve, on the south by the Umfolozi river, on the east by the N2 highway (except form portions where the KwaMsane township strandles the highway) and in the north by the Inyalazi river for portions of the boundary. The area is 438km2.
Exposure episodes
Households resident in dwellings within the study area will be eligible for inclusion in the household component of SAPRIN. All individuals identified by the household proxy informant as a member of the household will be enumerated. A resident household member is an individual that intends to sleep the majority of time at the dwelling occupied by the household over a four-month period. Households will include resident and non-resident members. An individual is a non-resident member if they have close ties to the household, but do not physically reside with the household most of the time. They can also be called temporary migrants and they are enumerated within the household list. Because household membership is not tied to physical residency, an individual may be a member of more than one household.
Event/transaction data
This dataset is not based on a sample but contains information from the complete demographic surveillance areas.