25 datasets found

High income tax filers in Canada, specific geographic area thresholds
www150.statcan.gc.ca
open.canada.ca
+1more
Updated Oct 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Canada, Statistics Canada (2024). High income tax filers in Canada, specific geographic area thresholds [Dataset]. http://doi.org/10.25318/1110005601-eng
Explore at:
Unique identifier
https://doi.org/10.25318/1110005601-eng
Dataset updated
Oct 28, 2024
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Area covered
Canada
Description
This table presents income shares, thresholds, tax shares, and total counts of individual Canadian tax filers, with a focus on high income individuals (95% income threshold, 99% threshold, etc.). Income thresholds are geography-specific; for example, the number of Nova Scotians in the top 1% will be calculated as the number of taxfiling Nova Scotians whose total income exceeded the 99% income threshold of Nova Scotian tax filers. Different definitions of income are available in the table namely market, total, and after-tax income, both with and without capital gains.
N
Median Household Income by Racial Categories in Norman, OK (2022)
neilsberg.com
csv, json
Updated Jan 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Median Household Income by Racial Categories in Norman, OK (2022) [Dataset]. https://www.neilsberg.com/research/datasets/3622a9b8-8904-11ee-9302-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 3, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Norman, Oklahoma
Variables measured
Median Household Income for Asian Population, Median Household Income for Black Population, Median Household Income for White Population, Median Household Income for Some other race Population, Median Household Income for Two or more races Population, Median Household Income for American Indian and Alaska Native Population, Median Household Income for Native Hawaiian and Other Pacific Islander Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates. To portray the median household income within each racial category idetified by the US Census Bureau, we conducted an initial analysis and categorization of the data. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). It is important to note that the median household income estimates exclusively represent the identified racial categories and do not incorporate any ethnicity classifications. Households are categorized, and median incomes are reported based on the self-identified race of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the median household income across different racial categories in Norman. It portrays the median household income of the head of household across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to gain insights into economic disparities and trends and explore the variations in median houshold income for diverse racial categories.

Key observations

Based on our analysis of the distribution of Norman population by race & ethnicity, the population is predominantly White. This particular racial category constitutes the majority, accounting for 74.83% of the total residents in Norman. Notably, the median household income for White households is $68,429. Interestingly, White is both the largest group and the one with the highest median household income, which stands at $68,429.

https://i.neilsberg.com/ch/norman-ok-median-household-income-by-race.jpeg" alt="Norman median household income diversity across racial categories">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates.

Racial categories include:

White

Black or African American

American Indian and Alaska Native

Asian

Native Hawaiian and Other Pacific Islander

Some other race

Two or more races (multiracial)

Variables / Data Columns

Race of the head of household: This column presents the self-identified race of the household head, encompassing all relevant racial categories (excluding ethnicity) applicable in Norman.

Median household income: Median household income, adjusting for inflation, presented in 2022-inflation-adjusted dollars

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Norman median household income by race. You can refer the same here
Meta Kaggle Code
kaggle.com
zip
Updated Mar 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaggle (2025). Meta Kaggle Code [Dataset]. https://www.kaggle.com/datasets/kaggle/meta-kaggle-code/code
Explore at:
zip(133186454988 bytes)Available download formats
Dataset updated
Mar 20, 2025
Dataset authored and provided by
Kagglehttp://kaggle.com/
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Explore our public notebook content!

Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.

Why we’re releasing this dataset

By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.

Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.

The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!

Sensitive data

While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.

Joining with Meta Kaggle

The files contained here are a subset of the KernelVersions in Meta Kaggle. The file names match the ids in the KernelVersions csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.

File organization

The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.

The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays

Questions / Comments

We love feedback! Let us know in the Discussion tab.

Happy Kaggling!
Income of individuals by age group, sex and income source, Canada, provinces...
www150.statcan.gc.ca
open.canada.ca
+2more
Updated Apr 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Canada, Statistics Canada (2024). Income of individuals by age group, sex and income source, Canada, provinces and selected census metropolitan areas [Dataset]. http://doi.org/10.25318/1110023901-eng
Explore at:
Unique identifier
https://doi.org/10.25318/1110023901-eng
Dataset updated
Apr 26, 2024
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Area covered
Canada
Description
Income of individuals by age group, sex and income source, Canada, provinces and selected census metropolitan areas, annual.
U.S. median household income 2023, by education of householder
statista.com
Updated Sep 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). U.S. median household income 2023, by education of householder [Dataset]. https://www.statista.com/statistics/233301/median-household-income-in-the-united-states-by-education/
Explore at:
Dataset updated
Sep 17, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
United States
Description
U.S. citizens with a professional degree had the highest median household income in 2023, at 172,100 U.S. dollars. In comparison, those with less than a 9th grade education made significantly less money, at 35,690 U.S. dollars. Household income The median household income in the United States has fluctuated since 1990, but rose to around 70,000 U.S. dollars in 2021. Maryland had the highest median household income in the United States in 2021. Maryland’s high levels of wealth is due to several reasons, and includes the state's proximity to the nation's capital. Household income and ethnicity The median income of white non-Hispanic households in the United States had been on the rise since 1990, but declining since 2019. While income has also been on the rise, the median income of Hispanic households was much lower than those of white, non-Hispanic private households. However, the median income of Black households is even lower than Hispanic households. Income inequality is a problem without an easy solution in the United States, especially since ethnicity is a contributing factor. Systemic racism contributes to the non-White population suffering from income inequality, which causes the opportunity for growth to stagnate.
N
Income Bracket Analysis by Age Group Dataset: Age-Wise Distribution of...
neilsberg.com
csv, json
Updated Feb 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Income Bracket Analysis by Age Group Dataset: Age-Wise Distribution of Milford Town, New York Household Incomes Across 16 Income Brackets // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/f35dae34-f353-11ef-8577-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Feb 25, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
New York
Variables measured
Number of households with income $200,000 or more, Number of households with income less than $10,000, Number of households with income between $15,000 - $19,999, Number of households with income between $20,000 - $24,999, Number of households with income between $25,000 - $29,999, Number of households with income between $30,000 - $34,999, Number of households with income between $35,000 - $39,999, Number of households with income between $40,000 - $44,999, Number of households with income between $45,000 - $49,999, Number of households with income between $50,000 - $59,999, and 6 more
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across 16 income brackets (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out the total number of households within a specific income bracket along with how many households with that income bracket for each of the 4 age cohorts (Under 25 years, 25-44 years, 45-64 years and 65 years and over). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the the household distribution across 16 income brackets among four distinct age groups in Milford town: Under 25 years, 25-44 years, 45-64 years, and over 65 years. The dataset highlights the variation in household income, offering valuable insights into economic trends and disparities within different age categories, aiding in data analysis and decision-making..

Key observations

Upon closer examination of the distribution of households among age brackets, it reveals that there are 11(1%) households where the householder is under 25 years old, 282(25.75%) households with a householder aged between 25 and 44 years, 345(31.51%) households with a householder aged between 45 and 64 years, and 457(41.74%) households where the householder is over 65 years old.

The age group of 25 to 44 years exhibits the highest median household income, while the largest number of households falls within the 65 years and over bracket. This distribution hints at economic disparities within the town of Milford town, showcasing varying income levels among different age demographics.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Income brackets:

Less than $10,000

$10,000 to $14,999

$15,000 to $19,999

$20,000 to $24,999

$25,000 to $29,999

$30,000 to $34,999

$35,000 to $39,999

$40,000 to $44,999

$45,000 to $49,999

$50,000 to $59,999

$60,000 to $74,999

$75,000 to $99,999

$100,000 to $124,999

$125,000 to $149,999

$150,000 to $199,999

$200,000 or more

Variables / Data Columns

Household Income: This column showcases 16 income brackets ranging from Under $10,000 to $200,000+ ( As mentioned above).

Under 25 years: The count of households led by a head of household under 25 years old with income within a specified income bracket.

25 to 44 years: The count of households led by a head of household 25 to 44 years old with income within a specified income bracket.

45 to 64 years: The count of households led by a head of household 45 to 64 years old with income within a specified income bracket.

65 years and over: The count of households led by a head of household 65 years and over old with income within a specified income bracket.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Milford town median household income by age. You can refer the same here
N
Income Bracket Analysis by Age Group Dataset: Age-Wise Distribution of...
neilsberg.com
csv, json
Updated Feb 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Income Bracket Analysis by Age Group Dataset: Age-Wise Distribution of Florence, AZ Household Incomes Across 16 Income Brackets // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/f34c1ceb-f353-11ef-8577-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Feb 25, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Florence, Arizona
Variables measured
Number of households with income $200,000 or more, Number of households with income less than $10,000, Number of households with income between $15,000 - $19,999, Number of households with income between $20,000 - $24,999, Number of households with income between $25,000 - $29,999, Number of households with income between $30,000 - $34,999, Number of households with income between $35,000 - $39,999, Number of households with income between $40,000 - $44,999, Number of households with income between $45,000 - $49,999, Number of households with income between $50,000 - $59,999, and 6 more
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across 16 income brackets (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out the total number of households within a specific income bracket along with how many households with that income bracket for each of the 4 age cohorts (Under 25 years, 25-44 years, 45-64 years and 65 years and over). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the the household distribution across 16 income brackets among four distinct age groups in Florence: Under 25 years, 25-44 years, 45-64 years, and over 65 years. The dataset highlights the variation in household income, offering valuable insights into economic trends and disparities within different age categories, aiding in data analysis and decision-making..

Key observations

Upon closer examination of the distribution of households among age brackets, it reveals that there are 63(1%) households where the householder is under 25 years old, 1,435(22.84%) households with a householder aged between 25 and 44 years, 1,792(28.52%) households with a householder aged between 45 and 64 years, and 2,994(47.64%) households where the householder is over 65 years old.

The age group of 25 to 44 years exhibits the highest median household income, while the largest number of households falls within the 65 years and over bracket. This distribution hints at economic disparities within the town of Florence, showcasing varying income levels among different age demographics.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Income brackets:

Less than $10,000

$10,000 to $14,999

$15,000 to $19,999

$20,000 to $24,999

$25,000 to $29,999

$30,000 to $34,999

$35,000 to $39,999

$40,000 to $44,999

$45,000 to $49,999

$50,000 to $59,999

$60,000 to $74,999

$75,000 to $99,999

$100,000 to $124,999

$125,000 to $149,999

$150,000 to $199,999

$200,000 or more

Variables / Data Columns

Household Income: This column showcases 16 income brackets ranging from Under $10,000 to $200,000+ ( As mentioned above).

Under 25 years: The count of households led by a head of household under 25 years old with income within a specified income bracket.

25 to 44 years: The count of households led by a head of household 25 to 44 years old with income within a specified income bracket.

45 to 64 years: The count of households led by a head of household 45 to 64 years old with income within a specified income bracket.

65 years and over: The count of households led by a head of household 65 years and over old with income within a specified income bracket.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Florence median household income by age. You can refer the same here
Z
Dataset: A Systematic Literature Review on the topic of High-value datasets
data.niaid.nih.gov
zenodo.org
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anastasija Nikiforova (2024). Dataset: A Systematic Literature Review on the topic of High-value datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7944424
Explore at:
Dataset updated
Jul 11, 2024
Dataset provided by
Magdalena Ciesielska
Nina Rizun
Charalampos Alexopoulos
Andrea Miletič
Anastasija Nikiforova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains data collected during a study ("Towards High-Value Datasets determination for data-driven development: a systematic literature review") conducted by Anastasija Nikiforova (University of Tartu), Nina Rizun, Magdalena Ciesielska (Gdańsk University of Technology), Charalampos Alexopoulos (University of the Aegean) and Andrea Miletič (University of Zagreb) It being made public both to act as supplementary data for "Towards High-Value Datasets determination for data-driven development: a systematic literature review" paper (pre-print is available in Open Access here -> https://arxiv.org/abs/2305.10234) and in order for other researchers to use these data in their own work.

The protocol is intended for the Systematic Literature review on the topic of High-value Datasets with the aim to gather information on how the topic of High-value datasets (HVD) and their determination has been reflected in the literature over the years and what has been found by these studies to date, incl. the indicators used in them, involved stakeholders, data-related aspects, and frameworks. The data in this dataset were collected in the result of the SLR over Scopus, Web of Science, and Digital Government Research library (DGRL) in 2023.

Methodology

To understand how HVD determination has been reflected in the literature over the years and what has been found by these studies to date, all relevant literature covering this topic has been studied. To this end, the SLR was carried out to by searching digital libraries covered by Scopus, Web of Science (WoS), Digital Government Research library (DGRL).

These databases were queried for keywords ("open data" OR "open government data") AND ("high-value data*" OR "high value data*"), which were applied to the article title, keywords, and abstract to limit the number of papers to those, where these objects were primary research objects rather than mentioned in the body, e.g., as a future work. After deduplication, 11 articles were found unique and were further checked for relevance. As a result, a total of 9 articles were further examined. Each study was independently examined by at least two authors.

To attain the objective of our study, we developed the protocol, where the information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information.

Test procedure Each study was independently examined by at least two authors, where after the in-depth examination of the full-text of the article, the structured protocol has been filled for each study. The structure of the survey is available in the supplementary file available (see Protocol_HVD_SLR.odt, Protocol_HVD_SLR.docx) The data collected for each study by two researchers were then synthesized in one final version by the third researcher.

Description of the data in this data set

Protocol_HVD_SLR provides the structure of the protocol Spreadsheets #1 provides the filled protocol for relevant studies. Spreadsheet#2 provides the list of results after the search over three indexing databases, i.e. before filtering out irrelevant studies

The information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information

Descriptive information
1) Article number - a study number, corresponding to the study number assigned in an Excel worksheet 2) Complete reference - the complete source information to refer to the study 3) Year of publication - the year in which the study was published 4) Journal article / conference paper / book chapter - the type of the paper -{journal article, conference paper, book chapter} 5) DOI / Website- a link to the website where the study can be found 6) Number of citations - the number of citations of the article in Google Scholar, Scopus, Web of Science 7) Availability in OA - availability of an article in the Open Access 8) Keywords - keywords of the paper as indicated by the authors 9) Relevance for this study - what is the relevance level of the article for this study? {high / medium / low}

Approach- and research design-related information 10) Objective / RQ - the research objective / aim, established research questions 11) Research method (including unit of analysis) - the methods used to collect data, including the unit of analy-sis (country, organisation, specific unit that has been ana-lysed, e.g., the number of use-cases, scope of the SLR etc.) 12) Contributions - the contributions of the study 13) Method - whether the study uses a qualitative, quantitative, or mixed methods approach? 14) Availability of the underlying research data- whether there is a reference to the publicly available underly-ing research data e.g., transcriptions of interviews, collected data, or explanation why these data are not shared? 15) Period under investigation - period (or moment) in which the study was conducted 16) Use of theory / theoretical concepts / approaches - does the study mention any theory / theoretical concepts / approaches? If any theory is mentioned, how is theory used in the study?

Quality- and relevance- related information
17) Quality concerns - whether there are any quality concerns (e.g., limited infor-mation about the research methods used)? 18) Primary research object - is the HVD a primary research object in the study? (primary - the paper is focused around the HVD determination, sec-ondary - mentioned but not studied (e.g., as part of discus-sion, future work etc.))

HVD determination-related information
19) HVD definition and type of value - how is the HVD defined in the article and / or any other equivalent term? 20) HVD indicators - what are the indicators to identify HVD? How were they identified? (components & relationships, “input -> output") 21) A framework for HVD determination - is there a framework presented for HVD identification? What components does it consist of and what are the rela-tionships between these components? (detailed description) 22) Stakeholders and their roles - what stakeholders or actors does HVD determination in-volve? What are their roles? 23) Data - what data do HVD cover? 24) Level (if relevant) - what is the level of the HVD determination covered in the article? (e.g., city, regional, national, international)

Format of the file .xls, .csv (for the first spreadsheet only), .odt, .docx

Licenses or restrictions CC-BY

For more info, see README.txt
a
Levels of obesity and inactivity related illnesses (physical illnesses):...
hub.arcgis.com
data.catchmentbasedapproach.org
Updated Apr 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Rivers Trust (2021). Levels of obesity and inactivity related illnesses (physical illnesses): Summary (England) [Dataset]. https://hub.arcgis.com/maps/theriverstrust::levels-of-obesity-and-inactivity-related-illnesses-physical-illnesses-summary-england
Explore at:
Dataset updated
Apr 7, 2021
Dataset authored and provided by
The Rivers Trust
Area covered

Description
SUMMARYThis analysis, designed and executed by Ribble Rivers Trust, identifies areas across England with the greatest levels of physical illnesses that are linked with obesity and inactivity. Please read the below information to gain a full understanding of what the data shows and how it should be interpreted.ANALYSIS METHODOLOGYThe analysis was carried out using Quality and Outcomes Framework (QOF) data, derived from NHS Digital, relating to:- Asthma (in persons of all ages)- Cancer (in persons of all ages)- Chronic kidney disease (in adults aged 18+)- Coronary heart disease (in persons of all ages)- Diabetes mellitus (in persons aged 17+)- Hypertension (in persons of all ages)- Stroke and transient ischaemic attack (in persons of all ages)This information was recorded at the GP practice level. However, GP catchment areas are not mutually exclusive: they overlap, with some areas covered by 30+ GP practices. Therefore, to increase the clarity and usability of the data, the GP-level statistics were converted into statistics based on Middle Layer Super Output Area (MSOA) census boundaries.For each of the above illnesses, the percentage of each MSOA’s population with that illness was estimated. This was achieved by calculating a weighted average based on:- The percentage of the MSOA area that was covered by each GP practice’s catchment area- Of the GPs that covered part of that MSOA: the percentage of patients registered with each GP that have that illnessThe estimated percentage of each MSOA’s population with each illness was then combined with Office for National Statistics Mid-Year Population Estimates (2019) data for MSOAs, to estimate the number of people in each MSOA with each illness, within the relevant age range.For each illness, each MSOA was assigned a relative score between 1 and 0 (1 = worst, 0 = best) based on:A) the PERCENTAGE of the population within that MSOA who are estimated to have that illnessB) the NUMBER of people within that MSOA who are estimated to have that illnessAn average of scores A & B was taken, and converted to a relative score between 1 and 0 (1= worst, 0 = best). The closer to 1 the score, the greater both the number and percentage of the population in the MSOA predicted to have that illness, compared to other MSOAs. In other words, those are areas where a large number of people are predicted to suffer from an illness, and where those people make up a large percentage of the population, indicating there is a real issue with that illness within the population and the investment of resources to address that issue could have the greatest benefits.The scores for each of the 7 illnesses were added together then converted to a relative score between 1 – 0 (1 = worst, 0 = best), to give an overall score for each MSOA: a score close to 1 would indicate that an area has high predicted levels of all obesity/inactivity-related illnesses, and these are areas where the local population could benefit the most from interventions to address those illnesses. A score close to 0 would indicate very low predicted levels of obesity/inactivity-related illnesses and therefore interventions might not be required.LIMITATIONS1. GPs do not have catchments that are mutually exclusive from each other: they overlap, with some geographic areas being covered by 30+ practices. This dataset should be viewed in combination with the ‘Health and wellbeing statistics (GP-level, England): Missing data and potential outliers’ dataset to identify where there are areas that are covered by multiple GP practices but at least one of those GP practices did not provide data. Results of the analysis in these areas should be interpreted with caution, particularly if the levels of obesity/inactivity-related illnesses appear to be significantly lower than the immediate surrounding areas.2. GP data for the financial year 1st April 2018 – 31st March 2019 was used in preference to data for the financial year 1st April 2019 – 31st March 2020, as the onset of the COVID19 pandemic during the latter year could have affected the reporting of medical statistics by GPs. However, for 53 GPs (out of 7670) that did not submit data in 2018/19, data from 2019/20 was used instead. Note also that some GPs (997 out of 7670) did not submit data in either year. This dataset should be viewed in conjunction with the ‘Health and wellbeing statistics (GP-level, England): Missing data and potential outliers’ dataset, to determine areas where data from 2019/20 was used, where one or more GPs did not submit data in either year, or where there were large discrepancies between the 2018/19 and 2019/20 data (differences in statistics that were > mean +/- 1 St.Dev.), which suggests erroneous data in one of those years (it was not feasible for this study to investigate this further), and thus where data should be interpreted with caution. Note also that there are some rural areas (with little or no population) that do not officially fall into any GP catchment area (although this will not affect the results of this analysis if there are no people living in those areas).3. Although all of the obesity/inactivity-related illnesses listed can be caused or exacerbated by inactivity and obesity, it was not possible to distinguish from the data the cause of the illnesses in patients: obesity and inactivity are highly unlikely to be the cause of all cases of each illness. By combining the data with data relating to levels of obesity and inactivity in adults and children (see the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset), we can identify where obesity/inactivity could be a contributing factor, and where interventions to reduce obesity and increase activity could be most beneficial for the health of the local population.4. It was not feasible to incorporate ultra-fine-scale geographic distribution of populations that are registered with each GP practice or who live within each MSOA. Populations might be concentrated in certain areas of a GP practice’s catchment area or MSOA and relatively sparse in other areas. Therefore, the dataset should be used to identify general areas where there are high levels of obesity/inactivity-related illnesses, rather than interpreting the boundaries between areas as ‘hard’ boundaries that mark definite divisions between areas with differing levels of these illnesses. TO BE VIEWED IN COMBINATION WITH:This dataset should be viewed alongside the following datasets, which highlight areas of missing data and potential outliers in the data:- Health and wellbeing statistics (GP-level, England): Missing data and potential outliersDOWNLOADING THIS DATATo access this data on your desktop GIS, download the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset.DATA SOURCESThis dataset was produced using:Quality and Outcomes Framework data: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.GP Catchment Outlines. Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital. Data was cleaned by Ribble Rivers Trust before use.COPYRIGHT NOTICEThe reproduction of this data must be accompanied by the following statement:© Ribble Rivers Trust 2021. Analysis carried out using data that is: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.CaBA HEALTH & WELLBEING EVIDENCE BASEThis dataset forms part of the wider CaBA Health and Wellbeing Evidence Base.
d
Aquifer framework datasets used to represent the Marshall aquifer, Michigan
catalog.data.gov
data.usgs.gov
Updated Sep 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Aquifer framework datasets used to represent the Marshall aquifer, Michigan [Dataset]. https://catalog.data.gov/dataset/aquifer-framework-datasets-used-to-represent-the-marshall-aquifer-michigan
Explore at:
Dataset updated
Sep 26, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Michigan
Description
The Marshall aquifer underlies much of the Lower Peninsula of Michigan and has a maximum thickness of 493 feet (Lampe, 2009). The aquifer consists mainly of medium-grained sandstone and is overlain by Pennsylvanian-age rocks and glacial deposits and underlain by the Devonian-Mississippian-age confining unit. The Marshall aquifer is one of the most productive aquifers in the state where unconfined conditions occur (HA 730-J). This product provides source data for the Marshall aquifer framework, including: Extent shapefiles: 1. p_32MRSHLL.shp: Polygon shapefile containing the areal extent of the Marshall aquifer (32MRSHLL_AqExtent). The extent file contains no aquifer subunits. Point shapefiles: 1. po_32MRSHLL_top.shp: Point dataset containing altitude values, in feet reference to North American Vertical Datum of 1988 (NAVD88), across the top of the Marshall aquifer (Lampe, 2009). These data were used to create the ra_32MRSHLL_top.tif raster dataset. 2. po_32MRSHLL_bot.shp: Point dataset containing altitude values, in feet reference to NAVD88, across the bottom of the Marshall aquifer (Lampe, 2009). These data were used to create the ra_32MRSHLL_bot.tif raster dataset. Altitude raster files: 1. ra_32MRSHLL_top.tif : Altitude raster dataset of the top of the Marshall aquifer. The altitude values are in meters reference to NAVD88 vertical datum. This raster was interpolated from the po_32MRSHLL_top.shp point dataset. 2. ra_32MRSHLL_bot.tif: Altitude raster dataset of the bottom of the Marshall aquifer. The altitude values are in meters reference to NAVD88 vertical datum. This raster was interpolated from the po_32MRSHLL_bot.shp point dataset. Depth raster files: 1. rd_32MRSHLL_top.tif: Depth raster dataset of the top of the Marshall aquifer. The depth values are in meters below land-surface (NED, 100-meter). 2. rd_32MRSHLL_bot.tif: Depth raster dataset of the bottom of the Marshall aquifer. The depth values are in meters below land-surface (NED, 100-meter).
NOAA Analysis of Record for Calibration (AORC) Dataset
registry.opendata.aws
Updated Mar 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NOAA Analysis of Record for Calibration (AORC) Dataset [Dataset]. https://registry.opendata.aws/noaa-nws-aorc/
Explore at:
Dataset updated
Mar 22, 2024
Dataset provided by
National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
Description
The Analysis Of Record for Calibration (AORC) is a gridded record of near-surface weather conditions covering the continental United States and Alaska and their hydrologically contributing areas. It is defined on a latitude/longitude spatial grid with a mesh length of 30 arc seconds (~800 m), and a temporal resolution of one hour. Elements include hourly total precipitation, temperature, specific humidity, terrain-level pressure, downward longwave and shortwave radiation, and west-east and south-north wind components. It spans the period from 1979 across the Continental U.S. (CONUS) and from 1981 across Alaska, to the near-present (at all locations). This suite of eight variables is sufficient to drive most land-surface and hydrologic models and is used as input to the National Water Model (NWM) retrospective simulation. While the native AORC process generates netCDF output, the data is post-processed to create a cloud optimized Zarr formatted equivalent for dissemination using cloud technology and infrastructure.

AORC Version 1.1 dataset creation
The AORC dataset was created after reviewing, identifying, and processing multiple large-scale, observation, and analysis datasets. There are two versions of The Analysis Of Record for Calibration (AORC) data.

The initial AORC Version 1.0 dataset was completed in November 2019 and consisted of a grid with 8 elements at a resolution of 30 arc seconds. The AORC version 1.1 dataset was created to address issues "see Table 1 in Fall et al., 2023" in the version 1.0 CONUS dataset. Full documentation on version 1.1 of the AORC data and the related journal publication are provided below.

The native AORC version 1.1 process creates a dataset that consists of netCDF files with the following dimensions: 1 hour, 4201 latitude values (ranging from 25.0 to 53.0), and 8401 longitude values (ranging from -125.0 to -67).

The data creation runs with a 10-day lag to ensure the inclusion of any corrections to the input Stage IV and NLDAS data.

Note - The full extent of the AORC grid as defined in its data files exceed those cited above; those outermost rows and columns of data grids are filled with missing values and are the remnant of an early set of required AORC extents that have since been adjusted inward.

AORC Version 1.1 Zarr Conversion

The goal for converting the AORC data from netCDF to Zarr was to allow users to quickly and efficiently load/use the data. For example, one year of data takes 28 mins to load via NetCDF while only taking 3.2 seconds to load via Zarr (resulting in a substantial increase in speed). For longer periods of time, the percentage increase in speed using Zarr (vs NetCDF) is even higher. Using Zarr also leads to less memory and CPU utilization.

It was determined that the optimal conversion for the data was 1 year worth of Zarr files with a chunk size of 18MB. The chunking was completed across all 8 variables. The chunks consist of the following dimensions: 144 time, 128 latitude, and 256 longitude. To create the files in the Zarr format, the NetCDF files were rechunked using chunk() and "Xarray". After chunking the files, they were converted to a monthly Zarr file. Then, each monthly Zarr file was combined using "to_zarr" to create a Zarr file that represents a full year

Users wanting more than 1 year of data will be able to utilize Zarr utilities/libraries to combine multiple years up to the span of the full data set.

There are eight variables representing the meteorological conditions
Total Precipitaion (APCP_surface)

Hourly total precipitation (kgm-2 or mm) for Calibration (AORC) dataset

Air Temperature (TMP_2maboveground)

Temperature (at 2 m above-ground-level (AGL)) (K)

Specific Humidity (SPFH_2maboveground)

Specific humidity (at 2 m AGL) (g g-1)

Downward Long-Wave Radiation Flux (DLWRF_surface)

longwave (infrared)

radiation flux (at the surface) (W m-2)

Downward Short-Wave Radiation Flux (DSWRF_surface)

Downward shortwave (solar)

radiation flux (at the surface) (W m-2)

Pressure (PRES_surface)

Air pressure (at the surface) (Pa)

**U-Component of Wind (UGRD_10maboveground)"
1)U (west-east) - components of the wind (at 10 m AGL) (m s-1)
**V-Component of Wind (VGRD_10maboveground)"

V (south-north) - components of the wind (at 10 m AGL) (m s-1)

Precipitation and Temperature

The gridded AORC precipitation dataset contains one-hour Accumulated Surface Precipitation (APCP) ending at the “top” of each hour, in liquid water-equivalent units (kg m-2 to the nearest 0.1 kg m-2), while the gridded AORC temperature dataset is comprised of instantaneous, 2 m above-ground-level (AGL) temperatures at the top of each hour (in Kelvin, to the nearest 0.1).

Specific Humidity, Pressure, Downward Radiation, Wind

The development process for the six additional dataset components of the Conus AORC [i.e., specific humidity at 2m above ground (kg kg-1); downward longwave and shortwave radiation fluxes at the surface (W m-2); terrain-level pressure (Pa); and west-east and south-north wind components at 10 m above ground (m s-1)] has two distinct periods, based on datasets and methodology applied: 1979–2015 and 2016–present.
Z
Dataset for: The Evolution of the Manosphere Across the Web
data.niaid.nih.gov
zenodo.org
Updated Aug 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Manoel Horta Ribeiro (2020). Dataset for: The Evolution of the Manosphere Across the Web [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4007912
Explore at:
Dataset updated
Aug 30, 2020
Dataset provided by
Stephanie Greenberg
Barry Bradlyn
Jeremy Blackburn
Summer Long
Emiliano De Cristofaro
Gianluca Stringhini
Manoel Horta Ribeiro
Savvas Zannettou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Evolution of the Manosphere Across the Web

We make available data related to subreddit and standalone forums from the manosphere.

We also make available Perspective API annotations for all posts.

You can find the code in GitHub.

Please cite this paper if you use this data:

@article{ribeiroevolution2021, title={The Evolution of the Manosphere Across the Web}, author={Ribeiro, Manoel Horta and Blackburn, Jeremy and Bradlyn, Barry and De Cristofaro, Emiliano and Stringhini, Gianluca and Long, Summer and Greenberg, Stephanie and Zannettou, Savvas}, booktitle = {{Proceedings of the 15th International AAAI Conference on Weblogs and Social Media (ICWSM'21)}}, year={2021} }

Reddit data

We make available data for forums and for relevant subreddits (56 of them, as described in subreddit_descriptions.csv). These are available, 1 line per post in each subreddit Reddit in /ndjson/reddit.ndjson. A sample for example is:

{ "author": "Handheld_Gaming", "date_post": 1546300852, "id_post": "abcusl", "number_post": 9.0, "subreddit": "Braincels", "text_post": "Its been 2019 for almost 1 hour And I am at a party with 120 people, half of them being foids. The last year had been the best in my life. I actually was happy living hope because I was redpilled to the death.

Now that I am blackpilled I see that I am the shortest of all men and that I am the only one with a recessed jaw.

Its over. Its only thanks to my age old friendship with chads and my social skills I had developed in the past year that a lot of men like me a lot as a friend.

No leg lengthening syrgery is gonna save me. Ignorance was a bliss. Its just horror now seeing that everyone can make out wirth some slin hoe at the party.

I actually feel so unbelivably bad for turbomanlets. Life as an unattractive manlet is a pain, I cant imagine the hell being an ugly turbomanlet is like. I would have roped instsntly if I were one. Its so unfair.

Tallcels are fakecels and they all can (and should) suck my cock.

If I were 17cm taller my life would be a heaven and I would be the happiest man alive.

Just cope and wait for affordable body tranpslants.", "thread": "t3_abcusl" }

Forums

We here describe the .sqlite and .ndjson files that contain the data from the following forums.

(avfm) --- https://d2ec906f9aea-003845.vbulletin.net (incels) --- https://incels.co/ (love_shy) --- http://love-shy.com/lsbb/ (redpilltalk) --- https://redpilltalk.com/ (mgtow) --- https://www.mgtow.com/forums/ (rooshv) --- https://www.rooshvforum.com/ (pua_forum) --- https://www.pick-up-artist-forum.com/ (the_attraction) --- http://www.theattractionforums.com/

The files are in folders /sqlite/ and /ndjson.

2.1 .sqlite

All the tables in the sqlite. datasets follow a very simple {key:value} format. Each key is a thread name (for example /threads/housewife-is-like-a-job.123835/) and each value is a python dictionary or a list. This file contains three tables:

idx each key is the relative address to a thread and maps to a post. Each post is represented by a dict:

"type": (list) in some forums you can add a descriptor such as [RageFuel] to each topic, and you may also have special types of posts, like sticked/pool/locked posts.
"title": (str) title of the thread; "link": (str) link to the thread; "author_topic": (str) username that created the thread; "replies": (int) number of replies, may differ from number of posts due to difference in crawling date; "views": (int) number of views; "subforum": (str) name of the subforum; "collected": (bool) indicates if raw posts have been collected; "crawled_idx_at": (str) datetime of the collection.

processed_posts each key is the relative address to a thread and maps to a list with posts (in order). Each post is represented by a dict:

"author": (str) author's username; "resume_author": (str) author's little description; "joined_author": (str) date author joined; "messages_author": (int) number of messages the author has; "text_post": (str) text of the main post; "number_post": (int) number of the post in the thread; "id_post": (str) unique post identifier (depends), for sure unique within thread; "id_post_interaction": (list) list with other posts ids this post quoted; "date_post": (str) datetime of the post, "links": (tuple) nice tuple with the url parsed, e.g. ('https', 'www.youtube.com', '/S5t6K9iwcdw'); "thread": (str) same as key; "crawled_at": (str) datetime of the collection.

raw_posts each key is the relative address to a thread and maps to a list with unprocessed posts (in order). Each post is represented by a dict:

"post_raw": (binary) raw html binary; "crawled_at": (str) datetime of the collection.

2.2 .ndjson

Each line consists of a json object representing a different comment with the following fields:

"author": (str) author's username; "resume_author": (str) author's little description; "joined_author": (str) date author joined; "messages_author": (int) number of messages the author has; "text_post": (str) text of the main post; "number_post": (int) number of the post in the thread; "id_post": (str) unique post identifier (depends), for sure unique within thread; "id_post_interaction": (list) list with other posts ids this post quoted; "date_post": (str) datetime of the post, "links": (tuple) nice tuple with the url parsed, e.g. ('https', 'www.youtube.com', '/S5t6K9iwcdw'); "thread": (str) same as key; "crawled_at": (str) datetime of the collection.

Perspective

We also run each post and reddit post through perspective, the files are located in the /perspective/ folder. They are compressed with gzip. One example output

{ "id_post": 5200, "hate_output": { "text": "I still can\u2019t wrap my mind around both of those articles about these c~~~s sleeping with poor Haitian Men. Where\u2019s the uproar?, where the hell is the outcry?, the \u201cpig\u201d comments or the \u201ccreeper comments\u201d. F~~~ing hell, if roles were reversed and it was an article about Men going to Europe where under 18 sex in legal, you better believe they would crucify the writer of that article and DEMAND an apology by the paper that wrote it.. This is exactly what I try and explain to people about the double standards within our modern society. A bunch of older women, wanna get their kicks off by sleeping with poor Men, just before they either hit or are at menopause age. F~~~ing unreal, I\u2019ll never forget going to Sweden and Norway a few years ago with one of my buddies and his girlfriend who was from there, the legal age of consent in Norway is 16 and in Sweden it\u2019s 15. I couldn\u2019t believe it, but my friend told me \u201c hey, it\u2019s normal here\u201d . Not only that but the age wasn\u2019t a big different in other European countries as well. One thing i learned very quickly was how very Misandric Sweden as well as Denmark were.", "TOXICITY": 0.6079781, "SEVERE_TOXICITY": 0.53744453, "INFLAMMATORY": 0.7279288, "PROFANITY": 0.58842486, "INSULT": 0.5511079, "OBSCENE": 0.9830818, "SPAM": 0.17009115 } }

Working with sqlite

A nice way to read some of the files of the dataset is using SqliteDict, for example:

from sqlitedict import SqliteDict processed_posts = SqliteDict("./data/forums/incels.sqlite", tablename="processed_posts")

for key, posts in processed_posts.items(): for post in posts: # here you could do something with each post in the dataset pass

Helpers

Additionally, we provide two .sqlite files that are helpers used in the analyses. These are related to reddit, and not to the forums! They are:

channel_dict.sqlite a sqlite where each key corresponds to a subreddit and values are lists of dictionaries users who posted on it, along with timestamps.

author_dict.sqlite a sqlite where each key corresponds to an author and values are lists of dictionaries of the subreddits they posted on, along with timestamps.

These are used in the paper for the migration analyses.

Examples and particularities for forums

Although we did our best to clean the data and be consistent across forums, this is not always possible. In the following subsections we talk about the particularities of each forum, directions to improve the parsing which were not pursued as well as give some examples on how things work in each forum.

6.1 incels

Check out an archived version of the front page, the thread page and a post page, as well as a dump of the data stored for a thread page and a post page.

types: for the incel forums the special types associated with each thread in the idx table are “Sticky”, “Pool”, “Closed”, and the custom types added by users, such as [LifeFuel]. These last ones are all in brackets. You can see some examples of these in the on the example thread page.

quotes: quotes in this forum were quite nice and thus, all quotations are deterministic.

6.2 LoveShy

Check out an archived version of the front page, the thread page and a post page, as well as a dump of the data stored for a thread page and a post page.

types: no types were parsed. There are some rules in the forum, but not significant.

quotes: quotes were obtained from exact text+author match, or author match + a jaccard
Dataset covidgilance signals
zenodo.org
bin, csv +3
Updated Sep 25, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaudinat Arnaud; Gaudinat Arnaud (2020). Dataset covidgilance signals [Dataset]. http://doi.org/10.5281/zenodo.4048460
Explore at:
csv, tsv, bin, text/x-python, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4048460
Dataset updated
Sep 25, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gaudinat Arnaud; Gaudinat Arnaud
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Research datasets about top signals for covid 19 (coronavirus) for study into Google Trends (GT) and with SEO metrics

Website

The study is currently published on https://covidgilance.org website (in french)

Datasets description

covid signals -> |selection| -> 4 dataset -> |serp.py| -> 4 serp datasets -> |aggregate_serp.pl| -> 4 aggregated dataset of serp -> |prepare datasets| -> 4 ranked top seo dataset

Original lists of signals (mainly covid symptoms) - dataset

Description: contain the original relevant list of signals for covid19 (here list of queries where you can see, in GT, a relevant signal during the covid 19 period of time)
Name: covid_signal_list.tsv

List of content:

- id: unique id for the topic
- topic-fr: name of the topic in French
- topic-en: name of the topic in English
- topic-id: GT topic id
- keyword fr: one or several keywords in French for GT
- keyword en: one or several keywords in English for GT
- fr-topic-url-12M: link to 12-months French query topic in GT in France
- en-topic-url-12M: link to 12-months English query topic in GT in US
- fr-url-12M: link to 12-months French queries in GT in France
- en-url-12M: link to 12-months English queries topic in GT in US
- fr-topic-url-5M: link to 5-months French query topic in GT in France
- en-topic-url-5M: link to 5-months English query topic in GT in US
- fr-url-5M: link to 5-months French queries in GT in France
- en-url-5M: link to 5-months English queries topic in GT in US

Tool to get SERP of covid signals - tool

Description: query google with a list of covid signals and obtain a list of serps in csv (tsv in fact) file format
Name: serper.py

python serper.py

SERP files - datasets

Description Serp results for 4 datesets of queries Names: simple version of covid signals from google.ch in French: serp_signals_20_ch_fr.csv
simple version of covid signals from google.com in English: serp_signals_20_en.csv
amplified version of covid signals from google.ch in French: serp_signals_covid_20_ch_fr.csv
amplified version of covid signals from google.com in English: serp_signals_covid_20_en.csv

amplified version means that for each query we create two queries one with the keywords "covid" and one with "coronavirus"

Tool to aggregate SERP results - tool

Description: load csv serp data and aggregate the data to create a new csv file where each line is a website and each column is a query. Name: aggregate_serp.pl

`perl aggregate_serp.pl> aggregated_signals_20_en.csv

datasets of top website from the SERP results - dataset

Description a aggregated version of the SERP where each line is a website and each column a query
Names:
aggregated_signals_20_ch_fr.csv
aggregated_signals_20_en.csv
aggregated_signals_covid_20_ch_fr.csv
aggregated_signals_covid_20_en.csv

List of content:

- domain: domain name of the website
- signal 1: Position of the query 1 (signal 1) in the SERP where 30 indicates arbitrary that this website is not present in the SERP
- signal ...: Position of the query (signal) in the SERP where 30 indicates arbitrary that this website is not present in the SERP
- signal n: Position of the query n (signal n) in the SERP where 30 indicates arbitrary that this website is not present in the SERP
- total: average position (total of all position /divided by the number of queries)
- missing: Total number of missing results in the SERP for this website

datasets ranked top seo - dataset

Description a ranked (by weighted average position) version of the aggregated version of the SERP where each line is a website and each column a query. TOP 20 have more information about the type and HONcode validity (from the date of collect: September 2020)

Names:
ranked_signals_20_ch_fr.csv
ranked_signals_20_en.csv
ranked_signals_covid_20_ch_fr.csv
ranked_signals_covid_20_en.csv

List of content:

- domain: domain name of the website
- signal 1: Position of the query 1 (signal 1) in the SERP where 30 indicates arbitrary that this website is not present in the SERP
- signal ...: Position of the query (signal) in the SERP where 30 indicates arbitrary that this website is not present in the SERP
- signal n: Position of the query n (signal n) in the SERP where 30 indicates arbitrary that this website is not present in the SERP
- avg position: average position (total of all position /divided by the number of queries)
- nb missing: Total number of missing results in the SERP for this website
- % presence: % of presence
- weighted avg postion: combination of avg position and % of presence for final ranking
- honcode: status of the Honcode certificate for this website (none/valid/expired)
- type: type of the website (health, gov, edu or media)
d
Data from: ORION-AE: Multisensor acoustic emission datasets reflecting...
search.dataone.org
dataverse.harvard.edu
Updated Nov 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verdin, Benoit; Chevallier, Gaël; Ramasso, Emmanuel (2023). ORION-AE: Multisensor acoustic emission datasets reflecting supervised untightening of bolts in a jointed vibrating structure [Dataset]. http://doi.org/10.7910/DVN/FBRDU0
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/FBRDU0
Dataset updated
Nov 19, 2023
Dataset provided by
Harvard Dataverse
Authors
Verdin, Benoit; Chevallier, Gaël; Ramasso, Emmanuel
Description
Experiments were designed to reproduce the loosening phenomenon observed in aeronautics, automotive or civil engineering structures where parts are assembled together by means of bolted joints. The bolts can indeed be subject to self-loosening under vibrations. Therefore, it is of paramount importance to develop sensing strategies and algorithms for early loosening estimation. The test rig was specifically designed to make the vibration tests as repeatable as possible. The dataset ORION-AE is made of a set of time-series measurements obtained by untightening a bolt with seven different levels. The data have been sampled at 5 MHz on four different sensors, including three permanently attached acoustic emission sensors in contact with the structure, and one laser (contactless) measurement apparatus. This dataset can thus be used for performance benchmarking of supervised, semi-supervised or unsupervised learning algorithms, including deep and transfer learning for time-series data, with possibly seven classes. This dataset may also be useful to challenge denoising methods or wave-picking algorithms, for which the vibrometer measurements can be used for validation. ORION is a jointed structure made of two plates manufactured in a 2024 aluminium alloy, linked together by three bolts. The contact between the plates is done through machined overlays. The contact patches has an area of 12x12 mm^2 and is 1 mm thick. The structure was submitted to a 100 Hz harmonic excitation force during about 10 seconds. The load was applied using a Tyra electromagnetic shaker, which can deliver a 200 N force. The force was measured using a PCB piezoelectric load cell and the vibration level was determined next to the end of the specimen using a Polytec laser vibrometer. The ORION-AE dataset is composed of five directories collected in five campaigns denoted as B, C, D, E and F in the sequel. Seven tightening levels were applied on the upper bolt. The tightening was first set to 60 cNm with a torque screwdriver. After a 10 seconds vibration test, the shaker was stopped and this vibration test was repeated after a torque modification at 50 cNm. Then torque modifications at 40, 30, 20, 10 and 5 cNm were applied. Note that, for campaign C, the level 40 cNm is missing. During each cycle of the vibration test for a given tightening level, different AE sources can generate signals and those sources may be activated or not, depending on the tribological conditions within the contact between the beams which are not controlled. The tightening levels can be used to represent a reference against which clustering or classification results can be compared with. In that case, the main assumption is that the torque remained close to the level which was set at the beginning of every period of 10 s. This assumption can not be checked in the current configuration of the tests. For each campaign, four sensors were used: a laser vibrometer and three different AE sensors (micro-200-HF, micro-80 and the F50A from Euro-Physical Acoustics) with various frequency bands were attached onto the lower plate (5 cm above the end of the plate). All data were sampled at 5 MHz using a Picoscope 4824 and a preamplifier (from Euro-Physical Acoustics) set to 60 dB. The velocimeter is used for different purposes, in particular to control the amplitude of the displacement of the top of the upper beam so that it remains constant whatever the tightening level. The sensors are expected to detect the stick-slip transitions or shocks in the interface that are known to generate small AE events during vibrations. The acoustic waves generated by these events are highly dependent on bolt tightening. These sources of AE signals have to be detected and identified from the data stream which constitute the challenge. Details of the folders and files There is 1 folder per campaign, each composed of 7 subfolders corresponding to 7 tightening levels: 5 cNm, 10 cNm, 20 cNm, 30 cNm, 40 cNm, 50 cNm, 60 cNm. So, 7 levels are available per campaign, except for campaign C for which 40 cNm is missing. There is about 10 seconds of continuous recording of data per level (the exact value can be found according to the number of files in each subfolder). The sampling frequency was set to 5 MHZ on all channels of a picoscope 4824 and a preamplifer of 60 dB (model 2/4/6 preamplifier made by Europhysical acoustics). The characteristics of both the picoscope and preamplifier are provided in the enclosed documentation. Each subfolder is made of .mat files. There is about 1 file per second (depending on the buffering, it can vary a little). The files in a subfolder are named according to the timestamps (time of recording). Each file is composed of vectors of data named: A = micro80 sensor. B = F50A sensor. C = micro200HF sensor. D = velocimeter. Note ... Visit https://dataone.org/datasets/sha256%3A1448d7e6ddf29be42ecf7a171aae8a54a9d9ee5fd29055dfbe282f0cd5519f1e for complete metadata about this dataset.
c
Housing Receiving Incentives Open Data
opendata.cityofboise.org
housing-data-portal-boise.hub.arcgis.com
Updated Jul 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Boise, Idaho (2023). Housing Receiving Incentives Open Data [Dataset]. https://opendata.cityofboise.org/documents/1423afcc749646649c82d7cdc718e4f5
Explore at:
Dataset updated
Jul 5, 2023
Dataset authored and provided by
City of Boise, Idaho
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Thumbnail image by Tony Moody.This dataset includes all housing developments approved by the City of Boise’s (“city”) Planning Division since 2020 that are known by the city to have received or are expected to receive support or incentives from a government entity. Each row represents one development. Data may be unavailable for some projects and details are subject to change until construction is complete. Addresses are excluded for projects with fewer than five homes for privacy reasons.

The dataset includes details on the number of “homes” in a development. We use the word "home" to refer to any single unit of housing regardless of size, type, or whether it is rented or owned. For example, a building with 40 apartments counts as 40 homes, and a single detached house counts as one home.

The dataset includes details about the phase of each project. The process for build a new development is as follows: First, one must receive approval from the city’s Planning Division, which is also known as being “entitled.” Next, one must apply for and receive a permit from the city’s Building Division before beginning construction. Finally, once construction is complete and all city inspections have been passed, the building can be occupied.

The dataset also includes data on the affordability level of each development. To receive a government incentive, a developer is typically required to rent or sell a specified number of homes to households that have an income below limits set by the government and their housing cost must not exceed 30% of their income. The federal government determines income limits based on a standard called “area median income.” The city considers housing affordable if is targeted to households earning at or below 80% of the area median income. For a three-person household in Boise, that equates to an annual income of $60,650 and monthly rent or mortgage of $1,516. See Boise Income Guidelines for more details.Project Address(es) – Includes all addresses that are included as part of the development project.Address – The primary address for the development.Parcel Number(s) – The identification code for all parcels of land included in the development.Acreage – The number of acres for the parcel(s) included in the project.Planning Permit Number – The identification code for all permits the development has received from the Planning Division for the City of Boise. The number and types of permits required vary based on the location and type of development.Date Entitled – The date a development was approved by the City’s Planning Division.Building Permit Number – The identification code for all permits the development has received from the city’s Building Division.Date Building Permit Issued – Building permits are required to begin construction on a development.Date Final Certificate of Occupancy Issued – A certificate of occupancy is the final approval by the city for a development, once construction is complete. Not all developments require a certificate of occupancy.Studio – The number of homes in the development that are classified as a studio. A studio is typically defined as a home in which there is no separate bedroom. A single room serves as both a bedroom and a living room.1-Bedroom – The number of homes in a development that have exactly one bedroom.2-Bedroom – The number of homes in a development that have exactly two bedrooms.3-Bedroom – The number of homes in a development that have exactly three bedrooms.4+ Bedroom – The number of homes in a development that have four or more bedrooms.# of Total Project Units – The total number of homes in the development.# of units toward goals – The number of homes in a development that contribute to either the city’s goal to produce housing affordable at or under 60% of area median income, or the city’s goal to create permanent supportive housing for households experiencing homelessness.Rent at or under 60% AMI - The number of homes in a development that are required to be rented at or below 60% of area median income. See the description of the dataset above for an explanation of area median income or see Boise Income Guidelines for more details. Boise defines a home as “affordable” if it is rented or sold at or below 80% of area median income.Rent 61-80% AMI – The number of homes in a development that are required to be rented at between 61% and 80% of area median income. See the description of the dataset above for an explanation of area median income or see Boise Income Guidelines for more details. Boise defines a home as “affordable” if it is rented or sold at or below 80% of area median income.Rent 81-120% AMI - The number of homes in a development that are required to be rented at between 81% and 120% of area median income. See the description of the dataset above for an explanation of area median income or see Boise Income Guidelines for more details.Own at or under 60% AMI - The number of homes in a development that are required to be sold at or below 60% of area median income. See the description of the dataset above for an explanation of area median income or see Boise Income Guidelines for more details. Boise defines a home as “affordable” if it is rented or sold at or below 80% of area median income.Own 61-80% AMI – The number of homes in a development that are required to be sold at between 61% and 80% of area median income. See the description of the dataset above for an explanation of area median income or see Boise Income Guidelines for more details. Boise defines a home as “affordable” if it is rented or sold at or below 80% of area median income.Own 81-120% AMI - The number of homes in a development that are required to be sold at between 81% and 120% of area median income. See the description of the dataset above for an explanation of area median income or see Boise Income Guidelines for more details.Housing Land Trust – “Yes” if a development receives or is expected to receive this incentive. The Housing Land Trust is a model in which the city owns land that it leases to a developer to build affordable housing.City Investment – “Yes” if the city invests funding or contributes land to an affordable development.Zoning Incentive - The city's zoning code provides incentives for developers to create affordable housing. Incentives may include the ability to build an extra floor or be subject to reduced parking requirements. “Yes” if a development receives or is expected to receive one of these incentives.Project Management - The city provides a developer and their design team a single point of contact who works across city departments to simplify the permitting process, and assists the applicants in understanding the city’s requirements to avoid possible delays. “Yes” if a development receives or is expected to receive this incentive.Low-Income Housing Tax Credit (LIHTC) - A federal tax credit available to some new affordable housing developments. The Idaho Housing and Finance Association is a quasi-governmental agency that administers these federal tax credits. “Yes” if a development receives or is expected to receive this incentive.CCDC Investment - The Capital City Development Corp (CCDC) is a public agency that financially supports some affordable housing development in Urban Renewal Districts. “Yes” if a development receives or is expected to receive this incentive. If “Yes” the field identifies the Urban Renewal District associated with the development.City Goal – The city has set goals to produce housing affordable to households at or below 60% of area median income, and to create permanent supportive housing for households experiencing homelessness. This field identifies whether a development contributes to one of those goals.Project Phase - The process for build a new development is as follows: First, one must receive approval from the city’s Planning Division, which is also known as being “entitled.” Next, one must apply for and receive a permit from the city’s Building Division before beginning construction. Finally, once construction is complete and all city inspections have been passed, the building can be occupied.
R
Hard Hat Workers Object Detection Dataset - resize-416x416-reflectEdges
public.roboflow.com
zip
Updated Sep 30, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Northeastern University - China (2022). Hard Hat Workers Object Detection Dataset - resize-416x416-reflectEdges [Dataset]. https://public.roboflow.com/object-detection/hard-hat-workers/1
Explore at:
zipAvailable download formats
Dataset updated
Sep 30, 2022
Dataset authored and provided by
Northeastern University - China
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Variables measured
Bounding Boxes of Workers
Description
Overview

The Hard Hat dataset is an object detection dataset of workers in workplace settings that require a hard hat. Annotations also include examples of just "person" and "head," for when an individual may be present without a hard hart.

The original dataset has a 75/25 train-test split.

Example Image: https://i.imgur.com/7spoIJT.png" alt="Example Image">

Use Cases

One could use this dataset to, for example, build a classifier of workers that are abiding safety code within a workplace versus those that may not be. It is also a good general dataset for practice.

Using this Dataset

Use the fork or Download this Dataset button to copy this dataset to your own Roboflow account and export it with new preprocessing settings (perhaps resized for your model's desired format or converted to grayscale), or additional augmentations to make your model generalize better. This particular dataset would be very well suited for Roboflow's new advanced Bounding Box Only Augmentations.

Dataset Versions:

Image Preprocessing | Image Augmentation | Modify Classes * v1 (resize-416x416-reflect): generated with the original 75/25 train-test split | No augmentations * v2 (raw_75-25_trainTestSplit): generated with the original 75/25 train-test split | These are the raw, original images * v3 (v3): generated with the original 75/25 train-test split | Modify Classes used to drop person class | Preprocessing and Augmentation applied * v5 (raw_HeadHelmetClasses): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop person class * v8 (raw_HelmetClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop head and person classes * v9 (raw_PersonClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop head and helmet classes * v10 (raw_AllClasses): generated with a 70/20/10 train/valid/test split | These are the raw, original images * v11 (augmented3x-AllClasses-FastModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied | 3x image generation | Trained with Roboflow's Fast Model * v12 (augmented3x-HeadHelmetClasses-FastModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied, Modify Classes used to drop person class | 3x image generation | Trained with Roboflow's Fast Model * v13 (augmented3x-HeadHelmetClasses-AccurateModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied, Modify Classes used to drop person class | 3x image generation | Trained with Roboflow's Accurate Model * v14 (raw_HeadClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop person class, and remap/relabel helmet class to head

Choosing Between Computer Vision Model Sizes | Roboflow Train

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.
N
Age-wise distribution of East Jordan, MI household incomes: Comparative...
neilsberg.com
csv, json
Updated Jan 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Age-wise distribution of East Jordan, MI household incomes: Comparative analysis across 16 income brackets [Dataset]. https://www.neilsberg.com/research/datasets/859901fb-8dec-11ee-9302-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 9, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
East Jordan, Michigan
Variables measured
Number of households with income $200,000 or more, Number of households with income less than $10,000, Number of households with income between $15,000 - $19,999, Number of households with income between $20,000 - $24,999, Number of households with income between $25,000 - $29,999, Number of households with income between $30,000 - $34,999, Number of households with income between $35,000 - $39,999, Number of households with income between $40,000 - $44,999, Number of households with income between $45,000 - $49,999, Number of households with income between $50,000 - $59,999, and 6 more
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 16 income brackets (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out the total number of households within a specific income bracket along with how many households with that income bracket for each of the 4 age cohorts (Under 25 years, 25-44 years, 45-64 years and 65 years and over). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the the household distribution across 16 income brackets among four distinct age groups in East Jordan: Under 25 years, 25-44 years, 45-64 years, and over 65 years. The dataset highlights the variation in household income, offering valuable insights into economic trends and disparities within different age categories, aiding in data analysis and decision-making..

Key observations

Upon closer examination of the distribution of households among age brackets, it reveals that there are 10(1%) households where the householder is under 25 years old, 345(34.60%) households with a householder aged between 25 and 44 years, 364(36.51%) households with a householder aged between 45 and 64 years, and 278(27.88%) households where the householder is over 65 years old.

The age group of 25 to 44 years exhibits the highest median household income, while the largest number of households falls within the 45 to 64 years bracket. This distribution hints at economic disparities within the city of East Jordan, showcasing varying income levels among different age demographics.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Income brackets:

Less than $10,000

$10,000 to $14,999

$15,000 to $19,999

$20,000 to $24,999

$25,000 to $29,999

$30,000 to $34,999

$35,000 to $39,999

$40,000 to $44,999

$45,000 to $49,999

$50,000 to $59,999

$60,000 to $74,999

$75,000 to $99,999

$100,000 to $124,999

$125,000 to $149,999

$150,000 to $199,999

$200,000 or more

Variables / Data Columns

Household Income: This column showcases 16 income brackets ranging from Under $10,000 to $200,000+ ( As mentioned above).

Under 25 years: The count of households led by a head of household under 25 years old with income within a specified income bracket.

25 to 44 years: The count of households led by a head of household 25 to 44 years old with income within a specified income bracket.

45 to 64 years: The count of households led by a head of household 45 to 64 years old with income within a specified income bracket.

65 years and over: The count of households led by a head of household 65 years and over old with income within a specified income bracket.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for East Jordan median household income by age. You can refer the same here
g
Data from: Global Soil Types, 0.5-Degree Grid (Modified Zobler)
data.globalchange.gov
search.dataone.org
Updated Feb 1, 2001
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2001). Global Soil Types, 0.5-Degree Grid (Modified Zobler) [Dataset]. https://data.globalchange.gov/dataset/nasa-ornldaac-540
Explore at:
Dataset updated
Feb 1, 2001
Description
ABSTRACT: A global data set of soil types is available at 0.5-degree latitude by 0.5-degree longitude resolution. There are 106 soil units, based on Zoblerï¿½s (1986) assessment of the FAO/UNESCO Soil Map of the World. This data set is a conversion of the Zobler 1-degree resolution version to a 0.5-degree resolution. The resolution of the data set was not actually increased. Rather, the 1-degree squares were divided into four 0.5-degree squares with the necessary adjustment of continental boundaries and islands. The computer code used to convert the original 1-degree data to 0.5-degree is provided as a companion file. A JPG image of the data is provided in this document. The Zobler data (1-degree resolution) as distributed by Webb et al. (1993) [http://www.ngdc.noaa.gov/seg/eco/cdroms/gedii_a/datasets/a12/wr.htm#top] contains two columns, one column for continent and one column for soil type. The Soil Map of the World consists of 9 maps that represent parts of the world. The texture data that Webb et al.(1993) provided allowed for the fact that a soil type in one part of the world may have different properties than the same soil in a different part of the world. This continent-specific information is retained in this 0.5-degree resolution data set, as well as the soil type information which is the second column. A code was written (one2half.c) to take the file CONTIZOB.LER distributed by Webb et al. (1993) [http://www.ngdc.noaa.gov/seg/eco/cdroms/gedii_a/datasets/a12/wr.htm#top] and simply divide the 1-degree cells into quarters. This code also reads in a land/water file (land.wave) that specifies the cells that are land at 0.5 degrees. The code checks for consistency between the newly quartered map and the land/water map to which the quartered map is to be registered. If there is a discrepancy between the two, an attempt was made to make the two consistent using the following logic. If the cell is supposed to be water, it is forced to be water. If it is supposed to be land but was resolved to water at 1 degree, the code looks at the surrounding 8 cells and picks the most frequent soil type and assigns it to the cell. If there are no surrounding land cells then it is kept as water in the hopes that on the next pass one or more of the surrounding cells might be converted from water to a soil type. The whole map is iterated 5 times. The remaining cells that should be land but couldn't be determined from surrounding cells (mostly islands that are resolved at 0.5 degree but not at 1 degree) are printed out with coordinate information. A temporary map is output with -9 indicating where data is required. This is repeated for the continent code in CONTIZOB.LER as well. A separate map of the temporary continent codes is produced with -9 indicating required data. A nearly identical code (one2half.c) does the same for the continent codes. The printout allows one to consult the printed versions of the soil map and look up the soil type with the largest coverage in the 0.5-degree cell. The program manfix.c then will go through the temporary map and prompt for input to correct both the soil codes and the continent codes for the map. This can be done manually or by preparing a file of changes (new_fix.dat) and redirecting stdin. A new complete version of the map is outputted. This is in the form of the original CONTIZOB.LER file (contizob.half) but four times larger. Original documentation and computer codes prepared by Post et al. (1996) are provided as companion files with this data set. Image of 106 global soil types available at 0.5-degree by 0.5-degree resolution. Additional documentation from Zoblerï¿½s assessment of FAO soil units is available from the NASA Center for Scientific Information.
N
Age-wise distribution of Jonesboro, IN household incomes: Comparative...
neilsberg.com
csv, json
Updated Jan 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Age-wise distribution of Jonesboro, IN household incomes: Comparative analysis across 16 income brackets [Dataset]. https://www.neilsberg.com/research/datasets/85d48ce8-8dec-11ee-9302-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 9, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Jonesboro
Variables measured
Number of households with income $200,000 or more, Number of households with income less than $10,000, Number of households with income between $15,000 - $19,999, Number of households with income between $20,000 - $24,999, Number of households with income between $25,000 - $29,999, Number of households with income between $30,000 - $34,999, Number of households with income between $35,000 - $39,999, Number of households with income between $40,000 - $44,999, Number of households with income between $45,000 - $49,999, Number of households with income between $50,000 - $59,999, and 6 more
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 16 income brackets (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out the total number of households within a specific income bracket along with how many households with that income bracket for each of the 4 age cohorts (Under 25 years, 25-44 years, 45-64 years and 65 years and over). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the the household distribution across 16 income brackets among four distinct age groups in Jonesboro: Under 25 years, 25-44 years, 45-64 years, and over 65 years. The dataset highlights the variation in household income, offering valuable insights into economic trends and disparities within different age categories, aiding in data analysis and decision-making..

Key observations

Upon closer examination of the distribution of households among age brackets, it reveals that there are 7(1%) households where the householder is under 25 years old, 248(35.53%) households with a householder aged between 25 and 44 years, 227(32.52%) households with a householder aged between 45 and 64 years, and 216(30.95%) households where the householder is over 65 years old.

The age group of 45 to 64 years exhibits the highest median household income, while the largest number of households falls within the 25 to 44 years bracket. This distribution hints at economic disparities within the city of Jonesboro, showcasing varying income levels among different age demographics.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Income brackets:

Less than $10,000

$10,000 to $14,999

$15,000 to $19,999

$20,000 to $24,999

$25,000 to $29,999

$30,000 to $34,999

$35,000 to $39,999

$40,000 to $44,999

$45,000 to $49,999

$50,000 to $59,999

$60,000 to $74,999

$75,000 to $99,999

$100,000 to $124,999

$125,000 to $149,999

$150,000 to $199,999

$200,000 or more

Variables / Data Columns

Household Income: This column showcases 16 income brackets ranging from Under $10,000 to $200,000+ ( As mentioned above).

Under 25 years: The count of households led by a head of household under 25 years old with income within a specified income bracket.

25 to 44 years: The count of households led by a head of household 25 to 44 years old with income within a specified income bracket.

45 to 64 years: The count of households led by a head of household 45 to 64 years old with income within a specified income bracket.

65 years and over: The count of households led by a head of household 65 years and over old with income within a specified income bracket.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Jonesboro median household income by age. You can refer the same here
N
Age-wise distribution of Grant township, Oceana County, Michigan household...
neilsberg.com
csv, json
Updated Jan 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Age-wise distribution of Grant township, Oceana County, Michigan household incomes: Comparative analysis across 16 income brackets [Dataset]. https://www.neilsberg.com/research/datasets/85b6fe73-8dec-11ee-9302-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Jan 9, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Oceana County, Grant Township, Michigan
Variables measured
Number of households with income $200,000 or more, Number of households with income less than $10,000, Number of households with income between $15,000 - $19,999, Number of households with income between $20,000 - $24,999, Number of households with income between $25,000 - $29,999, Number of households with income between $30,000 - $34,999, Number of households with income between $35,000 - $39,999, Number of households with income between $40,000 - $44,999, Number of households with income between $45,000 - $49,999, Number of households with income between $50,000 - $59,999, and 6 more
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 16 income brackets (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out the total number of households within a specific income bracket along with how many households with that income bracket for each of the 4 age cohorts (Under 25 years, 25-44 years, 45-64 years and 65 years and over). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the the household distribution across 16 income brackets among four distinct age groups in Grant township: Under 25 years, 25-44 years, 45-64 years, and over 65 years. The dataset highlights the variation in household income, offering valuable insights into economic trends and disparities within different age categories, aiding in data analysis and decision-making..

Key observations

Upon closer examination of the distribution of households among age brackets, it reveals that there are 10(1%) households where the householder is under 25 years old, 336(33.47%) households with a householder aged between 25 and 44 years, 415(41.33%) households with a householder aged between 45 and 64 years, and 243(24.20%) households where the householder is over 65 years old.

The age group of 25 to 44 years exhibits the highest median household income, while the largest number of households falls within the 45 to 64 years bracket. This distribution hints at economic disparities within the township of Grant township, showcasing varying income levels among different age demographics.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Income brackets:

Less than $10,000

$10,000 to $14,999

$15,000 to $19,999

$20,000 to $24,999

$25,000 to $29,999

$30,000 to $34,999

$35,000 to $39,999

$40,000 to $44,999

$45,000 to $49,999

$50,000 to $59,999

$60,000 to $74,999

$75,000 to $99,999

$100,000 to $124,999

$125,000 to $149,999

$150,000 to $199,999

$200,000 or more

Variables / Data Columns

Household Income: This column showcases 16 income brackets ranging from Under $10,000 to $200,000+ ( As mentioned above).

Under 25 years: The count of households led by a head of household under 25 years old with income within a specified income bracket.

25 to 44 years: The count of households led by a head of household 25 to 44 years old with income within a specified income bracket.

45 to 64 years: The count of households led by a head of household 45 to 64 years old with income within a specified income bracket.

65 years and over: The count of households led by a head of household 65 years and over old with income within a specified income bracket.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Grant township median household income by age. You can refer the same here

Facebook

Twitter

Click to copy link

Link copied

Cite

Government of Canada, Statistics Canada (2024). High income tax filers in Canada, specific geographic area thresholds [Dataset]. http://doi.org/10.25318/1110005601-eng

High income tax filers in Canada, specific geographic area thresholds

1110005601

Explore at:

Unique identifier

https://doi.org/10.25318/1110005601-eng

Dataset updated

Oct 28, 2024

Dataset provided by

Statistics Canadahttps://statcan.gc.ca/en

Area covered

Canada

Description

This table presents income shares, thresholds, tax shares, and total counts of individual Canadian tax filers, with a focus on high income individuals (95% income threshold, 99% threshold, etc.). Income thresholds are geography-specific; for example, the number of Nova Scotians in the top 1% will be calculated as the number of taxfiling Nova Scotians whose total income exceeded the 99% income threshold of Nova Scotian tax filers. Different definitions of income are available in the table namely market, total, and after-tax income, both with and without capital gains.

Clear search

Close search

Google apps

Main menu

High income tax filers in Canada, specific geographic area thresholds

Median Household Income by Racial Categories in Norman, OK (2022)

About this dataset

Content

Inspiration

Recommended for further research

Meta Kaggle Code

Explore our public notebook content!

Why we’re releasing this dataset

Sensitive data

Joining with Meta Kaggle

File organization

Questions / Comments

Income of individuals by age group, sex and income source, Canada, provinces...

U.S. median household income 2023, by education of householder

Income Bracket Analysis by Age Group Dataset: Age-Wise Distribution of...

About this dataset

Content

Inspiration

Recommended for further research

Income Bracket Analysis by Age Group Dataset: Age-Wise Distribution of...

About this dataset

Content

Inspiration

Recommended for further research

Dataset: A Systematic Literature Review on the topic of High-value datasets

Levels of obesity and inactivity related illnesses (physical illnesses):...

Aquifer framework datasets used to represent the Marshall aquifer, Michigan

NOAA Analysis of Record for Calibration (AORC) Dataset

Dataset for: The Evolution of the Manosphere Across the Web

Dataset covidgilance signals

Data from: ORION-AE: Multisensor acoustic emission datasets reflecting...

Housing Receiving Incentives Open Data

Hard Hat Workers Object Detection Dataset - resize-416x416-reflectEdges

Overview

Use Cases

Using this Dataset

Dataset Versions:

About Roboflow

Age-wise distribution of East Jordan, MI household incomes: Comparative...

About this dataset

Content

Inspiration

Recommended for further research

Data from: Global Soil Types, 0.5-Degree Grid (Modified Zobler)

Age-wise distribution of Jonesboro, IN household incomes: Comparative...

About this dataset

Content

Inspiration

Recommended for further research

Age-wise distribution of Grant township, Oceana County, Michigan household...

About this dataset

Content

Inspiration

Recommended for further research

High income tax filers in Canada, specific geographic area thresholdsSee More Versions

1110005601

High income tax filers in Canada, specific geographic area thresholds