100+ datasets found

Data from: College Completion Dataset
kaggle.com
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). College Completion Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/boost-student-success-with-college-completion-da
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 6, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
College Completion Dataset

Graduation Rates, Race, Efficiency Measures and More

By Jonathan Ortiz [source]

About this dataset

This College Completion dataset provides an invaluable insight into the success and progress of college students in the United States. It contains graduation rates, race and other data to offer a comprehensive view of college completion in America. The data is sourced from two primary sources – the National Center for Education Statistics (NCES)’ Integrated Postsecondary Education System (IPEDS) and Voluntary System of Accountability’s Student Success and Progress rate.

At four-year institutions, the graduation figures come from IPEDS for first-time, full-time degree seeking students at the undergraduate level, who entered college six years earlier at four-year institutions or three years earlier at two-year institutions. Furthermore, colleges report how many students completed their program within 100 percent and 150 percent of normal time which corresponds with graduation within four years or six year respectively. Students reported as being of two or more races are included in totals but not shown separately

When analyzing race and ethnicity data NCES have classified student demographics since 2009 into seven categories; White non-Hispanic; Black non Hispanic; American Indian/ Alaskan native ; Asian/ Pacific Islander ; Unknown race or ethnicity ; Non resident with two new categorize Native Hawaiian or Other Pacific Islander combined with Asian plus students belonging to several races. Also worth noting is that different classifications for graduate data stemming from 2008 could be due to variations in time frame examined & groupings used by particular colleges – those who can’t be identified from National Student Clearinghouse records won’t be subjected to penalty by these locations .

When it comes down to efficiency measures parameters like “Awards per 100 Full Time Undergraduate Students which includes all undergraduate completions reported by a particular institution including associate degrees & certificates less than 4 year programme will assist us here while we also take into consideration measures like expenditure categories , Pell grant percentage , endowment values , average student aid amounts & full time faculty members contributing outstandingly towards instructional research / public service initiatives .

When trying to quantify outcomes back up Median Estimated SAT score metric helps us when it is derived either on 25th percentile basis / 75th percentile basis with all these factors further qualified by identifying required criteria meeting 90% threshold when incoming students are considered for relevance . Last but not least , Average Student Aid equalizes amount granted by institution dividing same over total sum received against what was allotted that particular year .

All this analysis gives an opportunity get a holistic overview about performance , potential deficits &

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains data on student success, graduation rates, race and gender demographics, an efficiency measure to compare colleges across states and more. It is a great source of information to help you better understand college completion and student success in the United States.

In this guide we’ll explain how to use the data so that you can find out the best colleges for students with certain characteristics or focus on your target completion rate. We’ll also provide some useful tips for getting the most out of this dataset when seeking guidance on which institutions offer the highest graduation rates or have a good reputation for success in terms of completing programs within normal timeframes.

Before getting into specifics about interpreting this dataset, it is important that you understand that each row represents information about a particular institution – such as its state affiliation, level (two-year vs four-year), control (public vs private), name and website. Each column contains various demographic information such as rate of awarding degrees compared to other institutions in its sector; race/ethnicity Makeup; full-time faculty percentage; median SAT score among first-time students; awards/grants comparison versus national average/state average - all applicable depending on institution location — and more!

When using this dataset, our suggestion is that you begin by forming a hypothesis or research question concerning student completion at a given school based upon observable characteristics like financ...
Community Services Statistics for Children, Young People and Adults,...
gov.uk
Updated Dec 6, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NHS Digital (2022). Community Services Statistics for Children, Young People and Adults, September 2022 [Dataset]. https://www.gov.uk/government/statistics/community-services-statistics-for-children-young-people-and-adults-september-2022
Explore at:
Dataset updated
Dec 6, 2022
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
NHS Digital
Description
This is a monthly report on publicly funded community services for children, young people and adults using data from the Community Services Data Set (CSDS) reported in England. The CSDS is a patient-level dataset and has been developed to help achieve better outcomes for children, young people and adults. It provides data that will be used to commission services in a way that improves health, reduces inequalities, and supports service improvement and clinical quality. These services can include NHS Trusts, health centres, schools, mental health trusts, and local authorities. The data collected in CSDS includes personal and demographic information, diagnoses including long-term conditions and disabilities and care events plus screening activities. These statistics are classified as experimental and should be used with caution. Experimental statistics are new official statistics undergoing evaluation. They are published in order to involve users and stakeholders in their development and as a means to build in quality at an early stage. More information about experimental statistics can be found on the UK Statistics Authority website. We hope this information is helpful and would be grateful if you could spare a couple of minutes to complete a short customer satisfaction survey. Please use the survey in the related links to provide us with any feedback or suggestions for improving the report.
t
PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IS BELOW THE POVERTY LEVEL -...
portal.tad3.org
Updated Jul 23, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IS BELOW THE POVERTY LEVEL - DP03_MAN_P - Dataset - CKAN [Dataset]. https://portal.tad3.org/dataset/percentage-of-families-and-people-whose-income-is-below-the-poverty-level--dp03_man_p
Explore at:
Dataset updated
Jul 23, 2023
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
SELECTED ECONOMIC CHARACTERISTICS PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL - DP03 Universe - All families and All People Survey-Program - American Community Survey 5-year estimates Years - 2020, 2021, 2022 Poverty statistics in American Community Survey (ACS) products adhere to the standards specified by the Office of Management and Budget in Statistical Policy Directive 14. The Census Bureau uses a set of dollar value thresholds that vary by family size and composition to determine who is in poverty. Further, poverty thresholds for people living alone or with nonrelatives (unrelated individuals) vary by age (under 65 Year or 65 Year and older). The poverty thresholds for two-person families also vary by the age of the householder. If a family’s total income is less than the dollar value of the appropriate threshold, then that family and every individual in it are considered to be in poverty. Similarly, if an unrelated individual’s total income is less than the appropriate threshold, then that individual is considered to be in poverty.
N
Level Plains, AL Population Breakdown by Gender and Age Dataset: Male and...
neilsberg.com
csv, json
Updated Feb 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Level Plains, AL Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e1ec949c-f25d-11ef-8c1b-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Feb 24, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Level Plains, Alabama
Variables measured
Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Level Plains by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Level Plains. The dataset can be utilized to understand the population distribution of Level Plains by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Level Plains. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Level Plains.

Key observations

Largest age group (population): Male # 45-49 years (137) | Female # 60-64 years (95). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

Variables / Data Columns

Age Group: This column displays the age group for the Level Plains population analysis. Total expected values are 18 and are define above in the age groups section.

Population (Male): The male population in the Level Plains is shown in the following column.

Population (Female): The female population in the Level Plains is shown in the following column.

Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Level Plains for each age group.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Level Plains Population by Gender. You can refer the same here
Healthy People 2020 Final Progress by Population Group Chart and Table
catalog.data.gov
data.virginia.gov
+2more
Updated Apr 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centers for Disease Control and Prevention (2025). Healthy People 2020 Final Progress by Population Group Chart and Table [Dataset]. https://catalog.data.gov/dataset/healthy-people-2020-final-progress-by-population-group-chart-and-table-617d0
Explore at:
Dataset updated
Apr 23, 2025
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Description
[1] The Progress by Population Group analysis is a component of the Healthy People 2020 (HP2020) Final Review. The analysis included subsets of the 1,111 measurable HP2020 objectives that have data available for any of six broad population characteristics: sex, race and ethnicity, educational attainment, family income, disability status, and geographic location. Progress toward meeting HP2020 targets is presented for up to 24 population groups within these characteristics, based on objective data aggregated across HP2020 topic areas. The Progress by Population Group data are also available at the individual objective level in the downloadable data set. [2] The final value was generally based on data available on the HP2020 website as of January 2020. For objectives that are continuing into HP2030, more recent data will be included on the HP2030 website as it becomes available: https://health.gov/healthypeople. [3] For more information on the HP2020 methodology for measuring progress toward target attainment and the elimination of health disparities, see: Healthy People Statistical Notes, no 27; available from: https://www.cdc.gov/nchs/data/statnt/statnt27.pdf. [4] Status for objectives included in the HP2020 Progress by Population Group analysis was determined using the baseline, final, and target value. The progress status categories used in HP2020 were: a. Target met or exceeded—One of the following applies: (i) At baseline, the target was not met or exceeded, and the most recent value was equal to or exceeded the target (the percentage of targeted change achieved was equal to or greater than 100%); (ii) The baseline and most recent values were equal to or exceeded the target (the percentage of targeted change achieved was not assessed). b. Improved—One of the following applies: (i) Movement was toward the target, standard errors were available, and the percentage of targeted change achieved was statistically significant; (ii) Movement was toward the target, standard errors were not available, and the objective had achieved 10% or more of the targeted change. c. Little or no detectable change—One of the following applies: (i) Movement was toward the target, standard errors were available, and the percentage of targeted change achieved was not statistically significant; (ii) Movement was toward the target, standard errors were not available, and the objective had achieved less than 10% of the targeted change; (iii) Movement was away from the baseline and target, standard errors were available, and the percent change relative to the baseline was not statistically significant; (iv) Movement was away from the baseline and target, standard errors were not available, and the objective had moved less than 10% relative to the baseline; (v) No change was observed between the baseline and the final data point. d. Got worse—One of the following applies: (i) Movement was away from the baseline and target, standard errors were available, and the percent change relative to the baseline was statistically significant; (ii) Movement was away from the baseline and target, standard errors were not available, and the objective had moved 10% or more relative to the baseline. NOTE: Measurable objectives had baseline data. SOURCE: National Center for Health Statistics, Healthy People 2020 Progress by Population Group database.

Cost of International Education

kaggle.com

Updated May 7, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Adil Shamim (2025). Cost of International Education [Dataset]. https://www.kaggle.com/datasets/adilshamim8/cost-of-international-education

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 7, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Adil Shamim

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This Cost of International Education dataset compiles detailed financial information for students pursuing higher education abroad. It covers multiple countries, cities, and universities around the world, capturing the full tuition and living expenses spectrum alongside key ancillary costs. With standardized fields such as tuition in USD, living-cost indices, rent, visa fees, insurance, and up-to-date exchange rates, it enables comparative analysis across programs, degree levels, and geographies. Whether you’re a prospective international student mapping out budgets, an educational consultant advising on affordability, or a researcher studying global education economics, this dataset offers a comprehensive foundation for data-driven insights.

Description

Column	Type	Description
Country	`string`	ISO country name where the university is located (e.g., “Germany”, “Australia”).
City	`string`	City in which the institution sits (e.g., “Munich”, “Melbourne”).
University	`string`	Official name of the higher-education institution (e.g., “Technical University of Munich”).
Program	`string`	Specific course or major (e.g., “Master of Computer Science”, “MBA”).
Level	`string`	Degree level of the program: “Undergraduate”, “Master’s”, “PhD”, or other certifications.
Duration_Years	`integer`	Length of the program in years (e.g., `2` for a typical Master’s).
Tuition_USD	`numeric`	Total program tuition cost, converted into U.S. dollars for ease of comparison.
Living_Cost_Index	`numeric`	A normalized index (often based on global city indices) reflecting relative day-to-day living expenses (food, transport, utilities).
Rent_USD	`numeric`	Average monthly student accommodation rent in U.S. dollars.
Visa_Fee_USD	`numeric`	One-time visa application fee payable by international students, in U.S. dollars.
Insurance_USD	`numeric`	Annual health or student insurance cost in U.S. dollars, as required by many host countries.
Exchange_Rate	`numeric`	Local currency units per U.S. dollar at the time of data collection—vital for currency conversion and trend analysis if rates fluctuate.

Potential Uses

Budget Planning Prospective students can filter by country, program level, or university to forecast total expenses and compare across destinations.
Policy Analysis Educational policymakers and NGOs can assess the affordability of international education and design support programs.
Economic Research Economists can correlate living-cost indices and tuition levels with enrollment rates or student demographics.
University Benchmarking Institutions can benchmark their fees and ancillary costs against peer universities worldwide.

Notes on Data Collection & Quality

Currency Conversions All monetary values are unified to USD using contemporaneous exchange rates to facilitate direct comparison.
Living Cost Index Derived from reputable city-index publications (e.g., Numbeo, Mercer) to standardize disparate cost-of-living metrics.
Data Currency Exchange rates and fee schedules should be periodically updated to reflect market fluctuations and policy changes.

Feel free to explore, visualize, and extend this dataset for deeper insights into the true cost of studying abroad!

United States COVID-19 County Level of Community Transmission Historical...
data.virginia.gov
healthdata.gov
+1more
csv, json, rdf, xsl
Updated Feb 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centers for Disease Control and Prevention (2025). United States COVID-19 County Level of Community Transmission Historical Changes - ARCHIVED [Dataset]. https://data.virginia.gov/dataset/united-states-covid-19-county-level-of-community-transmission-historical-changes-archived
Explore at:
json, rdf, xsl, csvAvailable download formats
Dataset updated
Feb 23, 2025
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Area covered
United States
Description
On October 20, 2022, CDC began retrieving aggregate case and death data from jurisdictional and state partners weekly instead of daily. This dataset contains archived historical community transmission and related data elements by county. Although these data will continue to be publicly available, this dataset has not been updated since October 20, 2022. An archived dataset containing weekly historical community transmission data by county can also be found here: Weekly COVID-19 County Level of Community Transmission Historical Changes | Data | Centers for Disease Control and Prevention (cdc.gov).

Related data CDC has been providing the public with two versions of COVID-19 county-level community transmission level data: this historical dataset with the daily county-level transmission data from January 22, 2020, and a dataset with the daily values as originally posted on the COVID Data Tracker. Similar to this dataset, the original dataset with daily data as posted is archived on 10/20/2022. It will continue to be publicly available but will no longer be updated. A new dataset containing community transmission data by county as originally posted is now published weekly and can be found at: Weekly COVID-19 County Level of Community Transmission as Originally Posted | Data | Centers for Disease Control and Prevention (cdc.gov).

This public use dataset has 7 data elements reflecting historical data for community transmission levels for all available counties and jurisdictions. It contains historical data for the county level of community transmission and includes updated data submitted by states and jurisdictions. Each day, the dataset was updated to include the most recent days’ data and incorporate any historical changes made by jurisdictions. This dataset includes data since January 22, 2020. Transmission level is set to low, moderate, substantial, or high using the calculation rules below.

Methods for calculating county level of community transmission indicator The County Level of Community Transmission indicator uses two metrics: (1) total new COVID-19 cases per 100,000 persons in the last 7 days and (2) percentage of positive SARS-CoV-2 diagnostic nucleic acid amplification tests (NAAT) in the last 7 days. For each of these metrics, CDC classifies transmission values as low, moderate, substantial, or high (below and here). If the values for each of these two metrics differ (e.g., one indicates moderate and the other low), then the higher of the two should be used for decision-making.

CDC core metrics of and thresholds for community transmission levels of SARS-CoV-2

Total New Case Rate Metric: "New cases per 100,000 persons in the past 7 days" is calculated by adding the number of new cases in the county (or other administrative level) in the last 7 days divided by the population in the county (or other administrative level) and multiplying by 100,000. "New cases per 100,000 persons in the past 7 days" is considered to have transmission level of Low (0-9.99); Moderate (10.00-49.99); Substantial (50.00-99.99); and High (greater than or equal to 100.00).

Test Percent Positivity Metric: "Percentage of positive NAAT in the past 7 days" is calculated by dividing the number of positive tests in the county (or other administrative level) during the last 7 days by the total number of tests resulted over the last 7 days. "Percentage of positive NAAT in the past 7 days" is considered to have transmission level of Low (less than 5.00); Moderate (5.00-7.99); Substa
Student Grades
kaggle.com
Updated Mar 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
clemence travers (2025). Student Grades [Dataset]. https://www.kaggle.com/datasets/clemencetravers/student-grades/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
clemence travers
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This Dataset comes from Mahmoud Elhemaly.

I've modified this dataset since it had no correlation between the variables. I've used it for data visualization on Tableau. Many columns contains NON ACCURATE DATA.

Description

Student Performance & Behavior Dataset This dataset is real data of 5,000 records collected from a private learning provider. The dataset includes key attributes necessary for exploring patterns, correlations, and insights related to academic performance.

Columns:

Student_ID: Unique identifier for each student.

First_Name: Student’s first name.

Last_Name: Student’s last name.

Email: Contact email (can be anonymized).

Gender: Male, Female, Other.

Age: The age of the student.

Department: Student's department (e.g., CS, Engineering, Business).

Attendance (%): Attendance percentage (0-100%).

Participation_Score: Score based on class participation (0-10).

Projects_Score: Project evaluation score (out of 100).

Total_Score: Weighted sum of all grades.

Grade: Letter grade (A, B, C, D, F).

Study_Hours_per_Week: Average study hours per week.

Extracurricular_Activities: Whether the student participates in extracurriculars (Yes/No).

Internet_Access_at_Home: Does the student have access to the internet at home? (Yes/No).

Parent_Education_Level: Highest education level of parents (None, High School, Bachelor's, Master's, PhD).

Family_Income_Level: Low, Medium, High.

Stress_Level (1-10): Self-reported stress level (1: Low, 10: High).

Sleep_Hours_per_Night: Average hours of sleep per night.

Sleep_Hours_per_Night_Entier: with integrer only

Country: Country of origin

Dataset contains:

Missing values (nulls): in some records (e.g., Attendance, Assignments, or Parent Education Level).

Bias in some Datae (ex: grading e.g., students with high attendance get slightly better grades).

Imbalanced distributions (e.g., some departments having more students).
N
Red Level, AL Population Breakdown by Gender and Age
neilsberg.com
csv, json
Updated Sep 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Red Level, AL Population Breakdown by Gender and Age [Dataset]. https://www.neilsberg.com/research/datasets/676fe252-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Sep 14, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Red Level, Alabama
Variables measured
Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Red Level by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Red Level. The dataset can be utilized to understand the population distribution of Red Level by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Red Level. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Red Level.

Key observations

Largest age group (population): Male # 50-54 years (31) | Female # 20-24 years (41). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

Variables / Data Columns

Age Group: This column displays the age group for the Red Level population analysis. Total expected values are 18 and are define above in the age groups section.

Population (Male): The male population in the Red Level is shown in the following column.

Population (Female): The female population in the Red Level is shown in the following column.

Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Red Level for each age group.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Red Level Population by Gender. You can refer the same here
United States COVID-19 Community Levels by County
data.cdc.gov
data.virginia.gov
+1more
application/rdfxml +5
Updated Nov 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CDC COVID-19 Response (2023). United States COVID-19 Community Levels by County [Dataset]. https://data.cdc.gov/Public-Health-Surveillance/United-States-COVID-19-Community-Levels-by-County/3nnm-4jni
Explore at:
application/rdfxml, application/rssxml, csv, tsv, xml, jsonAvailable download formats
Dataset updated
Nov 2, 2023
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Authors
CDC COVID-19 Response
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Area covered
United States
Description
Reporting of Aggregate Case and Death Count data was discontinued May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. Although these data will continue to be publicly available, this dataset will no longer be updated.

This archived public use dataset has 11 data elements reflecting United States COVID-19 community levels for all available counties.

The COVID-19 community levels were developed using a combination of three metrics — new COVID-19 admissions per 100,000 population in the past 7 days, the percent of staffed inpatient beds occupied by COVID-19 patients, and total new COVID-19 cases per 100,000 population in the past 7 days. The COVID-19 community level was determined by the higher of the new admissions and inpatient beds metrics, based on the current level of new cases per 100,000 population in the past 7 days. New COVID-19 admissions and the percent of staffed inpatient beds occupied represent the current potential for strain on the health system. Data on new cases acts as an early warning indicator of potential increases in health system strain in the event of a COVID-19 surge.

Using these data, the COVID-19 community level was classified as low, medium, or high.

COVID-19 Community Levels were used to help communities and individuals make decisions based on their local context and their unique needs. Community vaccination coverage and other local information, like early alerts from surveillance, such as through wastewater or the number of emergency department visits for COVID-19, when available, can also inform decision making for health officials and individuals.

For the most accurate and up-to-date data for any county or state, visit the relevant health department website. COVID Data Tracker may display data that differ from state and local websites. This can be due to differences in how data were collected, how metrics were calculated, or the timing of web updates.

Archived Data Notes:

This dataset was renamed from "United States COVID-19 Community Levels by County as Originally Posted" to "United States COVID-19 Community Levels by County" on March 31, 2022.

March 31, 2022: Column name for county population was changed to “county_population”. No change was made to the data points previous released.

March 31, 2022: New column, “health_service_area_population”, was added to the dataset to denote the total population in the designated Health Service Area based on 2019 Census estimate.

March 31, 2022: FIPS codes for territories American Samoa, Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands were re-formatted to 5-digit numeric for records released on 3/3/2022 to be consistent with other records in the dataset.

March 31, 2022: Changes were made to the text fields in variables “county”, “state”, and “health_service_area” so the formats are consistent across releases.

March 31, 2022: The “%” sign was removed from the text field in column “covid_inpatient_bed_utilization”. No change was made to the data. As indicated in the column description, values in this column represent the percentage of staffed inpatient beds occupied by COVID-19 patients (7-day average).

March 31, 2022: Data values for columns, “county_population”, “health_service_area_number”, and “health_service_area” were backfilled for records released on 2/24/2022. These columns were added since the week of 3/3/2022, thus the values were previously missing for records released the week prior.

April 7, 2022: Updates made to data released on 3/24/2022 for Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands to correct a data mapping error.

April 21, 2022: COVID-19 Community Level (CCL) data released for counties in Nebraska for the week of April 21, 2022 have 3 counties identified in the high category and 37 in the medium category. CDC has been working with state officials to verify the data submitted, as other data systems are not providing alerts for substantial increases in disease transmission or severity in the state.

May 26, 2022: COVID-19 Community Level (CCL) data released for McCracken County, KY for the week of May 5, 2022 have been updated to correct a data processing error. McCracken County, KY should have appeared in the low community level category during the week of May 5, 2022. This correction is reflected in this update.

May 26, 2022: COVID-19 Community Level (CCL) data released for several Florida counties for the week of May 19th, 2022, have been corrected for a data processing error. Of note, Broward, Miami-Dade, Palm Beach Counties should have appeared in the high CCL category, and Osceola County should have appeared in the medium CCL category. These corrections are reflected in this update.

May 26, 2022: COVID-19 Community Level (CCL) data released for Orange County, New York for the week of May 26, 2022 displayed an erroneous case rate of zero and a CCL category of low due to a data source error. This county should have appeared in the medium CCL category.

June 2, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a data processing error. Tolland County, CT should have appeared in the medium community level category during the week of May 26, 2022. This correction is reflected in this update.

June 9, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a misspelling. The medium community level category for Tolland County, CT on the week of May 26, 2022 was misspelled as “meduim” in the data set. This correction is reflected in this update.

June 9, 2022: COVID-19 Community Level (CCL) data released for Mississippi counties for the week of June 9, 2022 should be interpreted with caution due to a reporting cadence change over the Memorial Day holiday that resulted in artificially inflated case rates in the state.

July 7, 2022: COVID-19 Community Level (CCL) data released for Rock County, Minnesota for the week of July 7, 2022 displayed an artificially low case rate and CCL category due to a data source error. This county should have appeared in the high CCL category.

July 14, 2022: COVID-19 Community Level (CCL) data released for Massachusetts counties for the week of July 14, 2022 should be interpreted with caution due to a reporting cadence change that resulted in lower than expected case rates and CCL categories in the state.

July 28, 2022: COVID-19 Community Level (CCL) data released for all Montana counties for the week of July 21, 2022 had case rates of 0 due to a reporting issue. The case rates have been corrected in this update.

July 28, 2022: COVID-19 Community Level (CCL) data released for Alaska for all weeks prior to July 21, 2022 included non-resident cases. The case rates for the time series have been corrected in this update.

July 28, 2022: A laboratory in Nevada reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate will be inflated in Clark County, NV for the week of July 28, 2022.

August 4, 2022: COVID-19 Community Level (CCL) data was updated on August 2, 2022 in error during performance testing. Data for the week of July 28, 2022 was changed during this update due to additional case and hospital data as a result of late reporting between July 28, 2022 and August 2, 2022. Since the purpose of this data set is to provide point-in-time views of COVID-19 Community Levels on Thursdays, any changes made to the data set during the August 2, 2022 update have been reverted in this update.

August 4, 2022: COVID-19 Community Level (CCL) data for the week of July 28, 2022 for 8 counties in Utah (Beaver County, Daggett County, Duchesne County, Garfield County, Iron County, Kane County, Uintah County, and Washington County) case data was missing due to data collection issues. CDC and its partners have resolved the issue and the correction is reflected in this update.

August 4, 2022: Due to a reporting cadence change, case rates for all Alabama counties will be lower than expected. As a result, the CCL levels published on August 4, 2022 should be interpreted with caution.

August 11, 2022: COVID-19 Community Level (CCL) data for the week of August 4, 2022 for South Carolina have been updated to correct a data collection error that resulted in incorrect case data. CDC and its partners have resolved the issue and the correction is reflected in this update.

August 18, 2022: COVID-19 Community Level (CCL) data for the week of August 11, 2022 for Connecticut have been updated to correct a data ingestion error that inflated the CT case rates. CDC, in collaboration with CT, has resolved the issue and the correction is reflected in this update.

August 25, 2022: A laboratory in Tennessee reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate may be inflated in many counties and the CCLs published on August 25, 2022 should be interpreted with caution.

August 25, 2022: Due to a data source error, the 7-day case rate for St. Louis County, Missouri, is reported as zero in the COVID-19 Community Level data released on August 25, 2022. Therefore, the COVID-19 Community Level for this county should be interpreted with caution.

September 1, 2022: Due to a reporting issue, case rates for all Nebraska counties will include 6 days of data instead of 7 days in the COVID-19 Community Level (CCL) data released on September 1, 2022. Therefore, the CCLs for all Nebraska counties should be interpreted with caution.

September 8, 2022: Due to a data processing error, the case rate for Philadelphia County, Pennsylvania,
School Learning Modalities, 2020-2021
kaggle.com
healthdata.gov
+3more
Updated Oct 26, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamad Javad Babaei (2023). School Learning Modalities, 2020-2021 [Dataset]. https://www.kaggle.com/datasets/mhmtaha/school-learning-modalities-2020-2021
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 26, 2023
Dataset provided by
Kaggle
Authors
Mohamad Javad Babaei
License
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
Description
The 2020-2021 School Learning Modalities dataset provides weekly estimates of school learning modality (including in-person, remote, or hybrid learning) for U.S. K-12 public and independent charter school districts for the 2020-2021 school year, from August 2020 – June 2021.

These data were modeled using multiple sources of input data (see below) to infer the most likely learning modality of a school district for a given week. These data should be considered district-level estimates and may not always reflect true learning modality, particularly for districts in which data are unavailable. If a district reports multiple modality types within the same week, the modality offered for the majority of those days is reflected in the weekly estimate. All school district metadata are sourced from the National Center for Educational Statistics (NCES) for 2020-2021.

School learning modality types are defined as follows:

In-Person: All schools within the district offer face-to-face instruction 5 days per week to all students at all available grade levels. Remote: Schools within the district do not offer face-to-face instruction; all learning is conducted online/remotely to all students at all available grade levels. Hybrid: Schools within the district offer a combination of in-person and remote learning; face-to-face instruction is offered less than 5 days per week, or only to a subset of students.

Data Information

School learning modality data provided here are model estimates using combined input data and are not guaranteed to be 100% accurate. This learning modality dataset was generated by combining data from four different sources: Burbio [1], MCH Strategic Data [2], the AEI/Return to Learn Tracker [3], and state dashboards [4-20]. These data were combined using a Hidden Markov model which infers the sequence of learning modalities (In-Person, Hybrid, or Remote) for each district that is most likely to produce the modalities reported by these sources. This model was trained using data from the 2020-2021 school year. Metadata describing the location, number of schools and number of students in each district comes from NCES [21]. You can read more about the model in the CDC MMWR: COVID-19–Related School Closures and Learning Modality Changes — United States, August 1–September 17, 2021. The metrics listed for each school learning modality reflect totals by district and the number of enrolled students per district for which data are available. School districts represented here exclude private schools and include the following NCES subtypes:

Public school district that is NOT a component of a supervisory union Public school district that is a component of a supervisory union Independent charter district

“BI” in the state column refers to school districts funded by the Bureau of Indian Education.

Technical Notes

Data from September 1, 2020 to June 25, 2021 correspond to the 2020-2021 school year. During this timeframe, all four sources of data were available. Inferred modalities with a probability below 0.75 were deemed inconclusive and were omitted. Data for the month of July may show “In Person” status although most school districts are effectively closed during this time for summer break. Users may wish to exclude July data from use for this reason where applicable.

Sources

K-12 School Opening Tracker. Burbio 2021; https
t
PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IS BELOW THE POVERTY LEVEL -...
portal.tad3.org
Updated Jul 23, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IS BELOW THE POVERTY LEVEL - DP03_SAR_T - Dataset - CKAN [Dataset]. https://portal.tad3.org/dataset/percentage-of-families-and-people-whose-income-is-below-the-poverty-level--dp03_sar_t
Explore at:
Dataset updated
Jul 23, 2023
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
SELECTED ECONOMIC CHARACTERISTICS PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL - DP03 Universe - All families and All People Survey-Program - American Community Survey 5-year estimates Years - 2020, 2021, 2022 Poverty statistics in American Community Survey (ACS) products adhere to the standards specified by the Office of Management and Budget in Statistical Policy Directive 14. The Census Bureau uses a set of dollar value thresholds that vary by family size and composition to determine who is in poverty. Further, poverty thresholds for people living alone or with nonrelatives (unrelated individuals) vary by age (under 65 Year or 65 Year and older). The poverty thresholds for two-person families also vary by the age of the householder. If a family’s total income is less than the dollar value of the appropriate threshold, then that family and every individual in it are considered to be in poverty. Similarly, if an unrelated individual’s total income is less than the appropriate threshold, then that individual is considered to be in poverty.
Planning Database: Tract Level
catalog.data.gov
Updated Jul 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Census Bureau (2023). Planning Database: Tract Level [Dataset]. https://catalog.data.gov/dataset/planning-database-tract-level
Explore at:
Dataset updated
Jul 19, 2023
Dataset provided by
United States Census Bureauhttp://census.gov/
Description
The PDB is a database of U.S. housing, demographic, socioeconomic and operational statistics based on select 2010 Decennial Census and select 5-year American Community Survey (ACS) estimates. Data are provided at the census block group level of geography. These data can be used for many purposes, including survey field operations planning.
a
Climate Ready Boston Social Vulnerability
bostonopendata-boston.opendata.arcgis.com
data.boston.gov
+1more
Updated Sep 21, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BostonMaps (2017). Climate Ready Boston Social Vulnerability [Dataset]. https://bostonopendata-boston.opendata.arcgis.com/datasets/34f2c48b670d4b43a617b1540f20efe3_0
Explore at:
Dataset updated
Sep 21, 2017
Dataset authored and provided by
BostonMaps
Area covered

Description
Social vulnerability is defined as the disproportionate susceptibility of some social groups to the impacts of hazards, including death, injury, loss, or disruption of livelihood. In this dataset from Climate Ready Boston, groups identified as being more vulnerable are older adults, children, people of color, people with limited English proficiency, people with low or no incomes, people with disabilities, and people with medical illnesses. Source:The analysis and definitions used in Climate Ready Boston (2016) are based on "A framework to understand the relationship between social factors that reduce resilience in cities: Application to the City of Boston." Published 2015 in the International Journal of Disaster Risk Reduction by Atyia Martin, Northeastern University.Population Definitions:Older Adults:Older adults (those over age 65) have physical vulnerabilities in a climate event; they suffer from higher rates of medical illness than the rest of the population and can have some functional limitations in an evacuation scenario, as well as when preparing for and recovering from a disaster. Furthermore, older adults are physically more vulnerable to the impacts of extreme heat. Beyond the physical risk, older adults are more likely to be socially isolated. Without an appropriate support network, an initially small risk could be exacerbated if an older adult is not able to get help.Data source: 2008-2012 American Community Survey 5-year Estimates (ACS) data by census tract for population over 65 years of age.Attribute label: OlderAdultChildren: Families with children require additional resources in a climate event. When school is cancelled, parents need alternative childcare options, which can mean missing work. Children are especially vulnerable to extreme heat and stress following a natural disaster.Data source: 2010 American Community Survey 5-year Estimates (ACS) data by census tract for population under 5 years of age.Attribute label: TotChildPeople of Color: People of color make up a majority (53 percent) of Boston’s population. People of color are more likely to fall into multiple vulnerable groups aswell. People of color statistically have lower levels of income and higher levels of poverty than the population at large. People of color, many of whom also have limited English proficiency, may not have ready access in their primary language to information about the dangers of extreme heat or about cooling center resources. This risk to extreme heat can be compounded by the fact that people of color often live in more densely populated urban areas that are at higher risk for heat exposure due to the urban heat island effect.Data source: 2008-2012 American Community Survey 5-year Estimates (ACS) data by census tract: Black, Native American, Asian, Island, Other, Multi, Non-white Hispanics.Attribute label: POC2Limited English Proficiency: Without adequate English skills, residents can miss crucial information on how to preparefor hazards. Cultural practices for information sharing, for example, may focus on word-of-mouth communication. In a flood event, residents can also face challenges communicating with emergency response personnel. If residents are more sociallyisolated, they may be less likely to hear about upcoming events. Finally, immigrants, especially ones who are undocumented, may be reluctant to use government services out of fear of deportation or general distrust of the government or emergency personnel.Data Source: 2008-2012 American Community Survey 5-year Estimates (ACS) data by census tract, defined as speaks English only or speaks English “very well”.Attribute label: LEPLow to no Income: A lack of financial resources impacts a household’s ability to prepare for a disaster event and to support friends and neighborhoods. For example, residents without televisions, computers, or data-driven mobile phones may face challenges getting news about hazards or recovery resources. Renters may have trouble finding and paying deposits for replacement housing if their residence is impacted by flooding. Homeowners may be less able to afford insurance that will cover flood damage. Having low or no income can create difficulty evacuating in a disaster event because of a higher reliance on public transportation. If unable to evacuate, residents may be more at risk without supplies to stay in their homes for an extended period of time. Low- and no-income residents can also be more vulnerable to hot weather if running air conditioning or fans puts utility costs out of reach.Data source: 2008-2012 American Community Survey 5-year Estimates (ACS) data by census tract for low-to- no income populations. The data represents a calculated field that combines people who were 100% below the poverty level and those who were 100–149% of the poverty level.Attribute label: Low_to_NoPeople with Disabilities: People with disabilities are among the most vulnerable in an emergency; they sustain disproportionate rates of illness, injury, and death in disaster events.46 People with disabilities can find it difficult to adequately prepare for a disaster event, including moving to a safer place. They are more likely to be left behind or abandoned during evacuations. Rescue and relief resources—like emergency transportation or shelters, for example— may not be universally accessible. Research has revealed a historic pattern of discrimination against people with disabilities in times of resource scarcity, like after a major storm and flood.Data source: 2008-2012 American Community Survey 5-year Estimates (ACS) data by census tract for total civilian non-institutionalized population, including: hearing difficulty, vision difficulty, cognitive difficulty, ambulatory difficulty, self-care difficulty, and independent living difficulty. Attribute label: TotDisMedical Illness: Symptoms of existing medical illnesses are often exacerbated by hot temperatures. For example, heat can trigger asthma attacks or increase already high blood pressure due to the stress of high temperatures put on the body. Climate events can interrupt access to normal sources of healthcare and even life-sustaining medication. Special planning is required for people experiencing medical illness. For example, people dependent on dialysis will have different evacuation and care needs than other Boston residents in a climate event.Data source: Medical illness is a proxy measure which is based on EASI data accessed through Simply Map. Health data at the local level in Massachusetts is not available beyond zip codes. EASI modeled the health statistics for the U.S. population based upon age, sex, and race probabilities using U.S. Census Bureau data. The probabilities are modeled against the census and current year and five year forecasts. Medical illness is the sum of asthma in children, asthma in adults, heart disease, emphysema, bronchitis, cancer, diabetes, kidney disease, and liver disease. A limitation is that these numbers may be over-counted as the result of people potentially having more than one medical illness. Therefore, the analysis may have greater numbers of people with medical illness within census tracts than actually present. Overall, the analysis was based on the relationship between social factors.Attribute label: MedIllnesOther attribute definitions:GEOID10: Geographic identifier: State Code (25), Country Code (025), 2010 Census TractAREA_SQFT: Tract area (in square feet)AREA_ACRES: Tract area (in acres)POP100_RE: Tract population countHU100_RE: Tract housing unit countName: Boston Neighborhood
d
COVID-19 case rate per 100,000 population and percent test positivity in the...
datasets.ai
data.ct.gov
+1more
23, 40, 55, 8
Updated Sep 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
State of Connecticut (2024). COVID-19 case rate per 100,000 population and percent test positivity in the last 7 days by town - ARCHIVE [Dataset]. https://datasets.ai/datasets/covid-19-case-rate-per-100000-population-and-percent-test-positivity-in-the-last-7-days-by
Explore at:
23, 55, 40, 8Available download formats
Dataset updated
Sep 8, 2024
Dataset authored and provided by
State of Connecticut
Description
DPH note about change from 7-day to 14-day metrics: As of 10/15/2020, this dataset is no longer being updated. Starting on 10/15/2020, these metrics will be calculated using a 14-day average rather than a 7-day average. The new dataset using 14-day averages can be accessed here: https://data.ct.gov/Health-and-Human-Services/COVID-19-case-rate-per-100-000-population-and-perc/hree-nys2

As you know, we are learning more about COVID-19 all the time, including the best ways to measure COVID-19 activity in our communities. CT DPH has decided to shift to 14-day rates because these are more stable, particularly at the town level, as compared to 7-day rates. In addition, since the school indicators were initially published by DPH last summer, CDC has recommended 14-day rates and other states (e.g., Massachusetts) have started to implement 14-day metrics for monitoring COVID transmission as well.

With respect to geography, we also have learned that many people are looking at the town-level data to inform decision making, despite emphasis on the county-level metrics in the published addenda. This is understandable as there has been variation within counties in COVID-19 activity (for example, rates that are higher in one town than in most other towns in the county).

This dataset includes a weekly count and weekly rate per 100,000 population for COVID-19 cases, a weekly count of COVID-19 PCR diagnostic tests, and a weekly percent positivity rate for tests among people living in community settings. Dates are based on date of specimen collection (cases and positivity).

A person is considered a new case only upon their first COVID-19 testing result because a case is defined as an instance or bout of illness. If they are tested again subsequently and are still positive, it still counts toward the test positivity metric but they are not considered another case.

These case and test counts do not include cases or tests among people residing in congregate settings, such as nursing homes, assisted living facilities, or correctional facilities.

These data are updated weekly; the previous week period for each dataset is the previous Sunday-Saturday, known as an MMWR week (https://wwwn.cdc.gov/nndss/document/MMWR_week_overview.pdf). The date listed is the date the dataset was last updated and corresponds to a reporting period of the previous MMWR week. For instance, the data for 8/20/2020 corresponds to a reporting period of 8/9/2020-8/15/2020.

Notes: 9/25/2020: Data for Mansfield and Middletown for the week of Sept 13-19 were unavailable at the time of reporting due to delays in lab reporting.
Hong Kong Social Contact Dynamics
kaggle.com
Updated Feb 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Hong Kong Social Contact Dynamics [Dataset]. https://www.kaggle.com/datasets/thedevastator/hong-kong-social-contact-dynamics
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 5, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Hong Kong
Description
Hong Kong Social Contact Dynamics

Understanding Age, Gender and Network Dynamics

By [source]

About this dataset

This dataset provides an in-depth look at the dynamics of social interaction, particularly in Hong Kong. It contains comprehensive information regarding individuals, households and interactions between individuals such as their ages, frequency and duration of contact, and genders. This data can be utilized to evaluate various social and economic trends, behaviors, as well as dynamics observed at different levels. For example, this data set is an ideal tool to recognize population-level trends such as age and gender diversification of contacts or investigate the structure of social networks in addition to the implications of contact patterns on health and economic outcomes. Additionally, it offers valuable insights into dissimilar groups of people including their permanent residence activities related to work or leisure by enabling one to understand their interactions along with contact dynamics within their respective populations. Ultimately this dataset is key for attaining a comprehensive understanding of social contact dynamics which are fundamental for grasping why these interactions are crucial in Hong Kong's society today

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset provides detailed information about the social contact dynamics in Hong Kong. With this dataset, it is possible to gain a comprehensive understanding of the patterns of various forms of social contact - from permanent residence and work contacts to leisure contacts. This guide will provide an overview and guidelines on how to use this dataset for analysis.

Exploring Trends and Dynamics:

To begin exploring the trends and dynamics of social contact in Hong Kong, start by looking at demographic factors such as age, gender, ethnicity, and educational attainment associated with different types of contacts (permanent residence/work/leisure). Consider the frequency and duration of contacts within these segments to identify any potential differences between them. Additionally, look at how these factors interact with each other – observe which segments have higher levels of interaction with each other or if there are any differences between different population groups based on their demographic characteristics. This can be done through visualizations such as line graphs or bar charts which can illustrate trends across timeframes or population demographics more clearly than raw numbers would alone.

Investigating Social Networks:

The data collected through this dataset also allows for investigation into social networks – understanding who connects with who in both real-life interactions as well as through digital channels (if applicable). Focus on analyzing individual or family networks rather than larger groups in order to get a clearer picture without having too much complexity added into the analysis time. Analyze commonalities among individuals within a network even after controlling for certain factors that could affect interaction such as age or gender – utilize clustering techniques for this step if appropriate– then focus on comparing networks between individuals/families overall using graph theory methods such as length distributions (the average number of relationships one has) , degrees (the number of links connected from one individual or family unit), centrality measures(identifying individuals who serve an important role bridging two different parts fo he network) etc., These methods will help provide insights into varying structures between large groups rather than focusing only on small-scale personal connections among friends / colleagues / relatives which may not always offer accurate portrayals due to their naturally limited scope

Modeling Health Implications:

Finally, consider modeling health implications stemming from these observed patterns– particularly implications that may not be captured by simpler measures like count per contact hour (which does not differentiate based on intensity). Take into account aspects like viral transmission risk by analyzing secondary effects generated from contact events captured in the data – things like physical proximity when multiple people meet up together over multiple days

Research Ideas

Analyzing the age, gender and contact dynamics of different areas within Hong Kong to understand the local population trends and behavior.

Investigating the structure of social networks to study how patterns of contact vary among socio economic backgro...
n
AirNow Air Quality Monitoring Data (Current) - Dataset - CKAN
nationaldataplatform.org
Updated Feb 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). AirNow Air Quality Monitoring Data (Current) - Dataset - CKAN [Dataset]. https://nationaldataplatform.org/catalog/dataset/airnow-air-quality-monitoring-data-current
Explore at:
Dataset updated
Feb 28, 2024
Description
This United States Environmental Protection Agency (US EPA) feature layer represents monitoring site data, updated hourly concentrations and Air Quality Index (AQI) values for the latest hour received from monitoring sites that report to AirNow.Map and forecast data are collected using federal reference or equivalent monitoring techniques or techniques approved by the state, local or tribal monitoring agencies. To maintain "real-time" maps, the data are displayed after the end of each hour. Although preliminary data quality assessments are performed, the data in AirNow are not fully verified and validated through the quality assurance procedures monitoring organizations used to officially submit and certify data on the EPA Air Quality System (AQS).This data sharing, and centralization creates a one-stop source for real-time and forecast air quality data. The benefits include quality control, national reporting consistency, access to automated mapping methods, and data distribution to the public and other data systems. The U.S. Environmental Protection Agency, National Oceanic and Atmospheric Administration, National Park Service, tribal, state, and local agencies developed the AirNow system to provide the public with easy access to national air quality information. State and local agencies report the Air Quality Index (AQI) for cities across the US and parts of Canada and Mexico. AirNow data are used only to report the AQI, not to formulate or support regulation, guidance or any other EPA decision or position.About the AQIThe Air Quality Index (AQI) is an index for reporting daily air quality. It tells you how clean or polluted your air is, and what associated health effects might be a concern for you. The AQI focuses on health effects you may experience within a few hours or days after breathing polluted air. EPA calculates the AQI for five major air pollutants regulated by the Clean Air Act: ground-level ozone, particle pollution (also known as particulate matter), carbon monoxide, sulfur dioxide, and nitrogen dioxide. For each of these pollutants, EPA has established national air quality standards to protect public health. Ground-level ozone and airborne particles (often referred to as "particulate matter") are the two pollutants that pose the greatest threat to human health in this country.A number of factors influence ozone formation, including emissions from cars, trucks, buses, power plants, and industries, along with weather conditions. Weather is especially favorable for ozone formation when it’s hot, dry and sunny, and winds are calm and light. Federal and state regulations, including regulations for power plants, vehicles and fuels, are helping reduce ozone pollution nationwide.Fine particle pollution (or "particulate matter") can be emitted directly from cars, trucks, buses, power plants and industries, along with wildfires and woodstoves. But it also forms from chemical reactions of other pollutants in the air. Particle pollution can be high at different times of year, depending on where you live. In some areas, for example, colder winters can lead to increased particle pollution emissions from woodstove use, and stagnant weather conditions with calm and light winds can trap PM2.5 pollution near emission sources. Federal and state rules are helping reduce fine particle pollution, including clean diesel rules for vehicles and fuels, and rules to reduce pollution from power plants, industries, locomotives, and marine vessels, among others.How Does the AQI Work?Think of the AQI as a yardstick that runs from 0 to 500. The higher the AQI value, the greater the level of air pollution and the greater the health concern. For example, an AQI value of 50 represents good air quality with little potential to affect public health, while an AQI value over 300 represents hazardous air quality.An AQI value of 100 generally corresponds to the national air quality standard for the pollutant, which is the level EPA has set to protect public health. AQI values below 100 are generally thought of as satisfactory. When AQI values are above 100, air quality is considered to be unhealthy-at first for certain sensitive groups of people, then for everyone as AQI values get higher.Understanding the AQIThe purpose of the AQI is to help you understand what local air quality means to your health. To make it easier to understand, the AQI is divided into six categories:Air Quality Index(AQI) ValuesLevels of Health ConcernColorsWhen the AQI is in this range:..air quality conditions are:...as symbolized by this color:0 to 50GoodGreen51 to 100ModerateYellow101 to 150Unhealthy for Sensitive GroupsOrange151 to 200UnhealthyRed201 to 300Very UnhealthyPurple301 to 500HazardousMaroonNote: Values above 500 are considered Beyond the AQI. Follow recommendations for the Hazardous category. Additional information on reducing exposure to extremely high levels of particle pollution is available here.Each category corresponds to a different level of health concern. The six levels of health concern and what they mean are:"Good" AQI is 0 to 50. Air quality is considered satisfactory, and air pollution poses little or no risk."Moderate" AQI is 51 to 100. Air quality is acceptable; however, for some pollutants there may be a moderate health concern for a very small number of people. For example, people who are unusually sensitive to ozone may experience respiratory symptoms."Unhealthy for Sensitive Groups" AQI is 101 to 150. Although general public is not likely to be affected at this AQI range, people with lung disease, older adults and children are at a greater risk from exposure to ozone, whereas persons with heart and lung disease, older adults and children are at greater risk from the presence of particles in the air."Unhealthy" AQI is 151 to 200. Everyone may begin to experience some adverse health effects, and members of the sensitive groups may experience more serious effects."Very Unhealthy" AQI is 201 to 300. This would trigger a health alert signifying that everyone may experience more serious health effects."Hazardous" AQI greater than 300. This would trigger a health warnings of emergency conditions. The entire population is more likely to be affected.AQI colorsEPA has assigned a specific color to each AQI category to make it easier for people to understand quickly whether air pollution is reaching unhealthy levels in their communities. For example, the color orange means that conditions are "unhealthy for sensitive groups," while red means that conditions may be "unhealthy for everyone," and so on.Air Quality Index Levels of Health ConcernNumericalValueMeaningGood0 to 50Air quality is considered satisfactory, and air pollution poses little or no risk.Moderate51 to 100Air quality is acceptable; however, for some pollutants there may be a moderate health concern for a very small number of people who are unusually sensitive to air pollution.Unhealthy for Sensitive Groups101 to 150Members of sensitive groups may experience health effects. The general public is not likely to be affected.Unhealthy151 to 200Everyone may begin to experience health effects; members of sensitive groups may experience more serious health effects.Very Unhealthy201 to 300Health alert: everyone may experience more serious health effects.Hazardous301 to 500Health warnings of emergency conditions. The entire population is more likely to be affected.Note: Values above 500 are considered Beyond the AQI. Follow recommendations for the "Hazardous category." Additional information on reducing exposure to extremely high levels of particle pollution is available here.
US county-level mortality
kaggle.com
Updated Nov 17, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IHME (2019). US county-level mortality [Dataset]. https://www.kaggle.com/IHME/us-countylevel-mortality/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 17, 2019
Dataset provided by
Kaggle
Authors
IHME
Area covered
United States
Description
Context

IHME United States Mortality Rates by County 1980-2014: National - All. (Deaths per 100,000 population)

To quickly get started creating maps, like the one below, see the Quick Start R kernel.

https://storage.googleapis.com/montco-stats/kaggleNeoplasms.png" alt="NeoplasmsMap">

How the Dataset was Created

This Dataset was created from the Excel Spreadsheet, which can be found in the download. Or, you can view the source here. If you take a look at the row for United States, for the column Mortality Rate, 1980*, you'll see the set of numbers 1.52 (1.44, 1.61). Numbers in parentheses are 95% uncertainty. The 1.52 is an age-standardized mortality rate for both sexes combined (deaths per 100,000 population).

In this Dataset 1.44 will be placed in the named column Mortality Rage, 1989 (Min)* and 1.61 is in column named Mortality Rate, 1980 (Max)* . For information on how these Age-standardized mortality rates were calculated, see the December JAMA 2016 article, which you can download for free.

https://storage.googleapis.com/montco-stats/kaggleUSMort.png" alt="Spreadsheet">

Reference

JAMA Full Article

Video Describing this Study (Short and this is worth viewing)

Data Resources

How Americans Die May Depend On Where They Live, by Anna Maria Barry-Jester (FiveThirtyEight)

Interactive Map from healthdata.org

IHME Data

Acknowledgements

This Dataset was provided by IHME

Institute for Health Metrics and Evaluation 2301 Fifth Ave., Suite 600, Seattle, WA 98121, USA Tel: +1.206.897.2800 Fax: +1.206.897.2899 © 2016 University of Washington
a
Levels of obesity and inactivity related illnesses (physical illnesses):...
arc-gis-hub-home-arcgishub.hub.arcgis.com
data.catchmentbasedapproach.org
+1more
Updated Apr 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Rivers Trust (2021). Levels of obesity and inactivity related illnesses (physical illnesses): Summary (England) [Dataset]. https://arc-gis-hub-home-arcgishub.hub.arcgis.com/maps/theriverstrust::levels-of-obesity-and-inactivity-related-illnesses-physical-illnesses-summary-england
Explore at:
Dataset updated
Apr 7, 2021
Dataset authored and provided by
The Rivers Trust
Area covered

Description
SUMMARYThis analysis, designed and executed by Ribble Rivers Trust, identifies areas across England with the greatest levels of physical illnesses that are linked with obesity and inactivity. Please read the below information to gain a full understanding of what the data shows and how it should be interpreted.ANALYSIS METHODOLOGYThe analysis was carried out using Quality and Outcomes Framework (QOF) data, derived from NHS Digital, relating to:- Asthma (in persons of all ages)- Cancer (in persons of all ages)- Chronic kidney disease (in adults aged 18+)- Coronary heart disease (in persons of all ages)- Diabetes mellitus (in persons aged 17+)- Hypertension (in persons of all ages)- Stroke and transient ischaemic attack (in persons of all ages)This information was recorded at the GP practice level. However, GP catchment areas are not mutually exclusive: they overlap, with some areas covered by 30+ GP practices. Therefore, to increase the clarity and usability of the data, the GP-level statistics were converted into statistics based on Middle Layer Super Output Area (MSOA) census boundaries.For each of the above illnesses, the percentage of each MSOA’s population with that illness was estimated. This was achieved by calculating a weighted average based on:- The percentage of the MSOA area that was covered by each GP practice’s catchment area- Of the GPs that covered part of that MSOA: the percentage of patients registered with each GP that have that illnessThe estimated percentage of each MSOA’s population with each illness was then combined with Office for National Statistics Mid-Year Population Estimates (2019) data for MSOAs, to estimate the number of people in each MSOA with each illness, within the relevant age range.For each illness, each MSOA was assigned a relative score between 1 and 0 (1 = worst, 0 = best) based on:A) the PERCENTAGE of the population within that MSOA who are estimated to have that illnessB) the NUMBER of people within that MSOA who are estimated to have that illnessAn average of scores A & B was taken, and converted to a relative score between 1 and 0 (1= worst, 0 = best). The closer to 1 the score, the greater both the number and percentage of the population in the MSOA predicted to have that illness, compared to other MSOAs. In other words, those are areas where a large number of people are predicted to suffer from an illness, and where those people make up a large percentage of the population, indicating there is a real issue with that illness within the population and the investment of resources to address that issue could have the greatest benefits.The scores for each of the 7 illnesses were added together then converted to a relative score between 1 – 0 (1 = worst, 0 = best), to give an overall score for each MSOA: a score close to 1 would indicate that an area has high predicted levels of all obesity/inactivity-related illnesses, and these are areas where the local population could benefit the most from interventions to address those illnesses. A score close to 0 would indicate very low predicted levels of obesity/inactivity-related illnesses and therefore interventions might not be required.LIMITATIONS1. GPs do not have catchments that are mutually exclusive from each other: they overlap, with some geographic areas being covered by 30+ practices. This dataset should be viewed in combination with the ‘Health and wellbeing statistics (GP-level, England): Missing data and potential outliers’ dataset to identify where there are areas that are covered by multiple GP practices but at least one of those GP practices did not provide data. Results of the analysis in these areas should be interpreted with caution, particularly if the levels of obesity/inactivity-related illnesses appear to be significantly lower than the immediate surrounding areas.2. GP data for the financial year 1st April 2018 – 31st March 2019 was used in preference to data for the financial year 1st April 2019 – 31st March 2020, as the onset of the COVID19 pandemic during the latter year could have affected the reporting of medical statistics by GPs. However, for 53 GPs (out of 7670) that did not submit data in 2018/19, data from 2019/20 was used instead. Note also that some GPs (997 out of 7670) did not submit data in either year. This dataset should be viewed in conjunction with the ‘Health and wellbeing statistics (GP-level, England): Missing data and potential outliers’ dataset, to determine areas where data from 2019/20 was used, where one or more GPs did not submit data in either year, or where there were large discrepancies between the 2018/19 and 2019/20 data (differences in statistics that were > mean +/- 1 St.Dev.), which suggests erroneous data in one of those years (it was not feasible for this study to investigate this further), and thus where data should be interpreted with caution. Note also that there are some rural areas (with little or no population) that do not officially fall into any GP catchment area (although this will not affect the results of this analysis if there are no people living in those areas).3. Although all of the obesity/inactivity-related illnesses listed can be caused or exacerbated by inactivity and obesity, it was not possible to distinguish from the data the cause of the illnesses in patients: obesity and inactivity are highly unlikely to be the cause of all cases of each illness. By combining the data with data relating to levels of obesity and inactivity in adults and children (see the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset), we can identify where obesity/inactivity could be a contributing factor, and where interventions to reduce obesity and increase activity could be most beneficial for the health of the local population.4. It was not feasible to incorporate ultra-fine-scale geographic distribution of populations that are registered with each GP practice or who live within each MSOA. Populations might be concentrated in certain areas of a GP practice’s catchment area or MSOA and relatively sparse in other areas. Therefore, the dataset should be used to identify general areas where there are high levels of obesity/inactivity-related illnesses, rather than interpreting the boundaries between areas as ‘hard’ boundaries that mark definite divisions between areas with differing levels of these illnesses. TO BE VIEWED IN COMBINATION WITH:This dataset should be viewed alongside the following datasets, which highlight areas of missing data and potential outliers in the data:- Health and wellbeing statistics (GP-level, England): Missing data and potential outliersDOWNLOADING THIS DATATo access this data on your desktop GIS, download the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset.DATA SOURCESThis dataset was produced using:Quality and Outcomes Framework data: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.GP Catchment Outlines. Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital. Data was cleaned by Ribble Rivers Trust before use.COPYRIGHT NOTICEThe reproduction of this data must be accompanied by the following statement:© Ribble Rivers Trust 2021. Analysis carried out using data that is: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.CaBA HEALTH & WELLBEING EVIDENCE BASEThis dataset forms part of the wider CaBA Health and Wellbeing Evidence Base.
c
CT School Learning Model Indicators by County (7-day metrics) - ARCHIVE
s.cnmilf.com
data.ct.gov
+1more
Updated Aug 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ct.gov (2023). CT School Learning Model Indicators by County (7-day metrics) - ARCHIVE [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/ct-school-learning-model-indicators-by-county-7-day-metrics-archive
Explore at:
Dataset updated
Aug 12, 2023
Dataset provided by
data.ct.gov
Area covered
Connecticut
Description
DPH note about change from 7-day to 14-day metrics: As of 10/15/2020, this dataset is no longer being updated. Starting on 10/15/2020, the school learning model indicator metrics will be calculated using a 14-day average rather than a 7-day average. The new school learning model indicators dataset using 14-day averages can be accessed here: https://data.ct.gov/Health-and-Human-Services/CT-School-Learning-Model-Indicators-by-County-14-d/e4bh-ax24 As you know, we are learning more about COVID-19 all the time, including the best ways to measure COVID-19 activity in our communities. CT DPH has decided to shift to 14-day rates because these are more stable, particularly at the town level, as compared to 7-day rates. In addition, since the school indicators were initially published by DPH last summer, CDC has recommended 14-day rates and other states (e.g., Massachusetts) have started to implement 14-day metrics for monitoring COVID transmission as well. With respect to geography, we also have learned that many people are looking at the town-level data to inform decision making, despite emphasis on the county-level metrics in the published addenda. This is understandable as there has been variation within counties in COVID-19 activity (for example, rates that are higher in one town than in most other towns in the county). This dataset includes the leading and secondary metrics identified by the Connecticut Department of Health (DPH) and the Department of Education (CSDE) to support local district decision-making on the level of in-person, hybrid (blended), and remote learning model for Pre K-12 education. Data represent daily averages for each week by date of specimen collection (cases and positivity), date of hospital admission, or date of ED visit. Hospitalization data come from the Connecticut Hospital Association and are based on hospital _location, not county of patient residence. COVID-19-like illness includes fever and cough or shortness of breath or difficulty breathing or the presence of coronavirus diagnosis code and excludes patients with influenza-like illness. All data are preliminary. These data are updated weekly; the previous week period for each dataset is the previous Sunday-Saturday, known as an MMWR week (https://wwwn.cdc.gov/nndss/document/MMWR_week_overview.pdf). The date listed is the date the dataset was last updated and corresponds to a reporting period of the previous MMWR week. For instance, the data for 8/20/2020 corresponds to a reporting period of 8/9/2020-8/15/2020. These metrics were adapted from recommendations by the Harvard Global Institute and supplemented by existing DPH measures. For national data on COVID-19, see COVID View, the national weekly surveillance summary of U.S. COVID-19 activity, at https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html Notes: 9/25/2020: Data for Mansfield and Middletown for the week of Sept 13-19 were unavailable at the time of reporting due to delays in lab reporting.

Facebook

Twitter

Click to copy link

Link copied

Cite

The Devastator (2022). College Completion Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/boost-student-success-with-college-completion-da

Data from: College Completion Dataset

Graduation Rates, Race, Efficiency Measures and More

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Dec 6, 2022

Dataset provided by

Kagglehttp://kaggle.com/

Authors

The Devastator

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

College Completion Dataset

Graduation Rates, Race, Efficiency Measures and More

By Jonathan Ortiz [source]

About this dataset

This College Completion dataset provides an invaluable insight into the success and progress of college students in the United States. It contains graduation rates, race and other data to offer a comprehensive view of college completion in America. The data is sourced from two primary sources – the National Center for Education Statistics (NCES)’ Integrated Postsecondary Education System (IPEDS) and Voluntary System of Accountability’s Student Success and Progress rate.

At four-year institutions, the graduation figures come from IPEDS for first-time, full-time degree seeking students at the undergraduate level, who entered college six years earlier at four-year institutions or three years earlier at two-year institutions. Furthermore, colleges report how many students completed their program within 100 percent and 150 percent of normal time which corresponds with graduation within four years or six year respectively. Students reported as being of two or more races are included in totals but not shown separately

When analyzing race and ethnicity data NCES have classified student demographics since 2009 into seven categories; White non-Hispanic; Black non Hispanic; American Indian/ Alaskan native ; Asian/ Pacific Islander ; Unknown race or ethnicity ; Non resident with two new categorize Native Hawaiian or Other Pacific Islander combined with Asian plus students belonging to several races. Also worth noting is that different classifications for graduate data stemming from 2008 could be due to variations in time frame examined & groupings used by particular colleges – those who can’t be identified from National Student Clearinghouse records won’t be subjected to penalty by these locations .

When it comes down to efficiency measures parameters like “Awards per 100 Full Time Undergraduate Students which includes all undergraduate completions reported by a particular institution including associate degrees & certificates less than 4 year programme will assist us here while we also take into consideration measures like expenditure categories , Pell grant percentage , endowment values , average student aid amounts & full time faculty members contributing outstandingly towards instructional research / public service initiatives .

When trying to quantify outcomes back up Median Estimated SAT score metric helps us when it is derived either on 25th percentile basis / 75th percentile basis with all these factors further qualified by identifying required criteria meeting 90% threshold when incoming students are considered for relevance . Last but not least , Average Student Aid equalizes amount granted by institution dividing same over total sum received against what was allotted that particular year .

All this analysis gives an opportunity get a holistic overview about performance , potential deficits &

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains data on student success, graduation rates, race and gender demographics, an efficiency measure to compare colleges across states and more. It is a great source of information to help you better understand college completion and student success in the United States.

In this guide we’ll explain how to use the data so that you can find out the best colleges for students with certain characteristics or focus on your target completion rate. We’ll also provide some useful tips for getting the most out of this dataset when seeking guidance on which institutions offer the highest graduation rates or have a good reputation for success in terms of completing programs within normal timeframes.

Before getting into specifics about interpreting this dataset, it is important that you understand that each row represents information about a particular institution – such as its state affiliation, level (two-year vs four-year), control (public vs private), name and website. Each column contains various demographic information such as rate of awarding degrees compared to other institutions in its sector; race/ethnicity Makeup; full-time faculty percentage; median SAT score among first-time students; awards/grants comparison versus national average/state average - all applicable depending on institution location — and more!

When using this dataset, our suggestion is that you begin by forming a hypothesis or research question concerning student completion at a given school based upon observable characteristics like financ...

Clear search

Close search

Google apps

Main menu

Data from: College Completion Dataset

College Completion Dataset

Graduation Rates, Race, Efficiency Measures and More

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Community Services Statistics for Children, Young People and Adults,...

PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IS BELOW THE POVERTY LEVEL -...

Level Plains, AL Population Breakdown by Gender and Age Dataset: Male and...

About this dataset

Content

Inspiration

Recommended for further research

Healthy People 2020 Final Progress by Population Group Chart and Table

Cost of International Education

Description

Potential Uses

Notes on Data Collection & Quality

United States COVID-19 County Level of Community Transmission Historical...

Student Grades

Red Level, AL Population Breakdown by Gender and Age

About this dataset

Content

Inspiration

Recommended for further research

United States COVID-19 Community Levels by County

School Learning Modalities, 2020-2021

PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IS BELOW THE POVERTY LEVEL -...

Planning Database: Tract Level

Climate Ready Boston Social Vulnerability

COVID-19 case rate per 100,000 population and percent test positivity in the...

Hong Kong Social Contact Dynamics

Hong Kong Social Contact Dynamics

Understanding Age, Gender and Network Dynamics

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Exploring Trends and Dynamics:

Investigating Social Networks:

Modeling Health Implications:

Research Ideas

AirNow Air Quality Monitoring Data (Current) - Dataset - CKAN

US county-level mortality

Context

How the Dataset was Created

Reference

Acknowledgements

Levels of obesity and inactivity related illnesses (physical illnesses):...

CT School Learning Model Indicators by County (7-day metrics) - ARCHIVE

Data from: College Completion Dataset

Graduation Rates, Race, Efficiency Measures and More

College Completion Dataset

Graduation Rates, Race, Efficiency Measures and More

About this dataset

More Datasets

Featured Notebooks

How to use the dataset