Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
aThe percentages for each city were computed from [58] using the countr y's percentage of children under 20 years old. Taiwan's percentage was obtained from [59].
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a hybrid gridded dataset of demographic data for the world, given as 5-year population bands at a 0.5 degree grid resolution.
This dataset combines the NASA SEDAC Gridded Population of the World version 4 (GPWv4) with the ISIMIP Histsoc gridded population data and the United Nations World Population Program (WPP) demographic modelling data.
Demographic fractions are given for the time period covered by the UN WPP model (1950-2050) while demographic totals are given for the time period covered by the combination of GPWv4 and Histsoc (1950-2020)
Method - demographic fractions
Demographic breakdown of country population by grid cell is calculated by combining the GPWv4 demographic data given for 2010 with the yearly country breakdowns from the UN WPP. This combines the spatial distribution of demographics from GPWv4 with the temporal trends from the UN WPP. This makes it possible to calculate exposure trends from 1980 to the present day.
To combine the UN WPP demographics with the GPWv4 demographics, we calculate for each country the proportional change in fraction of demographic in each age band relative to 2010 as:
\(\delta_{year,\ country,age}^{\text{wpp}} = f_{year,\ country,age}^{\text{wpp}}/f_{2010,country,age}^{\text{wpp}}\)
Where:
- \(\delta_{year,\ country,age}^{\text{wpp}}\) is the ratio of change in demographic for a given age and and country from the UN WPP dataset.
- \(f_{year,\ country,age}^{\text{wpp}}\) is the fraction of population in the UN WPP dataset for a given age band, country, and year.
- \(f_{2010,country,age}^{\text{wpp}}\) is the fraction of population in the UN WPP dataset for a given age band, country for the year 2020.
The gridded demographic fraction is then calculated relative to the 2010 demographic data given by GPWv4.
For each subset of cells corresponding to a given country c, the fraction of population in a given age band is calculated as:
\(f_{year,c,age}^{\text{gpw}} = \delta_{year,\ country,age}^{\text{wpp}}*f_{2010,c,\text{age}}^{\text{gpw}}\)
Where:
- \(f_{year,c,age}^{\text{gpw}}\) is the fraction of the population in a given age band for given year, for the grid cell c.
- \(f_{2010,c,age}^{\text{gpw}}\) is the fraction of the population in a given age band for 2010, for the grid cell c.
The matching between grid cells and country codes is performed using the GPWv4 gridded country code lookup data and country name lookup table. The final dataset is assembled by combining the cells from all countries into a single gridded time series. This time series covers the whole period from 1950-2050, corresponding to the data available in the UN WPP model.
Method - demographic totals
Total population data from 1950 to 1999 is drawn from ISIMIP Histsoc, while data from 2000-2020 is drawn from GPWv4. These two gridded time series are simply joined at the cut-over date to give a single dataset covering 1950-2020.
The total population per age band per cell is calculated by multiplying the population fractions by the population totals per grid cell.
Note that as the total population data only covers until 2020, the time span covered by the demographic population totals data is 1950-2020 (not 1950-2050).
Disclaimer
This dataset is a hybrid of different datasets with independent methodologies. No guarantees are made about the spatial or temporal consistency across dataset boundaries. The dataset may contain outlier points (e.g single cells with demographic fractions >1). This dataset is produced on a 'best effort' basis and has been found to be broadly consistent with other approaches, but may contain inconsistencies which not been identified.
Facebook
TwitterNotation and descriptions of total population sizes.
Facebook
Twitterhttps://www.ontario.ca/page/open-government-licence-ontariohttps://www.ontario.ca/page/open-government-licence-ontario
Data includes: board and school information, grade 3 and 6 EQAO student achievements for reading, writing and mathematics, and grade 9 mathematics EQAO and OSSLT. Data excludes private schools, Education and Community Partnership Programs (ECPP), summer, night and continuing education schools.
How Are We Protecting Privacy?
Results for OnSIS and Statistics Canada variables are suppressed based on school population size to better protect student privacy. In order to achieve this additional level of protection, the Ministry has used a methodology that randomly rounds a percentage either up or down depending on school enrolment. In order to protect privacy, the ministry does not publicly report on data when there are fewer than 10 individuals represented.
The information in the School Information Finder is the most current available to the Ministry of Education at this time, as reported by schools, school boards, EQAO and Statistics Canada. The information is updated as frequently as possible.
This information is also available on the Ministry of Education's School Information Finder website by individual school.
Descriptions for some of the data types can be found in our glossary.
School/school board and school authority contact information are updated and maintained by school boards and may not be the most current version. For the most recent information please visit: https://data.ontario.ca/dataset/ontario-public-school-contact-information.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains student achievement data for two Portuguese high schools. The data was collected using school reports and questionnaires, and includes student grades, demographics, social, parent, and school-related features.
Two datasets are provided regarding performance in two distinct subjects: Mathematics and Portuguese language. I have cleaned the original datasets so that they are easier to read and use.
Important note: the target attribute final_grade has a strong correlation with attributes grade_2 and grade_1. This occurs because final_grade is the final year grade (issued at the 3rd period), while grade_1 and grade_2 correspond to the 1st and 2nd period grades. It is more difficult to predict final_grade without grade_2 and grade_1, but these predictions will be much more useful.
Additional note: there are 382 students that belong to both datasets, though the ID's do not match. These students can be identified by searching for identical attributes that characterize each student.
Please include this citation if you plan to use this database: P. Cortez and A. Silva. Using Data Mining to Predict Secondary School Student Performance. In A. Brito and J. Teixeira Eds., Proceedings of 5th FUture BUsiness TEChnology Conference (FUBUTEC 2008) pp. 5-12, Porto, Portugal, April, 2008, EUROSIS, ISBN 978-9077381-39-7.
Facebook
TwitterThis large, international dataset contains survey responses from N = 12,570 students from 100 universities in 35 countries, collected in 21 languages. We measured anxieties (statistics, mathematics, test, trait, social interaction, performance, creativity, intolerance of uncertainty, and fear of negative evaluation), self-efficacy, persistence, and the cognitive reflection test, and collected demographics, previous mathematics grades, self-reported and official statistics grades, and statistics module details. Data reuse potential is broad, including testing links between anxieties and statistics/mathematics education factors, and examining instruments’ psychometric properties across different languages and contexts. Note that the pre-registration can be found here: https://osf.io/xs5wf
Facebook
TwitterTEDS-M examined how different countries prepare their teachers to teach mathematics in primary and lower-secondary schools. The study gathered information on various characteristics of teacher education institutions, programs, and curricula. It also collected information on the opportunities to learn within these contexts, and on future teachers’ knowledge and beliefs about mathematics and learning mathematics. TEDS-M Educational measurements and tests Target population: Teachers of Mathematics TEDS-M surveyed teacher education institutions, educators of future teachers, and future teachers (primary and secondary levels). STRATIFIED TWO-STAGE CLUSTER SAMPLE DESIGN
Facebook
TwitterA significant challenge in the field of biomedicine is the development of methods to integrate the multitude of dispersed data sets into comprehensive frameworks to be used to generate optimal clinical decisions. Recent technological advances in single cell analysis allow for high-dimensional molecular characterization of cells and populations, but to date, few mathematical models have attempted to integrate measurements from the single cell scale with other data types. Here, we present a framework that actionizes static outputs from a machine learning model and leverages these as measurements of state variables in a dynamic mechanistic model of treatment response. We apply this framework to breast cancer cells to integrate single cell transcriptomic data with longitudinal population-size data. We demonstrate that the explicit inclusion of the transcriptomic information in the parameter estimation is critical for identification of the model parameters and enables accurate prediction of new treatment regimens. Inclusion of the transcriptomic data improves predictive accuracy in new treatment response dynamics with a concordance correlation coefficient (CCC) of 0.89 compared to a prediction accuracy of CCC = 0.79 without integration of the single cell RNA sequencing (scRNA-seq) data directly into the model calibration. To the best our knowledge, this is the first work that explicitly integrates single cell clonally-resolved transcriptome datasets with longitudinal treatment response data into a mechanistic mathematical model of drug resistance dynamics. We anticipate this approach to be a first step that demonstrates the feasibility of incorporating multimodal data sets into identifiable mathematical models to develop optimized treatment regimens from data. Single cell RNA-seq of MDA-MB-231 cell line with chemotherapy treatment
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a hybrid gridded dataset of demographic data for China from 1979 to 2100, given as 21 five-year age groups of population divided by gender every year at a 0.5-degree grid resolution.
The historical period (1979-2020) part of this dataset combines the NASA SEDAC Gridded Population of the World version 4 (GPWv4, UN WPP-Adjusted Population Count) with gridded population from the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP, Histsoc gridded population data).
The projection (2010-2100) part of this dataset is resampled directly from Chen et al.’s data published in Scientific Data.
This dataset includes 31 provincial administrative districts of China, including 22 provinces, 5 autonomous regions, and 4 municipalities directly under the control of the central government (Taiwan, Hong Kong, and Macao were excluded due to missing data).
Method - demographic fractions by age and gender in 1979-2020
Age- and gender-specific demographic data by grid cell for each province in China are derived by combining historical demographic data in 1979-2020 with the national population census data provided by the National Statistics Bureau of China.
To combine the national population census data with the historical demographics, we constructed the provincial fractions of demographic in each age groups and each gender according to the fourth, fifth and sixth national population census, which cover the year of 1979-1990, 1991-2000 and 2001-2020, respectively. The provincial fractions can be computed as:
\(\begin{align*} \begin{split} f_{year,province,age,gender}= \left \{ \begin{array}{lr} POP_{1990,province,age,gender}^{4^{th}census}/POP_{1990,province}^{4^{th}census} & 1979\le\mathrm{year}\le1990\\ POP_{2000,province,age,gender}^{5^{th}census}/POP_{2000,province}^{5^{th}census} & 1991\le\mathrm{year}\le2000\\ POP_{2010,province,age,gender}^{6^{th}census}/POP_{2010,province}^{6^{th}census}, & 2001\le\mathrm{year}\le2020 \end{array} \right. \end{split} \end{align*}\)
Where:
- \( f_{\mathrm{year,province,age,gender}}\)is the fraction of population for a given age, a given gender in each province from the national census from 1979-2020.
- \(\mathrm{PO}\mathrm{P}_{\mathrm{year,province,age,gender}}^{X^{\mathrm{th}}\mathrm{census} }\) is the total population for a given age, a given gender in each province from the Xth national census.
- \(\mathrm{PO}\mathrm{P}_{\mathrm{year,province}}^{X^{\mathrm{th}}\mathrm{census} }\) is the total population for all ages and both genders in each province from the Xth national census.
Method - demographic totals by age and gender in 1979-2020
The yearly grid population for 1979-1999 are from ISIMIP Histsoc gridded population data, and for 2000-2020 are from the GPWv4 demographic data adjusted by the UN WPP (UN WPP-Adjusted Population Count, v4.11, https://beta.sedac.ciesin.columbia.edu/data/set/gpw-v4-population-count-adjusted-to-2015-unwpp-country-totals-rev11), which combines the spatial distribution of demographics from GPWv4 with the temporal trends from the UN WPP to improve accuracy. These two gridded time series are simply joined at the cut-over date to give a single dataset - historical demographic data covering 1979-2020.
Next, historical demographic data are mapped onto the grid scale to obtain provincial data by using gridded provincial code lookup data and name lookup table. The age- and gender-specific fraction were multiplied by the historical demographic data at the provincial level to obtain the total population by age and gender for per grid cell for china in 1979-2020.
Method - demographic totals and fractions by age and gender in 2010-2100
The grid population count data in 2010-2100 under different shared socioeconomic pathway (SSP) scenarios are drawn from Chen et al. published in Scientific Data with a resolution of 1km (~ 0.008333 degree). We resampled the data to 0.5 degree by aggregating the population count together to obtain the future population data per cell.
This previously published dataset also provided age- and gender-specific population of each provinces, so we calculated the fraction of each age and gender group at provincial level. Then, we multiply the fractions with grid population count to get the total population per age group per cell for each gender.
Note that the projected population data from Chen’s dataset covers 2010-2020, while the historical population in our dataset also covers 2010-2020. The two datasets of that same period may vary because the original population data come from different sources and are calculated based on different methods.
Disclaimer
This dataset is a hybrid of different datasets with independent methodologies. Spatial or temporal consistency across dataset boundaries cannot be guaranteed.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wastewater-based epidemiology is a promising public health tool that can yield a more representative view of the population than case reporting. However, only about 80% of the U.S. population is connected to public sewers, and the characteristics of populations missed by wastewater-based epidemiology are unclear. To address this gap, we used publicly available datasets to assess sewer connectivity in the U.S. by location, demographic groups, and economic groups. Data from the U.S. Census’ American Housing Survey revealed that sewer connectivity was lower than average when the head of household was American Indian and Alaskan Native, White, non-Hispanic, older, and for larger households and those with higher income, but smaller geographic scales revealed local variations from this national connectivity pattern. For example, data from the U.S. Environmental Protection Agency showed that sewer connectivity was positively correlated with income in Minnesota, Florida, and California. Data from the U.S. Census’ American Community Survey and Environmental Protection Agency also revealed geographic areas with low sewer connectivity, such as Alaska, the Navajo Nation, Minnesota, Michigan, and Florida. However, with the exception of the U.S. Census data, there were inconsistencies across datasets. Using mathematical modeling to assess the impact of wastewater sampling inequities on inferences about epidemic trajectory at a local scale, we found that in some situations, even weak connections between communities may allow wastewater monitoring in one community to serve as a reliable proxy for an interacting community with no wastewater monitoring, when cases are widespread. A systematic, rigorous assessment of sewer connectivity will be important for ensuring an equitable and informed implementation of wastewater-based epidemiology as a public health monitoring system.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This data set consists the analysis data set for the paper titled "Causal Inference for Interfering Units With Cluster and Population Level Treatment Allocation Programs". It includes key power plant covariates, area level characteristics and ambient ozone concentrations with 100 km of the power plant.
Facebook
TwitterDescriptive statistics of PP population.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual math proficiency from 2011 to 2023 for La Canada High School vs. California and La Canada Unified School District
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual math proficiency from 2011 to 2023 for Washington G High School vs. Illinois and Chicago Public Schools District 299
Facebook
TwitterTarget population (students) All students in their final year of secondary school (often 12th grade) who are engaged in advanced mathematics and physics studies that prepare them to enter STEM programs in higher education
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
The dataset includes: Roll Number: Represent the roll number of the student.
Gender: Useful for analyzing performance differences between male and female students.
Race/Ethnicity: Allows analysis of academic performance trends across different racial or ethnic groups.
Parental Level of Education: Indicates the educational background of the student's family.
Lunch: Shows whether students receive a free or reduced lunch, which is often a socioeconomic indicator.
Test Preparation Course: This tells whether students completed a test prep course, which could impact their performance.
Math Score: Provides a measure of each student’s performance in math, used to calculate averages or trends across various demographics. Science Score: Evaluates students' Science knowledge, which can be analyzed to assess overall scentific knowledge of the student.
Reading Score: Measures performance in reading, allowing for insights into literacy and comprehension levels among students.
Writing Score: Evaluates students' writing skills, which can be analyzed to assess overall literacy and expression.
Total Score: Shows the total number achieved by the student out of 400.
Grade: Gade achieved by the student. "A" grade if Total marks >= 320, "B" grade if Total marks >= 250, "C" grade if Total marks >= 200, "D" grade if Total marks >= 150 and Fail if <150.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual math proficiency from 2011 to 2023 for Haines City Senior High School vs. Florida and Polk School District
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual math proficiency from 2019 to 2023 for Grand Oaks High School vs. Texas and Conroe Independent School District
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
All data refer to pupils in schools with grades according to the target and knowledge-related grading system. Statistics as per cent from 2013. ‘Local municipality’ means pupils in both municipal and independent schools located in the municipality, regardless of where they are registered in the population register. Source: The National Agency for Education (Siris).
Facebook
TwitterParticipant demographics and summary statistics.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
aThe percentages for each city were computed from [58] using the countr y's percentage of children under 20 years old. Taiwan's percentage was obtained from [59].