Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This folder contains the formation energy of BDE-db, QM9, PC9, QMugs, and QMugs1.1 datasets by filtering (The training, test, and validation sets were randomly split in a ratio of 0.8, 0.1, and 0.1, respectively). The filtered process is described in the article "Graph-based deep learning models for thermodynamic property prediction: The interplay between target definition, data distribution, featurization, and model architecture" and the code can be found at https://github.com/chimie-paristech-CTM/thermo_GNN.After application of the filter procedure described in the article, final versions of the QM9 (127,007 data points), BDE-db (289,639 data points), PC9 (96,634 data points), QMugs (636,821 data points) and QMugs1.1 (70,546 data points) were obtained and used throughout this study.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2010 data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2017 American Community Survey (ACS) data generally reflect the February 2013 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2017 American Community Survey 1-Year Estimates
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data and code associated with a paper by Yancy et al titled "Evaluating the definition and distribution of spring ephemeral wildflowers in eastern North America". Metadata is included in files when possible.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Definition of terms presented in calculating the index of distributional consistency.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
In the study “CLIMATE-LIMITED VEGETATION CHANGE IN THE CONTERMINOUS UNITED STATES OF AMERICA”, published in the Global Change Biology journal, we evaluated the effects of climate conditions on vegetation composition and distribution in the conterminous United States (CONUS). To disentangle the direct effects of climate change from different non-climate factors, we applied "Liebig's law of the minimum" in a geospatial context, and determined the climate-limited potential for tree, shrub, herbaceous, and non-vegetation fractional cover change. We then compared these potential rates against observed change rates for the period 1986 to 2018 to identify areas of the CONUS where vegetation change is likely being limited by climatic conditions. This dataset contains the input and the resulting rasters for the study which include a) the observed rates of vegetation change, b) the climate derived potential vegetation rates of change, c) the difference between potential and observed values and d) the identified climatic limiting factor. Methods Input data
We use the available data from the “Vegetative Lifeform Cover from Landsat SR for CONUS” product (https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1809) to evaluate the changes in vegetation fractional cover.
The information for the climate factors was derived from the TerraClimate data catalog (https://www.climatologylab.org/terraclimate.html). We downloaded data from this catalog for the period 1971 to 2018 for the following variables: minimum temperature (TMIN), precipitation (PPT), actual evapotranspiration (AET), potential evapotranspiration (PET), and climatic water deficit (DEF).
Preprocessing of vegetation fractional cover data
We resampled and aligned the maps of fractional cover using pixel averaging to the extent and resolution of the TerraClimate dataset (~ 4 km). Then, we calculated rates of lifeform cover change per pixel using the Theil-Sen slope analysis (Sen, 1968; Theil, 1992).
Preprocessing of climate variables data
To process the climate data, we defined a year time step as the months from July of one year to July of the next. Following this definition, we constructed annual maps of each climate variable for the years 1971 to 2018.
The annual maps of each climate variable were further summarized per pixel, into mean and slope (calculated as the Theil-Sen slope) across one, two, three, four, five, ten-, and 15-year lags.
Estimation of climate potential
We constructed a final multilayer dataset of response and predictor variables for the CONUS including the resulting maps of fractional cover rate of change (four response variables), the mean and slope maps for the climate variables for all the time-lags (70 predictor variables), and the initial percent cover for each lifeform in the year 1986 (four predictor variables).
We evaluated for each pixel in the CONUS which of the predictor variables produced the minimum potential rate of change in fractional cover for each lifeform class. To do that, we first calculated the 100% quantile hull of the distribution of each predictor variable against each response variable.
To calculate the 100% quantile of the predictor variables’ distribution we divided the total range of each predictor variable into equal-sized bins. The size and number of bins were set specifically per variable due to differences in their data distribution. For each of the bins, we calculated the maximum value of the vegetation rate of change, which resulted in a lookup table with the lower and upper boundaries of each bin, and the associated maximum rate of change. We constructed a total of 296 lookup tables, one per lifeform class and predictor variable combination. The resulting lookup tables were used to construct spatially explicit maps of maximum vegetation rate of change from each of the predictor variable input rasters, and the final climate potential maps were constructed by stacking all the resulting maps per lifeform class and selecting for each pixel the minimum predicted rate of change and the predictor variable that produced that rate.
Identifying climate-limited areas
We defined climate-limited areas as the parts of the CONUS with little or no differences between the estimated climate potential and the observed rates of change in fractional cover. To identify these areas, we subtracted the raster of observed rates of change from the raster of climate potential for each lifeform class.
Facebook
TwitterSuggested Citation: Doyle, Jennifer. (2020) VERSION SUPERSEDED - Nephrops Underwater TV Survey FU22 The "Smalls". Marine Institute, Ireland. doi:10/dk22.
Facebook
TwitterIn 2024, 34.59 percent of all households in the United States were two person households. In 1970, this figure was at 28.92 percent. Single households Single mother households are usually the most common households with children under 18 years old found in the United States. As of 2021, the District of Columbia and North Dakota had the highest share of single-person households in the United States. Household size in the United States has decreased over the past century, due to customs and traditions changing. Families are typically more nuclear, whereas in the past, multigenerational households were more common. Furthermore, fertility rates have also decreased, meaning that women do not have as many children as they used to. Average households in Utah Out of all states in the U.S., Utah was reported to have the largest average household size. This predominately Mormon state has about three million inhabitants. The Church of the Latter-Day Saints, or Mormonism, plays a large role in Utah, and can contribute to the high birth rate and household size in Utah. The Church of Latter-Day Saints promotes having many children and tight-knit families. Furthermore, Utah has a relatively young population, due to Mormons typically marrying and starting large families younger than those in other states.
Facebook
TwitterThe TEI Data Distribution packages in this folder contain of the full Terrestrial Ecosystem Information (TEI) dataset split into Predictive Ecosystem Mapping (PEM) data and non-PEM data which includes Terrestrial Ecosystem Mapping (TEM), Terrain Mapping (TER), Bioterrain Mapping (TBT) Terrain Stability Mapping (TSM), Sensitive Ecosystems Inventory (SEI), Soil Mapping (SOIL project boundaries only), and Wildlife Habitat Ratings (WHR project boundaries only) by Natural Resource Sector Region (see Index map .pdf). Data includes the Project Boundaries (with project metadata and links to related data such as reports), Long Table (detailed mapping polygons with the full RISC standard attribute table), Short Table (detailed mapping polygons with key and amalgamated (concatenated) attributes derived from Long Table), On-site Symbol features (point, line or polygon terrain features such as landslide tracks, scarps), Sample Sites (field sampling locations), and any user-defined tables. The data dictionary is also available. This data is in file geodatabase format. Current version: v11 (published on 2024-10-03) Previous versions: v10 (published on 2023-11-14), v9 (published on 2023-03-01), v8 (published on 2016-09-01) Note that the Soil Mapping dataset is available from: http://www.env.gov.bc.ca/esd/distdata/ecosystems/Soil_Data/SOIL_DATA_FGDB/
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains medical and lifestyle information for 1500 patients, designed to predict the presence of cancer based on various features. The dataset is structured to provide a realistic challenge for predictive modeling in the medical domain.
Age: Integer values representing the patient's age, ranging from 20 to 80.
Gender: Binary values representing gender, where 0 indicates Male and 1 indicates Female.
BMI: Continuous values representing Body Mass Index, ranging from 15 to 40.
Smoking: Binary values indicating smoking status, where 0 means No and 1 means Yes.
GeneticRisk: Categorical values representing genetic risk levels for cancer, with 0 indicating Low, 1 indicating Medium, and 2 indicating High.
PhysicalActivity: Continuous values representing the number of hours per week spent on physical activities, ranging from 0 to 10.
AlcoholIntake: Continuous values representing the number of alcohol units consumed per week, ranging from 0 to 5.
CancerHistory: Binary values indicating whether the patient has a personal history of cancer, where 0 means No and 1 means Yes.
Diagnosis: Binary values indicating the cancer diagnosis status, where 0 indicates No Cancer and 1 indicates Cancer.
This dataset is intended for training and testing machine learning models for cancer prediction. It can be used for:
This dataset has been preprocessed and cleaned to ensure that users can focus on the most critical aspects of their analysis. The preprocessing steps were designed to eliminate noise and irrelevant information, allowing you to concentrate on developing and fine-tuning your predictive models.
This dataset, shared by Rabie El Kharoua, is original and has never been shared before. It is made available under the CC BY 4.0 license, allowing anyone to use the dataset in any form as long as proper citation is given to the author. A DOI is provided for proper referencing. Please note that duplication of this work within Kaggle is not permitted.
This dataset is synthetic and was generated for educational purposes, making it ideal for data science and machine learning projects. It is an original dataset, owned by Mr. Rabie El Kharoua, and has not been previously shared. You are free to use it under the license outlined on the data card. The dataset is offered without any guarantees. Details about the data provider will be shared soon.
Facebook
TwitterDataset DOI: 10.5061/dryad.n02v6wx9q
The dataset contains a new graduated nativeness status for the Danish vascular flora. In addition, we list our version of species’ status from a binary definition of nativeness from the three sources: the Euro+Med Plantbase (Euro+Med 2006), the Danish Redlist (Moeslund 2023) and Atlas Flora Danica (Hartvig and Vestergaard 2015).
Data from Euro+Med (2006) were used to create the graduated definition of nativeness, with Denmark as the focal territory. Species given in that source as non-native in Denmark, but strictly native to one or more of the following neighbouring countries (or Euro+Med territories) were re-classified as ‘near-native’ : Sweden, Norway, Germany, the Netherlands, Poland, Latvia, Lithuania, Belgium with Luxembourg, the Czech Republic, Estonia, “Baltic states with Kalini...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In daily life, two common algorithms are used for collecting medical disease data: data integration of medical institutions and questionnaires. However, these statistical methods require collecting data from the entire research area, which consumes a significant amount of manpower and material resources. Additionally, data integration is difficult and poses privacy protection challenges, resulting in a large number of missing data in the dataset. The presence of incomplete data significantly reduces the quality of the published data, hindering the timely analysis of data and the generation of reliable knowledge by epidemiologists, public health authorities, and researchers. Consequently, this affects the downstream tasks that rely on this data. To address the issue of discrete missing data in cardiac disease, this paper proposes the AGAN (Attribute Generative Adversarial Nets) architecture for missing data filling, based on generative adversarial networks. This algorithm takes advantage of the strong learning ability of generative adversarial networks. Given the ambiguous meaning of filling data in other network structures, the attribute matrix is designed to directly convert it into the corresponding data type, making the actual meaning of the filling data more evident. Furthermore, the distribution deviation between the generated data and the real data is integrated into the loss function of the generative adversarial networks, improving their training stability and ensuring consistency between the generated data and the real data distribution. This approach establishes the missing data filling mechanism based on the generative adversarial networks, which ensures the rationality of the data distribution while filling the missing data samples. The experimental results demonstrate that compared to other filling algorithms, the data matrix filled by the proposed algorithm in this paper has more evident practical significance, fewer errors, and higher accuracy in downstream classification prediction.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Set definition.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Tell us what you think. Provide feedback to help make American Community Survey data more useful for you..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2010 data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2016 American Community Survey (ACS) data generally reflect the February 2013 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..When information is missing or inconsistent, the Census Bureau logically assigns an acceptable value using the response to a related question or questions. If a logical assignment is not possible, data are filled using a statistical process called allocation, which uses a similar individual or household to provide a donor value. The "Allocated" section is the number of respondents who received an allocated value for a particular subject..Workers include members of the Armed Forces and civilians who were at work last week..The 12 selected states are Connecticut, Maine, Massachusetts, Michigan, Minnesota, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, Vermont, and Wisconsin..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2016 American Community Survey 1-Year Estimates
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2010 data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2010-2014 American Community Survey (ACS) data generally reflect the February 2013 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..Occupation codes are 4-digit codes and are based on Standard Occupational Classification 2010..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2010-2014 American Community Survey 5-Year Estimates
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2012 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..This table has been updated to include additional categories for detailed Asian groups. Multi-year estimates for these additional detailed groups will be produced after three single years of data is tabulated (beginning with the first 1-year release in 2011)...Total includes people who reported Asian only, regardless of whether they reported one or more detailed Asian groups...Other Asian, specified. Includes respondents who provide a response of another Asian group not shown separately, such as Iwo Jiman, Maldivian, or Singaporean...Other Asian, not specified. Includes respondents who checked the "Other Asian" response category on the ACS questionnaire and did not write in a specific group or wrote in a generic term such as "Asian," or "Asiatic." ..Two or more Asian. Includes respondents who provided multiple Asian responses such as Asian Indian and Japanese; or Vietnamese, Chinese and Hmong...Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2012 American Community Survey
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Tell us what you think. Provide feedback to help make American Community Survey data more useful for you..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2010 data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2011-2015 American Community Survey (ACS) data generally reflect the February 2013 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..Questions for "wage and salary" and "tips, bonuses and commissions" were asked separately for the first time during non-response follow-up via Computer Assisted Telephone Interview (CATI) and Computer Assisted Personal Interview (CAPI). Prior to 2013 these questions were asked in combination, "wages, salary, tips, bonuses and commissions."..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2011-2015 American Community Survey 5-Year Estimates
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2012 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The 2009, 2010, 2011, and 2012 plumbing data for Puerto Rico will not be shown. Research indicates that the questions on plumbing facilities that were introduced in 2008 in the stateside American Community Survey and the 2008 Puerto Rico Community Survey may not have been appropriate for Puerto Rico..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2012 American Community Survey
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2010 data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2009-2013 American Community Survey (ACS) data generally reflect the February 2013 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..Fertility data are not available for certain geographic areas due to problems with data collection. See Errata Note #92 for details. ..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2009-2013 5-Year American Community Survey
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Note: For information on data collection, confidentiality protection, nonsampling error, and definitions, see the 2020 Island Areas Censuses Technical Documentation..Note: For information on the codes used when processing the data in this table, see the 2020 Island Areas Censuses Technical Documentation..Explanation of Symbols: 1.An "-" means the statistic could not be computed because there were an insufficient number of observations. 2. An "-" following a median estimate means the median falls in the lowest interval of an open-ended distribution.3. An "+" following a median estimate means the median falls in the upper interval of an open-ended distribution.4. An "N" means data are not displayed for the selected geographic area due to concerns with statistical reliability or an insufficient number of cases.5. An "(X)" means not applicable..Source: U.S. Census Bureau, 2020 Census, Commonwealth of the Northern Mariana Islands.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Note: For information on data collection, confidentiality protection, nonsampling error, and definitions, see the 2020 Island Areas Censuses Technical Documentation..Due to COVID-19 restrictions impacting data collection for the 2020 Census of American Samoa, data tables reporting social and economic characteristics do not include the group quarters population in the table universe. As a result, impacted 2020 data tables should not be compared to 2010 and other past census data tables reporting the same characteristics. The Census Bureau advises data users to verify table universes are the same before comparing data across census years. For more information about data collection limitations and the impacts on American Samoa's data products, see the 2020 Island Areas Censuses Technical Documentation..Note: Occupation categories are based on 4-digit codes from the Standard Occupational Classification 2018..Explanation of Symbols: 1.An "-" means the statistic could not be computed because there were an insufficient number of observations. 2. An "-" following a median estimate means the median falls in the lowest interval of an open-ended distribution.3. An "+" following a median estimate means the median falls in the upper interval of an open-ended distribution.4. An "N" means data are not displayed for the selected geographic area due to concerns with statistical reliability or an insufficient number of cases.5. An "(X)" means not applicable..Source: U.S. Census Bureau, 2020 Census, American Samoa.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This folder contains the formation energy of BDE-db, QM9, PC9, QMugs, and QMugs1.1 datasets by filtering (The training, test, and validation sets were randomly split in a ratio of 0.8, 0.1, and 0.1, respectively). The filtered process is described in the article "Graph-based deep learning models for thermodynamic property prediction: The interplay between target definition, data distribution, featurization, and model architecture" and the code can be found at https://github.com/chimie-paristech-CTM/thermo_GNN.After application of the filter procedure described in the article, final versions of the QM9 (127,007 data points), BDE-db (289,639 data points), PC9 (96,634 data points), QMugs (636,821 data points) and QMugs1.1 (70,546 data points) were obtained and used throughout this study.