Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
huggingface-projects/contribute-a-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Government spending in the United States was last recorded at 39.7 percent of GDP in 2024 . This dataset provides - United States Government Spending To Gdp- actual values, historical data, forecast, chart, statistics, economic calendar and news.
By Health [source]
The Behavioral Risk Factor Surveillance System (BRFSS) offers an expansive collection of data on the health-related quality of life (HRQOL) from 1993 to 2010. Over this time period, the Health-Related Quality of Life dataset consists of a comprehensive survey reflecting the health and well-being of non-institutionalized US adults aged 18 years or older. The data collected can help track and identify unmet population health needs, recognize trends, identify disparities in healthcare, determine determinants of public health, inform decision making and policy development, as well as evaluate programs within public healthcare services.
The HRQOL surveillance system has developed a compact set of HRQOL measures such as a summary measure indicating unhealthy days which have been validated for population health surveillance purposes and have been widely implemented in practice since 1993. Within this study's dataset you will be able to access information such as year recorded, location abbreviations & descriptions, category & topic overviews, questions asked in surveys and much more detailed information including types & units regarding data values retrieved from respondents along with their sample sizes & geographical locations involved!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset tracks the Health-Related Quality of Life (HRQOL) from 1993 to 2010 using data from the Behavioral Risk Factor Surveillance System (BRFSS). This dataset includes information on the year, location abbreviation, location description, type and unit of data value, sample size, category and topic of survey questions.
Using this dataset on BRFSS: HRQOL data between 1993-2010 will allow for a variety of analyses related to population health needs. The compact set of HRQOL measures can be used to identify trends in population health needs as well as determine disparities among various locations. Additionally, responses to survey questions can be used to inform decision making and program and policy development in public health initiatives.
- Analyzing trends in HRQOL over the years by location to identify disparities in health outcomes between different populations and develop targeted policy interventions.
- Developing new models for predicting HRQOL indicators at a regional level, and using this information to inform medical practice and public health implementation efforts.
- Using the data to understand differences between states in terms of their HRQOL scores and establish best practices for healthcare provision based on that understanding, including areas such as access to care, preventative care services availability, etc
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: rows.csv | Column name | Description | |:-------------------------------|:----------------------------------------------------------| | Year | Year of survey. (Integer) | | LocationAbbr | Abbreviation of location. (String) | | LocationDesc | Description of location. (String) | | Category | Category of survey. (String) | | Topic | Topic of survey. (String) | | Question | Question asked in survey. (String) | | DataSource | Source of data. (String) | | Data_Value_Unit | Unit of data value. (String) | | Data_Value_Type | Type of data value. (String) | | Data_Value_Footnote_Symbol | Footnote symbol for data value. (String) | | Data_Value_Std_Err | Standard error of the data value. (Float) | | Sample_Size | Sample size used in sample. (Integer) | | Break_Out | Break out categories used. (String) | | Break_Out_Category | Type break out assessed. (String) | | **GeoLocation*...
This is the dataset used for the U.S. lead exposure risk hotspot analysis in Zartarian et al., 2024, ES&T. The data dictionary files explain the contents of the 2 included zipped data folders. The Figures 1&2 zipped folder contains the data for Figures 1 and 2, the Supplement A figures, and the data for all tables in the paper. The Supplement B zipped folder contains the Random Forest modeling methodology in Supplement B and corresponding data. This folder also includes the full national dataset version of Random Forest model version 1 and 2 used in the analysis (in .csv format). This dataset is associated with the following publication: Zartarian Morrison, V., J. Xue, A. Poulakos, R. Tornero-Velez, L. Stanek, E. Snyder, V. Helms Garrison, K. Egan, and J. Courtney. A U.S. Lead Exposure Hotspots Analysis. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 7(7): 3311-3321, (2024).
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Every year, young women from across the United States compete for the title of Miss America. The competition is open to women between the ages of 17 and 25, and includes a talent portion, an interview, and a swimsuit competition (which was removed in 2018). The winner is crowned by the previous year's titleholder and goes on to tour the nation for about 20,000 miles a month, promoting her particular platform of interest.
The Miss America dataset contains information on all Miss America titleholders from 1921 to 2022. It includes columns for the year of the pageant, the name of the crowned winner, her state or district represented, awards won, talent performed, and notes about her win
This dataset contains information on Miss America titleholders from 1921 to 2022. The data includes the name of the winner, her state or district, the city she represented, her talent, and the year she won
- Miss America could be used to study changes in American culture over time. For example, the decline in the swimsuit competition could be seen as a sign of increasing body positivity in the US.
- The dataset could be used to study the effect of winning Miss America has on a woman's career. Does winning lead to more opportunities?
- The dataset could be used to study geographical patterns inMiss America winners. For example, are there any states that have produced more winners than others?
License
License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original.
File: miss_america_titleholders.csv | Column name | Description | |:----------------------|:-----------------------------------------------------------------------| | year | The year the Miss America pageant was held. (Integer) | | crowned | The name of the Miss America titleholder. (String) | | winner | The name of the Miss America winner. (String) | | state_or_district | The state or district represented by the Miss America winner. (String) | | city | The city represented by the Miss America winner. (String) | | awards | The awards won by the Miss America winner. (String) | | talent | The talent performed by the Miss America winner. (String) | | notes | Notes about the Miss America winner. (String) |
File: eurovision_winners.csv | Column name | Description | |:--------------|:-------------------------------------------------------------------------| | Year | The year the pageant was held. (Integer) | | Date | The date the pageant was held. (Date) | | Host City | The city where the pageant was held. (String) | | Winner | The name of the pageant winner. (String) | | Song | The song performed by the pageant winner. (String) | | Performer | The name of the performer of the pageant winner's song. (String) | | Points | The number of points the pageant winner received. (Integer) | | Margin | The margin of points between the pageant winner and runner-up. (Integer) | | Runner-up | The name of the pageant runner-up. (String) |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the distribution of median household income among distinct age brackets of householders in Lead. Based on the latest 2019-2023 5-Year Estimates from the American Community Survey, it displays how income varies among householders of different ages in Lead. It showcases how household incomes typically rise as the head of the household gets older. The dataset can be utilized to gain insights into age-based household income trends and explore the variations in incomes across households.
Key observations: Insights from 2023
In terms of income distribution across age cohorts, in Lead, the median household income stands at $74,444 for householders within the 25 to 44 years age group, followed by $74,135 for the 45 to 64 years age group. Notably, householders within the 65 years and over age group, had the lowest median household income at $44,219.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. All incomes have been adjusting for inflation and are presented in 2023-inflation-adjusted dollars.
Age groups classifications include:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Lead median household income by age. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GDP Growth Contribution Consumer Spending in the United States decreased to 1.21 percentage points in the first quarter of 2025 from 2.70 percentage points in the fourth quarter of 2024. This dataset includes a chart with historical data for the United States GDP Growth Contribution Consumer Spending.
Offshore wind (OSW) is an established technology in Europe, but it has not yet gained market share in the United States (U.S.). There is, however, increasing interest in and action supporting OSW development from many coastal states, predominantly along the Atlantic coast. As OSW grows in the U.S., as seems likely, it will displace existing and future generation assets. Depending on the energy resources used by those generators, emissions from the electric power sector will change. This research explores combinations of two energy sector drivers, OSW costs and carbon dioxide (CO2) mitigation stringency, to measure the changes in the energy mix and quantify OSW’s impact on the resulting emissions. This dataset is not publicly accessible because: This is a very large access file in a format specifically for the times model. It can be accessed through the following means: Contact Carol Lenox at lenox.carol@epa.gov. Format: The data resides in a number of excel files and in the corresponding TIMES model. This dataset is associated with the following publication: Browning, M., and C. Lenox. Contribution of Offshore Wind to the Power Grid: U.S. Air Quality Implications. Applied Energy. Elsevier B.V., Amsterdam, NETHERLANDS, 276: 115474, (2020).
This is a collection of dataset that I personally think it is useful in analysing COVID19 data. Since all of the data comes from the internet and majority of them originated from World Bank, I am use some Kaggle users has already uploaded similar data. However, I think it makes my life (and perhaps yours) easier by compiling all of these data together.
The following are some remarks for the dataset-
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GDP Growth Contribution Investment in the United States increased to 3.60 percentage points in the first quarter of 2025 from -1.03 percentage points in the fourth quarter of 2024. This dataset includes a chart with historical data for the United States GDP Growth Contribution Investment.
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Reporting of Aggregate Case and Death Count data was discontinued May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. Although these data will continue to be publicly available, this dataset will no longer be updated.
This archived public use dataset has 11 data elements reflecting United States COVID-19 community levels for all available counties.
The COVID-19 community levels were developed using a combination of three metrics — new COVID-19 admissions per 100,000 population in the past 7 days, the percent of staffed inpatient beds occupied by COVID-19 patients, and total new COVID-19 cases per 100,000 population in the past 7 days. The COVID-19 community level was determined by the higher of the new admissions and inpatient beds metrics, based on the current level of new cases per 100,000 population in the past 7 days. New COVID-19 admissions and the percent of staffed inpatient beds occupied represent the current potential for strain on the health system. Data on new cases acts as an early warning indicator of potential increases in health system strain in the event of a COVID-19 surge.
Using these data, the COVID-19 community level was classified as low, medium, or high.
COVID-19 Community Levels were used to help communities and individuals make decisions based on their local context and their unique needs. Community vaccination coverage and other local information, like early alerts from surveillance, such as through wastewater or the number of emergency department visits for COVID-19, when available, can also inform decision making for health officials and individuals.
For the most accurate and up-to-date data for any county or state, visit the relevant health department website. COVID Data Tracker may display data that differ from state and local websites. This can be due to differences in how data were collected, how metrics were calculated, or the timing of web updates.
Archived Data Notes:
This dataset was renamed from "United States COVID-19 Community Levels by County as Originally Posted" to "United States COVID-19 Community Levels by County" on March 31, 2022.
March 31, 2022: Column name for county population was changed to “county_population”. No change was made to the data points previous released.
March 31, 2022: New column, “health_service_area_population”, was added to the dataset to denote the total population in the designated Health Service Area based on 2019 Census estimate.
March 31, 2022: FIPS codes for territories American Samoa, Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands were re-formatted to 5-digit numeric for records released on 3/3/2022 to be consistent with other records in the dataset.
March 31, 2022: Changes were made to the text fields in variables “county”, “state”, and “health_service_area” so the formats are consistent across releases.
March 31, 2022: The “%” sign was removed from the text field in column “covid_inpatient_bed_utilization”. No change was made to the data. As indicated in the column description, values in this column represent the percentage of staffed inpatient beds occupied by COVID-19 patients (7-day average).
March 31, 2022: Data values for columns, “county_population”, “health_service_area_number”, and “health_service_area” were backfilled for records released on 2/24/2022. These columns were added since the week of 3/3/2022, thus the values were previously missing for records released the week prior.
April 7, 2022: Updates made to data released on 3/24/2022 for Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands to correct a data mapping error.
April 21, 2022: COVID-19 Community Level (CCL) data released for counties in Nebraska for the week of April 21, 2022 have 3 counties identified in the high category and 37 in the medium category. CDC has been working with state officials to verify the data submitted, as other data systems are not providing alerts for substantial increases in disease transmission or severity in the state.
May 26, 2022: COVID-19 Community Level (CCL) data released for McCracken County, KY for the week of May 5, 2022 have been updated to correct a data processing error. McCracken County, KY should have appeared in the low community level category during the week of May 5, 2022. This correction is reflected in this update.
May 26, 2022: COVID-19 Community Level (CCL) data released for several Florida counties for the week of May 19th, 2022, have been corrected for a data processing error. Of note, Broward, Miami-Dade, Palm Beach Counties should have appeared in the high CCL category, and Osceola County should have appeared in the medium CCL category. These corrections are reflected in this update.
May 26, 2022: COVID-19 Community Level (CCL) data released for Orange County, New York for the week of May 26, 2022 displayed an erroneous case rate of zero and a CCL category of low due to a data source error. This county should have appeared in the medium CCL category.
June 2, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a data processing error. Tolland County, CT should have appeared in the medium community level category during the week of May 26, 2022. This correction is reflected in this update.
June 9, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a misspelling. The medium community level category for Tolland County, CT on the week of May 26, 2022 was misspelled as “meduim” in the data set. This correction is reflected in this update.
June 9, 2022: COVID-19 Community Level (CCL) data released for Mississippi counties for the week of June 9, 2022 should be interpreted with caution due to a reporting cadence change over the Memorial Day holiday that resulted in artificially inflated case rates in the state.
July 7, 2022: COVID-19 Community Level (CCL) data released for Rock County, Minnesota for the week of July 7, 2022 displayed an artificially low case rate and CCL category due to a data source error. This county should have appeared in the high CCL category.
July 14, 2022: COVID-19 Community Level (CCL) data released for Massachusetts counties for the week of July 14, 2022 should be interpreted with caution due to a reporting cadence change that resulted in lower than expected case rates and CCL categories in the state.
July 28, 2022: COVID-19 Community Level (CCL) data released for all Montana counties for the week of July 21, 2022 had case rates of 0 due to a reporting issue. The case rates have been corrected in this update.
July 28, 2022: COVID-19 Community Level (CCL) data released for Alaska for all weeks prior to July 21, 2022 included non-resident cases. The case rates for the time series have been corrected in this update.
July 28, 2022: A laboratory in Nevada reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate will be inflated in Clark County, NV for the week of July 28, 2022.
August 4, 2022: COVID-19 Community Level (CCL) data was updated on August 2, 2022 in error during performance testing. Data for the week of July 28, 2022 was changed during this update due to additional case and hospital data as a result of late reporting between July 28, 2022 and August 2, 2022. Since the purpose of this data set is to provide point-in-time views of COVID-19 Community Levels on Thursdays, any changes made to the data set during the August 2, 2022 update have been reverted in this update.
August 4, 2022: COVID-19 Community Level (CCL) data for the week of July 28, 2022 for 8 counties in Utah (Beaver County, Daggett County, Duchesne County, Garfield County, Iron County, Kane County, Uintah County, and Washington County) case data was missing due to data collection issues. CDC and its partners have resolved the issue and the correction is reflected in this update.
August 4, 2022: Due to a reporting cadence change, case rates for all Alabama counties will be lower than expected. As a result, the CCL levels published on August 4, 2022 should be interpreted with caution.
August 11, 2022: COVID-19 Community Level (CCL) data for the week of August 4, 2022 for South Carolina have been updated to correct a data collection error that resulted in incorrect case data. CDC and its partners have resolved the issue and the correction is reflected in this update.
August 18, 2022: COVID-19 Community Level (CCL) data for the week of August 11, 2022 for Connecticut have been updated to correct a data ingestion error that inflated the CT case rates. CDC, in collaboration with CT, has resolved the issue and the correction is reflected in this update.
August 25, 2022: A laboratory in Tennessee reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate may be inflated in many counties and the CCLs published on August 25, 2022 should be interpreted with caution.
August 25, 2022: Due to a data source error, the 7-day case rate for St. Louis County, Missouri, is reported as zero in the COVID-19 Community Level data released on August 25, 2022. Therefore, the COVID-19 Community Level for this county should be interpreted with caution.
September 1, 2022: Due to a reporting issue, case rates for all Nebraska counties will include 6 days of data instead of 7 days in the COVID-19 Community Level (CCL) data released on September 1, 2022. Therefore, the CCLs for all Nebraska counties should be interpreted with caution.
September 8, 2022: Due to a data processing error, the case rate for Philadelphia County, Pennsylvania,
Operational, financial, and land use data to estimate drinking water treatment cost functions for 2006 calendar year. Data are organized for surface water and groundwater systems. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: US EPA's Office of Water, Office of Science and Technology, Engineering and Analysis Division is the holder of the survey data. US EPA's Office of Water maintains a database of point coordinates for surface water intakes and wells used for public water supply. Format: Our dataset is described in detail in Section 3 of the paper. We include a link to the 2006 Community Water System Survey that excludes the identifiers. Other data are confidential business information.
This dataset is associated with the following publication: Price, J., and M. Heberling. The Effects of Agricultural and Urban Land Use on Drinking Water Treatment Costs: An Analysis of United States Community Water Systems. Water Economics and Policy. World Scientific Publishing Co. Pte. Ltd., 5 Toh Tuck Link, SINGAPORE, 6(4): 2050008, (2020).
This dataset provides the monthly total of gift contributions received by the U.S. Treasury that were donated to reduce the public debt. These donations can include money, outstanding government obligations, and property that is sold for cash.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By US Open Data Portal, data.gov [source]
This dataset contains in-depth facility-level information on industrial combustion energy use in the United States. It provides an essential resource for understanding consumption patterns across different sectors and industries, as reported by large emitters (>25,000 metric tons CO2e per year) under the U.S. EPA's Greenhouse Gas Reporting Program (GHGRP). Our records have been calculated using EPA default emissions factors and contain data on fuel type, location (latitude, longitude), combustion unit type and energy end use classified by manufacturing NAICS code. Additionally, our dataset reveals valuable insight into the thermal spectrum of low-temperature energy use from a 2010 Energy Information Administration Manufacturing Energy Consumption Survey (MECS). This information is critical to assessing industrial trends of energy consumption in manufacturing sectors and can serve as an informative baseline for efficient or renewable alternative plans of operation at these facilities. With this dataset you're just a few clicks away from analyzing research questions related to consumption levels across industries, waste issues associated with unconstrained fossil fuel burning practices and their environmental impacts
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides detailed information on industrial combustion energy end use in the United States. Knowing how certain industries use fuel can be valuable for those interested in reducing energy consumption and its associated environmental impacts.
To make the most out of this dataset, users should first become familiar with what's included by looking at the columns and their respective definitions. After becoming familiar with the data, users should start to explore areas of interest such as Fuel Type, Report Year, Primary NAICS Code, Emissions Indicators etc. The more granular and specific details you can focus on will help build a stronger analysis from which to draw conclusions from your data set.
Next steps could include filtering your data set down by region or end user type (such as direct related processes or indirect support activities). Segmenting your data set further can allow you to identify trends between fuel type used in different regions or compare emissions indicators between different processes within manufacturing industries etc. By taking a closer look through this lens you may be able to find valuable insights that can help inform better decision making when it comes to reducing energy consumption throughout industry in both public and private sectors alike.
if exploring specific trends within industry is not something that’s of particular interest to you but rather understanding general patterns among large emitters across regions then it may be beneficial for your analysis to group like-data together and take averages over larger samples which better represent total production across an area or multiple states (timeline varies depending on needs). This approach could open up new possibilities for exploring correlations between economic productivity metrics compared against industrial energy use over periods of time which could lead towards more formal investigations about where efforts are being made towards improved resource efficiency standards among certain industries/areas of production compared against other more inefficient sectors/regionsetc — all from what's already present here!
By leveraging the information provided within this dataset users have access to many opportunities for finding all sorts of interesting yet practical insights which can have important impacts far beyond understanding just another singular statistic alone; so happy digging!
- Analyzing the trends in combustion energy uses by region across different industries.
- Predicting the potential of transitioning to clean and renewable sources of energy considering the current end-uses and their magnitude based on this data.
- Creating an interactive web map application to visualize multiple industrial sites, including their energy sources and emissions data from this dataset combined with other sources (EPA’s GHGRP, MECS survey, etc)
If you use this dataset in your research, please credit the original authors. Data Source
**License: [CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication](https://creativecommons...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GDP Growth Contribution Exports in the United States increased to 0.19 percentage points in the first quarter of 2025 from -0.01 percentage points in the fourth quarter of 2024. This dataset includes a chart with historical data for the United States GDP Growth Contribution Exports.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the South Lead Hill population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for South Lead Hill. The dataset can be utilized to understand the population distribution of South Lead Hill by age. For example, using this dataset, we can identify the largest age group in South Lead Hill.
Key observations
The largest age group in South Lead Hill, AR was for the group of age 45 to 49 years years with a population of 8 (13.56%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in South Lead Hill, AR was the 80 to 84 years years with a population of 0 (0%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for South Lead Hill Population by Age. You can refer the same here
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Native American Facial Images from Past Dataset, meticulously curated to enhance face recognition models and support the development of advanced biometric identification systems, KYC models, and other facial recognition technologies.
This dataset comprises over 5,000+ images, divided into participant-wise sets with each set including:
The dataset includes contributions from a diverse network of individuals across Native American countries:
To ensure high utility and robustness, all images are captured under varying conditions:
Each image set is accompanied by detailed metadata for each participant, including:
This metadata is essential for training models that can accurately recognize and identify Native American faces across different demographics and conditions.
This facial image dataset is ideal for various applications in the field of computer vision, including but not limited to:
This dataset package is focused on U.S construction materials and three construction companies: Cemex, Martin Marietta & Vulcan.
In this package, SpaceKnow tracks manufacturing and processing facilities for construction material products all over the US. By tracking these facilities, we are able to give you near-real-time data on spending on these materials, which helps to predict residential and commercial real estate construction and spending in the US.
The dataset includes 40 indices focused on asphalt, cement, concrete, and building materials in general. You can look forward to receiving country-level and regional data (activity in the North, East, West, and South of the country) and the aforementioned company data.
SpaceKnow uses satellite (SAR) data to capture activity and building material manufacturing and processing facilities in the US.
Data is updated daily, has an average lag of 4-6 days, and history back to 2017.
The insights provide you with level and change data for refineries, storage, manufacturing, logistics, and employee parking-based locations.
SpaceKnow offers 3 delivery options: CSV, API, and Insights Dashboard
Available Indices Companies: Cemex (CX): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates Martin Marietta (MLM): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates Vulcan (VMC): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates
USA Indices:
Aggregates USA Asphalt USA Cement USA Cement Refinery USA Cement Storage USA Concrete USA Construction Materials USA Construction Mining USA Construction Parking Lots USA Construction Materials Transfer Hub US Cement - Midwest, Northeast, South, West Cement Refinery - Midwest, Northeast, South, West Cement Storage - Midwest, Northeast, South, West
Why get SpaceKnow's U.S Construction Materials Package?
Monitor Construction Market Trends: Near-real-time insights into the construction industry allow clients to understand and anticipate market trends better.
Track Companies Performance: Monitor the operational activities, such as the volume of sales
Assess Risk: Use satellite activity data to assess the risks associated with investing in the construction industry.
Index Methodology Summary Continuous Feed Index (CFI) is a daily aggregation of the area of metallic objects in square meters. There are two types of CFI indices; CFI-R index gives the data in levels. It shows how many square meters are covered by metallic objects (for example employee cars at a facility). CFI-S index gives the change in data. It shows how many square meters have changed within the locations between two consecutive satellite images.
How to interpret the data SpaceKnow indices can be compared with the related economic indicators or KPIs. If the economic indicator is in monthly terms, perform a 30-day rolling sum and pick the last day of the month to compare with the economic indicator. Each data point will reflect approximately the sum of the month. If the economic indicator is in quarterly terms, perform a 90-day rolling sum and pick the last day of the 90-day to compare with the economic indicator. Each data point will reflect approximately the sum of the quarter.
Where the data comes from SpaceKnow brings you the data edge by applying machine learning and AI algorithms to synthetic aperture radar and optical satellite imagery. The company’s infrastructure searches and downloads new imagery every day, and the computations of the data take place within less than 24 hours.
In contrast to traditional economic data, which are released in monthly and quarterly terms, SpaceKnow data is high-frequency and available daily. It is possible to observe the latest movements in the construction industry with just a 4-6 day lag, on average.
The construction materials data help you to estimate the performance of the construction sector and the business activity of the selected companies.
The foundation of delivering high-quality data is based on the success of defining each location to observe and extract the data. All locations are thoroughly researched and validated by an in-house team of annotators and data analysts.
See below how our Construction Materials index performs against the US Non-residential construction spending benchmark
Each individual location is precisely defined to avoid noise in the data, which may arise from traffic or changing vegetation due to seasonal reasons.
SpaceKnow uses radar imagery and its own unique algorithms, so the indices do not lose their significance in bad weather conditions such as rain or heavy clouds.
→ Reach out to get free trial
...
analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
huggingface-projects/contribute-a-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community