Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents median household incomes for various household sizes in La Cañada Flintridge, CA, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.
Key observations
https://i.neilsberg.com/ch/la-canada-flintridge-ca-median-household-income-by-household-size.jpeg" alt="La Cañada Flintridge, CA median household income, by household size (in 2022 inflation-adjusted dollars)">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Household Sizes:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for La Cañada Flintridge median household income. You can refer the same here
This table presents income shares, thresholds, tax shares, and total counts of individual Canadian tax filers, with a focus on high income individuals (95% income threshold, 99% threshold, etc.). Income thresholds are based on national threshold values, regardless of selected geography; for example, the number of Nova Scotians in the top 1% will be calculated as the number of taxfiling Nova Scotians whose total income exceeded the 99% national income threshold. Different definitions of income are available in the table namely market, total, and after-tax income, both with and without capital gains.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.
By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.
Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.
The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!
While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.
The files contained here are a subset of the KernelVersions
in Meta Kaggle. The file names match the ids in the KernelVersions
csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.
The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.
The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads
. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays
We love feedback! Let us know in the Discussion tab.
Happy Kaggling!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wages in China increased to 120698 CNY/Year in 2023 from 114029 CNY/Year in 2022. This dataset provides - China Average Yearly Wages - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main goal of this model is to help me create an app that count How much money does a picture has.
Descriptions of each class type
I don't seperate country base money and don't seperate front and back
EUR-1-cent dasdasd
EUR-2-cent
EUR-5-cent
EUR-10-cent
EUR-20-cent
EUR-50-cent
EUR-1-euro
EUR-2-euro
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents median household incomes for various household sizes in Williams Bay, WI, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.
Key observations
https://i.neilsberg.com/ch/williams-bay-wi-median-household-income-by-household-size.jpeg" alt="Williams Bay, WI median household income, by household size (in 2022 inflation-adjusted dollars)">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Household Sizes:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Williams Bay median household income. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 1 row and is filtered where the books is 101 great ways to sew a metre : look how much you can make with just one metre of fabric!. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Welcome to the Consolidated Open Source Global Development Dataset (COSGDD)!
The Consolidated Open Source Global Development Dataset (COSGDD) was created to address the growing need for accessible, consolidated, and diverse global datasets for education, research, and policy-making. By combining data from publicly available, open-source datasets, COSGDD provides a one-stop resource for analyzing key socio-economic, environmental, and governance indicators across the globe.
Streamlit Dashboard Link (The LIME explanation graph will take time to load) - https://cosgdd.streamlit.app/ Github Code Repo Link - https://github.com/AkhilByteWrangler/Consolidated-Open-Source-Global-Development-Dataset
Imagine having a magical map of the world that shows you not just the roads and mountains but also how happy people are, how much money they make, how clean the air is, and how fair their governments are. This dataset is that magical map - but in the form of organized data!
It combines facts and figures from trusted sources to help researchers, governments, companies, and YOU understand how the world works and how to make it better.
The world is complicated. Happiness doesn’t depend on just one thing like money; it’s also about health, fairness, relationships, and even how clean the air is. But these pieces of the puzzle are scattered across many places. This dataset brings everything together in one place, making it easier to:
- Answer big questions like:
- What makes people happy?
- Is wealth or freedom more important for well-being?
- How does urbanization affect happiness?
- Find patterns and trends across countries.
- Make smart decisions based on real-world data.
This dataset is for anyone curious about the world, including:
- Researchers: Study connections between happiness, governance, and sustainability.
- Policy Makers: Design better policies to improve quality of life.
- Data Enthusiasts: Explore trends and patterns using statistics or machine learning.
- Businesses: Understand societal needs to improve Corporate Social Responsibility (CSR).
This dataset consolidates data from well-established sources such as the World Happiness Report, The Economist Democracy Index, environmental databases, and more. It includes engineered features to deepen understanding of well-being and sustainability.
Life Ladder
: Self-reported happiness scores.Log GDP per capita
: Log-transformed measure of wealth.Tax Revenue
: Government revenue as a share of GDP.Social support
: Proportion of people with reliable social networks.Freedom to make life choices
: Self-reported freedom levels.Total Emissions
: Aggregated greenhouse gas emissions.Renewables Production
: Share of renewable energy production.Democracy_Index
: Quantitative measure of democratic governance.Rule_of_Law_Index
: Assessment of the legal system’s strength.Freedom_Index
: Combines wealth and freedom.Generosity_Per_Dollar
: Normalized generosity against GDP.Environmental_Bonus
: Evaluates environmental efficiency relative to economic output.2024
). Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Corporate Profits in the United States decreased to 3191.90 USD Billion in the first quarter of 2025 from 3312 USD Billion in the fourth quarter of 2024. This dataset provides the latest reported value for - United States Corporate Profits - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
In the United States, city governments provide many services: they run public school districts, administer certain welfare and health programs, build roads and manage airports, provide police and fire protection, inspect buildings, and often run water and utility systems. Cities also get revenues through certain local taxes, various fees and permit costs, sale of property, and through the fees they charge for the utilities they run.
It would be interesting to compare all these expenses and revenues across cities and over time, but also quite difficult. Cities share many of these service responsibilities with other government agencies: in one particular city, some roads may be maintained by the state government, some law enforcement provided by the county sheriff, some schools run by independent school districts with their own tax revenue, and some utilities run by special independent utility districts. These governmental structures vary greatly by state and by individual city. It would be hard to make a fair comparison without taking into account all these differences.
This dataset takes into account all those differences. The Lincoln Institute of Land Policy produces what they call “Fiscally Standardized Cities” (FiSCs), aggregating all services provided to city residents regardless of how they may be divided up by different government agencies and jurisdictions. Using this, we can study city expenses and revenues, and how the proportions of different costs vary over time.
The dataset tracks over 200 American cities between 1977 and 2020. Each row represents one city for one year. Revenue and expenditures are broken down into more than 120 categories.
Values are available for FiSCs and also for the entities that make it up: the city, the county, independent school districts, and any special districts, such as utility districts. There are hence five versions of each variable, with suffixes indicating the entity. For example, taxes gives the FiSC’s tax revenue, while taxes_city, taxes_cnty, taxes_schl, and taxes_spec break it down for the city, county, school districts, and special districts.
The values are organized hierarchically. For example, taxes is the sum of tax_property (property taxes), tax_sales_general (sales taxes), tax_income (income tax), and tax_other (other taxes). And tax_income is itself the sum of tax_income_indiv (individual income tax) and tax_income_corp (corporate income tax) subcategories.
The revenue and expenses variables are described in this detailed table. Further documentation is available on the FiSC Database website, linked in References below.
All monetary data is already adjusted for inflation, and is given in terms of 2020 US dollars per capita. The Consumer Price Index is provided for each year if you prefer to use numbers not adjusted for inflation, scaled so that 2020 is 1; simply divide each value by the CPI to get the value in that year’s nominal dollars. The total population is also provided if you want total values instead of per-capita values.
The National Hydrography Dataset Plus (NHDplus) maps the lakes, ponds, streams, rivers and other surface waters of the United States. Created by the US EPA Office of Water and the US Geological Survey, the NHDPlus provides mean annual and monthly flow estimates for rivers and streams. Additional attributes provide connections between features facilitating complicated analyses. For more information on the NHDPlus dataset see the NHDPlus v2 User Guide.Dataset SummaryPhenomenon Mapped: Surface waters and related features of the United States and associated territories not including Alaska.Geographic Extent: The United States not including Alaska, Puerto Rico, Guam, US Virgin Islands, Marshall Islands, Northern Marianas Islands, Palau, Federated States of Micronesia, and American SamoaProjection: Web Mercator Auxiliary Sphere Visible Scale: Visible at all scales but layer draws best at scales larger than 1:1,000,000Source: EPA and USGSUpdate Frequency: There is new new data since this 2019 version, so no updates planned in the futurePublication Date: March 13, 2019Prior to publication, the NHDPlus network and non-network flowline feature classes were combined into a single flowline layer. Similarly, the NHDPlus Area and Waterbody feature classes were merged under a single schema.Attribute fields were added to the flowline and waterbody layers to simplify symbology and enhance the layer's pop-ups. Fields added include Pop-up Title, Pop-up Subtitle, On or Off Network (flowlines only), Esri Symbology (waterbodies only), and Feature Code Description. All other attributes are from the original NHDPlus dataset. No data values -9999 and -9998 were converted to Null values for many of the flowline fields.What can you do with this layer?Feature layers work throughout the ArcGIS system. Generally your work flow with feature layers will begin in ArcGIS Online or ArcGIS Pro. Below are just a few of the things you can do with a feature service in Online and Pro.ArcGIS OnlineAdd this layer to a map in the map viewer. The layer is limited to scales of approximately 1:1,000,000 or larger but a vector tile layer created from the same data can be used at smaller scales to produce a webmap that displays across the full range of scales. The layer or a map containing it can be used in an application. Change the layer’s transparency and set its visibility rangeOpen the layer’s attribute table and make selections. Selections made in the map or table are reflected in the other. Center on selection allows you to zoom to features selected in the map or table and show selected records allows you to view the selected records in the table.Apply filters. For example you can set a filter to show larger streams and rivers using the mean annual flow attribute or the stream order attribute. Change the layer’s style and symbologyAdd labels and set their propertiesCustomize the pop-upUse as an input to the ArcGIS Online analysis tools. This layer works well as a reference layer with the trace downstream and watershed tools. The buffer tool can be used to draw protective boundaries around streams and the extract data tool can be used to create copies of portions of the data.ArcGIS ProAdd this layer to a 2d or 3d map. Use as an input to geoprocessing. For example, copy features allows you to select then export portions of the data to a new feature class. Change the symbology and the attribute field used to symbolize the dataOpen table and make interactive selections with the mapModify the pop-upsApply Definition Queries to create sub-sets of the layerThis layer is part of the ArcGIS Living Atlas of the World that provides an easy way to explore the landscape layers and many other beautiful and authoritative maps on hundreds of topics.Questions?Please leave a comment below if you have a question about this layer, and we will get back to you as soon as possible.
https://www.icpsr.umich.edu/web/ICPSR/studies/36498/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36498/terms
The Population Assessment of Tobacco and Health (PATH) Study began originally surveying 45,971 adult and youth respondents. The PATH Study was launched in 2011 to inform Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.Dataset 0001 (DS0001) contains the data from the Master Linkage file. This file contains 14 variables and 67,276 cases. The file provides a master list of every person's unique identification number and what type of respondent they were for each wave. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Public-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.Dataset 1001 (DS1001) contains the data from the Wave 1 Adult Questionnaire. This data file contains 1,732 variables and 32,320 cases. Each of the cases represents a single, completed interview. Dataset 1002 (DS1002) contains the data from the Youth and Parent Questionnaire. This file contains 1,228 variables and 13,651 cases.Dataset 2001 (DS2001) contains the data from the Wave 2 Adult Questionnaire. This data file contains 2,197 variables and 28,362 cases. Of these cases, 26,447 also completed a Wave 1 Adult Questionnaire. The other 1,915 cases are "aged-up adults" having previously completed a Wave 1 Youth Questionnaire. Dataset 2002 (DS2002) contains the data from the Wave 2 Youth and Parent Questionnaire. This data file contains 1,389 variables and 12,172 cases. Of these cases, 10,081 also completed a Wave 1 Youth Questionnaire. The other 2,091 cases are "aged-up youth" having previously been sampled as "shadow youth." Dataset 3001 (DS3001) contains the data from the Wave 3 Adult Questionnaire. This data file contains 2,139 variables and 28,148 cases. Of these cases, 26,241 are continuing adults having completed a prior Adult Questionnaire. The other 1,907 cases are "aged-up adults" having previously completed a Youth Questionnaire. Dataset 3002 (DS3002) contains the data from t
https://www.icpsr.umich.edu/web/ICPSR/studies/36231/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36231/terms
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Dataset 0002 (DS0002) contains the data from the State Design Data. This file contains 7 variables and 82,139 cases. The state identifier in the State Design file reflects the participant's state of residence at the time of selection and recruitment for the PATH Study. Dataset 1011 (DS1011) contains the data from the Wave 1 Adult Questionnaire. This data file contains 2,021 variables and 32,320 cases. Each of the cases represents a single, completed interview. Dataset 1012 (DS1012) contains the data from the Wave 1 Youth and Parent Questionnaire. This file contains 1,431 variables and 13,651 cases. Dataset 1411 (DS1411) contains the Wave 1 State Identifier data for Adults and has 5 variables and 32,320 cases. Dataset 1412 (DS1412) contains the Wave 1 State Identifier data for Youth (and Parents) and has 5 variables and 13,651 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 1, which is also their state of residence at the time of recruitment. Dataset 1611 (DS1611) contains the Tobacco Universal Product Code (UPC) data from Wave 1. This data file contains 32 variables and 8,601 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 1. The UPC values can be used to identify and validate the specific products used by respon
The Willamette Lowland basin-fill aquifers (hereinafter referred to as the Willamette aquifer) is located in Oregon and in southern Washington. The aquifer is composed of unconsolidated deposits of sand and gravel, which are interlayered with clay units. The aquifer thickness varies from less than 100 feet to 800 feet. The aquifer is underlain by basaltic-rock. Cities such as Portland, Oregon, depend on the aquifer for public and industrial use (HA 730-H). This product provides source data for the Willamette aquifer framework, including: Georeferenced images: 1. i_08WLMLWD_bot.tif: Georeferenced figure of altitude contour lines representing the bottom of the Willamette aquifer. The original figure was from Professional Paper 1424-A, Plate 2 (1424-A-P2). The contour lines from this figure were digitized to make the file c_08WLMLWD_bot.shp, and the fault lines were digitized to make f_08WLMLWD_bot.shp. Extent shapefiles: 1. p_08WLMLWD.shp: Polygon shapefile containing the areal extent of the Willamette aquifer (Willamette_AqExtent). The original shapefile was modified to create the shapefile included in this data release. It was modified to only include the Willamette Lowland portion of the aquifer. The extent file contains no aquifer subunits. Contour line shapefiles: 1. c_08WLMLWD_bot.shp: Contour line dataset containing altitude values, in feet, referenced to National Geodetic Vertical Datum of 1929 (NGVD29), across the bottom of the Willamette aquifer. These data were used to create the ra_08WLMLWD_bot.tif raster dataset. Fault line shapefiles: 1. f_08WLMLWD_bot.shp: Fault line dataset containing fault lines across the bottom of the Willamette aquifer. These data were not used in raster creation but were included as supplementary information. Altitude raster files: 1. ra_08WLMLWD_top.tif: Altitude raster dataset of the top of the Willamette aquifer. The altitude values are in meters reference to North American Vertical Datum of 1988 (NAVD88). The top of the aquifer is assumed to be land surface based on available data and was interpolated from the digital elevation model (DEM) dataset (NED, 100-meter). 2. ra_08WLMLWD_bot.tif: Altitude raster dataset of the bottom of the Willamette aquifer. The altitude values are in meters reference to NAVD88. This raster was interpolated from the c_08WLMLWD_bot.shp dataset. Depth raster files: 1. rd_08WLMLWD_top.tif: Depth raster dataset of the top of the Willamette aquifer. The depth values are in meters below land surface (NED, 100-meter). The top of the aquifer is assumed to be land surface based on available data. 2. rd_08WLMLWD_bot.tif : Depth raster dataset of the bottom of the Willamette aquifer. The depth values are in meters below land surface (NED, 100-meter).
On an annual basis (individual hospital fiscal year), individual hospitals and hospital systems report detailed facility-level data on services capacity, inpatient/outpatient utilization, patients, revenues and expenses by type and payer, balance sheet and income statement.
Due to the large size of the complete dataset, a selected set of data representing a wide range of commonly used data items, has been created that can be easily managed and downloaded. The selected data file includes general hospital information, utilization data by payer, revenue data by payer, expense data by natural expense category, financial ratios, and labor information.
There are two groups of data contained in this dataset: 1) Selected Data - Calendar Year: To make it easier to compare hospitals by year, hospital reports with report periods ending within a given calendar year are grouped together. The Pivot Tables for a specific calendar year are also found here. 2) Selected Data - Fiscal Year: Hospital reports with report periods ending within a given fiscal year (July-June) are grouped together.
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
This dataset contains product prices from Amazon USA, with a focus on price prediction. With a good amount of data on what price points sell the most, you can train machine learning models to predict the optimal price for a product based on its features and product name.
If you find this dataset useful, make sure to show your appreciation by upvoting! ❤️✨
This dataset is a superset of my Amazon USA product price dataset. Another inspiration is this competition that awareded 100K Prize Money
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Business roles at AgroStar require a baseline of analytical skills, and it is also critical that we are able to explain complex concepts in a simple way to a variety of audiences. This test is structured so that someone with the baseline skills needed to succeed in the role should be able to complete this in under 4 hours without assistance.
Use the data in the included sheet to address the following scenario...
Since its inception, AgroStar has been leveraging an assisted marketplace model. Given that the market potential is huge and that the target customer appreciates a physical store nearby, we have taken a call to explore the offline retail model to drive growth. The primary objective is to get a larger wallet share for AgroStar among existing customers.
Assume you are back in time, in August 2018 and you have been asked to determine the location (taluka) of the first AgroStar offline retail store. 1. What are the key factors you would use to determine the location? Why? 2. What taluka (across three states) would you look open in? Why?
-- (1) Please mention any assumptions you have made and the underlying thought process
-- (2) Please treat the assignment as standalone (it should be self-explanatory to someone who reads it), but we will have a follow-up discussion with you in which we will walk through your approach to this assignment.
-- (3) Mention any data that may be missing that would make this study more meaningful
-- (4) Kindly conduct your analysis within the spreadsheet, we would like to see the working sheet
. If you face any issues due to the file size, kindly download this file and share an excel sheet with us
-- (5) If you would like to append a word document/presentation to summarize, please go ahead.
-- (6) In case you use any external data source/article, kindly share the source.
The file CDNOW_master.txt contains the entire purchase history up to the end of June 1998 of the cohort of 23,570 individuals who made their first-ever purchase at CDNOW in the first quarter of 1997. This CDNOW dataset was first used by Fader and Hardie (2001).
Each record in this file, 69,659 in total, comprises four fields: the customer's ID, the date of the transaction, the number of CDs purchased, and the dollar value of the transaction.
CustID = CDNOW_master(:,1); % customer id Date = CDNOW_master(:,2); % transaction date Quant = CDNOW_master(:,3); % number of CDs purchased Spend = CDNOW_master(:,4); % dollar value (excl. S&H)
See "Notes on the CDNOW Master Data Set" (http://brucehardie.com/notes/026/) for details of how the 1/10th systematic sample (http://brucehardie.com/datasets/CDNOW_sample.zip) used in many papers was created.
Reference:
Fader, Peter S. and Bruce G.,S. Hardie, (2001), "Forecasting Repeat Sales at CDNOW: A Case Study," Interfaces, 31 (May-June), Part 2 of 2, S94-S107.
I have merged all three datasets into one file and also did some feature engineering.
Available Data: You will be given anonymized user gameplay data in the form of 3 csv files.
Fields in the data are as described below:
Gameplay_Data.csv contains the following fields:
* Uid: Alphanumeric unique Id assigned to user
* Eventtime: DateTime on which user played the tournament
* Entry_Fee: Entry Fee of tournament
* Win_Loss: ‘W’ if the user won that particular tournament, ‘L’ otherwise
* Winnings: How much money the user won in the tournament (0 for ‘L’)
* Tournament_Type: Type of tournament user played (A / B / C / D)
* Num_Players: Number of players that played in this tournament
Wallet_Balance.csv contains following fields: * Uid: Alphanumeric unique Id assigned to user * Timestamp: DateTime at which user’s wallet balance is given * Wallet_Balance: User’s wallet balance at given time stamp
Demographic.csv contains following fields: * Uid: Alphanumeric unique Id assigned to user * Installed_At: Timestamp at which user installed the app * Connection_Type: User’s internet connection type (Ex: Cellular / Dial Up) * Cpu_Type: Cpu type of device that the user is playing with * Network_Type: Network type in encoded form * Device_Manufacturer: Ex: Realme * ISP: Internet Service Provider. Ex: Airtel * Country * Country_Subdivision * City * Postal_Code * Language: Language that user has selected for gameplay * Device_Name * Device_Type
Build a basic recommendation system which is able to rank/recommend relevant tournaments and entry prices to the user. The main objectives are: 1. A user should not have to scroll too much before selecting a tournament of their preference 2. We would like the user to play as high an entry fee tournament as possible
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Note: In these datasets, a person is defined as up to date if they have received at least one dose of an updated COVID-19 vaccine. The Centers for Disease Control and Prevention (CDC) recommends that certain groups, including adults ages 65 years and older, receive additional doses.
Starting on July 13, 2022, the denominator for calculating vaccine coverage has been changed from age 5+ to all ages to reflect new vaccine eligibility criteria. Previously the denominator was changed from age 16+ to age 12+ on May 18, 2021, then changed from age 12+ to age 5+ on November 10, 2021, to reflect previous changes in vaccine eligibility criteria. The previous datasets based on age 12+ and age 5+ denominators have been uploaded as archived tables.
Starting June 30, 2021, the dataset has been reconfigured so that all updates are appended to one dataset to make it easier for API and other interfaces. In addition, historical data has been extended back to January 5, 2021.
This dataset shows full, partial, and at least 1 dose coverage rates by zip code tabulation area (ZCTA) for the state of California. Data sources include the California Immunization Registry and the American Community Survey’s 2015-2019 5-Year data.
This is the data table for the LHJ Vaccine Equity Performance dashboard. However, this data table also includes ZTCAs that do not have a VEM score.
This dataset also includes Vaccine Equity Metric score quartiles (when applicable), which combine the Public Health Alliance of Southern California’s Healthy Places Index (HPI) measure with CDPH-derived scores to estimate factors that impact health, like income, education, and access to health care. ZTCAs range from less healthy community conditions in Quartile 1 to more healthy community conditions in Quartile 4.
The Vaccine Equity Metric is for weekly vaccination allocation and reporting purposes only. CDPH-derived quartiles should not be considered as indicative of the HPI score for these zip codes. CDPH-derived quartiles were assigned to zip codes excluded from the HPI score produced by the Public Health Alliance of Southern California due to concerns with statistical reliability and validity in populations smaller than 1,500 or where more than 50% of the population resides in a group setting.
These data do not include doses administered by the following federal agencies who received vaccine allocated directly from CDC: Indian Health Service, Veterans Health Administration, Department of Defense, and the Federal Bureau of Prisons.
For some ZTCAs, vaccination coverage may exceed 100%. This may be a result of many people from outside the county coming to that ZTCA to get their vaccine and providers reporting the county of administration as the county of residence, and/or the DOF estimates of the population in that ZTCA are too low. Please note that population numbers provided by DOF are projections and so may not be accurate, especially given unprecedented shifts in population as a result of the pandemic.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset used is the OASIS MRI dataset (https://sites.wustl.edu/oasisbrains/), which consists of 80,000 brain MRI images. The images have been divided into four classes based on Alzheimer's progression. The dataset aims to provide a valuable resource for analyzing and detecting early signs of Alzheimer's disease.
To make the dataset accessible, the original .img and .hdr files were converted into Nifti format (.nii) using FSL (FMRIB Software Library). The converted MRI images of 461 patients have been uploaded to a GitHub repository, which can be accessed in multiple parts.
For the neural network training, 2D images were used as input. The brain images were sliced along the z-axis into 256 pieces, and slices ranging from 100 to 160 were selected from each patient. This approach resulted in a comprehensive dataset for analysis.
Patient classification was performed based on the provided metadata and Clinical Dementia Rating (CDR) values, resulting in four classes: demented, very mild demented, mild demented, and non-demented. These classes enable the detection and study of different stages of Alzheimer's disease progression.
During the dataset preparation, the .nii MRI scans were converted to .jpg files. Although this conversion presented some challenges, the files were successfully processed using appropriate tools. The resulting dataset size is 1.3 GB.
With this comprehensive dataset, the project aims to explore various neural network models and achieve optimal results in Alzheimer's disease detection and analysis.
Acknowledgments: “Data were provided 1-12 by OASIS-1: Cross-Sectional: Principal Investigators: D. Marcus, R, Buckner, J, Csernansky J. Morris; P50 AG05681, P01 AG03991, P01 AG026276, R01 AG021910, P20 MH071616, U24 RR021382”
Citation: OASIS-1: Cross-Sectional: https://doi.org/10.1162/jocn.2007.19.9.1498
If you are looking for processed NifTi image version of this dataset please click here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books, has 1 rows. and is filtered where the book is Ascendance of a bookworm : I'll do anything to become a librarian. Part 1, If there aren't any books, I'll just have to make some! Volume 1. It features 7 columns including book, author, publication date, language, and book publisher. The preview is ordered by publication date (descending).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents median household incomes for various household sizes in La Cañada Flintridge, CA, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.
Key observations
https://i.neilsberg.com/ch/la-canada-flintridge-ca-median-household-income-by-household-size.jpeg" alt="La Cañada Flintridge, CA median household income, by household size (in 2022 inflation-adjusted dollars)">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Household Sizes:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for La Cañada Flintridge median household income. You can refer the same here