100+ datasets found

N
Median Household Income Variation by Family Size in La CaÃƒÂ±ada Flintridge,...
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Median Household Income Variation by Family Size in La CaÃƒÂ±ada Flintridge, CA: Comparative analysis across 7 household sizes [Dataset]. https://www.neilsberg.com/research/datasets/1b147604-73fd-11ee-949f-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
La Cañada Flintridge, Los Angeles, California
Variables measured
Household size, Median Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 7 household sizes (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out how household income varies with the size of the family unit. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents median household incomes for various household sizes in La CaÃƒÂ±ada Flintridge, CA, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.

Key observations

Of the 7 household sizes (1 person to 7-or-more person households) reported by the census bureau, La CaÃƒÂ±ada Flintridge did not include 6-person households. Across the different household sizes in La CaÃƒÂ±ada Flintridge the mean income is $213,497, and the standard deviation is $75,043. The coefficient of variation (CV) is 35.15%. This high CV indicates high relative variability, suggesting that the incomes vary significantly across different sizes of households. Please note that the U.S. Census Bureau uses $250,001 as a JAM value to report incomes of $250,000 or more. In the case of La CaÃƒÂ±ada Flintridge, there were 2 household sizes where the JAM values were used. Thus, the numbers for the mean and standard deviation may not be entirely accurate and have a higher possibility of errors. However, to obtain an approximate estimate, we have used a value of $250,001 as the income for calculations, as reported in the datasets by the U.S. Census Bureau.

In the most recent year, 2021, The smallest household size for which the bureau reported a median household income was 1-person households, with an income of $76,227. It then further increased to $270,229 for 7-person households, the largest household size for which the bureau reported a median household income.

https://i.neilsberg.com/ch/la-canada-flintridge-ca-median-household-income-by-household-size.jpeg" alt="La CaÃƒÂ±ada Flintridge, CA median household income, by household size (in 2022 inflation-adjusted dollars)">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Household Sizes:

1-person households

2-person households

3-person households

4-person households

5-person households

6-person households

7-or-more-person households

Variables / Data Columns

Household Size: This column showcases 7 household sizes ranging from 1-person households to 7-or-more-person households (As mentioned above).

Median Household Income: Median household income, in 2022 inflation-adjusted dollars for the specific household size.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for La CaÃƒÂ±ada Flintridge median household income. You can refer the same here
N
Median Household Income Variation by Family Size in Williams Bay, WI:...
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Median Household Income Variation by Family Size in Williams Bay, WI: Comparative analysis across 7 household sizes [Dataset]. https://www.neilsberg.com/research/datasets/1b9b3624-73fd-11ee-949f-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Williams Bay, Wisconsin
Variables measured
Household size, Median Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 7 household sizes (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out how household income varies with the size of the family unit. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents median household incomes for various household sizes in Williams Bay, WI, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.

Key observations

Of the 7 household sizes (1 person to 7-or-more person households) reported by the census bureau, all of the household sizes were found in Williams Bay. Across the different household sizes in Williams Bay the mean income is $124,102, and the standard deviation is $81,319. The coefficient of variation (CV) is 65.53%. This high CV indicates high relative variability, suggesting that the incomes vary significantly across different sizes of households. Please note that the U.S. Census Bureau uses $250,001 as a JAM value to report incomes of $250,000 or more. In the case of Williams Bay, there were 1 household sizes where the JAM values were used. Thus, the numbers for the mean and standard deviation may not be entirely accurate and have a higher possibility of errors. However, to obtain an approximate estimate, we have used a value of $250,001 as the income for calculations, as reported in the datasets by the U.S. Census Bureau.

In the most recent year, 2021, The smallest household size for which the bureau reported a median household income was 1-person households, with an income of $26,542. It then further increased to $45,564 for 7-person households, the largest household size for which the bureau reported a median household income.

https://i.neilsberg.com/ch/williams-bay-wi-median-household-income-by-household-size.jpeg" alt="Williams Bay, WI median household income, by household size (in 2022 inflation-adjusted dollars)">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Household Sizes:

1-person households

2-person households

3-person households

4-person households

5-person households

6-person households

7-or-more-person households

Variables / Data Columns

Household Size: This column showcases 7 household sizes ranging from 1-person households to 7-or-more-person households (As mentioned above).

Median Household Income: Median household income, in 2022 inflation-adjusted dollars for the specific household size.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Williams Bay median household income. You can refer the same here
F
Real Median Personal Income in the United States
fred.stlouisfed.org
json
Updated Sep 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Real Median Personal Income in the United States [Dataset]. https://fred.stlouisfed.org/series/MEPAINUSA672N
Explore at:
jsonAvailable download formats
Dataset updated
Sep 10, 2024
License
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Area covered
United States
Description
Graph and download economic data for Real Median Personal Income in the United States (MEPAINUSA672N) from 1974 to 2023 about personal income, personal, median, income, real, and USA.
T
China Average Yearly Wages
tradingeconomics.com
de.tradingeconomics.com
+13more
csv, excel, json, xml
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). China Average Yearly Wages [Dataset]. https://tradingeconomics.com/china/wages
Explore at:
json, xml, csv, excelAvailable download formats
Dataset updated
May 15, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1952 - Dec 31, 2024
Area covered
China
Description
Wages in China increased to 120698 CNY/Year in 2023 from 114029 CNY/Year in 2022. This dataset provides - China Average Yearly Wages - actual values, historical data, forecast, chart, statistics, economic calendar and news.
N
Greensboro, NC annual median income by work experience and sex dataset :...
neilsberg.com
csv, json
Updated Jan 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Greensboro, NC annual median income by work experience and sex dataset : Aged 15+, 2010-2022 (in 2022 inflation-adjusted dollars) [Dataset]. https://www.neilsberg.com/research/datasets/949005c4-9816-11ee-99cf-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 9, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
North Carolina, Greensboro
Variables measured
Income for Male Population, Income for Female Population, Income for Male Population working full time, Income for Male Population working part time, Income for Female Population working full time, Income for Female Population working part time
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2010-2022 1-Year Estimates. To portray the income for both the genders (Male and Female), we conducted an initial analysis and categorization of the data. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents median income data over a decade or more for males and females categorized by Total, Full-Time Year-Round (FT), and Part-Time (PT) employment in Greensboro. It showcases annual income, providing insights into gender-specific income distributions and the disparities between full-time and part-time work. The dataset can be utilized to gain insights into gender-based pay disparity trends and explore the variations in income for male and female individuals.

Key observations: Insights from 2022

Based on our analysis ACS 2022 1-Year Estimates, we present the following observations: - All workers, aged 15 years and older: In Greensboro, the median income for all workers aged 15 years and older, regardless of work hours, was $37,291 for males and $26,937 for females.
These income figures indicate a substantial gender-based pay disparity, showcasing a gap of approximately 28% between the median incomes of males and females in Greensboro. With women, regardless of work hours, earning 72 cents to each dollar earned by men, this income disparity reveals a concerning trend toward wage inequality that demands attention in thecity of Greensboro.
- Full-time workers, aged 15 years and older: In Greensboro, among full-time, year-round workers aged 15 years and older, males earned a median income of $53,807, while females earned $41,696, leading to a 23% gender pay gap among full-time workers. This illustrates that women earn 77 cents for each dollar earned by men in full-time roles. This analysis indicates a widening gender pay gap, showing a substantial income disparity where women, despite working full-time, face a more significant wage discrepancy compared to men in the same roles.
Remarkably, across all roles, including non-full-time employment, women displayed a similar gender pay gap percentage. This indicates a consistent gender pay gap scenario across various employment types in Greensboro, showcasing a consistent income pattern irrespective of employment status.

https://i.neilsberg.com/ch/greensboro-nc-income-by-gender.jpeg" alt="Greensboro, NC gender based income disparity">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates. All incomes have been adjusting for inflation and are presented in 2022-inflation-adjusted dollars.

Gender classifications include:

Male

Female

Employment type classifications include:

Full-time, year-round: A full-time, year-round worker is a person who worked full time (35 or more hours per week) and 50 or more weeks during the previous calendar year.

Part-time: A part-time worker is a person who worked less than 35 hours per week during the previous calendar year.

Variables / Data Columns

Year: This column presents the data year. Expected values are 2010 to 2022

Male Total Income: Annual median income, for males regardless of work hours

Male FT Income: Annual median income, for males working full time, year-round

Male PT Income: Annual median income, for males working part time

Female Total Income: Annual median income, for females regardless of work hours

Female FT Income: Annual median income, for females working full time, year-round

Female PT Income: Annual median income, for females working part time

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Greensboro median household income by gender. You can refer the same here
Meta Kaggle Code
kaggle.com
zip
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaggle (2025). Meta Kaggle Code [Dataset]. https://www.kaggle.com/datasets/kaggle/meta-kaggle-code/code
Explore at:
zip(148301844275 bytes)Available download formats
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Kagglehttp://kaggle.com/
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Explore our public notebook content!

Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.

Why we’re releasing this dataset

By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.

Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.

The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!

Sensitive data

While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.

Joining with Meta Kaggle

The files contained here are a subset of the KernelVersions in Meta Kaggle. The file names match the ids in the KernelVersions csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.

File organization

The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.

The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays

Questions / Comments

We love feedback! Let us know in the Discussion tab.

Happy Kaggling!
N
Loudoun County, VA annual median income by work experience and sex dataset :...
neilsberg.com
csv, json
Updated Jan 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Loudoun County, VA annual median income by work experience and sex dataset : Aged 15+, 2010-2022 (in 2022 inflation-adjusted dollars) [Dataset]. https://www.neilsberg.com/research/datasets/94cdda2a-9816-11ee-99cf-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 9, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Loudoun County
Variables measured
Income for Male Population, Income for Female Population, Income for Male Population working full time, Income for Male Population working part time, Income for Female Population working full time, Income for Female Population working part time
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2010-2022 1-Year Estimates. To portray the income for both the genders (Male and Female), we conducted an initial analysis and categorization of the data. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents median income data over a decade or more for males and females categorized by Total, Full-Time Year-Round (FT), and Part-Time (PT) employment in Loudoun County. It showcases annual income, providing insights into gender-specific income distributions and the disparities between full-time and part-time work. The dataset can be utilized to gain insights into gender-based pay disparity trends and explore the variations in income for male and female individuals.

Key observations: Insights from 2022

Based on our analysis ACS 2022 1-Year Estimates, we present the following observations: - All workers, aged 15 years and older: In Loudoun County, the median income for all workers aged 15 years and older, regardless of work hours, was $96,408 for males and $50,183 for females.
These income figures highlight a substantial gender-based income gap in Loudoun County. Women, regardless of work hours, earn 52 cents for each dollar earned by men. This significant gender pay gap, approximately 48%, underscores concerning gender-based income inequality in the county of Loudoun County.
- Full-time workers, aged 15 years and older: In Loudoun County, among full-time, year-round workers aged 15 years and older, males earned a median income of $124,133, while females earned $87,582, leading to a 29% gender pay gap among full-time workers. This illustrates that women earn 71 cents for each dollar earned by men in full-time roles. This analysis indicates a widening gender pay gap, showing a substantial income disparity where women, despite working full-time, face a more significant wage discrepancy compared to men in the same roles.
Surprisingly, the gender pay gap percentage was higher across all roles, including non-full-time employment, for women compared to men. This suggests that full-time employment offers a more equitable income scenario for women compared to other employment patterns in Loudoun County.

https://i.neilsberg.com/ch/loudoun-county-va-income-by-gender.jpeg" alt="Loudoun County, VA gender based income disparity">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates. All incomes have been adjusting for inflation and are presented in 2022-inflation-adjusted dollars.

Gender classifications include:

Male

Female

Employment type classifications include:

Full-time, year-round: A full-time, year-round worker is a person who worked full time (35 or more hours per week) and 50 or more weeks during the previous calendar year.

Part-time: A part-time worker is a person who worked less than 35 hours per week during the previous calendar year.

Variables / Data Columns

Year: This column presents the data year. Expected values are 2010 to 2022

Male Total Income: Annual median income, for males regardless of work hours

Male FT Income: Annual median income, for males working full time, year-round

Male PT Income: Annual median income, for males working part time

Female Total Income: Annual median income, for females regardless of work hours

Female FT Income: Annual median income, for females working full time, year-round

Female PT Income: Annual median income, for females working part time

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Loudoun County median household income by gender. You can refer the same here
Simulation Data Set
catalog.data.gov
s.cnmilf.com
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).
R
Data from: How Much Money Dataset
universe.roboflow.com
zip
Updated Oct 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Mendes Gonçalves (2022). How Much Money Dataset [Dataset]. https://universe.roboflow.com/daniel-mendes-goncalves/how-much-money/dataset/2
Explore at:
zipAvailable download formats
Dataset updated
Oct 15, 2022
Dataset authored and provided by
Daniel Mendes Gonçalves
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Coins Bounding Boxes
Description
Project overview

The main goal of this model is to help me create an app that count How much money does a picture has.

Descriptions of each class type

I don't seperate country base money and don't seperate front and back

EUR-1-cent dasdasd EUR-2-cent EUR-5-cent EUR-10-cent EUR-20-cent EUR-50-cent EUR-1-euro EUR-2-euro

Current status and timeline

Adding EUR (currently)

Adding CHF

Links to external resources

Contribution and labeling guidelines
Performance vs. Predicted Performance
kaggle.com
Updated Dec 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Calathea21 (2022). Performance vs. Predicted Performance [Dataset]. http://doi.org/10.34740/kaggle/dsv/4752670
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/4752670
Dataset updated
Dec 21, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Calathea21
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset contains information about high school students and their actual and predicted performance on an exam. Most of the information, including some general information about high school students and their grade for an exam, was based on an already existing dataset, while the predicted exam performance was based on a human experiment. In this experiment, participants were shown short descriptions of the students (based on the information in the original data) and had to rank and grade according to their expected performance. Prior to this task some participants were exposed to some "Stereotype Activation", suggesting that boys perform less well in school than girls.

Description of *original_data.csv*

Based on this dataset (which is also available on kaggle), we extracted a number of student profiles that participants had to make grade predictions for. For more information about this dataset we refer to the corresponding kaggle page: https://www.kaggle.com/datasets/uciml/student-alcohol-consumption

Note that we performed some preprocessing on the original data:

The original data consisted of two parts: the information about students following a Maths course and the information about students following a Portuguese course. Since in both datasets the same type of information was recorded, we merged both datasets and added a column "subject", to show which course each student belongs to

We excluded all data where G3 = 0 (i.e. the grade for the last exam = 0)

From original_data.csv we randomly sampled 856 students that participants in our study had to make grade predictions for.

Description of *CompleteDataAndBiases.csv*

index - this column corresponds to the indeces in the file "original_data.csv". Through these indices, it is possible to add columns from the original data to the dataset with the grade prediction

ParticipantID - the ID of the participant who made the performance predictions for the corresponding student. Predictions needed to be made for 856 students, and each participant made 8 predictions total. Thus there are 107 different participant IDs

name - to make the prediction task more engaging for participants, each of the 8 student profiles, that participants had to grade & rank was randomly matched to one of four boy/girl's names (depending on the sex of the student)

sex - the sex of each student, either female (F) or male (M). For benchmarking fair ML algorithms, this can be used as the sensitive attribute. We assume that in the fair version of the decision variable ("Pass"), no sex discrimination occurs. The biased versions of the variable ("Predicted Pass") are mostly discriminatory towards male students.

studytime - this variable is taken from the original dataset and denotes how long a student studied for their exam. In the original data this variable consisted of four levels (less than 2 hours vs. 2-5 hours vs. 5-10 hours vs. more than 10 hours). We binned the latter two levels together and encoded this column numerically from 1-3.

freetime - Originally, this variable ranged from 1 (very low) to 5 (very high). We binned this variable into three categories, where level 1 and 2 are binned, as well as level 4 and 5.

romantic - Binary variable, denoting whether the student is in a romantic relationship or not.

Walc - This variable shows how much alcohol each student consumes in the weekend. Originally it ranged from 1 to 5 (5 corresponding to the highest alcohol consumption), but we binned the last two levels together.

goout - This variable shows how often a student goes out in a week. Originally it ranged from 1 to 5 (5 corresponding to going out very often), but we binned the last two levels together.

Parents_edu - This variable was not present in the original dataset. Instead, the original dataset consisted of two variables "mum_edu" and "dad_edu". We obtained "Parents_edu" by taking the higher one of both. The variable consist of 4 levels, whereas 4 = highest level of education.

absences - This variable shows the number of absences per student. Originally it ranged from 0 - 93, but because large number of absences were infrequent we binned all absences of >=7 into one level.

reason - The reason for why a student chose to go to the school in question. The levels are close to home, school's reputation, school's curricular and other

G3 - The actual grade each student received for the final exam of the course, ranging from 0-20.

Pass - A binary variable showing whether G3 is a passing grade (i.e. >=10) or not.

Predicted Grade - The grade the student was predicted to receive in our experiment

Predicted Rank - In our ex...
w
Dataset of book subjects that contain 101 great ways to sew a metre : look...
workwithdata.com
Updated Nov 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Dataset of book subjects that contain 101 great ways to sew a metre : look how much you can make with just one metre of fabric! [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=101+great+ways+to+sew+a+metre+%3A+look+how+much+you+can+make+with+just+one+metre+of+fabric%21&j=1&j0=books
Explore at:
Dataset updated
Nov 7, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book subjects. It has 1 row and is filtered where the books is 101 great ways to sew a metre : look how much you can make with just one metre of fabric!. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
a
Internet Income Ratio
hub.arcgis.com
Updated Sep 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Timmons@WACOM (2023). Internet Income Ratio [Dataset]. https://hub.arcgis.com/datasets/2f2f84805e2c4a319bd9b990ac5ba167
Explore at:
Dataset updated
Sep 20, 2023
Dataset authored and provided by
Timmons@WACOM
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered

Description
This data is used for a broadband mapping initiative conducted by the Washington State Broadband Office. This dataset provides global fixed broadband and mobile (cellular) network performance metrics in zoom level 16 web mercator tiles (approximately 610.8 meters by 610.8 meters at the equator). Data is projected in EPSG:4326. Download speed, upload speed, and latency are collected via the Speedtest by Ookla applications for Android and iOS and averaged for each tile. Measurements are filtered to results containing GPS-quality location accuracy. The data was processed and published to ArcGIS Living Atlas by Esri.AboutSpeedtest data is used today by commercial fixed and mobile network operators around the world to inform network buildout, improve global Internet quality, and increase Internet accessibility. Government regulators such as the United States Federal Communications Commission and the Malaysian Communications and Multimedia Commission use Speedtest data to hold telecommunications entities accountable and direct funds for rural and urban connectivity development. Ookla licenses data to NGOs and educational institutions to fulfill its mission: to help make the internet better, faster and more accessible for everyone. Ookla hopes to further this mission by distributing the data to make it easier for individuals and organizations to use it for the purposes of bridging the social and economic gaps between those with and without modern Internet access.DataHundreds of millions of Speedtests are taken on the Ookla platform each month. In order to create a manageable dataset, we aggregate raw data into tiles. The size of a data tile is defined as a function of "zoom level" (or "z"). At z=0, the size of a tile is the size of the whole world. At z=1, the tile is split in half vertically and horizontally, creating 4 tiles that cover the globe. This tile-splitting continues as zoom level increases, causing tiles to become exponentially smaller as we zoom into a given region. By this definition, tile sizes are actually some fraction of the width/height of Earth according to Web Mercator projection (EPSG:3857). As such, tile size varies slightly depending on latitude, but tile sizes can be estimated in meters.For the purposes of these layers, a zoom level of 16 (z=16) is used for the tiling. This equates to a tile that is approximately 610.8 meters by 610.8 meters at the equator (18 arcsecond blocks). The geometry of each tile is represented in WGS 84 (EPSG:4326) in the tile field.The data can be found at: https://github.com/teamookla/ookla-open-dataUpdate CadenceThe tile aggregates start in Q1 2019 and go through the most recent quarter. They will be updated shortly after the conclusion of the quarter.Esri ProcessingThis layer is a best available aggregation of the original Ookla dataset. This means that for each tile that data is available, the most recent data is used. So for instance, if data is available for a tile for Q2 2019 and for Q4 2020, the Q4 2020 data is awarded to the tile. The default visualization for the layer is the "broadband index". The broadband index is a bivariate index based on both the average download speed and the average upload speed. For Mobile, the score is indexed to a standard of 25 megabits per second (Mbps) download and 3 Mbps upload. A tile with average Speedtest results of 25/3 Mbps is awarded 100 points. Tiles with average speeds above 25/3 are shown in green, tiles with average speeds below this are shown in fuchsia. For Fixed, the score is indexed to a standard of 100 Mbps download and 3 Mbps upload. A tile with average Speedtest results of 100/20 Mbps is awarded 100 points. Tiles with average speeds above 100/20 are shown in green, tiles with average speeds below this are shown in fuchsia.Tile AttributesEach tile contains the following adjoining attributes:The year and the quarter that the tests were performed.The average download speed of all tests performed in the tile, represented in megabits per second.The average upload speed of all tests performed in the tile, represented in megabits per second.The average latency of all tests performed in the tile, represented in millisecondsThe number of tests taken in the tile.The number of unique devices contributing tests in the tile.The quadkey representing the tile.QuadkeysQuadkeys can act as a unique identifier for the tile. This can be useful for joining data spatially from multiple periods (quarters), creating coarser spatial aggregations without using geospatial functions, spatial indexing, partitioning, and an alternative for storing and deriving the tile geometry.LayersThere are two layers:Ookla_Mobile_Tiles - Tiles containing tests taken from mobile devices with GPS-quality location and a cellular connection type (e.g. 4G LTE, 5G NR).Ookla_Fixed_Tiles - Tiles containing tests taken from mobile devices with GPS-quality location and a non-cellular connection type (e.g. WiFi, ethernet).The layers are set to draw at scales 1:3,000,000 and larger.Time Period and update Frequency Layers are generated based on a quarter year of data (three months) and files will be updated and added on a quarterly basis. A /year=2020/quarter=1/ period, the first quarter of the year 2020, would include all data generated on or after 2020-01-01 and before 2020-04-01.
Population Assessment of Tobacco and Health (PATH) Study [United States]...
icpsr.umich.edu
ascii, delimited, r +3
Updated Apr 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Inter-university Consortium for Political and Social Research [distributor] (2025). Population Assessment of Tobacco and Health (PATH) Study [United States] Public-Use Files [Dataset]. http://doi.org/10.3886/ICPSR36498.v23
Explore at:
ascii, delimited, sas, r, spss, stataAvailable download formats
Unique identifier
https://doi.org/10.3886/ICPSR36498.v23
Dataset updated
Apr 8, 2025
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
License
https://www.icpsr.umich.edu/web/ICPSR/studies/36498/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36498/terms
Area covered
United States
Description
The Population Assessment of Tobacco and Health (PATH) Study began originally surveying 45,971 adult and youth respondents. The PATH Study was launched in 2011 to inform Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.Dataset 0001 (DS0001) contains the data from the Master Linkage file. This file contains 14 variables and 67,276 cases. The file provides a master list of every person's unique identification number and what type of respondent they were for each wave. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Public-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.Dataset 1001 (DS1001) contains the data from the Wave 1 Adult Questionnaire. This data file contains 1,732 variables and 32,320 cases. Each of the cases represents a single, completed interview. Dataset 1002 (DS1002) contains the data from the Youth and Parent Questionnaire. This file contains 1,228 variables and 13,651 cases.Dataset 2001 (DS2001) contains the data from the Wave 2 Adult Questionnaire. This data file contains 2,197 variables and 28,362 cases. Of these cases, 26,447 also completed a Wave 1 Adult Questionnaire. The other 1,915 cases are "aged-up adults" having previously completed a Wave 1 Youth Questionnaire. Dataset 2002 (DS2002) contains the data from the Wave 2 Youth and Parent Questionnaire. This data file contains 1,389 variables and 12,172 cases. Of these cases, 10,081 also completed a Wave 1 Youth Questionnaire. The other 2,091 cases are "aged-up youth" having previously been sampled as "shadow youth." Dataset 3001 (DS3001) contains the data from the Wave 3 Adult Questionnaire. This data file contains 2,139 variables and 28,148 cases. Of these cases, 26,241 are continuing adults having completed a prior Adult Questionnaire. The other 1,907 cases are "aged-up adults" having previously completed a Youth Questionnaire. Dataset 3002 (DS3002) contains the data from t
d
Current Population Survey (CPS)
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/AK4FDD
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Damico, Anthony
Description
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
C
Hospital Annual Financial Data - Selected Data & Pivot Tables
data.chhs.ca.gov
data.ca.gov
+4more
csv, data, doc, html +4
Updated Apr 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Health Care Access and Information (2025). Hospital Annual Financial Data - Selected Data & Pivot Tables [Dataset]. https://data.chhs.ca.gov/dataset/hospital-annual-financial-data-selected-data-pivot-tables
Explore at:
pdf(121968), xlsx(765216), xls(44967936), xlsx(756356), xlsx(763636), xlsx, xlsx(750199), xlsx(769128), pdf(333268), xls(920576), xlsx(768036), xls(16002048), data, pdf(383996), xlsx(752914), html, xlsx(758089), xls(14657536), csv(205488092), xlsx(754073), xls(51424256), pdf(310420), doc, xls(44933632), xls, xlsx(14714368), pdf(303198), xls(18301440), xls(51554816), xlsx(770931), pdf(258239), zip, xls(19625472), xlsx(777616), xlsx(771275), xls(19650048), xlsx(790979), xlsx(758376), xls(19599360), xlsx(779866), xls(18445312), xlsx(782546), xls(19577856)Available download formats
Dataset updated
Apr 23, 2025
Dataset authored and provided by
Department of Health Care Access and Information
Description
On an annual basis (individual hospital fiscal year), individual hospitals and hospital systems report detailed facility-level data on services capacity, inpatient/outpatient utilization, patients, revenues and expenses by type and payer, balance sheet and income statement.

Due to the large size of the complete dataset, a selected set of data representing a wide range of commonly used data items, has been created that can be easily managed and downloaded. The selected data file includes general hospital information, utilization data by payer, revenue data by payer, expense data by natural expense category, financial ratios, and labor information.

There are two groups of data contained in this dataset: 1) Selected Data - Calendar Year: To make it easier to compare hospitals by year, hospital reports with report periods ending within a given calendar year are grouped together. The Pivot Tables for a specific calendar year are also found here. 2) Selected Data - Fiscal Year: Hospital reports with report periods ending within a given fiscal year (July-June) are grouped together.
LLM prompts in the context of machine learning
kaggle.com
Updated Jul 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jordan Nelson (2024). LLM prompts in the context of machine learning [Dataset]. https://www.kaggle.com/datasets/jordanln/llm-prompts-in-the-context-of-machine-learning
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 1, 2024
Dataset provided by
Kaggle
Authors
Jordan Nelson
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset is an extension of my previous work on creating a dataset for natural language processing tasks. It leverages binary representation to characterise various machine learning models. The attributes in the dataset are derived from a dictionary, which was constructed from a corpus of prompts typically provided to a large language model (LLM). These prompts reference specific machine learning algorithms and their implementations. For instance, consider a user asking an LLM or a generative AI to create a Multi-Layer Perceptron (MLP) model for a particular application. By applying this concept to multiple machine learning models, we constructed our corpus. This corpus was then transformed into the current dataset using a bag-of-words approach. In this dataset, each attribute corresponds to a word from our dictionary, represented as a binary value: 1 indicates the presence of the word in a given prompt, and 0 indicates its absence. At the end of each entry, there is a label. Each entry in the dataset pertains to a single class, where each class represents a distinct machine learning model or algorithm. This dataset is intended for multi-class classification tasks, not multi-label classification, as each entry is associated with only one label and does not belong to multiple labels simultaneously. This dataset has been utilised with a Convolutional Neural Network (CNN) using the Keras Automodel API, achieving impressive training and testing accuracy rates exceeding 97%. Post-training, the model's predictive performance was rigorously evaluated in a production environment, where it continued to demonstrate exceptional accuracy. For this evaluation, we employed a series of questions, which are listed below. These questions were intentionally designed to be similar to ensure that the model can effectively distinguish between different machine learning models, even when the prompts are closely related.

KNN How would you create a KNN model to classify emails as spam or not spam based on their content and metadata? How could you implement a KNN model to classify handwritten digits using the MNIST dataset? How would you use a KNN approach to build a recommendation system for suggesting movies to users based on their ratings and preferences? How could you employ a KNN algorithm to predict the price of a house based on features such as its location, size, and number of bedrooms etc? Can you create a KNN model for classifying different species of flowers based on their petal length, petal width, sepal length, and sepal width? How would you utilise a KNN model to predict the sentiment (positive, negative, or neutral) of text reviews or comments? Can you create a KNN model for me that could be used in malware classification? Can you make me a KNN model that can detect a network intrusion when looking at encrypted network traffic? Can you make a KNN model that would predict the stock price of a given stock for the next week? Can you create a KNN model that could be used to detect malware when using a dataset relating to certain permissions a piece of software may have access to?

Decision Tree Can you describe the steps involved in building a decision tree model to classify medical images as malignant or benign for cancer diagnosis and return a model for me? How can you utilise a decision tree approach to develop a model for classifying news articles into different categories (e.g., politics, sports, entertainment) based on their textual content? What approach would you take to create a decision tree model for recommending personalised university courses to students based on their academic strengths and weaknesses? Can you describe how to create a decision tree model for identifying potential fraud in financial transactions based on transaction history, user behaviour, and other relevant data? In what ways might you apply a decision tree model to classify customer complaints into different categories determining the severity of language used? Can you create a decision tree classifier for me? Can you make me a decision tree model that will help me determine the best course of action across a given set of strategies? Can you create a decision tree model for me that can recommend certain cars to customers based on their preferences and budget? How can you make a decision tree model that will predict the movement of star constellations in the sky based on data provided by the NASA website? How do I create a decision tree for time-series forecasting?

Random Forest Can you describe the steps involved in building a random forest model to classify different types of anomalies in network traffic data for cybersecurity purposes and return the code for me? In what ways could you implement a random forest model to predict the severity of traffic congestion in urban areas based on historical traffic patterns, weather...
T
Mexico Average Daily Wages
tradingeconomics.com
ru.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Jun 30, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2016). Mexico Average Daily Wages [Dataset]. https://tradingeconomics.com/mexico/wages
Explore at:
csv, excel, xml, jsonAvailable download formats
Dataset updated
Jun 30, 2016
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 31, 2000 - May 31, 2025
Area covered
Mexico
Description
Wages in Mexico decreased to 278.93 MXN/Day in May from 621.89 MXN/Day in April of 2025. This dataset provides - Mexico Average Daily Wages - actual values, historical data, forecast, chart, statistics, economic calendar and news.
P
How do I make a claim with Expedia? Dataset
paperswithcode.com
Updated Jun 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). How do I make a claim with Expedia? Dataset [Dataset]. https://paperswithcode.com/dataset/como-hablar-con-spirit-airlines-desde-mexico
Explore at:
Dataset updated
Jun 23, 2025
Description
How do I escalate an issue on Expedia?

To escalate an issue on Expedia, contact customer service directly at +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). Request to speak with a supervisor or file an official complaint. Keep your booking ID handy for faster support. 2. Where do I file a complaint against Expedia?

File a complaint by calling Expedia at +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). You can also submit it via their Help Center online, but calling provides quicker escalation. 3. How do I request compensation on Expedia?

For compensation due to delays, cancellations, or service issues, call Expedia’s OTA support at +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) and explain the situation in detail. How do I ask a question on Expedia?

To ask a question on Expedia, visit the Help Center or call customer support directly at +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). They provide answers regarding bookings, policies, payments, and more. For fast help, have your itinerary number ready when calling. 2. How do I escalate an issue on Expedia?

If you're unsatisfied with initial support, escalate the issue by calling +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) and request a supervisor. Explain your concern in detail. You can also use the feedback option in your Expedia account to formally escalate complaints. 3. Can I speak to someone at Expedia?

Yes, Expedia offers live support. Call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) and follow the voice prompts to speak with a representative. Use keywords like “agent” or “speak to a person” during the call. 4. How do I dispute with Expedia?

To dispute a charge or booking, contact Expedia directly at +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). Provide your booking details and reason for the dispute. You may also follow up with your bank or credit card provider. 5. Where can I complain about Expedia?

Submit a complaint by calling +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). You can also file through the Expedia Help Center. If unresolved, complaints may be filed with consumer protection platforms or BBB. 6. Does Expedia refund your money?

Yes, Expedia refunds are possible for eligible bookings. Policies vary by airline, hotel, or activity. For refund help or to check eligibility, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). 7. How do I make a claim with Expedia?

To file a claim, such as for trip protection or lost service, contact Expedia at +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). They’ll guide you through the claim process and documentation needed. 8. What are fully refundable terms on Expedia?

Fully refundable bookings allow cancellations for a full refund by a specified date. These options are clearly labeled. For clarity or changes, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). 9. Does Expedia offer compensation?

Expedia may offer compensation for issues like cancellations or booking errors. To request it, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) and explain your situation. 10. How do I sue Expedia for a refund?

Legal action should be a last resort. First, attempt resolution by calling +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). If unresolved, consider small claims court or arbitration as outlined in Expedia's terms. 11. How much of a cut does Expedia take?

Expedia typically takes a 15–20% commission from hotels. Rates may vary. If you’re a partner or host, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) for detailed fee structures. 12. Can you break up payments on Expedia?

Yes, Expedia offers payment plans through third-party services like Affirm. At checkout, select installment payment options. For support, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). 13. Does Expedia pay commission to travel agents?

Yes, Expedia offers commissions through its TAAP (Travel Agent Affiliate Program). Travel agents can register or inquire by calling +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). 14. Does Expedia give price adjustments?

Expedia may honor price adjustments under certain conditions, especially if a Price Match Guarantee applies. Call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) for claims. 15. What is Expedia cancellation plan?

Expedia offers free or flexible cancellation options depending on the booking type. Review cancellation terms before booking. For help canceling, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). 16. How do I escalate an issue with Expedia?

To escalate an unresolved issue, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) and ask for a supervisor. Be ready with supporting documents and your booking ID. 17. Is it hard to cancel with Expedia?

Cancelling is simple if the booking allows it. Go to "My Trips" or call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). Non-refundable bookings may be harder to cancel without a penalty. 18. Can you dispute with Expedia?

Yes, you can dispute incorrect charges or services by calling +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). Provide clear reasons and any supporting documents for faster resolution. 4. How to file a dispute with Expedia?

To dispute charges or incorrect bookings, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). Provide documentation such as screenshots or receipts. 5. Does Expedia actually refund money?

Yes, Expedia issues refunds based on fare rules and policies. To check eligibility or start a refund, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). 6. How do I file a claim with Expedia?

File claims (for trip protection or travel issues) by calling +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) and requesting the claim form or online submission link. 7. How do I get a full refund from Expedia?

A full refund depends on the fare and policy. For eligible bookings, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) and ask for a full cancellation refund. 8. Can I take Expedia to court?

Yes, you can take legal action, but it's advised to first resolve issues via OTA numbers: +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). Try arbitration if applicable. 9. How do I receive money from Expedia?

If you’re due a refund or payout, contact Expedia support at +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) to confirm payment status or method. 10. How do I speak to a real person at Expedia?

To speak to a live agent, dial +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) and follow the prompt for customer service. Say “agent” or “representative.” 11. How much commission does Expedia take from hotels?

Expedia typically takes 15–20% commission from hotels, but it varies. Hotel partners can call OTA support at +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) for details. 12. Do you get your deposit back from Expedia?

Refundability of deposits depends on the hotel’s policy. Call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) to check if your booking includes refundable deposits. 13. What is the non-refundable policy on Expedia?

Non-refundable bookings can’t be canceled for a refund unless special circumstances apply. Call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) to check flexibility. 14. How long does Expedia take to process payments?

Refunds or payments typically take 7–10 business days. For updates, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026). 15. Is Expedia reliable?

Expedia is a reputable travel platform, but like any service, issues may arise. Contact support at +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) if you face concerns. 16. Does Expedia give compensation?

Yes, Expedia may offer compensation for errors or disruptions. Contact them directly at +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) to make a request. 17. How do I dispute a charge on Expedia?

To dispute a charge, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026) with billing details and explain the issue. You can also contact your bank if needed. 18. What is the refundable option on Expedia?

Refundable bookings allow cancellation for a full refund. Always review the policy before booking. For changes or questions, call +(1) ➣(877) ➢ (567) ➤ (9375) or ☎+(1) ➣ (877) ➢ (398) ➤ (1026).
Data from: Population Assessment of Tobacco and Health (PATH) Study [United...
icpsr.umich.edu
Updated Jun 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Inter-university Consortium for Political and Social Research [distributor] (2025). Population Assessment of Tobacco and Health (PATH) Study [United States] Restricted-Use Files [Dataset]. http://doi.org/10.3886/ICPSR36231.v42
Explore at:
Unique identifier
https://doi.org/10.3886/ICPSR36231.v42
Dataset updated
Jun 27, 2025
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
License
https://www.icpsr.umich.edu/web/ICPSR/studies/36231/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36231/terms
Area covered
United States
Description
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Dataset 0002 (DS0002) contains the data from the State Design Data. This file contains 7 variables and 82,139 cases. The state identifier in the State Design file reflects the participant's state of residence at the time of selection and recruitment for the PATH Study. Dataset 1011 (DS1011) contains the data from the Wave 1 Adult Questionnaire. This data file contains 2,021 variables and 32,320 cases. Each of the cases represents a single, completed interview. Dataset 1012 (DS1012) contains the data from the Wave 1 Youth and Parent Questionnaire. This file contains 1,431 variables and 13,651 cases. Dataset 1411 (DS1411) contains the Wave 1 State Identifier data for Adults and has 5 variables and 32,320 cases. Dataset 1412 (DS1412) contains the Wave 1 State Identifier data for Youth (and Parents) and has 5 variables and 13,651 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 1, which is also their state of residence at the time of recruitment. Dataset 1611 (DS1611) contains the Tobacco Universal Product Code (UPC) data from Wave 1. This data file contains 32 variables and 8,601 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 1. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used
h
internal-datasets
huggingface.co
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ivan Rivaldo Marbun (2023). internal-datasets [Dataset]. https://huggingface.co/datasets/Marbyun/internal-datasets
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 1, 2023
Authors
Ivan Rivaldo Marbun
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
SynQA is a Reading Comprehension dataset created in the work "Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation" (https://aclanthology.org/2021.emnlp-main.696/). It consists of 314,811 synthetically generated questions on the passages in the SQuAD v1.1 (https://arxiv.org/abs/1606.05250) training set.

In this work, we use a synthetic adversarial data generation to make QA models more robust to human adversaries. We develop a data generation pipeline that selects source passages, identifies candidate answers, generates questions, then finally filters or re-labels them to improve quality. Using this approach, we amplify a smaller human-written adversarial dataset to a much larger set of synthetic question-answer pairs. By incorporating our synthetic data, we improve the state-of-the-art on the AdversarialQA (https://adversarialqa.github.io/) dataset by 3.7F1 and improve model generalisation on nine of the twelve MRQA datasets. We further conduct a novel human-in-the-loop evaluation to show that our models are considerably more robust to new human-written adversarial examples: crowdworkers can fool our model only 8.8% of the time on average, compared to 17.6% for a model trained without synthetic data.

For full details on how the dataset was created, kindly refer to the paper.

Facebook

Twitter

Click to copy link

Link copied

Cite

Neilsberg Research (2024). Median Household Income Variation by Family Size in La CaÃƒÂ±ada Flintridge, CA: Comparative analysis across 7 household sizes [Dataset]. https://www.neilsberg.com/research/datasets/1b147604-73fd-11ee-949f-3860777c1fe6/

Median Household Income Variation by Family Size in La CaÃƒÂ±ada Flintridge, CA: Comparative analysis across 7 household sizes

Explore at:

json, csvAvailable download formats

Dataset updated

Jan 11, 2024

Dataset authored and provided by

Neilsberg Research

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

La Cañada Flintridge, Los Angeles, California

Variables measured

Household size, Median Household Income

Measurement technique

The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 7 household sizes (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out how household income varies with the size of the family unit. For additional information about these estimations, please contact us via email at research@neilsberg.com

Dataset funded by

Neilsberg Research

Description

About this dataset

Context

The dataset presents median household incomes for various household sizes in La CaÃƒÂ±ada Flintridge, CA, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.

Key observations

Of the 7 household sizes (1 person to 7-or-more person households) reported by the census bureau, La CaÃƒÂ±ada Flintridge did not include 6-person households. Across the different household sizes in La CaÃƒÂ±ada Flintridge the mean income is $213,497, and the standard deviation is $75,043. The coefficient of variation (CV) is 35.15%. This high CV indicates high relative variability, suggesting that the incomes vary significantly across different sizes of households. Please note that the U.S. Census Bureau uses $250,001 as a JAM value to report incomes of $250,000 or more. In the case of La CaÃƒÂ±ada Flintridge, there were 2 household sizes where the JAM values were used. Thus, the numbers for the mean and standard deviation may not be entirely accurate and have a higher possibility of errors. However, to obtain an approximate estimate, we have used a value of $250,001 as the income for calculations, as reported in the datasets by the U.S. Census Bureau.
In the most recent year, 2021, The smallest household size for which the bureau reported a median household income was 1-person households, with an income of $76,227. It then further increased to $270,229 for 7-person households, the largest household size for which the bureau reported a median household income.

https://i.neilsberg.com/ch/la-canada-flintridge-ca-median-household-income-by-household-size.jpeg" alt="La CaÃƒÂ±ada Flintridge, CA median household income, by household size (in 2022 inflation-adjusted dollars)">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Household Sizes:

1-person households
2-person households
3-person households
4-person households
5-person households
6-person households
7-or-more-person households

Variables / Data Columns

Household Size: This column showcases 7 household sizes ranging from 1-person households to 7-or-more-person households (As mentioned above).
Median Household Income: Median household income, in 2022 inflation-adjusted dollars for the specific household size.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for La CaÃƒÂ±ada Flintridge median household income. You can refer the same here

Clear search

Close search

Google apps

Main menu

Median Household Income Variation by Family Size in La CaÃƒÂ±ada Flintridge,...

About this dataset

Content

Inspiration

Recommended for further research

Median Household Income Variation by Family Size in Williams Bay, WI:...

About this dataset

Content

Inspiration

Recommended for further research

Real Median Personal Income in the United States

China Average Yearly Wages

Greensboro, NC annual median income by work experience and sex dataset :...

About this dataset

Content

Inspiration

Recommended for further research

Meta Kaggle Code

Explore our public notebook content!

Why we’re releasing this dataset

Sensitive data

Joining with Meta Kaggle

File organization

Questions / Comments

Loudoun County, VA annual median income by work experience and sex dataset :...

About this dataset

Content

Inspiration

Recommended for further research

Simulation Data Set

Data from: How Much Money Dataset

Project overview

Descriptions of each class type

Current status and timeline

Links to external resources

Contribution and labeling guidelines

Performance vs. Predicted Performance

Description of *original_data.csv*

Description of *CompleteDataAndBiases.csv*

Dataset of book subjects that contain 101 great ways to sew a metre : look...

Internet Income Ratio

Population Assessment of Tobacco and Health (PATH) Study [United States]...

Current Population Survey (CPS)

Hospital Annual Financial Data - Selected Data & Pivot Tables

LLM prompts in the context of machine learning

Mexico Average Daily Wages

How do I make a claim with Expedia? Dataset

Data from: Population Assessment of Tobacco and Health (PATH) Study [United...

internal-datasets

Median Household Income Variation by Family Size in La CaÃƒÂ±ada Flintridge, CA: Comparative analysis across 7 household sizesSee More Versions

About this dataset

Content

Inspiration

Recommended for further research

**Description of *original_data.csv***

**Description of *CompleteDataAndBiases.csv***

Median Household Income Variation by Family Size in La CaÃƒÂ±ada Flintridge, CA: Comparative analysis across 7 household sizes