14 datasets found
  1. Bank Loan Analysis Project in Excel

    • kaggle.com
    Updated May 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanjana Murthy (2024). Bank Loan Analysis Project in Excel [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/bank-loan-analysis-project/discussion?sort=undefined
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 4, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sanjana Murthy
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    About Datasets: - Domain : Finance - Project: Bank loan of customers - Datasets: Finance_1.xlsx & Finance_2.xlsx - Dataset Type: Excel Data - Dataset Size: Each Excel file has 39k+ records

    KPI's: 1. Year wise loan amount Stats 2. Grade and sub grade wise revol_bal 3. Total Payment for Verified Status Vs Total Payment for Non Verified Status 4. State wise loan status 5. Month wise loan status 6. Get more insights based on your understanding of the data

    Process: 1. Understanding the problem 2. Data Collection 3. Data Cleaning 4. Exploring and analyzing the data 5. Interpreting the results

    This data contains Power Query, Power Pivot, Merge data, Clustered Bar Chart, Clustered Column Chart, Line Chart, 3D Pie chart, Dashboard, slicers, timeline, formatting techniques.

  2. f

    Cleaned NHANES 1988-2018

    • figshare.com
    txt
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet (2025). Cleaned NHANES 1988-2018 [Dataset]. http://doi.org/10.6084/m9.figshare.21743372.v9
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 18, 2025
    Dataset provided by
    figshare
    Authors
    Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The National Health and Nutrition Examination Survey (NHANES) provides data and have considerable potential to study the health and environmental exposure of the non-institutionalized US population. However, as NHANES data are plagued with multiple inconsistencies, processing these data is required before deriving new insights through large-scale analyses. Thus, we developed a set of curated and unified datasets by merging 614 separate files and harmonizing unrestricted data across NHANES III (1988-1994) and Continuous (1999-2018), totaling 135,310 participants and 5,078 variables. The variables conveydemographics (281 variables),dietary consumption (324 variables),physiological functions (1,040 variables),occupation (61 variables),questionnaires (1444 variables, e.g., physical activity, medical conditions, diabetes, reproductive health, blood pressure and cholesterol, early childhood),medications (29 variables),mortality information linked from the National Death Index (15 variables),survey weights (857 variables),environmental exposure biomarker measurements (598 variables), andchemical comments indicating which measurements are below or above the lower limit of detection (505 variables).csv Data Record: The curated NHANES datasets and the data dictionaries includes 23 .csv files and 1 excel file.The curated NHANES datasets involves 20 .csv formatted files, two for each module with one as the uncleaned version and the other as the cleaned version. The modules are labeled as the following: 1) mortality, 2) dietary, 3) demographics, 4) response, 5) medications, 6) questionnaire, 7) chemicals, 8) occupation, 9) weights, and 10) comments."dictionary_nhanes.csv" is a dictionary that lists the variable name, description, module, category, units, CAS Number, comment use, chemical family, chemical family shortened, number of measurements, and cycles available for all 5,078 variables in NHANES."dictionary_harmonized_categories.csv" contains the harmonized categories for the categorical variables.“dictionary_drug_codes.csv” contains the dictionary for descriptors on the drugs codes.“nhanes_inconsistencies_documentation.xlsx” is an excel file that contains the cleaning documentation, which records all the inconsistencies for all affected variables to help curate each of the NHANES modules.R Data Record: For researchers who want to conduct their analysis in the R programming language, only cleaned NHANES modules and the data dictionaries can be downloaded as a .zip file which include an .RData file and an .R file.“w - nhanes_1988_2018.RData” contains all the aforementioned datasets as R data objects. We make available all R scripts on customized functions that were written to curate the data.“m - nhanes_1988_2018.R” shows how we used the customized functions (i.e. our pipeline) to curate the original NHANES data.Example starter codes: The set of starter code to help users conduct exposome analysis consists of four R markdown files (.Rmd). We recommend going through the tutorials in order.“example_0 - merge_datasets_together.Rmd” demonstrates how to merge the curated NHANES datasets together.“example_1 - account_for_nhanes_design.Rmd” demonstrates how to conduct a linear regression model, a survey-weighted regression model, a Cox proportional hazard model, and a survey-weighted Cox proportional hazard model.“example_2 - calculate_summary_statistics.Rmd” demonstrates how to calculate summary statistics for one variable and multiple variables with and without accounting for the NHANES sampling design.“example_3 - run_multiple_regressions.Rmd” demonstrates how run multiple regression models with and without adjusting for the sampling design.

  3. SMARTDEST DATASET WP3 v1.0

    • data.europa.eu
    unknown
    Updated Jul 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2022). SMARTDEST DATASET WP3 v1.0 [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-6787378
    Explore at:
    unknown(9913124)Available download formats
    Dataset updated
    Jul 1, 2022
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The SMARTDEST DATASET WP3 v1.0 includes data at sub-city level for 7 cities: Amsterdam, Barcelona, Edinburgh, Lisbon, Ljubljana, Turin, and Venice. It is made up of information extracted from public sources at the local level (mostly, city council open data portals) or volunteered geographic information, that is, geospatial content generated by non-professionals using mapping systems available on the Internet (e.g., Geofabrik). Details on data sources and variables are included in a ‘metadata’ spreadsheet in the excel file. The same excel file contains 5 additional spreadsheets. The first one, labelled #1, was used to perform the analysis on the determinants of the geographical spread of tourism supply in SMARTDEST case study’s cities (in the main document D3.3, section 4.1), The second one (labelled #2) offers information that would allow to replicate the analysis on tourism-led population decline reported in section 4.3. As for spreadsheets named #3-AMS, #4-BCN, and #5-EDI, they refer to data sources and variables used to run follow-up analyses discussed in section 5.1, with the objective of digging into the causes of depopulation in Amsterdam, Barcelona, and Edinburgh, respectively. The column ‘row’ can be used to merge the excel file with the shapefile ‘db_task3.3_SmartDest’. Data are available at the buurt level in Amsterdam (an administrative unit roughly corresponding to a neighbourhood), census tract level in Barcelona and Ljubljana, for data zones in Edinburgh, statistical zones in Turin, and località in Venice.

  4. o

    Data from: Skepticism in science and punitive attitudes

    • openicpsr.org
    delimited
    Updated May 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jason Rydberg; Luke DeZago (2025). Skepticism in science and punitive attitudes [Dataset]. http://doi.org/10.3886/E228541V1
    Explore at:
    delimitedAvailable download formats
    Dataset updated
    May 4, 2025
    Dataset provided by
    University of Massachusetts Lowell
    Authors
    Jason Rydberg; Luke DeZago
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Replication materials for the manuscript "Skepticism in Science and Punitive Attitudes", published in the Journal of Criminal Justice.Note that the GSS repeated cross sections for 1972 to 2018 are too large to upload here, but they can be accessed from https://gss.norc.org/content/dam/gss/get-the-data/documents/spss/GSS_spss.zipIncluded here are:(A link to the repeated cross-sections data)Each of the 3 wave panels (2006-2010; 2008-2012; 2010-2014)Replication R script for the repeated cross sections cleaning and analysisReplication R script for the panel data cleaning and analysisAn excel spreadsheet with Uniform Crime Report data to merge to the cross sections.

  5. Superstore Sales Analysis

    • kaggle.com
    Updated Oct 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Reda Elblgihy (2023). Superstore Sales Analysis [Dataset]. https://www.kaggle.com/datasets/aliredaelblgihy/superstore-sales-analysis/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 21, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ali Reda Elblgihy
    Description

    Analyzing sales data is essential for any business looking to make informed decisions and optimize its operations. In this project, we will utilize Microsoft Excel and Power Query to conduct a comprehensive analysis of Superstore sales data. Our primary objectives will be to establish meaningful connections between various data sheets, ensure data quality, and calculate critical metrics such as the Cost of Goods Sold (COGS) and discount values. Below are the key steps and elements of this analysis:

    1- Data Import and Transformation:

    • Gather and import relevant sales data from various sources into Excel.
    • Utilize Power Query to clean, transform, and structure the data for analysis.
    • Merge and link different data sheets to create a cohesive dataset, ensuring that all data fields are connected logically.

    2- Data Quality Assessment:

    • Perform data quality checks to identify and address issues like missing values, duplicates, outliers, and data inconsistencies.
    • Standardize data formats and ensure that all data is in a consistent, usable state.

    3- Calculating COGS:

    • Determine the Cost of Goods Sold (COGS) for each product sold by considering factors like purchase price, shipping costs, and any additional expenses.
    • Apply appropriate formulas and calculations to determine COGS accurately.

    4- Discount Analysis:

    • Analyze the discount values offered on products to understand their impact on sales and profitability.
    • Calculate the average discount percentage, identify trends, and visualize the data using charts or graphs.

    5- Sales Metrics:

    • Calculate and analyze various sales metrics, such as total revenue, profit margins, and sales growth.
    • Utilize Excel functions to compute these metrics and create visuals for better insights.

    6- Visualization:

    • Create visualizations, such as charts, graphs, and pivot tables, to present the data in an understandable and actionable format.
    • Visual representations can help identify trends, outliers, and patterns in the data.

    7- Report Generation:

    • Compile the findings and insights into a well-structured report or dashboard, making it easy for stakeholders to understand and make informed decisions.

    Throughout this analysis, the goal is to provide a clear and comprehensive understanding of the Superstore's sales performance. By using Excel and Power Query, we can efficiently manage and analyze the data, ensuring that the insights gained contribute to the store's growth and success.

  6. g

    Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...

    • datasearch.gesis.org
    • openicpsr.org
    Updated Feb 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaplan, Jacob (2020). Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Property Stolen and Recovered (Supplement to Return A) 1960-2017 [Dataset]. http://doi.org/10.3886/E105403V3
    Explore at:
    Dataset updated
    Feb 19, 2020
    Dataset provided by
    da|ra (Registration agency for social science and economic data)
    Authors
    Kaplan, Jacob
    Description

    For any questions about this data please email me at jacob@crimedatatool.com. If you use this data, please cite it.Version 3 release notes:Adds data in the following formats: Excel.Changes project name to avoid confusing this data for the ones done by NACJD.Version 2 release notes:Adds data for 2017.Adds a "number_of_months_reported" variable which says how many months of the year the agency reported data.Property Stolen and Recovered is a Uniform Crime Reporting (UCR) Program data set with information on the number of offenses (crimes included are murder, rape, robbery, burglary, theft/larceny, and motor vehicle theft), the value of the offense, and subcategories of the offense (e.g. for robbery it is broken down into subcategories including highway robbery, bank robbery, gas station robbery). The majority of the data relates to theft. Theft is divided into subcategories of theft such as shoplifting, theft of bicycle, theft from building, and purse snatching. For a number of items stolen (e.g. money, jewelry and previous metals, guns), the value of property stolen and and the value for property recovered is provided. This data set is also referred to as the Supplement to Return A (Offenses Known and Reported). All the data was received directly from the FBI as text or .DTA files. I created a setup file based on the documentation provided by the FBI and read the data into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. For the R code used to clean this data, see here: https://github.com/jacobkap/crime_data. The Word document file available for download is the guidebook the FBI provided with the raw data which I used to create the setup file to read in data.There may be inaccuracies in the data, particularly in the group of columns starting with "auto." To reduce (but certainly not eliminate) data errors, I replaced the following values with NA for the group of columns beginning with "offenses" or "auto" as they are common data entry error values (e.g. are larger than the agency's population, are much larger than other crimes or months in same agency): 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 99942. This cleaning was NOT done on the columns starting with "value."For every numeric column I replaced negative indicator values (e.g. "j" for -1) with the negative number they are supposed to be. These negative number indicators are not included in the FBI's codebook for this data but are present in the data. I used the values in the FBI's codebook for the Offenses Known and Clearances by Arrest data.To make it easier to merge with other data, I merged this data with the Law Enforcement Agency Identifiers Crosswalk (LEAIC) data. The data from the LEAIC add FIPS (state, county, and place) and agency type/subtype. If an agency has used a different FIPS code in the past, check to make sure the FIPS code is the same as in this data.

  7. Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race,...

    • search.datacite.org
    • openicpsr.org
    Updated 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob Kaplan (2018). Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race, 1980-2016 [Dataset]. http://doi.org/10.3886/e102263v5-10021
    Explore at:
    Dataset updated
    2018
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    DataCitehttps://www.datacite.org/
    Authors
    Jacob Kaplan
    Description

    Version 5 release notes:
    Removes support for SPSS and Excel data.Changes the crimes that are stored in each file. There are more files now with fewer crimes per file. The files and their included crimes have been updated below.
    Adds in agencies that report 0 months of the year.Adds a column that indicates the number of months reported. This is generated summing up the number of unique months an agency reports data for. Note that this indicates the number of months an agency reported arrests for ANY crime. They may not necessarily report every crime every month. Agencies that did not report a crime with have a value of NA for every arrest column for that crime.Removes data on runaways.
    Version 4 release notes:
    Changes column names from "poss_coke" and "sale_coke" to "poss_heroin_coke" and "sale_heroin_coke" to clearly indicate that these column includes the sale of heroin as well as similar opiates such as morphine, codeine, and opium. Also changes column names for the narcotic columns to indicate that they are only for synthetic narcotics.
    Version 3 release notes:
    Add data for 2016.Order rows by year (descending) and ORI.Version 2 release notes:
    Fix bug where Philadelphia Police Department had incorrect FIPS county code.
    The Arrests by Age, Sex, and Race data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains highly granular data on the number of people arrested for a variety of crimes (see below for a full list of included crimes). The data sets here combine data from the years 1980-2015 into a single file. These files are quite large and may take some time to load.
    All the data was downloaded from NACJD as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. For the R code used to clean this data, see here. https://github.com/jacobkap/crime_data. If you have any questions, comments, or suggestions please contact me at jkkaplan6@gmail.com.

    I did not make any changes to the data other than the following. When an arrest column has a value of "None/not reported", I change that value to zero. This makes the (possible incorrect) assumption that these values represent zero crimes reported. The original data does not have a value when the agency reports zero arrests other than "None/not reported." In other words, this data does not differentiate between real zeros and missing values. Some agencies also incorrectly report the following numbers of arrests which I change to NA: 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 99999, 99998.

    To reduce file size and make the data more manageable, all of the data is aggregated yearly. All of the data is in agency-year units such that every row indicates an agency in a given year. Columns are crime-arrest category units. For example, If you choose the data set that includes murder, you would have rows for each agency-year and columns with the number of people arrests for murder. The ASR data breaks down arrests by age and gender (e.g. Male aged 15, Male aged 18). They also provide the number of adults or juveniles arrested by race. Because most agencies and years do not report the arrestee's ethnicity (Hispanic or not Hispanic) or juvenile outcomes (e.g. referred to adult court, referred to welfare agency), I do not include these columns.

    To make it easier to merge with other data, I merged this data with the Law Enforcement Agency Identifiers Crosswalk (LEAIC) data. The data from the LEAIC add FIPS (state, county, and place) and agency type/subtype. Please note that some of the FIPS codes have leading zeros and if you open it in Excel it will automatically delete those leading zeros.

    I created 9 arrest categories myself. The categories are:
    Total Male JuvenileTotal Female JuvenileTotal Male AdultTotal Female AdultTotal MaleTotal FemaleTotal JuvenileTotal AdultTotal ArrestsAll of these categories are based on the sums of the sex-age categories (e.g. Male under 10, Female aged 22) rather than using the provided age-race categories (e.g. adult Black, juvenile Asian). As not all agencies report the race data, my method is more accurate. These categories also make up the data in the "simple" version of the data. The "simple" file only includes the above 9 columns as the arrest data (all other columns in the data are just agency identifier columns). Because this "simple" data set need fewer columns, I include all offenses.

    As the arrest data is very granular, and each category of arrest is its own column, there are dozens of columns per crime. To keep the data somewhat manageable, there are nine different files, eight which contain different crimes and the "simple" file. Each file contains the data for all years. The eight categories each have crimes belonging to a major crime category and do not overlap in crimes other than with the index offenses. Please note that the crime names provided below are not the same as the column names in the data. Due to Stata limiting column names to 32 characters maximum, I have abbreviated the crime names in the data. The files and their included crimes are:

    Index Crimes
    MurderRapeRobberyAggravated AssaultBurglaryTheftMotor Vehicle TheftArsonAlcohol CrimesDUIDrunkenness
    LiquorDrug CrimesTotal DrugTotal Drug SalesTotal Drug PossessionCannabis PossessionCannabis SalesHeroin or Cocaine PossessionHeroin or Cocaine SalesOther Drug PossessionOther Drug SalesSynthetic Narcotic PossessionSynthetic Narcotic SalesGrey Collar and Property CrimesForgeryFraudStolen PropertyFinancial CrimesEmbezzlementTotal GamblingOther GamblingBookmakingNumbers LotterySex or Family CrimesOffenses Against the Family and Children
    Other Sex Offenses
    ProstitutionRapeViolent CrimesAggravated AssaultMurderNegligent ManslaughterRobberyWeapon Offenses
    Other CrimesCurfewDisorderly ConductOther Non-trafficSuspicion
    VandalismVagrancy
    Simple
    This data set has every crime and only the arrest categories that I created (see above).
    If you have any questions, comments, or suggestions please contact me at jkkaplan6@gmail.com.

  8. g

    Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...

    • datasearch.gesis.org
    • openicpsr.org
    Updated Feb 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaplan, Jacob (2020). Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Property Stolen and Recovered (Supplement to Return A) 1960-2018 [Dataset]. http://doi.org/10.3886/E105403
    Explore at:
    Dataset updated
    Feb 19, 2020
    Dataset provided by
    da|ra (Registration agency for social science and economic data)
    Authors
    Kaplan, Jacob
    Description

    For any questions about this data please email me at jacob@crimedatatool.com. If you use this data, please cite it.Version 4 release notes:Adds data for 2018Version 3 release notes:Adds data in the following formats: Excel.Changes project name to avoid confusing this data for the ones done by NACJD.Version 2 release notes:Adds data for 2017.Adds a "number_of_months_reported" variable which says how many months of the year the agency reported data.Property Stolen and Recovered is a Uniform Crime Reporting (UCR) Program data set with information on the number of offenses (crimes included are murder, rape, robbery, burglary, theft/larceny, and motor vehicle theft), the value of the offense, and subcategories of the offense (e.g. for robbery it is broken down into subcategories including highway robbery, bank robbery, gas station robbery). The majority of the data relates to theft. Theft is divided into subcategories of theft such as shoplifting, theft of bicycle, theft from building, and purse snatching. For a number of items stolen (e.g. money, jewelry and previous metals, guns), the value of property stolen and and the value for property recovered is provided. This data set is also referred to as the Supplement to Return A (Offenses Known and Reported). All the data was received directly from the FBI as text or .DTA files. I created a setup file based on the documentation provided by the FBI and read the data into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. For the R code used to clean this data, see here: https://github.com/jacobkap/crime_data. The Word document file available for download is the guidebook the FBI provided with the raw data which I used to create the setup file to read in data.There may be inaccuracies in the data, particularly in the group of columns starting with "auto." To reduce (but certainly not eliminate) data errors, I replaced the following values with NA for the group of columns beginning with "offenses" or "auto" as they are common data entry error values (e.g. are larger than the agency's population, are much larger than other crimes or months in same agency): 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 99942. This cleaning was NOT done on the columns starting with "value."For every numeric column I replaced negative indicator values (e.g. "j" for -1) with the negative number they are supposed to be. These negative number indicators are not included in the FBI's codebook for this data but are present in the data. I used the values in the FBI's codebook for the Offenses Known and Clearances by Arrest data.To make it easier to merge with other data, I merged this data with the Law Enforcement Agency Identifiers Crosswalk (LEAIC) data. The data from the LEAIC add FIPS (state, county, and place) and agency type/subtype. If an agency has used a different FIPS code in the past, check to make sure the FIPS code is the same as in this data.

  9. Data from: DATASET FOR: A multimodal spectroscopic approach combining...

    • zenodo.org
    • producciocientifica.uv.es
    bin, csv, zip
    Updated Aug 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Perez Guaita; David Perez Guaita (2024). DATASET FOR: A multimodal spectroscopic approach combining mid-infrared and near-infrared for discriminating Gram-positive and Gram-negative bacteria [Dataset]. http://doi.org/10.5281/zenodo.10523185
    Explore at:
    bin, zip, csvAvailable download formats
    Dataset updated
    Aug 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David Perez Guaita; David Perez Guaita
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description:

    This dataset comprises a comprehensive set of files designed for the analysis and 2D correlation of spectral data, specifically focusing on ATR and NIR spectra. It includes MATLAB scripts and supporting functions necessary to replicate the analysis, as well as the raw datasets used in the study. Below is a detailed description of the included files:

    1. Data Analysis:

      • File Name: Data_Analysis.mlx
      • Description: This MATLAB Live Script file contains the main script used for the classification analysis of the spectral data. It includes steps for preprocessing, analysis, and visualization of the ATR and NIR spectra.
    2. 2D Correlation Data Analysis:

      • File Name: Data_Analysis_2Dcorr.mlx
      • Description: This MATLAB Live Script file is similar to the primary analysis script but is specifically tailored for performing 2D correlation analysis on the spectral data. It includes detailed steps and code for executing the 2D correlation.
    3. Functions:

      • Folder Name: Functions
      • Description: This folder contains all the necessary MATLAB function files required to replicate the analyses presented in the scripts. These functions handle various preprocessing steps, calculations, and visualizations.
    4. Datasets:

      • File Names: ATR_dataset.xlsx, NIR_dataset.xlsx, Reference_data.csv
      • Description: These Excel files contain the raw spectral data for ATR and NIR analyses, as well as reference datasets. Each file includes multiple sheets with detailed measurements and metadata.

    Usage Notes:

    • Software Requirements:
      • MATLAB is required to run the .mlx files and utilize the functions.
      • PLS_Toolbox: Necessary for certain preprocessing and analysis steps.
      • MIDAS 2010: Available at MIDAS 2010, required for the 2D correlation analysis.
    • Replication: Users can replicate the analyses by running the Data_Analysis.mlx and Data_Analysis_2Dcorr.mlx scripts in MATLAB, ensuring that the Functions folder is in the MATLAB path.
    • Data Handling: The datasets are provided in .xlsx format, which can be easily imported into MATLAB or other data analysis software.
  10. Data for Fig 3 and 4.xlsx -- Article: Combining intransitive and higher...

    • figshare.com
    xlsx
    Updated May 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Vandermeer (2023). Data for Fig 3 and 4.xlsx -- Article: Combining intransitive and higher order effects in a coupled oscillator framework: a case study of an ant community by John Vandermeer and Ivertte Perfecto [Dataset]. http://doi.org/10.6084/m9.figshare.23244365.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 27, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    John Vandermeer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Three excel sheets with location data (x coord and y coord) for coffee trees in the survey plots presented in figures 3 and 4 of the article.

  11. Z

    Data set, combining epidemiological, genetics, and government stringency...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Balkanyi, Laszlo (2020). Data set, combining epidemiological, genetics, and government stringency data of COVID-19 pandemic. [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4152998
    Explore at:
    Dataset updated
    Nov 4, 2020
    Dataset provided by
    Dorkó, Balázs
    Lukacs, Lajos
    Balkanyi, Laszlo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set combines epidemiological, genetics, and government stringency data of COVID-19 pandemics, all from open data sources. The sources are: Our World in Data, Worldometer, GISAID-Nextstrain, and the Oxford COVID-19 Government Response Tracker (OxCGRT). The cut off date of the first version is at the end of June 2020.

    The simple data set is provided as an Excel workbook, where the first, "readme" worksheet describes the details of data of all the worksheets in the data set.

    This is a working data set, expected to be refreshed over time.

    Raw data are not cleaned - this collection is a tool to check various hypotheses regarding possible relations among the various data types. Simple visualisations of data relations are provided in a separate sheet.

  12. E

    Scottish Census 2011 Population by Council Area

    • dtechtive.com
    • find.data.gov.scot
    xml, zip
    Updated Feb 21, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Edinburgh (2017). Scottish Census 2011 Population by Council Area [Dataset]. http://doi.org/10.7488/ds/1908
    Explore at:
    zip(8.036 MB), xml(0.0038 MB)Available download formats
    Dataset updated
    Feb 21, 2017
    Dataset provided by
    University of Edinburgh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Scotland
    Description

    This data is sourced from the Census 2011 and shows the population and population density by council area. Raw data sourced from http://www.scotlandscensus.gov.uk/en/censusresults/downloadablefiles.html and then manipulated in excel to merge a number of tables. The resulting data was joined to a shapefile of Scottish Council areas from sharegeo (http://www.sharegeo.ac.uk/handle/10672/305). Both sources should be attributed as the sources of the base data. GIS vector data. This dataset was first accessioned in the EDINA ShareGeo Open repository on 2012-12-19 and migrated to Edinburgh DataShare on 2017-02-21.

  13. f

    Excel spreadsheet containing the underlying numerical data for Figs 1C, 2C,...

    • figshare.com
    xlsx
    Updated Jun 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bihuan Chen; Xiaonan Liu; Yina Wang; Jie Bai; Xiangyu Liu; Guisheng Xiang; Wei Liu; Xiaoxi Zhu; Jian Cheng; Lina Lu; Guanghui Zhang; Ge Zhang; Zongjie Dai; Shuhui Zi; Shengchao Yang; Huifeng Jiang (2023). Excel spreadsheet containing the underlying numerical data for Figs 1C, 2C, 2D, 4B, 4C, 5A, 5B, S11, S12 and S14. [Dataset]. http://doi.org/10.1371/journal.pbio.3002131.s021
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    PLOS Biology
    Authors
    Bihuan Chen; Xiaonan Liu; Yina Wang; Jie Bai; Xiangyu Liu; Guisheng Xiang; Wei Liu; Xiaoxi Zhu; Jian Cheng; Lina Lu; Guanghui Zhang; Ge Zhang; Zongjie Dai; Shuhui Zi; Shengchao Yang; Huifeng Jiang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Excel spreadsheet containing the underlying numerical data for Figs 1C, 2C, 2D, 4B, 4C, 5A, 5B, S11, S12 and S14.

  14. f

    Excel spreadsheet containing, in separate sheets for each figure, the...

    • plos.figshare.com
    xlsx
    Updated Jun 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yassine Cherrak; Miguel Angel Salazar; Koray Yilmaz; Markus Kreuzer; Wolf-Dietrich Hardt (2024). Excel spreadsheet containing, in separate sheets for each figure, the underlying and individual numerical data used for Figs 1B–1D, 2B, 2C, 3A–3D, 4A–4G, S1A, S1B, S1C, S1D, S1E, S1F, [Dataset]. http://doi.org/10.1371/journal.pbio.3002616.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 12, 2024
    Dataset provided by
    PLOS Biology
    Authors
    Yassine Cherrak; Miguel Angel Salazar; Koray Yilmaz; Markus Kreuzer; Wolf-Dietrich Hardt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Excel spreadsheet containing, in separate sheets for each figure, the underlying and individual numerical data used for Figs 1B–1D, 2B, 2C, 3A–3D, 4A–4G, S1A, S1B, S1C, S1D, S1E, S1F,

  15. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sanjana Murthy (2024). Bank Loan Analysis Project in Excel [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/bank-loan-analysis-project/discussion?sort=undefined
Organization logo

Bank Loan Analysis Project in Excel

Bank Loan Analysis Internship Project

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sanjana Murthy
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

About Datasets: - Domain : Finance - Project: Bank loan of customers - Datasets: Finance_1.xlsx & Finance_2.xlsx - Dataset Type: Excel Data - Dataset Size: Each Excel file has 39k+ records

KPI's: 1. Year wise loan amount Stats 2. Grade and sub grade wise revol_bal 3. Total Payment for Verified Status Vs Total Payment for Non Verified Status 4. State wise loan status 5. Month wise loan status 6. Get more insights based on your understanding of the data

Process: 1. Understanding the problem 2. Data Collection 3. Data Cleaning 4. Exploring and analyzing the data 5. Interpreting the results

This data contains Power Query, Power Pivot, Merge data, Clustered Bar Chart, Clustered Column Chart, Line Chart, 3D Pie chart, Dashboard, slicers, timeline, formatting techniques.

Search
Clear search
Close search
Google apps
Main menu