10 datasets found

Bank Loan Analysis Project in Excel
kaggle.com
Updated May 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanjana Murthy (2024). Bank Loan Analysis Project in Excel [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/bank-loan-analysis-project/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 4, 2024
Dataset provided by
Kaggle
Authors
Sanjana Murthy
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
About Datasets: - Domain : Finance - Project: Bank loan of customers - Datasets: Finance_1.xlsx & Finance_2.xlsx - Dataset Type: Excel Data - Dataset Size: Each Excel file has 39k+ records

KPI's: 1. Year wise loan amount Stats 2. Grade and sub grade wise revol_bal 3. Total Payment for Verified Status Vs Total Payment for Non Verified Status 4. State wise loan status 5. Month wise loan status 6. Get more insights based on your understanding of the data

Process: 1. Understanding the problem 2. Data Collection 3. Data Cleaning 4. Exploring and analyzing the data 5. Interpreting the results

This data contains Power Query, Power Pivot, Merge data, Clustered Bar Chart, Clustered Column Chart, Line Chart, 3D Pie chart, Dashboard, slicers, timeline, formatting techniques.
f
Cleaned NHANES 1988-2018
figshare.com
txt
Updated Feb 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet (2025). Cleaned NHANES 1988-2018 [Dataset]. http://doi.org/10.6084/m9.figshare.21743372.v9
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21743372.v9
Dataset updated
Feb 18, 2025
Dataset provided by
figshare
Authors
Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The National Health and Nutrition Examination Survey (NHANES) provides data and have considerable potential to study the health and environmental exposure of the non-institutionalized US population. However, as NHANES data are plagued with multiple inconsistencies, processing these data is required before deriving new insights through large-scale analyses. Thus, we developed a set of curated and unified datasets by merging 614 separate files and harmonizing unrestricted data across NHANES III (1988-1994) and Continuous (1999-2018), totaling 135,310 participants and 5,078 variables. The variables conveydemographics (281 variables),dietary consumption (324 variables),physiological functions (1,040 variables),occupation (61 variables),questionnaires (1444 variables, e.g., physical activity, medical conditions, diabetes, reproductive health, blood pressure and cholesterol, early childhood),medications (29 variables),mortality information linked from the National Death Index (15 variables),survey weights (857 variables),environmental exposure biomarker measurements (598 variables), andchemical comments indicating which measurements are below or above the lower limit of detection (505 variables).csv Data Record: The curated NHANES datasets and the data dictionaries includes 23 .csv files and 1 excel file.The curated NHANES datasets involves 20 .csv formatted files, two for each module with one as the uncleaned version and the other as the cleaned version. The modules are labeled as the following: 1) mortality, 2) dietary, 3) demographics, 4) response, 5) medications, 6) questionnaire, 7) chemicals, 8) occupation, 9) weights, and 10) comments."dictionary_nhanes.csv" is a dictionary that lists the variable name, description, module, category, units, CAS Number, comment use, chemical family, chemical family shortened, number of measurements, and cycles available for all 5,078 variables in NHANES."dictionary_harmonized_categories.csv" contains the harmonized categories for the categorical variables.“dictionary_drug_codes.csv” contains the dictionary for descriptors on the drugs codes.“nhanes_inconsistencies_documentation.xlsx” is an excel file that contains the cleaning documentation, which records all the inconsistencies for all affected variables to help curate each of the NHANES modules.R Data Record: For researchers who want to conduct their analysis in the R programming language, only cleaned NHANES modules and the data dictionaries can be downloaded as a .zip file which include an .RData file and an .R file.“w - nhanes_1988_2018.RData” contains all the aforementioned datasets as R data objects. We make available all R scripts on customized functions that were written to curate the data.“m - nhanes_1988_2018.R” shows how we used the customized functions (i.e. our pipeline) to curate the original NHANES data.Example starter codes: The set of starter code to help users conduct exposome analysis consists of four R markdown files (.Rmd). We recommend going through the tutorials in order.“example_0 - merge_datasets_together.Rmd” demonstrates how to merge the curated NHANES datasets together.“example_1 - account_for_nhanes_design.Rmd” demonstrates how to conduct a linear regression model, a survey-weighted regression model, a Cox proportional hazard model, and a survey-weighted Cox proportional hazard model.“example_2 - calculate_summary_statistics.Rmd” demonstrates how to calculate summary statistics for one variable and multiple variables with and without accounting for the NHANES sampling design.“example_3 - run_multiple_regressions.Rmd” demonstrates how run multiple regression models with and without adjusting for the sampling design.
o
Data from: Skepticism in science and punitive attitudes
openicpsr.org
delimited
Updated May 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jason Rydberg; Luke DeZago (2025). Skepticism in science and punitive attitudes [Dataset]. http://doi.org/10.3886/E228541V1
Explore at:
delimitedAvailable download formats
Unique identifier
https://doi.org/10.3886/E228541V1
Dataset updated
May 4, 2025
Dataset provided by
University of Massachusetts Lowell
Authors
Jason Rydberg; Luke DeZago
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Replication materials for the manuscript "Skepticism in Science and Punitive Attitudes", published in the Journal of Criminal Justice.Note that the GSS repeated cross sections for 1972 to 2018 are too large to upload here, but they can be accessed from https://gss.norc.org/content/dam/gss/get-the-data/documents/spss/GSS_spss.zipIncluded here are:(A link to the repeated cross-sections data)Each of the 3 wave panels (2006-2010; 2008-2012; 2010-2014)Replication R script for the repeated cross sections cleaning and analysisReplication R script for the panel data cleaning and analysisAn excel spreadsheet with Uniform Crime Report data to merge to the cross sections.
g
Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...
datasearch.gesis.org
openicpsr.org
Updated Feb 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaplan, Jacob (2020). Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Property Stolen and Recovered (Supplement to Return A) 1960-2018 [Dataset]. http://doi.org/10.3886/E105403
Explore at:
Unique identifier
https://doi.org/10.3886/E105403
Dataset updated
Feb 19, 2020
Dataset provided by
da|ra (Registration agency for social science and economic data)
Authors
Kaplan, Jacob
Description
For any questions about this data please email me at jacob@crimedatatool.com. If you use this data, please cite it.Version 4 release notes:Adds data for 2018Version 3 release notes:Adds data in the following formats: Excel.Changes project name to avoid confusing this data for the ones done by NACJD.Version 2 release notes:Adds data for 2017.Adds a "number_of_months_reported" variable which says how many months of the year the agency reported data.Property Stolen and Recovered is a Uniform Crime Reporting (UCR) Program data set with information on the number of offenses (crimes included are murder, rape, robbery, burglary, theft/larceny, and motor vehicle theft), the value of the offense, and subcategories of the offense (e.g. for robbery it is broken down into subcategories including highway robbery, bank robbery, gas station robbery). The majority of the data relates to theft. Theft is divided into subcategories of theft such as shoplifting, theft of bicycle, theft from building, and purse snatching. For a number of items stolen (e.g. money, jewelry and previous metals, guns), the value of property stolen and and the value for property recovered is provided. This data set is also referred to as the Supplement to Return A (Offenses Known and Reported). All the data was received directly from the FBI as text or .DTA files. I created a setup file based on the documentation provided by the FBI and read the data into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. For the R code used to clean this data, see here: https://github.com/jacobkap/crime_data. The Word document file available for download is the guidebook the FBI provided with the raw data which I used to create the setup file to read in data.There may be inaccuracies in the data, particularly in the group of columns starting with "auto." To reduce (but certainly not eliminate) data errors, I replaced the following values with NA for the group of columns beginning with "offenses" or "auto" as they are common data entry error values (e.g. are larger than the agency's population, are much larger than other crimes or months in same agency): 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 99942. This cleaning was NOT done on the columns starting with "value."For every numeric column I replaced negative indicator values (e.g. "j" for -1) with the negative number they are supposed to be. These negative number indicators are not included in the FBI's codebook for this data but are present in the data. I used the values in the FBI's codebook for the Offenses Known and Clearances by Arrest data.To make it easier to merge with other data, I merged this data with the Law Enforcement Agency Identifiers Crosswalk (LEAIC) data. The data from the LEAIC add FIPS (state, county, and place) and agency type/subtype. If an agency has used a different FIPS code in the past, check to make sure the FIPS code is the same as in this data.
Digitisation of Weather Records of Seungjeongwon Ilgi: A Historical Weather...
zenodo.org
bin, csv, json, txt
Updated Sep 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zeyu Lyu; Zeyu Lyu; Kohei Ichikawa; Kohei Ichikawa; Yongchao Cheng; Yongchao Cheng; Hisashi Hayakawa; Hisashi Hayakawa; Yukiko Kawamoto; Yukiko Kawamoto (2023). Digitisation of Weather Records of Seungjeongwon Ilgi: A Historical Weather Dynamics Dataset of the Korean Peninsula (1623-1910) [Dataset]. http://doi.org/10.5281/zenodo.7453644
Explore at:
csv, json, bin, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7453644
Dataset updated
Sep 27, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Zeyu Lyu; Zeyu Lyu; Kohei Ichikawa; Kohei Ichikawa; Yongchao Cheng; Yongchao Cheng; Hisashi Hayakawa; Hisashi Hayakawa; Yukiko Kawamoto; Yukiko Kawamoto
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Korea
Description
Introduction

This study has exploited the daily weather records of Seungjeongwon Ilgi from the NIKH database. Seungjeongwon Ilgi (http://sjw.history.go.kr/main.do) is a daily record of the Seungjeongwon, the Royal Secretariat of the Joseon Dynasty of Korea. These diaries span from 1623 to 1910 and generally involve daily weather records in the entry header. Their observational site would be located in Seoul (N37°35′, E126°59′). We have exploited the weather records from the NIKH database and classified the daily weather using text mining method. We have also converted the report dates from the traditional lunisolar calendar to the Gregorian calendar, to better contextualise our data into the contemporary daily measurements.

Data

We provide different formats (csv, xlsx, json) to facilitate the usage of data. The main contents of data are listed as below.

ID: The unique identifier of a specific record in the metadata, which can also serve as the identifier to merge with external data in the NIKH digital database.

Traditional calendar: The original lunar dates in the NIKH digital database, which are listed in data format "YYYY-MM-DD". More specifically, "L0" implies the leap year and "L1" implies the common year.

Leap: The identifier of a leap year.

Gregorian calendar: The Gregorian calendar date that converted by the traditional calendar date.

Weather Text: The text that describe the weather conditions. Specifically, multiple weather descriptions of the same day have been put together.

Flag: The computed value that indicates different combinations of weather conditions.

Volume: The volume of text in the original record.

Herbal Volume: The volume of text in the herbal record.

Sunny: A dummy variable that represents whether the weather description contains the expression of sunny.

Cloudy: A dummy variable that represents whether the weather description contains the expression of cloudy.

Rainy: A dummy variable that represents whether the weather description contains the expression of rainy.

Snow: A dummy variable that represents whether the weather description contains the expression of snow.

Wind: A dummy variable that represents whether the weather description contains the expression of wind.

Import Data

# Python # CSV file import pandas as pd data=pd.read_csv('~/SJWilgi_Seoul_Weather_YR1623_1910.csv',encoding="utf-8") # JSON file data=pd.read_json('~/SJWilgi_Seoul_Weather_YR1623_1910.json',encoding="utf-8") # Excel file data=pd.read_excel('~/SJWilgi_Seoul_Weather_YR1623_1910.xlsx') # Excel file

# R # CSV file library(readr) data<- read_csv("~/SJWilgi_Seoul_Weather_YR1623_1910.csv") # Excel file library(readxl) data <- read_excel("~/SJWilgi_Seoul_Weather_YR1623_1910.xlsx")
a
Highlands County GIS Interactive Application
highlands-county-gis-open-data-hcbcc.hub.arcgis.com
Updated Aug 2, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Highlands County Board of County Commissioners (2019). Highlands County GIS Interactive Application [Dataset]. https://highlands-county-gis-open-data-hcbcc.hub.arcgis.com/datasets/highlands-county-gis-interactive-application
Explore at:
Dataset updated
Aug 2, 2019
Dataset authored and provided by
Highlands County Board of County Commissioners
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered
Highlands County
Description
Users can manipulate the map environment by panning, zooming, etc. Tools include:IdentifyPrintShareDownload BufferSelectGet XYMeasureSearch (address, lat/long, parcel number, owner name, street name, Schools, fire stations.)Query BuilderCoordinate converterDrawing ToolsAttribute TableMail Merge Download Excel TableGrid OverlayCreate Drive Time AreasSituation AwarenessBookmarksBasemap ChangerEmergency Response Guide Future work on this app includes::

Disclaimer review About page information Testing Cover Image Check all symbology
E
Scottish Census 2011 Population by Council Area
dtechtive.com
find.data.gov.scot
xml, zip
Updated Feb 21, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Edinburgh (2017). Scottish Census 2011 Population by Council Area [Dataset]. http://doi.org/10.7488/ds/1908
Explore at:
zip(8.036 MB), xml(0.0038 MB)Available download formats
Unique identifier
https://doi.org/10.7488/ds/1908
Dataset updated
Feb 21, 2017
Dataset provided by
University of Edinburgh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Scotland
Description
This data is sourced from the Census 2011 and shows the population and population density by council area. Raw data sourced from http://www.scotlandscensus.gov.uk/en/censusresults/downloadablefiles.html and then manipulated in excel to merge a number of tables. The resulting data was joined to a shapefile of Scottish Council areas from sharegeo (http://www.sharegeo.ac.uk/handle/10672/305). Both sources should be attributed as the sources of the base data. GIS vector data. This dataset was first accessioned in the EDINA ShareGeo Open repository on 2012-12-19 and migrated to Edinburgh DataShare on 2017-02-21.
f
Excel spreadsheet containing, in separate sheets, underlying numerical data...
plos.figshare.com
xlsx
Updated Sep 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hoi Tong Wong; Adeline M. Luperchio; Sean Riley; Daniel J. Salamango (2023). Excel spreadsheet containing, in separate sheets, underlying numerical data used to generate the indicated figure panels. [Dataset]. http://doi.org/10.1371/journal.ppat.1011634.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.ppat.1011634.s001
Dataset updated
Sep 15, 2023
Dataset provided by
PLOS Pathogens
Authors
Hoi Tong Wong; Adeline M. Luperchio; Sean Riley; Daniel J. Salamango
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Excel spreadsheet containing, in separate sheets, underlying numerical data used to generate the indicated figure panels.
Z
The National Archives Accessions to Repositories Data c.2007 - 2020
data.niaid.nih.gov
Updated Dec 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jones, Kevin (2022). The National Archives Accessions to Repositories Data c.2007 - 2020 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7442983
Explore at:
Dataset updated
Dec 16, 2022
Dataset authored and provided by
Jones, Kevin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Annual Accessions to Repositories survey is a UK-wide exercise conducted by the National Archives that assesses what is being collected by UK repositories. The primary purpose of this exercise is to place some of this information onto TNA’s search engine Discovery. More recently, the data has been used to communicate accessions trends to the wider archives sector including information on what is being collected and where. Each year, TNA sends out survey templates in the form of Excel spreadsheets that are sent out to repositories in each part of the UK. The returns sent to TNA include information on the size of the record, the dates it covers, the creator of the record and a description of the record. Work has been undertaken since October 2021 to to merge and standardise the accessions data held by TNA. This data repository presents the merged dataset.
f
Excel file of Excluded articles.
plos.figshare.com
xlsx
Updated May 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emma Begley; Jason Thomas; Carl Senior (2025). Excel file of Excluded articles. [Dataset]. http://doi.org/10.1371/journal.pone.0322324.s006
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0322324.s006
Dataset updated
May 6, 2025
Dataset provided by
PLOS ONE
Authors
Emma Begley; Jason Thomas; Carl Senior
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundThe incidence and prevalence of neurodegenerative diseases (NDs) are growing worldwide. In an environment where healthcare resources are already stretched, it is important to optimise treatment choice to help alleviate healthcare burden. This rapid review aims to consolidate evidence on factors that influence healthcare professionals (HCPs) to prescribe medication for NDs and map them to theoretical models of behaviour change to identify the behavioural determinants that may support in optimising prescribing.Methods and findingsEmbase and Ovid MEDLINE were used to identify relevant empirical research studies. Screening, data extraction and quality assessment were carried out by three independent reviewers to ensure consistency. Factors influencing prescribing were mapped to the Theoretical Domains Framework (TDF) and key behavioural determinants were described using the Capability, Opportunity, Motivation – Behaviour (COM-B) model. An initial 3,099 articles were identified, of which 53 were included for data extraction. Fifty-six factors influencing prescribing were identified and categorised into patient, HCP or healthcare system groups, then mapped to TDF and COM-B domains. Prescribing was influenced by capability of HCPs, namely factors mapped to decision making (e.g., patient age or symptom burden) and knowledge (e.g., clinical understanding) behavioural domains. However, most factors were influenced by HCP opportunity, underpinned by factors mapped to social (e.g., prescribing support or culture) and contextual (e.g., lack of resources or medication availability) domains. Less evidence was available on factors influencing the motivation of HCPs, where evident; factors primarily related to HCP belief about consequences (e.g., side effects) and professional identify (e.g., level of specialism) were often described.ConclusionsThis systematic analysis of the literature provides an in-depth understanding of the behavioural determinants that may support in optimising prescribing practices (e.g., drug costs or pressure from patients’ family members). Understanding these approaches provides an opportunity to identify relevant intervention functions and behaviour change techniques to target the factors that directly influence HCP prescribing behaviour.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Sanjana Murthy (2024). Bank Loan Analysis Project in Excel [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/bank-loan-analysis-project/data

Bank Loan Analysis Project in Excel

Bank Loan Analysis Internship Project

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 4, 2024

Dataset provided by

Kaggle

Authors

Sanjana Murthy

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

About Datasets: - Domain : Finance - Project: Bank loan of customers - Datasets: Finance_1.xlsx & Finance_2.xlsx - Dataset Type: Excel Data - Dataset Size: Each Excel file has 39k+ records

KPI's: 1. Year wise loan amount Stats 2. Grade and sub grade wise revol_bal 3. Total Payment for Verified Status Vs Total Payment for Non Verified Status 4. State wise loan status 5. Month wise loan status 6. Get more insights based on your understanding of the data

Process: 1. Understanding the problem 2. Data Collection 3. Data Cleaning 4. Exploring and analyzing the data 5. Interpreting the results

This data contains Power Query, Power Pivot, Merge data, Clustered Bar Chart, Clustered Column Chart, Line Chart, 3D Pie chart, Dashboard, slicers, timeline, formatting techniques.

Clear search

Close search

Google apps

Main menu

Bank Loan Analysis Project in Excel

Cleaned NHANES 1988-2018

Data from: Skepticism in science and punitive attitudes

Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...

Digitisation of Weather Records of Seungjeongwon Ilgi: A Historical Weather...

Highlands County GIS Interactive Application

Scottish Census 2011 Population by Council Area

Excel spreadsheet containing, in separate sheets, underlying numerical data...

The National Archives Accessions to Repositories Data c.2007 - 2020

Excel file of Excluded articles.

Bank Loan Analysis Project in Excel

Bank Loan Analysis Internship Project