54 datasets found

EDA on Cleaned Netflix Data
kaggle.com
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikhil raman K (2025). EDA on Cleaned Netflix Data [Dataset]. https://www.kaggle.com/datasets/nikhilramank/eda-on-cleaned-netflix-data/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 7, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nikhil raman K
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This is a cleaned version of a Netflix movies dataset originally used for exploratory data analysis (EDA). The dataset contains information such as:

Title

Release Year

Rating

Genre

Votes

Description

Stars

Missing values have been handled using appropriate methods (mean, median, unknown), and new features like rating_level and popular have been added for deeper analysis.

The dataset is ready for: - EDA - Data visualization - Machine learning tasks - Dashboard building

Used in the accompanying notebook
EDA - Percentage of University Center clients taking action as a result of...
performance.commerce.gov
application/rdfxml +5
Updated Mar 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Economic Development Administration (2025). EDA - Percentage of University Center clients taking action as a result of the assistance facilitated by the University Center [Dataset]. https://performance.commerce.gov/KPI-EDA/EDA-Percentage-of-University-Center-clients-taking/prgp-nn7t
Explore at:
xml, csv, application/rdfxml, tsv, application/rssxml, jsonAvailable download formats
Dataset updated
Mar 6, 2025
Dataset authored and provided by
Economic Development Administrationhttp://www.eda.gov/
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
This measure determines the perceived value added by the University Centers (UCs) to their clients. EDA funds UCs to provide technical assistance and specialized services (for example, feasibility studies, marketing research, economic analysis, environmental services, and technology transfer) to local officials and communities. This assistance improves the community’s capacity to plan and manage successful development projects. UCs develop client profiles and report findings to EDA, which evaluates the performance of each center once every 3 years and verifies the data. “Taking action as a result of the assistance facilitated” means to implement an aspect of the technical assistance provided by the UC in one of several areas: economic development initiatives and training session development; linkages to crucial resources; economic development planning; project management; community investment package development; geographic information system services; strategic partnering to public or private sector entities; increased organizational capacity; feasibility plans; marketing studies; technology transfer; new company, product, or patent development; and other services.
f
Data from: The Often-Overlooked Power of Summary Statistics in Exploratory...
acs.figshare.com
xlsx
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tahereh G. Avval; Behnam Moeini; Victoria Carver; Neal Fairley; Emily F. Smith; Jonas Baltrusaitis; Vincent Fernandez; Bonnie. J. Tyler; Neal Gallagher; Matthew R. Linford (2023). The Often-Overlooked Power of Summary Statistics in Exploratory Data Analysis: Comparison of Pattern Recognition Entropy (PRE) to Other Summary Statistics and Introduction of Divided Spectrum-PRE (DS-PRE) [Dataset]. http://doi.org/10.1021/acs.jcim.1c00244.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.1c00244.s002
Dataset updated
Jun 8, 2023
Dataset provided by
ACS Publications
Authors
Tahereh G. Avval; Behnam Moeini; Victoria Carver; Neal Fairley; Emily F. Smith; Jonas Baltrusaitis; Vincent Fernandez; Bonnie. J. Tyler; Neal Gallagher; Matthew R. Linford
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Unsupervised exploratory data analysis (EDA) is often the first step in understanding complex data sets. While summary statistics are among the most efficient and convenient tools for exploring and describing sets of data, they are often overlooked in EDA. In this paper, we show multiple case studies that compare the performance, including clustering, of a series of summary statistics in EDA. The summary statistics considered here are pattern recognition entropy (PRE), the mean, standard deviation (STD), 1-norm, range, sum of squares (SSQ), and X4, which are compared with principal component analysis (PCA), multivariate curve resolution (MCR), and/or cluster analysis. PRE and the other summary statistics are direct methods for analyzing datathey are not factor-based approaches. To quantify the performance of summary statistics, we use the concept of the “critical pair,” which is employed in chromatography. The data analyzed here come from different analytical methods. Hyperspectral images, including one of a biological material, are also analyzed. In general, PRE outperforms the other summary statistics, especially in image analysis, although a suite of summary statistics is useful in exploring complex data sets. While PRE results were generally comparable to those from PCA and MCR, PRE is easier to apply. For example, there is no need to determine the number of factors that describe a data set. Finally, we introduce the concept of divided spectrum-PRE (DS-PRE) as a new EDA method. DS-PRE increases the discrimination power of PRE. We also show that DS-PRE can be used to provide the inputs for the k-nearest neighbor (kNN) algorithm. We recommend PRE and DS-PRE as rapid new tools for unsupervised EDA.
Z
Data from: Wrist-worn sensor validation for heart rate variability and...
data.niaid.nih.gov
data.mendeley.com
+1more
Updated Jul 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Falivene, Anna (2024). Wrist-worn sensor validation for heart rate variability and electrodermal activity detection in a stressful driving environment [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8059241
Explore at:
Dataset updated
Jul 11, 2024
Dataset provided by
Costantini, Simone
Dei, Carla
Malerba, Giorgia
Falivene, Anna
Storm, Fabio
Biffi, Emilia
Chiappini, Mattia
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
The current dataset contributes to assess the accuracy of the Empatica 4 (E4) wristband for the detection of heart rate variability (HRV) and electrodermal activity (EDA) metrics in stress-inducing conditions and growing-risk driving scenarios. Heart Rate Variability (HRV) and ElectroDermal Activity (EDA) signals were recorded over six experimental conditions (i.e., Baseline, Video Clip, Scream, No Risk Driving, Low-Risk Driving, and High-Risk Driving) and by means of two measurement systems: the E4 device and a gold standard system. The raw quality of the physiological signals was enhanced by means of robust semi-automatic reconstruction algorithms. Heart Rate Variability time-domain parameters showed high accuracy in motion-free experimental conditions, while Heart Rate Variability frequency-domain parameters reported sufficient accuracy in almost every experimental condition.
E
EDA in Automotive Report
marketreportanalytics.com
doc, pdf, ppt
Updated Apr 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Report Analytics (2025). EDA in Automotive Report [Dataset]. https://www.marketreportanalytics.com/reports/eda-in-automotive-56229
Explore at:
pdf, doc, pptAvailable download formats
Dataset updated
Apr 3, 2025
Dataset authored and provided by
Market Report Analytics
License
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The automotive EDA (Electronic Design Automation) market is experiencing robust growth, driven by the increasing complexity of electric vehicles (EVs), autonomous driving systems, and advanced driver-assistance systems (ADAS). The shift towards software-defined vehicles necessitates sophisticated simulation and design tools to ensure safety, performance, and reliability. This is fueling demand for cloud-based EDA solutions, offering scalability and collaborative design capabilities. While on-premise solutions remain relevant for specific needs, the cloud's flexibility and cost-effectiveness are driving a significant market shift. OEMs are major drivers of market growth, investing heavily in developing next-generation vehicles. However, the 4S shops and other service providers are also seeing increased adoption for maintenance and repair purposes and aftermarket modifications. The market is segmented by application (OEMs, 4S shops, others) and type (cloud-based, on-premise). Leading players like ANSYS, Altair Engineering, and Dassault Systèmes are investing in R&D and strategic acquisitions to expand their market share and capabilities. Geographic growth is widespread, with North America and Europe currently leading, followed by the rapidly expanding Asia-Pacific region due to increasing vehicle production and technological advancements in countries like China and India. Challenges include the high cost of EDA software and the need for skilled professionals to operate these complex tools. However, the long-term growth outlook remains very positive, fueled by ongoing technological advancements and increasing vehicle electrification and autonomy. The forecast period (2025-2033) suggests a sustained high CAGR, reflecting the continued integration of electronics and software in vehicles. Assuming a conservative CAGR of 15% and a 2025 market size of $5 billion, the market is projected to reach approximately $17 Billion by 2033. This growth will be driven by factors such as increased adoption of EV and autonomous driving technologies, stricter regulatory compliance requirements necessitating extensive simulation and validation, and growing demand for high-performance computing resources for complex simulations. The competitive landscape is characterized by both established players and emerging innovative companies. Successful players will be those that can adapt to the evolving technological landscape and offer flexible, scalable, and user-friendly solutions.
t
BIOGRID CURATED DATA FOR EDA (Homo sapiens)
thebiogrid.org
zip
Updated Oct 16, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioGRID Project (2019). BIOGRID CURATED DATA FOR EDA (Homo sapiens) [Dataset]. https://thebiogrid.org/108224/table/homo-sapiens/eda.html
Explore at:
zipAvailable download formats
Dataset updated
Oct 16, 2019
Dataset authored and provided by
BioGRID Project
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Protein-Protein, Genetic, and Chemical Interactions for EDA (Homo sapiens) curated by BioGRID (https://thebiogrid.org); DEFINITION: ectodysplasin A
f
Supplemental data for "An Automated On-the-Go Unloading System Reduces...
asabe.figshare.com
txt
Updated Aug 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Travis Burgers; Kusha Kamarei; Mukund Vora (2024). Supplemental data for "An Automated On-the-Go Unloading System Reduces Harvest Operator Stress Relative to Manual Operation" [Dataset]. http://doi.org/10.13031/26319289.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.13031/26319289.v1
Dataset updated
Aug 13, 2024
Dataset provided by
American Society of Agricultural and Biological Engineers
Authors
Travis Burgers; Kusha Kamarei; Mukund Vora
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset information -machine-operator_usage.csv This file contains the data for the machine-operator usage for July, October, and November for three grain cart tractors. This is the data for Figure 3. The table includes the following variables for machine-powered days: day, machine ID, time spent driving at slow speed, time at medium speed, time at fast speed, time machine was powered before harvesting, time machine was powered after harvesting, time stationary during harvesting, total time machine was powered, total distance travelled Dataset information -stress_combine.csv -stress_grain_cart.csv These files contain the stress rate and other unloading data for each on-the-go unload event (the ones that qualified for stress analysis). This the data for Figures 4–9. These tables include the following variables for unload events: unload ID, state, crop, unload event start time, unload event end time, automation type, duration auger on, automation time percentage, number of stressful events, number of invalid wristband data points, mean tractor speed, mean combine speed, stress rate, combine operator experience, stress rate manual mean, subject-normalized stress rate, combine operator ID, grain cart operator experience, grain cart operator ID Dataset information -XTEt_XTEc.csv This file contains cross-track error (XTE) standard deviation (SD) data from each on-the-go unload event that was used to evaluate steering performance. This is the data for Figure 10. The table includes the following variables for unload events: unload ID, state, crop, unload event start time, unload event end time, automation type, duration auger on, automation time percentage, mean tractor speed, mean combine speed, XTE SD combine, XTE SD tractor
D
Electronic Design Automation (EDA) for Semiconductor Chips Market Report |...
dataintelo.com
csv, pdf, pptx
Updated Jun 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Electronic Design Automation (EDA) for Semiconductor Chips Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-electronic-design-automation-eda-for-semiconductor-chips-market
Explore at:
csv, pptx, pdfAvailable download formats
Dataset updated
Jun 25, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Electronic Design Automation (EDA) for Semiconductor Chips Market Outlook 2032

The global electronic design automation (EDA) for semiconductor chips market size was USD 15.95 Billion in 2023 and is projected to reach USD 37.69 Billion by 2032, expanding at a CAGR of 10% during 2024–2032. The market growth is attributed to the use of these chips in modern technology.

Increasingly, Electronic Design Automation (EDA) for semiconductor chips is becoming a cornerstone of modern technology. EDA, a category of software tools for designing electronic systems such as printed circuit boards and integrated circuits, plays a pivotal role in the creation of complex electronic systems. The sophistication of these tools allows for the design and production of semiconductor chips with unparalleled efficiency and precision, making them indispensable in an era defined by rapid technological advancement.

Rising regulatory scrutiny is leading to the introduction of new rules and regulations by governing bodies such as the International Electrotechnical Commission (IEC) and the Institute of Electrical and Electronics Engineers (IEEE). These regulations, aimed at ensuring the safety, reliability, and environmental sustainability of EDA products and services, are reshaping the market. The impact of these regulations is likely to be profound, driving innovation and fostering a greater emphasis on compliance within the industry.

Impact of Artificial Intelligence (AI) in Electronic Design Automation (EDA) for Semiconductor Chips Market

Artificial Intelligence (AI) has a considerable impact on the electronic design automation (EDA) for semiconductor chips market. AI accelerates the development process, reducing time-to-market and enhancing competitive advantage by automating intricate design tasks. It enables the creation of complex and powerful chips, driving innovation in a myriad of industries.

AI's predictive analytics capabilities facilitate proactive error detection, minimizing costly design flaws and improving product reliability. However, this technological shift necessitates a profound transformation in skill sets and processes, requiring significant investment in training and infrastructure. <
Breast cancer dataset
kaggle.com
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wasiq Ali (2025). Breast cancer dataset [Dataset]. https://www.kaggle.com/datasets/wasiqaliyasir/breast-cancer-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 30, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Wasiq Ali
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Breast Cancer Dataset

Description

The Breast Cancer Dataset hosted on Kaggle is a powerful resource for researchers, data scientists, and machine learning enthusiasts looking to explore and develop predictive models for breast cancer diagnosis. This dataset, accessible via Kaggle, is designed for binary classification tasks to predict whether a breast tumor is benign or malignant. It provides a rich collection of features derived from digitized images of fine needle aspirates (FNA) of breast masses, making it an essential tool for advancing healthcare analytics and computational pathology. Below is a comprehensive, human-crafted description of the dataset, complete with examples and key highlights to make it engaging and informative.

*****Overview*****

The dataset originates from the Breast Cancer Wisconsin (Diagnostic) Data Set, a widely used benchmark in machine learning for medical diagnostics. It contains detailed measurements of cell nuclei from breast tissue samples, enabling the classification of tumors as either benign (non-cancerous) or malignant (cancerous). This dataset is particularly valuable for developing and testing machine learning models, such as logistic regression, support vector machines, or deep neural networks, to aid in early and accurate breast cancer detection.

Purpose: Binary classification to predict tumor type (benign or malignant). Source: University of Wisconsin, provided through Kaggle. Link: Breast Cancer Dataset on Kaggle. Application: Ideal for medical research, machine learning model development, and educational purposes.

##### Dataset Structure The dataset comprises 569 instances (rows) and 32 columns, including an ID column, a diagnosis label, and 30 numerical features describing cell nuclei characteristics. Each instance represents a single breast mass sample, with features computed from digitized FNA images. Key Columns:

ID: A unique identifier for each sample (e.g., 842302). Diagnosis: The target variable, labeled as: M (Malignant): Indicates a cancerous tumor. B (Benign): Indicates a non-cancerous tumor.

Features (30 columns): Numerical measurements of cell nuclei, such as radius, texture, perimeter, and area, derived from image analysis.

Feature Categories:

The 30 features are grouped into three main categories based on the characteristics of cell nuclei:

Mean: Average values of measurements (e.g., mean radius, mean texture). Standard Error (SE): Variability of measurements (e.g., standard error of radius, standard error of area). Worst: Largest (worst) values of measurements (e.g., worst radius, worst smoothness).

Each category includes 10 specific measurements:

Radius (mean of distances from center to points on the perimeter)

Texture (standard deviation of grayscale values)

Perimeter

Area

Smoothness (local variation in radius lengths)

Compactness (perimeter² / area - 1.0)

Concavity (severity of concave portions of the contour)

Concave points (number of concave portions of the contour)

Symmetry

Fractal dimension ("coastline approximation" - 1)

Example Data Point: Here’s a simplified example of a single row in the dataset:

ID Diagnosis Radius_mean Texture_mean Perimeter_mean Area_mean Smoothness_mean ...

842302 M 17.99 10.38 122.80 1001.0 0.11840 ...

Interpretation: This sample (ID 842302) is malignant (M), with a mean radius of 17.99 units, a mean texture of 10.38, and so on. The remaining 27 columns provide additional measurements (e.g., standard error and worst values).

Key Highlights

Balanced Classes: The dataset includes 357 benign and 212 malignant cases, offering a relatively balanced distribution for training robust models. No Missing Values: The dataset is clean and preprocessed, with no missing or null values, making it ready for immediate analysis. High Dimensionality: With 30 numerical features, the dataset supports complex modeling techniques, including feature selection and dimensionality reduction. Real-World Impact: The dataset is widely used in research to improve diagnostic accuracy, contributing to early breast cancer detection and better patient outcomes. Open Access: Freely available on Kaggle, encouraging collaboration and innovation in the data science community.

Potential Use Cases

Machine Learning: Train classification models (e.g., Random Forest, SVM, or Neural Networks) to predict tumor malignancy.

Feature Engineering: Explore correlations between features (e.g., radius and area) to identify key predictors of malignancy.

Data Visualization: Create visualizations (e.g., scatter plots, heatmaps) to understand feature distributions and relationships.

Medical Research: Support computational pathology studies by analyzing nuclear characteristics for diagnostic insights.

Educational Tool: Perfect for teaching data science concepts, such as preprocessing...
Multidimensional Dataset for APA Investigations in Cancer Patients
zenodo.org
Updated Sep 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marco Cascella; Marco Cascella; Alfonso Maria Ponsiglione; Alfonso Maria Ponsiglione; Vittorio Santoriello; Vittorio Santoriello; Ornella Piazza; Ornella Piazza; Francesco Amato; Francesco Amato; Maria Romano; Maria Romano (2024). Multidimensional Dataset for APA Investigations in Cancer Patients [Dataset]. http://doi.org/10.5281/zenodo.13711426
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.13711426
Dataset updated
Sep 6, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marco Cascella; Marco Cascella; Alfonso Maria Ponsiglione; Alfonso Maria Ponsiglione; Vittorio Santoriello; Vittorio Santoriello; Ornella Piazza; Ornella Piazza; Francesco Amato; Francesco Amato; Maria Romano; Maria Romano
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains data collected from patients suffering from cancer-related pain. The features extracted from clinical data (including typical cancer phenomena such as breakthrough pain) and the biosignal acquisitions contributed to the definition of a multidimensional dataset. This unique database can be useful for the characterization of the patient’s pain experience from a qualitative and quantitative perspective. We implemented measurable biosignals-related indicators of the individual’s pain response and of the overall Autonomic Nervous System (ANS) functioning. The most peculiar features extracted from EDA and ECG signals can be adopted to investigate the status and complex functioning of the ANS through the study of sympatho-vagal activations. Specifically, while EDA is mainly related sympathetic activation, the Heart Rate Variability (HRV), which can be derived from ECG recordings, is strictly related to the interplay between sympathetic and parasympathetic functioning.

As far as the EDA signal, two types of analyzes have been performed: (i) the Trough-To-Peak analysis (TTP), or min-max analysis, aimed at measuring the difference between the Skin Conductance (SC) at the peak of a response and its previous minimum within pre-established time-windows; (ii) the Continuous Decomposition Analysis (CDA), aimed at performing a decomposition of SC data into continuous signals of tonic (basic level of conductance) and phasic (short-duration changes in the SC) activity. Before applying the TPP analysis or the CDA, the signal was filtered by means of a fifth-order Butterworth low-pass filter with a cutoff frequency of 1 Hz and downsampled up to 10 Hz to reducing the computational burden of the analysis. The application of TPP and CDA allowed the detection and measurement of SC Responses (SCR) and the following parameters have been calculated for both TPP and CDA methodologies:

Total number of detected SCRs.

Maximum value of SCRs [measured in μS].

Minimum value of SCRs [measured in μS].

Arithmetic mean of the SCRs [measured in μS].

Maximum interval between SCRs [measured in ms].

Minimum interval between SCRs [measured in ms].

Arithmetic mean of the intervals between SCRs [measured in ms].

Concerning the ECG, the RR series of interbeat intervals (i.e., the time between successive R waves of the QRS complex on the ECG waveform) has been computed to extract time-domain parameters of the HRV. The R peak detection was carried out by adopting the Pan–Tompkins algorithm for QRS detection and R peak identification. The corresponding RR series of interbeat intervals were derived as the difference between successive R peaks.

The ECG-derived RR time series was then filtered by means of a recursive procedure to remove the intervals differing most from the mean of the surrounding RR intervals. Then, both the Time-Domain Analysis (TDA) and Frequency-Domain Analysis (FDA) of the HRV have been carried out to extract the main features characterizing the variability of the heart rhythm. Time-domain parameters are obtained from statistical analysis of the intervals between heart beats and are used to describe how much variability in the heartbeats is present at various time scales.

The parameters computed through the TDA include the following:

Arithmetic mean of the RR time series [measured in ms].

The standard deviation of the RR time series [measured in ms].

Mean value of heart rate [measured in bpm].

Standard deviation of the heart rate [measured in bpm].

Root Mean Square of Successive Differences of RR intervals [measured in ms], which is sensitive to high-frequency heart period fluctuations in the respiratory frequency range and has been used as an index of vagal cardiac control.

Number of successive RR intervals whose difference is higher than 50 ms.

Percentage of successive RR intervals higher than 50 ms.

Number of successive RR intervals whose difference is higher than 50 ms.

Frequency-domain parameters reflect the distribution of spectral power across different frequencies bands and are used to assess specific components of HRV (e.g., thermoregulation control loop, baroreflex control loop, and respiration control loop, which are regulated by both sympathetic and vagal nerves of the ANS).
The parameters computed through the FDA have been computed by adopting the Welch's Fourier periodogram method based on the Discrete Fourier Transform (DFT), which allows the expression of the RR series in the discrete frequency domain. However, due to the non-stationarity of the RR series, Welch Fourier periodogram method is used for dealing with non-stationarity. Specifically, Welch's periodogram divides the signal into specific periods of constant length appliying the Fast Fourier Transform (FFT) trasforming individually these parts of the signal. The periodogram is basically a way of estimating power spectral density of a time series.

The FDA parameters include the following:

Peak value in the Very Low Frequency Band of the HRV power density spectrum [measured in Hz].

Peak value in the Low Frequency Band of the HRV power density spectrum [measured in Hz].

Peak value in the High Frequency Band of the HRV power density spectrum [measured in Hz].

Power in the Very Low Frequency Band of the HRV power density spectrum [measured in ms^2].

Power in the Low Frequency Band of the HRV power density spectrum [measured in ms^2].

Power in the High Frequency Band of the HRTotal Power of the HRV power density spectrum [measured in ms^2].

Total Power of the HRV power density spectrum [measured in ms^2].

Percentage power in the Very Low Frequency Band of the HRV power density spectrum with respect to the total power.

Percentage power in the Low Frequency Band of the HRV power density spectrum with respect to the total power.

Percentage power in the High Frequency Band of the HRV power density spectrum with respect to the total power.

Normalized power in the Low Frequency Band of the HRV power density spectrum with respect to the sum of LF and HF power.

Normalized power in the High Frequency Band of the HRV power density spectrum with respect to the sum of LF and HF power.

Sympathovagal balance measured as the ration between power in LF and power in the LF band.
k
APAC EDA (Electronic Design Automation) Market Outlook to 2030
kenresearch.com
pdf
Updated Dec 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ken Research (2024). APAC EDA (Electronic Design Automation) Market Outlook to 2030 [Dataset]. https://www.kenresearch.com/industry-reports/apac-eda-electronic-design-automation-market
Explore at:
pdfAvailable download formats
Dataset updated
Dec 17, 2024
Dataset authored and provided by
Ken Research
License
https://www.kenresearch.com/terms-and-conditionshttps://www.kenresearch.com/terms-and-conditions
Description
The APAC EDA Electronic Design Automation Market size is USD 7.89 billion in 2023, explores compliance trends, sourcing strategies, and innovation pipeline to define go-to-market priorities.

hotel_booking_data

kaggle.com

Updated Dec 9, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Sadidul Kabir (2024). hotel_booking_data [Dataset]. https://www.kaggle.com/datasets/chanchal57/hotel-booking-data

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Dec 9, 2024

Dataset provided by

Kaggle

Authors

Sadidul Kabir

Description

This data set contains booking information for a city hotel and a resort hotel and includes information such as when the booking was made, length of stay, the number of adults, children, and/or babies, and the number of available parking spaces, among other things.

All personally identifying information has been removed from the data.

NOTE: Names, Emails, Phone Numbers, and Credit Card numbers in the data are synthetic and not real information from people. The hotel data is real from the publication listed above.

Data Column Reference

Variable	Type	Description	Source/Engineering
ADR	Numeric	Average Daily Rate as defined by [5]	BO, BL and TR / Calculated by dividing the sum of all lodging transactions by the total number of staying nights
Adults	Integer	Number of adults	BO and BL
Agent	Categorical	ID of the travel agency that made the booking	BO and BL
ArrivalDateDayOfMonth	Integer	Day of the month of the arrival date	BO and BL
ArrivalDateMonth	Categorical	Month of arrival date with 12 categories: “January” to “December”	BO and BL
ArrivalDateWeekNumber	Integer	Week number of the arrival date	BO and BL
ArrivalDateYear	Integer	Year of arrival date	BO and BL
AssignedRoomType	Categorical	Code for the type of room assigned to the booking. Sometimes the assigned room type differs from the reserved room type due to hotel operation reasons (e.g. overbooking) or by customer request. Code is presented instead of designation for anonymity reasons	BO and BL
Babies	Integer	Number of babies	BO and BL
BookingChanges	Integer	Number of changes/amendments made to the booking from the moment the booking was entered on the PMS until the moment of check-in or cancellation	BO and BL/Calculated by adding the number of unique iterations that change some of the booking attributes, namely: persons, arrival date, nights, reserved room type or meal
Children	Integer	Number of children	BO and BL/Sum of both payable and non-payable children
Company	Categorical	ID of the company/entity that made the booking or responsible for paying the booking. ID is presented instead of designation for anonymity reasons	BO and BL.
Country	Categorical	Country of origin. Categories are represented in the ISO 3155–3:2013 format [6]	BO, BL and NT

CustomerType	Categorical	Type of booking, assuming one of four categories:	BO and BL
Contract - when the booking has an allotment or other type of contract associated to it;
Group – when the booking is associated to a group;
Transient – when the booking is not part of a group or contract, and is not associated to other transient booking;
Transient-party – when the booking is transient, but is associated to at least other transient booking
DaysInWaitingList	Integer	Number of days the booking was in the waiting list before it was confirmed to the customer	BO/Calculated by subtracting the date the booking was confirmed to the customer from the date the booking entered on the PMS

DepositType	Categorical	Indication on if the customer made a deposit to guarantee the booking. This variable can assume three categories:	BO and TR/Value calculated based on the payments identified for the booking in the transaction (TR) table before the booking׳s arrival or cancellation date.
No Deposit – no deposit was made;
In case no payments were found the value is “No Deposit”.
If the payment was equal or exceeded the total cost of stay, the value is set as “Non Refund”.
Non Refund – a deposit was made in the value of the total stay cost;
Otherwise the value is set as “Refundable”
Refundable – a deposit was made wi...

t
BIOGRID CURATED DATA FOR EDA (Escherichia coli (K12/W3110))
thebiogrid.org
zip
Updated Nov 5, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioGRID Project (2016). BIOGRID CURATED DATA FOR EDA (Escherichia coli (K12/W3110)) [Dataset]. https://thebiogrid.org/4259153/table/escherichia-coli/eda.html
Explore at:
zipAvailable download formats
Dataset updated
Nov 5, 2016
Dataset authored and provided by
BioGRID Project
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Protein-Protein, Genetic, and Chemical Interactions for EDA (Escherichia coli (K12/W3110)) curated by BioGRID (https://thebiogrid.org); DEFINITION: multifunctional 2-keto-3-deoxygluconate 6-phosphate aldolase, 2-keto-4-hydroxyglutarate aldolase, and oxaloacetate decarboxylase
f
Data from: Photocatalytic Radical Decarboxylation [4 + 3] Annulation...
datasetcatalog.nlm.nih.gov
acs.figshare.com
Updated Feb 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Liang, Chuyun; Wang, Shuzhong; Deji, Cuo; Wu, Xinxin; Huang, Huicai; Ke, Mingji; Zhan, Ruoting (2024). Photocatalytic Radical Decarboxylation [4 + 3] Annulation Reactions of Lactones via Dienoic Acid EDA Complexes [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001416619
Explore at:
Dataset updated
Feb 8, 2024
Authors
Liang, Chuyun; Wang, Shuzhong; Deji, Cuo; Wu, Xinxin; Huang, Huicai; Ke, Mingji; Zhan, Ruoting
Description
The process of radical decarboxylation is crucial in organic synthesis. Nevertheless, decarboxylation of dienoic acids presents a greater challenge compared to that of aliphatic carboxylic acids. Herein, catalyst- and additive-free visible-light-promoted [4 + 3] annulation of lactones and diamines was achieved via radical decarboxylation of dienoic acids. By means of this novel EDA-activated [4 + 3] annulation, the 1,5-benzodiazepines, which display a wide range of biological activities and are widely used in many fields, can be directly accessed in high yields under mild conditions. This visible-light-induced radical decarboxylation [4 + 3] annulation tolerates a broad array of functional groups and intricate molecules, including pharmaceutical-relevant compounds and natural products.
f
Cognitive Fatigue
figshare.com
csv
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rui Varandas; Inês Silveira; Hugo Gamboa (2025). Cognitive Fatigue [Dataset]. http://doi.org/10.6084/m9.figshare.28188143.v3
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28188143.v3
Dataset updated
Jun 4, 2025
Dataset provided by
figshare
Authors
Rui Varandas; Inês Silveira; Hugo Gamboa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cognitive Fatigue2.1. Experimental designCognitive fatigue (CF) is a phenomenon that arises following the prolonged engagement in mentally demanding cognitive tasks. Thus, we developed an experimental procedure that involved three demanding tasks: a digital lesson in Jupyter Notebook format, three repetitions of Corsi-Block task, and two repetitions of a concentration test.Before the Corsi-Block task and after the concentration task there were periods of baseline of two min. In our analysis, the first baseline period, although not explicitly present in the dataset, was designated as representing no CF, whereas the final baseline period was designated as representing the presence of CF. Between repetitions of the Corsi-Block task, there were periods of baseline of 15 s after the task and of 30 s before the beginning of each repetition of the task.2.2. Data recordingA data sample of 10 volunteer participants (4 females) aged between 22 and 48 years old (M = 28.2, SD = 7.6) took part in this study. All volunteers were recruited at NOVA School of Science and Technology, fluent in English, right-handed, none reported suffering from psychological disorders, and none reported taking regular medication. Written informed consent was obtained before participating and all Ethical Procedures approved by the Ethics Committee of NOVA University of Lisbon were thoroughly followed.In this study, we omitted the data from one participant due to the insufficient duration of data acquisition.2.3. Data labellingThe labels easy, difficult, very difficult and repeat found in the ECG_lesson_answers.txt files represent the subjects' opinion of the content read in the ECG lesson. The repeat label represents the most difficult level. It's called repeat because when you press it, the answer to the question is shown again. This system is based on the Anki system, which has been proposed and used to memorise information effectively. In addition, the PB description JSON files include timestamps indicating the start and end of cognitive tasks, baseline periods, and other events, which are useful for defining CF states as we defined in 2.1.2.4. Data descriptionBiosignals include EEG, fNIRS (not converted to oxi and deoxiHb), ECG, EDA, respiration (RIP), accelerometer (ACC), and push-button data (PB). All signals have already been converted to physical units. In each biosignal file, the first column corresponds to the timestamps.HCI features encompass keyboard, mouse, and screenshot data. Below is a Python code snippet for extracting screenshot files from the screenshots CSV file.import base64from os import mkdirfrom os.path import joinfile = '...'with open(file, 'r') as f: lines = f.readlines()for line in lines[1:]: timestamp = line.split(',')[0] code = line.split(',')[-1][:-2] imgdata = base64.b64decode(code) filename = str(timestamp) + '.jpeg' mkdir('screenshot') with open(join('screenshot', filename), 'wb') as f: f.write(imgdata)A characterization file containing age and gender information for all subjects in each dataset is provided within the respective dataset folder (e.g., D2_subject-info.csv). Other complementary files include (i) description of the pushbuttons to help segment the signals (e.g., D2_S2_PB_description.json) and (ii) labelling (e.g., D2_S2_ECG_lesson_results.txt). The files D2_Sx_results_corsi-block_board_1.json and D2_Sx_results_corsi-block_board_2.json show the results for the first and second iterations of the corsi-block task, where, for example, row_0_1 = 12 means that the subject got 12 pairs right in the first row of the first board, and row_0_2 = 12 means that the subject got 12 pairs right in the first row of the second board.
f
Table1_Functional and clinical analysis of five EDA variants associated with...
frontiersin.figshare.com
docx
Updated Aug 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sare Gökdere; Holm Schneider; Ute Hehr; Laure Willen; Pascal Schneider; Sigrun Maier-Wohlfart (2023). Table1_Functional and clinical analysis of five EDA variants associated with ectodermal dysplasia but with a hard-to-predict significance.DOCX [Dataset]. http://doi.org/10.3389/fgene.2022.934395.s002
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2022.934395.s002
Dataset updated
Aug 30, 2023
Dataset provided by
Frontiers
Authors
Sare Gökdere; Holm Schneider; Ute Hehr; Laure Willen; Pascal Schneider; Sigrun Maier-Wohlfart
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Deficiency of ectodysplasin A1 (EDA1) due to variants of the gene EDA causes X-linked hypohidrotic ectodermal dysplasia (XLHED), a rare genetic condition characterized by abnormal development of ectodermal structures. XLHED is defined by the triad of hypotrichosis, hypo- or anhidrosis, and hypo- or anodontia. Anhidrosis may lead to life-threatening hyperthermia. A definite genetic diagnosis is, thus, important for the patients’ management and amenability to a novel prenatal treatment option. Here, we describe five familial EDA variants segregating with the disease in three families, for which different prediction tools yielded discordant results with respect to their significance. Functional properties in vitro and levels of circulating serum EDA were compared with phenotypic data on skin, hair, eyes, teeth, and sweat glands. EDA1-Gly176Val, although associated with relevant hypohidrosis, still bound to the EDA receptor (EDAR). Subjects with EDA1-Pro389LeufsX27, -Ter392GlnfsX30, -Ser125Cys, and an EDA1 splice variant (c.924+7A > G) showed complete absence of pilocarpine-induced sweating. EDA1-Pro389LeufsX27 was incapable of binding to EDAR and undetectable in serum. EDA1-Ter392GlnfsX30, produced in much lower amounts than wild-type EDA1, could still bind to EDAR, and so did EDA1-Ser125Cys that was, however, undetectable in serum. The EDA splice variant c.924+7A > G resulted experimentally in a mix of wild-type EDA1 and EDA molecules truncated in the middle of the receptor-binding domain, with reduced EDA serum concentration. Thus, in vitro assays reflected the clinical phenotype in two of these difficult cases, but underestimated it in three others. Absence of circulating EDA seems to predict the full-blown phenotype of XLHED, while residual EDA levels may also be found in anhidrotic patients. This indicates that unborn subjects carrying variants of uncertain significance could benefit from an upcoming prenatal medical treatment even if circulating EDA levels or tests in vitro suggest residual EDA1 activity.
Retail Sales Dataset
kaggle.com
Updated Aug 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammad Talib (2023). Retail Sales Dataset [Dataset]. https://www.kaggle.com/datasets/mohammadtalib786/retail-sales-dataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 22, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mohammad Talib
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Welcome to the Retail Sales and Customer Demographics Dataset! This synthetic dataset has been meticulously crafted to simulate a dynamic retail environment, providing an ideal playground for those eager to sharpen their data analysis skills through exploratory data analysis (EDA). With a focus on retail sales and customer characteristics, this dataset invites you to unravel intricate patterns, draw insights, and gain a deeper understanding of customer behavior.

****Dataset Overview:**

This dataset is a snapshot of a fictional retail landscape, capturing essential attributes that drive retail operations and customer interactions. It includes key details such as Transaction ID, Date, Customer ID, Gender, Age, Product Category, Quantity, Price per Unit, and Total Amount. These attributes enable a multifaceted exploration of sales trends, demographic influences, and purchasing behaviors.

Why Explore This Dataset?

Realistic Representation: Though synthetic, the dataset mirrors real-world retail scenarios, allowing you to practice analysis within a familiar context.

Diverse Insights: From demographic insights to product preferences, the dataset offers a broad spectrum of factors to investigate.

Hypothesis Generation: As you perform EDA, you'll have the chance to formulate hypotheses that can guide further analysis and experimentation.

Applied Learning: Uncover actionable insights that retailers could use to enhance their strategies and customer experiences.

Questions to Explore:

How does customer age and gender influence their purchasing behavior?

Are there discernible patterns in sales across different time periods?

Which product categories hold the highest appeal among customers?

What are the relationships between age, spending, and product preferences?

How do customers adapt their shopping habits during seasonal trends?

Are there distinct purchasing behaviors based on the number of items bought per transaction?

What insights can be gleaned from the distribution of product prices within each category?

Your EDA Journey:

Prepare to immerse yourself in a world of data-driven exploration. Through data visualization, statistical analysis, and correlation examination, you'll uncover the nuances that define retail operations and customer dynamics. EDA isn't just about numbers—it's about storytelling with data and extracting meaningful insights that can influence strategic decisions.

Embrace the Retail Sales and Customer Demographics Dataset as your canvas for discovery. As you traverse the landscape of this synthetic retail environment, you'll refine your analytical skills, pose intriguing questions, and contribute to the ever-evolving narrative of the retail industry. Happy exploring!
f
Data from Ectodysplasin A is increased in nonalcoholic fatty liver disease,...
figshare.com
docx
Updated Sep 3, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jackie Bayliss; Geraldine J. Ooi; William De Nardo; Yazmin Johari Halim Shah; Magdalene Montgomery; Catriona McLean; William Kemp; Stuart K Roberts; Wendy A. Brown; Paul R Burton; Matthew Watt (2020). Data from Ectodysplasin A is increased in nonalcoholic fatty liver disease, but is not associated with type 2 diabetes.docx [Dataset]. http://doi.org/10.6084/m9.figshare.12910088.v2
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12910088.v2
Dataset updated
Sep 3, 2020
Dataset provided by
figshare
Authors
Jackie Bayliss; Geraldine J. Ooi; William De Nardo; Yazmin Johari Halim Shah; Magdalene Montgomery; Catriona McLean; William Kemp; Stuart K Roberts; Wendy A. Brown; Paul R Burton; Matthew Watt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
CONTEXT: Ectodysplasin A (EDA) was recently identified as a liver-secreted protein that is increased in the liver and plasma of obese mice and causes skeletal muscle insulin resistance.OBJECTIVE: To determine if liver and plasma EDA is associated with worsening non-alcoholic fatty liver disease (NAFLD) in obese patients and to evaluate plasma EDA as a biomarker for NAFLD.DESIGN AND SETTING: Cross-sectional study in a public hospital.PATIENTS, INTERVENTIONS AND MAIN OUTCOME MEASURES: Patients with a body mass index >30 kg/m2 (n=152) underwent liver biopsy for histopathology assessment and fasting liver EDA mRNA. Fasting plasma EDA levels were also assessed. Non-alcoholic fatty liver (NAFL) was defined as >5% hepatic steatosis and nonalcoholic steatohepatitis (NASH) as NAFLD activity score ≥3.RESULTS: Patients were divided into three groups: No NAFLD (n=45); NAFL (n=65); and NASH (n=42). Liver EDA mRNA was increased in patients with NASH compared with No NAFLD (P=0.05), but not NAFL. Plasma EDA levels were increased in NAFL and NASH compared with No NAFLD (P=0.03). Plasma EDA was related to worsening steatosis (P=0.02) and fibrosis (P=0.04), but not inflammation or hepatocellular ballooning. ROC analysis indicates that plasma EDA is not a reliable biomarker for NAFL or NASH. Plasma EDA was not increased in patients with type 2 diabetes and did not correlate with insulin resistance.CONCLUSIONS: Plasma EDA is increased in NAFL and NASH, is related to worsening steatosis and fibrosis but is not a reliable biomarker for NASH. Circulating EDA is not associated with insulin resistance in human obesity.
f
Candidate Gene Analysis of Tooth Agenesis Identifies Novel Mutations in Six...
plos.figshare.com
xlsx
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sirpa Arte; Satu Parmanen; Sinikka Pirinen; Satu Alaluusua; Pekka Nieminen (2023). Candidate Gene Analysis of Tooth Agenesis Identifies Novel Mutations in Six Genes and Suggests Significant Role for WNT and EDA Signaling and Allele Combinations [Dataset]. http://doi.org/10.1371/journal.pone.0073705
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0073705
Dataset updated
Jun 6, 2023
Dataset provided by
PLOS ONE
Authors
Sirpa Arte; Satu Parmanen; Sinikka Pirinen; Satu Alaluusua; Pekka Nieminen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Failure to develop complete dentition, tooth agenesis, is a common developmental anomaly manifested most often as isolated but also as associated with many developmental syndromes. It typically affects third molars or one or few other permanent teeth but severe agenesis is also relatively prevalent. Here we report mutational analyses of seven candidate genes in a cohort of 127 probands with non-syndromic tooth agenesis. 82 lacked more than five permanent teeth excluding third molars, called as oligodontia. We identified 28 mutations, 17 of which were novel. Together with our previous reports, we have identified two mutations in MSX1, AXIN2 and EDARADD, five in PAX9, four in EDA and EDAR, and nine in WNT10A. They were observed in 58 probands (44%), with a mean number of missing teeth of 11.7 (range 4 to 34). Almost all of these probands had severe agenesis. Only few of the probands but several relatives with heterozygous genotypes of WNT10A or EDAR conformed to the common type of non-syndromic tooth agenesis, incisor-premolar hypodontia. Mutations in MSX1 and PAX9 affected predominantly posterior teeth, whereas both deciduous and permanent incisors were especially sensitive to mutations in EDA and EDAR. Many mutations in EDAR, EDARADD and WNT10A were present in several families. Biallelic or heterozygous genotypes of WNT10A were observed in 32 and hemizygous or heterozygous genotypes of EDA, EDAR or EDARADD in 22 probands. An EDARADD variant were in seven probands present together with variants in EDAR or WNT10A, suggesting combined phenotypic effects of alleles in distinct genes.
f
Data_Sheet_1_Physiological synchrony in electrodermal activity predicts...
frontiersin.figshare.com
docx
Updated Jun 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ivo V. Stuldreher; Emma Maasland; Charelle Bottenheft; Jan B. F. van Erp; Anne-Marie Brouwer (2023). Data_Sheet_1_Physiological synchrony in electrodermal activity predicts decreased vigilant attention induced by sleep deprivation.docx [Dataset]. http://doi.org/10.3389/fnrgo.2023.1199347.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fnrgo.2023.1199347.s001
Dataset updated
Jun 29, 2023
Dataset provided by
Frontiers
Authors
Ivo V. Stuldreher; Emma Maasland; Charelle Bottenheft; Jan B. F. van Erp; Anne-Marie Brouwer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionWhen multiple individuals are presented with narrative movie or audio clips, their electrodermal activity (EDA) and heart rate show significant similarities. Higher levels of such inter-subject physiological synchrony are related with higher levels of attention toward the narrative, as for instance expressed by more correctly answered questions about the narrative. We here investigate whether physiological synchrony in EDA and heart rate during watching of movie clips predicts performance on a subsequent vigilant attention task among participants exposed to a night of total sleep deprivation.MethodsWe recorded EDA and heart rate of 54 participants during a night of total sleep deprivation. Every hour from 22:00 to 07:00 participants watched a 10-min movie clip during which we computed inter-subject physiological synchrony. Afterwards, they answered questions about the movie and performed the psychomotor vigilance task (PVT) to capture attentional performance.ResultsWe replicated findings that inter-subject correlations in EDA and heart rate predicted the number of correct answers on questions about the movie clips. Furthermore, we found that inter-subject correlations in EDA, but not in heart rate, predicted PVT performance. Individuals' mean EDA and heart rate also predicted their PVT performance. For EDA, inter-subject correlations explained more variance of PVT performance than individuals' mean EDA.DiscussionTogether, these findings confirm the association between physiological synchrony and attention. Physiological synchrony in EDA does not only capture the attentional processing during the time that it is determined, but also proves valuable for capturing more general changes in the attentional state of monitored individuals.

Facebook

Twitter

Click to copy link

Link copied

Cite

Nikhil raman K (2025). EDA on Cleaned Netflix Data [Dataset]. https://www.kaggle.com/datasets/nikhilramank/eda-on-cleaned-netflix-data/code

EDA on Cleaned Netflix Data

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jul 7, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Nikhil raman K

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

This is a cleaned version of a Netflix movies dataset originally used for exploratory data analysis (EDA). The dataset contains information such as:

Title
Release Year
Rating
Genre
Votes
Description
Stars

Missing values have been handled using appropriate methods (mean, median, unknown), and new features like rating_level and popular have been added for deeper analysis.

The dataset is ready for: - EDA - Data visualization - Machine learning tasks - Dashboard building

Used in the accompanying notebook

Clear search

Close search

Google apps

Main menu

EDA on Cleaned Netflix Data

EDA - Percentage of University Center clients taking action as a result of...

Data from: The Often-Overlooked Power of Summary Statistics in Exploratory...

Data from: Wrist-worn sensor validation for heart rate variability and...

EDA in Automotive Report

BIOGRID CURATED DATA FOR EDA (Homo sapiens)

Supplemental data for "An Automated On-the-Go Unloading System Reduces...

Electronic Design Automation (EDA) for Semiconductor Chips Market Report |...

Electronic Design Automation (EDA) for Semiconductor Chips Market Outlook 2032

Impact of Artificial Intelligence (AI) in Electronic Design Automation (EDA) for Semiconductor Chips Market

Breast cancer dataset

Breast Cancer Dataset

Description

*****Overview*****

Feature Categories:

Potential Use Cases

Multidimensional Dataset for APA Investigations in Cancer Patients

APAC EDA (Electronic Design Automation) Market Outlook to 2030

hotel_booking_data

NOTE: Names, Emails, Phone Numbers, and Credit Card numbers in the data are synthetic and not real information from people. The hotel data is real from the publication listed above.

Data Column Reference

BIOGRID CURATED DATA FOR EDA (Escherichia coli (K12/W3110))

Data from: Photocatalytic Radical Decarboxylation [4 + 3] Annulation...

Cognitive Fatigue

Table1_Functional and clinical analysis of five EDA variants associated with...

Retail Sales Dataset

Data from Ectodysplasin A is increased in nonalcoholic fatty liver disease,...

Candidate Gene Analysis of Tooth Agenesis Identifies Novel Mutations in Six...

Data_Sheet_1_Physiological synchrony in electrodermal activity predicts...

EDA on Cleaned Netflix Data

Overview