7 datasets found
  1. Student Performance Dataset: Academic Insights 10K

    • kaggle.com
    zip
    Updated Dec 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nadeem Majeed (2024). Student Performance Dataset: Academic Insights 10K [Dataset]. https://www.kaggle.com/datasets/nadeemajeedch/students-performance-10000-clean-data-eda
    Explore at:
    zip(129033 bytes)Available download formats
    Dataset updated
    Dec 1, 2024
    Authors
    Nadeem Majeed
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    The dataset includes: Roll Number: Represent the roll number of the student.

    Gender: Useful for analyzing performance differences between male and female students.

    Race/Ethnicity: Allows analysis of academic performance trends across different racial or ethnic groups.

    Parental Level of Education: Indicates the educational background of the student's family.

    Lunch: Shows whether students receive a free or reduced lunch, which is often a socioeconomic indicator.

    Test Preparation Course: This tells whether students completed a test prep course, which could impact their performance.

    Math Score: Provides a measure of each student’s performance in math, used to calculate averages or trends across various demographics. Science Score: Evaluates students' Science knowledge, which can be analyzed to assess overall scentific knowledge of the student.

    Reading Score: Measures performance in reading, allowing for insights into literacy and comprehension levels among students.

    Writing Score: Evaluates students' writing skills, which can be analyzed to assess overall literacy and expression.

    Total Score: Shows the total number achieved by the student out of 400.

    Grade: Gade achieved by the student. "A" grade if Total marks >= 320, "B" grade if Total marks >= 250, "C" grade if Total marks >= 200, "D" grade if Total marks >= 150 and Fail if <150.

  2. Students Performance EDA in R

    • kaggle.com
    zip
    Updated Sep 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vikram amin (2023). Students Performance EDA in R [Dataset]. https://www.kaggle.com/datasets/vikramamin/students-performance
    Explore at:
    zip(7847 bytes)Available download formats
    Dataset updated
    Sep 6, 2023
    Authors
    vikram amin
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    We will be doing Exploratory Data Analysis on the Dataset.

    • Set the working directory and read the data
    • Check the summary of the data https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fad5a02011e6566baedbc40677e3e0b72%2FPicture2.png?generation=1693983416961939&alt=media" alt="">
    • Data Cleaning: No missing values or duplicated values found. Data types for 5 columns needed to be changed from character vector to factor vector. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F14351f36a8bd8987f73bb29dd0978a82%2FPicture3.png?generation=1693983829428938&alt=media" alt="">
    • EDA: Renamed columns ‘race.ethnicity’ to ‘race’, ‘parental.level.of.education’ to ‘parents_edu’, ‘test.preparation.course’ to ‘test_prep’. Created new column ‘avg_score’ by taking the average score of columns ‘math.score’, ‘reading.score’, ‘writing.score’.
    • Run libraries for data visualisation ‘dplyr’, ‘ggplot2’, ‘corrplot’, ‘tidyr’ https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F6f76a53892907e18c45d3e841db4f4c0%2FPicture1.jpg?generation=1693983705727407&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F9d01012a6039316ad5138bf059296a37%2FPicture2.jpg?generation=1693983800460383&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fb085faf0af221d44dd461742d714943a%2FPicture3.jpg?generation=1693983876047599&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fe28cbea4f9ffee5d3325b85bb968f32b%2FPicture4.jpg?generation=1693983910231678&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fcac9db4c50c89f8b194a866efd7a10fd%2FPicture5.jpg?generation=1693983931073804&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F545b65e4c4ce23511e6d6d454e3bcb38%2FPicture7.jpg?generation=1693984000632751&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F98af3b998c2876a76b29625f0edcc894%2FPicture8.jpg?generation=1693984025214732&alt=media" alt="">

    • Conclusion:
    • Female students (518) are more represented than male students (482). Total Students being 1000
    • 58% students belong to Group C Race (180- females and 139-males) & Group D Race (129- females and 133-males) and the least number of students belong to Group A race (53-females and 36-males) Total = 89. 22.6% students parents education is of some college followed closely by associate's degree (22.2%). 5.9% students parents have a master's degree
    • 35.5% students have free or reduced lunch versus 64.5% who get standard lunch. Within this, 18.9% female students and 16.6% male students get free or reduced lunch versus 32.9% female students and 31.6% male students who get standard lunch
    • Females students total average score is more than that of male students. This could also be due to higher proportion of female students
    • 35.8% students had completed the test preparations versus 64.2% who had not completed. Within this, 18.4% female students and 17.4% male students had completed the test preparations versus 33.4% female and 30.8% male students who had not.
    • Highest correlation is between writing score and reading score i.e 0.95
  3. D

    Electronic Design Automation (EDA) Market Report | Global Forecast From 2025...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Electronic Design Automation (EDA) Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/electronic-design-automation-eda-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Electronic Design Automation (EDA) Market Outlook



    The global Electronic Design Automation (EDA) market size was valued at approximately USD 11.5 billion in 2023 and is projected to reach USD 21.3 billion by 2032, growing at a compound annual growth rate (CAGR) of 7% from 2024 to 2032. This significant growth can be attributed to the increasing complexity of electronics systems and the corresponding need for advanced design solutions. EDA tools are becoming indispensable for developing complex semiconductor components and integrated circuits, particularly as industries push for miniaturization and enhanced functionality. The burgeoning demand for consumer electronics and advancements in wireless technologies further fuel this market's expansion.



    One of the primary growth factors for the EDA market is the continuous advancement in semiconductor technologies, which necessitates sophisticated design tools. The semiconductor industry is the backbone of all electronic devices, and as devices become more integrated and advanced, the design processes also become more intricate. EDA tools provide crucial support for design verification, simulation, and testing, ensuring that complex designs meet stringent quality and performance standards. With the rise of Artificial Intelligence (AI) and Machine Learning (ML), EDA tools are being enhanced to meet these new computational requirements, offering designers more powerful capabilities to manage the increasing complexity of semiconductor designs.



    Another significant growth driver is the increasing investment in industrial automation and smart manufacturing. Across various sectors, there is a push towards automating processes and integrating smart technologies, which require sophisticated electronics and intricate design processes. EDA tools facilitate the design and development of these complex systems, making them indispensable for industries looking to innovate and optimize their operations. Furthermore, with the Internet of Things (IoT) gaining momentum, the demand for EDA tools is expected to rise as these technologies require precise design and verification to ensure efficiency and security in interconnected devices.



    Additionally, the trend towards miniaturization and the demand for high-performance, low-power electronic devices have led to the increased adoption of advanced EDA tools. Manufacturers are under pressure to deliver smaller, more efficient devices without compromising on performance. EDA tools provide the solutions necessary to design circuits that meet these demands while also allowing for rapid prototyping and reduced time-to-market. As competition in the consumer electronics market intensifies, the reliance on EDA tools for innovation and design efficiency becomes even more critical.



    EDA Tools are at the heart of the innovation driving the electronics industry forward. These tools provide the necessary framework for engineers to design, test, and validate complex electronic systems. As the demand for more sophisticated and efficient electronic devices grows, EDA tools evolve to meet these challenges by offering enhanced simulation and verification capabilities. This evolution is crucial for maintaining the pace of innovation, particularly in sectors where precision and reliability are paramount. The integration of EDA tools into the design process not only accelerates development timelines but also ensures that products meet the highest standards of quality and performance.



    Regionally, the EDA market's growth is most pronounced in the Asia Pacific region, driven by the presence of major semiconductor manufacturing hubs in countries like China, Taiwan, and South Korea. North America continues to be a significant player, with a strong base of technology companies and ongoing innovation in aerospace, defense, and consumer electronics sectors. Europe, while smaller in terms of market share, shows potential for steady growth due to its investments in automotive electronics and IoT applications. The Middle East & Africa and Latin America are emerging regions, gradually increasing their adoption of EDA tools as they advance their technological infrastructure and capabilities.



    Component Analysis



    Within the EDA market, the component segment is classified into software, hardware, and services, each playing a pivotal role in the ecosystem. The software component dominates the market due to its crucial role in enabling the design, simulation, and verification of electronic systems. EDA software solu

  4. Student performance

    • kaggle.com
    zip
    Updated Aug 30, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adithya Shetty (2020). Student performance [Dataset]. https://www.kaggle.com/adithyabshetty100/student-performance
    Explore at:
    zip(8907 bytes)Available download formats
    Dataset updated
    Aug 30, 2020
    Authors
    Adithya Shetty
    Description

    Objective

    To understand how the student's performance (test scores) is affected by the other variables (Gender, Ethnicity, Parental level of education, Lunch, Test preparation course).

    Acknowledgements

    I thank my teacher for sharing this dataset for completing our EDA project

    Inspiration

    I want to build an ML model using this dataset to predict student's marks. View my EDA analysis at:https://www.kaggle.com/adithyabshetty100/exploratory-data-analysis-on-student-performance I would love to hear your feedback about my project in the comments!!

    Thank you for reading :)

  5. Cricket Commentary Dataset

    • kaggle.com
    • huggingface.co
    zip
    Updated Dec 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Cricket Commentary Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/cricket-commentary-dataset
    Explore at:
    zip(3693673 bytes)Available download formats
    Dataset updated
    Dec 6, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Cricket Commentary Dataset

    Performance Validation for Cricket Commentary Model

    By Nirmalkumar Pajany (From Huggingface) [source]

    About this dataset

    The validation.csv file is specifically used for validating the accuracy and performance of a cricket commentary model. It contains data that can be used to assess how well the model performs in predicting or analyzing cricket commentary related information. This dataset can be used to fine-tune the model's parameters or evaluate its overall effectiveness.

    The train.csv file, on the other hand, contains data that is primarily utilized for training purposes. It includes a comprehensive set of cricket commentary-related information that can be used to train machine learning models or algorithms. The purpose here is to enable the model or algorithm to learn from this extensive dataset so that it can effectively analyze and generate accurate predictions about cricket commentary.

    Lastly, the test.csv file serves as a separate dataset solely for evaluating and validating the performance of trained models or algorithms. It acts as an unbiased measure of how well a model generalizes beyond its training data. By using this testing dataset, researchers and analysts are able to assess how accurately their models perform on unseen data - thereby ensuring their reliability when applied in real-world scenarios.

    How to use the dataset

    • Familiarize yourself with the files:

      • validation.csv: This file serves as a validation dataset and can be used to assess the accuracy and performance of your cricket commentary model.
      • train.csv: Use this file for training your machine learning models. It contains a comprehensive set of data related to cricket commentary.
      • test.csv: The test dataset contained in this file is ideal for evaluating and validating the performance of your models or algorithms on unseen data.
    • Understand the columns: The dataset contains multiple columns that provide valuable information about each entry. Some essential columns may include:

      • Commentator: The name or identifier of the commentator providing the ball-by-ball commentary.
      • Commentary: This column consists of textual descriptions that capture various aspects such as ball delivery, player actions, match events, etc. (Additional columns may be present depending on how comprehensive the dataset is)
    • Exploratory Data Analysis (EDA): Before creating any model or algorithm, it's recommended to perform an EDA on both training and validation datasets separately. This step involves understanding different patterns in text data like word frequency analysis, sentiment analysis, topic modeling techniques (e.g., Latent Dirichlet Allocation), etc.

    • Model Training: Once you have gained insights from EDA and pre-processed your text data by removing stopwords/punctuation/lemmatization/tokenization/etc., start building your machine learning models using train.csv as your base dataset.

    • Model Evaluation: After training your model(s) using train.csv, use test.csv for evaluating how well they perform on previously unseen data.

    • Validation: Validate the final performance of your model(s) using validation.csv. This will help you assess the accuracy and compare the performance of different models or algorithms on cricket commentary analysis.

    Research Ideas

    • Predicting player performance: This dataset can be used to train models that can analyze cricket commentary and predict the performance of players in future matches. By analyzing the commentary and understanding the specific mentions and descriptions related to players, a model can learn patterns and correlations that help in making predictions.
    • Analyzing match dynamics: The dataset can be used to analyze the dynamics of a cricket match. By considering various factors mentioned in the commentary such as scores, wickets, runs required, etc., insightful analysis can be performed on how different teams approach different situations or how certain events impact the match outcome.
    • Evaluating commentators' effectiveness: This dataset can also be used for evaluating the effectiveness of cricket commentators by analyzing their style of commentary, use of language, ability to convey information accurately and engagingly, etc. This analysis could help broadcasters or sports organizations identify effective commentators who resonate well with their audience and enhance viewer experience during live matches

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. [Data Source](https://huggingface.co/datase...

  6. Real Madrid UEFA Champions League Perform Analysis

    • kaggle.com
    zip
    Updated Aug 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joaco Romero Flores (2023). Real Madrid UEFA Champions League Perform Analysis [Dataset]. https://www.kaggle.com/datasets/joaquinaromerof/real-madrid-analysis
    Explore at:
    zip(32668239 bytes)Available download formats
    Dataset updated
    Aug 26, 2023
    Authors
    Joaco Romero Flores
    License

    https://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/

    Description

    Introduction

    In the high-stakes world of professional football, public opinion often forms around emotions, loyalties, and subjective interpretations. The project at hand aims to transcend these biases by delving into a robust, data-driven analysis of Real Madrid's performance in the UEFA Champions League over the past decade.

    Through a blend of traditional statistical methods, machine learning models, game theory, psychology, philosophy, and even military strategies, this investigation presents a multifaceted view of what contributes to a football team's success and how performance can be objectively evaluated.

    Exploratory Data Analysis (EDA)

    The EDA consists of two layers:

    1. Statistical Analysis:

    • Set-Up Process: Loading libraries, data frames, determining position relevancy, and calculating average minutes played.
    • Kurtosis: Understanding data variance and its internal behavior.
    • Feature Engineering: Preprocessing with standard scaler for later ML applications.
    • Sample Statistics, Distribution, and Standard Errors: Essential for inference.
    • Central Limit Theorem: A focus for understanding by experienced data scientists.
    • A/B Testing & ANOVA: Used for null hypothesis testing.

    2. Machine Learning Models:

    • Ordinary Least Square: To estimate the unknown parameters.
    • Linear Regression Models with Sci-Kit Learn: Predicting the dependent variable.
    • XGBoost & Cross-Validation: A powerful algorithm for making predictions.
    • Conformal Prediction: To create valid prediction regions.
    • Radar Maps: For visualizing player performance during their match campaigns.

    Objectives

    The goal of this analysis is multifaceted: 1. Unveil Hidden Statistics: To reveal the underlying patterns often overlooked in casual discussions. 2. Demonstrate the Impact of Probability: How it shapes matches and seasons. 3. Explore Interdisciplinary Influences: Including Game Theory, Strategy, Cooperation, Psychology, Physiology, Military Training, Luck, Economics, Philosophy, and even Freudian Analysis. 4. Challenge Subjective Bias: By presenting a well-rounded, evidence-based view of football performance.

    Conclusion

    This project stands as a testament to the profound complexity of football performance and the nuanced insights that can be derived through rigorous scientific analysis. Whether a data scientist recruiter, football fanatic, or curious mind, the findings herein offer a unique perspective that bridges the gap between passion and empiricism.

  7. Engine Ratng Prediction

    • kaggle.com
    zip
    Updated Feb 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ved Prakash (2023). Engine Ratng Prediction [Dataset]. https://www.kaggle.com/datasets/ved1104/engine-ratng-prediction
    Explore at:
    zip(3540393 bytes)Available download formats
    Dataset updated
    Feb 28, 2023
    Authors
    Ved Prakash
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Your task is to write a small Python or R script that predicts the engine rating based on the inspection parameters using only the provided dataset. You need to find all the cases/outliers where the rating has been given incorrectly as compared to the current condition of the engine.

    This task is designed to test your Python or R ability, your knowledge of Data Science techniques, your ability to find trends, and outliers, the relative importance of variables with deviation in target variable, and your ability to work effectively, efficiently, and independently within a commercial setting.

    This task is designed as well to test your hyper-tuning abilities or lateral thinking. Deliverables: · One Python or R script · One requirement text file including an exhaustive list of packages and version numbers used in your solution · Summary of your insights · List of cases that are outliers/incorrectly rated as high or low and it should be backed with analysis/reasons. · model object files for reproducibility.

    Your solution should at a minimum do the following: · Load the data into memory · Prepare the data for modeling · EDA of the variables · Build a model on training data · Test the model on testing data · Provide some measure of performance · Outlier analysis and detection

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nadeem Majeed (2024). Student Performance Dataset: Academic Insights 10K [Dataset]. https://www.kaggle.com/datasets/nadeemajeedch/students-performance-10000-clean-data-eda
Organization logo

Student Performance Dataset: Academic Insights 10K

Analyze student performance trends across demographics, scores, and grade catego

Explore at:
zip(129033 bytes)Available download formats
Dataset updated
Dec 1, 2024
Authors
Nadeem Majeed
License

https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

Description

The dataset includes: Roll Number: Represent the roll number of the student.

Gender: Useful for analyzing performance differences between male and female students.

Race/Ethnicity: Allows analysis of academic performance trends across different racial or ethnic groups.

Parental Level of Education: Indicates the educational background of the student's family.

Lunch: Shows whether students receive a free or reduced lunch, which is often a socioeconomic indicator.

Test Preparation Course: This tells whether students completed a test prep course, which could impact their performance.

Math Score: Provides a measure of each student’s performance in math, used to calculate averages or trends across various demographics. Science Score: Evaluates students' Science knowledge, which can be analyzed to assess overall scentific knowledge of the student.

Reading Score: Measures performance in reading, allowing for insights into literacy and comprehension levels among students.

Writing Score: Evaluates students' writing skills, which can be analyzed to assess overall literacy and expression.

Total Score: Shows the total number achieved by the student out of 400.

Grade: Gade achieved by the student. "A" grade if Total marks >= 320, "B" grade if Total marks >= 250, "C" grade if Total marks >= 200, "D" grade if Total marks >= 150 and Fail if <150.

Search
Clear search
Close search
Google apps
Main menu