100+ datasets found
  1. student data analysis

    • kaggle.com
    Updated Nov 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    maira javeed (2023). student data analysis [Dataset]. https://www.kaggle.com/datasets/mairajaveed/student-data-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 17, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    maira javeed
    Description

    In this project, we aim to analyze and gain insights into the performance of students based on various factors that influence their academic achievements. We have collected data related to students' demographic information, family background, and their exam scores in different subjects.

    **********Key Objectives:*********

    1. Performance Evaluation: Evaluate and understand the academic performance of students by analyzing their scores in various subjects.

    2. Identifying Underlying Factors: Investigate factors that might contribute to variations in student performance, such as parental education, family size, and student attendance.

    3. Visualizing Insights: Create data visualizations to present the findings effectively and intuitively.

    Dataset Details:

    • The dataset used in this analysis contains information about students, including their age, gender, parental education, lunch type, and test scores in subjects like mathematics, reading, and writing.

    Analysis Highlights:

    • We will perform a comprehensive analysis of the dataset, including data cleaning, exploration, and visualization to gain insights into various aspects of student performance.

    • By employing statistical methods and machine learning techniques, we will determine the significant factors that affect student performance.

    Why This Matters:

    Understanding the factors that influence student performance is crucial for educators, policymakers, and parents. This analysis can help in making informed decisions to improve educational outcomes and provide support where it is most needed.

    Acknowledgments:

    We would like to express our gratitude to [mention any data sources or collaborators] for making this dataset available.

    Please Note:

    This project is meant for educational and analytical purposes. The dataset used is fictitious and does not represent any specific educational institution or individuals.

  2. b

    Stock Prices Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). Stock Prices Dataset [Dataset]. https://brightdata.com/products/datasets/financial/stock-price
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 2, 2024
    Dataset authored and provided by
    Bright Data
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Use our Stock prices dataset to access comprehensive financial and corporate data, including company profiles, stock prices, market capitalization, revenue, and key performance metrics. This dataset is tailored for financial analysts, investors, and researchers to analyze market trends and evaluate company performance.

    Popular use cases include investment research, competitor benchmarking, and trend forecasting. Leverage this dataset to make informed financial decisions, identify growth opportunities, and gain a deeper understanding of the business landscape. The dataset includes all major data points: company name, company ID, summary, stock ticker, earnings date, closing price, previous close, opening price, and much more.

  3. d

    Streamflow-gain- and streamflow-loss data for streamgages in the Central...

    • catalog.data.gov
    • data.usgs.gov
    • +3more
    Updated Oct 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Streamflow-gain- and streamflow-loss data for streamgages in the Central Valley Hydrologic Model [Dataset]. https://catalog.data.gov/dataset/streamflow-gain-and-streamflow-loss-data-for-streamgages-in-the-central-valley-hydrologic-
    Explore at:
    Dataset updated
    Oct 5, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Central Valley
    Description

    This digital dataset contains 61 sets of annual streamflow gains and losses between 1961 and 1977 along Central Valley surface-water network for the Central Valley Hydrologic Model (CVHM). The Central Valley encompasses an approximate 50,000 square-kilometer region of California. The complex hydrologic system of the Central Valley is simulated using the USGS's numerical modeling code MODFLOW-FMP (Schmid and others, 2006). This simulation is referred to here as the CVHM (Faunt, 2009). Utilizing MODFLOW-FMP, the CVHM simulates groundwater and surface-water flow, irrigated agriculture, land subsidence, and other key processes in the Central Valley on a monthly basis from 1961-2003. The total active modeled area is 20,334 square-miles. The CVHM includes complex surface-water management processes. The hydrology of the present-day Central Valley and the CVHM model are driven by surface-water deliveries and associated groundwater pumpage. The Streamflow Routing Package (SFR1) is linked to MODFLOW-FMP to facilitate the simulated conveyance of surface-water deliveries. If surface-water deliveries do not meet the farm-delivery requirement, the FMP invokes simulated groundwater pumping to meet the demand. The surface-water network represents a subset of the entire stream network in the valley. Quantitative observations of streamflow gains and losses were available for 57 reaches of 20 major stream systems in the Central Valley for water years 1961-77 (Mullen and Nady, 1985). These observations were included in parameter estimation process and in the model-fit statistics. The CVHM is the most recent regional-scale model of the Central Valley developed by the U.S. Geological Survey (USGS). The CVHM was developed as part of the USGS Groundwater Resources Program (see "Foreword", Chapter A, page iii, for details).

  4. Z

    Dataset: Adult Age Differences in Remembering Gain- and Loss-Related...

    • data.niaid.nih.gov
    Updated Jun 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Freund, Alexandra (2021). Dataset: Adult Age Differences in Remembering Gain- and Loss-Related Intentions [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4923320
    Explore at:
    Dataset updated
    Jun 11, 2021
    Dataset provided by
    Horn, Sebastian
    Freund, Alexandra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Raw data used in the analyses of Registered Report Horn, S. & Freund. A. Adult Age Differences in Remembering Gain- and Loss-Related Intentions. Cognition and Emotion.

  5. LinkedIn Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Mar 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2025). LinkedIn Datasets [Dataset]. https://brightdata.com/products/datasets/linkedin
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Mar 27, 2025
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Unlock the full potential of LinkedIn data with our extensive dataset that combines profiles, company information, and job listings into one powerful resource for business decision-making, strategic hiring, competitive analysis, and market trend insights. This all-encompassing dataset is ideal for professionals, recruiters, analysts, and marketers aiming to enhance their strategies and operations across various business functions. Dataset Features

    Profiles: Dive into detailed public profiles featuring names, titles, positions, experience, education, skills, and more. Utilize this data for talent sourcing, lead generation, and investment signaling, with a refresh rate ensuring up to 30 million records per month. Companies: Access comprehensive company data including ID, country, industry, size, number of followers, website details, subsidiaries, and posts. Tailored subsets by industry or region provide invaluable insights for CRM enrichment, competitive intelligence, and understanding the startup ecosystem, updated monthly with up to 40 million records. Job Listings: Explore current job opportunities detailed with job titles, company names, locations, and employment specifics such as seniority levels and employment functions. This dataset includes direct application links and real-time application numbers, serving as a crucial tool for job seekers and analysts looking to understand industry trends and the job market dynamics.

    Customizable Subsets for Specific Needs Our LinkedIn dataset offers the flexibility to tailor the dataset according to your specific business requirements. Whether you need comprehensive insights across all data points or are focused on specific segments like job listings, company profiles, or individual professional details, we can customize the dataset to match your needs. This modular approach ensures that you get only the data that is most relevant to your objectives, maximizing efficiency and relevance in your strategic applications. Popular Use Cases

    Strategic Hiring and Recruiting: Track talent movement, identify growth opportunities, and enhance your recruiting efforts with targeted data. Market Analysis and Competitive Intelligence: Gain a competitive edge by analyzing company growth, industry trends, and strategic opportunities. Lead Generation and CRM Enrichment: Enrich your database with up-to-date company and professional data for targeted marketing and sales strategies. Job Market Insights and Trends: Leverage detailed job listings for a nuanced understanding of employment trends and opportunities, facilitating effective job matching and market analysis. AI-Driven Predictive Analytics: Utilize AI algorithms to analyze large datasets for predicting industry shifts, optimizing business operations, and enhancing decision-making processes based on actionable data insights.

    Whether you are mapping out competitive landscapes, sourcing new talent, or analyzing job market trends, our LinkedIn dataset provides the tools you need to succeed. Customize your access to fit specific needs, ensuring that you have the most relevant and timely data at your fingertips.

  6. Netflix movies and tv shows dataset

    • crawlfeeds.com
    csv, zip
    Updated Jul 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Netflix movies and tv shows dataset [Dataset]. https://crawlfeeds.com/datasets/netflix-movies-and-tv-shows-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Jul 1, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Dive into the Netflix Movies and TV Shows Dataset, a detailed collection of web-scraped data featuring popular streaming titles. Discover trending movies, binge-worthy TV series, genres, ratings, release years, and audience preferences. Gain insights into Netflix originals, global streaming trends, and viewer favorites to inform market analysis and entertainment research.

    Perfect for exploring content diversity, production trends, and streaming platform dynamics.

  7. NBA WNBA play-by-play and shots data

    • kaggle.com
    zip
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vladislav Shufinskiy (2025). NBA WNBA play-by-play and shots data [Dataset]. https://www.kaggle.com/datasets/brains14482/nba-playbyplay-and-shotdetails-data-19962021
    Explore at:
    zip(1683596108 bytes)Available download formats
    Dataset updated
    Jun 26, 2025
    Authors
    Vladislav Shufinskiy
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Description

    NBA anba WNBA dataset is a large-scale play-by-play and shot-detail dataset covering both NBA and WNBA games, collected from multiple public sources (e.g., official league APIs and stats sites). It provides every in-game event—from period starts, jump balls, fouls, turnovers, rebounds, and field-goal attempts through free throws—along with detailed shot metadata (shot location, distance, result, assisting player, etc.).

    Also you can download dataset from github or GoogleDrive

    Tutorials

    1. NBA play-by-play dataset R example

    I will be grateful for ratings and stars on github, but the best gratitude is use of dataset for your projects.

    Useful links:

    Motivation

    I made this dataset because I want to simplify and speed up work with play-by-play data so that researchers spend their time studying data, not collecting it. Due to the limits on requests on the NBA and WNBA website, and also because you can get play-by-play of only one game per request, collecting this data is a very long process.

    Using this dataset, you can reduce the time to get information about one season from a few hours to a couple of seconds and spend more time analyzing data or building models.

    I also added play-by-play information from other sources: pbpstats.com, data.nba.com, cdnnba.com. This data will enrich information about the progress of each game and hopefully add opportunities to do interesting things.

    Contact Me

    If you have any questions or suggestions about the dataset, you can write to me in a convenient channel for you:

  8. TESLA Inc Last 5 Years Stock Historical Data

    • kaggle.com
    Updated Jun 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jims Chacko (2023). TESLA Inc Last 5 Years Stock Historical Data [Dataset]. https://www.kaggle.com/jimschacko/tesla-inc-last-5-years-stock-historical-data/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 22, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jims Chacko
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Explore the fascinating journey of Tesla's stock performance over the past 5 years and gain valuable insights into its growth, trends, and market behavior. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2655536%2F077eab1e897d10453e7fdfe1619a3d05%2FTesla-Logo-PNG-HD-Isolated.png?generation=1687460082377704&alt=media" alt=""> Tesla, the renowned electric vehicle manufacturer, has captured the world's attention with its groundbreaking innovations and exponential growth. In this blog post, we will dive into Tesla's stock performance over the past five years, unraveling key trends and providing valuable insights for investors and enthusiasts alike.

    The Dataset holds Tesla Stock Prices from last 5 years.

    Date: First Column represents the data.

    Open: Tesla Stock Opening Price for the given date.

    High: Tesla Stock price highest price point hit.

    Low: Tesla Stock price lowest price for the given date Tesla Stock price lowest price for the given date.

    adj Close: Adjusted stock closing price of Tesla after taking dividends, stock splits, and new stock offerings into account.

    Volume: Amount of an Tesla Stock that changed hands over the course of the trading d

    Source: https://finance.yahoo.com

  9. e

    Data for: A global-scale dataset of direct natural groundwater recharge...

    • opendata.eawag.ch
    Updated Jun 9, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Data for: A global-scale dataset of direct natural groundwater recharge rates: A review of variables, processes and relationships - Package - ERIC [Dataset]. https://opendata.eawag.ch/dataset/globalscale_groundwater_moeck
    Explore at:
    Dataset updated
    Jun 9, 2020
    Description

    Groundwater recharge indicates the existence of renewable groundwater resources and is therefore an important component in sustainability studies. However, recharge is also one of the least understood, largely because it varies in space and time and is difficult to measure directly. For most studies, only a relatively small number of measurements is available, which hampers a comprehensive understanding of processes driving recharge and the validation of hydrogeological model formulations for small- and large-scale applications. We present a new global recharge dataset encompassing >5000 locations. In order to gain insights into recharge processes, we provide a systematic analysis between the dataset and other global-scale datasets, such as climatic or soil-related parameters. Precipitation rates and seasonality in temperature and precipitation were identified as the most important variables in predicting recharge. The high dependency of recharge on climate indicates its sensitivity to climate change. We also show that vegetation and soil structure have an explanatory power for recharge. Since these conditions can be highly variable, recharge estimates based only on climatic parameters may be misleading. The freely available dataset offers diverse possibilities to study recharge processes from a variety of perspectives. By noting the existing gaps in understanding, we hope to encourage the community to initiate new research into recharge processes and subsequently make recharge data available to improve recharge predictions.

  10. Z

    Data_WP5_3_Monetary valuation of impacts and cost-benefit analysis

    • data.niaid.nih.gov
    • data.europa.eu
    Updated Mar 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tim Taylor (2021). Data_WP5_3_Monetary valuation of impacts and cost-benefit analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4605801
    Explore at:
    Dataset updated
    Mar 23, 2021
    Dataset authored and provided by
    Tim Taylor
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset includes the results of the cost-effectiveness and cost-benefit analysis performed for each of the policy/measure options defined in Task 5.1. As such it complements the dataset “Data_WP5_2_Health effects at the community level data set” in providing the assessment of the economic dimension for the same policies/measures. Cost-effectiveness analysis examined the costs of these options and calculated for example the cost per ton of CO2eq. For the cost-benefit analysis the dataset includes all benefits, damages and costs and the non-monetary (intangible) items which were transformed into monetary values (where possible) including social costs, monetized health impacts, monetized contributions to climate change, utility and gain losses. The full dataset is organized in three different files according to the sector addressed by the policy/measure options analyzed. In this light, the file named “CBA active transport” includes the full results for the active transport policies; the file named “CBA alternative fuel vehicles” results for all the alternative fuel vehicles policies; and the file “CBA energy efficiency” for the energy efficiency policies. Every single file includes multiple worksheets which respectively encompasses a summary of all the CBA results for the policy sector addressed, as well as other worksheets including the detailed results for each specific policy up to the year 2040. The data are available either in MS–Excel xls(x) format to ensure full interoperability allowing easy parsing and information exchange.

  11. w

    Data from: Dataset to accompany genomics combined with UAS data enhances...

    • rex.libraries.wsu.edu
    csv, gz
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Osval A. Montesinos-López; Andrew W. Herr; Jose Crossa; Arron H. Carter (2022). Dataset to accompany genomics combined with UAS data enhances prediction of grain yield in winter wheat [Dataset]. https://rex.libraries.wsu.edu/esploro/outputs/dataset/Dataset-to-accompany-genomics-combined-with/99900914641301842
    Explore at:
    gz(67338119 bytes), csv(3968871 bytes)Available download formats
    Dataset updated
    Dec 14, 2022
    Dataset provided by
    Washington State University
    Authors
    Osval A. Montesinos-López; Andrew W. Herr; Jose Crossa; Arron H. Carter
    Time period covered
    2022
    Description

    With the human population continuing to increase worldwide, there is pressure to employ novel technologies to increase genetic gain in plant breeding programs that contribute to nutrition and food security. Genomic selection (GS) has the potential to increase genetic gain because it can accelerate the breeding cycle, increase the accuracy of estimated breeding values, and improve selection accuracy. However, with recent advances in high throughput phenotyping in plant breeding programs, the opportunity to integrate genomic and phenotypic data to increase prediction accuracy is present. In this paper, we applied GS to winter wheat data integrating two types of inputs: genomic and phenotypic. We observed the best prediction performance when combining both genomic and phenotypic inputs, while only using genomic information fared poorly. Interestingly, using only phenotypic information was slightly worse in some cases than the combination of both sources, whereas in other cases, using only phenotypic information provided the best prediction performance. Our results are encouraging because it is clear we can enhance the prediction accuracy of GS by integrating more related inputs in the models. Included here are: A .csv file with field trait and drone data from 2018 through 2022 used in model analysis. A .vcf file with genotype by sequencing (gbs) data of all tested wheat lines between 2015 and 2022. This data was also used in model analysis.

  12. College Placement Predictor Dataset

    • kaggle.com
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SameerProgrammer (2023). College Placement Predictor Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/7298157
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    SameerProgrammer
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    1. About the Dataset:

    Description: Dive into the world of college placements with this dataset designed to unravel the factors influencing student placement outcomes. The dataset comprises crucial parameters such as IQ scores, CGPA (Cumulative Grade Point Average), and placement status. Aspiring data scientists, researchers, and enthusiasts can leverage this dataset to uncover patterns and insights that contribute to a deeper understanding of successful college placements.

    2. Projects Ideas:

    Project Idea 1: Predictive Modeling for College Placements Utilize machine learning algorithms to build a predictive model that forecasts a student's likelihood of placement based on their IQ scores and CGPA. Evaluate and compare the effectiveness of different algorithms to enhance prediction accuracy.

    Project Idea 2: Feature Importance Analysis Conduct a feature importance analysis to identify the key factors that significantly influence placement outcomes. Gain insights into whether IQ, CGPA, or a combination of both plays a more dominant role in determining success.

    Project Idea 3: Clustering Analysis of Placement Trends Apply clustering techniques to group students based on their placement outcomes. Explore whether distinct clusters emerge, shedding light on common characteristics or trends among students who secure placements.

    Project Idea 4: Correlation Analysis with External Factors Investigate the correlation between the provided data (IQ, CGPA, placement) and external factors such as internship experience, extracurricular activities, or industry demand. Assess how these external factors may complement or influence placement success.

    Project Idea 5: Visualization of Placement Dynamics Over Time Create dynamic visualizations to illustrate how placement trends evolve over time. Analyze trends, patterns, and fluctuations in placement rates to identify potential cyclical or seasonal influences on student placements.

    3. Columns Explanation:

    • IQ:

      • Definition: Intelligence Quotient, a measure of a person's intellectual abilities.
      • Data Type: Numeric
      • Range: Typically, IQ scores range from 70 to 130, with 100 being the average.
    • CGPA:

      • Definition: Cumulative Grade Point Average, a measure of a student's overall academic performance.
      • Data Type: Numeric
      • Range: Typically, CGPA is on a scale of 0 to 4, with 4 being the highest possible score.
    • Placement:

      • Definition: Binary variable indicating whether a student secured a placement (1) or not (0).
      • Data Type: Categorical (Binary)
      • Values: 1 (Placement secured) or 0 (No placement).

    These columns collectively provide a comprehensive snapshot of a student's intellectual abilities, academic performance, and their success in securing a placement. Analyzing this dataset can offer valuable insights into the dynamics of college placements and inform strategies for optimizing student outcomes.

  13. Fantasy Premier League Player Data (2016-2024)

    • kaggle.com
    Updated May 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reeve Barreto (2024). Fantasy Premier League Player Data (2016-2024) [Dataset]. https://www.kaggle.com/datasets/reevebarreto/fantasy-premier-league-player-data-2016-2024
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 14, 2024
    Dataset provided by
    Kaggle
    Authors
    Reeve Barreto
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides an archive of Fantasy Premier League (FPL) player performance data for eight seasons, spanning from 2016-2024.

    The data was originally collected from https://github.com/vaastav/Fantasy-Premier-League, a public repository for FPL data.

    The dataset has been meticulously cleaned and processed to ensure accuracy and consistency. This may include handling missing values, correcting inconsistencies, and standardizing formats.

    The dataset includes a wide range of player statistics captured on a gameweek-by-gameweek basis. This allows you to analyze trends, identify patterns, and gain valuable insights into player performance.

    This dataset can be a powerful tool for FPL enthusiasts and data scientists alike. Here are some potential applications: - Trend Analysis: Identify historical trends in player performance across different seasons and positions. - Predictive Modeling: Develop machine learning models to predict player points, performance, and transfers. - Informed Team Selection: Make data-driven decisions to optimize your FPL team for each gameweek. - Comparative Analysis: Compare player statistics across seasons and positions to uncover hidden gems and potential breakout stars.

    Using this dataset, you can gain a deeper understanding of FPL player performance and enhance your decision-making for the upcoming season.

  14. 4

    Multimodal SKEP dataset for attention regulation behaviors, knowledge gain,...

    • data.4tu.nl
    zip
    Updated Apr 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yoon Lee; Marcus Specht (2023). Multimodal SKEP dataset for attention regulation behaviors, knowledge gain, perceived learning experience, and perceived social presence in e-learning with a conversational agent [Dataset]. http://doi.org/10.4121/4c9de645-ca88-4b45-8fc7-2fc325f191dc.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 21, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Yoon Lee; Marcus Specht
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Reading on digital devices has become more commonplace, often challenging learners' attention. In this study, we hypothesized that allowing learners to reflect on their reading phases with an empathic social robot companion might enhance learners' attention in e-reading. To verify our assumption, we collected a novel SKEP dataset in an e-reading setting with social robot support.


    We designed two interfaces: 1) a GUI-based system with a monitor, mouse, and eye tracker implemented, and 2) an HRI-based system, which has a monitor, mouse, eye tracker, and Furhat Robot as physical components. See the footnote to check the specification of the Pupil Core eye tracker and Logitech C505 HD Webcam that was implemented. For both conditions, an informative e-reading material with technicality, "Waste management and critical raw materials," has been provided through a screen-based reader, which we explicitly developed for this study. The content has been chosen, aiming for an equal baseline knowledge for general readers. The text contains 4,750 words, divided into 29 pages covering seven subtopics. The text has been implemented with 47pt on a 27-inch monitor, having 2560*1440 resolution. The setting was optimized for the eye tracker implementation, which requires a bigger font size than the usual PDF readers for high-resolution data collection.


    We implemented four measurements that are direct and indirect attentional cues. Data features and granularity varies based on the data collection methods, collection timing, and data post-processing. Learners' self-regulatory behavior has been collected through a video feed and annotated second-by-second by human labelers as post hoc. Labels are observable behavioral cues that indicate learners' attentional shifts. Movements from the 1) eyebrow, 2) blink, 3) mumble, 4) hands, and 5) body works as good predictors of learners' self-awareness on attention loss; we annotated 60 video samples by applying six labels, including 6) neutral state as opposed to five attention regulation behavior labels. Additionally, we examined multimodal cues that are direct and indirect clues of attention: knowledge gain, perceived learning experience, and perceived social presence with interfaces (see readme.txt for descriptions of indicators).

  15. Global Salt Marsh Change, 2000-2019 - Dataset - NASA Open Data Portal

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • data.nasa.gov
    Updated Mar 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Global Salt Marsh Change, 2000-2019 - Dataset - NASA Open Data Portal [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/global-salt-marsh-change-2000-2019-bc1eb
    Explore at:
    Dataset updated
    Mar 20, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This dataset provides global salt marsh change, including loss and gain for five-year periods from 2000-2019. Loss and gain at a 30 m spatial resolution were estimated with Normalized Difference Vegetation Index (NDVI) anomaly algorithm using Landsat 5, 7, and 8 collections within the known extent of salt marshes. The data are provided in cloud-optimized GeoTIFF format.

  16. m

    Fruits Dataset for Classification

    • data.mendeley.com
    Updated Feb 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS GTS (2025). Fruits Dataset for Classification [Dataset]. http://doi.org/10.17632/rg254yr63x.1
    Explore at:
    Dataset updated
    Feb 11, 2025
    Authors
    GTS GTS
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    About Dataset (strawberries, peaches, pomegranates) Photo requirements: 1-White background 2-.jpg 3- Image size 300*300 The number of photos required is 250 photos of each fruit when it is fresh and 250 photos of each Fruit Dataset for Classification when it is rotten. Total 1500 images

    Diverse Collection With a diverse collection of Product images, the files provides an excellent foundation for developing and testing machine learning models designed for image recognition and allocation. Each image is captured under different lighting conditions and backgrounds, offering a realistic challenge for algorithms to overcome.

    Real-World Applications The variability in the dataset ensures that models trained on it can generalize well to real-world scenarios, making them robust and reliable. The dataset includes common fruits such as apples, bananas, oranges, and strawberries, among others, allowing for comprehensive training and evaluation.

    Industry Use Cases One of the significant advantages of using the Fruits Dataset for Classification is its applicability in various fields such as agriculture, retail, and the food industry. In agriculture, it can help automate the process of fruit sorting and grading, enhancing efficiency and reducing labor costs. In retail, it can be used to develop automated checkout systems that accurately identify fruits, streamlining the purchasing process.

    Educational Value The dataset is also valuable for educational purposes, providing students and educators with a practical tool to learn and teach machine learning concepts. By working with this dataset, learners can gain hands-on experience in data preprocessing, model training, and evaluation.

    Conclusion The Fruits Dataset for Classification is a versatile and indispensable resource for advancing the field of image classification. Its diverse and high-quality images, coupled with practical applications, make it a go-to dataset for researchers, developers, and educators aiming to improve and innovate in machine learning and computer vision.

    This dataset is sourced from Kaggle.

  17. f

    Orange dataset table

    • figshare.com
    xlsx
    Updated Mar 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rui Simões (2022). Orange dataset table [Dataset]. http://doi.org/10.6084/m9.figshare.19146410.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 4, 2022
    Dataset provided by
    figshare
    Authors
    Rui Simões
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.

    Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.

  18. Political Tweets Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). Political Tweets Dataset [Dataset]. https://brightdata.com/products/datasets/twitter/tweets/political
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 23, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Utilize our Political Tweets dataset to enhance campaign strategies and gain insights into public discourse. This dataset offers a comprehensive view of political dynamics on social media, empowering organizations, researchers, and policymakers to analyze trends and sentiment. Access the full dataset or customize it with specific data points tailored to your needs. Popular use cases include: Sentiment Analysis: Analyze publicly available political tweets to understand public sentiment on policies, events, and candidates, aiding campaign strategies and opinion research. Trend Monitoring: Track trending topics and hashtags in political discourse to identify key issues and shifts in public priorities across demographics. Misinformation Detection: Detect and analyze patterns of misinformation, supporting efforts to combat its spread effectively. Harness these insights to stay informed and adapt to the evolving political landscape.

  19. N

    McKenzie County, ND annual median income by work experience and sex dataset:...

    • neilsberg.com
    csv, json
    Updated Feb 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). McKenzie County, ND annual median income by work experience and sex dataset: Aged 15+, 2010-2023 (in 2023 inflation-adjusted dollars) // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/mckenzie-county-nd-income-by-gender/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 27, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    McKenzie County, North Dakota
    Variables measured
    Income for Male Population, Income for Female Population, Income for Male Population working full time, Income for Male Population working part time, Income for Female Population working full time, Income for Female Population working part time
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 5-Year Estimates. The dataset covers the years 2010 to 2023, representing 14 years of data. To analyze income differences between genders (male and female), we conducted an initial data analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series (R-CPI-U-RS) based on current methodologies. For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents median income data over a decade or more for males and females categorized by Total, Full-Time Year-Round (FT), and Part-Time (PT) employment in McKenzie County. It showcases annual income, providing insights into gender-specific income distributions and the disparities between full-time and part-time work. The dataset can be utilized to gain insights into gender-based pay disparity trends and explore the variations in income for male and female individuals.

    Key observations: Insights from 2023

    Based on our analysis ACS 2019-2023 5-Year Estimates, we present the following observations: - All workers, aged 15 years and older: In McKenzie County, the median income for all workers aged 15 years and older, regardless of work hours, was $70,683 for males and $45,098 for females.

    These income figures highlight a substantial gender-based income gap in McKenzie County. Women, regardless of work hours, earn 64 cents for each dollar earned by men. This significant gender pay gap, approximately 36%, underscores concerning gender-based income inequality in the county of McKenzie County.

    - Full-time workers, aged 15 years and older: In McKenzie County, among full-time, year-round workers aged 15 years and older, males earned a median income of $82,314, while females earned $52,974, leading to a 36% gender pay gap among full-time workers. This illustrates that women earn 64 cents for each dollar earned by men in full-time roles. This level of income gap emphasizes the urgency to address and rectify this ongoing disparity, where women, despite working full-time, face a more significant wage discrepancy compared to men in the same employment roles.

    Remarkably, across all roles, including non-full-time employment, women displayed a similar gender pay gap percentage. This indicates a consistent gender pay gap scenario across various employment types in McKenzie County, showcasing a consistent income pattern irrespective of employment status.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. All incomes have been adjusting for inflation and are presented in 2023-inflation-adjusted dollars.

    Gender classifications include:

    • Male
    • Female

    Employment type classifications include:

    • Full-time, year-round: A full-time, year-round worker is a person who worked full time (35 or more hours per week) and 50 or more weeks during the previous calendar year.
    • Part-time: A part-time worker is a person who worked less than 35 hours per week during the previous calendar year.

    Variables / Data Columns

    • Year: This column presents the data year. Expected values are 2010 to 2023
    • Male Total Income: Annual median income, for males regardless of work hours
    • Male FT Income: Annual median income, for males working full time, year-round
    • Male PT Income: Annual median income, for males working part time
    • Female Total Income: Annual median income, for females regardless of work hours
    • Female FT Income: Annual median income, for females working full time, year-round
    • Female PT Income: Annual median income, for females working part time

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for McKenzie County median household income by race. You can refer the same here

  20. n

    JPL GRACE and GRACE-FO Mascon Ocean, Ice, and Hydrology Equivalent Water...

    • podaac.jpl.nasa.gov
    html
    Updated Sep 15, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PO.DAAC (2015). JPL GRACE and GRACE-FO Mascon Ocean, Ice, and Hydrology Equivalent Water Height Coastal Resolution Improvement (CRI) Filtered Release 06 Version 02 [Dataset]. http://doi.org/10.5067/TEMSC-3JC62
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Sep 15, 2015
    Dataset provided by
    PO.DAAC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 4, 2002 - Present
    Variables measured
    GRAVITY ANOMALIES, SEA LEVEL, SEA LEVEL RISE
    Description

    This dataset contains gridded monthly global water storage/height anomalies relative to a time-mean, derived from GRACE and GRACE-FO and processed at JPL using the Mascon approach (Version2/RL06). These data are provided in a single data file in netCDF format, and can be used for analysis for ocean, ice, and hydrology phenomena. This version of the data employs a Coastal Resolution Improvement (CRI) filter that reduces signal leakage errors across coastlines. The water storage/height anomalies are given in equivalent water thickness units (cm). The solution provided here is derived from solving for monthly gravity field variations in terms of geolocated spherical cap mass concentration functions, rather than global spherical harmonic coefficients. Additionally, realistic geophysical information is introduced during the solution inversion to intrinsically remove correlated error. Thus, these Mascon grids do not need to be destriped or smoothed, like traditional spherical harmonic gravity solutions. The complete Mascon solution consists of 4,551 relatively independent estimates of surface mass change that have been derived using an equal-area 3-degree grid of individual mascons. A subset of these individual mascons span coastlines, and contain mixed land and ocean mass change signals. In a post-processing step, the CRI filter is applied to those mixed land/ocean Mascons to separate land and ocean mass. The land mask used to perform this separation is provided in the same directory as this dataset. Since the individual mascons act as an inherent smoother on the gravity field, a set of optional gain factors (for continental hydrology applications) that can be applied to the solution to study mass change signals at sub-mascon resolution is also provided within the same data directory as the Mascon data. Please refer to the 'Data Access' tab at the top of this page to gain direct access to the Mascon data. For more information, please visit https://grace.jpl.nasa.gov/data/get-data/jpl_global_mascons/. For a detailed description on the Mascon solution, including the mathematical derivation, implementation of geophysical constraints, and solution validation, please see Watkins et al., 2015, doi: 10.1002/2014JB011547. For a detailed description of the CRI filter implementation, please see Wiese et al., 2016, doi:10.1002/2016WR019344.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
maira javeed (2023). student data analysis [Dataset]. https://www.kaggle.com/datasets/mairajaveed/student-data-analysis
Organization logo

student data analysis

Student Performance Analysis

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 17, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
maira javeed
Description

In this project, we aim to analyze and gain insights into the performance of students based on various factors that influence their academic achievements. We have collected data related to students' demographic information, family background, and their exam scores in different subjects.

**********Key Objectives:*********

  1. Performance Evaluation: Evaluate and understand the academic performance of students by analyzing their scores in various subjects.

  2. Identifying Underlying Factors: Investigate factors that might contribute to variations in student performance, such as parental education, family size, and student attendance.

  3. Visualizing Insights: Create data visualizations to present the findings effectively and intuitively.

Dataset Details:

  • The dataset used in this analysis contains information about students, including their age, gender, parental education, lunch type, and test scores in subjects like mathematics, reading, and writing.

Analysis Highlights:

  • We will perform a comprehensive analysis of the dataset, including data cleaning, exploration, and visualization to gain insights into various aspects of student performance.

  • By employing statistical methods and machine learning techniques, we will determine the significant factors that affect student performance.

Why This Matters:

Understanding the factors that influence student performance is crucial for educators, policymakers, and parents. This analysis can help in making informed decisions to improve educational outcomes and provide support where it is most needed.

Acknowledgments:

We would like to express our gratitude to [mention any data sources or collaborators] for making this dataset available.

Please Note:

This project is meant for educational and analytical purposes. The dataset used is fictitious and does not represent any specific educational institution or individuals.

Search
Clear search
Close search
Google apps
Main menu