15 datasets found
  1. Stock Market Simulation Dataset

    • kaggle.com
    Updated Mar 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samay Ashar (2025). Stock Market Simulation Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/11010423
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 12, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Samay Ashar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides realistic stock market data generated using Geometric Brownian Motion for price movements and Markov Chains for trend prediction. It is designed for time-series forecasting, financial modeling, and algorithmic trading simulations.

    Key Features

    • 1000 days of synthetic stock market data (from January 1, 2022, onwards).
    • Multiple companies from diverse industries (Technology, Finance, Healthcare, Energy, Consumer Goods, Automotive, Aerospace, etc.).
    • Stock price details: Open, High, Low, Close prices.
    • Trading volume and market capitalization.
    • Financial metrics: P/E Ratio, Dividend Yield, Volatility.
    • Sentiment Score: A measure of market sentiment (-1 to 1 scale).
    • Trend Labeling: Bullish, Bearish, or Stable, based on Markov Chain modeling.
    Column NameDescription
    DateTrading date
    CompanyStock name (e.g., Apple, Tesla, JPMorgan, etc.)
    SectorIndustry classification
    OpenOpening price of the stock
    HighHighest price of the stock for the day
    LowLowest price of the stock for the day
    CloseClosing price of the stock
    VolumeNumber of shares traded
    Market_CapMarket capitalization (in USD)
    PE_RatioPrice-to-Earnings ratio
    Dividend_YieldPercentage of dividends relative to stock price
    VolatilityMeasure of stock price fluctuation
    Sentiment_ScoreMarket sentiment (-1 to 1 scale)
    TrendStock market trend (Bullish, Bearish, or Stable)

    Usage Scenarios

    🔹 Time-Series Forecasting: Train models like LSTMs, Transformers, or ARIMA for stock price prediction.
    🔹 Algorithmic Trading: Develop trading strategies based on trends and sentiment.
    🔹 Feature Engineering: Explore correlations between financial metrics and stock movements.
    🔹 Quantitative Finance Research: Analyze market trends using simulated yet realistic data.

    PS: If you find this dataset helpful, please consider upvoting :)

  2. f

    S2 Data -

    • plos.figshare.com
    txt
    Updated Dec 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahadee Al Mobin; Md. Kamrujjaman (2023). S2 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0295803.s002
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 14, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Mahadee Al Mobin; Md. Kamrujjaman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data scarcity and discontinuity are common occurrences in the healthcare and epidemiological dataset and often is needed to form an educative decision and forecast the upcoming scenario. Often to avoid these problems, these data are processed as monthly/yearly aggregate where the prevalent forecasting tools like Autoregressive Integrated Moving Average (ARIMA), Seasonal Autoregressive Integrated Moving Average (SARIMA), and TBATS often fail to provide satisfactory results. Artificial data synthesis methods have been proven to be a powerful tool for tackling these challenges. The paper aims to propose a novel algorithm named Stochastic Bayesian Downscaling (SBD) algorithm based on the Bayesian approach that can regenerate downscaled time series of varying time lengths from aggregated data, preserving most of the statistical characteristics and the aggregated sum of the original data. The paper presents two epidemiological time series case studies of Bangladesh (Dengue, Covid-19) to showcase the workflow of the algorithm. The case studies illustrate that the synthesized data agrees with the original data regarding its statistical properties, trend, seasonality, and residuals. In the case of forecasting performance, using the last 12 years data of Dengue infection data in Bangladesh, we were able to decrease error terms up to 72.76% using synthetic data over actual aggregated data.

  3. f

    Coefficients of ARIMA(7,0,7).

    • plos.figshare.com
    xls
    Updated Dec 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahadee Al Mobin; Md. Kamrujjaman (2023). Coefficients of ARIMA(7,0,7). [Dataset]. http://doi.org/10.1371/journal.pone.0295803.t010
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 14, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Mahadee Al Mobin; Md. Kamrujjaman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data scarcity and discontinuity are common occurrences in the healthcare and epidemiological dataset and often is needed to form an educative decision and forecast the upcoming scenario. Often to avoid these problems, these data are processed as monthly/yearly aggregate where the prevalent forecasting tools like Autoregressive Integrated Moving Average (ARIMA), Seasonal Autoregressive Integrated Moving Average (SARIMA), and TBATS often fail to provide satisfactory results. Artificial data synthesis methods have been proven to be a powerful tool for tackling these challenges. The paper aims to propose a novel algorithm named Stochastic Bayesian Downscaling (SBD) algorithm based on the Bayesian approach that can regenerate downscaled time series of varying time lengths from aggregated data, preserving most of the statistical characteristics and the aggregated sum of the original data. The paper presents two epidemiological time series case studies of Bangladesh (Dengue, Covid-19) to showcase the workflow of the algorithm. The case studies illustrate that the synthesized data agrees with the original data regarding its statistical properties, trend, seasonality, and residuals. In the case of forecasting performance, using the last 12 years data of Dengue infection data in Bangladesh, we were able to decrease error terms up to 72.76% using synthetic data over actual aggregated data.

  4. Delhi Power Load with Weather & Development

    • kaggle.com
    Updated Jan 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pratik Chougule (2025). Delhi Power Load with Weather & Development [Dataset]. https://www.kaggle.com/datasets/pratikyuvrajchougule/delhi-datset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 12, 2025
    Dataset provided by
    Kaggle
    Authors
    Pratik Chougule
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Delhi
    Description

    This dataset provides synthetic data designed to analyze and predict power load (in MW) in Delhi, incorporating a variety of influencing factors such as weather, holidays, festivals, and real estate development levels. With over a year of hourly data, this dataset is ideal for researchers, students, and practitioners working on energy systems, urban planning, and time-series forecasting.

    Key Features:

    • Weather Data: Temperature, humidity, wind speed, and rainfall measurements for each hour.
    • Socio-Economic Indicators: Information on public holidays, weekly holidays, and festival days.
    • Urban Development: Classification of areas into low, medium, and high development zones with respective percentages.
    • Power Load (MW): Target variable representing hourly electricity consumption in megawatts. ## Purpose: This dataset is intended for the following use cases:

    1. Power Load Forecasting:Build machine learning models to predict future electricity demand. 2. Weather Impact Studies: Analyze how weather conditions influence power consumption patterns. 3. Urban Development Insights: Explore the correlation between area development levels and energy usage. 4. Policy Planning: Assist policymakers in understanding energy demand trends during holidays, festivals, and extreme weather. 5. Time Series Analysis: Practice and research advanced time-series forecasting techniques. 6. Renewable Energy Integration: Develop models to optimize energy distribution and reduce reliance on non-renewable sources.

    Potential Applications:

    • Building intelligent power grid systems.
    • Analyzing the impact of climate change on energy demand.
    • Supporting smart city initiatives with energy-efficient planning.
    • Creating educational tools for data science and machine learning learners.
  5. f

    Selection of best model based on criteria.

    • plos.figshare.com
    xls
    Updated Dec 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahadee Al Mobin; Md. Kamrujjaman (2023). Selection of best model based on criteria. [Dataset]. http://doi.org/10.1371/journal.pone.0295803.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 14, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Mahadee Al Mobin; Md. Kamrujjaman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data scarcity and discontinuity are common occurrences in the healthcare and epidemiological dataset and often is needed to form an educative decision and forecast the upcoming scenario. Often to avoid these problems, these data are processed as monthly/yearly aggregate where the prevalent forecasting tools like Autoregressive Integrated Moving Average (ARIMA), Seasonal Autoregressive Integrated Moving Average (SARIMA), and TBATS often fail to provide satisfactory results. Artificial data synthesis methods have been proven to be a powerful tool for tackling these challenges. The paper aims to propose a novel algorithm named Stochastic Bayesian Downscaling (SBD) algorithm based on the Bayesian approach that can regenerate downscaled time series of varying time lengths from aggregated data, preserving most of the statistical characteristics and the aggregated sum of the original data. The paper presents two epidemiological time series case studies of Bangladesh (Dengue, Covid-19) to showcase the workflow of the algorithm. The case studies illustrate that the synthesized data agrees with the original data regarding its statistical properties, trend, seasonality, and residuals. In the case of forecasting performance, using the last 12 years data of Dengue infection data in Bangladesh, we were able to decrease error terms up to 72.76% using synthetic data over actual aggregated data.

  6. Aerospace Artificial Intelligence (AI) Market Analysis North America,...

    • technavio.com
    Updated Feb 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Aerospace Artificial Intelligence (AI) Market Analysis North America, Europe, APAC, Middle East and Africa, South America - US, Canada, UK, China, Germany, France, Italy, India, Japan, South Korea - Size and Forecast 2025-2029 [Dataset]. https://www.technavio.com/report/aerospace-artificial-intelligence-market-industry-analysis
    Explore at:
    Dataset updated
    Feb 28, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    Global
    Description

    Snapshot img

    Aerospace Artificial Intelligence Market Size 2025-2029

    The aerospace artificial intelligence (AI) market size is forecast to increase by USD 7.24 billion at a CAGR of 45.9% between 2024 and 2029.

    Artificial Intelligence (AI) is revolutionizing the aerospace industry with its application in various domains, including software for flight simulation and virtual assistants for cockpit interaction. The rising trend of digital transformation in aviation is driving market growth, as AI enables automation in aircraft maintenance, threat detection systems, and additive manufacturing. The increasing use of drones equipped with sensors and data analytics capabilities is another significant trend, offering opportunities for real-time data collection and analysis. However, concerns surrounding data security and privacy are major challenges, necessitating strong cybersecurity measures. Machine learning algorithms, image recognition, and natural language processing are key technologies enabling AI in the aerospace sector, enhancing travel experiences and optimizing operational efficiency. The adoption of AI is set to continue, with the market expected to grow significantly in the coming years.
    

    What will be the Size of the Aerospace Artificial Intelligence (AI) Market During the Forecast Period?

    Request Free Sample

    The market encompasses the application of AI models, including machine learning, computer vision, and natural language processing, to enhance various aspects of the aerospace sector. AI technologies are increasingly being integrated into flight operations for predictive maintenance, optimization of fuel consumption, and improving pilot training through computer vision and voice recognition. In customer service, virtual assistants and voice recognition systems facilitate efficient communication between airlines and passengers.
    Air traffic control benefits from AI's ability to analyze big data and identify data patterns for improved safety and efficiency. AI is also employed for observation tasks, such as analyzing time series data for anomaly detection and predictive maintenance in aircraft components. The aerospace AI market is poised for significant growth, as human intelligence is augmented by AI software to address complex challenges and optimize processes.
    

    How is this Aerospace Artificial Intelligence (AI) Industry segmented and which is the largest segment?

    The aerospace artificial intelligence (AI) industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

    Component
    
      Software
      Hardware
      Services
    
    
    End-user
    
      Defense and military
      Commercial aviation
      Aircraft manufacturers
      Space exploration
      Airports
    
    
    Application
    
      Machine learning
      Natural language processing
      Computer vision
      Context awareness computing
    
    
    Geography
    
      North America
    
        Canada
        US
    
    
      Europe
    
        Germany
        UK
        France
        Italy
    
    
      APAC
    
        China
        India
        Japan
        South Korea
    
    
      Middle East and Africa
    
    
    
      South America
    

    By Component Insights

    The software segment is estimated to witness significant growth during the forecast period.
    

    Aerospace Artificial Intelligence (AI) software plays a crucial role in the development and operation of autonomous systems for UAVs, drones, and spacecraft. AI algorithms, including machine learning, computer vision, and neural networks, enable navigation, obstacle detection, and real-time decision-making. For instance, Airbus SE's Air Superiority Tactical Assistance Real-Time Execution System (ASTares) digitizes human-level experience to support tactical coordination in the Future Combat Air System (FCAS). In the aerospace sector, AI software optimizes flight control systems by analyzing data from sensors and adjusting flight parameters in real-time. This leads to improved fuel efficiency, reduced emissions, and enhanced safety. AI models are also integrated into customer service applications, such as virtual assistants and chatbots, to streamline airline industry processes and improve customer satisfaction.

    Get a glance at the market report of share of various segments Request Free Sample

    The software segment was valued at USD 141.10 million in 2019 and showed a gradual increase during the forecast period.

    Regional Analysis

    North America is estimated to contribute 35% to the growth of the global market during the forecast period.
    

    Technavio's analysts have elaborately explained the regional trends and drivers that shape the market during the forecast period.

    For more insights on the market size of various regions, Request Free Sample

    The aerospace industry is embracing Artificial Intelligence (AI) to enhance operational efficiency and automate processes in North America. Machine le

  7. w

    Global Synthetic Data Tool Market Research Report: By Type (Image...

    • wiseguyreports.com
    Updated Aug 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wWiseguy Research Consultants Pvt Ltd (2024). Global Synthetic Data Tool Market Research Report: By Type (Image Generation, Text Generation, Audio Generation, Time-Series Generation, User-Generated Data Marketplace), By Application (Computer Vision, Natural Language Processing, Predictive Analytics, Healthcare, Retail), By Deployment Mode (Cloud-Based, On-Premise), By Organization Size (Small and Medium Enterprises (SMEs), Large Enterprises) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/cn/reports/synthetic-data-tool-market
    Explore at:
    Dataset updated
    Aug 10, 2024
    Dataset authored and provided by
    wWiseguy Research Consultants Pvt Ltd
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Jan 8, 2024
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2024
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 20237.98(USD Billion)
    MARKET SIZE 20249.55(USD Billion)
    MARKET SIZE 203240.0(USD Billion)
    SEGMENTS COVEREDType ,Application ,Deployment Mode ,Organization Size ,Regional
    COUNTRIES COVEREDNorth America, Europe, APAC, South America, MEA
    KEY MARKET DYNAMICSGrowing Demand for Data Privacy and Security Advancement in Artificial Intelligence AI and Machine Learning ML Increasing Need for Faster and More Efficient Data Generation Growing Adoption of Synthetic Data in Various Industries Government Regulations and Compliance
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDMostlyAI ,Gretel.ai ,H2O.ai ,Scale AI ,UNchart ,Anomali ,Replica ,Big Syntho ,Owkin ,DataGenix ,Synthesized ,Verisart ,Datumize ,Deci ,Datasaur
    MARKET FORECAST PERIOD2025 - 2032
    KEY MARKET OPPORTUNITIESData privacy compliance Improved data availability Enhanced data quality Reduced data bias Costeffective
    COMPOUND ANNUAL GROWTH RATE (CAGR) 19.61% (2025 - 2032)
  8. Heat pump COP drop - synthetic faults

    • kaggle.com
    zip
    Updated Feb 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathieu Vallee (2023). Heat pump COP drop - synthetic faults [Dataset]. https://www.kaggle.com/datasets/mathieuvallee/ai-dhc-heatpump-cop
    Explore at:
    zip(68378018 bytes)Available download formats
    Dataset updated
    Feb 28, 2023
    Authors
    Mathieu Vallee
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains data generated in the AI DHC project.

    This dataset contains synthetic fault data for decrease of the COP of a heat pump

    The IEA DHC Annex XIII project “Artificial Intelligence for Failure Detection and Forecasting of Heat Production and Heat demand in District Heating Networks” is developing Artificial Intelligence (AI) methods for forecasting heat demand and heat production and is evaluating algorithms for detecting faults which can be used by interested stakeholders (operators, suppliers of DHC components and manufacturers of control devices).

    See https://github.com/mathieu-vallee/ai-dhc for the models and pythons scripts used to generate the dataset

    Please cite this dataset as: Vallee, M., Wissocq T., Gaoua Y., Lamaison N., Generation and Evaluation of a Synthetic Dataset to improve Fault Detection in District Heating and Cooling Systems, 2023 (under review at the Energy journal)

    Disclaimer notice (IEA DHC): This project has been independently funded by the International Energy Agency Technology Collaboration Programme on District Heating and Cooling including Combined Heat and Power (IEA DHC).

    Any views expressed in this publication are not necessarily those of IEA DHC.

    IEA DHC can take no responsibility for the use of the information within this publication, nor for any errors or omissions it may contain.

    Information contained herein have been compiled or arrived from sources believed to be reliable. Nevertheless, the authors or their organizations do not accept liability for any loss or damage arising from the use thereof. Using the given information is strictly your own responsibility.

    Disclaimer Notice (Authors):

    This publication has been compiled with reasonable skill and care. However, neither the authors nor the DHC Contracting Parties (of the International Energy Agency Technology Collaboration Programme on District Heating & Cooling) make any representation as to the adequacy or accuracy of the information contained herein, or as to its suitability for any particular application, and accept no responsibility or liability arising out of the use of this publication. The information contained herein does not supersede the requirements given in any national codes, regulations or standards, and should not be regarded as a substitute

    Copyright:

    All property rights, including copyright, are vested in IEA DHC. In particular, all parts of this publication may be reproduced, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise only by crediting IEA DHC as the original source. Republishing of this report in another format or storing the report in a public retrieval system is prohibited unless explicitly permitted by the IEA DHC Operating Agent in writing.

  9. Artificial Intelligence-As-A-Service (AIaaS) Market Analysis, Size, and...

    • technavio.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio, Artificial Intelligence-As-A-Service (AIaaS) Market Analysis, Size, and Forecast 2025-2029: North America (US and Canada), APAC (China, India, Japan, South Korea), Europe (France, Germany, Italy, UK), Middle East and Africa , and South America [Dataset]. https://www.technavio.com/report/artificial-intelligence-as-a-service-market-industry-analysis
    Explore at:
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    Global
    Description

    Snapshot img

    Artificial Intelligence-As-A-Service (AIaaS) Market Size 2025-2029

    The artificial intelligence-as-a-service (aiaas) market size is forecast to increase by USD 60.24 billion at a CAGR of 42.6% between 2024 and 2029.

    The market is experiencing significant growth, driven by increasing investment in research and development and the integration of AIaaS with emerging technologies like Blockchain. These advancements enable organizations to harness the power of AI to streamline operations, enhance customer experiences, and gain competitive advantages. However, the market faces challenges, including data privacy concerns, as businesses grapple with securing sensitive information in a cloud-based environment. As AIaaS continues to evolve, it's crucial for businesses to stay informed about these trends and address the associated challenges to fully leverage the potential of AI technology.

    What will be the Size of the Artificial Intelligence-As-A-Service (AIaaS) Market During the Forecast Period?

    Request Free SampleIn the dynamic and evolving the market, various advanced technologies are shaping the future of business intelligence. NoSQL databases are increasingly being adopted for their flexibility in handling large, complex datasets. Human-computer interaction is advancing with the integration of Virtual Reality (VR) and Mixed Reality (MR), enhancing user experiences. Reinforcement learning, deep learning, and transfer learning are revolutionizing decision-making processes, providing insights from vast datasets. Time series analysis and unsupervised learning are essential for predictive analytics and pattern recognition. Data warehousing and serverless computing optimize storage and processing capabilities, while cognitive computing and machine translation streamline business operations through automation and multilingual understanding. Sentiment analysis and text summarization are transforming customer engagement and market research, enabling businesses to gain valuable insights from unstructured data. Neural networks and quantum computing are pushing the boundaries of AI, offering unprecedented processing power and efficiency. The integration of AI technologies like semi-supervised learning, reinforcement learning, and deep learning in various applications, including VR, MR, and AR, is redefining industries and creating new opportunities for businesses. In the realm of big data, edge computing and serverless computing are becoming essential components, enabling real-time processing and analysis, while AI continues to drive innovation and growth.

    How is this Artificial Intelligence-As-A-Service (AIaaS) Industry segmented and which is the largest segment?

    The artificial intelligence-as-a-service (aiaas) industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments. End-userRetail and healthcareBFSITelecommunicationGovernment and defenseOthersTypeSoftwareServicesDeploymentPublic cloudPrivate cloudHybrid cloudSourceLarge enterprisesSMEsTechnologyMachine learningNatural language processingComputer visionOthersGeographyNorth AmericaUSCanadaEuropeFranceGermanyItalyUKAPACChinaIndiaJapanSouth KoreaRest of World (ROW)

    By End-user Insights

    The retail and healthcare segment is estimated to witness significant growth during the forecast period.The market is experiencing significant growth as businesses seek to enhance their enterprise resource planning software with AI capabilities. Retail organizations, in particular, are modernizing their IT infrastructure to accommodate new technologies and meet evolving customer expectations. With the increasing competition in retail industries driven by the demand for convenient web and mobile shopping platforms, traditional businesses are expanding into e-commerce. Local retailers are also investing in IT solutions, including AIaaS, to remain competitive and generate additional revenue through online channels. AIaaS is being integrated into various applications, such as marketing automation, cost optimization, predictive analytics, security audits, virtual assistants, recommendation engines, performance optimization, and user interface/experience enhancement. Industry-specific solutions, mobile applications, agile development, and API integration are also gaining popularity. Businesses are leveraging AIaaS for data mining, technical support, natural language processing, business intelligence, content personalization, machine learning models, data visualization, customer service, and more. Additionally, AIaaS is being used for data analysis, fraud detection, computer vision, process automation, data security, training, and documentation, and software-as-a-service (SaaS) offerings. Cloud computing and open-source technologies are enabling the ado

  10. f

    Comparative analysis of existing literature.

    • plos.figshare.com
    xls
    Updated Feb 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghulam Mustafa; Muhammad Ali Moazzam; Asif Nawaz; Tariq Ali; Deema Mohammed Alsekait; Ahmed Saleh Alattas; Diaa Salama AbdElminaam (2025). Comparative analysis of existing literature. [Dataset]. http://doi.org/10.1371/journal.pone.0316682.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Feb 5, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Ghulam Mustafa; Muhammad Ali Moazzam; Asif Nawaz; Tariq Ali; Deema Mohammed Alsekait; Ahmed Saleh Alattas; Diaa Salama AbdElminaam
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Accurate crop yield forecasting is vital for ensuring food security and making informed decisions. With the increasing population and global warming, addressing food security has become a priority, so accurate yield forecasting is very important. Artificial Intelligence (AI) has increased the yield accuracy significantly. The existing Machine Learning (ML) methods are using statistical measures as regression, correlation and chi square test for predicting crop yield, all such model’s leads to low accuracy when the number of factors (variables) such as the weather and soil conditions, the wind, fertilizer quantity, and the seed quality and climate are increased. The proposed methodology consists of different stages, like Data Collection, Preprocessing, Feature Extraction with Support Vector Machine (SVM), correlation with Normalized Google Distance (NGD), feature ranking with rising star. This study combines Bidirectional Gated Recurrent Unit (Bi-GRU) and Time Series CNN to predict crop yield and then recommendation for further improvement. The proposed model showed very good results in all datasets and showed significant improvement compared to baseline models. The ECP-IEM achieved an accuracy 96.34%, precision 94.56% and recall 95.23% on different datasets. Moreover, the proposed model was also evaluated based on MAE, MSE, and RMSE, which produced values of 0.191, 0.0674, and 0.238, respectively. This will help in improving production of crops by giving an early look about the yield of crops which will than help the farmer in improving the crops yield.

  11. n

    Data from: From Chaos to Harmony: Addressing Data De-Noising, Complexity and...

    • curate.nd.edu
    pdf
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qianlong Wen (2025). From Chaos to Harmony: Addressing Data De-Noising, Complexity and Adaptability in Graph Machine Learning [Dataset]. http://doi.org/10.7274/28786127.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    University of Notre Dame
    Authors
    Qianlong Wen
    License

    https://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106

    Description

    Graph representation learning—especially via graph neural networks (GNNs)—has demonstrated considerable promise in modeling intricate interaction systems, such as social networks and molecular structures. However, the deployment of GNN-based frameworks in industrial settings remains challenging due to the inherent complexity and noise in real-world graph data. This dissertation systematically addresses these challenges by advancing novel methodologies to improve the comprehensiveness and robustness of graph representation learning, with a dual focus on resolving data complexity and denoising across diverse graph-learning scenarios. In addressing graph data denoising, we design auxiliary self-supervised optimization objectives that disentangle noisy topological structures and misinformation while preserving the representational sufficiency of critical graph features. These tasks operate synergistically with primary learning objectives to enhance robustness against data corruption. The efficacy of these techniques is demonstrated through their application to real-world opioid prescription time series data for predicting potential opioid over-prescription. To mitigate data complexity, the study investigates two complementary approaches: (1) multimodal fusion, which employs attentive integration of graph data with features from other modalities, and (2) hierarchical substructure mining, which extracts semantic patterns at multiple granularities to enhance model generalization in demanding contexts. Finally, the dissertation explores the adaptability of graph data in a range of practical applications, including E-commerce demand forecasting and recommendations, to further enhance prediction and reasoning capabilities.

  12. i

    KPI prediction dataset

    • ieee-dataport.org
    Updated Jun 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hu Zhang (2024). KPI prediction dataset [Dataset]. https://ieee-dataport.org/documents/kpi-prediction-dataset
    Explore at:
    Dataset updated
    Jun 20, 2024
    Authors
    Hu Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    KPI prediction

  13. Spacecraft Thruster Firing Test Dataset

    • zenodo.org
    • data.niaid.nih.gov
    csv, zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Fleith; Patrick Fleith (2024). Spacecraft Thruster Firing Test Dataset [Dataset]. http://doi.org/10.5281/zenodo.7137930
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Patrick Fleith; Patrick Fleith
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    WARNING

    This version of the dataset is not recommended for anomaly detection use case. We discovered discrepancies in the anomalous sequences. A new version will be released. In the meantime, please ignore all sequence marked as anomalous.

    CONTEXT

    Testing hardware to qualify it for Spaceflight is critical to model and verify performances. Hot fire tests (also known as life-tests) are typically run during the qualification campaigns of satellite thrusters, but results remain proprietary data, hence making it difficult for the machine learning community to develop suitable data-driven predictive models. This synthetic dataset was generated partially based on the real-world physics of monopropellant chemical thrusters, to foster the development and benchmarking of new data-driven analytical methods (machine learning, deep-learning, etc.).

    The PDF document "STFT Dataset Description" describes in much details the structure, context, use cases and domain-knowledge about thruster in order for ML practitioners to use the dataset.

    PROPOSED TASKS

    Supervised:

    • Performance Modelling: Prediction of the thruster performances (target can be thrust, mass flow rate, and/or the average specific impulse)
    • Acceptance Test for Individualised Performance Model refinement: Taking into account the acceptance test of individual thruster might be helpful to generate individualised thruster predictive model
    • Uncertainty Quantification for Thruster-to-thruster reproducibility verification, i.e. to evaluate the prediction variability between several thrusters in order to construct uncertainty bounds around the prediction (predictive intervals) of the thrust and mass flow rate of future thrusters that may be used during an actual space mission

    Unsupervised / Anomaly Detection

    • Anomaly Detection: Anomalies can be detected in an unsupervised setting (outlier detection) or in a semi-supervised setting (novelty detection). The dataset includes a total of 270 anomalies. A simple approach is to predict if a firing test sequence is anomalous or nominal. A more advanced approach is trying to predict which portion of a time series is anomalous. The dataset also provide a detailed information about each time point being anomalous or nominal. In case of an anomaly, a code is provided which allows to diagnosis the detection system performance on the different types of anomalies contained in the dataset.

  14. m

    Giant Mud Crab Molting Visual Dataset

    • data.mendeley.com
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dany Eka Saputra (2024). Giant Mud Crab Molting Visual Dataset [Dataset]. http://doi.org/10.17632/4kc36yjhdy.1
    Explore at:
    Dataset updated
    Dec 16, 2024
    Authors
    Dany Eka Saputra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains images of Giant Mud Crab growth before molting. This data is time series data that shows the growth of several crabs before it molts. The data is collected as a basis to develop an AI model that can predict the time to molt of a crab, especially Giant Mud Crab species (Scylla Serrata). The hypothesis for this data collection is that the time of a crab molting can be predicted by observing the visual cue (e.g. growth of limbs) that exist on the crab. The dataset contains image of 6 different crab that taken at the same time periods. The crab have different molting time, so the dataset has include the time-to-molt data for each image, that show how long the crab in the picture will molt. The dataset is gathered on November 2024 at a vertical crab farm in Surabaya, Indonesia.

  15. Cost of Living in Nairobi

    • kaggle.com
    Updated Feb 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yacooti (2025). Cost of Living in Nairobi [Dataset]. https://www.kaggle.com/datasets/yacooti/cost-of-living-in-nairobi/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 15, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yacooti
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Nairobi
    Description

    🏡 Cost of Living in Nairobi, Kenya

    📌 Overview

    This dataset provides a detailed time-series estimate of the monthly cost of living across 20 different areas in Nairobi, Kenya from 2019 to 2024. It covers essential expenses such as rent, food, transport, utilities, and miscellaneous costs, allowing for comprehensive cost-of-living analysis.

    This dataset is useful for:
    ✅ Individuals planning to move to Nairobi
    ✅ Researchers analyzing long-term cost trends
    ✅ Businesses assessing salary benchmarks based on inflation
    ✅ Data scientists developing predictive models for cost forecasting

    📊 Data Summary

    • Total Records: 60,000 (5 years of monthly data)
    • Columns:
      • 🏠 Area: The residential area in Nairobi
      • 💰 Rent: Estimated monthly rent (KES)
      • 🍽️ Food: Grocery and dining expenses (KES)
      • 🚕 Transport: Public and private transport costs (KES)
      • Utilities: Water, electricity, and internet bills (KES)
      • 🎭 Misc: Entertainment, personal care, and leisure expenses (KES)
      • 🏷️ Total: Sum of all expenses
      • 📆 Date: Monthly timestamp from January 2019 to December 2024

    📍 Areas Covered

    This dataset provides cost estimates for 20+ residential areas, including:
    - High-End Areas 🏡: Kileleshwa, Westlands, Karen
    - Mid-Range Areas 🏙️: South B, Langata, Ruaka
    - Affordable Areas 🏠: Embakasi, Kasarani, Githurai, Ruiru, Umoja
    - Satellite Towns 🌿: Ngong, Rongai, Thika, Kitengela, Kikuyu

    🛠️ How the Data Was Generated

    This dataset was synthetically generated using Python, incorporating realistic market variations. The process includes:

    Inflation Modeling 📈 – A 2% annual increase in costs over time.
    Seasonal Effects 📅 – Higher food and transport costs in December & January (holiday season), rent spikes in June & July.
    Economic Shocks ⚠️ – A 5% chance per record of external economic effects (e.g., fuel price hikes, supply chain issues).
    Random Fluctuations 🔄 – Expenses vary slightly month-to-month to simulate real-world spending behavior.

    🔍 Potential Use Cases

    • 📊 Cost of Living Analysis – Compare affordability across different Nairobi areas.
    • 💵 Salary & Real Estate Benchmarking – Businesses can analyze salary expectations by location.
    • 📉 Time-Series Forecasting – Train predictive models (ARIMA, Prophet, LSTM) to estimate future living costs.
    • 📈 Inflation Impact Studies – Measure how economic conditions influence cost variations over time.

    ⚠️ Limitations

    • Synthetic Data – The dataset is not based on real survey data but follows market trends.
    • No Lifestyle Adjustments – Differences in household size or spending habits are not factored in.
    • Inflation Approximation – While inflation is simulated at 2% annually, actual inflation rates may differ.

    📁 File Format & Access

    • nairobi_cost_of_living_time_series.csv – 60,000 records in CSV format (time-series structured).

    📢 Acknowledgments

    This dataset was generated for research and educational purposes. If you find it useful, consider citing it in your work. 🚀

    📥 Download and Explore the Data Now!

    This updated version makes your documentation more detailed and actionable for users interested in forecasting and economic analysis. Would you like help building a cost prediction model? 🚀

  16. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Samay Ashar (2025). Stock Market Simulation Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/11010423
Organization logo

Stock Market Simulation Dataset

📈 A Realistic Synthetic Dataset for Time-Series Forecasting & Stock Analysis

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 12, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Samay Ashar
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

This dataset provides realistic stock market data generated using Geometric Brownian Motion for price movements and Markov Chains for trend prediction. It is designed for time-series forecasting, financial modeling, and algorithmic trading simulations.

Key Features

  • 1000 days of synthetic stock market data (from January 1, 2022, onwards).
  • Multiple companies from diverse industries (Technology, Finance, Healthcare, Energy, Consumer Goods, Automotive, Aerospace, etc.).
  • Stock price details: Open, High, Low, Close prices.
  • Trading volume and market capitalization.
  • Financial metrics: P/E Ratio, Dividend Yield, Volatility.
  • Sentiment Score: A measure of market sentiment (-1 to 1 scale).
  • Trend Labeling: Bullish, Bearish, or Stable, based on Markov Chain modeling.
Column NameDescription
DateTrading date
CompanyStock name (e.g., Apple, Tesla, JPMorgan, etc.)
SectorIndustry classification
OpenOpening price of the stock
HighHighest price of the stock for the day
LowLowest price of the stock for the day
CloseClosing price of the stock
VolumeNumber of shares traded
Market_CapMarket capitalization (in USD)
PE_RatioPrice-to-Earnings ratio
Dividend_YieldPercentage of dividends relative to stock price
VolatilityMeasure of stock price fluctuation
Sentiment_ScoreMarket sentiment (-1 to 1 scale)
TrendStock market trend (Bullish, Bearish, or Stable)

Usage Scenarios

🔹 Time-Series Forecasting: Train models like LSTMs, Transformers, or ARIMA for stock price prediction.
🔹 Algorithmic Trading: Develop trading strategies based on trends and sentiment.
🔹 Feature Engineering: Explore correlations between financial metrics and stock movements.
🔹 Quantitative Finance Research: Analyze market trends using simulated yet realistic data.

PS: If you find this dataset helpful, please consider upvoting :)

Search
Clear search
Close search
Google apps
Main menu