90 datasets found
  1. DataSet_Upload_Practice

    • kaggle.com
    Updated Jun 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John, Kim (2020). DataSet_Upload_Practice [Dataset]. https://www.kaggle.com/eaglekeeneye/dataset-upload-practice/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 20, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    John, Kim
    Description

    Dataset

    This dataset was created by John, Kim

    Contents

  2. testing file upload

    • kaggle.com
    Updated Nov 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad Basher (2024). testing file upload [Dataset]. https://www.kaggle.com/datasets/ahmadbasher/testing-file-upload/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 24, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ahmad Basher
    Description

    Dataset

    This dataset was created by Ahmad Basher

    Contents

  3. Testing github actions for upload datasets

    • kaggle.com
    zip
    Updated Oct 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jaime Valero (2020). Testing github actions for upload datasets [Dataset]. https://www.kaggle.com/jaimevalero/my-new-dataset
    Explore at:
    zip(183 bytes)Available download formats
    Dataset updated
    Oct 12, 2020
    Authors
    Jaime Valero
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Example of dataset syncronized by github actions
    Source https://github.com/jaimevalero/test-actions and https://github.com/jaimevalero/push-kaggle-dataset

  4. upload

    • kaggle.com
    Updated Mar 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sica Chang (2023). upload [Dataset]. https://www.kaggle.com/datasets/sicachang/upload
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 30, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sica Chang
    Description

    Dataset

    This dataset was created by Sica Chang

    Contents

  5. m

    Weather and electric load dataset

    • data.mendeley.com
    • kaggle.com
    Updated Oct 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manvi Mishra (2024). Weather and electric load dataset [Dataset]. http://doi.org/10.17632/vf4ckw66cy.1
    Explore at:
    Dataset updated
    Oct 1, 2024
    Authors
    Manvi Mishra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The present dataset combines the details of hourly load variation with the hourly weather parameters. The load dataset has been obtained from one of the substations (location Ahmedabad) and weather parameters dataset has been extracted from the NASA open-source website.

  6. Load Forecasting Dataset

    • kaggle.com
    Updated Jul 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Isuranga Nipun Kumara (2025). Load Forecasting Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/12567872
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 24, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Isuranga Nipun Kumara
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains historical data used for forecasting the electrical load of the national grid in Sri Lanka. The data spans from January 2020 to May 2025, with 189888 records representing electrical consumption in 15-minute intervals. This dataset is designed for use in predictive modeling, specifically for applying machine learning techniques such as Recurrent Neural Networks (RNNs) for short-term load forecasting.

    Features: 1. Timestamp: Date and time of the observation (15-minute intervals).

    2.Load Demand (kW): The amount of electrical load demand in kilowatts (kW), which is the target variable for forecasting.

    3.Temperature (°C): Average temperature (in Celsius) during the forecast period.

    4.Humidity (%): Average relative humidity during the forecast period.

    5.Wind Speed (m/s): Average wind speed during the forecast period.

    6.Rainfall (mm): Total rainfall (in millimeters) during the forecast period.

    7.Solar Irradiance (W/m²): Solar energy received in watts per square meter, which affects electricity demand.

    8.GDP (USD): The gross domestic product (GDP) in USD, representing economic activity that influences power demand.

    9.Per Capita Energy Use (kWh): Average energy usage per person (in kilowatt-hours).

    10.Electricity Price (LKR/kWh): Price of electricity (in LKR per kWh) during the time period.

    11.Day of Week: The day of the week (0 = Monday, 6 = Sunday).

    12.Hour of Day: The hour of the day (0 = midnight, 23 = 11 PM).

    13.Month: The month of the year (1 = January, 12 = December).

    14.Season: The season of the year (Summer, Winter, Fall).

    15.Public Event: A binary variable indicating if a public event occurred (1 = Yes, 0 = No).

    Purpose: The dataset can be used to train and evaluate models that predict electrical load demand for short-term forecasting. This is especially important for energy management, efficient resource allocation, and planning for future power generation in response to demand fluctuations.

    Use Case:

    This dataset is suitable for:

    Predictive modeling using machine learning algorithms (particularly deep learning techniques like RNNs and LSTMs). Load forecasting and demand-side management.

    Economic planning related to electricity generation and transmission.

    Optimization of resource usage in energy management systems.

    Data Collection:

    The data was collected from multiple sources including the Ceylon Electricity Board (CEB) and weather data providers, with a focus on historical load demand and environmental factors that influence power consumption.

    Licensing:

    This dataset is open for educational and research purposes under a Creative Commons license. Please attribute the dataset to the source when using it for analysis or research.

  7. upload fold-0 retrained model

    • kaggle.com
    Updated Jun 2, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandiago (2021). upload fold-0 retrained model [Dataset]. https://www.kaggle.com/datasets/sandiago21/upload-fold0-retrained-model/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 2, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sandiago
    Description

    Dataset

    This dataset was created by Sandiago

    Contents

  8. upload_example1

    • kaggle.com
    Updated Jun 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robin (2025). upload_example1 [Dataset]. https://www.kaggle.com/datasets/robinluy/upload-example1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 9, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Robin
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Robin

    Released under Apache 2.0

    Contents

  9. Furniture Sales Data

    • kaggle.com
    Updated Aug 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RAJ AGRAWAL (2024). Furniture Sales Data [Dataset]. http://doi.org/10.34740/kaggle/dsv/9253879
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 26, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    RAJ AGRAWAL
    Description

    This dataset is generated for the purpose of analyzing furniture sales data using multiple regression techniques. It contains 2,500 rows with 15 columns, including 7 numerical columns and 7 categorical columns, along with a target variable (revenue) which represents the total revenue generated from furniture sales. The dataset captures various aspects of furniture sales, such as pricing, cost, sales volume, discount percentage, inventory levels, delivery time, and different categorical attributes like furniture type, material, color, and store location.

    Guys please upload your notebook of this dataset so that others can also learn from your work

  10. MeDAL Dataset

    • kaggle.com
    • opendatalab.com
    • +1more
    zip
    Updated Nov 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    xhlulu (2020). MeDAL Dataset [Dataset]. https://www.kaggle.com/xhlulu/medal-emnlp
    Explore at:
    zip(7324382521 bytes)Available download formats
    Dataset updated
    Nov 16, 2020
    Authors
    xhlulu
    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2352583%2F868a18fb09d7a1d3da946d74a9857130%2FLogo.PNG?generation=1604973725053566&alt=media" alt="">

    Medical Dataset for Abbreviation Disambiguation for Natural Language Understanding (MeDAL) is a large medical text dataset curated for abbreviation disambiguation, designed for natural language understanding pre-training in the medical domain. It was published at the ClinicalNLP workshop at EMNLP.

    💻 Code 🤗 Dataset (Hugging Face) 💾 Dataset (Kaggle) 💽 Dataset (Zenodo) 📜 Paper (ACL) 📝 Paper (Arxiv)Pre-trained ELECTRA (Hugging Face)

    Downloading the data

    We recommend downloading from Kaggle if you can authenticate through their API. The advantage to Kaggle is that the data is compressed, so it will be faster to download. Links to the data can be found at the top of the readme.

    First, you will need to create an account on kaggle.com. Afterwards, you will need to install the kaggle API: pip install kaggle

    Then, you will need to follow the instructions here to add your username and key. Once that's done, you can run: kaggle datasets download xhlulu/medal-emnlp

    Now, unzip everything and place them inside the data directory: unzip -nq crawl-300d-2M-subword.zip -d data mv data/pretrain_sample/* data/

    Loading FastText Embeddings

    For the LSTM models, we will need to use the fastText embeddings. To do so, first download and extract the weights: wget -nc -P data/ https://dl.fbaipublicfiles.com/fasttext/vectors-english/crawl-300d-2M-subword.zip unzip -nq data/crawl-300d-2M-subword.zip -d data/

    Model Quickstart

    Using Torch Hub

    You can directly load LSTM and LSTM-SA with torch.hub: ```python import torch

    lstm = torch.hub.load("BruceWen120/medal", "lstm") lstm_sa = torch.hub.load("BruceWen120/medal", "lstm_sa") ```

    If you want to use the Electra model, you need to first install transformers: pip install transformers Then, you can load it with torch.hub: python import torch electra = torch.hub.load("BruceWen120/medal", "electra")

    Using Huggingface transformers

    If you are only interested in the pre-trained ELECTRA weights (without the disambiguation head), you can load it directly from the Hugging Face Repository:

    from transformers import AutoModel, AutoTokenizer
    
    model = AutoModel.from_pretrained("xhlu/electra-medal")
    tokenizer = AutoTokenizer.from_pretrained("xhlu/electra-medal")
    

    Citation

    Download the bibtex here, or copy the text below: @inproceedings{wen-etal-2020-medal, title = "{M}e{DAL}: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining", author = "Wen, Zhi and Lu, Xing Han and Reddy, Siva", booktitle = "Proceedings of the 3rd Clinical Natural Language Processing Workshop", month = nov, year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/2020.clinicalnlp-1.15", pages = "130--135", }

    License, Terms and Conditions

    The ELECTRA model is licensed under Apache 2.0. The license for the libraries used in this project (transformers, pytorch, etc.) can be found in their respective GitHub repository. Our model is released under a MIT license.

    The original dataset was retrieved and modified from the NLM website. By using this dataset, you are bound by the terms and conditions specified by NLM:

    INTRODUCTION

    Downloading data from the National Library of Medicine FTP servers indicates your acceptance of the following Terms and Conditions: No charges, usage fees or royalties are paid to NLM for this data.

    MEDLINE/PUBMED SPECIFIC TERMS

    NLM freely provides PubMed/MEDLINE data. Please note some PubMed/MEDLINE abstracts may be protected by copyright.

    GENERAL TERMS AND CONDITIONS

    • Users of the data agree to:

      • acknowledge NLM as the source of the data by including the phrase "Courtesy of the U.S. National Library of Medicine" in a clear and conspicuous manner,
      • properly use registration and/or trademark symbols when referring to NLM products, and
      • not indicate or imply that NLM has endorsed its products/services/applications.
    • Users who republish or redistribute the data (services, products or raw data) agree to:

      • maintain the most current version of all distributed data, or
      • make known in a clear and conspicuous manner that the products/services/applications do not reflect the most current/accurate data available from NLM.
    • These data are produced with a reasonable standard of care, but NLM makes no warranties express or implied, including no warranty of merchantability or fitness for particular purpose, regarding the accuracy or completeness of the data. Users agree to hold NLM and the U.S. Government harmless from any liability resulting from errors in the data. NLM disclaims any liability for any consequences due to use, misuse, or interpretation of information contained or not contained in the data.

    • NLM does not provide legal advice regarding copyright, fair use, or other aspects of intellectual property rights. See the NLM Copyright page.

    • NLM reserves the right to change the type and format of its machine-readable data. NLM will take reasonable steps to inform users of any changes to the format of the data before the data are distributed via the announcement section or subscription to email and RSS updates.

  11. diabetic data upload

    • kaggle.com
    Updated Jul 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Soyabul Islam Lincoln (samlin) (2024). diabetic data upload [Dataset]. https://www.kaggle.com/datasets/senjutylincoln/diabetic-data-upload/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Soyabul Islam Lincoln (samlin)
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Soyabul Islam Lincoln (samlin)

    Released under Apache 2.0

    Contents

  12. efficientnet-car-upload-at-0623-2236

    • kaggle.com
    Updated Jun 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jeongjh180 (2023). efficientnet-car-upload-at-0623-2236 [Dataset]. https://www.kaggle.com/datasets/jeongjh180/efficientnet-car-upload-at-0623-2236
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 23, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    jeongjh180
    Description

    Dataset

    This dataset was created by jeongjh180

    Contents

  13. Images used for training, validation, and testing.

    • kaggle.com
    Updated Mar 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chrysthian Chrisley (2024). Images used for training, validation, and testing. [Dataset]. https://www.kaggle.com/datasets/chrysthian/images-used-for-training-validation-and-testing
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 15, 2024
    Dataset provided by
    Kaggle
    Authors
    Chrysthian Chrisley
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Imports:

    # All Imports
    import os
    from matplotlib import pyplot as plt
    import pandas as pd
    from sklearn.calibration import LabelEncoder
    import seaborn as sns
    import matplotlib.image as mpimg
    import cv2
    import numpy as np
    import pickle
    
    # Tensflor and Keras Layer and Model and Optimize and Loss
    import tensorflow as tf
    from tensorflow import keras
    from keras import Sequential
    from keras.layers import *
    
    #Kernel Intilizer 
    from keras.optimizers import Adamax
    
    # PreTrained Model
    from keras.applications import *
    
    #Early Stopping
    from keras.callbacks import EarlyStopping
    import warnings 
    

    Warnings Suppression | Configuration

    # Warnings Remove 
    warnings.filterwarnings("ignore")
    
    # Define the base path for the training folder
    base_path = 'jaguar_cheetah/train'
    
    # Weights file
    weights_file = 'Model_train_weights.weights.h5'
    
    # Path to the saved or to save the model:
    model_file = 'Model-cheetah_jaguar_Treined.keras'
    
    # Model history
    history_path = 'training_history_cheetah_jaguar.pkl'
    
    # Initialize lists to store file paths and labels
    filepaths = []
    labels = []
    
    # Iterate over folders and files within the training directory
    for folder in ['Cheetah', 'Jaguar']:
      folder_path = os.path.join(base_path, folder)
      for filename in os.listdir(folder_path):
        file_path = os.path.join(folder_path, filename)
        filepaths.append(file_path)
        labels.append(folder)
    
    # Create the TRAINING dataframe
    file_path_series = pd.Series(filepaths , name= 'filepath')
    Label_path_series = pd.Series(labels , name = 'label')
    df_train = pd.concat([file_path_series ,Label_path_series ] , axis = 1)
    
    
    # Define the base path for the test folder
    directory = "jaguar_cheetah/test"
    
    filepath =[]
    label = []
    
    folds = os.listdir(directory)
    
    for fold in folds:
      f_path = os.path.join(directory , fold)
      
      imgs = os.listdir(f_path)
      
      for img in imgs:
        
        img_path = os.path.join(f_path , img)
        filepath.append(img_path)
        label.append(fold)
        
    # Create the TEST dataframe
    file_path_series = pd.Series(filepath , name= 'filepath')
    Label_path_series = pd.Series(label , name = 'label')
    df_test = pd.concat([file_path_series ,Label_path_series ] , axis = 1)
    
    # Display the first rows of the dataframe for verification
    #print(df_train)
    
    # Folders with Training and Test files
    data_dir = 'jaguar_cheetah/train'
    test_dir = 'jaguar_cheetah/test'
    
    # Image size 256x256
    IMAGE_SIZE = (256,256) 
    

    Tain | Test

    #print('Training Images:')
    
    # Create the TRAIN dataframe
    train_ds = tf.keras.utils.image_dataset_from_directory(
      data_dir,
      validation_split=0.1,
      subset='training',
      seed=123,
      image_size=IMAGE_SIZE,
      batch_size=32)
    
    #Testing Data
    #print('Validation Images:')
    validation_ds = tf.keras.utils.image_dataset_from_directory(
      data_dir, 
      validation_split=0.1,
      subset='validation',
      seed=123,
      image_size=IMAGE_SIZE,
      batch_size=32)
    
    print('Testing Images:')
    test_ds = tf.keras.utils.image_dataset_from_directory(
      test_dir, 
      seed=123,
      image_size=IMAGE_SIZE,
      batch_size=32)
    
    # Extract labels
    train_labels = train_ds.class_names
    test_labels = test_ds.class_names
    validation_labels = validation_ds.class_names
    
    # Encode labels
    # Defining the class labels
    class_labels = ['CHEETAH', 'JAGUAR'] 
    
    # Instantiate (encoder) LabelEncoder
    label_encoder = LabelEncoder()
    
    # Fit the label encoder on the class labels
    label_encoder.fit(class_labels)
    
    # Transform the labels for the training dataset
    train_labels_encoded = label_encoder.transform(train_labels)
    
    # Transform the labels for the validation dataset
    validation_labels_encoded = label_encoder.transform(validation_labels)
    
    # Transform the labels for the testing dataset
    test_labels_encoded = label_encoder.transform(test_labels)
    
    # Normalize the pixel values
    
    # Train files 
    train_ds = train_ds.map(lambda x, y: (x / 255.0, y))
    # Validate files
    validation_ds = validation_ds.map(lambda x, y: (x / 255.0, y))
    # Test files
    test_ds = test_ds.map(lambda x, y: (x / 255.0, y))
    
    #TRAINING VISUALIZATION
    #Count the occurrences of each category in the column
    count = df_train['label'].value_counts()
    
    # Create a figure with 2 subplots
    fig, axs = plt.subplots(1, 2, figsize=(12, 6), facecolor='white')
    
    # Plot a pie chart on the first subplot
    palette = sns.color_palette("viridis")
    sns.set_palette(palette)
    axs[0].pie(count, labels=count.index, autopct='%1.1f%%', startangle=140)
    axs[0].set_title('Distribution of Training Categories')
    
    # Plot a bar chart on the second subplot
    sns.barplot(x=count.index, y=count.values, ax=axs[1], palette="viridis")
    axs[1].set_title('Count of Training Categories')
    
    # Adjust the layout
    plt.tight_layout()
    
    # Visualize
    plt.show()
    
    # TEST VISUALIZATION
    count = df_test['label'].value_counts()
    
    # Create a figure with 2 subplots
    fig, axs = plt.subplots(1, 2, figsize=(12, 6), facec...
    
  14. moa_upload_1

    • kaggle.com
    Updated Nov 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Klein (2020). moa_upload_1 [Dataset]. https://www.kaggle.com/datasets/felix613/moa-upload-1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 15, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Felix Klein
    Description

    Dataset

    This dataset was created by Felix Klein

    Contents

  15. Import and Export by India

    • kaggle.com
    Updated Aug 4, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajanand Ilangovan (2017). Import and Export by India [Dataset]. https://www.kaggle.com/datasets/rajanand/import-and-export-by-india
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 4, 2017
    Dataset provided by
    Kaggle
    Authors
    Rajanand Ilangovan
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    India
    Description
    "https://link.rajanand.org/sql-challenges" target="_blank"> https://link.rajanand.org/banner-01" alt="SQL Data Challenges" style="width: 700px; height: 120px">
    --- ### Context To better understand the imports and exports by India and how it changed in 3 years. ### Content Import and export data available by principle commodity and country wise for 3 years from Apr'2014 to Mar'2017. ### Column Descriptions 1. pc_code: Integer, Principal Commodity Code 2. pc: String, Principal Commodity Name 3. unit: String, measurement of quantity 4. country_code: Integer, country code 5. country_name: String, country name 6. quantity: Integer, quantify of export or import 7. value: Integer, monetary valeu of the quantity (in million USD) ### Acknowledgements [Ministry of Commerce and Industry](http://commerce.gov.in), Govt of India has published [these](https://data.gov.in/catalog/principal-commodity-wise-export) [datasets](https://data.gov.in/catalog/principal-commodity-wise-import) in Open Govt Data Platform India portal under [Govt. Open Data License - India](https://data.gov.in/government-open-data-license-india). ### Inspiration Some of questions I would like to be answered are 1. Top countries by growth percentage. 2. Top commodity by quantity or value. 3. YoY growth of export and import. ---
    "https://link.rajanand.org/sql-challenges" target="_blank"> https://link.rajanand.org/banner-02" alt="SQL Data Challenges" style="width: 700px; height: 120px">
  16. EV charging Load Dataset and Optimal Routing

    • kaggle.com
    Updated Oct 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DatasetEngineer (2024). EV charging Load Dataset and Optimal Routing [Dataset]. http://doi.org/10.34740/kaggle/dsv/9604807
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 11, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    DatasetEngineer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The EV Charging Dataset, used in this study, is a publicly available dataset on Kaggle that records real-world electric vehicle (EV) charging behavior and patterns across various locations. The dataset contains 26 key features, each providing valuable insights into the operational and environmental factors that influence EV charging loads. The features include vehicle-specific data, charging station details, and environmental metrics, which collectively contribute to a comprehensive understanding of the factors affecting EV charging demands and route optimization.

    Vehicle ID: A unique identifier for each electric vehicle in the dataset, used for tracking individual vehicle charging behavior. Battery Capacity (kWh): The total energy storage capacity of the EV battery, typically measured in kilowatt-hours. State of Charge (SOC %): The current charge level of the vehicle's battery as a percentage of its total capacity. Energy Consumption Rate (kWh/km): The rate at which the vehicle consumes energy per kilometer traveled, modeled based on real-world driving conditions. Current and Destination Latitude/Longitude: Geographic coordinates providing the vehicle's current and intended location. Distance to Destination (km): The remaining distance to the vehicle’s destination, which influences the decision-making process for when to charge. Traffic Data: A count of vehicles on the road, providing insight into real-time congestion levels affecting the travel duration and energy consumption. Road Conditions: A categorical feature (Good, Average, Poor) representing the state of the road, which can impact vehicle energy efficiency. Charging Station ID: A unique identifier for each charging station where the vehicle connects for recharging. Charging Rate (kW): The rate at which power is delivered to the vehicle’s battery while charging, influencing the time required to fully charge. Queue Time (mins): The estimated waiting time before charging starts, influenced by the number of vehicles at the station. Station Capacity (EVs): The maximum number of vehicles a charging station can accommodate simultaneously. Time Spent Charging (mins): The duration for which a vehicle is connected to the charging station. Energy Drawn (kWh): The amount of energy transferred to the vehicle's battery during the charging session. Session Start Hour: The hour of the day when the charging session begins, represented as an integer from 0 to 23. Fleet Size: The total number of vehicles in the fleet, which provides insights into overall charging demand. Fleet Schedule: Indicates whether the fleet is on schedule or delayed (0 for on time, 1 for delayed). Temperature (°C), Wind Speed (m/s), and Precipitation (mm): Environmental variables that affect EV performance and energy usage during travel. Weekday: Coded as an integer from 0 to 6, representing the day of the week. Charging Preferences: A binary variable indicating whether a vehicle or user has any specific preferences for charging stations (0 for no preference, 1 for preference). Weather Conditions: The overall weather status (Clear, Cloudy, Rain, Storm), which influences travel and charging behavior. Charging Load (kW): The target label representing the load on the charging station, used for forecasting and demand prediction. This dataset is essential for the development of machine learning models aimed at predicting EV charging demand and optimizing charging infrastructure usage. By analyzing the features provided, the dataset enables researchers to investigate patterns in EV charging behavior and explore route optimization strategies in the context of IoT-enabled electric vehicle networks.

    Location Dataset Description: The Location Dataset is a synthetic dataset designed for route optimization tasks, especially useful for logistics, fleet management, and EV route planning applications. The dataset consists of 30 key locations, each represented by its geographical coordinates and categorized based on its function (e.g., city, port, warehouse). This dataset allows for the computation of the optimal routes between locations using various optimization algorithms.

    Location: A unique identifier for each point in the dataset, typically named after a city or functional node (e.g., A, B, C). Type: The type of location, which indicates its role in the network. Types include: City: Represents urban areas where fleet operations typically begin or end. Port: Represents seaports or inland ports where goods are transferred between modes of transport. Warehouse: Represents storage facilities that act as distribution points. Power Plant: Represents energy generation sites, often used in energy logistics planning. Industrial Zone: Represents areas designated for manufacturing and other industrial operations. Mining Site: Represents remote locations where resources are extracted. Latitude: The geographic coor...

  17. h

    cardd100-kaggle

    • huggingface.co
    Updated Jul 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Victor Ruto (2025). cardd100-kaggle [Dataset]. https://huggingface.co/datasets/vickruto/cardd100-kaggle
    Explore at:
    Dataset updated
    Jul 15, 2025
    Authors
    Victor Ruto
    Description

    Dataset Card for Dataset Name

    This is a FiftyOne dataset with 100 samples.

      Installation
    

    If you haven't already, install FiftyOne: pip install -U fiftyone

      Usage
    

    import fiftyone as fo from fiftyone.utils.huggingface import load_from_hub

    Load the dataset

    Note: other available arguments include 'max_samples', etc

    dataset = load_from_hub("vickruto/cardd100-kaggle")

    Launch the App

    session = fo.launch_app(dataset)

      Dataset Details… See the full description on the dataset page: https://huggingface.co/datasets/vickruto/cardd100-kaggle.
    
  18. RC Beam Capacity Optimization Dataset

    • kaggle.com
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Python Developer (2025). RC Beam Capacity Optimization Dataset [Dataset]. https://www.kaggle.com/datasets/programmer3/rc-beam-capacity-optimization-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 23, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Python Developer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains 3,234 data of reinforced concrete (RC) beam design parameters and their corresponding load-bearing capacities. The data is based on realistic construction standards and includes geometric, material, and reinforcement details such as beam dimensions, concrete grade, reinforcement ratios, and stirrup specifications.

  19. Elastic Load and Carbon Scheduling Dataset

    • kaggle.com
    Updated Jun 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Python Developer (2025). Elastic Load and Carbon Scheduling Dataset [Dataset]. https://www.kaggle.com/datasets/programmer3/elastic-load-and-carbon-scheduling-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 9, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Python Developer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides data for analyzing and optimizing electricity load management in modern power systems. It captures key variables related to elastic load behavior, dynamic pricing, shared energy storage systems (SESS), power-to-gas (P2G) technology, and carbon emissions.

  20. Cognitive Load EEG Dataset for English Reading

    • kaggle.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Python Developer (2025). Cognitive Load EEG Dataset for English Reading [Dataset]. https://www.kaggle.com/datasets/programmer3/cognitive-load-eeg-dataset-for-english-reading
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Kaggle
    Authors
    Python Developer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains EEG-derived features, biometric indicators, and session metadata collected from 124 college students during English reading comprehension tasks. Each of the 1310 rows represents a reading segment annotated with cognitive load levels (Low, Medium, High). Features include power spectral densities across EEG bands (Delta, Theta, Alpha, Beta, Gamma), mental effort scores, signal entropy, and biometric markers such as heart rate variability and pupil dilation. Additional metadata such as student age, gender, and English proficiency level are included to support demographic analysis.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
John, Kim (2020). DataSet_Upload_Practice [Dataset]. https://www.kaggle.com/eaglekeeneye/dataset-upload-practice/code
Organization logo

DataSet_Upload_Practice

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 20, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
John, Kim
Description

Dataset

This dataset was created by John, Kim

Contents

Search
Clear search
Close search
Google apps
Main menu