100+ datasets found
  1. Magic The Gathering dataset

    • kaggle.com
    Updated Oct 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonio Pelusi (2022). Magic The Gathering dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/4373482
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 22, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Antonio Pelusi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    A simple dataset to start learning the basics of Data Analytics. You can find a simple script in Python (Pymongo to connect to MongoDB and pandas to print) that use this dataset in the code section.

    Click the following link to get more info about the usage of the Python Script mtgdb.py): MTGDB

  2. Top Rated Movies

    • kaggle.com
    Updated Sep 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marium Masroor (2024). Top Rated Movies [Dataset]. https://www.kaggle.com/datasets/mariumfaheem666/top-rated-movies
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 10, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Marium Masroor
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Beginner Freindly:

    It is a beginner friendly data which is easy to understand and best to start any one's journey. Students of machine learning and data analytics can use this to understand basic libraries of python.

    Preprocessing:

    To make it less complex only four basic columns are included which are giving some quick information of top rated movies all over the world. This data set can help to build concepts of preprocessing and handling of data before applying any mathematical models.

    Visualizations are also giving some comprehensive description about popular movies.

    Lets get started. Happy coding!

  3. A deceptively simple dataset for beginners

    • kaggle.com
    Updated Jan 17, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashish Gupta (2019). A deceptively simple dataset for beginners [Dataset]. https://www.kaggle.com/ashishguptadss/a-deceptively-simple-dataset-for-beginners/kernels
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 17, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ashish Gupta
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by Ashish Gupta

    Released under Database: Open Database, Contents: Database Contents

    Contents

  4. w

    Dataset of books called So simple a beginning : how four physical principles...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called So simple a beginning : how four physical principles shape our living world [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=So+simple+a+beginning+%3A+how+four+physical+principles+shape+our+living+world
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is So simple a beginning : how four physical principles shape our living world. It features 7 columns including author, publication date, language, and book publisher.

  5. A

    ‘Deep Learning A-Z - ANN dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Deep Learning A-Z - ANN dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-deep-learning-a-z-ann-dataset-3c75/cb36262b/?iid=013-159&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Deep Learning A-Z - ANN dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/filippoo/deep-learning-az-ann on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    This is the dataset used in the section "ANN (Artificial Neural Networks)" of the Udemy course from Kirill Eremenko (Data Scientist & Forex Systems Expert) and Hadelin de Ponteves (Data Scientist), called Deep Learning A-Z™: Hands-On Artificial Neural Networks. The dataset is very useful for beginners of Machine Learning, and a simple playground where to compare several techniques/skills.

    It can be freely downloaded here: https://www.superdatascience.com/deep-learning/

    The story: A bank is investigating a very high rate of customer leaving the bank. Here is a 10.000 records dataset to investigate and predict which of the customers are more likely to leave the bank soon.

    The story of the story: I'd like to compare several techniques (better if not alone, and with the experience of several Kaggle users) to improve my basic knowledge on Machine Learning.

    Content

    I will write more later, but the columns names are very self-explaining.

    Acknowledgements

    Udemy instructors Kirill Eremenko (Data Scientist & Forex Systems Expert) and Hadelin de Ponteves (Data Scientist), and their efforts to provide this dataset to their students.

    Inspiration

    Which methods score best with this dataset? Which are fastest (or, executable in a decent time)? Which are the basic steps with such a simple dataset, very useful to beginners?

    --- Original source retains full ownership of the source dataset ---

  6. h

    simple-datasets

    • huggingface.co
    Updated Mar 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fy Bot(Bot) (2024). simple-datasets [Dataset]. https://huggingface.co/datasets/fssdsdasdas123/simple-datasets
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 23, 2024
    Authors
    fy Bot(Bot)
    Description

    Dataset Card for "input-dataset"

    More Information needed

  7. h

    Simple-dataset

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mark Kornev, Simple-dataset [Dataset]. https://huggingface.co/datasets/Datasetboy/Simple-dataset
    Explore at:
    Authors
    Mark Kornev
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Datasetboy/Simple-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. Z

    Simple Dataset from 1 to 10

    • data.niaid.nih.gov
    Updated Nov 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Borja, Fernando (2020). Simple Dataset from 1 to 10 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4287706
    Explore at:
    Dataset updated
    Nov 25, 2020
    Dataset authored and provided by
    Borja, Fernando
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a simple dataset from 1 to 10 updated

    This is more description to the second version

    This is further description

  9. h

    simple-complex-singleturn-dataset

    • huggingface.co
    Updated May 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SoftAge Information Technology Limited (2024). simple-complex-singleturn-dataset [Dataset]. https://huggingface.co/datasets/SoftAge-AI/simple-complex-singleturn-dataset
    Explore at:
    Dataset updated
    May 15, 2024
    Dataset authored and provided by
    SoftAge Information Technology Limited
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Simple/Complex Single-turn Prompts Dataset

      Description
    

    The dataset consists of 600 text-only prompts, each representing a fine-tuned instance of a single-turn user exchange in English. The samples are categorized into 10 distinct classes and cover 19 specific use cases. The dataset has been generated using ethically sourced human-in-the-loop data generation methods involving detailed insights of subject matter experts on labeled data for supervised fine-tuning to map… See the full description on the dataset page: https://huggingface.co/datasets/SoftAge-AI/simple-complex-singleturn-dataset.

  10. Pass/Fail

    • kaggle.com
    Updated Jul 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mrinmoy Borah (2023). Pass/Fail [Dataset]. https://www.kaggle.com/datasets/mrinmoyborah/passfail/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 16, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mrinmoy Borah
    Description

    subject1: marks scored by student out of 50 subject2: marks scored by student out of 50 passing criteria: sum of subject1 and subject2 must be greater than 70 pass = (subject1+subject2)>70 yes means pass no means fail

  11. Z

    Simple Geometric Shapes

    • data.niaid.nih.gov
    Updated Aug 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bos, Patrick (2021). Simple Geometric Shapes [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_5012824
    Explore at:
    Dataset updated
    Aug 19, 2021
    Dataset provided by
    Meijer, Christiaan
    Ranguelova, Elena
    Bos, Patrick
    Oostrum, Leon
    Liu, Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset containing 10000 images of a geometric shape with varying sizes and gray shades and a uniform background. The set contains 5000 images of a circle and 5000 images of a triangle. All images have the same size of 64x64. A label is provided for every image. The images are split into train and test set.

    The dataset is intended as a toy dataset to explore machine learning with or, in our case specifically, to try out explainable AI (XAI) methods with.

    The dataset was created as part of the DIANNA project. See https://github.com/dianna-ai/.

  12. Machine Learning Materials Datasets

    • figshare.com
    txt
    Updated Sep 11, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dane Morgan (2018). Machine Learning Materials Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.7017254.v5
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 11, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Dane Morgan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Three datasets are intended to be used for exploring machine learning applications in materials science. They are formatted in simple form and in particular for easy input into the MAterials Simulation Toolkit - Machine Learning (MAST-ML) package (see https://github.com/uw-cmg/MAST-ML).Each dataset is a materials property of interest and associated descriptors. For detailed information, please see the attached REAME text file.The first dataset for dilute solute diffusion can be used to predict an effective diffusion barrier for a solute element moving through another host element. The dataset has been calculated with DFT methods.The second dataset for perovskite stability gives energies of compostions of potential perovskite materials relative to the convex hull calculated with DFT. The perovskite dataset also includes columns with information about the A site, B site, and X site in the perovskite structure in order to perform more advanced grouping of the data.The third dataset is a metallic glasses dataset which has values of reduced glass transition temperature (Trg) for a variety of metallic alloys. An additional column is included for majority element for each alloy, which can be an interesting property to group on during tests.

  13. u

    PDMX

    • cseweb.ucsd.edu
    json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, PDMX [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    We introduce PDMX: a Public Domain MusicXML dataset for symbolic music processing, including over 250k musical scores in MusicXML format. PDMX is the largest publicly available, copyright-free MusicXML dataset in existence. PDMX includes genre, tag, description, and popularity metadata for every file.

  14. E

    The Human Know-How Dataset

    • dtechtive.com
    • find.data.gov.scot
    pdf, zip
    Updated Apr 29, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). The Human Know-How Dataset [Dataset]. http://doi.org/10.7488/ds/1394
    Explore at:
    pdf(0.0582 MB), zip(19.67 MB), zip(0.0298 MB), zip(9.433 MB), zip(13.06 MB), zip(0.2837 MB), zip(5.372 MB), zip(69.8 MB), zip(20.43 MB), zip(5.769 MB), zip(14.86 MB), zip(19.78 MB), zip(43.28 MB), zip(62.92 MB), zip(92.88 MB), zip(90.08 MB)Available download formats
    Dataset updated
    Apr 29, 2016
    Description

    The Human Know-How Dataset describes 211,696 human activities from many different domains. These activities are decomposed into 2,609,236 entities (each with an English textual label). These entities represent over two million actions and half a million pre-requisites. Actions are interconnected both according to their dependencies (temporal/logical orders between actions) and decompositions (decomposition of complex actions into simpler ones). This dataset has been integrated with DBpedia (259,568 links). For more information see: - The project website: http://homepages.inf.ed.ac.uk/s1054760/prohow/index.htm - The data is also available on datahub: https://datahub.io/dataset/human-activities-and-instructions ---------------------------------------------------------------- * Quickstart: if you want to experiment with the most high-quality data before downloading all the datasets, download the file '9of11_knowhow_wikihow', and optionally files 'Process - Inputs', 'Process - Outputs', 'Process - Step Links' and 'wikiHow categories hierarchy'. * Data representation based on the PROHOW vocabulary: http://w3id.org/prohow# Data extracted from existing web resources is linked to the original resources using the Open Annotation specification * Data Model: an example of how the data is represented within the datasets is available in the attached Data Model PDF file. The attached example represents a simple set of instructions, but instructions in the dataset can have more complex structures. For example, instructions could have multiple methods, steps could have further sub-steps, and complex requirements could be decomposed into sub-requirements. ---------------------------------------------------------------- Statistics: * 211,696: number of instructions. From wikiHow: 167,232 (datasets 1of11_knowhow_wikihow to 9of11_knowhow_wikihow). From Snapguide: 44,464 (datasets 10of11_knowhow_snapguide to 11of11_knowhow_snapguide). * 2,609,236: number of RDF nodes within the instructions From wikiHow: 1,871,468 (datasets 1of11_knowhow_wikihow to 9of11_knowhow_wikihow). From Snapguide: 737,768 (datasets 10of11_knowhow_snapguide to 11of11_knowhow_snapguide). * 255,101: number of process inputs linked to 8,453 distinct DBpedia concepts (dataset Process - Inputs) * 4,467: number of process outputs linked to 3,439 distinct DBpedia concepts (dataset Process - Outputs) * 376,795: number of step links between 114,166 different sets of instructions (dataset Process - Step Links)

  15. P

    Simple Shapes Dataset Dataset

    • paperswithcode.com
    Updated Jul 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Devillers; Léopold Maytié; Rufin VanRullen (2023). Simple Shapes Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/simple-shapes-dataset
    Explore at:
    Dataset updated
    Jul 4, 2023
    Authors
    Benjamin Devillers; Léopold Maytié; Rufin VanRullen
    Description

    It consists of 32x32 pixel images of shapes with multiple attributes (size, location, rotation, color). Each image is also paired with its ground truth information (attributes), and a natural language description (English) of the image.

    The dataset is composed of: a train set of 500,000 samples, a val and a test set of 1000 samples each.

    It also contains already processed 12-dimensional visual features (from a VAE), and presaved BERT features of the text descriptions.

    Link to dataset: https://zenodo.org/record/8112838

  16. h

    login-screen-dataset-simple

    • huggingface.co
    Updated Jun 25, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Smith (2023). login-screen-dataset-simple [Dataset]. https://huggingface.co/datasets/monroex/login-screen-dataset-simple
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 25, 2023
    Authors
    John Smith
    Description

    Dataset Card for "login-screen-dataset-simple"

    More Information needed

  17. h

    tdce-example-simple-dataset

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Theethawat Savastham, tdce-example-simple-dataset [Dataset]. https://huggingface.co/datasets/theethawats98/tdce-example-simple-dataset
    Explore at:
    Authors
    Theethawat Savastham
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Example Dataset For Time-Driven Cost Estimation Learning Model

    This dataset is the inspired-simulated data (the actual data is removed). This data is related to the Time-Driven Activity-Based Costing (TDABC) Principle.

      Simple Dataset
    

    It include the data with low variation and low dimension. It includes 4 files that bring from the manufacturing management system, which can be listed as.

    Process Data (generated_process_data) it contains the manufacturing process data… See the full description on the dataset page: https://huggingface.co/datasets/theethawats98/tdce-example-simple-dataset.

  18. w

    Dataset of books called Whittling the Country Bear & his friends : 12 simple...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Whittling the Country Bear & his friends : 12 simple projects for beginners [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Whittling+the+Country+Bear+%26+his+friends+%3A+12+simple+projects+for+beginners
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is Whittling the Country Bear & his friends : 12 simple projects for beginners. It features 7 columns including author, publication date, language, and book publisher.

  19. h

    simple-wiki

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Embedding Training Data, simple-wiki [Dataset]. https://huggingface.co/datasets/embedding-data/simple-wiki
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    Embedding Training Data
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for "simple-wiki"

      Dataset Summary
    

    This dataset contains pairs of equivalent sentences obtained from Wikipedia.

      Supported Tasks
    

    Sentence Transformers training; useful for semantic search and sentence similarity.

      Languages
    

    English.

      Dataset Structure
    

    Each example in the dataset contains pairs of equivalent sentences and is formatted as a dictionary with the key "set" and a list with the sentences as "value". {"set":… See the full description on the dataset page: https://huggingface.co/datasets/embedding-data/simple-wiki.

  20. SELTO Dataset

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated May 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sören Dittmer; David Erzmann; Henrik Harms; Rielson Falck; Marco Gosch; Sören Dittmer; David Erzmann; Henrik Harms; Rielson Falck; Marco Gosch (2023). SELTO Dataset [Dataset]. http://doi.org/10.5281/zenodo.7781392
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 23, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sören Dittmer; David Erzmann; Henrik Harms; Rielson Falck; Marco Gosch; Sören Dittmer; David Erzmann; Henrik Harms; Rielson Falck; Marco Gosch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A Benchmark Dataset for Deep Learning for 3D Topology Optimization

    This dataset represents voxelized 3D topology optimization problems and solutions. The solutions have been generated in cooperation with the Ariane Group and Synera using the Altair OptiStruct implementation of SIMP within the Synera software. The SELTO dataset consists of four different 3D datasets for topology optimization, called disc simple, disc complex, sphere simple and sphere complex. Each of these datasets is further split into a training and a validation subset.

    The following paper provides full documentation and examples:

    Dittmer, S., Erzmann, D., Harms, H., Maass, P., SELTO: Sample-Efficient Learned Topology Optimization (2022) https://arxiv.org/abs/2209.05098.

    The Python library DL4TO (https://github.com/dl4to/dl4to) can be used to download and access all SELTO dataset subsets.
    Each TAR.GZ file container consists of multiple enumerated pairs of CSV files. Each pair describes a unique topology optimization problem and contains an associated ground truth solution. Each problem-solution pair consists of two files, where one contains voxel-wise information and the other file contains scalar information. For example, the i-th sample is stored in the files i.csv and i_info.csv, where i.csv contains all voxel-wise information and i_info.csv contains all scalar information. We define all spatially varying quantities at the center of the voxels, rather than on the vertices or surfaces. This allows for a shape-consistent tensor representation.

    For the i-th sample, the columns of i_info.csv correspond to the following scalar information:

    • E - Young's modulus [Pa]
    • ν - Poisson's ratio [-]
    • σ_ys - a yield stress [Pa]
    • h - discretization size of the voxel grid [m]

    The columns of i.csv correspond to the following voxel-wise information:

    • x, y, z - the indices that state the location of the voxel within the voxel mesh
    • Ω_design - design space information for each voxel. This is a ternary variable that indicates the type of density constraint on the voxel. 0 and 1 indicate that the density is fixed at 0 or 1, respectively. -1 indicates the absence of constraints, i.e., the density in that voxel can be freely optimized
    • Ω_dirichlet_x, Ω_dirichlet_y, Ω_dirichlet_z - homogeneous Dirichlet boundary conditions for each voxel. These are binary variables that define whether the voxel is subject to homogeneous Dirichlet boundary constraints in the respective dimension
    • F_x, F_y, F_z - floating point variables that define the three spacial components of external forces applied to each voxel. All forces are body forces given in [N/m^3]
    • density - defines the binary voxel-wise density of the ground truth solution to the topology optimization problem

    How to Import the Dataset

    with DL4TO: With the Python library DL4TO (https://github.com/dl4to/dl4to) it is straightforward to download and access the dataset as a customized PyTorch torch.utils.data.Dataset object. As shown in the tutorial this can be done via:

    from dl4to.datasets import SELTODataset
    
    dataset = SELTODataset(root=root, name=name, train=train)

    Here, root is the path where the dataset should be saved. name is the name of the SELTO subset and can be one of "disc_simple", "disc_complex", "sphere_simple" and "sphere_complex". train is a boolean that indicates whether the corresponding training or validation subset should be loaded. See here for further documentation on the SELTODataset class.

    without DL4TO: After downloading and unzipping, any of the i.csv files can be manually imported into Python as a Pandas dataframe object:

    import pandas as pd
    
    root = ...
    file_path = f'{root}/{i}.csv'
    columns = ['x', 'y', 'z', 'Ω_design','Ω_dirichlet_x', 'Ω_dirichlet_y', 'Ω_dirichlet_z', 'F_x', 'F_y', 'F_z', 'density']
    df = pd.read_csv(file_path, names=columns)

    Similarly, we can import a i_info.csv file via:

    file_path = f'{root}/{i}_info.csv'
    info_column_names = ['E', 'ν', 'σ_ys', 'h']
    df_info = pd.read_csv(file_path, names=info_columns)

    We can extract PyTorch tensors from the Pandas dataframe df using the following function:

    import torch
    
    def get_torch_tensors_from_dataframe(df, dtype=torch.float32):
      shape = df[['x', 'y', 'z']].iloc[-1].values.astype(int) + 1
      voxels = [df['x'].values, df['y'].values, df['z'].values]
    
      Ω_design = torch.zeros(1, *shape, dtype=int)
      Ω_design[:, voxels[0], voxels[1], voxels[2]] = torch.from_numpy(data['Ω_design'].values.astype(int))
    
      Ω_Dirichlet = torch.zeros(3, *shape, dtype=dtype)
      Ω_Dirichlet[0, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['Ω_dirichlet_x'].values, dtype=dtype)
      Ω_Dirichlet[1, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['Ω_dirichlet_y'].values, dtype=dtype)
      Ω_Dirichlet[2, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['Ω_dirichlet_z'].values, dtype=dtype)
    
      F = torch.zeros(3, *shape, dtype=dtype)
      F[0, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['F_x'].values, dtype=dtype)
      F[1, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['F_y'].values, dtype=dtype)
      F[2, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['F_z'].values, dtype=dtype)
    
      density = torch.zeros(1, *shape, dtype=dtype)
      density[:, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['density'].values, dtype=dtype)
    
      return Ω_design, Ω_Dirichlet, F, density

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Antonio Pelusi (2022). Magic The Gathering dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/4373482
Organization logo

Magic The Gathering dataset

A simple dataset for beginners with all the standard card from the card game MTG

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 22, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Antonio Pelusi
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

A simple dataset to start learning the basics of Data Analytics. You can find a simple script in Python (Pymongo to connect to MongoDB and pandas to print) that use this dataset in the code section.

Click the following link to get more info about the usage of the Python Script mtgdb.py): MTGDB

Search
Clear search
Close search
Google apps
Main menu