30 datasets found

Data from: Email Template Dataset
kaggle.com
Updated Aug 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CS21M1005 (2022). Email Template Dataset [Dataset]. https://www.kaggle.com/datasets/cs21m1005/email-template-dataset/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
CS21M1005
Description
Dataset

This dataset was created by CS21M1005

Contents
summary img template
kaggle.com
Updated Oct 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Trương Quang Sang (2024). summary img template [Dataset]. https://www.kaggle.com/datasets/sangtruong/summary-img-template/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 27, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Trương Quang Sang
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Trương Quang Sang

Released under Apache 2.0

Contents
meme project raw
kaggle.com
zip
Updated Apr 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zacchaeus (2021). meme project raw [Dataset]. https://www.kaggle.com/zacchaeus/meme-project-raw
Explore at:
zip(99797452 bytes)Available download formats
Dataset updated
Apr 25, 2021
Authors
Zacchaeus
Description
Dataset

This dataset was created by Zacchaeus

Contents

It contains the following files:
Template Dataset Chatbot
kaggle.com
Updated Jun 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Extraction (2024). Template Dataset Chatbot [Dataset]. https://www.kaggle.com/rizkynindra/template-dataset-chatbot/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 3, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Extraction
Description
Dataset

This dataset was created by Extraction

Contents
(Blank) - Plant Tracker Template
kaggle.com
Updated Nov 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chad Mottershead (2023). (Blank) - Plant Tracker Template [Dataset]. https://www.kaggle.com/datasets/chadmottershead/blank-plant-tracker-template
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 26, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Chad Mottershead
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Chad Mottershead

Released under CC0: Public Domain

Contents
A
‘WHO national life expectancy ’ analyzed by Analyst-2
analyst-2.ai
Updated Oct 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2020). ‘WHO national life expectancy ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-who-national-life-expectancy-c4c7/d31e495e/?iid=008-942&v=presentation
Explore at:
Dataset updated
Oct 30, 2020
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘WHO national life expectancy ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mmattson/who-national-life-expectancy on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

I am developing my data science skills in areas outside of my previous work. An interesting problem for me was to identify which factors influence life expectancy on a national level. There is an existing Kaggle data set that explored this, but that information was corrupted. Part of the problem solving process is to step back periodically and ask "does this make sense?" Without reasonable data, it is harder to notice mistakes in my analysis code (as opposed to unusual behavior due to the data itself). I wanted to make a similar data set, but with reliable information.

This is my first time exploring life expectancy, so I had to guess which features might be of interest when making the data set. Some were included for comparison with the other Kaggle data set. A number of potentially interesting features (like air pollution) were left off due to limited year or country coverage. Since the data was collected from more than one server, some features are present more than once, to explore the differences.

Content

A goal of the World Health Organization (WHO) is to ensure that a billion more people are protected from health emergencies, and provided better health and well-being. They provide public data collected from many sources to identify and monitor factors that are important to reach this goal. This set was primarily made using GHO (Global Health Observatory) and UNESCO (United Nations Educational Scientific and Culture Organization) information. The set covers the years 2000-2016 for 183 countries, in a single CSV file. Missing data is left in place, for the user to decide how to deal with it.

Three notebooks are provided for my cursory analysis, a comparison with the other Kaggle set, and a template for creating this data set.

Inspiration

There is a lot to explore, if the user is interested. The GHO server alone has over 2000 "indicators". - How are the GHO and UNESCO life expectancies calculated, and what is causing the difference? That could also be asked for Gross National Income (GNI) and mortality features. - How does the life expectancy after age 60 compare to the life expectancy at birth? Is the relationship with the features in this data set different for those two targets? - What other indicators on the servers might be interesting to use? Some of the GHO indicators are different studies with different coverage. Can they be combined to make a more useful and robust data feature? - Unraveling the correlations between the features would take significant work.

--- Original source retains full ownership of the source dataset ---
h
News_Summary_Dataset
huggingface.co
Updated Oct 15, 2002
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ayush Sur (2002). News_Summary_Dataset [Dataset]. https://huggingface.co/datasets/SurAyush/News_Summary_Dataset
Explore at:
Dataset updated
Oct 15, 2002
Authors
Ayush Sur
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for Dataset Name

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

Dataset Details Dataset Description

Dataset Origin: [BBC News Summary] Data Source by: [https://www.kaggle.com/datasets/pariza/bbc-news-summary/data] Language(s) (NLP): [English] License: [More Information Needed]

Uses

[Used to summarize a language model like T5, to produce concise and clean summaries to… See the full description on the dataset page: https://huggingface.co/datasets/SurAyush/News_Summary_Dataset.
Github actions ETL template example
kaggle.com
Updated Apr 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrés Humberto Chirinos Lizondo (2025). Github actions ETL template example [Dataset]. https://www.kaggle.com/datasets/andreschirinos/github-actions-etl-template/versions/2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 28, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Andrés Humberto Chirinos Lizondo
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Github actions ELT template
h
Translation_words_and_sentences_english_french
huggingface.co
Updated Oct 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanchez Pauline (2023). Translation_words_and_sentences_english_french [Dataset]. https://huggingface.co/datasets/PaulineSanchez/Translation_words_and_sentences_english_french
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 19, 2023
Authors
Sanchez Pauline
Area covered
French
Description
Dataset Card for Dataset Name

Dataset Summary

This dataset card aims to be a base template for new datasets. It has been generated using this raw template. This dataset is a clean version (all NanN removed) of this dataset : https://www.kaggle.com/datasets/devicharith/language-translation-englishfrench . I'm not the person who posted it first on Kaggle.

Supported Tasks and Leaderboards

[More Information Needed]

Languages

[More Information… See the full description on the dataset page: https://huggingface.co/datasets/PaulineSanchez/Translation_words_and_sentences_english_french.
sensor-template
kaggle.com
Updated Sep 1, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Azib Hasan (2020). sensor-template [Dataset]. https://www.kaggle.com/datasets/azibhasan/sensortemplate/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 1, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Azib Hasan
Description
Dataset

This dataset was created by Azib Hasan

Contents
A
‘Phishing website Detector’ analyzed by Analyst-2
analyst-2.ai
Updated Nov 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Phishing website Detector’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-phishing-website-detector-4557/b159e5cf/?iid=255-989&v=presentation
Explore at:
Dataset updated
Nov 12, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Phishing website Detector’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/eswarchandt/phishing-website-detector on 12 November 2021.

--- Dataset description provided by original source is as follows ---

Description

The data set is provided both in text file and csv file which provides the following resources that can be used as inputs for model building :

A collection of website URLs for 11000+ websites. Each sample has 30 website parameters and a class label identifying it as a phishing website or not (1 or -1).

The code template containing these code blocks: a. Import modules (Part 1) b. Load data function + input/output field descriptions

The data set also serves as an input for project scoping and tries to specify the functional and non-functional requirements for it.

Background of Problem Statement :

You are expected to write the code for a binary classification model (phishing website or not) using Python Scikit-Learn that trains on the data and calculates the accuracy score on the test data. You have to use one or more of the classification algorithms to train a model on the phishing website data set.

Dataset Description:

The dataset for a “.txt” file is with no headers and has only the column values.

The actual column-wise header is described above and, if needed, you can add the header manually if you are using '.txt' file.If you are using '.csv' file then the column names were added and given.

The header list (column names) is as follows : [ 'UsingIP', 'LongURL', 'ShortURL', 'Symbol@', 'Redirecting//', 'PrefixSuffix-', 'SubDomains', 'HTTPS', 'DomainRegLen', 'Favicon', 'NonStdPort', 'HTTPSDomainURL', 'RequestURL', 'AnchorURL', 'LinksInScriptTags', 'ServerFormHandler', 'InfoEmail', 'AbnormalURL', 'WebsiteForwarding', 'StatusBarCust', 'DisableRightClick', 'UsingPopupWindow', 'IframeRedirection', 'AgeofDomain', 'DNSRecording', 'WebsiteTraffic', 'PageRank', 'GoogleIndex', 'LinksPointingToPage', 'StatsReport', 'class' ] ### Brief Description of the features in data set ● UsingIP (categorical - signed numeric) : { -1,1 } ● LongURL (categorical - signed numeric) : { 1,0,-1 } ● ShortURL (categorical - signed numeric) : { 1,-1 } ● Symbol@ (categorical - signed numeric) : { 1,-1 } ● Redirecting// (categorical - signed numeric) : { -1,1 } ● PrefixSuffix- (categorical - signed numeric) : { -1,1 } ● SubDomains (categorical - signed numeric) : { -1,0,1 } ● HTTPS (categorical - signed numeric) : { -1,1,0 } ● DomainRegLen (categorical - signed numeric) : { -1,1 } ● Favicon (categorical - signed numeric) : { 1,-1 } ● NonStdPort (categorical - signed numeric) : { 1,-1 } ● HTTPSDomainURL (categorical - signed numeric) : { -1,1 } ● RequestURL (categorical - signed numeric) : { 1,-1 } ● AnchorURL (categorical - signed numeric) :

--- Original source retains full ownership of the source dataset ---
h
ImageNet100
huggingface.co
paperswithcode.com
Updated Apr 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ImageNet100 [Dataset]. https://huggingface.co/datasets/ilee0022/ImageNet100
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 24, 2024
Authors
isaacNingLee
Description
Dataset Card for Dataset Name

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

Dataset Details

This is Huggingface dataset version of https://www.kaggle.com/datasets/ambityga/imagenet100. All credits are given to the original author and please cite the original author.

Acknowledgements

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy… See the full description on the dataset page: https://huggingface.co/datasets/ilee0022/ImageNet100.
A
‘Indonesian Abusive and Hate Speech Twitter Text’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Indonesian Abusive and Hate Speech Twitter Text’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-indonesian-abusive-and-hate-speech-twitter-text-f777/latest
Explore at:
Dataset updated
Feb 14, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Indonesian Abusive and Hate Speech Twitter Text’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/ilhamfp31/indonesian-abusive-and-hate-speech-twitter-text on 14 February 2022.

--- Dataset description provided by original source is as follows ---

About

The original author GitHub: https://github.com/okkyibrohim/id-multi-label-hate-speech-and-abusive-language-detection I upload it to Kaggle because I'm using it for my undergraduate project here. All credit to the original author.

Preprocessing

The original author preprocess the data in 5 steps. Here's a kernel I make trying to replicate the preprocess steps done by the original author: https://www.kaggle.com/ilhamfp31/preprocessing-the-indonesian-hate-abusive-text/data

Citation

Cite the original author if you use the data:

Muhammad Okky Ibrohim and Indra Budi. 2019. Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter. In ALW3: 3rd Workshop on Abusive Language Online, 46-57. (Every paper template may have different citation writting. For LaTex user, you can see citation.bib).

--- Original source retains full ownership of the source dataset ---
Equiarea Shape from Template Deformations Dataset
kaggle.com
Updated Jan 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Fuentes Jimenez (2025). Equiarea Shape from Template Deformations Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/10579732
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/10579732
Dataset updated
Jan 25, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
David Fuentes Jimenez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Equiareal Shape from Template Deformations Dataset is a comprehensive repository of RGB-D real videos generated using Kinect V2 recording various experiment that appear in the paper Equiareal shape from template. Part of the codes associated with each experiment are contained in each folder.

Developed through a collaborative effort between the University of Alcalá, University of Clermont-Auvgerne and EnCoV, the database is organized into folders, each corresponding to a specific experiment analyzed in the study. Inside each folder, users will find RGB images, and matlab files with associated tracking and groundtruths.

This resource is intended to support researchers, educators, and students working on 3D deformable reconstruction and related fields, offering a practical tool for experimentation and analysis.
h
Data from: example-dataset
huggingface.co
Updated Nov 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carlos A. Catania (Harpo) (2023). example-dataset [Dataset]. https://huggingface.co/datasets/harpomaxx/example-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 2, 2023
Authors
Carlos A. Catania (Harpo)
License
https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/
Description
Dataset Card for Dataset Name

Dataset Summary

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

Supported Tasks and Leaderboards

[More Information Needed]

Languages

[More Information Needed]

Dataset Structure Data Instances

[More Information Needed]

Data Fields

[More Information Needed]

Data Splits

[More Information Needed]

Dataset Creation… See the full description on the dataset page: https://huggingface.co/datasets/harpomaxx/example-dataset.
h
PhishingURLsDataset
huggingface.co
Updated Jan 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semih Güner (2024). PhishingURLsDataset [Dataset]. https://huggingface.co/datasets/semihGuner2002/PhishingURLsDataset
Explore at:
Dataset updated
Jan 12, 2024
Authors
Semih Güner
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
PhishingURLDataset

This dataset is created for being used for neural network training, on phishing website detection. It has been generated using this raw template.

Dataset Details

This dataset contains phishing websites, which are labeled with "1" and are called "malignant", and benign websites, which are labeled with "0".

Dataset Sources

Kaggle Dataset on Phishing URLs: https://www.kaggle.com/datasets/siddharthkumar25/malicious-and-benign-urls USOM Phishing… See the full description on the dataset page: https://huggingface.co/datasets/semihGuner2002/PhishingURLsDataset.
h
arxiv-cs
huggingface.co
Updated Apr 21, 2007
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rafael Arias Calles (2007). arxiv-cs [Dataset]. https://huggingface.co/datasets/rjac/arxiv-cs
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 21, 2007
Authors
Rafael Arias Calles
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for Dataset Name

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

Dataset Details

last_update: 2024-03-19

Dataset Description Dataset Sources [optional]

Repository: [More Information Needed] Paper [optional]: [More Information Needed] Demo [optional]: [More Information Needed]

Kaggle arvix filter by Computer Science

Uses Direct Use

[More… See the full description on the dataset page: https://huggingface.co/datasets/rjac/arxiv-cs.
toy_lr
kaggle.com
Updated Oct 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Dragon (2022). toy_lr [Dataset]. https://www.kaggle.com/datasets/daviddragon/toy-lr
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 9, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
David Dragon
Description
A toy dataset for running linear regression! The dataset consists of inputs and targets. Inputs are of shape (1000, 10), where there are 1000 examples and 10 input features. Targets are of shape (1000,), one target per example. Submit learned weights and biases at https://forms.gle/R4gRgrSYcMTPXZUy9 to get a score! Template notebook to get started: https://www.kaggle.com/code/daviddragon/toy-lr-template/notebook
prompt_template
kaggle.com
Updated Dec 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Odunayo Ogundepo (2024). prompt_template [Dataset]. https://www.kaggle.com/datasets/odunayoogundepo6386/prompt-template/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 1, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Odunayo Ogundepo
Description
Dataset

This dataset was created by Odunayo Ogundepo

Contents
h
multiclass-sentiment-analysis-dataset
huggingface.co
Updated Jul 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shahriar Parvez (2023). multiclass-sentiment-analysis-dataset [Dataset]. https://huggingface.co/datasets/Sp1786/multiclass-sentiment-analysis-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 14, 2023
Authors
Shahriar Parvez
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for Dataset Name

Dataset Summary

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

Supported Tasks and Leaderboards

[More Information Needed]

Languages

[More Information Needed]

Dataset Structure Data Instances

[More Information Needed]

Data Fields

[More Information Needed]

Data Splits

[More Information Needed]

Dataset Creation… See the full description on the dataset page: https://huggingface.co/datasets/Sp1786/multiclass-sentiment-analysis-dataset.

Facebook

Twitter

Click to copy link

Link copied

Cite

CS21M1005 (2022). Email Template Dataset [Dataset]. https://www.kaggle.com/datasets/cs21m1005/email-template-dataset/suggestions

Data from: Email Template Dataset

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 3, 2022

Dataset provided by

Kagglehttp://kaggle.com/

Authors

CS21M1005

Description

Dataset

This dataset was created by CS21M1005

Clear search

Close search

Google apps

Main menu

Data from: Email Template Dataset

Dataset

Contents

summary img template

Dataset

Contents

meme project raw

Dataset

Contents

Template Dataset Chatbot

Dataset

Contents

(Blank) - Plant Tracker Template

Dataset

Contents

‘WHO national life expectancy ’ analyzed by Analyst-2

Context

Content

Inspiration

News_Summary_Dataset

Github actions ETL template example

Translation_words_and_sentences_english_french

sensor-template

Dataset

Contents

‘Phishing website Detector’ analyzed by Analyst-2

Description

Background of Problem Statement :

Dataset Description:

ImageNet100

‘Indonesian Abusive and Hate Speech Twitter Text’ analyzed by Analyst-2

About

Preprocessing

Citation

Equiarea Shape from Template Deformations Dataset

Data from: example-dataset

PhishingURLsDataset

arxiv-cs

toy_lr

prompt_template

Dataset

Contents

multiclass-sentiment-analysis-dataset

Data from: Email Template Dataset

Dataset

Contents