2 datasets found

h
hate_speech_dataset
huggingface.co
Updated Jul 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christina Christodoulou (2024). hate_speech_dataset [Dataset]. https://huggingface.co/datasets/christinacdl/hate_speech_dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 27, 2024
Authors
Christina Christodoulou
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
32.579 texts in total, 14.012 NOT hateful texts and 18.567 HATEFUL texts All duplicate values were removed Split using sklearn into 80% train and 20% temporary test (stratified label). Then split the test set using 0.50% test and validation (stratified label) Split: 80/10/10 Train set label distribution: 0 ==> 11.210, 1 ==> 14.853, 26.063 in total Validation set label distribution: 0 ==> 1.401, 1 ==> 1.857, 3.258 in total Test set label distribution: 0 ==> 1.401, 1 ==> 1.857, 3.258 in… See the full description on the dataset page: https://huggingface.co/datasets/christinacdl/hate_speech_dataset.
Classifier Model
kaggle.com
Updated Feb 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeriann L Rhymer (2025). Classifier Model [Dataset]. https://www.kaggle.com/datasets/jeriannlrhymer/regression-model
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 4, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jeriann L Rhymer
License
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Description
Purpose of this data is Linear Regression

Handling categorical features in a scikit-learn model. Carrying out a train/test split. Training a model. Evaluating that model on the testing data.

The mpg data set represents the fuel economy (in miles per gallon) for 38 popular models of car, measured between 1999 and 2008.

Factor Type Description manufacturer multi-valued discrete Vehicle manufacturer model multi-valued discrete Model of the vehicle displ continuous Size of engine [litres] year multi-valued discrete Year of vehicle manufacture cyl multi-valued discrete Number of ignition cylinders trans multi-valued discrete Transmission type (manual or automatic) drv multi-valued discrete Driven wheels (f=front, 4=4-wheel, r=rear wheel drive) city continuous Miles per gallon, city driving conditions (fuel economy) hwy continuous Miles per gallon, highway driving conditions (fuel economy) fl multi-valued discrete Vehicle type class multi-valued discrete Vehicle class (suv, compact, etc)
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Christina Christodoulou (2024). hate_speech_dataset [Dataset]. https://huggingface.co/datasets/christinacdl/hate_speech_dataset

hate_speech_dataset

christinacdl/hate_speech_dataset

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jul 27, 2024

Authors

Christina Christodoulou

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

32.579 texts in total, 14.012 NOT hateful texts and 18.567 HATEFUL texts All duplicate values were removed Split using sklearn into 80% train and 20% temporary test (stratified label). Then split the test set using 0.50% test and validation (stratified label) Split: 80/10/10 Train set label distribution: 0 ==> 11.210, 1 ==> 14.853, 26.063 in total Validation set label distribution: 0 ==> 1.401, 1 ==> 1.857, 3.258 in total Test set label distribution: 0 ==> 1.401, 1 ==> 1.857, 3.258 in… See the full description on the dataset page: https://huggingface.co/datasets/christinacdl/hate_speech_dataset.

Clear search

Close search

Google apps

Main menu

hate_speech_dataset

Classifier Model

hate_speech_datasetSee More Versions

christinacdl/hate_speech_dataset

hate_speech_dataset