6 datasets found

o
IMDB Movie Reviews (Binary Sentiment)
opendatabay.com
.csv
Updated Jun 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). IMDB Movie Reviews (Binary Sentiment) [Dataset]. https://www.opendatabay.com/data/ai-ml/c48f7110-3d06-45be-9cae-aa8799720eec
Explore at:
.csvAvailable download formats
Dataset updated
Jun 18, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Entertainment & Media Consumption
Description
Source Huggingface Hub: link

About this dataset This is a large dataset for binary sentiment classification containing a substantial amount of data compared to previous benchmark datasets. Provided are 25,000 highly polar movie reviews for training and 25,000 for testing. There is also additional unlabeled data available for use. The data fields are consistent among all splits of the dataset

How to use the dataset In order to use this dataset, you will need to first download the IMDB Large Movie Review Dataset. Once you have downloaded the dataset, you can either use it in its original form or split it into training and testing sets. To split the dataset, you will need to create a new file called unsupervised.csv and copy the text column from train.csv into it. You can then split unsupervised.csv into two files: train_unsupervised.csv and test_unsupervised.csv.

Once you have either the original dataset or the training and testing sets, you can begin using them for binary sentiment classification. In order to do this, you will need to use a machine learning algorithm that is capable of performing binary classification, such as logistic regression or support vector machines. Once you have trained your model on the training set, you can then evaluate its performance on the test set by predicting the labels of the reviews in test_unsupervised.csv

Research Ideas This dataset can be used to train a binary sentiment classification model. This dataset can be used to train a model to classify movie reviews into positive and negative sentiment categories. This dataset can be used to build a large movie review database for research purposes

License

CC0

Original Data Source: IMDB Movie Reviews (Binary Sentiment)
T
imdb_reviews
tensorflow.org
Updated Sep 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). imdb_reviews [Dataset]. https://www.tensorflow.org/datasets/catalog/imdb_reviews
Explore at:
Dataset updated
Sep 20, 2024
Description
Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('imdb_reviews', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
P
OpenSubtitles Dataset
paperswithcode.com
Updated Jul 10, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pierre Lison; J{\"o}rg Tiedemann (2022). OpenSubtitles Dataset [Dataset]. https://paperswithcode.com/dataset/opensubtitles
Explore at:
Dataset updated
Jul 10, 2022
Authors
Pierre Lison; J{\"o}rg Tiedemann
Description
OpenSubtitles is collection of multilingual parallel corpora. The dataset is compiled from a large database of movie and TV subtitles and includes a total of 1689 bitexts spanning 2.6 billion sentences across 60 languages.
Indian Movie Faces Dataset(IMFDB) Face Recognition
kaggle.com
Updated Oct 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ANIRUDH SIMHACHALAM (2021). Indian Movie Faces Dataset(IMFDB) Face Recognition [Dataset]. https://www.kaggle.com/anirudhsimhachalam/indian-movie-faces-datasetimfdb-face-recognition/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 22, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ANIRUDH SIMHACHALAM
Description
Details about IMFDB: Indian Movie Face database (IMFDB) is a large unconstrained face database consisting of 34512 images of 100 Indian actors collected from more than 100 videos. All the images are manually selected and cropped from the video frames resulting in a high degree of variability interms of scale, pose, expression, illumination, age, resolution, occlusion, and makeup. IMFDB is the first face database that provides a detailed annotation of every image in terms of age, pose, gender, expression and type of occlusion that may help other face related applications.

This dataset is modified in such a way that it is ready for training a Face Recognition model. For dataset with annotations as mentioned above, you can download from here(official): https://cvit.iiit.ac.in/projects/IMFDB/

Acknowledgements: https://cvit.iiit.ac.in/projects/IMFDB/ Shankar Setty, Moula Husain, Parisa Beham, Jyothi Gudavalli, Menaka Kandasamy, Radhesyam Vaddi, Vidyagouri Hemadri, J C Karure, Raja Raju, Rajan, Vijay Kumar and C V Jawahar. "Indian Movie Face Database: A Benchmark for Face Recognition Under Wide Variations" National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2013.

Project Area

opendata-historicengland.hub.arcgis.com

Updated Sep 14, 2023

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Historic England (2023). Project Area [Dataset]. https://opendata-historicengland.hub.arcgis.com/maps/historicengland::project-area-1

Explore at:

Dataset updated

Sep 14, 2023

Dataset provided by

Historic Buildings And Monuments Commission For Englandhttps://historicengland.org.uk/

Authors

Historic England

Area covered

Description

Various data recorded by Historic England relating to aerial investigation and mapping projects. N.B. This is a dynamic dataset that is constantly evolving, not only with the addition of newly completed projects, but also with the reassessment of some earlier projects. See https://historicengland.org.uk/research/methods/airborne-remote-sensing/aerial-investigation/ for further details of Historic England's work with aerial sources.It's currently not possible to provide download access to the earlier hand drawn projects, which are only available as raster files, but these can be viewed via the Aerial Archaeology Mapping Explorer. We aim to create vector monument polygons for these features as the next phase of the project.More information and help with these the layers Detailed MappingThis layer shows the detailed mapping of archaeological features derived from aerial imagery; this includes photographic imagery from many decades taken specifically for archaeological purposes, as well as other photography taken for other reasons and airborne lidar. The data are symbolised initially based on their physical form i.e. cut/negative (e.g. pit, ditch etc) or built/positive (e.g. mound, bank etc) .

    Field name
    Field alias
    Description
    Mandatory Y/N


    LAYER
    LAYER
    The layer used for mapping
    Y


    PROJECT
    PROJECT
    Project name
    Y

    PERIOD
    PERIOD
    The presumed date/period assigned to the feature (terminology from FISH thesaurus)
    Y


    MONUMENT_TYPE
    MONUMENT_TYPE
    The presumed type/function assigned to the feature (terminology from FISH thesaurus)
    Y


    EVIDENCE_1
    EVIDENCE_1
    The primary evidence for the feature e.g. cropmark, earthwork etc (terminology from FISH thesaurus)
    Y


  SOURCE_1
    SOURCE_1
    The primary source for the feature e.g. aerial photo reference, documentary source etc
    Y


  EVIDENCE_2
    EVIDENCE_2
    Where available the latest evidence for the feature e.g. cropmark, earthwork etc (terminology from FISH thesaurus) N.B. This was the latest evidence seen and does not necessarily represent the current status of the feature.
    N


  SOURCE_2
    SOURCE_2
    Where available the latest source for the feature N.B. This was the latest evidence seen and does not necessarily represent the current status of the feature.
    N


    HE_UID
    HE_UID
    Composite of Unique identifier(s) used by Historic England
    Y

    HER_NO
    HER_NO
    Composite of Unique identifier(s) used by Historic Environment Records
    N

    DHEUID_1
    DHEUID_1
    Primary Unique identifier used by Historic England
    Y

    DHEUID_2
    DHEUID_2
    Secondary Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
    N

    DHEUID_3 ~ 5
    DHEUID_3 ~ 5
    Additional Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
    N


    HE_URL1
    HE_URL1
    URL link to the relevant Historic England record in Heritage Gateway
    Y

    HE_URL2
    HE_URL2
    URL link to the relevant Historic England record in Heritage Gateway
    N

    HE_URL3 ~ 5
    HE_URL3 ~ 5
    URL link to the relevant Historic England record in Heritage Gateway
    N


    DHERNO_1
    DHERNO_1
    Primary unique identifier used by the relevant Historic Environment Record (HER)
    Y


    DHERNO_2
    DHERNO_2
    Secondary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
    N


    DHERNO_3 ~ 5
    DHERNO_3 ~ 5
    Tertiary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
    N

    DHERPREF_1
    DHERPREF_1
    Primary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID
    Y

    DHERPREF_2
    DHERPREF_2
    Secondary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
    N

    DHERPREF_3 ~ 5
    DHERPREF_3 ~ 5
    Additional alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
    N


    HER_LINK_1
    HER_LINK_1
    URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
    Y


    HER_LINK_2
    HER_LINK_2
    URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
    N


    HER_LINK_3 ~ 5
    HER_LINK_3 ~ 5
    URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
    N

The data are symbolised initially based on their physical form i.e. cut/negative (e.g. pit, ditch etc) or built/positive (e.g. mound, bank etc)

    Layer name
    Colour (Hex)
    Description

Bank #A50026 Used to outline banks, platforms, mounds and spoil heaps.

Ditch #313695 Used to outline cut features such as ditches, ponds, pits or hollow ways.

    Extent of Feature
    #FDAE61 (Dashes)
    Used to depict the extent of large area features such as airfields, military camps, or major extraction.


    Ridge and Furrow Alignment
    #74ADD1
    Line or arrow(s) (hand drawn not a symbol) depicting the direction of the rigs in a block of ridge and furrow.


    Ridge and Furrow Area
    #74ADD1 (Dots)
    Used to outline a block of ridge and furrow .


    Slope
    #4575B4
     The top of the “T” indicates the top of slope and the body indicates the length and direction of the slope. Used to depict scarps, edges of platforms and other large earthworks.

    Structure
    #F46D43
     Used to outline structures including stone, concrete, metal and timber constructions e.g., buildings, Nissen huts, tents, radio masts, camouflaged airfields, wrecks, fish traps, etc.

You can find instructions on how to create a QGIS style file (.qml) to recreate our mapping symbology in QGIS via our Open Data Downloads page under Aerial Investigation Mapping data.Monument ExtentsThis layer shows the general extent of the monuments, created from multiple sources, primarily aerial imagery, but referring to other sources such as earthwork surveys, documentary evidence and any information available from the relevant Historic Environment Record etc. This differs from the 'Detailed Mapping' layer, which shows the individual features as they appear on the ground.

    Field name
    Field alias
    Description
    Mandatory Y/N


    LAYER
    LAYER
    The layer used for mapping
    Y

    HE_UID
    HE_UID
    Composite of Unique identifier(s) used by Historic England
    Y

    HER_NO
    HER_NO
    Composite of Unique identifier(s) used by Historic Environement Records
    N


    HE_UID1
    HE_UID1
    Primary Unique identifier used by Historic England
    Y

    HE_UID2
    HE_UID2
    Secondary Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
    N

    HE_UID3 ~ 5
    HE-UID3 ~ 5
    Additional Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
    N


    HE_URL1
    HE_URL1
    URL link to the relevant Historic England record in Heritage Gateway
    Y

    HE_URL2
    HE_URL2
    URL link to the relevant Historic England record in Heritage Gateway
    N

    HE_URL3 ~ 5
    HE_URL3 ~ 5
    URL link to the relevant Historic England record in Heritage Gateway
    N


    HERNO_1
    HERNO_1
    Primary unique identifier used by the relevant Historic Environment Record (HER)
    Y


    HERNO_2
    HERNO_2
    Secondary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
    N


    HERNO_3 ~ 25
    HERNO_3 ~ 25
    Tertiary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
    N

    HERPREF_1
    HERPREF_1
    Primary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID
    Y

    HERPREF_2
    HERPREF_2
    Secondary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
    N

    HERPREF_3 ~ 25
    HERPREF_3 ~ 25
    Additional alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
    N


    HER_LINK_1
    HER_LINK_1
    URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
    Y


    HER_LINK_2
    HER_LINK_2
    URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
    N


    HER_LINK_3 ~ 25
    HER_LINK_3 ~ 25
    URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
    N


    PROJECT
    project
    Project name
    Y

Project AreaThis layer shows the extent of the

i
IMDb Movie Reviews Dataset
ieee-dataport.org
Updated Aug 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aditya Pal (2022). IMDb Movie Reviews Dataset [Dataset]. https://ieee-dataport.org/open-access/imdb-movie-reviews-dataset
Explore at:
Dataset updated
Aug 2, 2022
Authors
Aditya Pal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
R
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Datasimple (2025). IMDB Movie Reviews (Binary Sentiment) [Dataset]. https://www.opendatabay.com/data/ai-ml/c48f7110-3d06-45be-9cae-aa8799720eec

IMDB Movie Reviews (Binary Sentiment)

Explore at:

.csvAvailable download formats

Dataset updated

Jun 18, 2025

Dataset authored and provided by

Datasimple

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Area covered

Entertainment & Media Consumption

Description

Source Huggingface Hub: link

About this dataset This is a large dataset for binary sentiment classification containing a substantial amount of data compared to previous benchmark datasets. Provided are 25,000 highly polar movie reviews for training and 25,000 for testing. There is also additional unlabeled data available for use. The data fields are consistent among all splits of the dataset

How to use the dataset In order to use this dataset, you will need to first download the IMDB Large Movie Review Dataset. Once you have downloaded the dataset, you can either use it in its original form or split it into training and testing sets. To split the dataset, you will need to create a new file called unsupervised.csv and copy the text column from train.csv into it. You can then split unsupervised.csv into two files: train_unsupervised.csv and test_unsupervised.csv.

Once you have either the original dataset or the training and testing sets, you can begin using them for binary sentiment classification. In order to do this, you will need to use a machine learning algorithm that is capable of performing binary classification, such as logistic regression or support vector machines. Once you have trained your model on the training set, you can then evaluate its performance on the test set by predicting the labels of the reviews in test_unsupervised.csv

Research Ideas This dataset can be used to train a binary sentiment classification model. This dataset can be used to train a model to classify movie reviews into positive and negative sentiment categories. This dataset can be used to build a large movie review database for research purposes

License

CC0

Original Data Source: IMDB Movie Reviews (Binary Sentiment)

Clear search

Close search

Google apps

Main menu

IMDB Movie Reviews (Binary Sentiment)

License

imdb_reviews

OpenSubtitles Dataset

Indian Movie Faces Dataset(IMFDB) Face Recognition

Project Area

IMDb Movie Reviews Dataset

IMDB Movie Reviews (Binary Sentiment)See More Versions

License

IMDB Movie Reviews (Binary Sentiment)