6 datasets found
  1. o

    IMDB Movie Reviews (Binary Sentiment)

    • opendatabay.com
    .csv
    Updated Jun 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). IMDB Movie Reviews (Binary Sentiment) [Dataset]. https://www.opendatabay.com/data/ai-ml/c48f7110-3d06-45be-9cae-aa8799720eec
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jun 18, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Entertainment & Media Consumption
    Description

    Source Huggingface Hub: link

    About this dataset This is a large dataset for binary sentiment classification containing a substantial amount of data compared to previous benchmark datasets. Provided are 25,000 highly polar movie reviews for training and 25,000 for testing. There is also additional unlabeled data available for use. The data fields are consistent among all splits of the dataset

    How to use the dataset In order to use this dataset, you will need to first download the IMDB Large Movie Review Dataset. Once you have downloaded the dataset, you can either use it in its original form or split it into training and testing sets. To split the dataset, you will need to create a new file called unsupervised.csv and copy the text column from train.csv into it. You can then split unsupervised.csv into two files: train_unsupervised.csv and test_unsupervised.csv.

    Once you have either the original dataset or the training and testing sets, you can begin using them for binary sentiment classification. In order to do this, you will need to use a machine learning algorithm that is capable of performing binary classification, such as logistic regression or support vector machines. Once you have trained your model on the training set, you can then evaluate its performance on the test set by predicting the labels of the reviews in test_unsupervised.csv

    Research Ideas This dataset can be used to train a binary sentiment classification model. This dataset can be used to train a model to classify movie reviews into positive and negative sentiment categories. This dataset can be used to build a large movie review database for research purposes

    License

    CC0

    Original Data Source: IMDB Movie Reviews (Binary Sentiment)

  2. T

    imdb_reviews

    • tensorflow.org
    Updated Sep 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). imdb_reviews [Dataset]. https://www.tensorflow.org/datasets/catalog/imdb_reviews
    Explore at:
    Dataset updated
    Sep 20, 2024
    Description

    Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('imdb_reviews', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  3. P

    OpenSubtitles Dataset

    • paperswithcode.com
    Updated Jul 10, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierre Lison; J{\"o}rg Tiedemann (2022). OpenSubtitles Dataset [Dataset]. https://paperswithcode.com/dataset/opensubtitles
    Explore at:
    Dataset updated
    Jul 10, 2022
    Authors
    Pierre Lison; J{\"o}rg Tiedemann
    Description

    OpenSubtitles is collection of multilingual parallel corpora. The dataset is compiled from a large database of movie and TV subtitles and includes a total of 1689 bitexts spanning 2.6 billion sentences across 60 languages.

  4. Indian Movie Faces Dataset(IMFDB) Face Recognition

    • kaggle.com
    Updated Oct 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ANIRUDH SIMHACHALAM (2021). Indian Movie Faces Dataset(IMFDB) Face Recognition [Dataset]. https://www.kaggle.com/anirudhsimhachalam/indian-movie-faces-datasetimfdb-face-recognition/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 22, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ANIRUDH SIMHACHALAM
    Description

    Details about IMFDB: Indian Movie Face database (IMFDB) is a large unconstrained face database consisting of 34512 images of 100 Indian actors collected from more than 100 videos. All the images are manually selected and cropped from the video frames resulting in a high degree of variability interms of scale, pose, expression, illumination, age, resolution, occlusion, and makeup. IMFDB is the first face database that provides a detailed annotation of every image in terms of age, pose, gender, expression and type of occlusion that may help other face related applications.

    This dataset is modified in such a way that it is ready for training a Face Recognition model. For dataset with annotations as mentioned above, you can download from here(official): https://cvit.iiit.ac.in/projects/IMFDB/

    Acknowledgements: https://cvit.iiit.ac.in/projects/IMFDB/ Shankar Setty, Moula Husain, Parisa Beham, Jyothi Gudavalli, Menaka Kandasamy, Radhesyam Vaddi, Vidyagouri Hemadri, J C Karure, Raja Raju, Rajan, Vijay Kumar and C V Jawahar. "Indian Movie Face Database: A Benchmark for Face Recognition Under Wide Variations" National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2013.

  5. Project Area

    • opendata-historicengland.hub.arcgis.com
    Updated Sep 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Historic England (2023). Project Area [Dataset]. https://opendata-historicengland.hub.arcgis.com/maps/historicengland::project-area-1
    Explore at:
    Dataset updated
    Sep 14, 2023
    Dataset provided by
    Historic Buildings And Monuments Commission For Englandhttps://historicengland.org.uk/
    Authors
    Historic England
    Area covered
    Description

    Various data recorded by Historic England relating to aerial investigation and mapping projects. N.B. This is a dynamic dataset that is constantly evolving, not only with the addition of newly completed projects, but also with the reassessment of some earlier projects. See https://historicengland.org.uk/research/methods/airborne-remote-sensing/aerial-investigation/ for further details of Historic England's work with aerial sources.It's currently not possible to provide download access to the earlier hand drawn projects, which are only available as raster files, but these can be viewed via the Aerial Archaeology Mapping Explorer. We aim to create vector monument polygons for these features as the next phase of the project.More information and help with these the layers Detailed MappingThis layer shows the detailed mapping of archaeological features derived from aerial imagery; this includes photographic imagery from many decades taken specifically for archaeological purposes, as well as other photography taken for other reasons and airborne lidar. The data are symbolised initially based on their physical form i.e. cut/negative (e.g. pit, ditch etc) or built/positive (e.g. mound, bank etc) .

        Field name
        Field alias
        Description
        Mandatory Y/N
    
    
        LAYER
        LAYER
        The layer used for mapping
        Y
    
    
        PROJECT
        PROJECT
        Project name
        Y
    
        PERIOD
        PERIOD
        The presumed date/period assigned to the feature (terminology from FISH thesaurus)
        Y
    
    
        MONUMENT_TYPE
        MONUMENT_TYPE
        The presumed type/function assigned to the feature (terminology from FISH thesaurus)
        Y
    
    
        EVIDENCE_1
        EVIDENCE_1
        The primary evidence for the feature e.g. cropmark, earthwork etc (terminology from FISH thesaurus)
        Y
    
    
      SOURCE_1
        SOURCE_1
        The primary source for the feature e.g. aerial photo reference, documentary source etc
        Y
    
    
      EVIDENCE_2
        EVIDENCE_2
        Where available the latest evidence for the feature e.g. cropmark, earthwork etc (terminology from FISH thesaurus) N.B. This was the latest evidence seen and does not necessarily represent the current status of the feature.
        N
    
    
      SOURCE_2
        SOURCE_2
        Where available the latest source for the feature N.B. This was the latest evidence seen and does not necessarily represent the current status of the feature.
        N
    
    
        HE_UID
        HE_UID
        Composite of Unique identifier(s) used by Historic England
        Y
    
        HER_NO
        HER_NO
        Composite of Unique identifier(s) used by Historic Environment Records
        N
    
        DHEUID_1
        DHEUID_1
        Primary Unique identifier used by Historic England
        Y
    
        DHEUID_2
        DHEUID_2
        Secondary Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
        N
    
        DHEUID_3 ~ 5
        DHEUID_3 ~ 5
        Additional Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
        N
    
    
        HE_URL1
        HE_URL1
        URL link to the relevant Historic England record in Heritage Gateway
        Y
    
        HE_URL2
        HE_URL2
        URL link to the relevant Historic England record in Heritage Gateway
        N
    
        HE_URL3 ~ 5
        HE_URL3 ~ 5
        URL link to the relevant Historic England record in Heritage Gateway
        N
    
    
        DHERNO_1
        DHERNO_1
        Primary unique identifier used by the relevant Historic Environment Record (HER)
        Y
    
    
        DHERNO_2
        DHERNO_2
        Secondary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
        N
    
    
        DHERNO_3 ~ 5
        DHERNO_3 ~ 5
        Tertiary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
        N
    
        DHERPREF_1
        DHERPREF_1
        Primary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID
        Y
    
        DHERPREF_2
        DHERPREF_2
        Secondary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
        N
    
        DHERPREF_3 ~ 5
        DHERPREF_3 ~ 5
        Additional alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
        N
    
    
        HER_LINK_1
        HER_LINK_1
        URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
        Y
    
    
        HER_LINK_2
        HER_LINK_2
        URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
        N
    
    
        HER_LINK_3 ~ 5
        HER_LINK_3 ~ 5
        URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
        N
    

    The data are symbolised initially based on their physical form i.e. cut/negative (e.g. pit, ditch etc) or built/positive (e.g. mound, bank etc)

        Layer name
        Colour (Hex)
        Description
    

    Bank #A50026 Used to outline banks, platforms, mounds and spoil heaps.

    Ditch #313695 Used to outline cut features such as ditches, ponds, pits or hollow ways.

        Extent of Feature
        #FDAE61 (Dashes)
        Used to depict the extent of large area features such as airfields, military camps, or major extraction.
    
    
        Ridge and Furrow Alignment
        #74ADD1
        Line or arrow(s) (hand drawn not a symbol) depicting the direction of the rigs in a block of ridge and furrow.
    
    
        Ridge and Furrow Area
        #74ADD1 (Dots)
        Used to outline a block of ridge and furrow .
    
    
        Slope
        #4575B4
         The top of the “T” indicates the top of slope and the body indicates the length and direction of the slope. Used to depict scarps, edges of platforms and other large earthworks.
    
        Structure
        #F46D43
         Used to outline structures including stone, concrete, metal and timber constructions e.g., buildings, Nissen huts, tents, radio masts, camouflaged airfields, wrecks, fish traps, etc.
    

    You can find instructions on how to create a QGIS style file (.qml) to recreate our mapping symbology in QGIS via our Open Data Downloads page under Aerial Investigation Mapping data.Monument ExtentsThis layer shows the general extent of the monuments, created from multiple sources, primarily aerial imagery, but referring to other sources such as earthwork surveys, documentary evidence and any information available from the relevant Historic Environment Record etc. This differs from the 'Detailed Mapping' layer, which shows the individual features as they appear on the ground.

        Field name
        Field alias
        Description
        Mandatory Y/N
    
    
        LAYER
        LAYER
        The layer used for mapping
        Y
    
        HE_UID
        HE_UID
        Composite of Unique identifier(s) used by Historic England
        Y
    
        HER_NO
        HER_NO
        Composite of Unique identifier(s) used by Historic Environement Records
        N
    
    
        HE_UID1
        HE_UID1
        Primary Unique identifier used by Historic England
        Y
    
        HE_UID2
        HE_UID2
        Secondary Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
        N
    
        HE_UID3 ~ 5
        HE-UID3 ~ 5
        Additional Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
        N
    
    
        HE_URL1
        HE_URL1
        URL link to the relevant Historic England record in Heritage Gateway
        Y
    
        HE_URL2
        HE_URL2
        URL link to the relevant Historic England record in Heritage Gateway
        N
    
        HE_URL3 ~ 5
        HE_URL3 ~ 5
        URL link to the relevant Historic England record in Heritage Gateway
        N
    
    
        HERNO_1
        HERNO_1
        Primary unique identifier used by the relevant Historic Environment Record (HER)
        Y
    
    
        HERNO_2
        HERNO_2
        Secondary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
        N
    
    
        HERNO_3 ~ 25
        HERNO_3 ~ 25
        Tertiary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
        N
    
        HERPREF_1
        HERPREF_1
        Primary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID
        Y
    
        HERPREF_2
        HERPREF_2
        Secondary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
        N
    
        HERPREF_3 ~ 25
        HERPREF_3 ~ 25
        Additional alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
        N
    
    
        HER_LINK_1
        HER_LINK_1
        URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
        Y
    
    
        HER_LINK_2
        HER_LINK_2
        URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
        N
    
    
        HER_LINK_3 ~ 25
        HER_LINK_3 ~ 25
        URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
        N
    
    
        PROJECT
        project
        Project name
        Y
    

    Project AreaThis layer shows the extent of the

  6. i

    IMDb Movie Reviews Dataset

    • ieee-dataport.org
    Updated Aug 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditya Pal (2022). IMDb Movie Reviews Dataset [Dataset]. https://ieee-dataport.org/open-access/imdb-movie-reviews-dataset
    Explore at:
    Dataset updated
    Aug 2, 2022
    Authors
    Aditya Pal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Datasimple (2025). IMDB Movie Reviews (Binary Sentiment) [Dataset]. https://www.opendatabay.com/data/ai-ml/c48f7110-3d06-45be-9cae-aa8799720eec

IMDB Movie Reviews (Binary Sentiment)

Explore at:
.csvAvailable download formats
Dataset updated
Jun 18, 2025
Dataset authored and provided by
Datasimple
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Area covered
Entertainment & Media Consumption
Description

Source Huggingface Hub: link

About this dataset This is a large dataset for binary sentiment classification containing a substantial amount of data compared to previous benchmark datasets. Provided are 25,000 highly polar movie reviews for training and 25,000 for testing. There is also additional unlabeled data available for use. The data fields are consistent among all splits of the dataset

How to use the dataset In order to use this dataset, you will need to first download the IMDB Large Movie Review Dataset. Once you have downloaded the dataset, you can either use it in its original form or split it into training and testing sets. To split the dataset, you will need to create a new file called unsupervised.csv and copy the text column from train.csv into it. You can then split unsupervised.csv into two files: train_unsupervised.csv and test_unsupervised.csv.

Once you have either the original dataset or the training and testing sets, you can begin using them for binary sentiment classification. In order to do this, you will need to use a machine learning algorithm that is capable of performing binary classification, such as logistic regression or support vector machines. Once you have trained your model on the training set, you can then evaluate its performance on the test set by predicting the labels of the reviews in test_unsupervised.csv

Research Ideas This dataset can be used to train a binary sentiment classification model. This dataset can be used to train a model to classify movie reviews into positive and negative sentiment categories. This dataset can be used to build a large movie review database for research purposes

License

CC0

Original Data Source: IMDB Movie Reviews (Binary Sentiment)

Search
Clear search
Close search
Google apps
Main menu