CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Source Huggingface Hub: link
About this dataset This is a large dataset for binary sentiment classification containing a substantial amount of data compared to previous benchmark datasets. Provided are 25,000 highly polar movie reviews for training and 25,000 for testing. There is also additional unlabeled data available for use. The data fields are consistent among all splits of the dataset
How to use the dataset In order to use this dataset, you will need to first download the IMDB Large Movie Review Dataset. Once you have downloaded the dataset, you can either use it in its original form or split it into training and testing sets. To split the dataset, you will need to create a new file called unsupervised.csv and copy the text column from train.csv into it. You can then split unsupervised.csv into two files: train_unsupervised.csv and test_unsupervised.csv.
Once you have either the original dataset or the training and testing sets, you can begin using them for binary sentiment classification. In order to do this, you will need to use a machine learning algorithm that is capable of performing binary classification, such as logistic regression or support vector machines. Once you have trained your model on the training set, you can then evaluate its performance on the test set by predicting the labels of the reviews in test_unsupervised.csv
Research Ideas This dataset can be used to train a binary sentiment classification model. This dataset can be used to train a model to classify movie reviews into positive and negative sentiment categories. This dataset can be used to build a large movie review database for research purposes
CC0
Original Data Source: IMDB Movie Reviews (Binary Sentiment)
Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('imdb_reviews', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
OpenSubtitles is collection of multilingual parallel corpora. The dataset is compiled from a large database of movie and TV subtitles and includes a total of 1689 bitexts spanning 2.6 billion sentences across 60 languages.
Details about IMFDB: Indian Movie Face database (IMFDB) is a large unconstrained face database consisting of 34512 images of 100 Indian actors collected from more than 100 videos. All the images are manually selected and cropped from the video frames resulting in a high degree of variability interms of scale, pose, expression, illumination, age, resolution, occlusion, and makeup. IMFDB is the first face database that provides a detailed annotation of every image in terms of age, pose, gender, expression and type of occlusion that may help other face related applications.
This dataset is modified in such a way that it is ready for training a Face Recognition model. For dataset with annotations as mentioned above, you can download from here(official): https://cvit.iiit.ac.in/projects/IMFDB/
Acknowledgements: https://cvit.iiit.ac.in/projects/IMFDB/ Shankar Setty, Moula Husain, Parisa Beham, Jyothi Gudavalli, Menaka Kandasamy, Radhesyam Vaddi, Vidyagouri Hemadri, J C Karure, Raja Raju, Rajan, Vijay Kumar and C V Jawahar. "Indian Movie Face Database: A Benchmark for Face Recognition Under Wide Variations" National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2013.
Various data recorded by Historic England relating to aerial investigation and mapping projects. N.B. This is a dynamic dataset that is constantly evolving, not only with the addition of newly completed projects, but also with the reassessment of some earlier projects. See https://historicengland.org.uk/research/methods/airborne-remote-sensing/aerial-investigation/ for further details of Historic England's work with aerial sources.It's currently not possible to provide download access to the earlier hand drawn projects, which are only available as raster files, but these can be viewed via the Aerial Archaeology Mapping Explorer. We aim to create vector monument polygons for these features as the next phase of the project.More information and help with these the layers Detailed MappingThis layer shows the detailed mapping of archaeological features derived from aerial imagery; this includes photographic imagery from many decades taken specifically for archaeological purposes, as well as other photography taken for other reasons and airborne lidar. The data are symbolised initially based on their physical form i.e. cut/negative (e.g. pit, ditch etc) or built/positive (e.g. mound, bank etc) .
Field name
Field alias
Description
Mandatory Y/N
LAYER
LAYER
The layer used for mapping
Y
PROJECT
PROJECT
Project name
Y
PERIOD
PERIOD
The presumed date/period assigned to the feature (terminology from FISH thesaurus)
Y
MONUMENT_TYPE
MONUMENT_TYPE
The presumed type/function assigned to the feature (terminology from FISH thesaurus)
Y
EVIDENCE_1
EVIDENCE_1
The primary evidence for the feature e.g. cropmark, earthwork etc (terminology from FISH thesaurus)
Y
SOURCE_1
SOURCE_1
The primary source for the feature e.g. aerial photo reference, documentary source etc
Y
EVIDENCE_2
EVIDENCE_2
Where available the latest evidence for the feature e.g. cropmark, earthwork etc (terminology from FISH thesaurus) N.B. This was the latest evidence seen and does not necessarily represent the current status of the feature.
N
SOURCE_2
SOURCE_2
Where available the latest source for the feature N.B. This was the latest evidence seen and does not necessarily represent the current status of the feature.
N
HE_UID
HE_UID
Composite of Unique identifier(s) used by Historic England
Y
HER_NO
HER_NO
Composite of Unique identifier(s) used by Historic Environment Records
N
DHEUID_1
DHEUID_1
Primary Unique identifier used by Historic England
Y
DHEUID_2
DHEUID_2
Secondary Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
N
DHEUID_3 ~ 5
DHEUID_3 ~ 5
Additional Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
N
HE_URL1
HE_URL1
URL link to the relevant Historic England record in Heritage Gateway
Y
HE_URL2
HE_URL2
URL link to the relevant Historic England record in Heritage Gateway
N
HE_URL3 ~ 5
HE_URL3 ~ 5
URL link to the relevant Historic England record in Heritage Gateway
N
DHERNO_1
DHERNO_1
Primary unique identifier used by the relevant Historic Environment Record (HER)
Y
DHERNO_2
DHERNO_2
Secondary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
N
DHERNO_3 ~ 5
DHERNO_3 ~ 5
Tertiary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
N
DHERPREF_1
DHERPREF_1
Primary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID
Y
DHERPREF_2
DHERPREF_2
Secondary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
N
DHERPREF_3 ~ 5
DHERPREF_3 ~ 5
Additional alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
N
HER_LINK_1
HER_LINK_1
URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
Y
HER_LINK_2
HER_LINK_2
URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
N
HER_LINK_3 ~ 5
HER_LINK_3 ~ 5
URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
N
The data are symbolised initially based on their physical form i.e. cut/negative (e.g. pit, ditch etc) or built/positive (e.g. mound, bank etc)
Layer name
Colour (Hex)
Description
Bank #A50026 Used to outline banks, platforms, mounds and spoil heaps.
Ditch #313695 Used to outline cut features such as ditches, ponds, pits or hollow ways.
Extent of Feature
#FDAE61 (Dashes)
Used to depict the extent of large area features such as airfields, military camps, or major extraction.
Ridge and Furrow Alignment
#74ADD1
Line or arrow(s) (hand drawn not a symbol) depicting the direction of the rigs in a block of ridge and furrow.
Ridge and Furrow Area
#74ADD1 (Dots)
Used to outline a block of ridge and furrow .
Slope
#4575B4
The top of the âTâ indicates the top of slope and the body indicates the length and direction of the slope. Used to depict scarps, edges of platforms and other large earthworks.
Structure
#F46D43
Used to outline structures including stone, concrete, metal and timber constructions e.g., buildings, Nissen huts, tents, radio masts, camouflaged airfields, wrecks, fish traps, etc.
You can find instructions on how to create a QGIS style file (.qml) to recreate our mapping symbology in QGIS via our Open Data Downloads page under Aerial Investigation Mapping data.Monument ExtentsThis layer shows the general extent of the monuments, created from multiple sources, primarily aerial imagery, but referring to other sources such as earthwork surveys, documentary evidence and any information available from the relevant Historic Environment Record etc. This differs from the 'Detailed Mapping' layer, which shows the individual features as they appear on the ground.
Field name
Field alias
Description
Mandatory Y/N
LAYER
LAYER
The layer used for mapping
Y
HE_UID
HE_UID
Composite of Unique identifier(s) used by Historic England
Y
HER_NO
HER_NO
Composite of Unique identifier(s) used by Historic Environement Records
N
HE_UID1
HE_UID1
Primary Unique identifier used by Historic England
Y
HE_UID2
HE_UID2
Secondary Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
N
HE_UID3 ~ 5
HE-UID3 ~ 5
Additional Unique identifier used by Historic England. Used where a feature may relate to more than one Historic England record
N
HE_URL1
HE_URL1
URL link to the relevant Historic England record in Heritage Gateway
Y
HE_URL2
HE_URL2
URL link to the relevant Historic England record in Heritage Gateway
N
HE_URL3 ~ 5
HE_URL3 ~ 5
URL link to the relevant Historic England record in Heritage Gateway
N
HERNO_1
HERNO_1
Primary unique identifier used by the relevant Historic Environment Record (HER)
Y
HERNO_2
HERNO_2
Secondary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
N
HERNO_3 ~ 25
HERNO_3 ~ 25
Tertiary unique identifier used by the relevant Historic Environment Record. Used where a feature may relate to more than one HER record
N
HERPREF_1
HERPREF_1
Primary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID
Y
HERPREF_2
HERPREF_2
Secondary alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
N
HERPREF_3 ~ 25
HERPREF_3 ~ 25
Additional alternative unique identifier used by the relevant Historic Environment Record. Some HERs use the same number for both the HER No. and the reference to link to the record; others use different numbers and give them different names e.g MonUID Used where a feature may relate to more than one HER record
N
HER_LINK_1
HER_LINK_1
URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
Y
HER_LINK_2
HER_LINK_2
URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
N
HER_LINK_3 ~ 25
HER_LINK_3 ~ 25
URL link to the relevant Historic Environment Record (HER) record in Heritage Gateway
N
PROJECT
project
Project name
Y
Project AreaThis layer shows the extent of the
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
R
Not seeing a result you expected?
Learn how you can add new datasets to our index.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Source Huggingface Hub: link
About this dataset This is a large dataset for binary sentiment classification containing a substantial amount of data compared to previous benchmark datasets. Provided are 25,000 highly polar movie reviews for training and 25,000 for testing. There is also additional unlabeled data available for use. The data fields are consistent among all splits of the dataset
How to use the dataset In order to use this dataset, you will need to first download the IMDB Large Movie Review Dataset. Once you have downloaded the dataset, you can either use it in its original form or split it into training and testing sets. To split the dataset, you will need to create a new file called unsupervised.csv and copy the text column from train.csv into it. You can then split unsupervised.csv into two files: train_unsupervised.csv and test_unsupervised.csv.
Once you have either the original dataset or the training and testing sets, you can begin using them for binary sentiment classification. In order to do this, you will need to use a machine learning algorithm that is capable of performing binary classification, such as logistic regression or support vector machines. Once you have trained your model on the training set, you can then evaluate its performance on the test set by predicting the labels of the reviews in test_unsupervised.csv
Research Ideas This dataset can be used to train a binary sentiment classification model. This dataset can be used to train a model to classify movie reviews into positive and negative sentiment categories. This dataset can be used to build a large movie review database for research purposes
CC0
Original Data Source: IMDB Movie Reviews (Binary Sentiment)