The CiteSeer dataset consists of 3312 scientific publications classified into one of six classes. The citation network consists of 4732 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 3703 unique words.
The CiteSeer dataset contains 3312 scientific publications grouped into one of six categories. The citation network consists of 4732 links. Each publication in the dataset is described by a 0/1-valued word vector, indicating the absence/presence of the corresponding word in the dictionary. The dictionary consists of 3703 unique words. The README file in the dataset provides more details.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The CiteSeer dataset consists of 3312 scientific publications classified into one of six classes. The citation network consists of 4732 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 3703 unique words.