A total of 12 software defect data sets from NASA were used in this study, where five data sets (part I) including CM1, JM1, KC1, KC2, and PC1 are obtained from PROMISE software engineering repository (http://promise.site.uottawa.ca/SERepository/), the other seven data sets (part II) are obtained from tera-PROMISE Repository (http://openscience.us/repo/defect/mccabehalsted/).
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
The different algorithms of the imbalanced-learn
toolbox are evaluated on a set of common dataset, which are more or less balanced. These benchmark have been proposed in [1]. The following section presents the main characteristics of this benchmark.
ID | Name | Repository & Target | Ratio | # samples | # features |
---|---|---|---|---|---|
1 | Ecoli | UCI, target: imU | 8.6:1 | 336 | 7 |
2 | Optical Digits | UCI, target: 8 | 9.1:1 | 5,620 | 64 |
3 | SatImage | UCI, target: 4 | 9.3:1 | 6,435 | 36 |
4 | Pen Digits | UCI, target: 5 | 9.4:1 | 10,992 | 16 |
5 | Abalone | UCI, target: 7 | 9.7:1 | 4,177 | 8 |
6 | Sick Euthyroid | UCI, target: sick euthyroid | 9.8:1 | 3,163 | 25 |
7 | Spectrometer | UCI, target: >=44 | 11:1 | 531 | 93 |
8 | Car_Eval_34 | UCI, target: good, v good | 12:1 | 1,728 | 6 |
9 | ISOLET | UCI, target: A, B | 12:1 | 7,797 | 617 |
10 | US Crime | UCI, target: >0.65 | 12:1 | 1,994 | 122 |
11 | Yeast_ML8 | LIBSVM, target: 8 | 13:1 | 2,417 | 103 |
12 | Scene | LIBSVM, target: >one label | 13:1 | 2,407 | 294 |
13 | Libras Move | UCI, target: 1 | 14:1 | 360 | 90 |
14 | Thyroid Sick | UCI, target: sick | 15:1 | 3,772 | 28 |
15 | Coil_2000 | KDD, CoIL, target: minority | 16:1 | 9,822 | 85 |
16 | Arrhythmia | UCI, target: 06 | 17:1 | 452 | 279 |
17 | Solar Flare M0 | UCI, target: M->0 | 19:1 | 1,389 | 10 |
18 | OIL | UCI, target: minority | 22:1 | 937 | 49 |
19 | Car_Eval_4 | UCI, target: vgood | 26:1 | 1,728 | 6 |
20 | Wine Quality | UCI, wine, target: <=4 | 26:1 | 4,898 | 11 |
21 | Letter Img | UCI, target: Z | 26:1 | 20,000 | 16 |
22 | Yeast _ME2 | UCI, target: ME2 | 28:1 | 1,484 | 8 |
23 | Webpage | LIBSVM, w7a, target: minority | 33:1 | 49,749 | 300 |
24 | Ozone Level | UCI, ozone, data | 34:1 | 2,536 | 72 |
25 | Mammography | UCI, target: minority | 42:1 | 11,183 | 6 |
26 | Protein homo. | KDD CUP 2004, minority | 111:1 | 145,751 | 74 |
27 | Abalone_19 | UCI, target: 19 | 130:1 | 4,177 | 8 |
[1] Ding, Zejin, "Diversified Ensemble Classifiers for H ighly Imbalanced Data Learning and their Application in Bioinformatics." Dissertation, Georgia State University, (2011).
[2] Blake, Catherine, and Christopher J. Merz. "UCI Repository of machine learning databases." (1998).
[3] Chang, Chih-Chung, and Chih-Jen Lin. "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology (TIST) 2.3 (2011): 27.
[4] Caruana, Rich, Thorsten Joachims, and Lars Backstrom. "KDD-Cup 2004: results and analysis." ACM SIGKDD Explorations Newsletter 6.2 (2004): 95-108.
https://choosealicense.com/licenses/undefined/https://choosealicense.com/licenses/undefined/
Tabular Benchmark
Dataset Description
This dataset is a curation of various datasets from openML and is curated to benchmark performance of various machine learning algorithms.
Repository: https://github.com/LeoGrin/tabular-benchmark/community Paper: https://hal.archives-ouvertes.fr/hal-03723551v2/document
Dataset Summary
Benchmark made of curation of various tabular data learning tasks, including:
Regression from Numerical and Categorical Features… See the full description on the dataset page: https://huggingface.co/datasets/inria-soda/tabular-benchmark.
This repository contains datasets to quickly test graph classification algorithms, such as Graph Kernels and Graph Neural Networks. The purpose of this dataset is to make the features on the nodes and the adjacency matrix to be completely uninformative if considered alone. Therefore, an algorithm that relies only on the node features or on the graph structure will fail to achieve good classification results. A more detailed description of the dataset construction can be found on the Github page (https://github.com/FilippoMB/Benchmark_dataset_for_graph_classification), in the original publication and in the original publication: Bianchi, Filippo Maria, Claudio Gallicchio, and Alessio Micheli. "Pyramidal Reservoir Graph Neural Network." Neurocomputing 470 (2022): 389-404, and in the README.txt file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the datasets and experiment results presented in our arxiv paper:
B. Hoffman, M. Cusimano, V. Baglione, D. Canestrari, D. Chevallier, D. DeSantis, L. Jeantet, M. Ladds, T. Maekawa, V. Mata-Silva, V. Moreno-González, A. Pagano, E. Trapote, O. Vainio, A. Vehkaoja, K. Yoda, K. Zacarian, A. Friedlaender, "A benchmark for computational analysis of animal behavior, using animal-borne tags," 2023.
Standardized code to implement, train, and evaluate models can be found at https://github.com/earthspecies/BEBE/.
Please note the licenses in each dataset folder.
Zip folders beginning with "formatted": These are the datasets we used to run the experiments reported in the benchmark paper.
Zip folders beginning with "raw": These are the unprocessed datasets used in BEBE. Code to process these raw datasets into the formatted ones used by BEBE can be found at https://github.com/earthspecies/BEBE-datasets/.
Zip folders beginning with "experiments": Results of the cross-validation experiments reported in the paper, as well as hyperparameter optimization. Confusion matrices for all experiments can also be found here. Note that dt, rf, and svm refer to the feature set from Nathan et al., 2012.
Results used in Fig. 4 of arxiv paper (deep neural networks vs. classical models)
{dataset}_ harnet_nogyr
{dataset}_CRNN
{dataset}_CNN
{dataset}_dt
{dataset}_rf
{dataset}_svm
{dataset}_wavelet_dt
{dataset}_wavelet_rf
{dataset}_wavelet_svm
Results used in Fig. 5D of arxiv paper (full data setting)
If dataset contains gyroscope (HAR, jeantet_turtles, vehkaoja_dogs):
{dataset}_harnet_nogyr
{dataset}_harnet_random_nogyr
{dataset}_harnet_unfrozen_nogyr
{dataset}_RNN_nogyr
{dataset}_CRNN_nogyr
{dataset}_rf_nogyr
Otherwise:
{dataset}_harnet_nogyr
{dataset}_harnet_unfrozen_nogyr
{dataset}_harnet_random_nogyr
{dataset}_RNN_nogyr
{dataset}_CRNN
{dataset}_rf
Results used in Fig. 5E of arxiv paper (reduced data setting)
If dataset contains gyroscope (HAR, jeantet_turtles, vehkaoja_dogs):
{dataset}_harnet_low_data_nogyr
{dataset}_harnet_random_low_data_nogyr
{dataset}_harnet_unfrozen_low_data_nogyr
{dataset}_RNN_low_data_nogyr
{dataset}_wavelet_RNN_low_data_nogyr
{dataset}_CRNN_low_data_nogyr
{dataset}_rf_low_data_nogyr
Otherwise:
{dataset}_harnet_low_data_nogyr
{dataset}_harnet_random_low_data_nogyr
{dataset}_harnet_unfrozen_low_data_nogyr
{dataset}_RNN_low_data_nogyr
{dataset}_wavelet_RNN_low_data_nogyr
{dataset}_CRNN_low_data
{dataset}_rf_low_data
CSV files: we also include summaries of the experimental results in experiments_summary.csv, experiments_by_fold_individual.csv, experiments_by_fold_behavior.csv.
experiments_summary.csv - results averaged over individuals and behavior classes
dataset (str): name of dataset
experiment (str): name of model with experiment setting
fig4 (bool): True if dataset+experiment was used in figure 4 of arxiv paper
fig5d (bool): True if dataset+experiment was used in figure 5d of arxiv paper
fig5e (bool): True if dataset+experiment was used in figure 5e of arxiv paper
f1_mean (float): mean of macro-averaged F1 score, averaged over individuals in test folds
f1_std (float): standard deviation of macro-averaged F1 score, computed over individuals in test folds
prec_mean, prec_std (float): analogous for precision
rec_mean, rec_std (float): analogous for recall
experiments_by_fold_individual.csv - results per individual in the test folds
dataset (str): name of dataset
experiment (str): name of model with experiment setting
fig4 (bool): True if dataset+experiment was used in figure 4 of arxiv paper
fig5d (bool): True if dataset+experiment was used in figure 5d of arxiv paper
fig5e (bool): True if dataset+experiment was used in figure 5e of arxiv paper
fold (int): test fold index
individual (int): individuals are numbered zero-indexed, starting from fold 1
f1 (float): macro-averaged f1 score for this individual
precision (float): macro-averaged precision for this individual
recall (float): macro-averaged recall for this individual
experiments_by_fold_behavior.csv - results per behavior class, for each test fold
dataset (str): name of dataset
experiment (str): name of model with experiment setting
fig4 (bool): True if dataset+experiment was used in figure 4 of arxiv paper
fig5d (bool): True if dataset+experiment was used in figure 5d of arxiv paper
fig5e (bool): True if dataset+experiment was used in figure 5e of arxiv paper
fold (int): test fold index
behavior_class (str): name of behavior class
f1 (float): f1 score for this behavior, averaged over individuals in the test fold
precision (float): precision for this behavior, averaged over individuals in the test fold
recall (float): recall for this behavior, averaged over individuals in the test fold
train_ground_truth_label_counts (int): number of timepoints labeled with this behavior class, in the training set
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a collection of benchmark data sets used in the experiments of the paper
"Compression for 2-Parameter Persistent Homology"
by Ulderico Fugacci, Michael Kerber, and Alexander Rolle.
The detailed description of the datasets can be found in that paper. The file format is partially firep (as described in the Rivet library here) and partially scc2020 files (as described here).
The repository also contains the scripts that generated the instances and to run the benchmarks from the cited paper. Executing them requires several additional libraries: for generating geometric examples, CGAL is required. For the benchmarks, mpfree, multi-chunk, phat and Rivet are required.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
NeuroTask is a benchmark dataset designed to facilitate the development of accurate and efficient methods for analyzing multi-session, multi-task, and multi-subject neural data. NeuroTask integrates 6 datasets from motor cortical regions, covering 7 tasks across 19 subjects.
This dataset includes:
The indices are included to uniquely identify each session using datasetID, animal, and session.
The rationale for the file naming convention is as follows:
datasetID _ bin size _ dataset name _ task.parquet
Check out the github repository for more resources and some example notebooks: https://github.com/catniplab/NeuroTask/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F20742846%2F85b47e421f30f4203cb97ceb78f2d2f6%2FNeuroTask3.png?generation=1716989002860465&alt=media" alt="">
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
HumanEval-V: Benchmarking High-Level Visual Reasoning with Complex Diagrams in Coding Tasks
📄 Paper •
🏠 Home Page •
💻 GitHub Repository •
🏆 Leaderboard •
🤗 Dataset Viewer
HumanEval-V is a novel benchmark designed to evaluate the diagram understanding and reasoning capabilities of Large Multimodal Models (LMMs) in programming contexts. Unlike existing benchmarks, HumanEval-V focuses on coding tasks that require sophisticated visual reasoning over… See the full description on the dataset page: https://huggingface.co/datasets/HumanEval-V/HumanEval-V-Benchmark.
RTL-Repo Benchmark
This repository contains the data for the RTL-Repo benchmark introduced in the paper RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects.
👋 Overview
RTL-Repo is a benchmark for evaluating LLMs' effectiveness in generating Verilog code autocompletions within large, complex codebases. It assesses the model's ability to understand and remember the entire Verilog repository context and generate new code that is correct, relevant… See the full description on the dataset page: https://huggingface.co/datasets/ahmedallam/RTL-Repo.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The HornMT repository contains data and the associated metadata for the project Machine Translation Benchmark Dataset for Languages in the Horn of Africa. It is a multi-way parallel corpus that will serve as a benchmark to accelerate progress in machine translation research and production systems for languages in the Horn of Africa.
Supported Languages
Language
ISO 639-3 code
Afar
aaf
Amharic
amh
English
eng
Oromo
orm
Somali
som
Tigrinya
tir
data/ contains one text file per language and each file contains news snippets in the same order for each language.
data ├── aar.txt ├── amh.txt ├── eng.txt ├── orm.txt ├── som.txt └── tir.txt
metadata.tsv contains tab separated data describing each news snippet. The metadata contains the following fields.
Scope - describes whether the news is global or local. It takes two values: Global news and Local news.
Category - News category covering the following 12 topics
Art and Culture
Business and Economy
Conflicts and Attacks
Disaster and Accidents
Entertainment
Environment
Health
International Relations
Law and Crime
Politics
Science and Technology
Sport
Source - List of one or more URLs from which the news content is extracted or based on.
Domain - TLD corresponding to the URL(s) in Source.
Date - The publication date of the source article. The format is yyyy-mm-dd.
Other formats
All the data and associated metadata together in one file is also available in other file formats.
HornMT.xlsx - data and associated metadata in xlsx format.
HornMT.json - data and associated metadata in json format.
Below is an example row.
{ "data":{ "eng":"The World Meteorological Organisation reports that the ozone layer is damaged to its worst extent ever in the Arctic.", "aaf":"Baad Metrolojih Eglali Areketekeh Addal Ozonih qelu faxe waktik lafetle calat biyakisem xayose.", "amh":"የአለም የአየር ንብረት ድርጅት በአርክቲክ አካባቢ ያለው የኦዞን ምንጣፍ ከፍተኛ ጉዳት እንደደረሰበት አስታወቀ፡፡", "orm":"Dhaabbanni Meetiroolojii Addunyaa baqqaanni oozonii Arkiitik keessatti gara sadarkaa isa hamaa haga ammaatti akka miidhame gabaase.", "som":"Ururka Saadaasha Hawada Adduunka ayaa ku warramaya in lakabka ozoneka ee Ka koreeya dhulka baraflayda uu waxyeelladii abid ugu darnaa soo gaadhay.", "tir":"ውድብ ሜትሮሎጂ ዓለም ኣብ ኣርክቲክ ዝርከብ ናሕሲ ኦዞን ኣዝዩ ብዝኸፍአ ደረጃ ከምዝተጎድአ ሓቢሩ፡፡" }, "metadata":{ "scope":"Global", "category":"Science and Technology", "source":"https://www.independent.co.uk/environment/climate-change/ozone-layer-damaged-by-unusually-harsh-winter-2263653.html", "domain":"www.independent.co.uk", "date":"2011-04-05" } }
Team
Afar
Mohammed Deresa
Yasin Nur
Amharic
Tigist Taye
Selamawit Hailemariam
Wako Tilahun
Oromo
Gemechis Melkamu
Galata Girmaye
Somali
Abdiselam Mohamed
Beshir Abdi
Tigrinya
Berhanu Abadi Weldegiorgis
Michael Minassie
Nureddin Mohammedshiek
Project Leaders
Asmelash Teka Hadgu asme@lesan.ai
Gebrekirstos G. Gebremeskel gebrekirstos.gebremeskel@ru.nl
Abel Aregawi abel@lesan.ai
License
Shield: CC BY 4.0
This work is licensed under a Creative Commons Attribution 4.0 International License.
Python wrapper for Penn Machine Learning Benchmark data repository. Large, curated repository of benchmark datasets for evaluating supervised machine learning algorithms. Part of PyPI https://pypi.org/
🤗 MigrationBench is a large-scale code migration benchmark dataset at the repository level, across multiple programming languages.
Current and initial release includes java 8 repositories with the maven build system, as of May 2025.
It has 3 datasets:
🤗 migration-bench-java-full has 5,102 repos, and each of them has a test directory or at least one test case.
🤗 migration-bench-java-selected is a subset of migration-bench-java-full, with 300 repos.
🤗 migration-bench-java-utg contains 4,184 repos, complementary to migration-bench-java-full.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
[UPDATE] You can now access MultiSen (GE and NA) collection though this portal : https://doi.theia.data-terra.org/ai4lcc/?lang=en
MultiSenGE is a new large-scale multimodal and multitemporal benchmark dataset covering one of the biggest administrative region located in the Eastern part of France. It contains 8,157 patches of 256 * 256 pixels for Sentinel-2 L2A, Sentinel-1 GRD and a regional LULC topographic regional database.
Every file has a specific nomenclature :
Sentinel-1 patches: {tile}_{date}_S1_{x-pixel-coordinate}_{y-pixel-coordinate}.tif
Sentinel-2 patches: {tile}_{date}_S2_{x-pixel-coordinate}_{y-pixel-coordinate}.tif
Ground reference patches: {tile}_GR_{x-pixel-coordinate}_{y-pixel-coordinate}.tif
JSON Labels: {tile}_{x-pixel-coordinate}_{y-pixel-coordinate}.json
where tile is the Sentinel-2 tile number, date the date of acquisition of the patch, x-pixel-coordinate and y-pixel-coordinate are the coordinates of the patch in the tile.
In addition, you can find a set of useful python tools for extracting information about the dataset on Github : https://github.com/r-wenger/MultiSenGE-Tools
First experiments based on this dataset is in press in ISPRS Annals : Wenger, R., Puissant, A., Weber, J., Idoumghar, L., and Forestier, G.: MULTISENGE: A MULTIMODAL AND MULTITEMPORAL BENCHMARK DATASET FOR LAND USE/LAND COVER REMOTE SENSING APPLICATIONS, ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., V-3-2022, 635–640, https://doi.org/10.5194/isprs-annals-V-3-2022-635-2022, 2022.
Due to the large size of the dataset, you will only find the associated JSON files on this Zenodo repository. To download the Sentinel-1, Sentinel-2 patches and the reference data, please do so via these links:
Sentinel-1 temporal serie patches: https://s3.unistra.fr/a2s_datasets/MultiSenGE/s1.tgz
Sentinel-2 temporal serie patches: https://s3.unistra.fr/a2s_datasets/MultiSenGE/s2.tgz
Ground reference patches: https://s3.unistra.fr/a2s_datasets/MultiSenGE/ground_reference.tgz
JSON files for each patch: https://s3.unistra.fr/a2s_datasets/MultiSenGE/labels.tgz
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Our dataset "repository_survey" summarizes a comprehensive survey of over 150 data repositories, characterizing their metadata documentation and standardization, data curation and validation, and tracking of dataset use in the literature. In addition, "survey_model_evaluation" includes our findings on model evaluation for five benchmark repositories. Column descriptions and further details can be found in "README.pdf." The data are associated with our paper "Benchmark Data Repositories: Lessons and Recommendations."
http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0
This resource can be used for benchmarking different RDF modelling solutions for statement-level metadata, namely:
RDF Reification,
Singleton Property,
RDF* (RDF-star).
More details about this resource can be found in the following publication:
Fabrizio Orlandi, Damien Graux, Declan O'Sullivan, "Benchmarking RDF Metadata Representations: Reification, Singleton Property and RDF*", 15th IEEE International Conference on Semantic Computing (ICSC), 2021.
Pre-print available at: http://fabriziorlandi.net/pdf/2021/ICSC2021_REF-Benchmark.pdf
The dataset contains 3 different versions of the Biomedical Knowledge Repository (BKR) knowledge graph, as described in:
Vinh Nguyen, Olivier Bodenreider, Amit Sheth. "Don't Like RDF Reification? Making Statements About Statements Using Singleton Property" WWW 2014, doi: 10.1145/2566486.2567973.
and,
Satya S. Sahoo, Olivier Bodenreider, Pascal Hitzler, Amit Sheth and Krishnaprasad Thirunarayan. "Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data" in Sci Stat Database Manag. 2010; 6187: 461–470. doi: 10.1007/978-3-642-13818-8_32
The 3 knowledge graphs dumps are packaged as Gzipped RDF files in Turtle (and Turtle*) syntax.
BKR-R-fullKGdump.ttl.gz for the Reification method,
BKR-S-fullKGdump.ttl.gz for the Singleton method,
BKR-star-fullKGdump.ttls.gz for the RDF* (RDF-star) method.
The RDF REiFication Benchmark (REF) includes also a set of SPARQL (and SPARQL*) queries that can be used to compare the performance of different triplestores.
Details about the SPARQL queries, and the queries themselves, are included in the "REF-Benchmark.tar.gz" archive. The queries are named after the dataset they are designed for (BKR-R or BKR-S or BKR-star), plus they include a letter identifying the query set, and a query number.
E.g. the query in the file "BKR-R_F-Q3.rq" is for the BKR-R (standard reification) dataset, it is part of the query set "F" and it is the number 3 of that set "F". Hence, the same query, but translated for the RDF* dataset in SPARQL* syntax, is contained in "BKR-star_F-Q3.rq".
Sets "A" and "B" are derived from the queries introduced by V. Nguyen et al. in: "Don't Like RDF Reification? Making Statements About Statements Using Singleton Property" WWW 2014, doi: 10.1145/2566486.2567973. Set "F" has been designed more with RDF* in mind as part of this benchmark (see [Orlandi et al., ICSC 2021])
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Music recommender systems can offer users personalized and contextualized recommendation and are therefore important for music information retrieval. An increasing number of datasets have been compiled to facilitate research on different topics, such as content-based, context-based or next-song recommendation. However, these topics are usually addressed separately using different datasets, due to the lack of a unified dataset that contains a large variety of feature types such as item features, user contexts, and timestamps. To address this issue, we propose a large-scale benchmark dataset called #nowplaying-RS, which contains 11.6 million music listening events (LEs) of 139K users and 346K tracks collected from Twitter. The dataset comes with a rich set of item content features and user context features, and the timestamps of the LEs. Moreover, some of the user context features imply the cultural origin of the users, and some others—like hashtags—give clues to the emotional state of a user underlying an LE. In this paper, we provide some statistics to give insight into the dataset, and some directions in which the dataset can be used for making music recommendation. We also provide standardized training and test sets for experimentation, and some baseline results obtained by using factorization machines.
The dataset contains three files:
user_track_hashtag_timestamp.csv contains basic information about each listening event. For each listening event, we provide an id, the user_id, track_id, hashtag, created_at
context_content_features.csv: contains all context and content features. For each listening event, we provide the id of the event, user_id, track_id, artist_id, content features regarding the track mentioned in the event (instrumentalness, liveness, speechiness, danceability, valence, loudness, tempo, acousticness, energy, mode, key) and context features regarding the listening event (coordinates (as geoJSON), place (as geoJSON), geo (as geoJSON), tweet_language, created_at, user_lang, time_zone, entities contained in the tweet).
sentiment_values.csv contains sentiment information for hashtags. It contains the hashtag itself and the sentiment values gathered via four different sentiment dictionaries: AFINN, Opinion Lexicon, Sentistrength Lexicon and vader. For each of these dictionaries we list the minimum, maximum, sum and average of all sentiments of the tokens of the hashtag (if available, else we list empty values). However, as most hashtags only consist of a single token, these values are equal in most cases. Please note that the lexica are rather diverse and therefore, are able to resolve very different terms against a score. Hence, the resulting csv is rather sparse. The file contains the following comma-separated values: , where we abbreviate all scores gathered over the Opinion Lexicon with the prefix 'ol'. Similarly, 'ss' stands for SentiStrength.
Please also find the training and test-splits for the dataset in this repo. Also, prototypical implementations of a context-aware recommender system based on the dataset can be found at https://github.com/asmitapoddar/nowplaying-RS-Music-Reco-FM.
If you make use of this dataset, please cite the following paper where we describe and experiment with the dataset:
@inproceedings{smc18, title = {#nowplaying-RS: A New Benchmark Dataset for Building Context-Aware Music Recommender Systems}, author = {Asmita Poddar and Eva Zangerle and Yi-Hsuan Yang}, url = {http://mac.citi.sinica.edu.tw/~yang/pub/poddar18smc.pdf}, year = {2018}, date = {2018-07-04}, booktitle = {Proceedings of the 15th Sound & Music Computing Conference}, address = {Limassol, Cyprus}, note = {code at https://github.com/asmitapoddar/nowplaying-RS-Music-Reco-FM}, tppubtype = {inproceedings} }
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
AerialExtreMatch — Benchmark Dataset
Code | Project Page | Paper (WIP) This repo contains the benchmark set for our paper AerialExtreMatch: A Benchmark for Extreme-View Image Matching and Localization. 32 difficulty levels are included. We also provide train and localization datasets.
Usage
Simply clone this repository and unzip the dataset files. git clone git@hf.co:datasets/Xecades/AerialExtreMatch-Benchmark cd AerialExtreMatch-Benchmark unzip "*.zip" rm -rf *.zip… See the full description on the dataset page: https://huggingface.co/datasets/Xecades/AerialExtreMatch-Benchmark.
This repository contains the dataset used in the paper "Enhancing Kitchen Activity Recognition: A Benchmark Study of the Rostock KTA Dataset" by Dr. Samaneh Zolfaghari, Teodor Stoev, and Prof. Dr. Kristina Yordanova. If you use the dataset, please cite the paper using the Bibtex below @ARTICLE{10409517, author={Zolfaghari, Samaneh and Stoev, Teodor and Yordanova, Kristina}, journal={IEEE Access}, title={Enhancing Kitchen Activity Recognition: A Benchmark Study of the Rostock KTA Dataset}, year={2024}, volume={}, number={}, pages={1-1}, doi={10.1109/ACCESS.2024.3356352}} as well as the original KTA dataset paper "Kitchen task assessment dataset for measuring errors due to cognitive impairments" by Yordanova, Kristina and Hein, Albert and Kirste, Thomas @inproceedings{yordanova2020kitchen, title={Kitchen task assessment dataset for measuring errors due to cognitive impairments}, author={Yordanova, Kristina and Hein, Albert and Kirste, Thomas}, booktitle={2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)}, pages={1--6}, year={2020}, organization={IEEE} } Description of the files All the archive files containing our data are in the folder data which contains two other folders all_actions (containing the experimental data and the labels we used for the evaluation of the classifier when trained with all action classes in the KTA dataset), and most_common_actions (containing the experimental data and labels we used to evaluate the classifier on the 6 most common actions).
Dataset Card for MMIU
Repository: https://github.com/OpenGVLab/MMIU Paper: https://arxiv.org/abs/2408.02718 Project Page: https://mmiu-bench.github.io/ Point of Contact: Fanqing Meng
Introduction
MMIU encompasses 7 types of multi-image relationships, 52 tasks, 77K images, and 11K meticulously curated multiple-choice questions, making it the most extensive benchmark of its kind. Our evaluation of 24 popular MLLMs, including both open-source and proprietary models… See the full description on the dataset page: https://huggingface.co/datasets/FanqingM/MMIU-Benchmark.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Set of results for the benchmark dataset for long text similarity research. The results were tested with the following language modelsall-MiniLM_6_v2all-MiniLM-L12-v2all-mpnet-base-v2glove.6B.300dLongformerBigBirdGPT2BARTThe repository containing the code and dataset is available at:https://github.com/omarzatarain/long-texts-similarity
A total of 12 software defect data sets from NASA were used in this study, where five data sets (part I) including CM1, JM1, KC1, KC2, and PC1 are obtained from PROMISE software engineering repository (http://promise.site.uottawa.ca/SERepository/), the other seven data sets (part II) are obtained from tera-PROMISE Repository (http://openscience.us/repo/defect/mccabehalsted/).