Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Table of three selected data events showing initial analysis of multimodal elements that constituted the data, which were the:
Metadata, Social actors, Visual images (e.g., photo anaysis), Linguistic expressions of sentiment, Non-linguistic reactions, and 6) Broader social-economic-political relations
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is the fourth of the four datasets that we have created, for audio-text training tasks. These collect pairs of texts and audios, based on the audio-image pairs from our datasets [1, 2, 3]. These are only intended for research purposes.
For the conversion, .csv tables were created, where audio values were separated in 16,000 columns and images were transformed into texts using the public model BLIP [4]. The original images are also preserved for future reference.
To allow other researchers a quick evaluation of the potential usefulness of our datasets for their purposes, we have made available a public page where anyone can check 60 random samples that we extracted from all of our data [5].
[1] Jorge E. León. Image-audio pairs (1 of 3). 2024. url: https://www.kaggle.com/datasets/jorvan/image-audio-pairs-1-of-3. [2] Jorge E. León. Image-audio pairs (2 of 3). 2024. url: https://www.kaggle.com/datasets/jorvan/image-audio-pairs-2-of-3. [3] Jorge E. León. Image-audio pairs (3 of 3). 2024. url: https://www.kaggle.com/datasets/jorvan/image-audio-pairs-3-of-3. [4] Junnan Li et al. “BLIP: Bootstrapping Language-Image Pre-training for Unified VisionLanguage Understanding and Generation”. En: ArXiv 2201.12086 (2022). [5] Jorge E. León. AVT Multimodal Dataset. 2024. url: https://jorvan758.github.io/AVT-Multimodal-Dataset/.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for TableBench
📚 Paper
🏆 Leaderboard
💻 Code
Dataset Summary
TableBench is a comprehensive and complex benchmark designed to evaluate Table Question Answering (TableQA) capabilities, aligning closely with the "Reasoning Complexity of Questions" dimension in real-world Table QA scenarios. It covers 18 question categories across 4 major ategories—including… See the full description on the dataset page: https://huggingface.co/datasets/Multilingual-Multimodal-NLP/TableBench.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To read any dataset you can use the following code
>>> import numpy as np
>>> embed_image = np.load('embed_image.npy')
>>> embed_image.shape
(33962, 768)
>>> embed_text = np.load('embed_text.npy')
>>> embed_text.shape
(33962, 768)
>>> import pandas as pd
>>> items = pd.read_csv('items.txt')
>>> m = len(items)
>>> print(f'{m} items in dataset')
33962
>>> users = pd.read_csv('users.txt')
>>> n = len(users)
>>> print(f'{n} users in dataset')
14790
>>> train = pd.read_csv('train.txt')
>>> train
user item
0 13444 23557
1 13444 33739
... ... ...
317109 13506 29993
317110 13506 13931
>>> from scipy.sparse import csr_matrix
>>> train_matrix = csr_matrix((np.ones(len(train)), (train.user, train.item)), shape=(n,m))
This dataset contains six datasets. Each dataset is duplicated with seven combinations of different Image and Text encoders, so you should see 42 folders.
Each folder is the name of the dataset and the encoder used for the visual and textual parts. For example: bookcrossing-vit_bert.
The datasets are: - Clothing, Shoes and Jewelry (Amazon) - Home and Kitchen (Amazon) - Musical Instruments (Amazon) - Movies and TV (Amazon) - Book-Crossing - Movielens 25M
And the encoders are:
- CLIP (Image and Text) (*-clip_clip). This is the main one used in the experiments.
- ViT and BERT (*-vit_bert)
- CLIP (only visual data) *-clip_none
- ViT only *-vit_none
- BERT only *-none_bert
- CLIP (text only) *-clip_none
- No textual or visual information *-none_none
For each dataset, we have the following files, considering we have M items and N users, textual embeddings with D (like 1024) dimensions, and Visual with E dimensions (like 768)
- embed_image.npy A NumPy array of MxE elements.
- embed_text.npy A NumPy array of MXD elements.
- items.csv A CSV with the Item ID in the original dataset (like the Amazon ASIN, the Movie ID, etc.) and the item number, an integer from 0 to M-1
- users.csv A CSV with the User ID in the original dataset (like the Amazon Reviewer Id) and the item number, an integer from 0 to N-1
- train.txt, validation.txt and test.txt are CSV files with the portions of the reviews for train validation and test. It has the item the user liked or reviewed positively. Each row has a positive user item.
We consider a review "positive" if the rating is four or more (or 8 or more for Book-crossing).
The vector is zeroed out if an Item does not have an image or text.
| Dataset | Users | Item | Ratings | Density |
|---|---|---|---|---|
| Clothing & Shoes & Jewelry | 23318 | 38493 | 178944 | 0.020% |
| Home & Kitchen | 5968 | 57645 | 135839 | 0.040% |
| Movies & TV | 21974 | 23958 | 216110 | 0.041% |
| Musical Instruments | 14429 | 29040 | 93923 | 0.022% |
| Book-crossing | 14790 | 33962 | 519613 | 0.103% |
| Movielens 25M | 162541 | 59047 | 25000095 | 0.260% |
Only a tiny fraction of the dataset was taken for the Amazon Datasets by considering reviews in a specific date range.
For the Bookcrossing dataset, only items with images were considered.
There are various other minor tweaks on how to obtain images and texts. The repo https://github.com/igui/MultimodalRecomAnalysis has the Notebook and scripts to reproduce the dataset extraction from scratch.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The first multi-modal Steam dataset with semantic search capabilities. 239,664 applications collected from official Steam Web APIs with PostgreSQL database architecture, vector embeddings for content discovery, and comprehensive review analytics.
Made by a lifelong gamer for the gamer in all of us. Enjoy!🎮
GitHub Repository https://github.com/vintagedon/steam-dataset-2025
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F28514182%2F4b7eb73ac0f2c3cc9f0d57f37321b38f%2FScreenshot%202025-10-18%20180450.png?generation=1760825194507387&alt=media" alt="">
1024-dimensional game embeddings projected to 2D via UMAP reveal natural genre clustering in semantic space
Unlike traditional flat-file Steam datasets, this is built as an analytically-native database optimized for advanced data science workflows:
☑️ Semantic Search Ready - 1024-dimensional BGE-M3 embeddings enable content-based game discovery beyond keyword matching
☑️ Multi-Modal Architecture - PostgreSQL + JSONB + pgvector in unified database structure
☑️ Production Scale - 239K applications vs typical 6K-27K in existing datasets
☑️ Complete Review Corpus - 1,048,148 user reviews with sentiment and metadata
☑️ 28-Year Coverage - Platform evolution from 1997-2025
☑️ Publisher Networks - Developer and publisher relationship data for graph analysis
☑️ Complete Methodology & Infrastructure - Full work logs document every technical decision and challenge encountered, while my API collection scripts, database schemas, and processing pipelines enable you to update the dataset, fork it for customized analysis, learn from real-world data engineering workflows, or critique and improve the methodology
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F28514182%2F649e9f7f46c6ce213101d0948c89e8ac%2F4_price_distribution_by_top_10_genres.png?generation=1760824835918620&alt=media" alt="">
Market segmentation and pricing strategy analysis across top 10 genres
Core Data (CSV Exports): - 239,664 Steam applications with complete metadata - 1,048,148 user reviews with scores and statistics - 13 normalized relational tables for pandas/SQL workflows - Genre classifications, pricing history, platform support - Hardware requirements (min/recommended specs) - Developer and publisher portfolios
Advanced Features (PostgreSQL): - Full database dump with optimized indexes - JSONB storage preserving complete API responses - Materialized columns for sub-second query performance - Vector embeddings table (pgvector-ready)
Documentation: - Complete data dictionary with field specifications - Database schema documentation - Collection methodology and validation reports
Three comprehensive analysis notebooks demonstrate dataset capabilities. All notebooks render directly on GitHub with full visualizations and output:
View on GitHub | PDF Export
28 years of Steam's growth, genre evolution, and pricing strategies.
View on GitHub | PDF Export
Content-based recommendations using vector embeddings across genre boundaries.
View on GitHub | PDF Export
Genre prediction from game descriptions - demonstrates text analysis capabilities.
Notebooks render with full output on GitHub. Kaggle-native versions planned for v1.1 release. CSV data exports included in dataset for immediate analysis.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F28514182%2F4079e43559d0068af00a48e2c31f0f1d%2FScreenshot%202025-10-18%20180214.png?generation=1760824950649726&alt=media" alt="">
*Steam platfor...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Modality A: Near-Infrared (NIR)
Modality B: three colour channels (in B-G-R order)
Modality A: Fluorescence Images
Modality B: Quantitative Phase Images (QPI)
Modality A: Second Harmonic Generation (SHG)
Modality B: Bright-Field (BF)
The evaluation set created from the above three publicly available 2D datasets consists of images undergone 4 levels of (rigid) transformations of increasing size of displacement. The level of transformations is determined by the size of the rotation angle θ and the displacement tx & ty, detailed in this table. Each image sample is transformed exactly once at each transformation level so that all levels have the same number of samples.
Modality A: T1-weighted MRI
Modality B: T2-weighted MRI
(Run make_rire_patches.py to generate the sub-volumes.)
Reference sub-volumes of size 210x210x70 voxels are cropped directly from centres of the (non-displaced) resampled volumes. Similarly as for the aforementioned 2D datasets, random (uniformly-distributed) transformations are composed of rotations θx, θy ∈ [-4, 4] degrees around the x- and y-axes, rotation θz ∈ [-20, 20] degrees around the z-axis, translations tx, ty ∈ [-19.6, 19.6] voxels in x and y directions and translation tz ∈ [-6.5, 6.5] voxels in z direction. 40 rigid transformations of increasing sizes of displacement are applied to each volume. Transformed sub-volumes, of size 210x210x70 voxels, are cropped from centres of the transformed and resampled volumes.
In total, it contains 864 image pairs created from the aerial dataset, 5040 image pairs created from the cytological dataset, 536 image pairs created from the histological dataset, and metadata with scripts to create the 480 volume pairs from the radiological dataset. Each image pair consists of a reference patch \(I^{\text{Ref}}\) and its corresponding initial transformed patch \(I^{\text{Init}}\) in both modalities, along with the ground-truth transformation parameters to recover it.
Scripts to calculate the registration performance and to plot the overall results can be found in https://github.com/MIDA-group/MultiRegEval, and instructions to generate more evaluation data with different settings can be found in https://github.com/MIDA-group/MultiRegEval/tree/master/Datasets#instructions-for-customising-evaluation-data.
Metadata
In the *.zip files, each row in {Zurich,Balvan}_patches/fold[1-3]/patch_tlevel[1-4]/info_test.csv or Eliceiri_patches/patch_tlevel[1-4]/info_test.csv provides the information of an image pair as follow:
Filename: identifier(ID) of the image pair
X1_Ref: x-coordinate of the upper-left corner of reference patch IRef
Y1_Ref: y-coordinate of the upper-left corner of reference patch IRef
X2_Ref: x-coordinate of the lower-left corner of reference patch IRef
Y2_Ref: y-coordinate of the lower-left corner of reference patch IRef
X3_Ref: x-coordinate of the lower-right corner of reference patch IRef
Y3_Ref: y-coordinate of the lower-right corner of reference patch IRef
X4_Ref: x-coordinate of the upper-right corner of reference patch IRef
Y4_Ref: y-coordinate of the upper-right corner of reference patch IRef
X1_Trans: x-coordinate of the upper-left corner of transformed patch IInit
Y1_Trans: y-coordinate of the upper-left corner of transformed patch IInit
X2_Trans: x-coordinate of the lower-left corner of transformed patch IInit
Y2_Trans: y-coordinate of the lower-left corner of transformed patch IInit
X3_Trans: x-coordinate of the lower-right corner of transformed patch IInit
Y3_Trans: y-coordinate of the lower-right corner of transformed patch IInit
X4_Trans: x-coordinate of the upper-right corner of transformed patch IInit
Y4_Trans: y-coordinate of the upper-right corner of transformed patch IInit
Displacement: mean Euclidean distance between reference corner points and transformed corner points
RelativeDisplacement: the ratio of displacement to the width/height of image patch
Tx: randomly generated translation in the x-direction to synthesise the transformed patch IInit
Ty: randomly generated translation in the y-direction to synthesise the transformed patch IInit
AngleDegree: randomly generated rotation in degrees to synthesise the transformed patch IInit
AngleRad: randomly generated rotation in radian to synthesise the transformed patch IInit
In addition, each row in RIRE_patches/fold[1-3]/patch_tlevel[1-4]/info_test.csv has following columns:
Naming convention
zh{ID}_{iRow}_{iCol}_{ReferenceOrTransformed}.png
zh5_03_02_R.png indicates the Reference patch of the 3rd row and 2nd column cut from the image with ID zh5.</li>
<li><strong>Cytological data</strong>
<ul>
<li>
<pre> {{cellline}_{treatment}_{fieldofview}_{iFrame}}_{iRow}_{iCol}_{ReferenceOrTransformed}.png</pre>
</li>
<li>Example: <code>PNT1A_do_1_f15_02_01_T.png</code> indicates the <em>Transformed
Facebook
TwitterIntroductionMachine learning (ML) is an effective tool for predicting mental states and is a key technology in digital psychiatry. This study aimed to develop ML algorithms to predict the upper tertile group of various anxiety symptoms based on multimodal data from virtual reality (VR) therapy sessions for social anxiety disorder (SAD) patients and to evaluate their predictive performance across each data type.MethodsThis study included 32 SAD-diagnosed individuals, and finalized a dataset of 132 samples from 25 participants. It utilized multimodal (physiological and acoustic) data from VR sessions to simulate social anxiety scenarios. This study employed extended Geneva minimalistic acoustic parameter set for acoustic feature extraction and extracted statistical attributes from time series-based physiological responses. We developed ML models that predict the upper tertile group for various anxiety symptoms in SAD using Random Forest, extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and categorical boosting (CatBoost) models. The best parameters were explored through grid search or random search, and the models were validated using stratified cross-validation and leave-one-out cross-validation.ResultsThe CatBoost, using multimodal features, exhibited high performance, particularly for the Social Phobia Scale with an area under the receiver operating characteristics curve (AUROC) of 0.852. It also showed strong performance in predicting cognitive symptoms, with the highest AUROC of 0.866 for the Post-Event Rumination Scale. For generalized anxiety, the LightGBM’s prediction for the State-Trait Anxiety Inventory-trait led to an AUROC of 0.819. In the same analysis, models using only physiological features had AUROCs of 0.626, 0.744, and 0.671, whereas models using only acoustic features had AUROCs of 0.788, 0.823, and 0.754.ConclusionsThis study showed that a ML algorithm using integrated multimodal data can predict upper tertile anxiety symptoms in patients with SAD with higher performance than acoustic or physiological data obtained during a VR session. The results of this study can be used as evidence for personalized VR sessions and to demonstrate the strength of the clinical use of multimodal data.
Facebook
Twitterhttps://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/
The MultiBanFakeDetect dataset consists of 9,600 text–image instances collected from online forums, news websites, and social media. It covers multiple themes — political, social, technology, and entertainment — with a balanced distribution of real and fake instances.
The dataset is split into:
| Type | Training | Testing | Validation |
|---|---|---|---|
| Misinformation | 1,288 | 161 | 162 |
| Rumor | 1,215 | 152 | 151 |
| Clickbait | 1,337 | 167 | 167 |
| Non-fake | 3,840 | 480 | 480 |
| Total | 7,680 | 960 | 960 |
| Label | Training | Testing | Validation |
|---|---|---|---|
| 1 (Fake) | 3,840 | 480 | 480 |
| 0 (Non-Fake) | 3,840 | 480 | 480 |
| Total | 7,680 | 960 | 960 |
| Category | Training | Testing | Validation |
|---|---|---|---|
| Entertainment | 640 | 80 | 80 |
| Sports | 640 | 80 | 80 |
| Technology | 640 | 80 | 80 |
| National | 640 | 80 | 80 |
| Lifestyle | 640 | 80 | 80 |
| Politics | 640 | 80 | 80 |
| Education | 640 | 80 | 80 |
| International | 640 | 80 | 80 |
| Crime | 640 | 80 | 80 |
| Finance | 640 | 80 | 80 |
| Business | 640 | 80 | 80 |
| Miscellaneous | 640 | 80 | 80 |
| Total | 7,680 | 960 | 960 |
@article{FARIA2025100347,
title = {MultiBanFakeDetect: Integrating advanced fusion techniques for multimodal detection of Bangla fake news in under-resourced contexts},
journal = {International Journal of Information Management Data Insights},
volume = {5},
number = {2},
pages = {100347},
year = {2025},
issn = {2667-0968},
doi = {https://doi.org/10.1016/j.jjimei.2025.100347},
url = {https://www.sciencedirect.com/science/article/pii/S2667096825000291},
author = {Fatema Tuj Johora Faria and Mukaffi Bin Moin and Zayeed Hasan and Md. Arafat Alam Khandaker and Niful Islam and Khan Md Hasib and M.F. Mridha},
keywords = {Fake news detection, Multimodal dataset, Textual analysis, Visual analysis, Bangla language, Under-resource, Fusion techniques, Deep learning}}
Facebook
TwitterMMSci_Table
Dataset for the paper "Does Table Source Matter? Benchmarking and Improving Multimodal Scientific Table Understanding and Reasoning"
📑 Paper Github
MMSci Dataset Collection
The MMSci dataset collection consists of three complementary datasets designed for scientific multimodal table understanding and reasoning: MMSci-Pre, MMSci-Ins, and MMSci-Eval.
Dataset Summary
MMSci-Pre: A domain-specific pre-training dataset… See the full description on the dataset page: https://huggingface.co/datasets/yangbh217/MMSci_Table.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
A multi-modality, multi-activity, and multi-subject dataset of wearable biosignals. Modalities: ECG, EMG, EDA, PPG, ACC, TEMP Main Activities: Lift object, Greet people, Gesticulate while talking, Jumping, Walking, and Running Cohort: 17 subjects (10 male, 7 female); median age: 24 Devices: 2x ScientISST Core + 1x Empatica E4 Body Locations: Chest, Abdomen, Left bicep, wrist and index finger No filter has been applied to the signals, but the correct transfer functions were applied, so the data is given in relevant unis (mV, uS, g, ºC).
In this repository, there are two formats available: a) LTBio's Biosignal files. Should be open like: x = Biosignal.load(path) LTBio Package: https://pypi.org/project/LongTermBiosignals/ Under the directory biosignal, the following tree structure is found: subject/x.biosignal, where subject is the subject's code, and x is any of the following { acc_chest, acc_wrist, ecg, eda, emg, ppg, temp }. Each file includes the signals recorded from every sensor that acquires the modality after which the file is named, independently of the device. Channels, activities and time intervals can be easily indexed with the index operator . A sneak peak of the signals can also be quickly plotted with: x.preview.plot() Any Biosignal can be easily converted to NumPy arrays or DataFrames, if needed. b) CSV files. Can be open like: x = pandas.read_csv(path) Pandas Package: https://pypi.org/project/pandas/ These files can be found under the directory csv, named as subject.csv, where subject is the subject's code. There is only one file per subject, containing their full session and all biosignal modalities. When read as tables, the time axis is in the first column, each sensor is in one of the middle columns, and the activity labels are in the last column. In each row are the samples of each sensor, if any, at each timestamp. At any given timestamp, if there is no sample for a sensor, it means the acquisition was interrupted for that sensor, which happens between activities, and sometimes for short periods during the running activity. Also in each row, on the last column, is one or more activity labels, if an activity was taking place at that timestamp. If there are multiple annotations, the labels are separated by vertical bars (e.g 'run | sprint'). If there are no annotations, the column is empty for that timestamp.
Both include annotations of the activities, however LTBio bio signal files have better time resolution and include clinical data and demographic data as well.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is the first of the four datasets that we have created, for audio-text training tasks. These collect pairs of texts and audios, based on the audio-image pairs from our datasets [1, 2, 3]. These are only intended for research purposes.
For the conversion, .csv tables were created, where audio values were separated in 16,000 columns and images were transformed into texts using the public model BLIP [4]. The original images are also preserved for future reference.
To allow other researchers a quick evaluation of the potential usefulness of our datasets for their purposes, we have made available a public page where anyone can check 60 random samples that we extracted from all of our data [5].
[1] Jorge E. León. Image-audio pairs (1 of 3). 2024. url: https://www.kaggle.com/datasets/jorvan/image-audio-pairs-1-of-3. [2] Jorge E. León. Image-audio pairs (2 of 3). 2024. url: https://www.kaggle.com/datasets/jorvan/image-audio-pairs-2-of-3. [3] Jorge E. León. Image-audio pairs (3 of 3). 2024. url: https://www.kaggle.com/datasets/jorvan/image-audio-pairs-3-of-3. [4] Junnan Li et al. “BLIP: Bootstrapping Language-Image Pre-training for Unified VisionLanguage Understanding and Generation”. En: ArXiv 2201.12086 (2022). [5] Jorge E. León. AVT Multimodal Dataset. 2024. url: https://jorvan758.github.io/AVT-Multimodal-Dataset/.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
🧠 Visual-TableQA: Open-Domain Benchmark for Reasoning over Table Images
Welcome to Visual-TableQA, a project designed to generate high-quality synthetic question-answer datasets associated to images of tables. This resource is ideal for training and evaluating models on visually-grounded table understanding tasks such as document QA, table parsing, and multimodal reasoning.
🚀 Latest Update
We have refreshed the dataset with newly generated QA pairs created by… See the full description on the dataset page: https://huggingface.co/datasets/AI-4-Everyone/Visual-TableQA.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The table presents the evaluation results of the selected formulas using the R2 metric, which measures the goodness of fit between the predicted values and the actual values. The R2 values for all the selected formulas are listed, providing a clear view of the fitting performance of each formula.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This comprehensive dataset contains 15,420 samples collected from 156 athletes over a 6-month monitoring period, designed for predictive modeling of sports injury risk using multimodal sensor data and machine learning techniques.
The dataset enables researchers and data scientists to: - Predict sports injury risk using multimodal physiological and biomechanical data - Develop real-time athlete monitoring systems for injury prevention - Build deep learning models (CNN, LSTM, Transformers) for temporal pattern recognition - Analyze pre-injury patterns and early warning indicators - Study relationships between training load, fatigue, and injury occurrence
injury_occurred (3 classes: Healthy, Low Risk, High Risk/Injured)| Feature | Unit | Range | Mean ± SD | Description | Sensor Type |
|---|---|---|---|---|---|
heart_rate | bpm | 40-180 | 72.4 ± 18.3 | Cardiovascular stress indicator | Chest-strap HR monitor |
body_temperature | °C | 35.8-39.2 | 37.1 ± 0.6 | Core body temperature | Infrared thermometer |
hydration_level | % | 45-100 | 78.3 ± 12.4 | Fluid balance status | Bioimpedance sensor |
sleep_quality | score | 2-10 | 6.8 ± 1.9 | Recovery quality indicator | Wearable sleep tracker |
recovery_score | score | 25-98 | 68.5 ± 15.2 | Overall recovery status | Composite metric |
stress_level | a.u. | 0.1-0.95 | 0.42 ± 0.18 | Physiological stress level | HRV-based estimate |
| Feature | Unit | Range | Mean ± SD | Description | Sensor Type |
|---|---|---|---|---|---|
muscle_activity | μV | 10-850 | 245.6 ± 127.3 | Muscle activation level | Surface EMG |
joint_angles | degrees | 45-175 | 112.3 ± 28.4 | Joint range of motion | IMU sensors (9-axis) |
gait_speed | m/s | 0.8-3.5 | 1.85 ± 0.52 | Walking/running speed | Motion capture |
cadence | steps/min | 50-200 | 85.7 ± 22.1 | Step frequency | Accelerometer |
step_count | count | 2000-15000 | 7823 ± 2341 | Total steps per session | Pedometer |
jump_height | meters | 0.15-0.85 | 0.48 ± 0.14 | Vertical jump performance | Force plate |
ground_reaction_force | N | 800-2800 | 1654 ± 387 | Impact force during movement | Force plate |
range_of_motion | degrees | 60-180 | 124.5 ± 23.7 | Joint flexibility | Goniometer |
| Feature | Unit | Range | Mean ± SD | Description |
|---|---|---|---|---|
ambient_temperature | °C | 15-38 | 24.8 ± 5.3 | Training environment temperature |
humidity | % | 30-85 | 58.3 ± 14.2 | Air humidity level |
altitude | meters | 0-1200 | 285 ± 234 | Training location elevation |
playing_surface | categorical | 0-4 | - | Surface type (0=Grass, 1=Turf, 2=Indoor, 3=Track, 4=Other) |
| Feature | Unit | Range | Mean ± SD | Description |
|---|---|---|---|---|
training_intensity | RPE | 2-10 | 6.4 ± 1.8 | Perceived exertion level |
training_duration | minutes | 30-180 | 87.5 ± 28.3 | Session duration |
training_load | a.u. | 150-1800 | 568 ± 287 | Intensity × Duration |
fatigue_index | score | 15-85 | 48.3 ± 18.7 | Cumulative fatigue measure |
| Column | Type | Description |
|---|---|---|
athlete_id | Integer | Unique athlete identifier (1-156) |
session_id | Integer | Session number per athlete |
sport_type | Categorical | Sport discipline (Soccer, Basketball, Track, Other) |
gender | Categorical | Male (68%), Female (32%) |
age | Integer | Athlete age in years (18-35, Mean: 24.3 ± 4.2) |
bmi | Float | Body Mass Index (18.5-28.3, Mean: 23.1 ± 2.4) |
injury_occurred | Integer | Target variable (see below) |
injury_occurredThe dataset includes a 3-class target variable for injury risk prediction:
| Class | Label | Count | Percentage | Description |
|---|---|---|---|---|
| 0 | Healthy | 9,869 | 64.0% | No injury risk indicators |
| 1 | Low Risk | 3,238 | 21.0% | Elevated fatigue or training load |
| 2 | High Risk/Injured | 2,313 | 15.0% | Injury occurred or imminent risk |
Imbalance Ratio: 4.27:1 (Majority:Minority)
Injury Definition: ...
Facebook
Twittertable-benchmark/tablebench-tqa dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present the dataset which was created during a user study on evaluation of explainability of artificial intelligence (AI) at the Jagielloninan University as a collaborative work of computer science (GEIST team) and information sciences research groups. The main goal of the research was to explore effective explanations of AI model patterns to diverse audiences.
The dataset contains material collected from 39 participants during the interviews conducted by the Information Sciences research group. The participants were recruited from 149 candidates to form three groups that represented domain experts in the field of mycology (DE), students with data science and visualization background (IT) and students from social sciences and humanities (SSH). Each group was given an explanation of a machine learning model trained to predict edible and non-edible mushrooms and asked to interpret the explanations and answer various questions during the interview. The machine learning model and explanations for its decision were prepared by the computer science research team.
The resulting dataset was constructed from the surveys obtained from the candidates, anonymized transcripts of the interviews, the results from thematic analysis, and original explanations with modifications suggested by the participants. The dataset is complemented with the source code allowing one to reproduce the initial machine leaning model and explanations.
The general structure of the dataset is described in the following table. The files that contain in their names [RR]_[SS]_[NN] contain the individual results obtained from particular participant. The meaning of the prefix is as follows:
| File | Description |
| SURVEY.csv | The results from a survey that was filled by 149 participants out of which 39 were selected to form a final group of particiapnts. |
| CODEBOOK.csv | The codebook used in thematic analysis and MAXQDA coding |
| QUESTIONS.csv | List of questions that the participants were asked during interviews. |
| SLIDES.csv | List of slides used in the study with their interpretation and reference to MAXQDA themes and VISUAL_MODIFICATIONS tables. |
| MAXQDA_SUMMARY.csv | Summary of thematic analysis performed with codes used in CODEBOOK for each participant |
| PROBLEMS.csv | List of problems that participants were asked to solve during interviews. They correspond to three instances from the dataset that the participants had to classify using knowledge gained from explanations. |
| PROBLEMS_RESPONSES.csv | The responses to the problems for each participant to the problems listed in PROBLEMS.csv |
| VISUALIZATION_MODIFICATIONS.csv | Information on how the order of the slides was modified by the participant, which slides (explanations) were removed, and what kind of additional explanation was suggested. |
| ORIGINAL_VISUZALIZATIONS.pdf | The PDF file containing the visualization of explanations presented to the participants during the interviews |
| VISUALIZATION_MODIFICATIONS.zip | The PDF file containing the original slides from ORIGINAL_VISUZALIZATIONS.pdf with the modifications suggested by the participant. Each file is a PDF file named with the participant ID, i.e. [RR]_[SS]_[NN].pdf |
| TRANSCRIPTS.zip | The anonymized transcripts of interviews for each given participant, zipped into one archive. Each transcript is named after the particiapnt ID, i.e. [RR]_[SS]_[NN].csv and contains text tagged with slide number that it related to, question number from QUESTIONS.csv, and problem number from PROBLEMS.csv. |
The detailed structure of the files presented in the previous Table is given in the Technical info section.
The source code used to train ML model and to generate explanations is available on Gitlab
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
URMAT (Urban Materials Dataset) is a large-scale, multimodal synthetic dataset designed for training and benchmarking material-aware semantic segmentation, scene understanding, and electromagnetic wave simulation tasks in complex urban environments.
The dataset provides pixel-wise annotated images, depth maps, segmentation masks, physical material metadata, and aligned 3D point clouds, all derived from realistic 3D reconstructions of urban scenes including Trastevere, CityLife, Louvre, Canary Wharf, Bryggen, Siemensstadt, and Eixample.
14 material classes: Brick, Glass, Steel, Tiles, Limestone, Plaster, Concrete, Wood, Cobblestone, Slate, Asphalt, Plastic, Gravel, Unknown.
Multimodal data: RGB, depth, material masks, mesh segmentation
Physically annotated metadata: includes permittivity, reflectance, attenuation
7 diverse European city districts, georeferenced and stylistically accurate
Precomputed point clouds for 3D analysis or downstream simulation
Compatible with Unreal Engine, PyTorch, and MATLAB pipelines
At the root of the dataset:
*_mapping/ folders: mapping files, mesh metadata, camera poses
*_pointclouds/ folders: colored 3D point clouds with material labels
train/, val/, test/: standard splits for training and evaluation
Inside each split (train/, val/, test/):
| Folder Name | Description |
|---|---|
rgb/ | RGB images rendered from Unreal Engine |
depth_png/ | Depth maps as grayscale .png (normalized for visualization) |
depth_npy/ | Raw depth arrays saved as .npy |
segmentation_material_png/ | Color-encoded material segmentation masks for visualization |
segmentation_material_npy/ | Material masks in .npy format (integer IDs per pixel, for training) |
segmentation_mesh/ | Optional masks identifying the mesh origin of each pixel |
metadata/ | JSON metadata with material type and physical properties per mesh |
Material-aware semantic segmentation
Scene-level reasoning for 3D reconstruction
Ray tracing and wireless signal propagation simulation
Urban AI and Smart City research
Synthetic-to-real generalization studies
If you use URMAT v2 in your research, please cite the dataset.
Paper: "UR-MAT: A Multimodal, Material-Aware Synthetic Dataset of Urban Scenarios" (https://www.researchgate.net/publication/395193944_UR-MAT_A_Multimodal_Material-Aware_Synthetic_Dataset_of_Urban_Scenarios) - to appear in ACM Multimedia 2025, Dataset Track
Facebook
Twitterhttps://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
| BASE YEAR | 2024 |
| HISTORICAL DATA | 2019 - 2023 |
| REGIONS COVERED | North America, Europe, APAC, South America, MEA |
| REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
| MARKET SIZE 2024 | 4.49(USD Billion) |
| MARKET SIZE 2025 | 5.59(USD Billion) |
| MARKET SIZE 2035 | 50.0(USD Billion) |
| SEGMENTS COVERED | Application, Deployment Model, End Use Industry, Model Type, Regional |
| COUNTRIES COVERED | US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA |
| KEY MARKET DYNAMICS | Technological advancements, Increasing data availability, Rising demand for automation, Enhancing user experience, Competitive landscape growth |
| MARKET FORECAST UNITS | USD Billion |
| KEY COMPANIES PROFILED | Adobe, OpenAI, Baidu, Microsoft, Google, C3.ai, Meta, Tencent, SAP, IBM, Amazon, Hugging Face, Alibaba, Salesforce, Nvidia |
| MARKET FORECAST PERIOD | 2025 - 2035 |
| KEY MARKET OPPORTUNITIES | Natural language processing integration, Enhanced personalization in services, Advanced healthcare applications, Smart automation in industries, Scalable cloud-based solutions |
| COMPOUND ANNUAL GROWTH RATE (CAGR) | 24.5% (2025 - 2035) |
Facebook
TwitterDescription: DefComm-DB comprises 261 genuine non-acted dialogues between English-speaking individuals in 'real-world' settings that feature one of the defensive behaviours outlined in Birkenbihl's model of communication failures [1]:
[1] Birkenbihl, V. (2013). Kommunikationstraining: Zwischenmenschliche Beziehungen erfolgreich gestalten. Schritte 1–6. : mvg Verlag.
Key statistics on the dataset are provided in Table 1. DefComm features a variety of video topics, including interviews with celebrities and professional athletes, political debates, legal trials, TV shows, and video footage obtained by paparazzi, among others. The situations, number of participants, gender, age, and ethnicity vary from scene to scene.
From each video, we retrieve audio, visual, and textual modalities. In this paper, we focus on the audio modality and the speech transcriptions.
| Label | # video clips | μ [s] | σ [s] | min [s] | max [s] | Σ duration [s] |
|---|---|---|---|---|---|---|
| Attack | 112 | 8 | 9 | 2 | 46 | 949 |
| Flight | 57 | 9 | 8 | 2 | 62 | 494 |
| Greater | 45 | 9 | 6 | 2 | 25 | 416 |
| Smaller | 47 | 12 | 8 | 3 | 49 | 556 |
| Total | 261 | 9 | 8 | 2 | 62 | 2415 |
Facebook
TwitterCoverage data for patient profiling
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Table of three selected data events showing initial analysis of multimodal elements that constituted the data, which were the:
Metadata, Social actors, Visual images (e.g., photo anaysis), Linguistic expressions of sentiment, Non-linguistic reactions, and 6) Broader social-economic-political relations