100+ datasets found

c
Insider Threat Test Dataset
kilthub.cmu.edu
txt
Updated May 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brian Lindauer (2023). Insider Threat Test Dataset [Dataset]. http://doi.org/10.1184/R1/12841247.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/12841247.v1
Dataset updated
May 30, 2023
Dataset provided by
Carnegie Mellon University
Authors
Brian Lindauer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Insider Threat Test Dataset is a collection of synthetic insider threat test datasets that provide both background and malicious actor synthetic data.The CERT Division, in partnership with ExactData, LLC, and under sponsorship from DARPA I2O, generated a collection of synthetic insider threat test datasets. These datasets provide both synthetic background data and data from synthetic malicious actors.For more background on this data, please see the paper, Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data.Datasets are organized according to the data generator release that created them. Most releases include multiple datasets (e.g., r3.1 and r3.2). Generally, later releases include a superset of the data generation functionality of earlier releases. Each dataset file contains a readme file that provides detailed notes about the features of that release.The answer key file answers.tar.bz2 contains the details of the malicious activity included in each dataset, including descriptions of the scenarios enacted and the identifiers of the synthetic users involved.
CMU Face Images
kaggle.com
zip
Updated Sep 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
RAVI PRAKASH SRIVASTAVA (2023). CMU Face Images [Dataset]. https://www.kaggle.com/datasets/raviprakash22/cmu-face-images
Explore at:
zip(22363697 bytes)Available download formats
Dataset updated
Sep 3, 2023
Authors
RAVI PRAKASH SRIVASTAVA
Description
This data consists of 640 black and white face images of people taken with varying pose (straight, left, right, up), expression (neutral, happy, sad, angry), eyes (wearing sunglasses or not), and size
h
cmu-mosei-comp-seq
huggingface.co
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reeha Parkar (2025). cmu-mosei-comp-seq [Dataset]. https://huggingface.co/datasets/reeha-parkar/cmu-mosei-comp-seq
Explore at:
Dataset updated
Jun 30, 2025
Authors
Reeha Parkar
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
CMU-MOSEI: Computational Sequences (Unofficial Mirror)

This repository provides a mirror of the official computational sequence files from the CMU-MOSEI dataset, which are required for multimodal sentiment and emotion research. The original download links are currently down, so this mirror is provided for the research community.

Note: This is an unofficial mirror. All data originates from Carnegie Mellon University and original authors. If you are a dataset creator and want this… See the full description on the dataset page: https://huggingface.co/datasets/reeha-parkar/cmu-mosei-comp-seq.

Speaker Recognition - CMU ARCTIC

kaggle.com

zip

Updated Nov 21, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

Gabriel Lins (2022). Speaker Recognition - CMU ARCTIC [Dataset]. https://www.kaggle.com/datasets/mrgabrielblins/speaker-recognition-cmu-arctic

Explore at:

zip(1354293783 bytes)Available download formats

Dataset updated

Nov 21, 2022

Authors

Gabriel Lins

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Can you predict which speaker is talking?
Can you predict what they are saying? This dataset makes all of these possible. Perfect for a school project, research project, or resume builder.

File information

train.csv - file containing all the data you need for training, with 4 columns, id (file id), file_path(path to .wav files), speech(transcription of audio file), and speaker (target column)
test.csv - file containing all the data you need to test your model (20% of total audio files), it has the same columns as train.csv
train/ - Folder with training data, subdivided with Speaker's folders
- aew/ - Folder containing audio files in .wav format for speaker aew
- ...
test/ - Folder containing audio files for test data.

Column description

Column	Description
id	file id (string)
file_path	file path to .wav file (string)
speech	transcription of the audio file (string)
speaker	speaker name, use this as the target variable if you are doing audio classification (string)

More Details

The CMU_ARCTIC databases were constructed at the Language Technologies Institute at Carnegie Mellon University as phonetically balanced, US-English single-speaker databases designed for unit selection speech synthesis research. A detailed report on the structure and content of the database and the recording environment etc is available as a Carnegie Mellon University, Language Technologies Institute Tech Report CMU-LTI-03-177 and is also available here.

The databases consist of around 1150 utterances carefully selected from out-of-copyright texts from Project Gutenberg. The databases include US English male (bdl) and female (slt) speakers (both experienced voice talent) as well as other accented speakers.

The 1132 sentence prompt list is available from cmuarctic.data

The distributions include 16KHz waveform and simultaneous EGG signals. Full phonetically labeling was performed by the CMU Sphinx using the FestVox based labeling scripts. Complete runnable Festival Voices are included with the database distributions, as examples though better voices can be made by improving labeling, etc.

Acknowledgements

This work was partially supported by the U.S. National Science Foundation under Grant No. 0219687, "ITR/CIS Evaluation and Personalization of Synthetic Voices". Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

h
cmu-arctic-xvectors
huggingface.co
opendatalab.com
Updated Feb 7, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthijs Hollemans (2023). cmu-arctic-xvectors [Dataset]. https://huggingface.co/datasets/Matthijs/cmu-arctic-xvectors
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 7, 2023
Authors
Matthijs Hollemans
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Speaker embeddings extracted from CMU ARCTIC

There is one .npy file for each utterance in the dataset, 7931 files in total. The speaker embeddings are 512-element X-vectors. The CMU ARCTIC dataset divides the utterances among the following speakers:

bdl (US male) slt (US female) jmk (Canadian male) awb (Scottish male) rms (US male) clb (US female) ksp (Indian male)

The X-vectors were extracted using this script, which uses the speechbrain/spkrec-xvect-voxceleb model. Usage: from… See the full description on the dataset page: https://huggingface.co/datasets/Matthijs/cmu-arctic-xvectors.
i
The CMU/Hotspot Dataset
impactcybertrust.org
Updated Jan 21, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
External Data Source (2019). The CMU/Hotspot Dataset [Dataset]. http://doi.org/10.23721/100/1478821
Explore at:
Unique identifier
https://doi.org/10.23721/100/1478821
Dataset updated
Jan 21, 2019
Authors
External Data Source
Description
The performance was measured and the application support of all visible APs at 13 hotspot locations around University Avenue, Seattle, WA, near the University of Washington over the course of 1 week. ; jeffpang@cs.cmu.edu
c
Data Collected with Package Delivery Quadcopter Drone
kilthub.cmu.edu
opendatalab.com
txt
Updated May 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thiago A. Rodrigues; Jay Patrikar; Arnav Choudhry; Jacob Feldgoise; Vaibhav Arcot; Aradhana Gahlaut; Sophia Lau; Brady Moon; Bastian Wagner; H Scott Matthews; Sebastian Scherer; Constantine Samaras (2021). Data Collected with Package Delivery Quadcopter Drone [Dataset]. http://doi.org/10.1184/R1/12683453.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/12683453.v1
Dataset updated
May 27, 2021
Dataset provided by
Carnegie Mellon University
Authors
Thiago A. Rodrigues; Jay Patrikar; Arnav Choudhry; Jacob Feldgoise; Vaibhav Arcot; Aradhana Gahlaut; Sophia Lau; Brady Moon; Bastian Wagner; H Scott Matthews; Sebastian Scherer; Constantine Samaras
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This experiment was performed in order to empirically measure the energy use of small, electric Unmanned Aerial Vehicles (UAVs). We autonomously direct a DJI ® Matrice 100 (M100) drone to take off, carry a range of payload weights on a triangular flight pattern, and land. Between flights, we varied specified parameters through a set of discrete options, payload of 0 , 250 g and 500 g; altitude during cruise of 25 m, 50 m, 75 m and 100 m; and speed during cruise of 4 m/s, 6 m/s, 8 m/s, 10 m/s and 12 m/s. We simultaneously collect data from a broad array of on-board sensors. The onboard sensors used to collect these data are* Wind sensor: FT Technologies FT205 UAV-mountable, pre-calibrated ultrasonic wind sensor with accuracy of ± 0.1 m/s and refresh rate of 10 Hz.;* Position: 3DM-GX5-45 GNSS/INS sensor pack. These sensors use a built-in Kalman filtering system to fuse the GPS and IMU data. The sensor has a maximum output rate of 10Hz with accuracy of ± 2 m$ RMS horizontal, ± 5 m$ RMS vertical.* Current and Voltage: Mauch Electronics PL-200 sensor. This sensor can record currents up to 200 A and voltages up to 33 V. Analogue readings from the sensor were converted into a digital format using an 8 channel 17 bit analogue-to-digital converter (ADC).Data syncing and recording was handled using the Robot Operating System (ROS) running on a low-power Raspberry Pi Zero W. Data was recorded on the Raspberry Pi's microSD card. The data provided by each sensor were synchronized to a frequency of approximately 5Hz using the ApproximateTime message filter policy of Robot Operating System (ROS). The number of flights performed varying operational parameters (payload, altitude, speed) was 196. In addition, 13 recordings were done to assess the drone’s ancillary power and hover conditions.
CMU-MOSEI
kaggle.com
zip
Updated May 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Quang Trung Hồ (2023). CMU-MOSEI [Dataset]. https://www.kaggle.com/datasets/gnurtqh/cmu-mosei
Explore at:
zip(3286247955 bytes)Available download formats
Dataset updated
May 16, 2023
Authors
Quang Trung Hồ
Description
Dataset

This dataset was created by Quang Trung Hồ

Contents
A subset of CMU motion capture database
figshare.com
bin
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qinkun Xiao (2023). A subset of CMU motion capture database [Dataset]. http://doi.org/10.6084/m9.figshare.3773109.v2
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3773109.v2
Dataset updated
Jun 5, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Qinkun Xiao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is an index of subset in CMU motion database. The corresponding ‘*.bvh’ files can be downloaded from http://mocap.cs.cmu.edu/ . Please cite this paper: Qinkun Xiao, Junfang Li, Qinhan Xiao. “Human Motion Capture Data Retrieval Based on Quaternion and EMD”. International Conference on Intelligent Human-Machine Systems and Cybernetics, v 1, 2013, pp. 517–520.
S
CMU Panoptic Dataset 2.0
simtk.org
Updated Feb 23, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eni Halilaj; Soyong Shin (2021). CMU Panoptic Dataset 2.0 [Dataset]. https://simtk.org/frs/?group_id=1966
Explore at:
Dataset updated
Feb 23, 2021
Dataset provided by
Carnegie Mellon University
Authors
Eni Halilaj; Soyong Shin
Description
The field of biomechanics is at a turning point, with marker-based motion capture set to be replaced by portable and inexpensive hardware, rapidly improving markerless tracking algorithms, and open datasets that will turn these new technologies into field-wide team projects. To expedite progress in this direction, we have collected the CMU Panoptic Dataset 2.0, which contains 86 subjects captured with 140 VGA cameras, 31 HD cameras, and 15 IMUs, performing on average 6.5 min of activities, including range of motion activities and tasks of daily living.

Data: The data are now available under Downloads > Data Share.

Code: The Fitting algorithm is published on Github: https://github.com/CMU-MBL/CMU_PanopticDataset_2.0

Citation: If you find this code or our data useful for your research, please cite the following paper: @article{HALILAJ2021110650, title = {American society of biomechanics early career achievement award 2020: Toward portable and modular biomechanics labs: How video and IMU fusion will change gait analysis}, journal = {Journal of Biomechanics}, volume = {129}, pages = {110650}, year = {2021}, issn = {0021-9290}, doi = {https://doi.org/10.1016/j.jbiomech.2021.110650}, url = {https://www.sciencedirect.com/science/article/pii/S002192902100419X}, author = {Eni Halilaj and Soyong Shin and Eric Rapp and Donglai Xiang}, }

This project includes the following software/data packages:

Pilot Study : This is the Pilot Study of CMU Panoptic Dataset 2.0
h
VL-CMU-CD
huggingface.co
Updated Apr 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guo-Hua Wang (2023). VL-CMU-CD [Dataset]. https://huggingface.co/datasets/Flourish/VL-CMU-CD
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 29, 2023
Authors
Guo-Hua Wang
Description
Flourish/VL-CMU-CD dataset hosted on Hugging Face and contributed by the HF Datasets community
CMU_MOSEI
kaggle.com
zip
Updated Dec 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samar Warsi (2024). CMU_MOSEI [Dataset]. https://www.kaggle.com/datasets/samarwarsi/cmu-mosei
Explore at:
zip(31201301641 bytes)Available download formats
Dataset updated
Dec 13, 2024
Authors
Samar Warsi
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
CMU-MOSEI is a comprehensive multimodal dataset designed to analyze emotions and sentiment in online videos. It's a valuable resource for researchers and developers working on automatic emotion recognition and sentiment analysis.

Key Features: Over 23,500 video clips from 1000+ speakers, covering diverse topics and monologues.

Multimodal data:

Acoustics: Features extracted from audio (CMU_MOSEI_COVAREP.csd) Labels: Annotations for sentiment intensity and emotion categories (CMU_MOSEI_Labels.csd) Language: Phonetic, word-level, and word vector representations (CMU_MOSEI_*.csd files under languages folder)

Visuals: Features extracted from facial expressions (CMU_MOSEI_Visual*.csd files under visuals folder)

Balanced for gender: The dataset ensures equal representation from male and female speakers.

Unlocking Insights: By exploring the various modalities within CMU-MOSEI, researchers can investigate the relationship between speech, facial expressions, and emotions expressed in online videos.

Download: The dataset is freely available for download at: http://immortal.multicomp.cs.cmu.edu/CMU-MOSEI/

Start exploring the world of emotions in videos with CMU-MOSEI!
c
EEG-BCI Dataset for Real-time Robotic Hand Control at Individual Finger...
kilthub.cmu.edu
figshare.com
txt
Updated Aug 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yidan Ding; Bin He (2025). EEG-BCI Dataset for Real-time Robotic Hand Control at Individual Finger Level [Dataset]. http://doi.org/10.1184/R1/29104040.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/29104040.v2
Dataset updated
Aug 20, 2025
Dataset provided by
Carnegie Mellon University
Authors
Yidan Ding; Bin He
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
In this study, subjects controlled an EEG-based BCI using motor execution or motor imagery of their fingers within the dominant hand to control the corresponding finger motions of a robotic hand in real time. Twenty-one right-handed subjects participated in one offline and two online sessions each for finger motor execution and motor imagery tasks. Sixteen out of the twenty-one subjects completed three more motor imagery online sessions and two more online sessions with smoothed robotic control, one for finger motor execution tasks and one for finger motor imagery tasks. The dataset includes EEG recordings and real-time finger decoding results for each subject during multiple sessions. The detailed description of the study can be found in the following publication:Ding, Y., Udompanyawit, C., Zhang, Y., & He, B. (2025). EEG-based brain-computer interface enables real-time robotic hand control at individual finger level. Nature communications, 16(1), 5401. https://doi.org/10.1038/s41467-025-61064-xIf you use a part of this dataset in your work, please cite the above publication.This dataset was collected under support from the National Institutes of Health via grants NS124564, NS131069, NS127849, and NS096761 to Dr. Bin He.Correspondence about the dataset: Dr. Bin He, Carnegie Mellon University, Department of Biomedical Engineering, Pittsburgh, PA 15213. E-mail: bhe1@andrew.cmu.edu
CMU SC and BOLD fMRI
figshare.com
bin
Updated Apr 8, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arian Ashourvan; Timothy Verstynen (2019). CMU SC and BOLD fMRI [Dataset]. http://doi.org/10.6084/m9.figshare.7965065.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7965065.v1
Dataset updated
Apr 8, 2019
Dataset provided by
Figsharehttp://figshare.com/
Authors
Arian Ashourvan; Timothy Verstynen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ParticipantsSixty participants (28 male, 32 female) were recruited locally from the Pittsburgh, Pennsylvania area as well as the U.S. Army Research Laboratory in Aberdeen, Maryland. Participants were neurologically healthy adults with no history of head trauma, neurological pathology, or psychological pathology. Participant ages ranged from 18 to 45 years old (mean age, 26.5 years). The study protocol for acquiring the human subjects data was reviewed and approved by the IRB at Carnegie Mellon University and written informed consent was obtained for all participants. As the present work uses de-identified human data from the original CMU study, the Penn IRB deemed this study exempt from the requirement for ethical review.MRI acquisitionAll 60 participants were scanned at the Scientific Imaging and Brain Research Center at Carnegie Mellon University on a Siemens Verio 3T magnet fitted with a 32-channel head coil. An MPRAGE sequence was used to acquire a high-resolution (1 mm3 isotropic voxels, 176 slices) T1-weighted brain image for all participants. DSI data was acquired following fMRI sequences using a 50 min, 257-direction, twice-refocused spin-echo EPI sequence with multiple q values (TR =11,400 ms, TE =128 ms, voxel size 2.4 mm3, field of view 231x 231 mm, b-max 5000 s/mm2, 51 slices). Resting state fMRI (rsfMRI) data consisting of 210 T2*-weighted volumes were collected for each participant (56 participants) with a BOLD contrast with echo planar imaging (EPI) sequence (TR 2000 ms, TE 29 ms, voxel size 3.5 mm3, field of view 224 x 224 mm, flip angle 79 degrees).Head motion is a major source of artifact in resting state fMRI data (rsfMRI). Although recently developed motion correction algorithms are far more effective than typical procedures, head motion was additionally minimized during image acquisition with a custom foam padding setup designed to minimize the variance of head motion along pitch and yaw directions. The setup also included a chin restraint that held the participant's head to the receiving coil itself. Preliminary inspection of EPI images at the imaging center showed that the setup minimized resting head motion to 1 mm maximum deviation for most subjects. Only 3 out of 56 subject were excluded from the final analysis because they moved more than 2 voxels multiple times throughout the imaging session.Diffusion MRI reconstructionDSI Studio (http://dsi-studio.labsolver.org) was used to process all DSI images using a q-space diffeomorphic reconstruction method (yeh et al 2011). A nonlinear spatial normalization approach (Ashburner et al 1999) was implemented through 16 iterations to obtain the spatial mapping function of quantitative anisotropy (QA) values from individual subject diffusion space tothe FMRIB 1 mm fractional anisotropy (FA) atlas template. QA is an orientation distribution function (ODF) based index that is scaled with spin density information that permits the removal of isotropic diffusion components from the ODF to filter false peaks, facilitating the resolution of fiber tracts using deterministic fiber tracking algorithms. For a detailed description and comparison of QA with standard FA techniques, see (Yeh et al 2013). The ODFs were reconstructed to a spatial resolution of 2 mm3 with a diffusion sampling length ratio of 1.25. Whole-brain ODF maps of all 60 subjects were averaged together to generate a template image of the average tractography space.Fiber tractography analysisFiber tractography was performed using an ODF-streamline version of the FACT algorithm (Yeh et al 2013) in DSI Studio, using the builds from September 23, 2013 and August 29, 2014. All fiber tractography was initiated from seed positions with random locations within the whole-brain seed mask with random initial fiber orientations. Using a step size of 1 mm, the directional estimates of fiber progression within each voxel were weighted by 80% of the incoming fiber direction and 20% of the previous fiber direction. A streamline was terminated when the QA index fell below 0.05 or had a turning angle greater than 75 degrees. We performed a region-based tractography to isolate streamlines between pairs of regional masks. All cortical masks were selected from an upsampled version of the original Automated Anatomical Labeling Atlas (AAL) (Tzourio et al 2002, Desikan et al 2006) containing 90 cortical and subcortical regions of interest but not containing cerebellar structures or the brainstem. This resampled version contains 600 regions and is created via a series of upsampling steps in which any given region is bisected perpendicular to its principal spatial axis in order to create 2 equally sized sub-regions (Hermundstad et al 2014). The final atlas contained regions of an average size of 268 voxels, with a standard deviation of 35 voxels. Diffusion-based tractography has been shown to exhibit a strong medial bias (Croxson et al 2005) due to partial volume effects and poor resolution of complex fiber crossings (jones et al 2010). To counter the bias away from more lateral cortical regions, tractography was generated for each cortical surface mask separately.Resting state fMRI preprocessingSPM8 (Wellcome Department of Imaging Neuroscience, London) was used to preprocess all rsfMRI collected from 53 of the 60 participants with DSI data. To estimate the normalization transformation for each EPI image, the mean EPI image was first selected as a source image and weighted by its mean across all volumes. Then, an MNI-space EPI template supplied with SPM was selected as the target image for normalization. The source image smoothing kernel was set to a FWHM of 4 mm, and all other estimation options were kept at the SPM8 defaults to generate a transformation matrix that was applied to each volume of the individual source images for further analyses. The time-series was up-sampled to a 1Hz TR using a cubic-spline interpolation. Regions from the AAL600 atlas were used as seed points for the functional connectivity analysis (Hermundstad et al 2014). A series of custom MATLAB functions were used to extract the voxel time series of activity for each region, and to remove estimated noise from the time series by selecting the first five principal components from the white matter and CSF masks.DatasetHere we provided the streamline count matrices of all 60 participants in CMU_SC.zip, which contains individual .mat files per participant. We also provided the average ROI BOLD fMRI time series of 53 participants with low head motion in CMU_BOLD.mat file.
e
cmu.edu Traffic Analytics Data
analytics.explodingtopics.com
Updated Sep 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). cmu.edu Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/cmu.edu
Explore at:
Dataset updated
Sep 1, 2025
Variables measured
Global Rank, Monthly Visits, Authority Score, US Country Rank, Education Category Rank
Description
Traffic analytics, rankings, and competitive metrics for cmu.edu as of September 2025
i
CRAWDAD cmu/supermarket
ieee-dataport.org
Updated Dec 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CRAWDAD Team (2022). CRAWDAD cmu/supermarket [Dataset]. https://ieee-dataport.org/open-access/crawdad-cmusupermarket
Explore at:
Dataset updated
Dec 12, 2022
Authors
CRAWDAD Team
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Round-trip Time-of-flight Measurements from a supermarket.
CMU-30 DSI Template (02/13/2013 Build)
figshare.com
application/gzip
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Timothy Verstynen (2023). CMU-30 DSI Template (02/13/2013 Build) [Dataset]. http://doi.org/10.6084/m9.figshare.643852.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.643852.v1
Dataset updated
Jun 5, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Timothy Verstynen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is an averaged template of a 257 direction DSI dataset that has been reconstructed using QSDR for use in DSI Studio. For more information visit here: http://www.psy.cmu.edu/~coaxlab/?page_id=305
R
Cmu F1 Tenth: Cars And Walls Dataset
universe.roboflow.com
zip
Updated Dec 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
f1tenthsegmentationcarwall (2023). Cmu F1 Tenth: Cars And Walls Dataset [Dataset]. https://universe.roboflow.com/f1tenthsegmentationcarwall-ovq2k/cmu-f1-tenth-cars-and-walls/model/2
Explore at:
zipAvailable download formats
Dataset updated
Dec 16, 2023
Dataset authored and provided by
f1tenthsegmentationcarwall
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Walls And Cars Polygons
Description
F1TENTH: Towards Visual Understanding in Racing

This is the segmentation dataset for CMU F1TENTH Team 1's final project. In this project, we set out to detect F1TENTH track walls and opponent cars in the RealSense camera frames, apply semantic segmentation, and then generate a Birds-Eye View (BEV) occupancy grid.

This dataset includes images from both own ROS Bags and images provided by the University of Pennsylvania as part of the F1TENTH vision lab. Note that the annotations for this dataset could also be used to train an instance segmentation model, which we did and uploaded to Roboflow--that version is attached to the second version of this dataset and is also available for download here.

You can find the ONNX weights for the trained UNET model (semantic segmentation) here. This model was trained on the 2nd version of our dataset (V2) and achieved 93% recall, 97% precision, an F1 score of 0.95, and a mean IoU of 91%. We chose to train this semantic segmentation model on top of the yolov8 instance segmentation model as UNET is supported by the NVIDIA Isaac ROS image segmentation package, offering more efficient/optimized inference on Jetson Platforms.
i
CMU-SynTraffic-2022
ieee-dataport.org
Updated May 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Drake Cullen (2022). CMU-SynTraffic-2022 [Dataset]. https://ieee-dataport.org/documents/cmu-syntraffic-2022
Explore at:
Dataset updated
May 19, 2022
Authors
Drake Cullen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
machine and deep learning solutions have become the standard. However
CMU MoCap Dataset as used in BeatGAN
kaggle.com
zip
Updated Jan 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MaximDolg (2022). CMU MoCap Dataset as used in BeatGAN [Dataset]. https://www.kaggle.com/datasets/maximdolg/cmu-mocap-dataset-as-used-in-beatgan
Explore at:
zip(271080 bytes)Available download formats
Dataset updated
Jan 3, 2022
Authors
MaximDolg
Description
This is a CSV raw version of the CMU MoCap dataset subset used in [Zhou et al., 2019]. There is no windowing, striding nor normalisation applied to the data. For more information concerning the structure of the data please see Beatgan Repo.

Structure

All of the data was concatenated into a single CSV data.csv. The provided labels.csv provides the labels for each data sample.

Labels

Zhou et al. use three classes for their dataset. The first class walking (labelled as 0) is considered the normal class and the jogging (labelled as 1) and jumping (labelled as 2) are considered abnormal classes.

License Notes from the original dataset authors:

Original Authors This data is free for use in research projects. You may include this data in commercially-sold products, but you may not resell this data directly, even in converted form. If you publish results obtained using this data, we would appreciate it if you would send the citation to your published paper to jkh+mocap@cs.cmu.edu, and also would add this text to your acknowledgments section: The data used in this project was obtained from mocap.cs.cmu.edu. The database was created with funding from NSF EIA-0196217.

Facebook

Twitter

Click to copy link

Link copied

Cite

Brian Lindauer (2023). Insider Threat Test Dataset [Dataset]. http://doi.org/10.1184/R1/12841247.v1

Insider Threat Test Dataset

Explore at:

197 scholarly articles cite this dataset (View in Google Scholar)

txtAvailable download formats

Unique identifier

https://doi.org/10.1184/R1/12841247.v1

Dataset updated

May 30, 2023

Dataset provided by

Carnegie Mellon University

Authors

Brian Lindauer

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Insider Threat Test Dataset is a collection of synthetic insider threat test datasets that provide both background and malicious actor synthetic data.The CERT Division, in partnership with ExactData, LLC, and under sponsorship from DARPA I2O, generated a collection of synthetic insider threat test datasets. These datasets provide both synthetic background data and data from synthetic malicious actors.For more background on this data, please see the paper, Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data.Datasets are organized according to the data generator release that created them. Most releases include multiple datasets (e.g., r3.1 and r3.2). Generally, later releases include a superset of the data generation functionality of earlier releases. Each dataset file contains a readme file that provides detailed notes about the features of that release.The answer key file answers.tar.bz2 contains the details of the malicious activity included in each dataset, including descriptions of the scenarios enacted and the identifiers of the synthetic users involved.

Clear search

Close search

Google apps

Main menu

Insider Threat Test Dataset

CMU Face Images

cmu-mosei-comp-seq

Speaker Recognition - CMU ARCTIC

File information

Column description

More Details

Acknowledgements

cmu-arctic-xvectors

The CMU/Hotspot Dataset

Data Collected with Package Delivery Quadcopter Drone

CMU-MOSEI

Dataset

Contents

A subset of CMU motion capture database

CMU Panoptic Dataset 2.0

VL-CMU-CD

CMU_MOSEI

EEG-BCI Dataset for Real-time Robotic Hand Control at Individual Finger...

CMU SC and BOLD fMRI

cmu.edu Traffic Analytics Data

CRAWDAD cmu/supermarket

CMU-30 DSI Template (02/13/2013 Build)

Cmu F1 Tenth: Cars And Walls Dataset

F1TENTH: Towards Visual Understanding in Racing

CMU-SynTraffic-2022

CMU MoCap Dataset as used in BeatGAN

Structure

Labels

License Notes from the original dataset authors:

Insider Threat Test DatasetSee More Versions

Insider Threat Test Dataset