100+ datasets found
  1. c

    Insider Threat Test Dataset

    • kilthub.cmu.edu
    txt
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian Lindauer (2023). Insider Threat Test Dataset [Dataset]. http://doi.org/10.1184/R1/12841247.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Carnegie Mellon University
    Authors
    Brian Lindauer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Insider Threat Test Dataset is a collection of synthetic insider threat test datasets that provide both background and malicious actor synthetic data.The CERT Division, in partnership with ExactData, LLC, and under sponsorship from DARPA I2O, generated a collection of synthetic insider threat test datasets. These datasets provide both synthetic background data and data from synthetic malicious actors.For more background on this data, please see the paper, Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data.Datasets are organized according to the data generator release that created them. Most releases include multiple datasets (e.g., r3.1 and r3.2). Generally, later releases include a superset of the data generation functionality of earlier releases. Each dataset file contains a readme file that provides detailed notes about the features of that release.The answer key file answers.tar.bz2 contains the details of the malicious activity included in each dataset, including descriptions of the scenarios enacted and the identifiers of the synthetic users involved.

  2. CMU Face Images

    • kaggle.com
    zip
    Updated Sep 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RAVI PRAKASH SRIVASTAVA (2023). CMU Face Images [Dataset]. https://www.kaggle.com/datasets/raviprakash22/cmu-face-images
    Explore at:
    zip(22363697 bytes)Available download formats
    Dataset updated
    Sep 3, 2023
    Authors
    RAVI PRAKASH SRIVASTAVA
    Description

    This data consists of 640 black and white face images of people taken with varying pose (straight, left, right, up), expression (neutral, happy, sad, angry), eyes (wearing sunglasses or not), and size

  3. h

    cmu-mosei-comp-seq

    • huggingface.co
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reeha Parkar (2025). cmu-mosei-comp-seq [Dataset]. https://huggingface.co/datasets/reeha-parkar/cmu-mosei-comp-seq
    Explore at:
    Dataset updated
    Jun 30, 2025
    Authors
    Reeha Parkar
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    CMU-MOSEI: Computational Sequences (Unofficial Mirror)

    This repository provides a mirror of the official computational sequence files from the CMU-MOSEI dataset, which are required for multimodal sentiment and emotion research. The original download links are currently down, so this mirror is provided for the research community.

    Note: This is an unofficial mirror. All data originates from Carnegie Mellon University and original authors. If you are a dataset creator and want this… See the full description on the dataset page: https://huggingface.co/datasets/reeha-parkar/cmu-mosei-comp-seq.

  4. Speaker Recognition - CMU ARCTIC

    • kaggle.com
    zip
    Updated Nov 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Lins (2022). Speaker Recognition - CMU ARCTIC [Dataset]. https://www.kaggle.com/datasets/mrgabrielblins/speaker-recognition-cmu-arctic
    Explore at:
    zip(1354293783 bytes)Available download formats
    Dataset updated
    Nov 21, 2022
    Authors
    Gabriel Lins
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description
    • Can you predict which speaker is talking?
    • Can you predict what they are saying? This dataset makes all of these possible. Perfect for a school project, research project, or resume builder.

    File information

    • train.csv - file containing all the data you need for training, with 4 columns, id (file id), file_path(path to .wav files), speech(transcription of audio file), and speaker (target column)
    • test.csv - file containing all the data you need to test your model (20% of total audio files), it has the same columns as train.csv
    • train/ - Folder with training data, subdivided with Speaker's folders
      • aew/ - Folder containing audio files in .wav format for speaker aew
      • ...
    • test/ - Folder containing audio files for test data.

    Column description

    ColumnDescription
    idfile id (string)
    file_pathfile path to .wav file (string)
    speechtranscription of the audio file (string)
    speakerspeaker name, use this as the target variable if you are doing audio classification (string)

    More Details

    The CMU_ARCTIC databases were constructed at the Language Technologies Institute at Carnegie Mellon University as phonetically balanced, US-English single-speaker databases designed for unit selection speech synthesis research. A detailed report on the structure and content of the database and the recording environment etc is available as a Carnegie Mellon University, Language Technologies Institute Tech Report CMU-LTI-03-177 and is also available here.

    The databases consist of around 1150 utterances carefully selected from out-of-copyright texts from Project Gutenberg. The databases include US English male (bdl) and female (slt) speakers (both experienced voice talent) as well as other accented speakers.

    The 1132 sentence prompt list is available from cmuarctic.data

    The distributions include 16KHz waveform and simultaneous EGG signals. Full phonetically labeling was performed by the CMU Sphinx using the FestVox based labeling scripts. Complete runnable Festival Voices are included with the database distributions, as examples though better voices can be made by improving labeling, etc.

    Acknowledgements

    This work was partially supported by the U.S. National Science Foundation under Grant No. 0219687, "ITR/CIS Evaluation and Personalization of Synthetic Voices". Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

  5. h

    cmu-arctic-xvectors

    • huggingface.co
    • opendatalab.com
    Updated Feb 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthijs Hollemans (2023). cmu-arctic-xvectors [Dataset]. https://huggingface.co/datasets/Matthijs/cmu-arctic-xvectors
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 7, 2023
    Authors
    Matthijs Hollemans
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Speaker embeddings extracted from CMU ARCTIC

    There is one .npy file for each utterance in the dataset, 7931 files in total. The speaker embeddings are 512-element X-vectors. The CMU ARCTIC dataset divides the utterances among the following speakers:

    bdl (US male) slt (US female) jmk (Canadian male) awb (Scottish male) rms (US male) clb (US female) ksp (Indian male)

    The X-vectors were extracted using this script, which uses the speechbrain/spkrec-xvect-voxceleb model. Usage: from… See the full description on the dataset page: https://huggingface.co/datasets/Matthijs/cmu-arctic-xvectors.

  6. i

    The CMU/Hotspot Dataset

    • impactcybertrust.org
    Updated Jan 21, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    External Data Source (2019). The CMU/Hotspot Dataset [Dataset]. http://doi.org/10.23721/100/1478821
    Explore at:
    Dataset updated
    Jan 21, 2019
    Authors
    External Data Source
    Description

    The performance was measured and the application support of all visible APs at 13 hotspot locations around University Avenue, Seattle, WA, near the University of Washington over the course of 1 week. ; jeffpang@cs.cmu.edu

  7. c

    Data Collected with Package Delivery Quadcopter Drone

    • kilthub.cmu.edu
    • opendatalab.com
    txt
    Updated May 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thiago A. Rodrigues; Jay Patrikar; Arnav Choudhry; Jacob Feldgoise; Vaibhav Arcot; Aradhana Gahlaut; Sophia Lau; Brady Moon; Bastian Wagner; H Scott Matthews; Sebastian Scherer; Constantine Samaras (2021). Data Collected with Package Delivery Quadcopter Drone [Dataset]. http://doi.org/10.1184/R1/12683453.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 27, 2021
    Dataset provided by
    Carnegie Mellon University
    Authors
    Thiago A. Rodrigues; Jay Patrikar; Arnav Choudhry; Jacob Feldgoise; Vaibhav Arcot; Aradhana Gahlaut; Sophia Lau; Brady Moon; Bastian Wagner; H Scott Matthews; Sebastian Scherer; Constantine Samaras
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This experiment was performed in order to empirically measure the energy use of small, electric Unmanned Aerial Vehicles (UAVs). We autonomously direct a DJI ® Matrice 100 (M100) drone to take off, carry a range of payload weights on a triangular flight pattern, and land. Between flights, we varied specified parameters through a set of discrete options, payload of 0 , 250 g and 500 g; altitude during cruise of 25 m, 50 m, 75 m and 100 m; and speed during cruise of 4 m/s, 6 m/s, 8 m/s, 10 m/s and 12 m/s. We simultaneously collect data from a broad array of on-board sensors. The onboard sensors used to collect these data are* Wind sensor: FT Technologies FT205 UAV-mountable, pre-calibrated ultrasonic wind sensor with accuracy of ± 0.1 m/s and refresh rate of 10 Hz.;* Position: 3DM-GX5-45 GNSS/INS sensor pack. These sensors use a built-in Kalman filtering system to fuse the GPS and IMU data. The sensor has a maximum output rate of 10Hz with accuracy of ± 2 m$ RMS horizontal, ± 5 m$ RMS vertical.* Current and Voltage: Mauch Electronics PL-200 sensor. This sensor can record currents up to 200 A and voltages up to 33 V. Analogue readings from the sensor were converted into a digital format using an 8 channel 17 bit analogue-to-digital converter (ADC).Data syncing and recording was handled using the Robot Operating System (ROS) running on a low-power Raspberry Pi Zero W. Data was recorded on the Raspberry Pi's microSD card. The data provided by each sensor were synchronized to a frequency of approximately 5Hz using the ApproximateTime message filter policy of Robot Operating System (ROS). The number of flights performed varying operational parameters (payload, altitude, speed) was 196. In addition, 13 recordings were done to assess the drone’s ancillary power and hover conditions.

  8. CMU-MOSEI

    • kaggle.com
    zip
    Updated May 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Quang Trung Hồ (2023). CMU-MOSEI [Dataset]. https://www.kaggle.com/datasets/gnurtqh/cmu-mosei
    Explore at:
    zip(3286247955 bytes)Available download formats
    Dataset updated
    May 16, 2023
    Authors
    Quang Trung Hồ
    Description

    Dataset

    This dataset was created by Quang Trung Hồ

    Contents

  9. A subset of CMU motion capture database

    • figshare.com
    bin
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qinkun Xiao (2023). A subset of CMU motion capture database [Dataset]. http://doi.org/10.6084/m9.figshare.3773109.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Qinkun Xiao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is an index of subset in CMU motion database. The corresponding ‘*.bvh’ files can be downloaded from http://mocap.cs.cmu.edu/ . Please cite this paper: Qinkun Xiao, Junfang Li, Qinhan Xiao. “Human Motion Capture Data Retrieval Based on Quaternion and EMD”. International Conference on Intelligent Human-Machine Systems and Cybernetics, v 1, 2013, pp. 517–520.

  10. S

    CMU Panoptic Dataset 2.0

    • simtk.org
    Updated Feb 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eni Halilaj; Soyong Shin (2021). CMU Panoptic Dataset 2.0 [Dataset]. https://simtk.org/frs/?group_id=1966
    Explore at:
    Dataset updated
    Feb 23, 2021
    Dataset provided by
    Carnegie Mellon University
    Authors
    Eni Halilaj; Soyong Shin
    Description

    The field of biomechanics is at a turning point, with marker-based motion capture set to be replaced by portable and inexpensive hardware, rapidly improving markerless tracking algorithms, and open datasets that will turn these new technologies into field-wide team projects. To expedite progress in this direction, we have collected the CMU Panoptic Dataset 2.0, which contains 86 subjects captured with 140 VGA cameras, 31 HD cameras, and 15 IMUs, performing on average 6.5 min of activities, including range of motion activities and tasks of daily living.

    Data: The data are now available under Downloads > Data Share.

    Code: The Fitting algorithm is published on Github: https://github.com/CMU-MBL/CMU_PanopticDataset_2.0

    Citation: If you find this code or our data useful for your research, please cite the following paper: @article{HALILAJ2021110650, title = {American society of biomechanics early career achievement award 2020: Toward portable and modular biomechanics labs: How video and IMU fusion will change gait analysis}, journal = {Journal of Biomechanics}, volume = {129}, pages = {110650}, year = {2021}, issn = {0021-9290}, doi = {https://doi.org/10.1016/j.jbiomech.2021.110650}, url = {https://www.sciencedirect.com/science/article/pii/S002192902100419X}, author = {Eni Halilaj and Soyong Shin and Eric Rapp and Donglai Xiang}, }



    This project includes the following software/data packages:

    • Pilot Study : This is the Pilot Study of CMU Panoptic Dataset 2.0

  11. h

    VL-CMU-CD

    • huggingface.co
    Updated Apr 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guo-Hua Wang (2023). VL-CMU-CD [Dataset]. https://huggingface.co/datasets/Flourish/VL-CMU-CD
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 29, 2023
    Authors
    Guo-Hua Wang
    Description

    Flourish/VL-CMU-CD dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. CMU_MOSEI

    • kaggle.com
    zip
    Updated Dec 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samar Warsi (2024). CMU_MOSEI [Dataset]. https://www.kaggle.com/datasets/samarwarsi/cmu-mosei
    Explore at:
    zip(31201301641 bytes)Available download formats
    Dataset updated
    Dec 13, 2024
    Authors
    Samar Warsi
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    CMU-MOSEI is a comprehensive multimodal dataset designed to analyze emotions and sentiment in online videos. It's a valuable resource for researchers and developers working on automatic emotion recognition and sentiment analysis.

    Key Features: Over 23,500 video clips from 1000+ speakers, covering diverse topics and monologues.

    Multimodal data:

    Acoustics: Features extracted from audio (CMU_MOSEI_COVAREP.csd) Labels: Annotations for sentiment intensity and emotion categories (CMU_MOSEI_Labels.csd) Language: Phonetic, word-level, and word vector representations (CMU_MOSEI_*.csd files under languages folder)

    Visuals: Features extracted from facial expressions (CMU_MOSEI_Visual*.csd files under visuals folder)

    Balanced for gender: The dataset ensures equal representation from male and female speakers.

    Unlocking Insights: By exploring the various modalities within CMU-MOSEI, researchers can investigate the relationship between speech, facial expressions, and emotions expressed in online videos.

    Download: The dataset is freely available for download at: http://immortal.multicomp.cs.cmu.edu/CMU-MOSEI/

    Start exploring the world of emotions in videos with CMU-MOSEI!

  13. c

    EEG-BCI Dataset for Real-time Robotic Hand Control at Individual Finger...

    • kilthub.cmu.edu
    • figshare.com
    txt
    Updated Aug 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yidan Ding; Bin He (2025). EEG-BCI Dataset for Real-time Robotic Hand Control at Individual Finger Level [Dataset]. http://doi.org/10.1184/R1/29104040.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 20, 2025
    Dataset provided by
    Carnegie Mellon University
    Authors
    Yidan Ding; Bin He
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    In this study, subjects controlled an EEG-based BCI using motor execution or motor imagery of their fingers within the dominant hand to control the corresponding finger motions of a robotic hand in real time. Twenty-one right-handed subjects participated in one offline and two online sessions each for finger motor execution and motor imagery tasks. Sixteen out of the twenty-one subjects completed three more motor imagery online sessions and two more online sessions with smoothed robotic control, one for finger motor execution tasks and one for finger motor imagery tasks. The dataset includes EEG recordings and real-time finger decoding results for each subject during multiple sessions. The detailed description of the study can be found in the following publication:Ding, Y., Udompanyawit, C., Zhang, Y., & He, B. (2025). EEG-based brain-computer interface enables real-time robotic hand control at individual finger level. Nature communications, 16(1), 5401. https://doi.org/10.1038/s41467-025-61064-xIf you use a part of this dataset in your work, please cite the above publication.This dataset was collected under support from the National Institutes of Health via grants NS124564, NS131069, NS127849, and NS096761 to Dr. Bin He.Correspondence about the dataset: Dr. Bin He, Carnegie Mellon University, Department of Biomedical Engineering, Pittsburgh, PA 15213. E-mail: bhe1@andrew.cmu.edu

  14. CMU SC and BOLD fMRI

    • figshare.com
    bin
    Updated Apr 8, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arian Ashourvan; Timothy Verstynen (2019). CMU SC and BOLD fMRI [Dataset]. http://doi.org/10.6084/m9.figshare.7965065.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 8, 2019
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Arian Ashourvan; Timothy Verstynen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ParticipantsSixty participants (28 male, 32 female) were recruited locally from the Pittsburgh, Pennsylvania area as well as the U.S. Army Research Laboratory in Aberdeen, Maryland. Participants were neurologically healthy adults with no history of head trauma, neurological pathology, or psychological pathology. Participant ages ranged from 18 to 45 years old (mean age, 26.5 years). The study protocol for acquiring the human subjects data was reviewed and approved by the IRB at Carnegie Mellon University and written informed consent was obtained for all participants. As the present work uses de-identified human data from the original CMU study, the Penn IRB deemed this study exempt from the requirement for ethical review.MRI acquisitionAll 60 participants were scanned at the Scientific Imaging and Brain Research Center at Carnegie Mellon University on a Siemens Verio 3T magnet fitted with a 32-channel head coil. An MPRAGE sequence was used to acquire a high-resolution (1 mm3 isotropic voxels, 176 slices) T1-weighted brain image for all participants. DSI data was acquired following fMRI sequences using a 50 min, 257-direction, twice-refocused spin-echo EPI sequence with multiple q values (TR =11,400 ms, TE =128 ms, voxel size 2.4 mm3, field of view 231x 231 mm, b-max 5000 s/mm2, 51 slices). Resting state fMRI (rsfMRI) data consisting of 210 T2*-weighted volumes were collected for each participant (56 participants) with a BOLD contrast with echo planar imaging (EPI) sequence (TR 2000 ms, TE 29 ms, voxel size 3.5 mm3, field of view 224 x 224 mm, flip angle 79 degrees).Head motion is a major source of artifact in resting state fMRI data (rsfMRI). Although recently developed motion correction algorithms are far more effective than typical procedures, head motion was additionally minimized during image acquisition with a custom foam padding setup designed to minimize the variance of head motion along pitch and yaw directions. The setup also included a chin restraint that held the participant's head to the receiving coil itself. Preliminary inspection of EPI images at the imaging center showed that the setup minimized resting head motion to 1 mm maximum deviation for most subjects. Only 3 out of 56 subject were excluded from the final analysis because they moved more than 2 voxels multiple times throughout the imaging session.Diffusion MRI reconstructionDSI Studio (http://dsi-studio.labsolver.org) was used to process all DSI images using a q-space diffeomorphic reconstruction method (yeh et al 2011). A nonlinear spatial normalization approach (Ashburner et al 1999) was implemented through 16 iterations to obtain the spatial mapping function of quantitative anisotropy (QA) values from individual subject diffusion space tothe FMRIB 1 mm fractional anisotropy (FA) atlas template. QA is an orientation distribution function (ODF) based index that is scaled with spin density information that permits the removal of isotropic diffusion components from the ODF to filter false peaks, facilitating the resolution of fiber tracts using deterministic fiber tracking algorithms. For a detailed description and comparison of QA with standard FA techniques, see (Yeh et al 2013). The ODFs were reconstructed to a spatial resolution of 2 mm3 with a diffusion sampling length ratio of 1.25. Whole-brain ODF maps of all 60 subjects were averaged together to generate a template image of the average tractography space.Fiber tractography analysisFiber tractography was performed using an ODF-streamline version of the FACT algorithm (Yeh et al 2013) in DSI Studio, using the builds from September 23, 2013 and August 29, 2014. All fiber tractography was initiated from seed positions with random locations within the whole-brain seed mask with random initial fiber orientations. Using a step size of 1 mm, the directional estimates of fiber progression within each voxel were weighted by 80% of the incoming fiber direction and 20% of the previous fiber direction. A streamline was terminated when the QA index fell below 0.05 or had a turning angle greater than 75 degrees. We performed a region-based tractography to isolate streamlines between pairs of regional masks. All cortical masks were selected from an upsampled version of the original Automated Anatomical Labeling Atlas (AAL) (Tzourio et al 2002, Desikan et al 2006) containing 90 cortical and subcortical regions of interest but not containing cerebellar structures or the brainstem. This resampled version contains 600 regions and is created via a series of upsampling steps in which any given region is bisected perpendicular to its principal spatial axis in order to create 2 equally sized sub-regions (Hermundstad et al 2014). The final atlas contained regions of an average size of 268 voxels, with a standard deviation of 35 voxels. Diffusion-based tractography has been shown to exhibit a strong medial bias (Croxson et al 2005) due to partial volume effects and poor resolution of complex fiber crossings (jones et al 2010). To counter the bias away from more lateral cortical regions, tractography was generated for each cortical surface mask separately.Resting state fMRI preprocessingSPM8 (Wellcome Department of Imaging Neuroscience, London) was used to preprocess all rsfMRI collected from 53 of the 60 participants with DSI data. To estimate the normalization transformation for each EPI image, the mean EPI image was first selected as a source image and weighted by its mean across all volumes. Then, an MNI-space EPI template supplied with SPM was selected as the target image for normalization. The source image smoothing kernel was set to a FWHM of 4 mm, and all other estimation options were kept at the SPM8 defaults to generate a transformation matrix that was applied to each volume of the individual source images for further analyses. The time-series was up-sampled to a 1Hz TR using a cubic-spline interpolation. Regions from the AAL600 atlas were used as seed points for the functional connectivity analysis (Hermundstad et al 2014). A series of custom MATLAB functions were used to extract the voxel time series of activity for each region, and to remove estimated noise from the time series by selecting the first five principal components from the white matter and CSF masks.DatasetHere we provided the streamline count matrices of all 60 participants in CMU_SC.zip, which contains individual .mat files per participant. We also provided the average ROI BOLD fMRI time series of 53 participants with low head motion in CMU_BOLD.mat file.

  15. e

    cmu.edu Traffic Analytics Data

    • analytics.explodingtopics.com
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). cmu.edu Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/cmu.edu
    Explore at:
    Dataset updated
    Sep 1, 2025
    Variables measured
    Global Rank, Monthly Visits, Authority Score, US Country Rank, Education Category Rank
    Description

    Traffic analytics, rankings, and competitive metrics for cmu.edu as of September 2025

  16. i

    CRAWDAD cmu/supermarket

    • ieee-dataport.org
    Updated Dec 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CRAWDAD Team (2022). CRAWDAD cmu/supermarket [Dataset]. https://ieee-dataport.org/open-access/crawdad-cmusupermarket
    Explore at:
    Dataset updated
    Dec 12, 2022
    Authors
    CRAWDAD Team
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Round-trip Time-of-flight Measurements from a supermarket.

  17. CMU-30 DSI Template (02/13/2013 Build)

    • figshare.com
    application/gzip
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Timothy Verstynen (2023). CMU-30 DSI Template (02/13/2013 Build) [Dataset]. http://doi.org/10.6084/m9.figshare.643852.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Timothy Verstynen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is an averaged template of a 257 direction DSI dataset that has been reconstructed using QSDR for use in DSI Studio. For more information visit here: http://www.psy.cmu.edu/~coaxlab/?page_id=305

  18. R

    Cmu F1 Tenth: Cars And Walls Dataset

    • universe.roboflow.com
    zip
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    f1tenthsegmentationcarwall (2023). Cmu F1 Tenth: Cars And Walls Dataset [Dataset]. https://universe.roboflow.com/f1tenthsegmentationcarwall-ovq2k/cmu-f1-tenth-cars-and-walls/model/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 16, 2023
    Dataset authored and provided by
    f1tenthsegmentationcarwall
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Walls And Cars Polygons
    Description

    F1TENTH: Towards Visual Understanding in Racing

    This is the segmentation dataset for CMU F1TENTH Team 1's final project. In this project, we set out to detect F1TENTH track walls and opponent cars in the RealSense camera frames, apply semantic segmentation, and then generate a Birds-Eye View (BEV) occupancy grid.

    This dataset includes images from both own ROS Bags and images provided by the University of Pennsylvania as part of the F1TENTH vision lab. Note that the annotations for this dataset could also be used to train an instance segmentation model, which we did and uploaded to Roboflow--that version is attached to the second version of this dataset and is also available for download here.

    You can find the ONNX weights for the trained UNET model (semantic segmentation) here. This model was trained on the 2nd version of our dataset (V2) and achieved 93% recall, 97% precision, an F1 score of 0.95, and a mean IoU of 91%. We chose to train this semantic segmentation model on top of the yolov8 instance segmentation model as UNET is supported by the NVIDIA Isaac ROS image segmentation package, offering more efficient/optimized inference on Jetson Platforms.

  19. i

    CMU-SynTraffic-2022

    • ieee-dataport.org
    Updated May 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Drake Cullen (2022). CMU-SynTraffic-2022 [Dataset]. https://ieee-dataport.org/documents/cmu-syntraffic-2022
    Explore at:
    Dataset updated
    May 19, 2022
    Authors
    Drake Cullen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    machine and deep learning solutions have become the standard. However

  20. CMU MoCap Dataset as used in BeatGAN

    • kaggle.com
    zip
    Updated Jan 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MaximDolg (2022). CMU MoCap Dataset as used in BeatGAN [Dataset]. https://www.kaggle.com/datasets/maximdolg/cmu-mocap-dataset-as-used-in-beatgan
    Explore at:
    zip(271080 bytes)Available download formats
    Dataset updated
    Jan 3, 2022
    Authors
    MaximDolg
    Description

    This is a CSV raw version of the CMU MoCap dataset subset used in [Zhou et al., 2019]. There is no windowing, striding nor normalisation applied to the data. For more information concerning the structure of the data please see Beatgan Repo.

    Structure

    All of the data was concatenated into a single CSV data.csv. The provided labels.csv provides the labels for each data sample.

    Labels

    Zhou et al. use three classes for their dataset. The first class walking (labelled as 0) is considered the normal class and the jogging (labelled as 1) and jumping (labelled as 2) are considered abnormal classes.

    License Notes from the original dataset authors:

    Original Authors This data is free for use in research projects. You may include this data in commercially-sold products, but you may not resell this data directly, even in converted form. If you publish results obtained using this data, we would appreciate it if you would send the citation to your published paper to jkh+mocap@cs.cmu.edu, and also would add this text to your acknowledgments section: The data used in this project was obtained from mocap.cs.cmu.edu. The database was created with funding from NSF EIA-0196217.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Brian Lindauer (2023). Insider Threat Test Dataset [Dataset]. http://doi.org/10.1184/R1/12841247.v1

Insider Threat Test Dataset

Related Article
Explore at:
197 scholarly articles cite this dataset (View in Google Scholar)
txtAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
Carnegie Mellon University
Authors
Brian Lindauer
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Insider Threat Test Dataset is a collection of synthetic insider threat test datasets that provide both background and malicious actor synthetic data.The CERT Division, in partnership with ExactData, LLC, and under sponsorship from DARPA I2O, generated a collection of synthetic insider threat test datasets. These datasets provide both synthetic background data and data from synthetic malicious actors.For more background on this data, please see the paper, Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data.Datasets are organized according to the data generator release that created them. Most releases include multiple datasets (e.g., r3.1 and r3.2). Generally, later releases include a superset of the data generation functionality of earlier releases. Each dataset file contains a readme file that provides detailed notes about the features of that release.The answer key file answers.tar.bz2 contains the details of the malicious activity included in each dataset, including descriptions of the scenarios enacted and the identifiers of the synthetic users involved.

Search
Clear search
Close search
Google apps
Main menu