7 datasets found
  1. c

    Trained models and testing datasets used in "Approach for the optimization...

    • kilthub.cmu.edu
    zip
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suguru Horimoto; Keane Lucas; Ljudevit Bauer (2024). Trained models and testing datasets used in "Approach for the optimization of machine learning models for calculating binary function similarity" [Dataset]. http://doi.org/10.1184/R1/26042788.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Carnegie Mellon University
    Authors
    Suguru Horimoto; Keane Lucas; Ljudevit Bauer
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This repository contains some trained multi-architecture models and testing datasets for multi-architecture models for the following paper: Suguru Horimoto, Keane Lucas, and Lujo Bauer. Approach for the optimization of machine learning models for calculating binary function similarity. In Proceedings of the 21st Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA '24), 2024.

    In order to use the models and the datasets, please follow the instructions on https://github.com/sgr-ht/mam-for-cbfs

  2. 4

    FBX Conversion Of The CMU Graphics Lab Motion Capture Database

    • data.4tu.nl
    zip
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ward Dewaele; B. Hahne; CMU Graphics Lab; Anish Abhijit Diwan, FBX Conversion Of The CMU Graphics Lab Motion Capture Database [Dataset]. http://doi.org/10.4121/0448aab2-3332-449f-a8e2-d208cb58c7df.v1
    Explore at:
    zipAvailable download formats
    Dataset provided by
    4TU.ResearchData
    Authors
    Ward Dewaele; B. Hahne; CMU Graphics Lab; Anish Abhijit Diwan
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Carnegie-Mellon Graphics Lab Motion Capture Database (fbx conversion)


    This is a dataset of motion capture data in the .fbx format. The original dataset (free to use, modify, and share) is created by the CMU Graphics Lab and can be accessed here (http://mocap.cs.cmu.edu/). This version contains the .fbx conversion of the files along with a categorisation of some of the motion data for imitation learning tasks. This can directly be used with the data pipelines described here (https://github.com/anishhdiwan/diffusion_motion_priors) to view, process, and retarget the data for any other task. Please refer to the README in the repository for further documentation.


    Compiled from the individual CMU index files by B. Hahne. FBX conversion for Unity by Ward Dewaele from cMonkeys.


    CMU Notice

    When browsing for motions, start with the higher numbered subjects first. The lower numbers contain some of our earliest motion capture sessions, and may not be as high quality.


    This data is free for use in research projects.

    You may include this data in commercially-sold products,

    but you may not resell this data directly, even in converted form.

    If you publish results obtained using this data, we would appreciate it

    if you would send the citation to your published paper to jkh+mocap@cs.cmu.edu,

    and also would add this text to your acknowledgments section:

    The data used in this project was obtained from mocap.cs.cmu.edu.

    The database was created with funding from NSF EIA-0196217.

  3. c

    EEG-BCI Dataset for "Continuous Tracking using Deep Learning-based Decoding...

    • kilthub.cmu.edu
    zip
    Updated May 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dylan Forenzo; Bin He (2024). EEG-BCI Dataset for "Continuous Tracking using Deep Learning-based Decoding for Non-invasive Brain-Computer Interface" [Dataset]. http://doi.org/10.1184/R1/25360300.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 6, 2024
    Dataset provided by
    Carnegie Mellon University
    Authors
    Dylan Forenzo; Bin He
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This EEG Brain Computer Interface (BCI) dataset was collected as part of the study titled: “Continuous Tracking using Deep Learning-based Decoding for Non-invasive Brain-Computer Interface”. If you use a part of this dataset in your work, please cite the following publication: D. Forenzo, H. Zhu, J. Shanahan, J. Lim, and B. He, “Continuous tracking using deep learning-based decoding for noninvasive brain–computer interface,” PNAS Nexus, vol. 3, no. 4, p. pgae145, Apr. 2024, doi: 10.1093/pnasnexus/pgae145. This dataset was collected under support from the National Institutes of Health via grants AT009263, NS096761, NS127849, EB029354, NS124564, and NS131069 to Dr. Bin He. Correspondence about the dataset: Dr. Bin He, Carnegie Mellon University, Department of Biomedical Engineering, Pittsburgh, PA 15213. E-mail: bhe1@andrew.cmu.edu

  4. test999

    • redivis.com
    application/jsonl +7
    Updated Mar 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carnegie Mellon University Libraries (2024). test999 [Dataset]. https://redivis.com/datasets/emqp-0e3w4k39t
    Explore at:
    application/jsonl, arrow, sas, spss, csv, stata, avro, parquetAvailable download formats
    Dataset updated
    Mar 8, 2024
    Dataset provided by
    Redivis Inc.
    Authors
    Carnegie Mellon University Libraries
    Description

    Abstract

    testDescription This dataset was created by CMULibraries on Fri, 08 Mar 2024 00:07:28 GMT.

  5. c

    Dataset for "Detection and Discovery of Misinformation Sources using...

    • kilthub.cmu.edu
    txt
    Updated Feb 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Carragher; Kathleen Carley; Evan Williams (2024). Dataset for "Detection and Discovery of Misinformation Sources using Attributed Webgraphs" [Dataset]. http://doi.org/10.1184/R1/25174193.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 12, 2024
    Dataset provided by
    Carnegie Mellon University
    Authors
    Peter Carragher; Kathleen Carley; Evan Williams
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We demonstrate that Search Engine Optimization (SEO) attributes provide strong signals for predicting news site reliability. We introduce a novel attributed webgraph dataset with labeled news domains and their connections to outlinking and backlinking domains. Finally, we introduce and evaluate a novel graph-based algorithm for discovering previously unknown misinformation news sources.

    This dataset is provided courtesy of Ahrefs.com. The associated paper is upcoming at ICWSM 2024.

  6. c

    Historical changes of annual temperature and precipitation indices at...

    • kilthub.cmu.edu
    txt
    Updated Aug 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuchuan Lai; David Dzombak (2024). Historical changes of annual temperature and precipitation indices at selected 210 U.S. cities [Dataset]. http://doi.org/10.1184/R1/7961012.v6
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 22, 2024
    Dataset provided by
    Carnegie Mellon University
    Authors
    Yuchuan Lai; David Dzombak
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Historical changes of annual temperature and precipitation indices at selected 210 U.S. cities

    This dataset provide:

    Annual average temperature, total precipitation, and temperature and precipitation extremes calculations for 210 U.S. cities.

    Historical rates of changes in annual temperature, precipitation, and the selected temperature and precipitation extreme indices in the 210 U.S. cities.

    Estimated thresholds (reference levels) for the calculations of annual extreme indices including warm and cold days, warm and cold nights, and precipitation amount from very wet days in the 210 cities.

    Annual average of daily mean temperature, Tmax, and Tmin are included for annual average temperature calculations. Calculations were based on the compiled daily temperature and precipitation records at individual cities.

    Temperature and precipitation extreme indices include: warmest daily Tmax and Tmin, coldest daily Tmax and Tmin , warm days and nights, cold days and nights, maximum 1-day precipitation, maximum consecutive 5-day precipitation, precipitation amounts from very wet days.

    Number of missing daily Tmax, Tmin, and precipitation values are included for each city.

    Rates of change were calculated using linear regression, with some climate indices applied with the Box-Cox transformation prior to the linear regression.

    The historical observations from ACIS belong to Global Historical Climatological Network - daily (GHCN-D) datasets. The included stations were based on NRCC’s “ThreadEx” project, which combined daily temperature and precipitation extremes at 255 NOAA Local Climatological Locations, representing all large and medium size cities in U.S. (See Owen et al. (2006) Accessing NOAA Daily Temperature and Precipitation Extremes Based on Combined/Threaded Station Records).

    Resources:

    See included README file for more information.

    Additional technical details and analyses can be found in: Lai, Y., & Dzombak, D. A. (2019). Use of historical data to assess regional climate change. Journal of climate, 32(14), 4299-4320. https://doi.org/10.1175/JCLI-D-18-0630.1

    Other datasets from the same project can be accessed at: https://kilthub.cmu.edu/projects/Use_of_historical_data_to_assess_regional_climate_change/61538

    ACIS database for historical observations: http://scacis.rcc-acis.org/

    GHCN-D datasets can also be accessed at: https://www.ncei.noaa.gov/data/global-historical-climatology-network-daily/

    Station information for each city can be accessed at: http://threadex.rcc-acis.org/

    • 2024 August updated -

      Annual calculations for 2022 and 2023 were added.

      Linear regression results and thresholds for extremes were updated because of the addition of 2022 and 2023 data.

      Note that future updates may be infrequent.

    • 2022 January updated -

      Annual calculations for 2021 were added.

      Linear regression results and thresholds for extremes were updated because of the addition of 2021 data.

    • 2021 January updated -

      Annual calculations for 2020 were added.

      Linear regression results and thresholds for extremes were updated because of the addition of 2020 data.

    • 2020 January updated -

      Annual calculations for 2019 were added.

      Linear regression results and thresholds for extremes were updated because of the addition of 2019 data.

      Thresholds for all 210 cities were combined into one single file – Thresholds.csv.

    • 2019 June updated -

      Baltimore was updated with the 2018 data (previously version shows NA for 2018) and new ID to reflect the GCHN ID of Baltimore-Washington International AP. city_info file was updated accordingly.

      README file was updated to reflect the use of "wet days" index in this study. The 95% thresholds for calculation of wet days utilized all daily precipitation data from the reference period and can be different from the same index from some other studies, where only days with at least 1 mm of precipitation were utilized to calculate the thresholds. Thus the thresholds in this study can be lower than the ones that would've be calculated from the 95% percentiles from wet days (i.e., with at least 1 mm of precipitation).

  7. c

    7th CCDC CSP blind test training data for AIMNet2 machine-learned neural...

    • kilthub.cmu.edu
    hdf
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olexandr Isayev; Kamal Nayal (2025). 7th CCDC CSP blind test training data for AIMNet2 machine-learned neural network potentials [Dataset]. http://doi.org/10.1184/R1/29282318.v1
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Jun 12, 2025
    Dataset provided by
    Carnegie Mellon University
    Authors
    Olexandr Isayev; Kamal Nayal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Training data used by Group 16 for both phases (structure generation and structure ranking) of the seventh CCDC CSP blind test. These target-specific datasets contain molecular clusters (n-mers) and their properties computed with B97M/def2-TZVPP (meta-GGA DFT) method. Each data file contains different number of n-mers, depending on the target. DFT calculation performed with ORCA 5.0.3 software. Properties include energy, forces, atomic charges, and molecular dipole and quadrupole moments.The results and conclusions of the 7th CCDC CSP blind test have been published in Acta Crystallographica: 1) The Seventh Blind Test of Crystal Structure Prediction: Structure Generation Methods – Hunnisett et al., J. Acta. Cryst. B80, Dec 2024. 2)The Seventh Blind Test of Crystal Structure Prediction: Structure Ranking Methods – Hunnisett et al., J. Acta. Cryst., B80, Dec 2024.

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Suguru Horimoto; Keane Lucas; Ljudevit Bauer (2024). Trained models and testing datasets used in "Approach for the optimization of machine learning models for calculating binary function similarity" [Dataset]. http://doi.org/10.1184/R1/26042788.v1

Trained models and testing datasets used in "Approach for the optimization of machine learning models for calculating binary function similarity"

Explore at:
zipAvailable download formats
Dataset updated
Jul 12, 2024
Dataset provided by
Carnegie Mellon University
Authors
Suguru Horimoto; Keane Lucas; Ljudevit Bauer
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

This repository contains some trained multi-architecture models and testing datasets for multi-architecture models for the following paper: Suguru Horimoto, Keane Lucas, and Lujo Bauer. Approach for the optimization of machine learning models for calculating binary function similarity. In Proceedings of the 21st Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA '24), 2024.

In order to use the models and the datasets, please follow the instructions on https://github.com/sgr-ht/mam-for-cbfs

Search
Clear search
Close search
Google apps
Main menu