7 datasets found

c
Trained models and testing datasets used in "Approach for the optimization...
kilthub.cmu.edu
zip
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suguru Horimoto; Keane Lucas; Ljudevit Bauer (2024). Trained models and testing datasets used in "Approach for the optimization of machine learning models for calculating binary function similarity" [Dataset]. http://doi.org/10.1184/R1/26042788.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/26042788.v1
Dataset updated
Jul 12, 2024
Dataset provided by
Carnegie Mellon University
Authors
Suguru Horimoto; Keane Lucas; Ljudevit Bauer
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This repository contains some trained multi-architecture models and testing datasets for multi-architecture models for the following paper: Suguru Horimoto, Keane Lucas, and Lujo Bauer. Approach for the optimization of machine learning models for calculating binary function similarity. In Proceedings of the 21st Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA '24), 2024.

In order to use the models and the datasets, please follow the instructions on https://github.com/sgr-ht/mam-for-cbfs
4
FBX Conversion Of The CMU Graphics Lab Motion Capture Database
data.4tu.nl
zip
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ward Dewaele; B. Hahne; CMU Graphics Lab; Anish Abhijit Diwan, FBX Conversion Of The CMU Graphics Lab Motion Capture Database [Dataset]. http://doi.org/10.4121/0448aab2-3332-449f-a8e2-d208cb58c7df.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/0448aab2-3332-449f-a8e2-d208cb58c7df.v1
Dataset provided by
4TU.ResearchData
Authors
Ward Dewaele; B. Hahne; CMU Graphics Lab; Anish Abhijit Diwan
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Carnegie-Mellon Graphics Lab Motion Capture Database (fbx conversion)

This is a dataset of motion capture data in the .fbx format. The original dataset (free to use, modify, and share) is created by the CMU Graphics Lab and can be accessed here (http://mocap.cs.cmu.edu/). This version contains the .fbx conversion of the files along with a categorisation of some of the motion data for imitation learning tasks. This can directly be used with the data pipelines described here (https://github.com/anishhdiwan/diffusion_motion_priors) to view, process, and retarget the data for any other task. Please refer to the README in the repository for further documentation.

Compiled from the individual CMU index files by B. Hahne. FBX conversion for Unity by Ward Dewaele from cMonkeys.

CMU Notice
When browsing for motions, start with the higher numbered subjects first. The lower numbers contain some of our earliest motion capture sessions, and may not be as high quality.

This data is free for use in research projects.
You may include this data in commercially-sold products,
but you may not resell this data directly, even in converted form.
If you publish results obtained using this data, we would appreciate it
if you would send the citation to your published paper to jkh+mocap@cs.cmu.edu,
and also would add this text to your acknowledgments section:
The data used in this project was obtained from mocap.cs.cmu.edu.
The database was created with funding from NSF EIA-0196217.
c
EEG-BCI Dataset for "Continuous Tracking using Deep Learning-based Decoding...
kilthub.cmu.edu
zip
Updated May 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dylan Forenzo; Bin He (2024). EEG-BCI Dataset for "Continuous Tracking using Deep Learning-based Decoding for Non-invasive Brain-Computer Interface" [Dataset]. http://doi.org/10.1184/R1/25360300.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/25360300.v1
Dataset updated
May 6, 2024
Dataset provided by
Carnegie Mellon University
Authors
Dylan Forenzo; Bin He
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This EEG Brain Computer Interface (BCI) dataset was collected as part of the study titled: “Continuous Tracking using Deep Learning-based Decoding for Non-invasive Brain-Computer Interface”. If you use a part of this dataset in your work, please cite the following publication: D. Forenzo, H. Zhu, J. Shanahan, J. Lim, and B. He, “Continuous tracking using deep learning-based decoding for noninvasive brain–computer interface,” PNAS Nexus, vol. 3, no. 4, p. pgae145, Apr. 2024, doi: 10.1093/pnasnexus/pgae145. This dataset was collected under support from the National Institutes of Health via grants AT009263, NS096761, NS127849, EB029354, NS124564, and NS131069 to Dr. Bin He. Correspondence about the dataset: Dr. Bin He, Carnegie Mellon University, Department of Biomedical Engineering, Pittsburgh, PA 15213. E-mail: bhe1@andrew.cmu.edu
test999
redivis.com
application/jsonl +7
Updated Mar 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carnegie Mellon University Libraries (2024). test999 [Dataset]. https://redivis.com/datasets/emqp-0e3w4k39t
Explore at:
application/jsonl, arrow, sas, spss, csv, stata, avro, parquetAvailable download formats
Dataset updated
Mar 8, 2024
Dataset provided by
Redivis Inc.
Authors
Carnegie Mellon University Libraries
Description
Abstract

testDescription This dataset was created by CMULibraries on Fri, 08 Mar 2024 00:07:28 GMT.
c
Dataset for "Detection and Discovery of Misinformation Sources using...
kilthub.cmu.edu
txt
Updated Feb 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter Carragher; Kathleen Carley; Evan Williams (2024). Dataset for "Detection and Discovery of Misinformation Sources using Attributed Webgraphs" [Dataset]. http://doi.org/10.1184/R1/25174193.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/25174193.v1
Dataset updated
Feb 12, 2024
Dataset provided by
Carnegie Mellon University
Authors
Peter Carragher; Kathleen Carley; Evan Williams
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We demonstrate that Search Engine Optimization (SEO) attributes provide strong signals for predicting news site reliability. We introduce a novel attributed webgraph dataset with labeled news domains and their connections to outlinking and backlinking domains. Finally, we introduce and evaluate a novel graph-based algorithm for discovering previously unknown misinformation news sources.

This dataset is provided courtesy of Ahrefs.com. The associated paper is upcoming at ICWSM 2024.
c
Historical changes of annual temperature and precipitation indices at...
kilthub.cmu.edu
txt
Updated Aug 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuchuan Lai; David Dzombak (2024). Historical changes of annual temperature and precipitation indices at selected 210 U.S. cities [Dataset]. http://doi.org/10.1184/R1/7961012.v6
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/7961012.v6
Dataset updated
Aug 22, 2024
Dataset provided by
Carnegie Mellon University
Authors
Yuchuan Lai; David Dzombak
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Historical changes of annual temperature and precipitation indices at selected 210 U.S. cities

This dataset provide:

Annual average temperature, total precipitation, and temperature and precipitation extremes calculations for 210 U.S. cities.

Historical rates of changes in annual temperature, precipitation, and the selected temperature and precipitation extreme indices in the 210 U.S. cities.

Estimated thresholds (reference levels) for the calculations of annual extreme indices including warm and cold days, warm and cold nights, and precipitation amount from very wet days in the 210 cities.

Annual average of daily mean temperature, Tmax, and Tmin are included for annual average temperature calculations. Calculations were based on the compiled daily temperature and precipitation records at individual cities.

Temperature and precipitation extreme indices include: warmest daily Tmax and Tmin, coldest daily Tmax and Tmin , warm days and nights, cold days and nights, maximum 1-day precipitation, maximum consecutive 5-day precipitation, precipitation amounts from very wet days.

Number of missing daily Tmax, Tmin, and precipitation values are included for each city.

Rates of change were calculated using linear regression, with some climate indices applied with the Box-Cox transformation prior to the linear regression.

The historical observations from ACIS belong to Global Historical Climatological Network - daily (GHCN-D) datasets. The included stations were based on NRCC’s “ThreadEx” project, which combined daily temperature and precipitation extremes at 255 NOAA Local Climatological Locations, representing all large and medium size cities in U.S. (See Owen et al. (2006) Accessing NOAA Daily Temperature and Precipitation Extremes Based on Combined/Threaded Station Records).

Resources:

See included README file for more information.

Additional technical details and analyses can be found in: Lai, Y., & Dzombak, D. A. (2019). Use of historical data to assess regional climate change. Journal of climate, 32(14), 4299-4320. https://doi.org/10.1175/JCLI-D-18-0630.1

Other datasets from the same project can be accessed at: https://kilthub.cmu.edu/projects/Use_of_historical_data_to_assess_regional_climate_change/61538

ACIS database for historical observations: http://scacis.rcc-acis.org/

GHCN-D datasets can also be accessed at: https://www.ncei.noaa.gov/data/global-historical-climatology-network-daily/

Station information for each city can be accessed at: http://threadex.rcc-acis.org/

2024 August updated -

Annual calculations for 2022 and 2023 were added.

Linear regression results and thresholds for extremes were updated because of the addition of 2022 and 2023 data.

Note that future updates may be infrequent.

2022 January updated -

Annual calculations for 2021 were added.

Linear regression results and thresholds for extremes were updated because of the addition of 2021 data.

2021 January updated -

Annual calculations for 2020 were added.

Linear regression results and thresholds for extremes were updated because of the addition of 2020 data.

2020 January updated -

Annual calculations for 2019 were added.

Linear regression results and thresholds for extremes were updated because of the addition of 2019 data.

Thresholds for all 210 cities were combined into one single file – Thresholds.csv.

2019 June updated -

Baltimore was updated with the 2018 data (previously version shows NA for 2018) and new ID to reflect the GCHN ID of Baltimore-Washington International AP. city_info file was updated accordingly.

README file was updated to reflect the use of "wet days" index in this study. The 95% thresholds for calculation of wet days utilized all daily precipitation data from the reference period and can be different from the same index from some other studies, where only days with at least 1 mm of precipitation were utilized to calculate the thresholds. Thus the thresholds in this study can be lower than the ones that would've be calculated from the 95% percentiles from wet days (i.e., with at least 1 mm of precipitation).
c
7th CCDC CSP blind test training data for AIMNet2 machine-learned neural...
kilthub.cmu.edu
hdf
Updated Jun 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Olexandr Isayev; Kamal Nayal (2025). 7th CCDC CSP blind test training data for AIMNet2 machine-learned neural network potentials [Dataset]. http://doi.org/10.1184/R1/29282318.v1
Explore at:
hdfAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/29282318.v1
Dataset updated
Jun 12, 2025
Dataset provided by
Carnegie Mellon University
Authors
Olexandr Isayev; Kamal Nayal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Training data used by Group 16 for both phases (structure generation and structure ranking) of the seventh CCDC CSP blind test. These target-specific datasets contain molecular clusters (n-mers) and their properties computed with B97M/def2-TZVPP (meta-GGA DFT) method. Each data file contains different number of n-mers, depending on the target. DFT calculation performed with ORCA 5.0.3 software. Properties include energy, forces, atomic charges, and molecular dipole and quadrupole moments.The results and conclusions of the 7th CCDC CSP blind test have been published in Acta Crystallographica: 1) The Seventh Blind Test of Crystal Structure Prediction: Structure Generation Methods – Hunnisett et al., J. Acta. Cryst. B80, Dec 2024. 2)The Seventh Blind Test of Crystal Structure Prediction: Structure Ranking Methods – Hunnisett et al., J. Acta. Cryst., B80, Dec 2024.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Suguru Horimoto; Keane Lucas; Ljudevit Bauer (2024). Trained models and testing datasets used in "Approach for the optimization of machine learning models for calculating binary function similarity" [Dataset]. http://doi.org/10.1184/R1/26042788.v1

Trained models and testing datasets used in "Approach for the optimization of machine learning models for calculating binary function similarity"

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.1184/R1/26042788.v1

Dataset updated

Jul 12, 2024

Dataset provided by

Carnegie Mellon University

Authors

Suguru Horimoto; Keane Lucas; Ljudevit Bauer

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

This repository contains some trained multi-architecture models and testing datasets for multi-architecture models for the following paper: Suguru Horimoto, Keane Lucas, and Lujo Bauer. Approach for the optimization of machine learning models for calculating binary function similarity. In Proceedings of the 21st Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA '24), 2024.

In order to use the models and the datasets, please follow the instructions on https://github.com/sgr-ht/mam-for-cbfs

Clear search

Close search

Google apps

Main menu

Trained models and testing datasets used in "Approach for the optimization...

FBX Conversion Of The CMU Graphics Lab Motion Capture Database

EEG-BCI Dataset for "Continuous Tracking using Deep Learning-based Decoding...

test999

Abstract

Dataset for "Detection and Discovery of Misinformation Sources using...

Historical changes of annual temperature and precipitation indices at...

7th CCDC CSP blind test training data for AIMNet2 machine-learned neural...

Trained models and testing datasets used in "Approach for the optimization of machine learning models for calculating binary function similarity"