Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the field of manufacturing, high-quality datasets are essential for optimizing production processes, improving energy efficiency, and developing predictive maintenance strategies. This repository introduces a comprehensive CNC machining data repository that includes three key data categories: (1) product geometry data, (2) NC code data, and (3) high frequency energy consumption data. This dataset is particularly valuable for researchers and engineers working in manufacturing analytics, energy-efficient machining, and machine learning applications in smart manufacturing. Potential use cases include optimizing machining parameters for energy reduction, predicting tool wear based on power consumption patterns, and enhancing digital twin models with real-world machining data. By making this dataset publicly available, we aim to support the development of data-driven solutions in modern manufacturing and facilitate benchmarking efforts across different machining strategies.
Cloud-based data repository for storing, publishing and accessing scientific data. Mendeley Data creates a permanent location and issues Force 11 compliant citations for uploaded data.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
A total of 12 software defect data sets from NASA were used in this study, where five data sets (part I) including CM1, JM1, KC1, KC2, and PC1 are obtained from PROMISE software engineering repository (http://promise.site.uottawa.ca/SERepository/), the other seven data sets (part II) are obtained from tera-PROMISE Repository (http://openscience.us/repo/defect/mccabehalsted/).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Part 3/3 of the dataset for "Camargo, et. al. A comprehensive, open-source dataset of lower limb biomechanics in multiple conditions of stairs, ramps, and level-ground ambulation and transitions." Dataset linked from https://doi.org/10.1016/j.jbiomech.2021.110320 (2021-02-24)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These data consist of a collection of legitimate as well as phishing website instances. Each website is represented by the set of features which denote, whether website is legitimate or not. Data can serve as an input for machine learning process.
In this repository the two variants of the Phishing Dataset are presented.
Full variant - dataset_full.csv Short description of the full variant dataset: Total number of instances: 88,647 Number of legitimate website instances (labeled as 0): 58,000 Number of phishing website instances (labeled as 1): 30,647 Total number of features: 111
Small variant - dataset_small.csv Short description of the small variant dataset: Total number of instances: 58,645 Number of legitimate website instances (labeled as 0): 27,998 Number of phishing website instances (labeled as 1): 30,647 Total number of features: 111
This data repository includes the datasets used both for training of the predictive models and for the evaluation of the proposed solution (FLAS) as a whole. In addition, the processed quantitative data are attached, which are the result of the evaluation carried out and which have been used to generate the graphs that show these results.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A collection of 22 data set of 50+ requirements each, expressed as user stories.
The dataset has been created by gathering data from web sources and we are not aware of license agreements or intellectual property rights on the requirements / user stories. The curator took utmost diligence in minimizing the risks of copyright infringement by using non-recent data that is less likely to be critical, by sampling a subset of the original requirements collection, and by qualitatively analyzing the requirements. In case of copyright infringement, please contact the dataset curator (Fabiano Dalpiaz, f.dalpiaz@uu.nl) to discuss the possibility of removal of that dataset [see Zenodo's policies]
The data sets have been originally used to conduct experiments about ambiguity detection with the REVV-Light tool: https://github.com/RELabUU/revv-light
This collection has been originally published in Mendeley data: https://data.mendeley.com/datasets/7zbk8zsd8y/1
The following text provides a description of the datasets, including links to the systems and websites, when available. The datasets are organized by macro-category and then by identifier.
g02-federalspending.txt
(2018) originates from early data in the Federal Spending Transparency project, which pertain to the website that is used to share publicly the spending data for the U.S. government. The website was created because of the Digital Accountability and Transparency Act of 2014 (DATA Act). The specific dataset pertains a system called DAIMS or Data Broker, which stands for DATA Act Information Model Schema. The sample that was gathered refers to a sub-project related to allowing the government to act as a data broker, thereby providing data to third parties. The data for the Data Broker project is currently not available online, although the backend seems to be hosted in GitHub under a CC0 1.0 Universal license. Current and recent snapshots of federal spending related websites, including many more projects than the one described in the shared collection, can be found here.
g03-loudoun.txt
(2018) is a set of extracted requirements from a document, by the Loudoun County Virginia, that describes the to-be user stories and use cases about a system for land management readiness assessment called Loudoun County LandMARC. The source document can be found here and it is part of the Electronic Land Management System and EPlan Review Project - RFP RFQ issued in March 2018. More information about the overall LandMARC system and services can be found here.
g04-recycling.txt
(2017) concerns a web application where recycling and waste disposal facilities can be searched and located. The application operates through the visualization of a map that the user can interact with. The dataset has obtained from a GitHub website and it is at the basis of a students' project on web site design; the code is available (no license).
g05-openspending.txt
(2018) is about the OpenSpending project (www), a project of the Open Knowledge foundation which aims at transparency about how local governments spend money. At the time of the collection, the data was retrieved from a Trello board that is currently unavailable. The sample focuses on publishing, importing and editing datasets, and how the data should be presented. Currently, OpenSpending is managed via a GitHub repository which contains multiple sub-projects with unknown license.
g11-nsf.txt
(2018) refers to a collection of user stories referring to the NSF Site Redesign & Content Discovery project, which originates from a publicly accessible GitHub repository (GPL 2.0 license). In particular, the user stories refer to an early version of the NSF's website. The user stories can be found as closed Issues.
g08-frictionless.txt
(2016) regards the Frictionless Data project, which offers an open source dataset for building data infrastructures, to be used by researchers, data scientists, and data engineers. Links to the many projects within the Frictionless Data project are on GitHub (with a mix of Unlicense and MIT license) and web. The specific set of user stories has been collected in 2016 by GitHub user @danfowler and are stored in a Trello board.
g14-datahub.txt
(2013) concerns the open source project DataHub, which is currently developed via a GitHub repository (the code has Apache License 2.0). DataHub is a data discovery platform which has been developed over multiple years. The specific data set is an initial set of user stories, which we can date back to 2013 thanks to a comment therein.
g16-mis.txt
(2015) is a collection of user stories that pertains a repository for researchers and archivists. The source of the dataset is a public Trello repository. Although the user stories do not have explicit links to projects, it can be inferred that the stories originate from some project related to the library of Duke University.
g17-cask.txt
(2016) refers to the Cask Data Application Platform (CDAP). CDAP is an open source application platform (GitHub, under Apache License 2.0) that can be used to develop applications within the Apache Hadoop ecosystem, an open-source framework which can be used for distributed processing of large datasets. The user stories are extracted from a document that includes requirements regarding dataset management for Cask 4.0, which includes the scenarios, user stories and a design for the implementation of these user stories. The raw data is available in the following environment.
g18-neurohub.txt
(2012) is concerned with the NeuroHub platform, a neuroscience data management, analysis and collaboration platform for researchers in neuroscience to collect, store, and share data with colleagues or with the research community. The user stories were collected at a time NeuroHub was still a research project sponsored by the UK Joint Information Systems Committee (JISC). For information about the research project from which the requirements were collected, see the following record.
g22-rdadmp.txt
(2018) is a collection of user stories from the Research Data Alliance's working group on DMP Common Standards. Their GitHub repository contains a collection of user stories that were created by asking the community to suggest functionality that should part of a website that manages data management plans. Each user story is stored as an issue on the GitHub's page.
g23-archivesspace.txt
(2012-2013) refers to ArchivesSpace: an open source, web application for managing archives information. The application is designed to support core functions in archives administration such as accessioning; description and arrangement of processed materials including analog, hybrid, and
born digital content; management of authorities and rights; and reference service. The application supports collection management through collection management records, tracking of events, and a growing number of administrative reports. ArchivesSpace is open source and its
https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/
Abstract A FORTRAN 77 program is presented which calculates energy values, reaction matrix and corresponding radial wave functions in a coupled-channel approximation of the hyperspherical adiabatic approach. In this approach, a multi-dimensional Schrödinger equation is reduced to a system of the coupled second-order ordinary differential equations on the finite interval with homogeneous boundary conditions of the third type. The resulting system of radial equations which contains the potential matrix ...
Title of program: KANTBP Catalogue Id: ADZH_v1_0
Nature of problem In the hyperspherical adiabatic approach [2-4], a multi-dimensional Schrödinger equation for a two-electron system [5] or a hydrogen atom in magnetic field [6] is reduced by separating the radial coordinate ρ from the angular variables to a system of second-order ordinary differential equations which contain potential matrix elements and first-derivative coupling terms. The purpose of this paper is to present the finite element method procedure based on the use of high-order accuracy approximati ...
Versions of this program held in the CPC repository in Mendeley Data ADZH_v1_0; KANTBP; 10.1016/j.cpc.2007.05.016 ADZH_v2_0; KANTBP; 10.1016/j.cpc.2008.06.005 ADZH_v3_0; KANTBP; 10.1016/j.cpc.2014.08.002
This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2019)
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.3/customlicense?persistentId=doi:10.7910/DVN/R33RS9https://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.3/customlicense?persistentId=doi:10.7910/DVN/R33RS9
Harvard Dataverse => Digital Library - Projects & Theses - Prof. Dr. Scholz ----- Introduction and background information to "Digital Library - Projects & Theses - Prof. Dr. Scholz". The URL of the dataverse: http://dataverse.harvard.edu/dataverse/LibraryProfScholz The URL of this (introduction) dataset: http://doi.org/10.7910/DVN/R33RS9 YOU MAY HAVE BEEN DIRECTED HERE, BECAUSE THE CALLING PAGE HAS NO OTHER ENTRY POINT (with DOI) INTO THIS DATAVERSE. Click on the title of this page to reach the start page of the dataverse! Introduction to the Data in this Dataverse This dataverse is about: Aircraft Design Flight Mechanics Aircraft Systems This dataverse contains research data and software produced by students for their projects and theses on above topics. Get linked to all other resources from their reports using the URN from the German National Library (DNB) as given in each dataset under "Metadata": https://nbn-resolving.org/html/urn:nbn:de:gbv:18302-aeroJJJJ-MM-DD.01x Alternative sites that store the data given in this dataverse are: http://library.ProfScholz.de and https://archive.org/details/@profscholz Open an "item". Under "DOWNLOAD OPTIONS" select the file (as far as available) called "ZIP" to download DataXxxx.zip. Alternatively, go to "SHOW ALL"; In the new window select next to DataXxxx.zip click "View Contents" or select URL next to "Data-list". Download single file from DataXxxx.zip. Data Publishing Data publishing means publishing of research data for (re)use by others. It consists of preparing single files or a dataset containing several files for access in the WWW. This practice is part of the open science movement. There is consensus about the benefits resulting from Open Data - especially in connection with Open Access publishing. It is important to link the publication (e.g. thesis) with the underlying data and vice versa. General (not disciplinary) and free data repositories are: Harvard Dataverse (this one!) figshare (emphasis: multi media) Zenodo (emphasis: results from EU research, mainly text) Mendeley Data (emphasis: data associated with journal articles) To find data repositories use http://re3data.org Read more on https://en.wikipedia.org/wiki/Data_publishing
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data is generated from an experiment in lieu of the paper "Leadership heuristic".Other InformationPublished in: Digital Commons Data - Mendeley Data RepositoryLicense: https://creativecommons.org/licenses/by/4.0See dataset on publisher's website: https://data.mendeley.com/datasets/4424s5pcwk/2
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Huntington’s disease (HD) is a rare neurodegenerative disorder caused by an expansion of the CAG trinucleotide repeat in exon 1 of the huntingtin (HTT) gene located on the short arm of chromosome 4. This expansion leads to an elongation of the polyglutamine tract encoding the mutant huntingtin (mHTT) protein. Based on the toxic gain-of-function hypothesis, mHTT-lowering interventions are currently among the most promising therapeutic approaches. As part of the effort to identify a 2nd generation mHTT PET ligand, a novel fluorinated radioligand, namely [18F]CHDI-650, was recently developed. [18F]CHDI-650 displayed suitable reversible kinetics in wild-type (WT) mice and non-human primates as well as specific binding to mHTT aggregates in post-mortem human HD brain. In order to ensure the suitability of [18F]CHDI-650 in humans, a dosimetry study is generally recommended in a preclinical species to estimate the radiation exposure due to the radioligand.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository includes all data and code that was used to produce the empirical results of the paper Complementary bidding and cartel detection: Evidence from Nordic asphalt markets.
Sand banks in the SW North Sea are heavily influenced by the relationship between the critical shear force of the sediment and the shear force of the water motions. The data shows how the sand bank distribution is bounded to certain shear force conditions. It appears that the ratio between the shear force of the water motions and the critical shear force of the bedload (simply referred to as shear force ratio) has to exceed a threshold of four to launch sand bank formation to a greater extend. Previous prediction models have suggested that sand banks form when the shear force of the water motions exceeds the critical shear force of the sediment. Subsequently, these findings can improve further predictions of sand bank formation.
The dataset contains:
Shear force ratio as raster data (Unit is N/m^2) The shear force ratio was calculated using several raster datasets provided by the Helmholtz Centre of Geesthacht.
Sand bank shapefile
https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/
Title of program: COMMUMAT Catalogue Id: AABD_v1_0
Nature of problem To find the invariants, the polynomials in the generators, labelling the unitary group representations.
Versions of this program held in the CPC repository in Mendeley Data aabd_v1_0; COMMUMAT; 10.1016/0010-4655(85)90065-7
This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2019)
https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-licensehttps://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license
Abstract It is shown that whenever the multiplicative normalization of a fitting function is not known, least square fitting by ^(χ2)minimization can be performed with one parameter less than usual by converting the normalization parameter into a function of the remaining parameters and the data. Title of program: FITM1 Catalogue Id: AEYG_v1_0 Nature of problem Least square minimization when one of the free parameters is the multiplicative normalization of the fitting function. Versions of this program held in the CPC repository in Mendeley Data AEYG_v1_0; FITM1; 10.1016/j.cpc.2015.09.021 This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2018)
This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2018)
Abstract This paper describes the main features of a state-of-the-art Monte Carlo solver for radiation transport which has been implemented within COOLFluiD, a world-class open source object-oriented platform for scientific simulations. The Monte Carlo code makes use of efficient ray tracing algorithms (for 2D, axisymmetric and 3D arbitrary unstructured meshes) which are described in detail. The solver accuracy is first verified in testcases for which analytical solutions are available, then validated...
Title of program: COOLFluiD-MC Catalogue Id: AEZG_v1_0
Nature of problem Radiative processes play a fundamental role in countless science and engineering contexts, including combustion, astrophysics, atmospheric space re-entry, experiments in plasma facilities (e.g. shock tubes, arc jets). The problem we are interested in is the computation of radiative heat transfer on arbitrarily complex geometries, in particular to characterize thermal loads acting on the surface of space vehicles.
Versions of this program held in the CPC repository in Mendeley Data AEZG_v1_0; COOLFluiD-MC; 10.1016/j.cpc.2015.12.017
A database with R-groups frequently used in medicinal chemistry and their preferred replacements is provided. For frequently used R-groups, replacements are organized in hierarches as specified in the readme.txt file. The data deposition accompanies a forthcoming publication by the authors in which the database will be described in detail.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Government regulations increasingly require mobile and web-based application (app) companies to standardize their data practices concerning the collection, use, and sharing of various types of information. A summary of these practices are communicated to users through online privacy policies. The challenge of acquiring requirements from data practice descriptions, however, is that privacy policies often contain ambiguities. Abstract and ambiguous terminology in requirements statements concerning information types (e.g., "we collect your device information"), can reduce shared understanding among app developers, policy writers, and users. To address this challenge, we propose a syntax-driven method that first parses a given information type phrase (e.g. mobile device identifier) into its constituents using a context-free grammar and second infers semantic relationships between constituents using semantic rules. The inferred semantic relationships between a given phrase and its constituents generate a hierarchy that models the generality and ambiguity of phrases. Through this method, we infer relations from a lexicon consisting of a set of information type phrases to populate a partial ontology. The resulting ontology is a knowledge graph that can be used to guide requirements authors in the selection of the most appropriate information type terms.
We evaluate the method’s performance using two criteria: (1) expert assessment of relations between information types; and (2) non-expert preferences for relations between information types. The results suggest performance improvement when compared to a previously proposed method. We also evaluate the reliability of the method considering the information types extracted from different data practices (e.g., collection, usage, sharing, etc.) in privacy policies for mobile or web-based apps in various app domains.
This data repository contains lexicons and ontologies that we used to construct and evaluate our method.
https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/
Title of program: NGHBTRNS Catalogue Id: AAMD_v1_0
Nature of problem To find the set of neighbouring transpositions which is equivalent to each permutation of any symmetric group.
Versions of this program held in the CPC repository in Mendeley Data aamd_v1_0; NGHBTRNS; 10.1016/0010-4655(81)90131-4
This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2019)
https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-licensehttps://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license
Abstract Procedures to manipulate pseudo-differential operators in MAPLE are implemented in the program PSEUDO to perform calculations with integrable models. We use lazy evaluation and streams to represent and operate with pseudo-differential operators. No order of truncation is needed since terms are produced on demand. We give a series of concrete examples. Title of program: PSEUDO Catalogue Id: ADUO_v1_0 Nature of problem Determination of equations of motion and conserved charges in the theory of integrable models. Versions of this program held in the CPC repository in Mendeley Data ADUO_v1_0; PSEUDO; 10.1016/j.cpc.2004.08.001 This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2018)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the field of manufacturing, high-quality datasets are essential for optimizing production processes, improving energy efficiency, and developing predictive maintenance strategies. This repository introduces a comprehensive CNC machining data repository that includes three key data categories: (1) product geometry data, (2) NC code data, and (3) high frequency energy consumption data. This dataset is particularly valuable for researchers and engineers working in manufacturing analytics, energy-efficient machining, and machine learning applications in smart manufacturing. Potential use cases include optimizing machining parameters for energy reduction, predicting tool wear based on power consumption patterns, and enhancing digital twin models with real-world machining data. By making this dataset publicly available, we aim to support the development of data-driven solutions in modern manufacturing and facilitate benchmarking efforts across different machining strategies.