32 datasets found

f
Experimental data for "Software Data Analytics: Architectural Model...
figshare.com
data.4tu.nl
zip
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cong Liu (2023). Experimental data for "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection" [Dataset]. http://doi.org/10.4121/uuid:ca1b0690-d9c5-4626-a067-525ec9d5881b
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:ca1b0690-d9c5-4626-a067-525ec9d5881b
Dataset updated
Jun 6, 2023
Dataset provided by
4TU.ResearchData
Authors
Cong Liu
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset includes all experimental data used for the PhD thesis of Cong Liu, entitled "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection". These data are generated by instrumenting both synthetic and real-life software systems, and are formated according to the IEEE XES format. See http://www.xes-standard.org/ and https://www.win.tue.nl/ieeetfpm/lib/exe/fetch.php?media=shared:downloads:2017-06-22-xes-software-event-v5-2.pdf for more explanations.
4
Data accompanying the PhD dissertation "Structure-Based Prediction of...
data.4tu.nl
Updated Jul 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tim Neijenhuis (2025). Data accompanying the PhD dissertation "Structure-Based Prediction of Protein Behavior in Preparative Chromatography" [Dataset]. http://doi.org/10.4121/0c67a2a7-3ead-4f1d-acac-9e1ef843d0d2.v1
Explore at:
Unique identifier
https://doi.org/10.4121/0c67a2a7-3ead-4f1d-acac-9e1ef843d0d2.v1
Dataset updated
Jul 8, 2025
Dataset provided by
4TU.ResearchData
Authors
Tim Neijenhuis
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
This repository contains the data generated during the PhD project: Structure-Based Prediction of Protein Behavior in Preparative Chromatography
By Tim Neijenhuis in Delft University of Technology
Supervisors: Marcel Ottens and Marieke Klijn
Department of Biotechnology, Section of Bioprocess Engineering.
When using the data, please refer using:
Neijenhuis, T., Le Bussy, O., Geldhof, G., Klijn, M. E., & Ottens, M. (2024). Predicting protein retention in ion‐exchange chromatography using an open source QSPR workflow. Biotechnology Journal, 19(3), 2300708.
Keulen, D., Neijenhuis, T., Lazopoulou, A., Disela, R., Geldhof, G., Le Bussy, O., ... & Ottens, M. (2025). From protein structure to an optimized chromatographic capture step using multiscale modeling. Biotechnology Progress, 41(1), e3505.
Disela, R., Neijenhuis, T., Le Bussy, O., Geldhof, G., Klijn, M., Pabst, M., & Ottens, M. (2024). Experimental characterization and prediction of Escherichia coli host cell proteome retention during preparative chromatography. Biotechnology and Bioengineering, 121(12), 3848-3859.

The objective of the PhD project was to predict the behavior of proteins for different chromatographic columns using molecular modeling methods.
This repository contains all predicted and measured values for each protein for the different columns investigated during the research in a CSV format.
These name of each file is identical to the figure presented in the dissertation stating first the chapter followed by the figure (as {chapter}_data_{figure}.csv).
A Parallel Corpus of Thesis and Dissertations Abstracts
figshare.com
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Felipe Soares (2023). A Parallel Corpus of Thesis and Dissertations Abstracts [Dataset]. http://doi.org/10.6084/m9.figshare.5995519.v2
Explore at:
application/x-sqlite3Available download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5995519.v2
Dataset updated
May 30, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Felipe Soares
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NOTE FOR WMT PARTICIPANTS:There is an easier version for MT available in Moses format (one sentence per line. The files start with moses_like.If you use this dataset, please cite the following work:@inproceedings{soares2018parallel, title={A Parallel Corpus of Theses and Dissertations Abstracts}, author={Soares, Felipe and Yamashita, Gabrielli Harumi and Anzanello, Michel Jose}, booktitle={International Conference on Computational Processing of the Portuguese Language}, pages={345--352}, year={2018}, organization={Springer} }In Brazil, the governmental body responsible for overseeing and coordinating post-graduate pro-grams, CAPES, keeps records of all thesis and dissertations presented in the country. Informa-tion regarding such documents can be accessed online in the Thesis and Dissertations Catalog(TDC), which contains abstracts in Portuguese and English, and additional data regarding suchdocuments. Thus, this database can be a potential source of parallel corpora for the Portugueseand English languages. In this article, we present the development of a parallel corpus from TDC,which is made available by CAPES under the open data initiative. Approximately 240,000 doc-uments were collected and aligned using the Hunalign algorithm. We demontrate the capabilityof our developed corpus by training Statistical Machine Translation (SMT) and Neural MachineTranslation (NMT) models for both language directions, followed by a comparison with GoogleTranslator (GT). Our both translation models presented better BLEU scores than GT, with NMTsystem being the most accurate one. Sentence alignment was also manually evaluated, presentingan average of XX% correctly aligned sentences. Our parallel corpus is freely available in TMXformat, with complementary infomration regarding document metadata.
Supplementary material to the masters thesis: NUTS-3 Regionalization of...
zenodo.org
zip
Updated Jun 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Danielle Schmidt; Danielle Schmidt (2020). Supplementary material to the masters thesis: NUTS-3 Regionalization of Industrial Load Shifting Potential in Germany using a Time-Resolved Model [Dataset]. http://doi.org/10.5281/zenodo.3613767
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3613767
Dataset updated
Jun 5, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Danielle Schmidt; Danielle Schmidt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Germany
Description
This is the supplementary material to the masters thesis:

"NUTS-3 Regionalization of Industrial Load Shifting Potential in Germany using a Time-Resolved Model"

LICENSE

All output data provided is under Creative Commons Attribution 4.0 International Public License. All Python scripts provided are under Apache License, Version 2.0. Refer to the ‘Input data documentation’ file for source and data license information for the input data. For license information, refer to the LICENSE files.

DATASET DESCRIPTION

The supplementary material is organized into four different subdirectories:

The subdirectory ‘Industrial processes’ contains the input data, Python scripts, and output data for the estimation of NUTS-3 load shifting potential of suitable electrically powered industrial processes (cement milling, mechanical pulping, paper production, air separation).

The subdirectory ‘Process heat’ contains the input data, Python scripts, and output data for the estimation of NUTS-3 load shifting potential of industrial process heat applications that are powered by electricity.

The subdirectory ‘Future projections’ contains the input data, Python scripts, and output data for the estimation of NUTS-0 annual average industrial processes load shifting potential in the future up until 2050.

The subdirectory ‘Other tables for reference’ contains data tables that are not used as input data to the Python scripts, but which may serve as useful further reference for the reader. These tables are the original or intermediate tables from which the input data tables were created.

For more information refer to the README file.

For a detailed description of the approach developed by the author, the input data used and the generated results, refer to the masters thesis "NUTS-3 Regionalization of Industrial Load Shifting Potential in Germany using a Time-Resolved Model", available here: https://elib.dlr.de/134116/

In case of questions please contact: bruno.schyska@dlr.de or wilko.heitkoetter@dlr.de
n
Data from: Innovative Approaches and Tool Development for Proteomics Data...
curate.nd.edu
Updated Apr 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ETD Depositor; Simon Dyck Weaver (2025). Innovative Approaches and Tool Development for Proteomics Data Analysis: Applications Across Diverse Biological Systems [Dataset]. http://doi.org/10.7274/28716458.v1
Explore at:
Unique identifier
https://doi.org/10.7274/28716458.v1
Dataset updated
Apr 3, 2025
Dataset provided by
University of Notre Dame
Authors
ETD Depositor; Simon Dyck Weaver
License
https://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106
Description
Bottum up proteomics (BUP) is a powerful analytical technique that involves digesting complex protein mixtures into peptides and analyzing them with liquid chromatography and tandem mass spectrometry to identify and quantify many proteins simultaneously. This produces massive multidimensional datasets which require informatics tools to analyze. The landscape of software tools for BUP analysis is vast and complex, and often custom programs and scripts are required to answer biological questions of interest in any given experiment.

This dissertation introduces novel methods and tools for analyzing BUP experiments and applies those methods to new samples. First, PrIntMap-R, a custom application for intraprotein intensity mapping, is developed and validated. This application is the first open-source tool to allow for statistical comparisons of peptides within a protein sequence along with quantitative sequence coverage visualization. Next, innovative sample preparation techniques and informatics methods are applied to characterize MUC16, a key ovarian cancer biomarker. This includes the proteomic validation of a novel model of MUC16 differing from the dominant isoform reported in literature. Shifting to bacterial studies, custom differential expression workflows are employed to investigate the role of virulence lipids in mycobacterial protein secretion by analyzing mutant strains of mycobacteria. This work links lipid presence and virulence factor secretion for the first time. Building on these efforts, OnePotN??TA, a labeling technique enabling quantification of N-terminal acetylation in mycobacterial samples, introduced. This method is the first technique to simultaneously quantify protein and N-terminal acetylation abundance using bottom-up proteomics, advancing the field of post-translational modification quantification. This project resulted in the identification of 37 new putative substrates for an N-acetyltransferase, three of which have since been validated biochemically. These tools and methodologies are further applied to various biological research areas, including breast cancer drug characterization and insect saliva analysis to perform the first proteomic studies of their kind with these respective treatments and samples. Additionally, a project focused on teaching programming skills relevant to analytical chemistry is presented. Collectively, this work enhances the analytical capabilities of bottom-up proteomics, providing novel tools and methodologies that advance protein characterization, post-translational modification analysis, and biological discovery across diverse research areas.
Z
Numbers of Articles, Books and Dissertation theses indexed in BASE and...
data.niaid.nih.gov
Updated Aug 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Herb, Ulrich (2024). Numbers of Articles, Books and Dissertation theses indexed in BASE and percentages of items published Open Access, under Creative Commons Licenses and under Open Licenses (2013-2017) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1189806
Explore at:
Dataset updated
Aug 2, 2024
Dataset authored and provided by
Herb, Ulrich
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
A look at the data provided by the Open Access search engine BASE (http://base-search.net) shows that the Open Science compliance among dissertation theses stagnates. BASE knows three categories of accessibility: Open Access, Unknown, Non-Open Access. In the following tables and graphs, figures reported as "Open Access" have been categorised by BASE as Open Access. The tables and graphics show data from BASE (as of 06.03.2018) as follows:

a) Indexed theses, books and journal articles

b) Indexed theses, books and journal articles published by Open Access

c) indexed theses, books and journal articles under Creative Commons licenses.

d) indexed theses, books and journal articles, which are published under open licenses in the sense of the Open License, i. e. reflect terms of use of the Open Source.

Although doctoral theses already had a high share of open access by 2013 (43%), by 2017 it had risen by only 5% (2017:48%). At the same time, the proportion of books published in open access rose by 14% (from 20% to 34%) and articles by 17% from 44% (2013) to 61% (2017). The same effect can be seen in the proportion of CC-licensed items: Their share rose by 4% (from 9% to 13%) for doctoral theses, by 9% for books (from 4% to 13%) and 8% for articles (from 10% to 18%) between 2013 and 2017. However, the share of openly licensed items is most pronounced: it did not increase for doctoral theses, but remained at 2% between 2013 and 2017; in the same period it increased by 5% (from 1% to 6%) for books, and by 5% (from 5% to 10%) for articles.
D.Sc. Thesis materials
figshare.com
pdf
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leo Lahti (2023). D.Sc. Thesis materials [Dataset]. http://doi.org/10.6084/m9.figshare.810514.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.810514.v1
Dataset updated
Jun 3, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Leo Lahti
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Doctoral thesis. Open source probabilistic models for human functional genomics. Includes: the press release, the thesis, and LaTeX sources.
h
Supplementary data for the thesis "Development and Validation of Explainable...
datahub.hku.hk
Updated Jul 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yui Lun Ng (2024). Supplementary data for the thesis "Development and Validation of Explainable Machine-Learning Prediction Systems: A Study of Biomedical and Clinical Data" [Dataset]. http://doi.org/10.25442/hku.26172664.v1
Explore at:
Unique identifier
https://doi.org/10.25442/hku.26172664.v1
Dataset updated
Jul 18, 2024
Dataset provided by
HKU Data Repository
Authors
Yui Lun Ng
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
The files contain the dataset for the thesis "Development and Validation of Explainable Machine-Learning Prediction Systems: A Study of Biomedical and Clinical Data".

Chapter 3 includes a patient dataset with CDI (Clostridioides difficile infection) admissions from 2009-2014 in Hong Kong.

Chapter 4 includes a list of protein structure data derived from UniProt (www.uniprot.org) (release 2021_03) and their corresponding enzyme functions. The protein structure data file can be downloaded from the open-source database Protein Data Bank (www.rcsb.org). Additionally, a list of AlphaFold 2 predicted structures is also included, and the structural data can be downloaded from www.alphafold.com.

Chapter 5 contains a list of PDB structures derived from UniProt (release 2023_01).
Data from: Experimental investigation of the effects of particle shape and...
zenodo.org
data.niaid.nih.gov
bin, zip
Updated Nov 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gustavo Pinzón; Gustavo Pinzón; Edward Andò; Edward Andò; Alessandro Tengattini; Alessandro Tengattini; Gioacchino Viggiani; Gioacchino Viggiani (2023). Experimental investigation of the effects of particle shape and friction on the mechanics of granular media [Dataset]. http://doi.org/10.5281/zenodo.8014905
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8014905
Dataset updated
Nov 20, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gustavo Pinzón; Gustavo Pinzón; Edward Andò; Edward Andò; Alessandro Tengattini; Alessandro Tengattini; Gioacchino Viggiani; Gioacchino Viggiani
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset corresponds to the raw data and experimental measurements of the PhD thesis "Experimental investigation of the effects of particle shape and friction on the mechanics of granular media" of Gustavo Pinzón (2023, Université Grenoble Alpes), available at: https://hal.science/tel-04202827v1.
The experiments correspond to a drained triaxial compression test of cylindrical granular specimens, a common testing procedure used in soil mechanics to characterise the mechanical response of a specimen under deviatoric loading. Each specimen is 140mm in height and 70mm in diameter, and is composed of more than 20000 ellipsoidal particles of a given aspect ratio and interparticle friction. The dataset comprises the test of six specimens, as a result of the combination of 3 particles shapes (Flat, Medium, and Rounded) and 2 values of interparticle friction (Rough and Smooth). A naming system for the specimens is adopted to reflect the morphology of the composing particles (e.g., the test EFR correspond to the specimen with Flat and Rough particles). Further details on the experimental methods are found in Ch. 2 of the thesis.
The compression tests are performed inside the x-ray scanner of Laboratoire 3SR in Grenoble (France), where the specimens are scanned each 0.5% of axial shortening, at an isotropic voxel size of 100 micrometer per pixel. The obtained radiographies are reconstructed using a Filtered Back Projection algorithm, using the software given by the x-ray cabin manufacturer (RX Solutions, France). The series of obtained 16-bit greyscale 3D images are processed with the open source software spam, version 0.6.2. The coordinate system of all the images is ZYX, where Z corresponds to compression direction. Further details on the image analysis techniques are found in Ch. 3 of the thesis.
Additional greyscale images, raw projections, and x-ray tomography files are available upon request. For visualisation purposes, the 3D images in .tif format can be opened using Fiji.
r
Data from: Data files used to study the distribution of growth in software...
researchdata.edu.au
Updated May 4, 2011
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Swinburne University of Technology (2011). Data files used to study the distribution of growth in software systems [Dataset]. https://researchdata.edu.au/files-used-study-software-systems/14865
Explore at:
Dataset updated
May 4, 2011
Dataset provided by
Swinburne University of Technology
Description
The evolution of a software system can be studied in terms of how various properties as reflected by software metrics change over time. Current models of software evolution have allowed for inferences to be drawn about certain attributes of the software system, for instance, regarding the architecture, complexity and its impact on the development effort. However, an inherent limitation of these models is that they do not provide any direct insight into where growth takes place. In particular, we cannot assess the impact of evolution on the underlying distribution of size and complexity among the various classes. Such an analysis is needed in order to answer questions such as 'do developers tend to evenly distribute complexity as systems get bigger?', and 'do large and complex classes get bigger over time?'. These are questions of more than passing interest since by understanding what typical and successful software evolution looks like, we can identify anomalous situations and take action earlier than might otherwise be possible. Information gained from an analysis of the distribution of growth will also show if there are consistent boundaries within which a software design structure exists. In our study of metric distributions, we focused on 10 different measures that span a range of size and complexity measures. The raw metric data (4 .txt files and 1 .log file in a .zip file measuring ~0.5MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).
Z
Fourteen-channel EEG with Imagined Speech (FEIS) dataset
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan Clayton (2020). Fourteen-channel EEG with Imagined Speech (FEIS) dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3369178
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Scott Wellington
Jonathan Clayton
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
<> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

Welcome to the FEIS (Fourteen-channel EEG with Imagined Speech) dataset.

<>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

The FEIS dataset comprises Emotiv EPOC+ [1] EEG recordings of:

21 participants listening to, imagining speaking, and then actually speaking 16 English phonemes (see supplementary, below)

2 participants listening to, imagining speaking, and then actually speaking 16 Chinese syllables (see supplementary, below)

For replicability and for the benefit of further research, this dataset includes the complete experiment set-up, including participants' recorded audio and 'flashcard' screens for audio-visual prompts, Lua script and .mxs scenario for the OpenVibe [2] environment, as well as all Python scripts for the preparation and processing of data as used in the supporting studies (submitted in support of completion of the MSc Speech and Language Processing with the University of Edinburgh):

J. Clayton, "Towards phone classification from imagined speech using a lightweight EEG brain-computer interface," M.Sc. dissertation, University of Edinburgh, Edinburgh, UK, 2019.

S. Wellington, "An investigation into the possibilities and limitations of decoding heard, imagined and spoken phonemes using a low-density, mobile EEG headset," M.Sc. dissertation, University of Edinburgh, Edinburgh, UK, 2019.

Each participant's data comprise 5 .csv files -- these are the 'raw' (unprocessed) EEG recordings for the 'stimuli', 'articulators' (see supplementary, below) 'thinking', 'speaking' and 'resting' phases per epoch for each trial -- alongside a 'full' .csv file with the end-to-end experiment recording (for the benefit of calculating deltas).

To guard against software deprecation or inaccessability, the full repository of open-source software used in the above studies is also included.

We hope for the FEIS dataset to be of some utility for future researchers, due to the sparsity of similar open-access databases. As such, this dataset is made freely available for all academic and research purposes (non-profit).

<> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

REFERENCING

<>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

If you use the FEIS dataset, please reference:

S. Wellington, J. Clayton, "Fourteen-channel EEG with Imagined Speech (FEIS) dataset," v1.0, University of Edinburgh, Edinburgh, UK, 2019. doi:10.5281/zenodo.3369178

<> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

LEGAL

<>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

The research supporting the distribution of this dataset has been approved by the PPLS Research Ethics Committee, School of Philosophy, Psychology and Language Sciences, University of Edinburgh (reference number: 435-1819/2).

This dataset is made available under the Open Data Commons Attribution License (ODC-BY): http://opendatacommons.org/licenses/by/1.0

<> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

ACKNOWLEDGEMENTS

<>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

The FEIS database was compiled by:

Scott Wellington (MSc Speech and Language Processing, University of Edinburgh) Jonathan Clayton (MSc Speech and Language Processing, University of Edinburgh)

Principal Investigators:

Oliver Watts (Senior Researcher, CSTR, University of Edinburgh) Cassia Valentini-Botinhao (Senior Researcher, CSTR, University of Edinburgh)

<>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

METADATA

<> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

For participants, dataset refs 01 to 21:

01 - NNS 02 - NNS 03 - NNS, Left-handed 04 - E 05 - E, Voice heard as part of 'stimuli' portions of trials belongs to particpant 04, due to microphone becoming damaged and unusable prior to recording 06 - E 07 - E 08 - E, Ambidextrous 09 - NNS, Left-handed 10 - E 11 - NNS 12 - NNS, Only sessions one and two recorded (out of three total), as particpant had to leave the recording session early 13 - E 14 - NNS 15 - NNS 16 - NNS 17 - E 18 - NNS 19 - E 20 - E 21 - E

E = native speaker of English NNS = non-native speaker of English (>= C1 level)

For participants, dataset refs chinese-1 and chinese-2:

chinese-1 - C chinese-2 - C, Voice heard as part of 'stimuli' portions of trials belongs to participant chinese-1

C = native speaker of Chinese

<>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

SUPPLEMENTARY

<> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

Under the international 10-20 system, the Emotiv EPOC+ headset 14 channels:

F3 FC5 AF3 F7 T7 P7 O1 O2 P8 T8 F8 AF4 FC6 F4

The 16 English phonemes investigated in dataset refs 01 to 21:

/i/ /u:/ /æ/ /ɔ:/ /m/ /n/ /ŋ/ /f/ /s/ /ʃ/ /v/ /z/ /ʒ/ /p /t/ /k/

The 16 Chinese syllables investigated in dataset refs chinese-1 and chinese-2:

mā má mǎ mà mēng méng měng mèng duō duó duǒ duò tuī tuí tuǐ tuì

All references to 'articulators' (e.g. as part of filenames) refer to the 1-second 'fixation point' portion of trials. The name is a layover from preliminary trials which were modelled on the KARA ONE database (http://www.cs.toronto.edu/~complingweb/data/karaOne/karaOne.html) [3].

<>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

<> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

[1] Emotiv EPOC+. https://emotiv.com/epoc. Accessed online 14/08/2019.

[2] Y. Renard, F. Lotte, G. Gibert, M. Congedo, E. Maby, V. Delannoy, O. Bertrand, A. Lécuyer. “OpenViBE: An Open-Source Software Platform to Design, Test and Use Brain-Computer Interfaces in Real and Virtual Environments”, Presence: teleoperators and virtual environments, vol. 19, no 1, 2010.

[3] S. Zhao, F. Rudzicz. "Classifying phonological categories in imagined and articulated speech." In Proceedings of ICASSP 2015, Brisbane Australia, 2015.
Dissertation data.zip
figshare.com
zip
Updated Jun 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ioana Puscasu; Simona Motogna (2021). Dissertation data.zip [Dataset]. http://doi.org/10.6084/m9.figshare.14780073.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14780073.v1
Dataset updated
Jun 14, 2021
Dataset provided by
Figsharehttp://figshare.com/
Authors
Ioana Puscasu; Simona Motogna
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Data used for developing an application that mapped Fowler's code smell to Pylint code smells. Firstly 179 files that contained the Pylint result were used. Those files came from projects that were submitted for Babes-Bolyai University's "Fundamentals of programming" course for the 2019-20202 academic year. Since those file did not contain main refactoring code smell, the application was tested on another set of data. The second set also came from set of projects submitted for Babes-Bolyai University's "Formal Languages and Compilation Techniques" course for the 2019-2020 academic year. For the second set of data access to code was also provided, so it was easier to perform an analysis on the code as well. The problem was that again, not so many refactoring code smell were discovered here, and in order to test the application a bigger project was needed. The third set of date came form the open source project called "TensorFlow" . This is a machine learning platform that runs from start to finish. It features a large, flexible ecosystem of tools, libraries, and community resources that enable academics to push the boundaries of machine learning and developers to quickly build and deploy ML-powered apps. All of this data can be found in the file.
4
Drone / Unmanned Aerial Vehicle raw and processed photogrammetry data,...
data.4tu.nl
zip
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Niels Hoogendoorn; H.C. (Hessel) Winsemius; N.C. (Nick) van de Giesen; Stephen Mather; Hoes O.A.C.; Davide Wüthrich (2024). Drone / Unmanned Aerial Vehicle raw and processed photogrammetry data, supporting the MSc thesis work 3D River Discharge Modelling using UAV photogrammetry [Dataset]. http://doi.org/10.4121/63a75bfc-4845-4827-9840-da9f710efb36.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/63a75bfc-4845-4827-9840-da9f710efb36.v1
Dataset updated
Jul 12, 2024
Dataset provided by
4TU.ResearchData
Authors
Niels Hoogendoorn; H.C. (Hessel) Winsemius; N.C. (Nick) van de Giesen; Stephen Mather; Hoes O.A.C.; Davide Wüthrich
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2023
Area covered

Dataset funded by
European Commission
Description
A photogrammetry dataset was collected using an Unmanned Aerial Vehicle (quadcopter) over a river stretch of the Black Volta at Bamboi Bridge - Ghana. Also Ground Control Points (GCPs) were collected that represent black-and-white markers in the landscape. These can be used to better geographically constrain the photogrammetric solution. GCPs have been associated with row and column pixel location in each photo in which they appear.
The raw data was processed into a 3D point cloud using the open-source software platform WebOpenDroneMap (WebODM). The point cloud was analysed for removal of vegetation using spatial filtering techniques with the intent to make a bare-earth topographical map of the dry part of the riverbed. Both filtered and unfiltered point clouds were further processed into a Digital Surface Model (unfiltered) and Digital Terrain Model (DTM). The unfiltered dataset was also processed into an RGB orthophoto. In the thesis work of Hoogendoorn (2023) further research was done on combining the results of these datasets and analyses with wet bathymetry points collected using a fishfinder equipped with Real-Time-Kinematics GNSS, and using the outcoming full bathymetry for hydraulic modelling and understanding of relationships between wetted geometry and river discharge. For more information, we refer to the MSc thesis work of Hoogendoorn (2023)
The data files consist of three (3) .zip files. Unzip these to get access to all underlying files. For a quick overview, a .qgs file can be opened in QGIS. This will display all layers in a simple GIS project. The point cloud is also visualized but may take significant time before rendered, as points first need to be cached.
References: Hoogendoorn, N. J.: 3D River Discharge Modelling using UAV photogrammetry | TU Delft Repository, Delft University of Technology, Delft, The Netherlands, 2023.
Link: https://repository.tudelft.nl/record/uuid:d4088a50-3590-4675-9600-d715800841a3
s
Data from: Can we make it better? Assessing and improving quality of GitHub...
researchdata.smu.edu.sg
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GEDE ARTHA AZRIADI PRANA (SMU) (2023). Data from: Can we make it better? Assessing and improving quality of GitHub repositories [Dataset]. http://doi.org/10.25440/smu.17073050.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25440/smu.17073050.v1
Dataset updated
May 31, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
GEDE ARTHA AZRIADI PRANA (SMU)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the related dataset for the PhD dissertation by G. A. A. Prana, "Can We Make It Better? Assessing and Improving Quality of GitHub Repositories", available at https://ink.library.smu.edu.sg/etd_coll/373/The code hosting platform GitHub has gained immense popularity worldwide in recent years, with over 200 million repositories hosted as of June 2021. Due to its popularity, it has great potential to facilitate widespread improvements across many software projects. Naturally, GitHub has attracted much research attention, and the source code in the various repositories it hosts also provide opportunity to apply techniques and tools developed by software engineering researchers over the years. However, much of existing body of research applicable to GitHub focuses on code quality of the software projects and ways to improve them. Fewer work focus on potential ways to improve quality of GitHub repositories through other aspects, although quality of a software project on GitHub is also affected by factors outside a project's source code, such as documentation, the project's dependencies, and pool of contributors.The three works that form this dissertation focus on investigating aspects of GitHub repositories beyond the code quality, and identify specific potential improvements that can be applied to improve wide range of GitHub repositories. In the first work, we aim to systematically understand the content of README files in GitHub software projects, and develop a tool that can process them automatically. The work begins with a qualitative study involving 4,226 README file sections from 393 randomly-sampled GitHub repositories, which reveals that many README files contain the What'' andHow'' of the software project, but often do not contain the purpose and status of the project. This is followed by a development and evaluation of a multi-label classifier that can predict eight different README content categories with F1 of 0.746. From our subsequent evaluation of the classifier, which involve twenty software professionals, we find that adding labels generated by the classifier to README files ease information discovery.Our second work focuses on characteristics of vulnerabilities in open-source libraries used by 450 software projects on GitHub that are written in Java, Python, and Ruby. Using an industrial software composition analysis tool, we scanned every version of the projects after each commit made between November 1, 2017 and October 31, 2018. Our subsequent analyses on the discovered library names, versions, and associated vulnerabilities reveal, among others, that Denial of Service'' andInformation Disclosure'' vulnerability types are common. In addition, we also find that most of the vulnerabilities persist throughout the observation period, and that attributes such as project size, project popularity, and experience level of commit authors do not translate to better or worse handling of vulnerabilities in dependent libraries. Based on the findings in the second work, we list a number of implications for library users, library developers, as well as researchers, and provide several concrete recommendations. This includes recommendations to simplify projects' dependency sets, as well as to encourage research into ways to automatically recommend libraries known to be secure to developers.In our third work, we conduct a multi-region geographical analysis of gender inclusion on GitHub. We use a mixed-methods approach involving a quantitative analysis of commit authors of 21,456 project repositories, followed by a survey that is strategically targeted to developers in various regions worldwide and a qualitative analysis of the survey responses. Among other findings, we discover differences in diversity levels between regions, with Asia and Americas being highest. We also find no strong correlation between gender and geographic diversity of a repository's commit authors. Further, from our survey respondents worldwide, we also identify barriers and motivations to contribute to open-source software. The results of this work provides insights on the current state of gender diversity in open source software and potential ways to improve participation of developers from under-represented regions and gender, and subsequently improve the open-source software community in general. Such potential ways include creation of codes of conduct, proximity-based mentorship schemes, and highlighting of women / regional role models.
n
Data from: Model-based design of experiments and measurement optimization...
curate.nd.edu
pdf
Updated Nov 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jialu Wang (2024). Model-based design of experiments and measurement optimization frameworks based on scalable and tractable algorithms and software [Dataset]. http://doi.org/10.7274/25607808.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.7274/25607808.v1
Dataset updated
Nov 11, 2024
Dataset provided by
University of Notre Dame
Authors
Jialu Wang
License
https://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106
Description
Mathematical models have become increasingly critical due to the rapid advances in computational methods in recent decades. However, the validation of these models often demands extensive and costly data, leading to time-consuming processes. Traditional design of experiments (DoE) methods struggle to choose informative experiments, especially for the typically large-scale, nonlinear, and dynamical science-based models in chemical and biomolecular engineering (CBE). In this dissertation, I propose a sequential model validation workflow powered by novel DoE and measurement optimization (MO) frameworks to improve data acquisition efficiency and accelerate the model building and validation process. The workflow relies on two scalable and tractable frameworks, along with their generalized open-source software tools:

Measurement optimization: determines what to measure for experiments to maximize the experimental information content. It guides apparatus preparation during the experimental setup stage, balancing the information content with practical constraints such as budgets Model-based design of experiments: quantifies experimental information content statistically and optimizes experiment selection based on updated model information. This framework is used throughout the model validation process, recommending new experiments after each iteration to update the model

Both frameworks focus on addressing the challenges of DoE and MO techniques leveraging large-scale, nonlinear, and dynamical models in CBE, providing user-friendly open-source software tools for widespread applications.

In this dissertation, I describe the development of the model-based DoE and the MO frameworks and their generalized tools to streamline the model validation workflow for complex models such as partial differential algebraic equations (PDAEs). I briefly discuss how these frameworks and the open-source software tool contribute to the broad DoE technique paradigm and applications. I demonstrate the tractability and scalability of the frameworks with laboratory and pilot-scale carbon capture experiments. Moreover, generalized open-source software tools are developed and applied to carbon capture experiments, highlighting their versatility and practicality. This dissertation lays the groundwork for a sequential MO and MBDoE workflow that can be readily applied to various challenging problems in CBE and beyond, offering potential benefits to broader science, technology, engineering, and math (STEM) communities. I conclude with a discussion of the future directions, and provide preliminary works for some future directions as a starting point.
f
Amsterdam Scenario MATSim
figshare.com
data.4tu.nl
txt
Updated Jul 28, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Konstanze Winter; J. (Jishnu) Narayan (2020). Amsterdam Scenario MATSim [Dataset]. http://doi.org/10.4121/uuid:6108ed85-7b24-455e-bd95-89d84e6306fa
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:6108ed85-7b24-455e-bd95-89d84e6306fa
Dataset updated
Jul 28, 2020
Dataset provided by
4TU.ResearchData
Authors
Konstanze Winter; J. (Jishnu) Narayan
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Amsterdam
Description
Data set based on the city of Amsterdam, which has been used in various simulations using the open-source agent-based transport simulation model MATSim (https://www.matsim.org/). It contains a networks, agents' plans (and the description how these have been derived from the ALBATROSS data set), configuration files and additional information material.

Regarding the original ALBATROSS data set, please contact Prof. Soora Rasouli (TU Eindhoven)
Data bundle for egon-data: A transparent and reproducible data processing...
zenodo.org
zip
Updated Jun 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ilka Cußmann; Ilka Cußmann (2022). Data bundle for egon-data: A transparent and reproducible data processing pipeline for energy system modeling [Dataset]. http://doi.org/10.5281/zenodo.6630616
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6630616
Dataset updated
Jun 10, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ilka Cußmann; Ilka Cußmann
Description
egon-data provides a transparent and reproducible open data based data processing pipeline for generating data models suitable for energy system modeling. The data is customized for the requirements of the research project eGoⁿ. The research project aims to develop tools for an open and cross-sectoral planning of transmission and distribution grids. For further information please visit the eGoⁿ project website or its Github repository.

egon-data retrieves and processes data from several different external input sources. As not all data dependencies can be downloaded automatically from external sources we provide a data bundle to be downloaded by egon-data.

The following data sets are part of the available data bundle:

climate_zones_germany

Climate zones in Germany

source: Own representation based on DWD TRY climate zones

License: Attribution 4.0 International (CC BY 4.0)

emobility

Data on eMobility mit_trip_data:
motorized individual travel - individual trips of electric vehicles (EV) generated with a modified version of simBEV v0.1.3 (https://github.com/rl-institut/simbev/tree/1f87c716d14ccc4a658b8d2b01fd12b88a4334d5). simBEV generates driving profiles for BEVs and PHEVs based upon MID data (BMVI) per RegioStaR7 region type (BBSR).

Reiner Lemoine Institut, June 2022

License: Attribution 4.0 International (CC BY 4.0)

geothermal_potential

Spatial distribution of deep geothermal potentials in Germany

source: Assessment and Public Reporting of Geothermal Resources in Germany: Review and Outlook

License: Attribution 4.0 International (CC BY 4.0)

household_electricity_demand_profiles

Annual profiles in hourly resolution of electricity demand of private households for different household types (singles, couples, other) with varying number of elderly and children.
The profiles were created using a bottom-up load profile generator by Fraunhofer IEE developed in the Bachelor's thesis "Auswirkungen verschiedener Haushaltslastprofile auf PV-Batterie-Systeme" by Jonas Haack, Fachhochschule Flensburg, December 2012.
The columns are named as follows: "

License: Attribution 4.0 International (CC BY 4.0)

household_heat_demand_profiles

Sample heat time series including hot water and space heating for single- and multi-familiy houses. The profiles were created using the loadprofile generator by Fraunhofer IEE developed in the Master's thesis "Synthesis of a heat and electrical load profile for single and multi-family houses used for subsequent performance tests of a multi-component energy system", Simon Ruben Drauz, RWTH Aachen University, March 2016

License: Attribution 4.0 International (CC BY 4.0)

hydrogen_storage_potential_saltstructures

The data are taken from figure 7.1 in Donadei, S., et al., (2020), p. 7-5..

Source: Flach lagernde Salze, (c) BGR Hannover, 2021.
Datenquelle: InSpEE-Salzstrukturen, (c) BGR, Hannover, 2015. &
Donadei, S., Horváth, B., Horváth, P.-L., Keppliner, J., Schneider, G.-S., &
Zander-Schiebenhöfer, D. (2020). Teilprojekt Bewertungskriterien und
Potenzialabschätzung. BGR. Informationssystem Salz: Planungsgrundlagen,
Auswahlkriterien und Potenzialabschätzung für die Errichtung von Salzkavernen
zur Speicherung von Erneuerbaren Energien (Wasserstoff und Druckluft) –
Doppelsalinare und flach lagernde Salzschichten: InSpEE-DS. Sachbericht.
Hannover: BGR.

License: The original data are licensed under the GeoNutzV, see https://sg.geodatenzentrum.de/web_public/gdz/lizenz/geonutzv.pdf

industrial_sites

Information about industrial sites with DSM-potential in Germany from a Master's thesis by Danielle Schmidt. The data set includes own information on the coordinates of every industrial site.

source: Schmidt, Danielle. (2019). Supplementary material to the masters thesis: NUTS-3 Regionalization of Industrial Load Shifting Potential in Germany using a Time-Resolved Model [Data set]. Zenodo. https://doi.org/10.5281/zenodo.3613767

License: Attribution 4.0 International (CC BY 4.0)

nep2035_version2021

Data extracted from the German grid development plan - power

source: Netzentwicklungsplan Strom 2035 (2021), erster Entwurf | Übertragungsnetzbetreiber (M) CC-BY-4.0

License: Attribution 4.0 International (CC BY 4.0)

pipeline_classification_gas

Parameters for the classification of gas pipelines

source: Single parameters extracted from Electricity, Heat and Gas Sector Data for Modelling the German System

License: Attribution 4.0 International (CC BY 4.0)

pypsa_eur_sec

Preliminary results from scenario generator pypsa-eur-sec

source: own calculation using pypsa-eur-sec fork (https://github.com/openego/pypsa-eur-sec)

License: Attribution 4.0 International (CC BY 4.0)

regions_dynamic_line_rating

German regions suitable to model dynamic line rating

source: Own representation based on Grundsätze für die Ausbauplanung des Deutschen Übertragungsnetze (2020)

License: Attribution 4.0 International (CC BY 4.0)

re_potential_areas

Eligible areas for wind turbines and ground-mounted PV systems.

Reiner Lemoine Institut, January 2022

License: Attribution 4.0 International (CC BY 4.0)

WZ_definition

Definitions of industrial and commercial branches

source: Klassifikation der Wirtschaftszweige (WZ 2008)

Extract from Terms of Use: © Statistisches Bundesamt, Wiesbaden 2008 Vervielfältigung und Verbreitung, auch auszugsweise, mit Quellenangabe gestattet.

zensus_households

Dataset describing the amount of people living by a certain types of family-types, age-classes,sex and size of household in Germany in state-resolution.

source: Data retrieved from Zensus Datenbank by performing these steps:

Search for: "1000A-2029"

or choose topic: "Bevölkerung kompakt"

Choose table code: "1000A-2029" with title "Personen: Alter (11 Altersklassen)/Geschlecht/Größe desprivaten Haushalts - Typ des privaten Haushalts (nach Familien/Lebensform)"

Change setting "GEOLK1" to "Bundesländer (16)" higher resolution "Landkreise und kreisfreie Städte (412)" only accessible after registration.

Extract from Terms of Use: © Statistische Ämter des Bundes und der Länder 2021, Vervielfältigung und Verbreitung, auch auszugsweise, mit Quellennachweis gestattet.
Causes for spatiotemporal variation in reproductive performance of Eurasian...
zenodo.org
explore.openaire.eu
Updated Jan 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Magali Frauendorf; Magali Frauendorf (2022). Causes for spatiotemporal variation in reproductive performance of Eurasian oystercatchers in a human-dominated landscape [Dataset]. http://doi.org/10.5281/zenodo.5831867
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5831867
Dataset updated
Jan 10, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Magali Frauendorf; Magali Frauendorf
Description
These data comprise the data of four chapters from the PhD thesis from Frauendorf (2022) entitled 'Causes for spatiotemporal variation in reproductive performance of Eurasian oystercatchers in a human-dominated landscape’. The thesis focusses on quantifying the anthropogenic impacts on the reproductive performance of oystercatchers across the Netherlands. The dataset contains data from chapter 3, 5 and 6, which were used in the thesis but are not published open access yet.

For chapter 3 oystercatchers were caught during winter across the wintering ground of oystercatchers (Wadden sea and Delta estuary) and their condition was measured (physiological measurements through blood and biometric measurements). The data also include resighting data from ringed individuals that were used for mark-recapture survival analysis (state and age matrices). In addition, environmental variables were collected from open source data. Next, the birds were followed from their wintering ground to the breeding ground where we measured their reproductive success.

Chapter 5 includes data about the reproductive performance of oystercatchers (available from several different data sources across the Netherlands). Next, we collected data on the environment from different (open access) data sources about for instance habitat type, land use intensity, predation and food availability.

In chapter 6, we used data on bill shape as a proxy for feeding specialization based on data from winter catches (chapter 3 in PhD thesis) to illustrate the proportion of birds with different feeding specialization in the studied population.
h
Data publication: Experimental characterization of four-magnon scattering...
rodare.hzdr.de
zip
Updated Feb 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hula, Tobias (2023). Data publication: Experimental characterization of four-magnon scattering processes in ferromagnetic conduits [Dataset]. http://doi.org/10.14278/rodare.2178
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.14278/rodare.2178
Dataset updated
Feb 28, 2023
Authors
Hula, Tobias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
All Raw and Processed Data + written Thesis. Data and Figures are stored in the 'Figures_and_Data' Directory. Experimental Measurements were done by means of BLS Microscopy (group of H. Schultheiß at HZDR). Micromagnetic Simulations were done at the Hemera Cluster (Dr. A. Kakay at HZDR). Data Analysis was done in Python or Jupyter Notebooks (Open Source). All scripts are included. Graphics were done using OmniGraffle and Blender. Plotting was done using Python and 'Plot2' (Mac Only!). All Files/Data/Skripts are sorted by Figure! The entire Latex Package is stored under 'Thesis_Hula' - Dissertation.tex is the main file and shows all required dependencies.
H
Replication data for: It's Not All about the Benjamins: Essays on Political...
dataverse.harvard.edu
Updated Jun 6, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benjamin Goodrich (2013). Replication data for: It's Not All about the Benjamins: Essays on Political Economy and Social Psychology Theories of Welfare State Preferences [Dataset]. http://doi.org/10.7910/DVN/YYGB58
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/YYGB58
Dataset updated
Jun 6, 2013
Dataset provided by
Harvard Dataverse
Authors
Benjamin Goodrich
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/YYGB58https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/YYGB58
Time period covered
Sep 2003 - May 2010
Area covered
Countries included in the ISSP and ESS surveys, World
Description
In a democracy, the relationship between the preferences of the citizens and the policies of the government is, in principle, fundamental. Whether this principle holds in practice has been the subject of a long but inconclusive debate in the political science literature. This dissertation focuses primarily on a different question, namely, what are the determinants of mass preferences over welfare state policies? To answer this question, new quantitative methods are developed, implemented in a Free, Libre, and Open Source Software package, and applied to relatively recent data. The primary contributions of this dissertation to the social science literature are two-fold. First, we present new empirical results on mass political preferences that will be of interest to political scientists, economists, and researchers in other fields. Second, those empirical results are obtained from new estimators that are especially useful for modeling preferences but are also useful for modeling other multivariate phenomena. The strength of these empirical results will hopefully spur innovation on a third front, namely the way in which political economists develop theoretical models of the process by which political preferences are aggregated in democracies. The first chapter is largely empirical and tests traditional political economy theories of preferences for redistribution against theories of inequality aversion, using the method developed in the second chapter. The main empirical conclusion of the first chapter is that a plurality of the variance in preferences for redistribution is attributable to differences in inequality aversion. The second chapter is methodological and attempts to answer the question of how many explanatory variables went into the data-generating process for the outcome variables we observe. The third chapter develops another new estimator and applies it to empirical data on preferences for redistribution and immigration. The main empirical conclusion of the third chapter is that not only is inequality aversion important to our understanding of preferences for redistribution but that it is is mostly exogenous to other factors in the model.

Facebook

Twitter

Click to copy link

Link copied

Cite

Cong Liu (2023). Experimental data for "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection" [Dataset]. http://doi.org/10.4121/uuid:ca1b0690-d9c5-4626-a067-525ec9d5881b

Experimental data for "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection"

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

zipAvailable download formats

Unique identifier

https://doi.org/10.4121/uuid:ca1b0690-d9c5-4626-a067-525ec9d5881b

Dataset updated

Jun 6, 2023

Dataset provided by

4TU.ResearchData

Authors

Cong Liu

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

This dataset includes all experimental data used for the PhD thesis of Cong Liu, entitled "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection". These data are generated by instrumenting both synthetic and real-life software systems, and are formated according to the IEEE XES format. See http://www.xes-standard.org/ and https://www.win.tue.nl/ieeetfpm/lib/exe/fetch.php?media=shared:downloads:2017-06-22-xes-software-event-v5-2.pdf for more explanations.

Clear search

Close search

Google apps

Main menu

Experimental data for "Software Data Analytics: Architectural Model...

Data accompanying the PhD dissertation "Structure-Based Prediction of...

A Parallel Corpus of Thesis and Dissertations Abstracts

Supplementary material to the masters thesis: NUTS-3 Regionalization of...

Data from: Innovative Approaches and Tool Development for Proteomics Data...

Numbers of Articles, Books and Dissertation theses indexed in BASE and...

D.Sc. Thesis materials

Supplementary data for the thesis "Development and Validation of Explainable...

Data from: Experimental investigation of the effects of particle shape and...

Data from: Data files used to study the distribution of growth in software...

Fourteen-channel EEG with Imagined Speech (FEIS) dataset

Dissertation data.zip

Drone / Unmanned Aerial Vehicle raw and processed photogrammetry data,...

Data from: Can we make it better? Assessing and improving quality of GitHub...

Data from: Model-based design of experiments and measurement optimization...

Amsterdam Scenario MATSim

Data bundle for egon-data: A transparent and reproducible data processing...

Causes for spatiotemporal variation in reproductive performance of Eurasian...

Data publication: Experimental characterization of four-magnon scattering...

Replication data for: It's Not All about the Benjamins: Essays on Political...

Experimental data for "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection"