32 datasets found
  1. f

    Experimental data for "Software Data Analytics: Architectural Model...

    • figshare.com
    • data.4tu.nl
    zip
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cong Liu (2023). Experimental data for "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection" [Dataset]. http://doi.org/10.4121/uuid:ca1b0690-d9c5-4626-a067-525ec9d5881b
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Cong Liu
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset includes all experimental data used for the PhD thesis of Cong Liu, entitled "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection". These data are generated by instrumenting both synthetic and real-life software systems, and are formated according to the IEEE XES format. See http://www.xes-standard.org/ and https://www.win.tue.nl/ieeetfpm/lib/exe/fetch.php?media=shared:downloads:2017-06-22-xes-software-event-v5-2.pdf for more explanations.

  2. 4

    Data accompanying the PhD dissertation "Structure-Based Prediction of...

    • data.4tu.nl
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tim Neijenhuis (2025). Data accompanying the PhD dissertation "Structure-Based Prediction of Protein Behavior in Preparative Chromatography" [Dataset]. http://doi.org/10.4121/0c67a2a7-3ead-4f1d-acac-9e1ef843d0d2.v1
    Explore at:
    Dataset updated
    Jul 8, 2025
    Dataset provided by
    4TU.ResearchData
    Authors
    Tim Neijenhuis
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    This repository contains the data generated during the PhD project: Structure-Based Prediction of Protein Behavior in Preparative Chromatography

    By Tim Neijenhuis in Delft University of Technology

    Supervisors: Marcel Ottens and Marieke Klijn

    Department of Biotechnology, Section of Bioprocess Engineering.

    When using the data, please refer using:

    Neijenhuis, T., Le Bussy, O., Geldhof, G., Klijn, M. E., & Ottens, M. (2024). Predicting protein retention in ion‐exchange chromatography using an open source QSPR workflow. Biotechnology Journal, 19(3), 2300708.

    Keulen, D., Neijenhuis, T., Lazopoulou, A., Disela, R., Geldhof, G., Le Bussy, O., ... & Ottens, M. (2025). From protein structure to an optimized chromatographic capture step using multiscale modeling. Biotechnology Progress, 41(1), e3505.

    Disela, R., Neijenhuis, T., Le Bussy, O., Geldhof, G., Klijn, M., Pabst, M., & Ottens, M. (2024). Experimental characterization and prediction of Escherichia coli host cell proteome retention during preparative chromatography. Biotechnology and Bioengineering, 121(12), 3848-3859.


    The objective of the PhD project was to predict the behavior of proteins for different chromatographic columns using molecular modeling methods.

    This repository contains all predicted and measured values for each protein for the different columns investigated during the research in a CSV format.

    These name of each file is identical to the figure presented in the dissertation stating first the chapter followed by the figure (as {chapter}_data_{figure}.csv).

  3. A Parallel Corpus of Thesis and Dissertations Abstracts

    • figshare.com
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felipe Soares (2023). A Parallel Corpus of Thesis and Dissertations Abstracts [Dataset]. http://doi.org/10.6084/m9.figshare.5995519.v2
    Explore at:
    application/x-sqlite3Available download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Felipe Soares
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NOTE FOR WMT PARTICIPANTS:There is an easier version for MT available in Moses format (one sentence per line. The files start with moses_like.If you use this dataset, please cite the following work:@inproceedings{soares2018parallel, title={A Parallel Corpus of Theses and Dissertations Abstracts}, author={Soares, Felipe and Yamashita, Gabrielli Harumi and Anzanello, Michel Jose}, booktitle={International Conference on Computational Processing of the Portuguese Language}, pages={345--352}, year={2018}, organization={Springer} }In Brazil, the governmental body responsible for overseeing and coordinating post-graduate pro-grams, CAPES, keeps records of all thesis and dissertations presented in the country. Informa-tion regarding such documents can be accessed online in the Thesis and Dissertations Catalog(TDC), which contains abstracts in Portuguese and English, and additional data regarding suchdocuments. Thus, this database can be a potential source of parallel corpora for the Portugueseand English languages. In this article, we present the development of a parallel corpus from TDC,which is made available by CAPES under the open data initiative. Approximately 240,000 doc-uments were collected and aligned using the Hunalign algorithm. We demontrate the capabilityof our developed corpus by training Statistical Machine Translation (SMT) and Neural MachineTranslation (NMT) models for both language directions, followed by a comparison with GoogleTranslator (GT). Our both translation models presented better BLEU scores than GT, with NMTsystem being the most accurate one. Sentence alignment was also manually evaluated, presentingan average of XX% correctly aligned sentences. Our parallel corpus is freely available in TMXformat, with complementary infomration regarding document metadata.

  4. Supplementary material to the masters thesis: NUTS-3 Regionalization of...

    • zenodo.org
    zip
    Updated Jun 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danielle Schmidt; Danielle Schmidt (2020). Supplementary material to the masters thesis: NUTS-3 Regionalization of Industrial Load Shifting Potential in Germany using a Time-Resolved Model [Dataset]. http://doi.org/10.5281/zenodo.3613767
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 5, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Danielle Schmidt; Danielle Schmidt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Germany
    Description

    This is the supplementary material to the masters thesis:

    "NUTS-3 Regionalization of Industrial Load Shifting Potential in Germany using a Time-Resolved Model"

    LICENSE

    All output data provided is under Creative Commons Attribution 4.0 International Public License. All Python scripts provided are under Apache License, Version 2.0. Refer to the ‘Input data documentation’ file for source and data license information for the input data. For license information, refer to the LICENSE files.

    DATASET DESCRIPTION

    The supplementary material is organized into four different subdirectories:

    • The subdirectory ‘Industrial processes’ contains the input data, Python scripts, and output data for the estimation of NUTS-3 load shifting potential of suitable electrically powered industrial processes (cement milling, mechanical pulping, paper production, air separation).
    • The subdirectory ‘Process heat’ contains the input data, Python scripts, and output data for the estimation of NUTS-3 load shifting potential of industrial process heat applications that are powered by electricity.
    • The subdirectory ‘Future projections’ contains the input data, Python scripts, and output data for the estimation of NUTS-0 annual average industrial processes load shifting potential in the future up until 2050.
    • The subdirectory ‘Other tables for reference’ contains data tables that are not used as input data to the Python scripts, but which may serve as useful further reference for the reader. These tables are the original or intermediate tables from which the input data tables were created.

    For more information refer to the README file.

    For a detailed description of the approach developed by the author, the input data used and the generated results, refer to the masters thesis "NUTS-3 Regionalization of Industrial Load Shifting Potential in Germany using a Time-Resolved Model", available here: https://elib.dlr.de/134116/

    In case of questions please contact: bruno.schyska@dlr.de or wilko.heitkoetter@dlr.de

  5. n

    Data from: Innovative Approaches and Tool Development for Proteomics Data...

    • curate.nd.edu
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ETD Depositor; Simon Dyck Weaver (2025). Innovative Approaches and Tool Development for Proteomics Data Analysis: Applications Across Diverse Biological Systems [Dataset]. http://doi.org/10.7274/28716458.v1
    Explore at:
    Dataset updated
    Apr 3, 2025
    Dataset provided by
    University of Notre Dame
    Authors
    ETD Depositor; Simon Dyck Weaver
    License

    https://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106

    Description

    Bottum up proteomics (BUP) is a powerful analytical technique that involves digesting complex protein mixtures into peptides and analyzing them with liquid chromatography and tandem mass spectrometry to identify and quantify many proteins simultaneously. This produces massive multidimensional datasets which require informatics tools to analyze. The landscape of software tools for BUP analysis is vast and complex, and often custom programs and scripts are required to answer biological questions of interest in any given experiment.

    This dissertation introduces novel methods and tools for analyzing BUP experiments and applies those methods to new samples. First, PrIntMap-R, a custom application for intraprotein intensity mapping, is developed and validated. This application is the first open-source tool to allow for statistical comparisons of peptides within a protein sequence along with quantitative sequence coverage visualization. Next, innovative sample preparation techniques and informatics methods are applied to characterize MUC16, a key ovarian cancer biomarker. This includes the proteomic validation of a novel model of MUC16 differing from the dominant isoform reported in literature. Shifting to bacterial studies, custom differential expression workflows are employed to investigate the role of virulence lipids in mycobacterial protein secretion by analyzing mutant strains of mycobacteria. This work links lipid presence and virulence factor secretion for the first time. Building on these efforts, OnePotN??TA, a labeling technique enabling quantification of N-terminal acetylation in mycobacterial samples, introduced. This method is the first technique to simultaneously quantify protein and N-terminal acetylation abundance using bottom-up proteomics, advancing the field of post-translational modification quantification. This project resulted in the identification of 37 new putative substrates for an N-acetyltransferase, three of which have since been validated biochemically. These tools and methodologies are further applied to various biological research areas, including breast cancer drug characterization and insect saliva analysis to perform the first proteomic studies of their kind with these respective treatments and samples. Additionally, a project focused on teaching programming skills relevant to analytical chemistry is presented. Collectively, this work enhances the analytical capabilities of bottom-up proteomics, providing novel tools and methodologies that advance protein characterization, post-translational modification analysis, and biological discovery across diverse research areas.

  6. Z

    Numbers of Articles, Books and Dissertation theses indexed in BASE and...

    • data.niaid.nih.gov
    Updated Aug 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Herb, Ulrich (2024). Numbers of Articles, Books and Dissertation theses indexed in BASE and percentages of items published Open Access, under Creative Commons Licenses and under Open Licenses (2013-2017) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1189806
    Explore at:
    Dataset updated
    Aug 2, 2024
    Dataset authored and provided by
    Herb, Ulrich
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    A look at the data provided by the Open Access search engine BASE (http://base-search.net) shows that the Open Science compliance among dissertation theses stagnates. BASE knows three categories of accessibility: Open Access, Unknown, Non-Open Access. In the following tables and graphs, figures reported as "Open Access" have been categorised by BASE as Open Access. The tables and graphics show data from BASE (as of 06.03.2018) as follows:

    a) Indexed theses, books and journal articles

    b) Indexed theses, books and journal articles published by Open Access

    c) indexed theses, books and journal articles under Creative Commons licenses.

    d) indexed theses, books and journal articles, which are published under open licenses in the sense of the Open License, i. e. reflect terms of use of the Open Source.

    Although doctoral theses already had a high share of open access by 2013 (43%), by 2017 it had risen by only 5% (2017:48%). At the same time, the proportion of books published in open access rose by 14% (from 20% to 34%) and articles by 17% from 44% (2013) to 61% (2017). The same effect can be seen in the proportion of CC-licensed items: Their share rose by 4% (from 9% to 13%) for doctoral theses, by 9% for books (from 4% to 13%) and 8% for articles (from 10% to 18%) between 2013 and 2017. However, the share of openly licensed items is most pronounced: it did not increase for doctoral theses, but remained at 2% between 2013 and 2017; in the same period it increased by 5% (from 1% to 6%) for books, and by 5% (from 5% to 10%) for articles.

  7. D.Sc. Thesis materials

    • figshare.com
    pdf
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leo Lahti (2023). D.Sc. Thesis materials [Dataset]. http://doi.org/10.6084/m9.figshare.810514.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Leo Lahti
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Doctoral thesis. Open source probabilistic models for human functional genomics. Includes: the press release, the thesis, and LaTeX sources.

  8. h

    Supplementary data for the thesis "Development and Validation of Explainable...

    • datahub.hku.hk
    Updated Jul 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yui Lun Ng (2024). Supplementary data for the thesis "Development and Validation of Explainable Machine-Learning Prediction Systems: A Study of Biomedical and Clinical Data" [Dataset]. http://doi.org/10.25442/hku.26172664.v1
    Explore at:
    Dataset updated
    Jul 18, 2024
    Dataset provided by
    HKU Data Repository
    Authors
    Yui Lun Ng
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The files contain the dataset for the thesis "Development and Validation of Explainable Machine-Learning Prediction Systems: A Study of Biomedical and Clinical Data".

    Chapter 3 includes a patient dataset with CDI (Clostridioides difficile infection) admissions from 2009-2014 in Hong Kong.

    Chapter 4 includes a list of protein structure data derived from UniProt (www.uniprot.org) (release 2021_03) and their corresponding enzyme functions. The protein structure data file can be downloaded from the open-source database Protein Data Bank (www.rcsb.org). Additionally, a list of AlphaFold 2 predicted structures is also included, and the structural data can be downloaded from www.alphafold.com.

    Chapter 5 contains a list of PDB structures derived from UniProt (release 2023_01).

  9. Data from: Experimental investigation of the effects of particle shape and...

    • zenodo.org
    • data.niaid.nih.gov
    bin, zip
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gustavo Pinzón; Gustavo Pinzón; Edward Andò; Edward Andò; Alessandro Tengattini; Alessandro Tengattini; Gioacchino Viggiani; Gioacchino Viggiani (2023). Experimental investigation of the effects of particle shape and friction on the mechanics of granular media [Dataset]. http://doi.org/10.5281/zenodo.8014905
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Nov 20, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gustavo Pinzón; Gustavo Pinzón; Edward Andò; Edward Andò; Alessandro Tengattini; Alessandro Tengattini; Gioacchino Viggiani; Gioacchino Viggiani
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset corresponds to the raw data and experimental measurements of the PhD thesis "Experimental investigation of the effects of particle shape and friction on the mechanics of granular media" of Gustavo Pinzón (2023, Université Grenoble Alpes), available at: https://hal.science/tel-04202827v1.

    The experiments correspond to a drained triaxial compression test of cylindrical granular specimens, a common testing procedure used in soil mechanics to characterise the mechanical response of a specimen under deviatoric loading. Each specimen is 140mm in height and 70mm in diameter, and is composed of more than 20000 ellipsoidal particles of a given aspect ratio and interparticle friction. The dataset comprises the test of six specimens, as a result of the combination of 3 particles shapes (Flat, Medium, and Rounded) and 2 values of interparticle friction (Rough and Smooth). A naming system for the specimens is adopted to reflect the morphology of the composing particles (e.g., the test EFR correspond to the specimen with Flat and Rough particles). Further details on the experimental methods are found in Ch. 2 of the thesis.

    The compression tests are performed inside the x-ray scanner of Laboratoire 3SR in Grenoble (France), where the specimens are scanned each 0.5% of axial shortening, at an isotropic voxel size of 100 micrometer per pixel. The obtained radiographies are reconstructed using a Filtered Back Projection algorithm, using the software given by the x-ray cabin manufacturer (RX Solutions, France). The series of obtained 16-bit greyscale 3D images are processed with the open source software spam, version 0.6.2. The coordinate system of all the images is ZYX, where Z corresponds to compression direction. Further details on the image analysis techniques are found in Ch. 3 of the thesis.

    Additional greyscale images, raw projections, and x-ray tomography files are available upon request. For visualisation purposes, the 3D images in .tif format can be opened using Fiji.

  10. r

    Data from: Data files used to study the distribution of growth in software...

    • researchdata.edu.au
    Updated May 4, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Swinburne University of Technology (2011). Data files used to study the distribution of growth in software systems [Dataset]. https://researchdata.edu.au/files-used-study-software-systems/14865
    Explore at:
    Dataset updated
    May 4, 2011
    Dataset provided by
    Swinburne University of Technology
    Description

    The evolution of a software system can be studied in terms of how various properties as reflected by software metrics change over time. Current models of software evolution have allowed for inferences to be drawn about certain attributes of the software system, for instance, regarding the architecture, complexity and its impact on the development effort. However, an inherent limitation of these models is that they do not provide any direct insight into where growth takes place. In particular, we cannot assess the impact of evolution on the underlying distribution of size and complexity among the various classes. Such an analysis is needed in order to answer questions such as 'do developers tend to evenly distribute complexity as systems get bigger?', and 'do large and complex classes get bigger over time?'. These are questions of more than passing interest since by understanding what typical and successful software evolution looks like, we can identify anomalous situations and take action earlier than might otherwise be possible. Information gained from an analysis of the distribution of growth will also show if there are consistent boundaries within which a software design structure exists. In our study of metric distributions, we focused on 10 different measures that span a range of size and complexity measures. The raw metric data (4 .txt files and 1 .log file in a .zip file measuring ~0.5MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).

  11. Z

    Fourteen-channel EEG with Imagined Speech (FEIS) dataset

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Clayton (2020). Fourteen-channel EEG with Imagined Speech (FEIS) dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3369178
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Scott Wellington
    Jonathan Clayton
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    <> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

    Welcome to the FEIS (Fourteen-channel EEG with Imagined Speech) dataset.

    <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

    The FEIS dataset comprises Emotiv EPOC+ [1] EEG recordings of:

    • 21 participants listening to, imagining speaking, and then actually speaking 16 English phonemes (see supplementary, below)

    • 2 participants listening to, imagining speaking, and then actually speaking 16 Chinese syllables (see supplementary, below)

    For replicability and for the benefit of further research, this dataset includes the complete experiment set-up, including participants' recorded audio and 'flashcard' screens for audio-visual prompts, Lua script and .mxs scenario for the OpenVibe [2] environment, as well as all Python scripts for the preparation and processing of data as used in the supporting studies (submitted in support of completion of the MSc Speech and Language Processing with the University of Edinburgh):

    • J. Clayton, "Towards phone classification from imagined speech using a lightweight EEG brain-computer interface," M.Sc. dissertation, University of Edinburgh, Edinburgh, UK, 2019.

    • S. Wellington, "An investigation into the possibilities and limitations of decoding heard, imagined and spoken phonemes using a low-density, mobile EEG headset," M.Sc. dissertation, University of Edinburgh, Edinburgh, UK, 2019.

    Each participant's data comprise 5 .csv files -- these are the 'raw' (unprocessed) EEG recordings for the 'stimuli', 'articulators' (see supplementary, below) 'thinking', 'speaking' and 'resting' phases per epoch for each trial -- alongside a 'full' .csv file with the end-to-end experiment recording (for the benefit of calculating deltas).

    To guard against software deprecation or inaccessability, the full repository of open-source software used in the above studies is also included.

    We hope for the FEIS dataset to be of some utility for future researchers, due to the sparsity of similar open-access databases. As such, this dataset is made freely available for all academic and research purposes (non-profit).

    <> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

    REFERENCING

    <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

    If you use the FEIS dataset, please reference:

    • S. Wellington, J. Clayton, "Fourteen-channel EEG with Imagined Speech (FEIS) dataset," v1.0, University of Edinburgh, Edinburgh, UK, 2019. doi:10.5281/zenodo.3369178

    <> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

    LEGAL

    <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

    The research supporting the distribution of this dataset has been approved by the PPLS Research Ethics Committee, School of Philosophy, Psychology and Language Sciences, University of Edinburgh (reference number: 435-1819/2).

    This dataset is made available under the Open Data Commons Attribution License (ODC-BY): http://opendatacommons.org/licenses/by/1.0

    <> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

    ACKNOWLEDGEMENTS

    <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

    The FEIS database was compiled by:

    Scott Wellington (MSc Speech and Language Processing, University of Edinburgh) Jonathan Clayton (MSc Speech and Language Processing, University of Edinburgh)

    Principal Investigators:

    Oliver Watts (Senior Researcher, CSTR, University of Edinburgh) Cassia Valentini-Botinhao (Senior Researcher, CSTR, University of Edinburgh)

    <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

    METADATA

    <> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

    For participants, dataset refs 01 to 21:

    01 - NNS 02 - NNS 03 - NNS, Left-handed 04 - E 05 - E, Voice heard as part of 'stimuli' portions of trials belongs to particpant 04, due to microphone becoming damaged and unusable prior to recording 06 - E 07 - E 08 - E, Ambidextrous 09 - NNS, Left-handed 10 - E 11 - NNS 12 - NNS, Only sessions one and two recorded (out of three total), as particpant had to leave the recording session early 13 - E 14 - NNS 15 - NNS 16 - NNS 17 - E 18 - NNS 19 - E 20 - E 21 - E

    E = native speaker of English NNS = non-native speaker of English (>= C1 level)

    For participants, dataset refs chinese-1 and chinese-2:

    chinese-1 - C chinese-2 - C, Voice heard as part of 'stimuli' portions of trials belongs to participant chinese-1

    C = native speaker of Chinese

    <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

    SUPPLEMENTARY

    <> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

    Under the international 10-20 system, the Emotiv EPOC+ headset 14 channels:

    F3 FC5 AF3 F7 T7 P7 O1 O2 P8 T8 F8 AF4 FC6 F4

    The 16 English phonemes investigated in dataset refs 01 to 21:

    /i/ /u:/ /æ/ /ɔ:/ /m/ /n/ /ŋ/ /f/ /s/ /ʃ/ /v/ /z/ /ʒ/ /p /t/ /k/

    The 16 Chinese syllables investigated in dataset refs chinese-1 and chinese-2:

    mā má mǎ mà mēng méng měng mèng duō duó duǒ duò tuī tuí tuǐ tuì

    All references to 'articulators' (e.g. as part of filenames) refer to the 1-second 'fixation point' portion of trials. The name is a layover from preliminary trials which were modelled on the KARA ONE database (http://www.cs.toronto.edu/~complingweb/data/karaOne/karaOne.html) [3].

    <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <>< <><

    <> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><> ><>

    [1] Emotiv EPOC+. https://emotiv.com/epoc. Accessed online 14/08/2019.

    [2] Y. Renard, F. Lotte, G. Gibert, M. Congedo, E. Maby, V. Delannoy, O. Bertrand, A. Lécuyer. “OpenViBE: An Open-Source Software Platform to Design, Test and Use Brain-Computer Interfaces in Real and Virtual Environments”, Presence: teleoperators and virtual environments, vol. 19, no 1, 2010.

    [3] S. Zhao, F. Rudzicz. "Classifying phonological categories in imagined and articulated speech." In Proceedings of ICASSP 2015, Brisbane Australia, 2015.

  12. Dissertation data.zip

    • figshare.com
    zip
    Updated Jun 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ioana Puscasu; Simona Motogna (2021). Dissertation data.zip [Dataset]. http://doi.org/10.6084/m9.figshare.14780073.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 14, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Ioana Puscasu; Simona Motogna
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Data used for developing an application that mapped Fowler's code smell to Pylint code smells. Firstly 179 files that contained the Pylint result were used. Those files came from projects that were submitted for Babes-Bolyai University's "Fundamentals of programming" course for the 2019-20202 academic year. Since those file did not contain main refactoring code smell, the application was tested on another set of data. The second set also came from set of projects submitted for Babes-Bolyai University's "Formal Languages and Compilation Techniques" course for the 2019-2020 academic year. For the second set of data access to code was also provided, so it was easier to perform an analysis on the code as well. The problem was that again, not so many refactoring code smell were discovered here, and in order to test the application a bigger project was needed. The third set of date came form the open source project called "TensorFlow" . This is a machine learning platform that runs from start to finish. It features a large, flexible ecosystem of tools, libraries, and community resources that enable academics to push the boundaries of machine learning and developers to quickly build and deploy ML-powered apps. All of this data can be found in the file.

  13. 4

    Drone / Unmanned Aerial Vehicle raw and processed photogrammetry data,...

    • data.4tu.nl
    zip
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niels Hoogendoorn; H.C. (Hessel) Winsemius; N.C. (Nick) van de Giesen; Stephen Mather; Hoes O.A.C.; Davide Wüthrich (2024). Drone / Unmanned Aerial Vehicle raw and processed photogrammetry data, supporting the MSc thesis work 3D River Discharge Modelling using UAV photogrammetry [Dataset]. http://doi.org/10.4121/63a75bfc-4845-4827-9840-da9f710efb36.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    4TU.ResearchData
    Authors
    Niels Hoogendoorn; H.C. (Hessel) Winsemius; N.C. (Nick) van de Giesen; Stephen Mather; Hoes O.A.C.; Davide Wüthrich
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2023
    Area covered
    Dataset funded by
    European Commission
    Description

    A photogrammetry dataset was collected using an Unmanned Aerial Vehicle (quadcopter) over a river stretch of the Black Volta at Bamboi Bridge - Ghana. Also Ground Control Points (GCPs) were collected that represent black-and-white markers in the landscape. These can be used to better geographically constrain the photogrammetric solution. GCPs have been associated with row and column pixel location in each photo in which they appear.

    The raw data was processed into a 3D point cloud using the open-source software platform WebOpenDroneMap (WebODM). The point cloud was analysed for removal of vegetation using spatial filtering techniques with the intent to make a bare-earth topographical map of the dry part of the riverbed. Both filtered and unfiltered point clouds were further processed into a Digital Surface Model (unfiltered) and Digital Terrain Model (DTM). The unfiltered dataset was also processed into an RGB orthophoto. In the thesis work of Hoogendoorn (2023) further research was done on combining the results of these datasets and analyses with wet bathymetry points collected using a fishfinder equipped with Real-Time-Kinematics GNSS, and using the outcoming full bathymetry for hydraulic modelling and understanding of relationships between wetted geometry and river discharge. For more information, we refer to the MSc thesis work of Hoogendoorn (2023)

    The data files consist of three (3) .zip files. Unzip these to get access to all underlying files. For a quick overview, a .qgs file can be opened in QGIS. This will display all layers in a simple GIS project. The point cloud is also visualized but may take significant time before rendered, as points first need to be cached.

    References: Hoogendoorn, N. J.: 3D River Discharge Modelling using UAV photogrammetry | TU Delft Repository, Delft University of Technology, Delft, The Netherlands, 2023.

    Link: https://repository.tudelft.nl/record/uuid:d4088a50-3590-4675-9600-d715800841a3

  14. s

    Data from: Can we make it better? Assessing and improving quality of GitHub...

    • researchdata.smu.edu.sg
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GEDE ARTHA AZRIADI PRANA (SMU) (2023). Data from: Can we make it better? Assessing and improving quality of GitHub repositories [Dataset]. http://doi.org/10.25440/smu.17073050.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    SMU Research Data Repository (RDR)
    Authors
    GEDE ARTHA AZRIADI PRANA (SMU)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the related dataset for the PhD dissertation by G. A. A. Prana, "Can We Make It Better? Assessing and Improving Quality of GitHub Repositories", available at https://ink.library.smu.edu.sg/etd_coll/373/The code hosting platform GitHub has gained immense popularity worldwide in recent years, with over 200 million repositories hosted as of June 2021. Due to its popularity, it has great potential to facilitate widespread improvements across many software projects. Naturally, GitHub has attracted much research attention, and the source code in the various repositories it hosts also provide opportunity to apply techniques and tools developed by software engineering researchers over the years. However, much of existing body of research applicable to GitHub focuses on code quality of the software projects and ways to improve them. Fewer work focus on potential ways to improve quality of GitHub repositories through other aspects, although quality of a software project on GitHub is also affected by factors outside a project's source code, such as documentation, the project's dependencies, and pool of contributors.The three works that form this dissertation focus on investigating aspects of GitHub repositories beyond the code quality, and identify specific potential improvements that can be applied to improve wide range of GitHub repositories. In the first work, we aim to systematically understand the content of README files in GitHub software projects, and develop a tool that can process them automatically. The work begins with a qualitative study involving 4,226 README file sections from 393 randomly-sampled GitHub repositories, which reveals that many README files contain the What'' andHow'' of the software project, but often do not contain the purpose and status of the project. This is followed by a development and evaluation of a multi-label classifier that can predict eight different README content categories with F1 of 0.746. From our subsequent evaluation of the classifier, which involve twenty software professionals, we find that adding labels generated by the classifier to README files ease information discovery.Our second work focuses on characteristics of vulnerabilities in open-source libraries used by 450 software projects on GitHub that are written in Java, Python, and Ruby. Using an industrial software composition analysis tool, we scanned every version of the projects after each commit made between November 1, 2017 and October 31, 2018. Our subsequent analyses on the discovered library names, versions, and associated vulnerabilities reveal, among others, that Denial of Service'' andInformation Disclosure'' vulnerability types are common. In addition, we also find that most of the vulnerabilities persist throughout the observation period, and that attributes such as project size, project popularity, and experience level of commit authors do not translate to better or worse handling of vulnerabilities in dependent libraries. Based on the findings in the second work, we list a number of implications for library users, library developers, as well as researchers, and provide several concrete recommendations. This includes recommendations to simplify projects' dependency sets, as well as to encourage research into ways to automatically recommend libraries known to be secure to developers.In our third work, we conduct a multi-region geographical analysis of gender inclusion on GitHub. We use a mixed-methods approach involving a quantitative analysis of commit authors of 21,456 project repositories, followed by a survey that is strategically targeted to developers in various regions worldwide and a qualitative analysis of the survey responses. Among other findings, we discover differences in diversity levels between regions, with Asia and Americas being highest. We also find no strong correlation between gender and geographic diversity of a repository's commit authors. Further, from our survey respondents worldwide, we also identify barriers and motivations to contribute to open-source software. The results of this work provides insights on the current state of gender diversity in open source software and potential ways to improve participation of developers from under-represented regions and gender, and subsequently improve the open-source software community in general. Such potential ways include creation of codes of conduct, proximity-based mentorship schemes, and highlighting of women / regional role models.

  15. n

    Data from: Model-based design of experiments and measurement optimization...

    • curate.nd.edu
    pdf
    Updated Nov 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jialu Wang (2024). Model-based design of experiments and measurement optimization frameworks based on scalable and tractable algorithms and software [Dataset]. http://doi.org/10.7274/25607808.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 11, 2024
    Dataset provided by
    University of Notre Dame
    Authors
    Jialu Wang
    License

    https://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106

    Description

    Mathematical models have become increasingly critical due to the rapid advances in computational methods in recent decades. However, the validation of these models often demands extensive and costly data, leading to time-consuming processes. Traditional design of experiments (DoE) methods struggle to choose informative experiments, especially for the typically large-scale, nonlinear, and dynamical science-based models in chemical and biomolecular engineering (CBE). In this dissertation, I propose a sequential model validation workflow powered by novel DoE and measurement optimization (MO) frameworks to improve data acquisition efficiency and accelerate the model building and validation process. The workflow relies on two scalable and tractable frameworks, along with their generalized open-source software tools:

    Measurement optimization: determines what to measure for experiments to maximize the experimental information content. It guides apparatus preparation during the experimental setup stage, balancing the information content with practical constraints such as budgets 
    Model-based design of experiments: quantifies experimental information content statistically and optimizes experiment selection based on updated model information. This framework is used throughout the model validation process, recommending new experiments after each iteration to update the model
    

    Both frameworks focus on addressing the challenges of DoE and MO techniques leveraging large-scale, nonlinear, and dynamical models in CBE, providing user-friendly open-source software tools for widespread applications.

    In this dissertation, I describe the development of the model-based DoE and the MO frameworks and their generalized tools to streamline the model validation workflow for complex models such as partial differential algebraic equations (PDAEs). I briefly discuss how these frameworks and the open-source software tool contribute to the broad DoE technique paradigm and applications. I demonstrate the tractability and scalability of the frameworks with laboratory and pilot-scale carbon capture experiments. Moreover, generalized open-source software tools are developed and applied to carbon capture experiments, highlighting their versatility and practicality. This dissertation lays the groundwork for a sequential MO and MBDoE workflow that can be readily applied to various challenging problems in CBE and beyond, offering potential benefits to broader science, technology, engineering, and math (STEM) communities. I conclude with a discussion of the future directions, and provide preliminary works for some future directions as a starting point.

  16. f

    Amsterdam Scenario MATSim

    • figshare.com
    • data.4tu.nl
    txt
    Updated Jul 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konstanze Winter; J. (Jishnu) Narayan (2020). Amsterdam Scenario MATSim [Dataset]. http://doi.org/10.4121/uuid:6108ed85-7b24-455e-bd95-89d84e6306fa
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 28, 2020
    Dataset provided by
    4TU.ResearchData
    Authors
    Konstanze Winter; J. (Jishnu) Narayan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Amsterdam
    Description

    Data set based on the city of Amsterdam, which has been used in various simulations using the open-source agent-based transport simulation model MATSim (https://www.matsim.org/). It contains a networks, agents' plans (and the description how these have been derived from the ALBATROSS data set), configuration files and additional information material.

    Regarding the original ALBATROSS data set, please contact Prof. Soora Rasouli (TU Eindhoven)

  17. Data bundle for egon-data: A transparent and reproducible data processing...

    • zenodo.org
    zip
    Updated Jun 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ilka Cußmann; Ilka Cußmann (2022). Data bundle for egon-data: A transparent and reproducible data processing pipeline for energy system modeling [Dataset]. http://doi.org/10.5281/zenodo.6630616
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 10, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ilka Cußmann; Ilka Cußmann
    Description

    egon-data provides a transparent and reproducible open data based data processing pipeline for generating data models suitable for energy system modeling. The data is customized for the requirements of the research project eGon. The research project aims to develop tools for an open and cross-sectoral planning of transmission and distribution grids. For further information please visit the eGon project website or its Github repository.

    egon-data retrieves and processes data from several different external input sources. As not all data dependencies can be downloaded automatically from external sources we provide a data bundle to be downloaded by egon-data.

    The following data sets are part of the available data bundle:

    1. climate_zones_germany
      • Climate zones in Germany
      • source: Own representation based on DWD TRY climate zones
      • License: Attribution 4.0 International (CC BY 4.0)
    2. emobility
      • Data on eMobility mit_trip_data:
        motorized individual travel - individual trips of electric vehicles (EV) generated with a modified version of simBEV v0.1.3 (https://github.com/rl-institut/simbev/tree/1f87c716d14ccc4a658b8d2b01fd12b88a4334d5). simBEV generates driving profiles for BEVs and PHEVs based upon MID data (BMVI) per RegioStaR7 region type (BBSR).
      • Reiner Lemoine Institut, June 2022
      • License: Attribution 4.0 International (CC BY 4.0)
    3. geothermal_potential
    4. household_electricity_demand_profiles
      • Annual profiles in hourly resolution of electricity demand of private households for different household types (singles, couples, other) with varying number of elderly and children.
        The profiles were created using a bottom-up load profile generator by Fraunhofer IEE developed in the Bachelor's thesis "Auswirkungen verschiedener Haushaltslastprofile auf PV-Batterie-Systeme" by Jonas Haack, Fachhochschule Flensburg, December 2012.
        The columns are named as follows: "
      • License: Attribution 4.0 International (CC BY 4.0)
    5. household_heat_demand_profiles
      • Sample heat time series including hot water and space heating for single- and multi-familiy houses. The profiles were created using the loadprofile generator by Fraunhofer IEE developed in the Master's thesis "Synthesis of a heat and electrical load profile for single and multi-family houses used for subsequent performance tests of a multi-component energy system", Simon Ruben Drauz, RWTH Aachen University, March 2016
      • License: Attribution 4.0 International (CC BY 4.0)
    6. hydrogen_storage_potential_saltstructures
      • The data are taken from figure 7.1 in Donadei, S., et al., (2020), p. 7-5..
      • Source: Flach lagernde Salze, (c) BGR Hannover, 2021.
        Datenquelle: InSpEE-Salzstrukturen, (c) BGR, Hannover, 2015. &
        Donadei, S., Horváth, B., Horváth, P.-L., Keppliner, J., Schneider, G.-S., &
        Zander-Schiebenhöfer, D. (2020). Teilprojekt Bewertungskriterien und
        Potenzialabschätzung. BGR. Informationssystem Salz: Planungsgrundlagen,
        Auswahlkriterien und Potenzialabschätzung für die Errichtung von Salzkavernen
        zur Speicherung von Erneuerbaren Energien (Wasserstoff und Druckluft) –
        Doppelsalinare und flach lagernde Salzschichten: InSpEE-DS. Sachbericht.
        Hannover: BGR.
      • License: The original data are licensed under the GeoNutzV, see https://sg.geodatenzentrum.de/web_public/gdz/lizenz/geonutzv.pdf
    7. industrial_sites
      • Information about industrial sites with DSM-potential in Germany from a Master's thesis by Danielle Schmidt. The data set includes own information on the coordinates of every industrial site.
      • source: Schmidt, Danielle. (2019). Supplementary material to the masters thesis: NUTS-3 Regionalization of Industrial Load Shifting Potential in Germany using a Time-Resolved Model [Data set]. Zenodo. https://doi.org/10.5281/zenodo.3613767
      • License: Attribution 4.0 International (CC BY 4.0)
    8. nep2035_version2021
      • Data extracted from the German grid development plan - power
      • source: Netzentwicklungsplan Strom 2035 (2021), erster Entwurf | Übertragungsnetzbetreiber (M) CC-BY-4.0
      • License: Attribution 4.0 International (CC BY 4.0)
    9. pipeline_classification_gas
    10. pypsa_eur_sec
      • Preliminary results from scenario generator pypsa-eur-sec
      • source: own calculation using pypsa-eur-sec fork (https://github.com/openego/pypsa-eur-sec)
      • License: Attribution 4.0 International (CC BY 4.0)
    11. regions_dynamic_line_rating
    12. re_potential_areas
      • Eligible areas for wind turbines and ground-mounted PV systems.
      • Reiner Lemoine Institut, January 2022
      • License: Attribution 4.0 International (CC BY 4.0)
    13. WZ_definition
      • Definitions of industrial and commercial branches
      • source: Klassifikation der Wirtschaftszweige (WZ 2008)
      • Extract from Terms of Use: © Statistisches Bundesamt, Wiesbaden 2008 Vervielfältigung und Verbreitung, auch auszugsweise, mit Quellenangabe gestattet.
    14. zensus_households
      • Dataset describing the amount of people living by a certain types of family-types, age-classes,sex and size of household in Germany in state-resolution.
      • source: Data retrieved from Zensus Datenbank by performing these steps:
        • Search for: "1000A-2029"
        • or choose topic: "Bevölkerung kompakt"
        • Choose table code: "1000A-2029" with title "Personen: Alter (11 Altersklassen)/Geschlecht/Größe desprivaten Haushalts - Typ des privaten Haushalts (nach Familien/Lebensform)"
        • Change setting "GEOLK1" to "Bundesländer (16)" higher resolution "Landkreise und kreisfreie Städte (412)" only accessible after registration.
      • Extract from Terms of Use: © Statistische Ämter des Bundes und der Länder 2021, Vervielfältigung und Verbreitung, auch auszugsweise, mit Quellennachweis gestattet.

  18. Causes for spatiotemporal variation in reproductive performance of Eurasian...

    • zenodo.org
    • explore.openaire.eu
    Updated Jan 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Magali Frauendorf; Magali Frauendorf (2022). Causes for spatiotemporal variation in reproductive performance of Eurasian oystercatchers in a human-dominated landscape [Dataset]. http://doi.org/10.5281/zenodo.5831867
    Explore at:
    Dataset updated
    Jan 10, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Magali Frauendorf; Magali Frauendorf
    Description

    These data comprise the data of four chapters from the PhD thesis from Frauendorf (2022) entitled 'Causes for spatiotemporal variation in reproductive performance of Eurasian oystercatchers in a human-dominated landscape’. The thesis focusses on quantifying the anthropogenic impacts on the reproductive performance of oystercatchers across the Netherlands. The dataset contains data from chapter 3, 5 and 6, which were used in the thesis but are not published open access yet.

    For chapter 3 oystercatchers were caught during winter across the wintering ground of oystercatchers (Wadden sea and Delta estuary) and their condition was measured (physiological measurements through blood and biometric measurements). The data also include resighting data from ringed individuals that were used for mark-recapture survival analysis (state and age matrices). In addition, environmental variables were collected from open source data. Next, the birds were followed from their wintering ground to the breeding ground where we measured their reproductive success.

    Chapter 5 includes data about the reproductive performance of oystercatchers (available from several different data sources across the Netherlands). Next, we collected data on the environment from different (open access) data sources about for instance habitat type, land use intensity, predation and food availability.

    In chapter 6, we used data on bill shape as a proxy for feeding specialization based on data from winter catches (chapter 3 in PhD thesis) to illustrate the proportion of birds with different feeding specialization in the studied population.

  19. h

    Data publication: Experimental characterization of four-magnon scattering...

    • rodare.hzdr.de
    zip
    Updated Feb 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hula, Tobias (2023). Data publication: Experimental characterization of four-magnon scattering processes in ferromagnetic conduits [Dataset]. http://doi.org/10.14278/rodare.2178
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 28, 2023
    Authors
    Hula, Tobias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All Raw and Processed Data + written Thesis. Data and Figures are stored in the 'Figures_and_Data' Directory. Experimental Measurements were done by means of BLS Microscopy (group of H. Schultheiß at HZDR). Micromagnetic Simulations were done at the Hemera Cluster (Dr. A. Kakay at HZDR). Data Analysis was done in Python or Jupyter Notebooks (Open Source). All scripts are included. Graphics were done using OmniGraffle and Blender. Plotting was done using Python and 'Plot2' (Mac Only!). All Files/Data/Skripts are sorted by Figure! The entire Latex Package is stored under 'Thesis_Hula' - Dissertation.tex is the main file and shows all required dependencies.

  20. H

    Replication data for: It's Not All about the Benjamins: Essays on Political...

    • dataverse.harvard.edu
    Updated Jun 6, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Goodrich (2013). Replication data for: It's Not All about the Benjamins: Essays on Political Economy and Social Psychology Theories of Welfare State Preferences [Dataset]. http://doi.org/10.7910/DVN/YYGB58
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 6, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Benjamin Goodrich
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/YYGB58https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/YYGB58

    Time period covered
    Sep 2003 - May 2010
    Area covered
    Countries included in the ISSP and ESS surveys, World
    Description

    In a democracy, the relationship between the preferences of the citizens and the policies of the government is, in principle, fundamental. Whether this principle holds in practice has been the subject of a long but inconclusive debate in the political science literature. This dissertation focuses primarily on a different question, namely, what are the determinants of mass preferences over welfare state policies? To answer this question, new quantitative methods are developed, implemented in a Free, Libre, and Open Source Software package, and applied to relatively recent data. The primary contributions of this dissertation to the social science literature are two-fold. First, we present new empirical results on mass political preferences that will be of interest to political scientists, economists, and researchers in other fields. Second, those empirical results are obtained from new estimators that are especially useful for modeling preferences but are also useful for modeling other multivariate phenomena. The strength of these empirical results will hopefully spur innovation on a third front, namely the way in which political economists develop theoretical models of the process by which political preferences are aggregated in democracies. The first chapter is largely empirical and tests traditional political economy theories of preferences for redistribution against theories of inequality aversion, using the method developed in the second chapter. The main empirical conclusion of the first chapter is that a plurality of the variance in preferences for redistribution is attributable to differences in inequality aversion. The second chapter is methodological and attempts to answer the question of how many explanatory variables went into the data-generating process for the outcome variables we observe. The third chapter develops another new estimator and applies it to empirical data on preferences for redistribution and immigration. The main empirical conclusion of the third chapter is that not only is inequality aversion important to our understanding of preferences for redistribution but that it is is mostly exogenous to other factors in the model.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Cong Liu (2023). Experimental data for "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection" [Dataset]. http://doi.org/10.4121/uuid:ca1b0690-d9c5-4626-a067-525ec9d5881b

Experimental data for "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection"

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
zipAvailable download formats
Dataset updated
Jun 6, 2023
Dataset provided by
4TU.ResearchData
Authors
Cong Liu
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

This dataset includes all experimental data used for the PhD thesis of Cong Liu, entitled "Software Data Analytics: Architectural Model Discovery and Design Pattern Detection". These data are generated by instrumenting both synthetic and real-life software systems, and are formated according to the IEEE XES format. See http://www.xes-standard.org/ and https://www.win.tue.nl/ieeetfpm/lib/exe/fetch.php?media=shared:downloads:2017-06-22-xes-software-event-v5-2.pdf for more explanations.

Search
Clear search
Close search
Google apps
Main menu