Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
R Markdown files and Data sets for Manuscript: "HSP90 as an Evolutionary Capacitor Drives Adaptive Eye Size Reduction via atonal"
Authors: Rascha Sayed, Özge Şahin, Mohammed Errbii, Reshma R, Tobias Prüser, Lukas Schrader, Nora K. E. Schulz and Joachim Kurtz
The repository includes all R Markdown files and corresponding datasets (raw data in XLSX format) used to generate the manuscript's figures and statistical results. The files are organized in the same order as they appear in the manuscript. Each plot and statistical analysis can be reproduced using the raw data and the corresponding R script.
All Figuers and statistical analyses were performed in R (version 2024.04.2+764) using RStudio.
Each Markdown file is named according to its corresponding figure number and includes plots and statistical analysis results for each panel within that figure. The associated R scripts needed to generate the plots and perform the statistical analyses for each panel are also provided.
The XLSX files are also named by figure number and contain sheets with all the raw data sets used for plotting and statistical analysis, as referenced in the corresponding Markdown file.
These folders include the original photos displayed in the manuscript, named according to their corresponding figure numbers.
All qPCR results are uploaded in a compressed file titled "Extended Data Figure 2_qPCR_raw_data".
The repository contains the following items:
01. Figure-1_Markdown_file.html: contains plots, statistical analysis and R scripts for:
02. Figure1_raw_data.xlsx: contains raw data used for:
03. Figure 1_Original photos.zip: contains the original photos for reduced-eye phenotype after HSP90 inhibition.
04. Figure-2_Markdown_file.html: contains plots, statistical analysis and R scripts for:
05. Figure2_raw_data.xlsx: contains raw data used for:
06. Figure 3_Original photos.zip: contains the original photos for reduced-eye phenotype after atonal knockdown.
07. Extended-Data-Figure-1_Markdown_file.html: contains plots, statistical analysis and R scripts for:
08. Extended Data Figure 1_raw_data.xlsx: contains raw data used for:
09. Extended Data Figure 1_raw_data.zip: contains the original photos for reduced-eye phenotype after HSP90 inhibition_lateral view.
010. Extended-Data-Figure-2_Markdown_file.html: contains plots and R scripts for:
011. Extended Data Figure 2_raw_data.xlsx: contains raw data used for:
012. Extended Data Figure 2_Original photos.zip: contains the original photos for gonads after Hsp83 knockdown.
013. Extended Data Figure 2_ qPCR_raw_data.zip: contains raw data used for qPCR results_Hsp83 and Hsp68 genes.
014. Extended-Data-Figure-4_Markdown_file.html: contains plots, statistical analysis and R scripts for:
015. Extended Data Figure 4_raw_data.xlsx: contains raw data used for:
016. Extended-Data-Figure-6_Markdown_file.html: contains plots, statistical analysis and R scripts for:
017. Extended Data Figure 6_raw_data.xlsx: contains raw data used for:
018. Extended Data Figure 7_Original photos.zip: contains the original photos for: Abnormal phenotypes in adults after larval and pupal knockdown for candidate genes using RNAi.
019. Extended Data Table 2.xlsx: Genome-wide analysis of allele frequency differences in T. castaneum.
020. Extended Data Table 3.xlsx: Expansion of candidate region for eye phenotype.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Includes the following datasets:
Also included is the .py file with the code used to simulate power flows in the grid.
This work is supported by Hanze University of Applied Sciences.
This dataset includes all the data and R code needed to reproduce the analyses in a forthcoming manuscript:Copes, W. E., Q. D. Read, and B. J. Smith. Environmental influences on drying rate of spray applied disinfestants from horticultural production services. PhytoFrontiers, DOI pending.Study description: Instructions for disinfestants typically specify a dose and a contact time to kill plant pathogens on production surfaces. A problem occurs when disinfestants are applied to large production areas where the evaporation rate is affected by weather conditions. The common contact time recommendation of 10 min may not be achieved under hot, sunny conditions that promote fast drying. This study is an investigation into how the evaporation rates of six commercial disinfestants vary when applied to six types of substrate materials under cool to hot and cloudy to sunny weather conditions. Initially, disinfestants with low surface tension spread out to provide 100% coverage and disinfestants with high surface tension beaded up to provide about 60% coverage when applied to hard smooth surfaces. Disinfestants applied to porous materials were quickly absorbed into the body of the material, such as wood and concrete. Even though disinfestants evaporated faster under hot sunny conditions than under cool cloudy conditions, coverage was reduced considerably in the first 2.5 min under most weather conditions and reduced to less than or equal to 50% coverage by 5 min. Dataset contents: This dataset includes R code to import the data and fit Bayesian statistical models using the model fitting software CmdStan, interfaced with R using the packages brms and cmdstanr. The models (one for 2022 and one for 2023) compare how quickly different spray-applied disinfestants dry, depending on what chemical was sprayed, what surface material it was sprayed onto, and what the weather conditions were at the time. Next, the statistical models are used to generate predictions and compare mean drying rates between the disinfestants, surface materials, and weather conditions. Finally, tables and figures are created. These files are included:Drying2022.csv: drying rate data for the 2022 experimental runWeather2022.csv: weather data for the 2022 experimental runDrying2023.csv: drying rate data for the 2023 experimental runWeather2023.csv: weather data for the 2023 experimental rundisinfestant_drying_analysis.Rmd: RMarkdown notebook with all data processing, analysis, and table creation codedisinfestant_drying_analysis.html: rendered output of notebookMS_figures.R: additional R code to create figures formatted for journal requirementsfit2022_discretetime_weather_solar.rds: fitted brms model object for 2022. This will allow users to reproduce the model prediction results without having to refit the model, which was originally fit on a high-performance computing clusterfit2023_discretetime_weather_solar.rds: fitted brms model object for 2023data_dictionary.xlsx: descriptions of each column in the CSV data files
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset provides simulated data on plastic and substance flows and stocks in buildings and infrastructure as described in the data article "Plastics in the German Building and Infrastructure Sector: A High-Resolution Dataset on Historical Flows, Stocks, and Legacy Substance Contamination". Besides simulated data, the repository contains input data and model files used to produce the simulated data.
Data & Data Visualization: The dataset contains input data and simulated data for the six main plastic applications in buildings and infrastructure in Germany in the period from 1950 to 2023, which are profiles, flooring, pipes, insulation material, cable insulations, and films. For each application the data are provided in a sub-directory (1_ ... 6_) following the structure described below.
Input Data:
The input data are stored in an xlsx-file with three sheets: flows, parameters, and data quality assessment. The data sources for all input data are detailed in the Supplementary Material of the linked Data in Brief article.
Simulated Data:
Simulated data are stored in a sub-folder, which contains:
Note: All files in the [product]/simulated_data folder are automatically replaced with updated model results upon execution of immec_dmfa_calculate_submodels.py.
To reduce storage requirements, data are stored in gzipped pickle files (.pkl.gz), while smaller files are provided as pickle files (.pkl). To open the files, users can use Python with the following code snippet:
import gzip
# Load a gzipped pickle file
with gzip.open("filename.pkl.gz", "rb") as f:
data = pickle.load(f)
# Load a regular pickle file
with open("filename.pkl", "rb") as f:
data = pickle.load(f)
Please note that opening pickle files requires compatible versions of numpy
and pandas
, as the files may have been created using version-specific data structures. If you encounter errors, ensure your package versions match those used during file creation (pandas: 2.2.3, numpy: 2.2.4).
Simulated data are provided as Xarray datasets, a data structure designed for efficient handling, analysis, and visualization of multi-dimensional labeled data. For more details on using Xarray, please refer to the official documentation: https://docs.xarray.dev/en/stable/
Core Model Files:
Computational Considerations:
During model execution, large arrays are generated, requiring significant memory. To enable computation on standard computers, Monte Carlo simulations are split into multiple chunks:
Dependencies
The model relies on the ODYM framework. To run the model, ODYM must be downloaded from https://github.com/IndEcol/ODYM (S. Pauliuk, N. Heeren, ODYM — An open software framework for studying dynamic material systems: Principles, implementation, and data structures, Journal of Industrial Ecology 24 (2020) 446–458. https://doi.org/10.1111/jiec.12952.)
7_Model_Structure:
8_Additional_Data: This folder contains supplementary data used in the model, including substance concentrations, data quality assessment scores, open-loop recycling distributions, and lifetime distributions.
The dataset was generated using a dynamic material flow analysis (dMFA) model. For a complete methodology description, refer to the Data in Brief article (add DOI).
If you use this dataset, please cite: Schmidt, S., Verni, X.-F., Gibon, T., Laner, D. (2025). Dataset for: Plastics in the German Building and Infrastructure Sector: A High-Resolution Dataset on Historical Flows, Stocks, and Legacy Substance Contamination, Zenodo. DOI: 10.5281/zenodo.15049210
This dataset is licensed under CC BY-NC 4.0, permitting use, modification, and distribution for non-commercial purposes, provided that proper attribution is given.
For questions or further details, please contact:
Sarah Schmidt
Center for Resource Management and Solid Waste Engineering
University of Kassel
Email: sarah.schmidt@uni-kassel.de
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all the spectra and OTU table data used in the paper "Spectroscopic investigation of faeces with surface-enhanced Raman scattering: a case study with coeliac patients on gluten-free diet", plus the R code to import the TXT (ASCII) files into a dataset, preprocess data, analzye data and generate the figures shown in the paper.
Spectral data are available in 2 different format:
the original TXT files (as generated from the Raman instrument, 1 file = 1 spectrum)
as RData file (an hyperSpec object including metadata), directly to be opened in R
The OTU table is available either as a single XLSX file or as a RData file to be opened in R.
The R code used to generate the figures is available as a single file "Rcode.R".
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
The open data portal catalogue is a downloadable dataset containing some key metadata for the general datasets available on the Government of Canada's Open Data portal. Resource 1 is generated using the ckanapi tool (external link) Resources 2 - 8 are generated using the Flatterer (external link) utility. ###Description of resources: 1. Dataset is a JSON Lines (external link) file where the metadata of each Dataset/Open Information Record is one line of JSON. The file is compressed with GZip. The file is heavily nested and recommended for users familiar with working with nested JSON. 2. Catalogue is a XLSX workbook where the nested metadata of each Dataset/Open Information Record is flattened into worksheets for each type of metadata. 3. datasets metadata contains metadata at the dataset
level. This is also referred to as the package
in some CKAN documentation. This is the main
table/worksheet in the SQLite database and XLSX output. 4. Resources Metadata contains the metadata for the resources contained within each dataset. 5. resource views metadata contains the metadata for the views applied to each resource, if a resource has a view configured. 6. datastore fields metadata contains the DataStore information for CSV datasets that have been loaded into the DataStore. This information is displayed in the Data Dictionary for DataStore enabled CSVs. 7. Data Package Fields contains a description of the fields available in each of the tables within the Catalogue, as well as the count of the number of records each table contains. 8. data package entity relation diagram Displays the title and format for column, in each table in the Data Package in the form of a ERD Diagram. The Data Package resource offers a text based version. 9. SQLite Database is a .db
database, similar in structure to Catalogue. This can be queried with database or analytical software tools for doing analysis.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
A small Excel file with data used to generate plots in a journal publication: Novikov, A.V., Shokrollahzadeh Behbahani, S., Voskov, D., Hajibeygi, H. and Jansen, J.D. (2024): "Benchmarking analytical and numerical simulation of induced fault slip", Geomechanics and Geophysics for Geo-Enegy and Geo-Resources. Special Issue "Selected Contributions from the 57th US Rock Mechanics/Geomechanics Symposium, Atlanta, GA, 2023."
Purpose of the publication is to provide analytical solutions that can serve as test problems for numerical tools to describe depletion-induced or injection-induced fault slip.
This data set forms a replacement of an earlier data set (DOI 10.4121/22240309) corresponding to the conference version of the paper. The current data set contains an additional figure (Fig. 14), while some of the other figures have been displayed on a somewhat more detailed and/or somewhat more regular grid.
On 22 April 2025 a revised Excel file (Data_GGGG_NovikovEtAl2024_corrected.xlsx) has been uploaded. In the original file, lines 24 to 27 (with data for figures 9 & 10 right) were incorrectly copied from another line.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
"We believe that by accounting for the inherent uncertainty in the system during each measurement, the relationship between cause and effect can be assessed more accurately, potentially reducing the duration of research."
Short description
This dataset was created as part of a research project investigating the efficiency and learning mechanisms of a Bayesian adaptive search algorithm supported by the Imprecision Entropy Indicator (IEI) as a novel method. It includes detailed statistical results, posterior probability values, and the weighted averages of IEI across multiple simulations aimed at target localization within a defined spatial environment. Control experiments, including random search, random walk, and genetic algorithm-based approaches, were also performed to benchmark the system's performance and validate its reliability.
The task involved locating a target area centered at (100; 100) within a radius of 10 units (Research_area.png), inside a circular search space with a radius of 100 units. The search process continued until 1,000 successful target hits were achieved.
To benchmark the algorithm's performance and validate its reliability, control experiments were conducted using alternative search strategies, including random search, random walk, and genetic algorithm-based approaches. These control datasets serve as baselines, enabling comprehensive comparisons of efficiency, randomness, and convergence behavior across search methods, thereby demonstrating the effectiveness of our novel approach.
Uploaded files
The first dataset contains the average IEI values, generated by randomly simulating 300 x 1 hits for 10 bins per quadrant (4 quadrants in total) using the Python programming language, and calculating the corresponding IEI values. This resulted in a total of 4 x 10 x 300 x 1 = 12,000 data points. The summary of the IEI values by quadrant and bin is provided in the file results_1_300.csv. The calculation of IEI values for averages is based on likelihood, using an absolute difference-based approach for the likelihood probability computation. IEI_Likelihood_Based_Data.zip
The weighted IEI average values for likelihood calculation (Bayes formula) are provided in the file Weighted_IEI_Average_08_01_2025.xlsx
This dataset contains the results of a simulated target search experiment using Bayesian posterior updates and Imprecision Entropy Indicators (IEI). Each row represents a hit during the search process, including metrics such as Shannon entropy (H), Gini index (G), average distance, angular deviation, and calculated IEI values. The dataset also includes bin-specific posterior probability updates and likelihood calculations for each iteration. The simulation explores adaptive learning and posterior penalization strategies to optimize the search efficiency. Our Bayesian adaptive searching system source code (search algorithm, 1000 target searches): IEI_Self_Learning_08_01_2025.pyThis dataset contains the results of 1,000 iterations of a successful target search simulation. The simulation runs until the target is successfully located for each iteration. The dataset includes further three main outputs: a) Results files (results{iteration_number}.csv): Details of each hit during the search process, including entropy measures, Gini index, average distance and angle, Imprecision Entropy Indicators (IEI), coordinates, and the bin number of the hit. b) Posterior updates (Pbin_all_steps_{iter_number}.csv): Tracks the posterior probability updates for all bins during the search process acrosations multiple steps. c) Likelihoodanalysis(likelihood_analysis_{iteration_number}.csv): Contains the calculated likelihood values for each bin at every step, based on the difference between the measured IEI and pre-defined IE bin averages. IEI_Self_Learning_08_01_2025.py
Based on the mentioned Python source code (see point 3, Bayesian adaptive searching method with IEI values), we performed 1,000 successful target searches, and the outputs were saved in the:Self_learning_model_test_output.zip file.
Bayesian Search (IEI) from different quadrant. This dataset contains the results of Bayesian adaptive target search simulations, including various outputs that represent the performance and analysis of the search algorithm. The dataset includes: a) Heatmaps (Heatmap_I_Quadrant, Heatmap_II_Quadrant, Heatmap_III_Quadrant, Heatmap_IV_Quadrant): These heatmaps represent the search results and the paths taken from each quadrant during the simulations. They indicate how frequently the system selected each bin during the search process. b) Posterior Distributions (All_posteriors, Probability_distribution_posteriors_values, CDF_posteriors_values): Generated based on posterior values, these files track the posterior probability updates, including cumulative distribution functions (CDF) and probability distributions. c) Macro Summary (summary_csv_macro): This file aggregates metrics and key statistics from the simulation. It summarizes the results from the individual results.csv files. d) Heatmap Searching Method Documentation (Bayesian_Heatmap_Searching_Method_05_12_2024): This document visualizes the search algorithm's path, showing how frequently each bin was selected during the 1,000 successful target searches. e) One-Way ANOVA Analysis (Anova_analyze_dataset, One_way_Anova_analysis_results): This includes the database and SPSS calculations used to examine whether the starting quadrant influences the number of search steps required. The analysis was conducted at a 5% significance level, followed by a Games-Howell post hoc test [43] to identify which target-surrounding quadrants differed significantly in terms of the number of search steps. Results were saved in the Self_learning_model_test_results.zip
This dataset contains randomly generated sequences of bin selections (1-40) from a control search algorithm (random search) used to benchmark the performance of Bayesian-based methods. The process iteratively generates random numbers until a stopping condition is met (reaching target bins 1, 11, 21, or 31). This dataset serves as a baseline for analyzing the efficiency, randomness, and convergence of non-adaptive search strategies. The dataset includes the following: a) The Python source code of the random search algorithm. b) A file (summary_random_search.csv) containing the results of 1000 successful target hits. c) A heatmap visualizing the frequency of search steps for each bin, providing insight into the distribution of steps across the bins. Random_search.zip
This dataset contains the results of a random walk search algorithm, designed as a control mechanism to benchmark adaptive search strategies (Bayesian-based methods). The random walk operates within a defined space of 40 bins, where each bin has a set of neighboring bins. The search begins from a randomly chosen starting bin and proceeds iteratively, moving to a randomly selected neighboring bin, until one of the stopping conditions is met (bins 1, 11, 21, or 31). The dataset provides detailed records of 1,000 random walk iterations, with the following key components: a) Individual Iteration Results: Each iteration's search path is saved in a separate CSV file (random_walk_results_.csv), listing the sequence of steps taken and the corresponding bin at each step. b) Summary File: A combined summary of all iterations is available in random_walk_results_summary.csv, which aggregates the step-by-step data for all 1,000 random walks. c) Heatmap Visualization: A heatmap file is included to illustrate the frequency distribution of steps across bins, highlighting the relative visit frequencies of each bin during the random walks. d) Python Source Code: The Python script used to generate the random walk dataset is provided, allowing reproducibility and customization for further experiments. Random_walk.zip
This dataset contains the results of a genetic search algorithm implemented as a control method to benchmark adaptive Bayesian-based search strategies. The algorithm operates in a 40-bin search space with predefined target bins (1, 11, 21, 31) and evolves solutions through random initialization, selection, crossover, and mutation over 1000 successful runs. Dataset Components: a) Run Results: Individual run data is stored in separate files (genetic_algorithm_run_.csv), detailing: Generation: The generation number. Fitness: The fitness score of the solution. Steps: The path length in bins. Solution: The sequence of bins visited. b) Summary File: summary.csv consolidates the best solutions from all runs, including their fitness scores, path lengths, and sequences. c) All Steps File: summary_all_steps.csv records all bins visited during the runs for distribution analysis. d) A heatmap was also generated for the genetic search algorithm, illustrating the frequency of bins chosen during the search process as a representation of the search pathways.Genetic_search_algorithm.zip
Technical Information
The dataset files have been compressed into a standard ZIP archive using Total Commander (version 9.50). The ZIP format ensures compatibility across various operating systems and tools.
The XLSX files were created using Microsoft Excel Standard 2019 (Version 1808, Build 10416.20027)
The Python program was developed using Visual Studio Code (Version 1.96.2, user setup), with the following environment details: Commit fabd6a6b30b49f79a7aba0f2ad9df9b399473380f, built on 2024-12-19. The Electron version is 32.6, and the runtime environment includes Chromium 128.0.6263.186, Node.js 20.18.1, and V8 12.8.374.38-electron.0. The operating system is Windows NT x64 10.0.19045.
The statistical analysis included in this dataset was partially conducted using IBM SPSS Statistics, Version 29.0.1.0
The CSV files in this dataset were created following European standards, using a semicolon (;) as the delimiter instead of a comma, encoded in UTF-8 to ensure compatibility with a wide
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DebDab: A database of supraglacial debris thickness and physical properties
Debdab is a database of measured and reported physical properties and thickness of supraglacial debris that we call DebDab and that is openly available and open to community submissions. The majority of the database (90%) is compiled from 172 sources in the literature, and the remaining 10% has not been published before. DebDab contains 8,286 data entries for supraglacial debris thickness, of which 1,852 entries also include sub-debris ablation rates, 167 data entries of thermal conductivity of debris, 157 of aerodynamic surface roughness length, 77 of debris albedo, 56 of debris emissivity and 37 of debris porosity. The data are distributed over 83 glaciers in 13 regions in the Global Terrestrial Network for Glaciers.
This is an initial version of the dataset for submission of a "Data descriptor" manuscript for publication in the scientific journal "Earth System Science Data (ESSD)" from Copernicus Publications.
The initial v1 submission consists of the following files:
(Note there are two versions on Zenodo because of corrections and new files added after submission to Earth System Science Data, but the database is still v1)
The data descriptor manuscript has been submitted and the corresponding DOI will be published here as soon as soon as the paper is in preprint/open review stage.
DebDab is open to new data submissions, and therefore future data submissions of previously unpublished data to DebDab will entail co-authorship on the DebDab database on Zenodo.
Furthermore, authors from published literature that check, verify or update their data on DebDab will become co-authors on the DebDab database on Zenodo.
Important note on citations: DebDab data users should cite the data descriptor manuscript (Fontrodona-Bach et al. 2024), the DebDab zenodo repository
(Groeneveld et al., 2024), and the original data sources when using the database, given that DebDab is mostly
a compilation of previously published data. To facilitate the citations of original data sources, each of the data entries in DebDab contains the corresponding
original reference and corresponding DOI.
The files and workflow will allow you to replicate the study titled "Exploring an extinct society through the lens of Habitus-Field theory and the Tocharian text corpus". This study aimed at utilizing the CEToM-corpus (https://cetom.univie.ac.at/) (Tocharian) to analyze the life-world of the elites of an extinct society situated in modern eastern China. To acquire the raw data needed for steps 1 & 2, please contact Melanie Malzahn melanie.malzahn@univie.ac.at. We conducted a mixed methods study, containing of close reading, content analysis, and multiple correspondence analysis (MCA). The excel file titled "fragments_architecture_combined.xlsx" allows for replication of the MCA and equates to the third step of the workflow outlined below. We used the following programming languages and packages to prepare the dataset and to analyze the data. Data preparation and merging procedures were achieved in python (version 3.9.10) with packages pandas (version 1.5.3), os (version 3.12.0), re (version 3.12.0), numpy (version 1.24.3), gensim (version 4.3.1), BeautifulSoup4 (version 4.12.2), pyasn1 (version 0.4.8), and langdetect (version 1.0.9). Multiple Correspondence Analyses were conducted in R (version 4.3.2) with the packages FactoMineR (version 2.9), factoextra (version 1.0.7), readxl version(1.4.3), tidyverse version(2.0.0), ggplot2 (version 3.4.4) and psych (version 2.3.9). After requesting the necessary files, please open the scripts in the order outlined bellow and execute the code-files to replicate the analysis: Preparatory step: Create a folder for the python and r-scripts downloadable in this repository. Open the file 0_create folders.py and declare a root folder in line 19. This first script will generate you the following folders: "tarim-brahmi_database" = Folder, which contains tocharian dictionaries and tocharian text fragments. "dictionaries" = contains tocharian A and tocharian B vocabularies, including linguistic features such as translations, meanings, part of speech tags etc. A full overview of the words is provided on https://cetom.univie.ac.at/?words. "fragments" = contains tocharian text fragments as xml-files. "word_corpus_data" = folder will contain excel-files of the corpus data after the first step. "Architectural_terms" = This folder contains the data on the architectural terms used in the dataset (e.g. dwelling, house). "regional_data" = This folder contains the data on the findsports (tocharian and modern chinese equivalent, e.g. Duldur-Akhur & Kucha). "mca_ready_data" = This is the folder, in which the excel-file with the merged data will be saved. Note that the prepared file named "fragments_architecture_combined.xlsx" can be saved into this directory. This allows you to skip steps 1 &2 and reproduce the MCA of the content analysis based on the third step of our workflow (R-Script titled 3_conduct_MCA.R). First step - run 1_read_xml-files.py: loops over the xml-files in folder dictionaries and identifies a) word metadata, including language (Tocharian A or B), keywords, part of speech, lemmata, word etymology, and loan sources. Then, it loops over the xml-textfiles and extracts a text id number, langauge (Tocharian A or B), text title, text genre, text subgenre, prose type, verse type, material on which the text is written, medium, findspot, the source text in tocharian, and the translation where available. After successful feature extraction, the resulting pandas dataframe object is exported to the word_corpus_data folder. Second step - run 2_merge_excel_files.py: merges all excel files (corpus, data on findspots, word data) and reproduces the content analysis, which was based upon close reading in the first place. Third step - run 3_conduct_MCA.R: recodes, prepares, and selects the variables necessary to conduct the MCA. Then produces the descriptive values, before conducitng the MCA, identifying typical texts per dimension, and exporting the png-files uploaded to this repository.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data and plots presented in the journal paper of Ocean Engineering titled "Individual Pitch Control with Static Inverted Decoupling for Periodic Blade Load Reduction on Monopile Offshore Wind Turbines" by Manuel Lara, Mario L. Ruz, Sebastiaan Paul Mulders, Francisco Vázquez, and Juan Garrido.
The data supports the analysis and results discussed in Sections 4.1 and 4.2 of the manuscript, including optimization results (Tables 3 and 4, Figures 6 and 7) and simulation results (Figures 8 to 11).
All data, scripts, and functions are coded in MATLAB language (https://www.mathworks.com/products/matlab.html). The files were generated and run on a PC with Windows 10. The dataset encompasses a wide variety of data types, including:
These data are provided in two main folders: Optimization results and Simulation results. The description of these folders and their files is as follows:
A) The folder "Optimization results" includes:
*Figure 6:
Figure6.fig: MATLAB figure file containing the graphical representation of optimization results for controller gains and decoupling elements as a function of wind speed (Figure 6 in the manuscript).
Figure6_code.m: MATLAB script used to generate Figure 6 from the optimization data.
*Figure 7:
figure7.m: MATLAB script to process and plot the optimization results for controller parameters and DEL values (Figure 7 in the manuscript).
figure7.fig: Alternative MATLAB figure file for Figure 7.
*Table 3 and 4:
controller_data.mat: MATLAB data file containing the optimized parameters (e.g., kt, ky, d12, d21) and DEL(M) values for Tables 3 and 4.
controller_table3.xlsx: Excel file with the optimization results for Table 3.
generate_Table3.m: MATLAB script to process controller_data.mat and generate Table 3.
generate_Table4.m: MATLAB script to process controller_data.mat and generate Table 4.
table4_relative_DEL.xlsx: Excel file with relative DEL(M) values for Table 4.
B) The folder "Simulation results" includes:
*Figure 8:
Figure8_plot.m: MATLAB script to plot the bar CHART with results of average relative DEL and NAT values for each controller (Figure 8 in the manuscript).
Figure8_result.mat: MATLAB data file with average relative DEL and NAT values for each controller
Figure8.fig: Alternative MATLAB figure file for Figure 8.
*Figures 9 to 11:
Figure9_plot.m: MATLAB script to generate Figure 9 (time responses in the rotating frame).
Figure9.fig: MATLAB figure file for Figure 9.
Figure10_plot.m: MATLAB script to generate Figure 10 (time responses in the non-rotating frame).
Figure10.fig: MATLAB figure file for Figure 10.
Figure11_plot.m: MATLAB script to generate Figure 1 (frequency responses).
Figure11.fig: MATLAB figure file for Figure 11.
sim_data_w18_I1.mat: MATLAB data file with temporal data for one seed in Case 2 corresponding to the IPC1 (I1).
sim_data_w18_ID.mat: MATLAB data file with temporal data for one seed in Case 2 corresponding to the IPC2 (ID).
sim_data_w18_I1D.mat: MATLAB data file with temporal data for one seed in Case 2 corresponding to the IPC3 (I1D).
sim_data_w18_I1D1.mat: MATLAB data file with temporal data for one seed in Case 2 corresponding to the IPC5 (I1D1).
Acknowledgments
This research was funded by the Spanish Ministry of Science, Innovation and Universities (MCIU/AEI/10.13039/501100011033/FEDER,UE), grant number PID2023-149181OB-I00.
Contact
For questions or additional information, contact the corresponding author: Manuel Lara (manuel.lara@uco.es), University of Cordoba, Spain.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
To recommend strategies to improve discoverability of consumer health informatics (CHI) literature, we aimed to characterize controlled vocabulary and author terminology applied to a subset of CHI literature on wearable technologies. A descriptive analysis of articles (N=2,522) from 2019 identified 308 (12.2%) CHI-related articles for which the citations with PubMed identifiers for the included and excluded studies are provided. The 308 articles were published in 181 journals which we classified by type of journal—health, informatics, technology and other—as shown in the third file. We provide an aggregated file of the author-assigned keywords as they appeared in the PubMed records of the included studies along with our decision about whether they represented consumer engagement. We also included an aggregated file of the Medical Subject Headings assigned to the included studies. The top 100 terms and their frequency scores for the title and abstracts are also included. We did not include any of the terminology from CINAHL, and Engineering Databases (Compendex and Inspec together) due to copyright concerns. Methods This data set includes 14 files, 7 Microsoft Excel (.xlsx) and those same 7 files as comma-delimited CSV files. We searched PubMed on December 19, 2020 using the strategy published in the associated article, limiting to the publication year 2019, and retrieved 2,522 citations for the feasibility study and uploaded them to Rayyan.ai for independent double screening by four reviewers with CHI expertise (CAS, KA, CM, SS). All 2,522 abstracts were divided equally across the team of four reviewers. Each reviewer independently screened 1261 abstracts (resulting in each abstract being reviewed by two reviewers); then, discussion and consensus were used as needed, with a third reviewer making the final decision. The inclusion and exclusion criteria that resulted in this data set appear below and also as a table in the published article. Final Inclusion and Exclusion Criteria Applied in Screening and Selecting Articles for the Terminology Analysis
Inclusion
Exclusion
At least one device in the article meets the following criteria, as identified by the screener but not necessarily stated explicitly by the author(s):
All devices in the article meet at least one of the following criteria, as identified by the screener, but not necessarily stated explicitly by the author(s):
The article is not solely a product advertisement or announcement and has an abstract or other substantive content in English. The article itself can be in a language other than English.
Article is solely a product advertisement or announcement.
The device is consumer/patient-focused and can be worn and removed.
A device that is implanted or designed as part of a larger clinical medical device or system that is not typically available to consumers.
The wearable device measures a health or physiological characteristic relevant to health or well-being.
A device that does not measure anything relevant to health.
The consumer/patient can observe the device’s data.
Monitoring or data that is generated is not available to the consumer/patient.
The files 1 and 2 contain a subset of the citation data exported from the Rayyan system.
File 3 was created by two authors (KA and RW) categorizing source journals into 4 groups: health, informatics (based on Wang et al.’s core journal list), technology, and other. The “health” category includes journals covering health topics exclusive to informatics or technology. Journals that did not focus on health, informatics or technology were categorized as “Other;” examples included Systematic Reviews, and Evaluation and Program Planning.
Files 4 and 7, counting terms as single words or pre-existing phrases in author keywords and MeSH, were created by removing spaces in phrases to create a single entity and counted those with a unique word calculator, PlanetCalc. Unique words count | Online calculators. https://planetcalc.com/3205/
Term frequencies for Files 5 and 6 were created by using the Monkey Learn automated word cloud generator that uses artificial intelligence to identify multi-word concepts and remove common stop words. MonkeyLearn. Word Cloud Generator. https://monkeylearn.com/word-cloud
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
ISO 3166-1-alpha-2 English country names and code elements. This list states the country names (official short names in English) in alphabetical order as given in ISO 3166-1 and the corresponding ISO 3166-1-alpha-2 code elements.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
R Markdown files and Data sets for Manuscript: "HSP90 as an Evolutionary Capacitor Drives Adaptive Eye Size Reduction via atonal"
Authors: Rascha Sayed, Özge Şahin, Mohammed Errbii, Reshma R, Tobias Prüser, Lukas Schrader, Nora K. E. Schulz and Joachim Kurtz
The repository includes all R Markdown files and corresponding datasets (raw data in XLSX format) used to generate the manuscript's figures and statistical results. The files are organized in the same order as they appear in the manuscript. Each plot and statistical analysis can be reproduced using the raw data and the corresponding R script.
All Figuers and statistical analyses were performed in R (version 2024.04.2+764) using RStudio.
Each Markdown file is named according to its corresponding figure number and includes plots and statistical analysis results for each panel within that figure. The associated R scripts needed to generate the plots and perform the statistical analyses for each panel are also provided.
The XLSX files are also named by figure number and contain sheets with all the raw data sets used for plotting and statistical analysis, as referenced in the corresponding Markdown file.
These folders include the original photos displayed in the manuscript, named according to their corresponding figure numbers.
All qPCR results are uploaded in a compressed file titled "Extended Data Figure 2_qPCR_raw_data".
The repository contains the following items:
01. Figure-1_Markdown_file.html: contains plots, statistical analysis and R scripts for:
02. Figure1_raw_data.xlsx: contains raw data used for:
03. Figure 1_Original photos.zip: contains the original photos for reduced-eye phenotype after HSP90 inhibition.
04. Figure-2_Markdown_file.html: contains plots, statistical analysis and R scripts for:
05. Figure2_raw_data.xlsx: contains raw data used for:
06. Figure 3_Original photos.zip: contains the original photos for reduced-eye phenotype after atonal knockdown.
07. Extended-Data-Figure-1_Markdown_file.html: contains plots, statistical analysis and R scripts for:
08. Extended Data Figure 1_raw_data.xlsx: contains raw data used for:
09. Extended Data Figure 1_raw_data.zip: contains the original photos for reduced-eye phenotype after HSP90 inhibition_lateral view.
010. Extended-Data-Figure-2_Markdown_file.html: contains plots and R scripts for:
011. Extended Data Figure 2_raw_data.xlsx: contains raw data used for:
012. Extended Data Figure 2_Original photos.zip: contains the original photos for gonads after Hsp83 knockdown.
013. Extended Data Figure 2_ qPCR_raw_data.zip: contains raw data used for qPCR results_Hsp83 and Hsp68 genes.
014. Extended-Data-Figure-4_Markdown_file.html: contains plots, statistical analysis and R scripts for:
015. Extended Data Figure 4_raw_data.xlsx: contains raw data used for:
016. Extended-Data-Figure-6_Markdown_file.html: contains plots, statistical analysis and R scripts for:
017. Extended Data Figure 6_raw_data.xlsx: contains raw data used for:
018. Extended Data Figure 7_Original photos.zip: contains the original photos for: Abnormal phenotypes in adults after larval and pupal knockdown for candidate genes using RNAi.
019. Extended Data Table 2.xlsx: Genome-wide analysis of allele frequency differences in T. castaneum.
020. Extended Data Table 3.xlsx: Expansion of candidate region for eye phenotype.