15 datasets found
  1. Z

    FooDrugs database: A database with molecular and text information about food...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lacruz Pleguezuelos, Blanca (2023). FooDrugs database: A database with molecular and text information about food - drug interactions [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6638469
    Explore at:
    Dataset updated
    Jul 28, 2023
    Dataset provided by
    Lacruz Pleguezuelos, Blanca
    Carrillo de Santa Pau, Enrique
    Pérez, David
    Laguna Lobo, Teresa
    Piette Gómez, Óscar
    Garranzo, Marco
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    FooDrugs database is a development done by the Computational Biology Group at IMDEA Food Institute (Madrid, Spain), in the context of the Food Nutrition Security Cloud (FNS-Cloud) project. Food Nutrition Security Cloud (FNS-Cloud) has received funding from the European Union's Horizon 2020 Research and Innovation programme (H2020-EU.3.2.2.3. – A sustainable and competitive agri-food industry) under Grant Agreement No. 863059 – www.fns-cloud.eu (See more details about FNS-Cloud below)

    FooDrugs stores information extracted from transcriptomics and text documents for foo-drug interactiosn and it is part of a demonstrator to be done in the FNS-Cloud project. The database was built using MySQL, an open source relational database management system. FooDrugs host information for a total of 161 transcriptomics GEO series with 585 conditions for food or bioactive compounds. Each condition is defined as a food/biocomponent per time point, per concentration, per cell line, primary culture or biopsy per study. FooDrugs includes information about a bipartite network with 510 nodes and their similarity scores (tau score; https://clue.io/connectopedia/connectivity_scores) related with possible drug interactions with drugs assayed in conectivity map (https://www.broadinstitute.org/connectivity-map-cmap). The information is stored in eight tables:

    Table “study” : This table contains basic information about study identifiers from GEO, pubmed or platform, study type, title and abstract

    Table “sample”: This table contains basic information about the different experiments in a study, like the identifier of the sample, treatment, origin type, time point or concentration.

    Table “misc_study”: This table contains additional information about different attributes of the study.

    Table “misc_sample”: This table contains additional information about different attributes of the sample.

    Table “cmap”: This table contains information about 70895 nodes, compromising drugs, foods or bioactives, overexpressed and knockdown genes (see section 3.4). The information includes cell line, compound and perturbation type.

    Table “cmap_foodrugs”: This table contains information about the tau score (see section 3.4) that relates food with drugs or genes and the node identifier in the FooDrugs network.

    Table “topTable”: This table contains information about 150 over and underexpressed genes from each GEO study condition, used to calculate the tau score (see section 3.4). The information stored is the logarithmic fold change, average expression, t-statistic, p-value, adjusted p-value and if the gene is up or downregulated.

    Table “nodes”: This table stores the information about the identification of the sample and the node in the bipartite network connecting the tables “sample”, “cmap_foodrugs” and “topTable”.

    In addition, FooDrugs database stores a total of 6422 food/drug interactions from 2849 text documents, obtained from three different sources: 2312 documents from PubMed, 285 from DrugBank, and 252 from drugs.com. These documents describe potential interactions between 1464 food/bioactive compounds and 3009 drugs. The information is stored in two tables:

    Table “texts”: This table contains all the documents with its identifiers where interactions have been identified with strategy described in section 4.

    Table “TM_interactions”: This table contains information about interaction identifiers, the food and drug entities, and the start and the end positions of the context for the interaction in the document.

    FNS-Cloud will overcome fragmentation problems by integrating existing FNS data, which is essential for high-end, pan-European FNS research, addressing FNS, diet, health, and consumer behaviours as well as on sustainable agriculture and the bio-economy. Current fragmented FNS resources not only result in knowledge gaps that inhibit public health and agricultural policy, and the food industry from developing effective solutions, making production sustainable and consumption healthier, but also do not enable exploitation of FNS knowledge for the benefit of European citizens. FNS-Cloud will, through three Demonstrators; Agri-Food, Nutrition & Lifestyle and NCDs & the Microbiome to facilitate: (1) Analyses of regional and country-specific differences in diet including nutrition, (epi)genetics, microbiota, consumer behaviours, culture and lifestyle and their effects on health (obesity, NCDs, ethnic and traditional foods), which are essential for public health and agri-food and health policies; (2) Improved understanding agricultural differences within Europe and what these means in terms of creating a sustainable, resilient food systems for healthy diets; and (3) Clear definitions of boundaries and how these affect the compositions of foods and consumer choices and, ultimately, personal and public health in the future. Long-term sustainability of the FNS-Cloud will be based on Services that have the capacity to link with new resources and enable cross-talk amongst them; access to FNS-Cloud data will be open access, underpinned by FAIR principles (findable, accessible, interoperable and re-useable). FNS-Cloud will work closely with the proposed Food, Nutrition and Health Research Infrastructure (FNHRI) as well as METROFOOD-RI and other existing ESFRI RIs (e.g. ELIXIR, ECRIN) in which several FNS-Cloud Beneficiaries are involved directly. (https://cordis.europa.eu/project/id/863059)

  2. Most popular database management systems worldwide 2024

    • statista.com
    Updated Jun 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Most popular database management systems worldwide 2024 [Dataset]. https://www.statista.com/statistics/809750/worldwide-popularity-ranking-database-management-systems/
    Explore at:
    Dataset updated
    Jun 19, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 2024
    Area covered
    Worldwide
    Description

    As of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of 1244.08; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.

  3. Z

    SQL Injection Attack Netflow

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adrián Campazas (2022). SQL Injection Attack Netflow [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6907251
    Explore at:
    Dataset updated
    Sep 28, 2022
    Dataset provided by
    Ignacio Crespo
    Adrián Campazas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used.

    NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device.

    Datasets

    The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2).

    The datasets contain both benign and malicious traffic. All collected datasets are balanced.

    The version of NetFlow used to build the datasets is 5.

        Dataset
        Aim
        Samples
        Benign-malicious
        traffic ratio
    
    
    
    
        D1
        Training
        400,003
        50%
    
    
        D2
        Test
        57,239
        50%
    

    Infrastructure and implementation

    Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows.

    DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes)

    Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet).

    The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities.

    The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table.

        Parameters
        Description
    
    
    
    
        '--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema'
        Enumerate users, password hashes, privileges, roles, databases, tables and columns
    
    
        --level=5
        Increase the probability of a false positive identification
    
    
        --risk=3
        Increase the probability of extracting data
    
    
        --random-agent
        Select the User-Agent randomly
    
    
        --batch
        Never ask for user input, use the default behavior
    
    
        --answers="follow=Y"
        Predefined answers to yes
    

    Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer).

    The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24. The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases.

    However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes.

    To run the MySQL server we ran MariaDB version 10.4.12. Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.

  4. Most popular relational database management systems worldwide 2024

    • statista.com
    Updated Jun 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Most popular relational database management systems worldwide 2024 [Dataset]. https://www.statista.com/statistics/1131568/worldwide-popularity-ranking-relational-database-management-systems/
    Explore at:
    Dataset updated
    Jun 19, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 2024
    Area covered
    Worldwide
    Description

    As of June 2024, the most popular relational database management system (RDBMS) worldwide was Oracle, with a ranking score of 1244.08. Oracle was also the most popular DBMS overall. MySQL and Microsoft SQL server rounded out the top three.

  5. s

    Orphan Drugs - Dataset 1: Twitter issue-networks as excluded publics

    • orda.shef.ac.uk
    txt
    Updated Oct 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Orphan Drugs - Dataset 1: Twitter issue-networks as excluded publics [Dataset]. https://orda.shef.ac.uk/articles/dataset/Orphan_Drugs_-_Dataset_1_Twitter_issue-networks_as_excluded_publics/16447326
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 22, 2021
    Dataset provided by
    The University of Sheffield
    Authors
    Matthew Hanchard
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises of two .csv format files used within workstream 2 of the Wellcome Trust funded ‘Orphan drugs: High prices, access to medicines and the transformation of biopharmaceutical innovation’ project (219875/Z/19/Z). They appear in various outputs, e.g. publications and presentations.

    The deposited data were gathered using the University of Amsterdam Digital Methods Institute’s ‘Twitter Capture and Analysis Toolset’ (DMI-TCAT) before being processed and extracted from Gephi. DMI-TCAT queries Twitter’s STREAM Application Programming Interface (API) using SQL and retrieves data on a pre-set text query. It then sends the returned data for storage on a MySQL database. The tool allows for output of that data in various formats. This process aligns fully with Twitter’s service user terms and conditions. The query for the deposited dataset gathered a 1% random sample of all public tweets posted between 10-Feb-2021 and 10-Mar-2021 containing the text ‘Rare Diseases’ and/or ‘Rare Disease Day’, storing it on a local MySQL database managed by the University of Sheffield School of Sociological Studies (http://dmi-tcat.shef.ac.uk/analysis/index.php), accessible only via a valid VPN such as FortiClient and through a permitted active directory user profile. The dataset was output from the MySQL database raw as a .gexf format file, suitable for social network analysis (SNA). It was then opened using Gephi (0.9.2) data visualisation software and anonymised/pseudonymised in Gephi as per the ethical approval granted by the University of Sheffield School of Sociological Studies Research Ethics Committee on 02-Jun-201 (reference: 039187). The deposited dataset comprises of two anonymised/pseudonymised social network analysis .csv files extracted from Gephi, one containing node data (Issue-networks as excluded publics – Nodes.csv) and another containing edge data (Issue-networks as excluded publics – Edges.csv). Where participants explicitly provided consent, their original username has been provided. Where they have provided consent on the basis that they not be identifiable, their username has been replaced with an appropriate pseudonym. All other usernames have been anonymised with a randomly generated 16-digit key. The level of anonymity for each Twitter user is provided in column C of deposited file ‘Issue-networks as excluded publics – Nodes.csv’.

    This dataset was created and deposited onto the University of Sheffield Online Research Data repository (ORDA) on 26-Aug-2021 by Dr. Matthew S. Hanchard, Research Associate at the University of Sheffield iHuman institute/School of Sociological Studies. ORDA has full permission to store this dataset and to make it open access for public re-use without restriction under a CC BY license, in line with the Wellcome Trust commitment to making all research data Open Access.

    The University of Sheffield are the designated data controller for this dataset.

  6. P

    Royal Institute for Cultural Heritage Radiocarbon and stable isotope...

    • pandoradata.earth
    • pandora.earth
    Updated Aug 17, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IRPA/KIK (2021). Royal Institute for Cultural Heritage Radiocarbon and stable isotope measurements [Dataset]. https://pandoradata.earth/bg/dataset/royal-institute-for-cultural-heritage-radiocarbon-and-stable-isotope-measurements
    Explore at:
    Dataset updated
    Aug 17, 2021
    Dataset provided by
    IRPA/KIK
    Description

    The Radiocarbon dating laboratory of IRPA/KIK was founded in the 1960s. Initially dates were reported at more or less regular intervals in the journal Radiocarbon (Schreurs 1968). Since the advent of radiocarbon dating in the 1950s it had been a common practice amongst radiocarbon laboratories to publish their dates in so-called ‘date-lists’ that were arranged per laboratory. This was first done in the Radiocarbon Supplement of the American Journal of Science and later in the specialised journal Radiocarbon. In the course of time the latter, with the added subtitle An International Journal of Cosmogenic Isotope Research, became a regular scientific journal shifting focus from date-lists to articles. Furthermore the world-wide exponential increase of radiocarbon dates made it almost impossible to publish them all in the same journal, even more so because of the broad range of applications that use radiocarbon analysis, ranging from archaeology and art history to geology and oceanography and recently also biomedical studies.The IRPA/KIK database

    From 1995 onwards IRPA/KIK’s Radiocarbon laboratory started to publish its dates in small publications, continuing the numbering of the preceding lists in Radiocarbon. The first booklet in this series was “Royal Institute for Cultural Heritage Radiocarbon dates XV” (Van Strydonck et al. 1995), followed by three more volumes (XVI, XVII, XVIII). The next list (XIX, 2005) was no longer printed but instead handed out as a PDF file on CD-rom.

    The ever increasing number of dates and the difficulties in handling all the data, however, made us look for a more permanent and easier solution. In order to improve data management and consulting, it was thus decided to gather all our dates in a web-based database. List XIX was in fact already a Microsoft Access database that was converted into a reader friendly style and could also be printed as a PDF file. However a Microsoft Access database is not the most practical solution to make information publicly available. Hence the structure of the database was recreated in Mysql and the existing content was transferred into the corresponding fields. To display the records, a web-based front-end was programmed in PHP/Apache. It features a full-text search function that allows for partial word-matching. In addition the records can be consulted in PDF format.

    Old records from the printed date-lists as well as new records are now added using the same Microsoft Acces back-end, which is now connected directly to the Mysql database. The main problem with introducing the old data was that not all the current criteria were available in the past (e.g. stable isotope measurements). Furthermore since all the sample information is given by the submitter, its quality largely depends on the persons willingness to contribute as well as on the accuracy and correctness of the information he provides. Sometimes problems arrive from the fact that a certain investigation (like an excavation) is carried out over a relatively long period (sometimes even more than ten years) and is directed by different people or even institutions. This can lead to differences in the labeling procedure of the samples, but also in the interpretation of structures and artifacts and in the orthography of the site’s name. Finally the submitter might change address, while the names of institutions or even regions and countries might change as well (e.g.Zaire - Congo)

  7. Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and...

    • zenodo.org
    • explore.openaire.eu
    • +1more
    bin, csv, png, txt +1
    Updated Aug 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mefta Sadat; Ayse Basar Bener; Andriy V. Miranskyy; Mefta Sadat; Ayse Basar Bener; Andriy V. Miranskyy (2024). Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and KDE [Dataset]. http://doi.org/10.5281/zenodo.400614
    Explore at:
    csv, txt, bin, zip, pngAvailable download formats
    Dataset updated
    Aug 3, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mefta Sadat; Ayse Basar Bener; Andriy V. Miranskyy; Mefta Sadat; Ayse Basar Bener; Andriy V. Miranskyy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present three defect rediscovery datasets mined from Bugzilla. The datasets capture data for three groups of open source software projects: Apache, Eclipse, and KDE. The datasets contain information about approximately 914 thousands of defect reports over a period of 18 years (1999-2017) to capture the inter-relationships among duplicate defects.

    File Descriptions

    • apache.csv - Apache Defect Rediscovery dataset
    • eclipse.csv - Eclipse Defect Rediscovery dataset
    • kde.csv - KDE Defect Rediscovery dataset

    • apache.relations.csv - Inter-relations of rediscovered defects of Apache
    • eclipse.relations.csv - Inter-relations of rediscovered defects of Eclipse
    • kde.relations.csv - Inter-relations of rediscovered defects of KDE

    • neo4j_examples.txt - Sample Neo4j queries
    • mysql_examples.txt - Sample MySQL queries
    • rediscovery_eclipse_6325.png - Output of Neo4j example #1

    • distinct_attrs.csv - Distinct values of bug_status, resolution, priority, severity for each project
  8. Malaria disease and grading system dataset from public hospitals reflecting...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Nov 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Temitope Olufunmi Atoyebi; Rashidah Funke Olanrewaju; N. V. Blamah; Emmanuel Chinanu Uwazie (2023). Malaria disease and grading system dataset from public hospitals reflecting complicated and uncomplicated conditions [Dataset]. http://doi.org/10.5061/dryad.4xgxd25gn
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 10, 2023
    Dataset provided by
    Nasarawa State University
    Authors
    Temitope Olufunmi Atoyebi; Rashidah Funke Olanrewaju; N. V. Blamah; Emmanuel Chinanu Uwazie
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Malaria is the leading cause of death in the African region. Data mining can help extract valuable knowledge from available data in the healthcare sector. This makes it possible to train models to predict patient health faster than in clinical trials. Implementations of various machine learning algorithms such as K-Nearest Neighbors, Bayes Theorem, Logistic Regression, Support Vector Machines, and Multinomial Naïve Bayes (MNB), etc., has been applied to malaria datasets in public hospitals, but there are still limitations in modeling using the Naive Bayes multinomial algorithm. This study applies the MNB model to explore the relationship between 15 relevant attributes of public hospitals data. The goal is to examine how the dependency between attributes affects the performance of the classifier. MNB creates transparent and reliable graphical representation between attributes with the ability to predict new situations. The model (MNB) has 97% accuracy. It is concluded that this model outperforms the GNB classifier which has 100% accuracy and the RF which also has 100% accuracy. Methods Prior to collection of data, the researcher was be guided by all ethical training certification on data collection, right to confidentiality and privacy reserved called Institutional Review Board (IRB). Data was be collected from the manual archive of the Hospitals purposively selected using stratified sampling technique, transform the data to electronic form and store in MYSQL database called malaria. Each patient file was extracted and review for signs and symptoms of malaria then check for laboratory confirmation result from diagnosis. The data was be divided into two tables: the first table was called data1 which contain data for use in phase 1 of the classification, while the second table data2 which contains data for use in phase 2 of the classification. Data Source Collection Malaria incidence data set is obtained from Public hospitals from 2017 to 2021. These are the data used for modeling and analysis. Also, putting in mind the geographical location and socio-economic factors inclusive which are available for patients inhabiting those areas. Naive Bayes (Multinomial) is the model used to analyze the collected data for malaria disease prediction and grading accordingly. Data Preprocessing: Data preprocessing shall be done to remove noise and outlier. Transformation: The data shall be transformed from analog to electronic record. Data Partitioning The data which shall be collected will be divided into two portions; one portion of the data shall be extracted as a training set, while the other portion will be used for testing. The training portion shall be taken from a table stored in a database and will be called data which is training set1, while the training portion taking from another table store in a database is shall be called data which is training set2. The dataset was split into two parts: a sample containing 70% of the training data and 30% for the purpose of this research. Then, using MNB classification algorithms implemented in Python, the models were trained on the training sample. On the 30% remaining data, the resulting models were tested, and the results were compared with the other Machine Learning models using the standard metrics. Classification and prediction: Base on the nature of variable in the dataset, this study will use Naïve Bayes (Multinomial) classification techniques; Classification phase 1 and Classification phase 2. The operation of the framework is illustrated as follows: i. Data collection and preprocessing shall be done. ii. Preprocess data shall be stored in a training set 1 and training set 2. These datasets shall be used during classification. iii. Test data set is shall be stored in database test data set. iv. Part of the test data set must be compared for classification using classifier 1 and the remaining part must be classified with classifier 2 as follows: Classifier phase 1: It classify into positive or negative classes. If the patient is having malaria, then the patient is classified as positive (P), while a patient is classified as negative (N) if the patient does not have malaria.
    Classifier phase 2: It classify only data set that has been classified as positive by classifier 1, and then further classify them into complicated and uncomplicated class label. The classifier will also capture data on environmental factors, genetics, gender and age, cultural and socio-economic variables. The system will be designed such that the core parameters as a determining factor should supply their value.

  9. s

    SIEGE- Smoking Induced Epithelial Gene Expression

    • scicrunch.org
    • dknet.org
    Updated Jan 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). SIEGE- Smoking Induced Epithelial Gene Expression [Dataset]. http://identifiers.org/RRID:SCR_007925
    Explore at:
    Dataset updated
    Jan 31, 2025
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 17, 2013. A database that provides access to data from several gene expression profile analysis results of smokers and non-smokers. In the experiment, researchers first obtained brushings from intra-pulmonary airways (the right upper lobe carina) and scrapings from the buccal mucosa, from normal smoking and non-smoking volunteers. RNA was isolated from these samples and gene expression profiles from intra-pulmonary airway epithelial cells were analyzed using Affymetrix U133A human gene expression arrays. All microarray data from these experimentshave been stored, preprocessed and analyzed in a relational MySQL database that is accessible through this website.

  10. Data from: KGCW 2024 Challenge @ ESWC 2024

    • zenodo.org
    • investigacion.usc.gal
    • +1more
    application/gzip
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dylan Van Assche; Dylan Van Assche; David Chaves-Fraga; David Chaves-Fraga; Anastasia Dimou; Anastasia Dimou; Umutcan Serles; Umutcan Serles; Ana Iglesias; Ana Iglesias (2024). KGCW 2024 Challenge @ ESWC 2024 [Dataset]. http://doi.org/10.5281/zenodo.10973433
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Apr 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dylan Van Assche; Dylan Van Assche; David Chaves-Fraga; David Chaves-Fraga; Anastasia Dimou; Anastasia Dimou; Umutcan Serles; Umutcan Serles; Ana Iglesias; Ana Iglesias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Knowledge Graph Construction Workshop 2024: challenge

    Knowledge graph construction of heterogeneous data has seen a lot of uptake
    in the last decade from compliance to performance optimizations with respect
    to execution time. Besides execution time as a metric for comparing knowledge
    graph construction, other metrics e.g. CPU or memory usage are not considered.
    This challenge aims at benchmarking systems to find which RDF graph
    construction system optimizes for metrics e.g. execution time, CPU,
    memory usage, or a combination of these metrics.

    Task description

    The task is to reduce and report the execution time and computing resources
    (CPU and memory usage) for the parameters listed in this challenge, compared
    to the state-of-the-art of the existing tools and the baseline results provided
    by this challenge. This challenge is not limited to execution times to create
    the fastest pipeline, but also computing resources to achieve the most efficient
    pipeline.

    We provide a tool which can execute such pipelines end-to-end. This tool also
    collects and aggregates the metrics such as execution time, CPU and memory
    usage, necessary for this challenge as CSV files. Moreover, the information
    about the hardware used during the execution of the pipeline is available as
    well to allow fairly comparing different pipelines. Your pipeline should consist
    of Docker images which can be executed on Linux to run the tool. The tool is
    already tested with existing systems, relational databases e.g. MySQL and
    PostgreSQL, and triplestores e.g. Apache Jena Fuseki and OpenLink Virtuoso
    which can be combined in any configuration. It is strongly encouraged to use
    this tool for participating in this challenge. If you prefer to use a different
    tool or our tool imposes technical requirements you cannot solve, please contact
    us directly.

    Track 1: Conformance

    The set of new specification for the RDF Mapping Language (RML) established by the W3C Community Group on Knowledge Graph Construction provide a set of test-cases for each module:

    These test-cases are evaluated in this Track of the Challenge to determine their feasibility, correctness, etc. by applying them in implementations. This Track is in Beta status because these new specifications have not seen any implementation yet, thus it may contain bugs and issues. If you find problems with the mappings, output, etc. please report them to the corresponding repository of each module.

    Note: validating the output of the RML Star module automatically through the provided tooling is currently not possible, see https://github.com/kg-construct/challenge-tool/issues/1.

    Through this Track we aim to spark development of implementations for the new specifications and improve the test-cases. Let us know your problems with the test-cases and we will try to find a solution.

    Track 2: Performance

    Part 1: Knowledge Graph Construction Parameters

    These parameters are evaluated using synthetic generated data to have more
    insights of their influence on the pipeline.

    Data

    • Number of data records: scaling the data size vertically by the number of records with a fixed number of data properties (10K, 100K, 1M, 10M records).
    • Number of data properties: scaling the data size horizontally by the number of data properties with a fixed number of data records (1, 10, 20, 30 columns).
    • Number of duplicate values: scaling the number of duplicate values in the dataset (0%, 25%, 50%, 75%, 100%).
    • Number of empty values: scaling the number of empty values in the dataset (0%, 25%, 50%, 75%, 100%).
    • Number of input files: scaling the number of datasets (1, 5, 10, 15).

    Mappings

    • Number of subjects: scaling the number of subjects with a fixed number of predicates and objects (1, 10, 20, 30 TMs).
    • Number of predicates and objects: scaling the number of predicates and objects with a fixed number of subjects (1, 10, 20, 30 POMs).
    • Number of and type of joins: scaling the number of joins and type of joins (1-1, N-1, 1-N, N-M)

    Part 2: GTFS-Madrid-Bench

    The GTFS-Madrid-Bench provides insights in the pipeline with real data from the
    public transport domain in Madrid.

    Scaling

    • GTFS-1 SQL
    • GTFS-10 SQL
    • GTFS-100 SQL
    • GTFS-1000 SQL

    Heterogeneity

    • GTFS-100 XML + JSON
    • GTFS-100 CSV + XML
    • GTFS-100 CSV + JSON
    • GTFS-100 SQL + XML + JSON + CSV

    Example pipeline

    The ground truth dataset and baseline results are generated in different steps
    for each parameter:

    1. The provided CSV files and SQL schema are loaded into a MySQL relational database.
    2. Mappings are executed by accessing the MySQL relational database to construct a knowledge graph in N-Triples as RDF format

    The pipeline is executed 5 times from which the median execution time of each
    step is calculated and reported. Each step with the median execution time is
    then reported in the baseline results with all its measured metrics.
    Knowledge graph construction timeout is set to 24 hours.
    The execution is performed with the following tool: https://github.com/kg-construct/challenge-tool,
    you can adapt the execution plans for this example pipeline to your own needs.

    Each parameter has its own directory in the ground truth dataset with the
    following files:

    • Input dataset as CSV.
    • Mapping file as RML.
    • Execution plan for the pipeline in metadata.json.

    Datasets

    Knowledge Graph Construction Parameters

    The dataset consists of:

    • Input dataset as CSV for each parameter.
    • Mapping file as RML for each parameter.
    • Baseline results for each parameter with the example pipeline.
    • Ground truth dataset for each parameter generated with the example pipeline.

    Format

    All input datasets are provided as CSV, depending on the parameter that is being
    evaluated, the number of rows and columns may differ. The first row is always
    the header of the CSV.

    GTFS-Madrid-Bench

    The dataset consists of:

    • Input dataset as CSV with SQL schema for the scaling and a combination of XML,
    • CSV, and JSON is provided for the heterogeneity.
    • Mapping file as RML for both scaling and heterogeneity.
    • SPARQL queries to retrieve the results.
    • Baseline results with the example pipeline.
    • Ground truth dataset generated with the example pipeline.

    Format

    CSV datasets always have a header as their first row.
    JSON and XML datasets have their own schema.

    Evaluation criteria

    Submissions must evaluate the following metrics:

    • Execution time of all the steps in the pipeline. The execution time of a step is the difference between the begin and end time of a step.
    • CPU time as the time spent in the CPU for all steps of the pipeline. The CPU time of a step is the difference between the begin and end CPU time of a step.
    • Minimal and maximal memory consumption for each step of the pipeline. The minimal and maximal memory consumption of a step is the minimum and maximum calculated of the memory consumption during the execution of a step.

    Expected output

    Duplicate values

    ScaleNumber of Triples
    0 percent2000000 triples
    25 percent1500020 triples
    50 percent1000020 triples
    75 percent500020 triples
    100 percent20 triples

    Empty values

    ScaleNumber of Triples
    0 percent2000000 triples
    25 percent1500000 triples
    50 percent1000000 triples
    75 percent500000 triples
    100 percent0 triples

    Mappings

    ScaleNumber of Triples
    1TM + 15POM1500000 triples
    3TM + 5POM1500000 triples
    5TM + 3POM 1500000 triples
    15TM + 1POM1500000 triples

    Properties

    ScaleNumber of Triples
    1M rows 1 column1000000 triples
    1M rows 10

  11. Most popular database management systems in software companies in Russia...

    • statista.com
    Updated Oct 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Most popular database management systems in software companies in Russia 2022 [Dataset]. https://www.statista.com/statistics/1330732/most-popular-dbms-in-software-companies-russia/
    Explore at:
    Dataset updated
    Oct 8, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 2022 - May 2022
    Area covered
    Russia
    Description

    Approximately 82 percent of the surveyed software companies in Russia mentioned PostgreSQL, making it the most popular database management system (DBMS) in the period between February and May 2022. MS SQL and MySQL followed, having been mentioned by 47 percent and 41 percent of respondents, respectively.

  12. u

    KGCW 2023 Challenge @ ESWC 2023

    • investigacion.usc.es
    • investigacion.usc.gal
    • +1more
    Updated 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Van Assche, Dylan; Chaves-Fraga, David; Dimou, Anastasia; Şimşek, Umutcan; Iglesias, Ana; Van Assche, Dylan; Chaves-Fraga, David; Dimou, Anastasia; Şimşek, Umutcan; Iglesias, Ana (2023). KGCW 2023 Challenge @ ESWC 2023 [Dataset]. https://investigacion.usc.es/documentos/668fc445b9e7c03b01bd84ec?lang=en
    Explore at:
    Dataset updated
    2023
    Authors
    Van Assche, Dylan; Chaves-Fraga, David; Dimou, Anastasia; Şimşek, Umutcan; Iglesias, Ana; Van Assche, Dylan; Chaves-Fraga, David; Dimou, Anastasia; Şimşek, Umutcan; Iglesias, Ana
    Description

    Knowledge Graph Construction Workshop 2023: challenge Knowledge graph construction of heterogeneous data has seen a lot of uptake
    in the last decade from compliance to performance optimizations with respect
    to execution time. Besides execution time as a metric for comparing knowledge
    graph construction, other metrics e.g. CPU or memory usage are not considered.
    This challenge aims at benchmarking systems to find which RDF graph
    construction system optimizes for metrics e.g. execution time, CPU,
    memory usage, or a combination of these metrics. Task description The task is to reduce and report the execution time and computing resources
    (CPU and memory usage) for the parameters listed in this challenge, compared
    to the state-of-the-art of the existing tools and the baseline results provided
    by this challenge. This challenge is not limited to execution times to create
    the fastest pipeline, but also computing resources to achieve the most efficient
    pipeline. We provide a tool which can execute such pipelines end-to-end. This tool also
    collects and aggregates the metrics such as execution time, CPU and memory
    usage, necessary for this challenge as CSV files. Moreover, the information
    about the hardware used during the execution of the pipeline is available as
    well to allow fairly comparing different pipelines. Your pipeline should consist
    of Docker images which can be executed on Linux to run the tool. The tool is
    already tested with existing systems, relational databases e.g. MySQL and
    PostgreSQL, and triplestores e.g. Apache Jena Fuseki and OpenLink Virtuoso
    which can be combined in any configuration. It is strongly encouraged to use
    this tool for participating in this challenge. If you prefer to use a different
    tool or our tool imposes technical requirements you cannot solve, please contact
    us directly. Part 1: Knowledge Graph Construction Parameters These parameters are evaluated using synthetic generated data to have more
    insights of their influence on the pipeline. Data Number of data records: scaling the data size vertically by the number of records with a fixed number of data properties (10K, 100K, 1M, 10M records). Number of data properties: scaling the data size horizontally by the number of data properties with a fixed number of data records (1, 10, 20, 30 columns). Number of duplicate values: scaling the number of duplicate values in the dataset (0%, 25%, 50%, 75%, 100%). Number of empty values: scaling the number of empty values in the dataset (0%, 25%, 50%, 75%, 100%). Number of input files: scaling the number of datasets (1, 5, 10, 15). Mappings Number of subjects: scaling the number of subjects with a fixed number of predicates and objects (1, 10, 20, 30 TMs). Number of predicates and objects: scaling the number of predicates and objects with a fixed number of subjects (1, 10, 20, 30 POMs). Number of and type of joins: scaling the number of joins and type of joins (1-1, N-1, 1-N, N-M) Part 2: GTFS-Madrid-Bench The GTFS-Madrid-Bench provides insights in the pipeline with real data from the
    public transport domain in Madrid. Scaling GTFS-1 SQL GTFS-10 SQL GTFS-100 SQL GTFS-1000 SQL Heterogeneity GTFS-100 XML + JSON GTFS-100 CSV + XML GTFS-100 CSV + JSON GTFS-100 SQL + XML + JSON + CSV Example pipeline The ground truth dataset and baseline results are generated in different steps
    for each parameter: The provided CSV files and SQL schema are loaded into a MySQL relational database. Mappings are executed by accessing the MySQL relational database to construct a knowledge graph in N-Triples as RDF format. The constructed knowledge graph is loaded into a Virtuoso triplestore, tuned according to the Virtuoso documentation. The provided SPARQL queries are executed on the SPARQL endpoint exposed by Virtuoso. The pipeline is executed 5 times from which the median execution time of each
    step is calculated and reported. Each step with the median execution time is
    then reported in the baseline results with all its measured metrics.
    Query timeout is set to 1 hour and knowledge graph construction timeout
    to 24 hours. The execution is performed with the following tool

  13. g

    Annotated transcriptome and associated datasets of flatback mud crabs...

    • data.griidc.org
    • search.dataone.org
    Updated Mar 24, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hernan Vazquez Miranda (2017). Annotated transcriptome and associated datasets of flatback mud crabs (Eurypanopeus depressus) exposed to dispersed oil [Dataset]. http://doi.org/10.7266/N71C1TZC
    Explore at:
    Dataset updated
    Mar 24, 2017
    Dataset provided by
    GRIIDC
    Authors
    Hernan Vazquez Miranda
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Description

    Flatback mud crab (Eurypanopeus depressus) transcriptome was assembled from 15 individuals sequenced in 2 Illumina HiSeq2000 lanes PE100 using Trinity2.0.3 and annotated with Trinotate2.0.1 on a custom MySQL database from The Broad Institute. We exposed 3 flatback mud crabs to one of 4 treatments (total = 12 individuals): non-aerated control, aerated control, oil only (Marlin platform Dorado 1g/l ), and oil-dispersant (Marlin platform Dorado 1g/l, COREXIT 9500 0.1g/l) for 72 hours to assess the up and down regulation of genes in muscle tissues. To account for stress caused by laboratory treatments, muscle tissue from three reference individuals that were sacrificed and not exposed to any lab treatments were analyzed. This dataset reports upregulated and downregulated gene expression. NCBI accession numbers are provided for each sample.

  14. r

    Usage Statistics for University of Tasmania EPrints Repository

    • researchdata.edu.au
    Updated 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arthur Sale; A H J Sale (2019). Usage Statistics for University of Tasmania EPrints Repository [Dataset]. https://researchdata.edu.au/usage-statistics-university-eprints-repository/1668483
    Explore at:
    Dataset updated
    2019
    Dataset provided by
    University of Tasmania, Australia
    Authors
    Arthur Sale; A H J Sale
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset is an active collection of access data to information items in the University of Tasmania's EPrints repository. Each night a task is scheduled to run, and this picks up in the Apache access logs from where it left off the previous night. Each download of an open access full-text item causes the generation of a database record in the MySQL database, together with a timestamp, and an approximate location of the computer system generating the download. This is achieved by looking up the IP address against the GeoIP database, with one significant difference. Downloads originating from a University of Tasmania IP address are separately identified, and removed from the Australia category. This eliminates vanity searches from achieving high significance. Countries are coded using the ISO3166 two-letter code.

    The dataset has been used to analyse the usage made of the repository and to tune it to achieve maximal visibility for the University of Tasmania. Researchers with items in the repository have used it to identify the types of use being made of their work, and to find potential collaborators. The citation of a work in a journal or conference article, for example, causes a typical step in usage, and the citing article can be searched in Google or Google Scholar to identify the authors. This enhances the dissemination experience and its value.

    The software was written in the University of Tasmania by Professor Arthur Sale (in php) based on earlier work by the University of Melbourne (with permission). Mr Christian McGee wrote some critical sections of the code in perl, and set up the cron scheduling.

    The dataset is generated by a computer program written by Professor Arthur Sale. The software was a test bed for ideas, and subsequently resulted in an official software set included in the EPrints distribution. This set expanded on the concepts significantly

  15. Z

    Site occupancy matrices, The River Ouse Project

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pilkington, Margaret (2020). Site occupancy matrices, The River Ouse Project [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3691906
    Explore at:
    Dataset updated
    May 13, 2020
    Dataset provided by
    Pilkington, John
    Pilkington, Margaret
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    River Ouse
    Description

    The River Ouse Project was started by Dr Margaret Pilkington and colleagues in the Centre for Continuing Education, University of Sussex. Margaret is now retired with emeritus status and continues to run the project with a team of volunteers, in association with the University of Sussex. The team does botanical surveys of streamside grassland and steep wooded valleys (gills) in the upper reaches of the Sussex Ouse, a short flashy river arising on the southern slopes of the High Weald AONB (Area of Outstanding Natural Beauty). Survey sites are chosen on the basis of species richness, potential for restoration and contribution to flood control, and surveyed using the sampling methods outlined in Rodwell, J S (1992. British Plant Communities, Volume 3, Grasslands and Montane Communities). Survey data are transferred from the paper record taken in the field to Excel spreadsheets, and from there after validation and cleaning into two MySQL (MariaDB) databases, meadows and gills.

    The file is an extract from the meadows database. It contains binary data of the site occupancy for most of the plants encountered in meadow sites (stands, assemblies) sampled using five 2m x 2m quadrats. Details of the database are available here: River Ouse Project databases.

    For further details and access to the full database contact the author.

  16. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Lacruz Pleguezuelos, Blanca (2023). FooDrugs database: A database with molecular and text information about food - drug interactions [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6638469

FooDrugs database: A database with molecular and text information about food - drug interactions

Explore at:
Dataset updated
Jul 28, 2023
Dataset provided by
Lacruz Pleguezuelos, Blanca
Carrillo de Santa Pau, Enrique
Pérez, David
Laguna Lobo, Teresa
Piette Gómez, Óscar
Garranzo, Marco
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

FooDrugs database is a development done by the Computational Biology Group at IMDEA Food Institute (Madrid, Spain), in the context of the Food Nutrition Security Cloud (FNS-Cloud) project. Food Nutrition Security Cloud (FNS-Cloud) has received funding from the European Union's Horizon 2020 Research and Innovation programme (H2020-EU.3.2.2.3. – A sustainable and competitive agri-food industry) under Grant Agreement No. 863059 – www.fns-cloud.eu (See more details about FNS-Cloud below)

FooDrugs stores information extracted from transcriptomics and text documents for foo-drug interactiosn and it is part of a demonstrator to be done in the FNS-Cloud project. The database was built using MySQL, an open source relational database management system. FooDrugs host information for a total of 161 transcriptomics GEO series with 585 conditions for food or bioactive compounds. Each condition is defined as a food/biocomponent per time point, per concentration, per cell line, primary culture or biopsy per study. FooDrugs includes information about a bipartite network with 510 nodes and their similarity scores (tau score; https://clue.io/connectopedia/connectivity_scores) related with possible drug interactions with drugs assayed in conectivity map (https://www.broadinstitute.org/connectivity-map-cmap). The information is stored in eight tables:

Table “study” : This table contains basic information about study identifiers from GEO, pubmed or platform, study type, title and abstract

Table “sample”: This table contains basic information about the different experiments in a study, like the identifier of the sample, treatment, origin type, time point or concentration.

Table “misc_study”: This table contains additional information about different attributes of the study.

Table “misc_sample”: This table contains additional information about different attributes of the sample.

Table “cmap”: This table contains information about 70895 nodes, compromising drugs, foods or bioactives, overexpressed and knockdown genes (see section 3.4). The information includes cell line, compound and perturbation type.

Table “cmap_foodrugs”: This table contains information about the tau score (see section 3.4) that relates food with drugs or genes and the node identifier in the FooDrugs network.

Table “topTable”: This table contains information about 150 over and underexpressed genes from each GEO study condition, used to calculate the tau score (see section 3.4). The information stored is the logarithmic fold change, average expression, t-statistic, p-value, adjusted p-value and if the gene is up or downregulated.

Table “nodes”: This table stores the information about the identification of the sample and the node in the bipartite network connecting the tables “sample”, “cmap_foodrugs” and “topTable”.

In addition, FooDrugs database stores a total of 6422 food/drug interactions from 2849 text documents, obtained from three different sources: 2312 documents from PubMed, 285 from DrugBank, and 252 from drugs.com. These documents describe potential interactions between 1464 food/bioactive compounds and 3009 drugs. The information is stored in two tables:

Table “texts”: This table contains all the documents with its identifiers where interactions have been identified with strategy described in section 4.

Table “TM_interactions”: This table contains information about interaction identifiers, the food and drug entities, and the start and the end positions of the context for the interaction in the document.

FNS-Cloud will overcome fragmentation problems by integrating existing FNS data, which is essential for high-end, pan-European FNS research, addressing FNS, diet, health, and consumer behaviours as well as on sustainable agriculture and the bio-economy. Current fragmented FNS resources not only result in knowledge gaps that inhibit public health and agricultural policy, and the food industry from developing effective solutions, making production sustainable and consumption healthier, but also do not enable exploitation of FNS knowledge for the benefit of European citizens. FNS-Cloud will, through three Demonstrators; Agri-Food, Nutrition & Lifestyle and NCDs & the Microbiome to facilitate: (1) Analyses of regional and country-specific differences in diet including nutrition, (epi)genetics, microbiota, consumer behaviours, culture and lifestyle and their effects on health (obesity, NCDs, ethnic and traditional foods), which are essential for public health and agri-food and health policies; (2) Improved understanding agricultural differences within Europe and what these means in terms of creating a sustainable, resilient food systems for healthy diets; and (3) Clear definitions of boundaries and how these affect the compositions of foods and consumer choices and, ultimately, personal and public health in the future. Long-term sustainability of the FNS-Cloud will be based on Services that have the capacity to link with new resources and enable cross-talk amongst them; access to FNS-Cloud data will be open access, underpinned by FAIR principles (findable, accessible, interoperable and re-useable). FNS-Cloud will work closely with the proposed Food, Nutrition and Health Research Infrastructure (FNHRI) as well as METROFOOD-RI and other existing ESFRI RIs (e.g. ELIXIR, ECRIN) in which several FNS-Cloud Beneficiaries are involved directly. (https://cordis.europa.eu/project/id/863059)

Search
Clear search
Close search
Google apps
Main menu