37 datasets found
  1. Disease or Syndrome Concepts and Types

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Disease or Syndrome Concepts and Types [Dataset]. https://www.johnsnowlabs.com/marketplace/disease-or-syndrome-concepts-and-types/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Area covered
    N/A
    Description

    This dataset contains the entire concept structure of UMLS Metathesaurus for the semantic type "Disease or Syndrome". One of the primary purposes of this dataset is to connect different names for all the concepts for a specific Semantic Type. There are 125 semantic types in the Semantic Network. Every Metathesaurus concept is assigned at least one semantic type; very few terms are assigned as many as five semantic types.

  2. Aerial Semantic Segmentation Drone Dataset

    • kaggle.com
    Updated Jan 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bulent Siyah (2021). Aerial Semantic Segmentation Drone Dataset [Dataset]. https://www.kaggle.com/datasets/bulentsiyah/semantic-drone-dataset/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 10, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bulent Siyah
    Description

    Dataset Resource: https://www.tugraz.at/index.php?id=22387

    Citation If you use this dataset in your research, please cite the following URL:

    http://dronedataset.icg.tugraz.at

    License The Drone Dataset is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use the data given that you agree:

    That the dataset comes "AS IS", without express or implied warranty. Although every effort has been made to ensure accuracy, we (Graz University of Technology) do not accept any responsibility for errors or omissions. That you include a reference to the Semantic Drone Dataset in any work that makes use of the dataset. For research papers or other media link to the Semantic Drone Dataset webpage. That you do not distribute this dataset or modified versions. It is permissible to distribute derivative works in as far as they are abstract representations of this dataset (such as models trained on it or additional annotations that do not directly include any of our data) and do not allow to recover the dataset or something similar in character. That you may not use the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain. That all rights not expressly granted to you are reserved by us (Graz University of Technology).

    Dataset Overview The Semantic Drone Dataset focuses on semantic understanding of urban scenes for increasing the safety of autonomous drone flight and landing procedures. The imagery depicts more than 20 houses from nadir (bird's eye) view acquired at an altitude of 5 to 30 meters above ground. A high resolution camera was used to acquire images at a size of 6000x4000px (24Mpx). The training set contains 400 publicly available images and the test set is made up of 200 private images.

    PERSON DETECTION For the task of person detection the dataset contains bounding box annotations of the training and test set.

    SEMANTIC SEGMENTATION We prepared pixel-accurate annotation for the same training and test set. The complexity of the dataset is limited to 20 classes as listed in the following table.

    Table 1: Semanic classes of the Drone Dataset

    tree, gras, other vegetation, dirt, gravel, rocks, water, paved area, pool, person, dog, car, bicycle, roof, wall, fence, fence-pole, window, door, obstacle

  3. I

    Intelligent Semantic Data Service Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Intelligent Semantic Data Service Report [Dataset]. https://www.marketreportanalytics.com/reports/intelligent-semantic-data-service-54001
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Apr 2, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Variables measured
    Market Size
    Description

    The Intelligent Semantic Data Service (ISDS) market is experiencing robust growth, driven by the increasing need for businesses to derive actionable insights from complex and unstructured data. The market, estimated at $15 billion in 2025, is projected to expand at a Compound Annual Growth Rate (CAGR) of 20% from 2025 to 2033, reaching an estimated $70 billion by 2033. This growth is fueled by several key factors. Firstly, the rise of big data and the limitations of traditional data processing techniques are pushing organizations toward sophisticated solutions like ISDS to unlock the true potential of their data assets. Secondly, advancements in artificial intelligence (AI), natural language processing (NLP), and machine learning (ML) are enhancing the capabilities of ISDS, enabling more accurate and insightful data analysis. Thirdly, cloud-based deployments of ISDS are gaining significant traction, offering scalability, cost-effectiveness, and accessibility to a wider range of users. The enterprise segment currently dominates the market, driven by the need for improved operational efficiency, better decision-making, and enhanced customer experience. However, the personal segment is expected to witness faster growth due to increasing consumer adoption of AI-powered applications and smart devices. The competitive landscape is highly dynamic, with major technology companies like Google, IBM, Microsoft, Amazon, and Salesforce vying for market share. OpenAI, Alibaba, and Tencent are also making significant strides in the development and deployment of advanced ISDS solutions. North America currently holds the largest market share, fueled by early adoption and high technology investment. However, Asia-Pacific is expected to demonstrate the fastest growth, driven by rapid digital transformation in regions like China and India. Despite the significant opportunities, certain restraints remain. These include the high initial investment costs associated with ISDS implementation, the need for skilled professionals to manage and interpret the generated insights, and concerns related to data privacy and security. The market is further segmented by deployment type (cloud-based and on-premises) and application (enterprise and personal), reflecting the diverse needs and preferences of different user segments. Addressing these challenges will be crucial for continued market expansion and broader adoption of ISDS.

  4. tFood: Semantic Table Annotations Benchmark for Food Domain

    • zenodo.org
    zip
    Updated Dec 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ernesto Jimènez-Ruiz; Ernesto Jimènez-Ruiz; Oktie Hassanzadeh; Oktie Hassanzadeh; Birgitta König-Ries; Birgitta König-Ries (2023). tFood: Semantic Table Annotations Benchmark for Food Domain [Dataset]. http://doi.org/10.5281/zenodo.10048187
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 7, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ernesto Jimènez-Ruiz; Ernesto Jimènez-Ruiz; Oktie Hassanzadeh; Oktie Hassanzadeh; Birgitta König-Ries; Birgitta König-Ries
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    tFood is a dataset for tabular data to knowledge graph matching. It is derived for the Food domain and has two types of tables. On the one hand, Horizontal Relational Tables are where each table represents a collection of entities. On the other hand, Entity Tables are where each of which represents a single entity. We supported ground truth data from Wikidata as a target knowledge graph (KG).

    The supported tasks for semantic table annotations are:

    1. Topic Detection (TD) links the entire table to an entity or a class from the target KG.
    2. Cell Entity Annotation (CEA) maps individual table cells to entities from the target KG.
    3. Column Type Annotation (CTA) links individual table columns to classes from the target KG.
    4. Column Property Annotation (CPA) detects the relations between column pairs from the target knowledge graph.

    This dataset version will be used during SemTab 2023 - Round 1. So, the ground truth data for the test set is currently hidden. We will add such ground truth after the conclusion of the challenge.

  5. tBiomed: Semantic Table Annotations Benchmark for Biomedical Domain

    • zenodo.org
    • data.niaid.nih.gov
    Updated Apr 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nora Abdelmageed; Nora Abdelmageed; Ernesto Jimènez-Ruiz; Ernesto Jimènez-Ruiz; Oktie Hassanzadeh; Oktie Hassanzadeh; Birgitta König-Ries; Birgitta König-Ries (2024). tBiomed: Semantic Table Annotations Benchmark for Biomedical Domain [Dataset]. http://doi.org/10.5281/zenodo.10996334
    Explore at:
    Dataset updated
    Apr 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nora Abdelmageed; Nora Abdelmageed; Ernesto Jimènez-Ruiz; Ernesto Jimènez-Ruiz; Oktie Hassanzadeh; Oktie Hassanzadeh; Birgitta König-Ries; Birgitta König-Ries
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    tBiomed is a dataset for tabular data to knowledge graph matching. It is derived for the Biodiversity domain and has two types of tables. On the one hand, Horizontal Relational Tables are where each table represents a collection of entities. On the other hand, Entity Tables represent a single entity. We supported ground truth data from Wikidata as a target knowledge graph (KG).

    tBiomed is generated by KG2Tables using two levels of a recursive hierarchy of related concepts in Wikidata.

    tBiomed contains 26,778 entity and horizontal tables, while this repository contains only a validation fold of the original data representing 20% of the total of the entire benchmark with its ground truth data (gt). The Full size of this dataset is 1 GB.

    We included the full version of the dataset. We will update this repository ground truth data of the test set in the Future.

    The supported tasks for semantic table annotations are:

    1. Topic Detection (TD) links the entire table to an entity or a class from the target KG.
    2. Cell Entity Annotation (CEA) maps individual table cells to entities from the target KG.
    3. Column Type Annotation (CTA) links individual table columns to classes from the target KG.
    4. Column Property Annotation (CPA) detects the relations between column pairs from the target knowledge graph.
    5. Row Annotation (RA) annotates the entire row to a KG entity or property.
  6. Semantic Knowledge Graphing Market is Growing at a CAGR of 14.80% from 2024...

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research, Semantic Knowledge Graphing Market is Growing at a CAGR of 14.80% from 2024 to 2031. [Dataset]. https://www.cognitivemarketresearch.com/semantic-knowledge-graphing-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    According to Cognitive Market Research, the global semantic knowledge graphing market size is USD 1512.2 million in 2024 and will expand at a compound annual growth rate (CAGR) of 14.80% from 2024 to 2031.

    North America held the major market of around 40% of the global revenue with a market size of USD 604.88 million in 2024 and will grow at a compound annual growth rate (CAGR) of 13.0% from 2024 to 2031.
    Europe accounted for a share of over 30% of the global market size of USD 453.66 million.
    Asia Pacific held the market of around 23% of the global revenue with a market size of USD 347.81 million in 2024 and will grow at a compound annual growth rate (CAGR) of 16.8% from 2024 to 2031.
    Latin America market of around 5% of the global revenue with a market size of USD 75.61 million in 2024 and will grow at a compound annual growth rate (CAGR) of 14.2% from 2024 to 2031.
    Middle East and Africa held the major market of around 2% of the global revenue with a market size of USD 30.24 million in 2024 and will grow at a compound annual growth rate (CAGR) of 14.5% from 2024 to 2031.
    The natural language processing knowledge graphing held the highest growth rate in semantic knowledge graphing market in 2024.
    

    Market Dynamics of Semantic Knowledge Graphing Market

    Key Drivers of Semantic Knowledge Graphing Market

    Growing Volumes of Structured, Semi-structured, and Unstructured Data to Increase the Global Demand
    

    The global demand for semantic knowledge graphing is escalating in response to the exponential growth of structured, semi-structured, and unstructured data. Enterprises are inundated with vast amounts of data from diverse sources such as social media, IoT devices, and enterprise applications. Structured data from databases, semi-structured data like XML and JSON, and unstructured data from documents, emails, and multimedia files present significant challenges in terms of organization, analysis, and deriving actionable insights. Semantic knowledge graphing addresses these challenges by providing a unified framework for representing, integrating, and analyzing disparate data types. By leveraging semantic technologies, businesses can unlock the value hidden within their data, enabling advanced analytics, natural language processing, and knowledge discovery. As organizations increasingly recognize the importance of harnessing data for strategic decision-making, the demand for semantic knowledge graphing solutions continues to surge globally.

    Demand for Contextual Insights to Propel the Growth
    

    The burgeoning demand for contextual insights is propelling the growth of semantic knowledge graphing solutions. In today's data-driven landscape, businesses are striving to extract deeper contextual meaning from their vast datasets to gain a competitive edge. Semantic knowledge graphing enables organizations to connect disparate data points, understand relationships, and derive valuable insights within the appropriate context. This contextual understanding is crucial for various applications such as personalized recommendations, predictive analytics, and targeted marketing campaigns. By leveraging semantic technologies, companies can not only enhance decision-making processes but also improve customer experiences and operational efficiency. As industries across sectors increasingly recognize the importance of contextual insights in driving innovation and business success, the adoption of semantic knowledge graphing solutions is poised to witness significant growth. This trend underscores the pivotal role of semantic technologies in unlocking the true potential of data for strategic advantage in today's dynamic marketplace.

    Restraint Factors Of Semantic Knowledge Graphing Market

    Stringent Data Privacy Regulations to Hinder the Market Growth
    

    Stringent data privacy regulations present a significant hurdle to the growth of the Semantic Knowledge Graphing market. Regulations such as GDPR (General Data Protection Regulation) in Europe and CCPA (California Consumer Privacy Act) in the United States impose strict requirements on how organizations collect, store, process, and share personal data. Compliance with these regulations necessitates robust data protection measures, including anonymization, encryption, and access controls, which can complicate the implementation of semantic knowledge graphing systems. Moreover, concerns about data breach...

  7. n

    Semantic Segmentation of Crop Type in Ghana

    • cmr.earthdata.nasa.gov
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Semantic Segmentation of Crop Type in Ghana [Dataset]. http://doi.org/10.34911/rdnt.ry138p
    Explore at:
    Dataset updated
    Oct 10, 2023
    Time period covered
    Jan 1, 2020 - Jan 1, 2023
    Area covered
    Description

    Automatic, accurate crop type maps can provide unprecedented information for understanding food systems, especially in developing countries where ground surveys are infrequent. However, little work has applied existing methods to these data scarce environments, which also have unique challenges of irregularly shaped fields, frequent cloud coverage, small plots, and a severe lack of training data. To address this gap in the literature, we provide the first crop type semantic segmentation dataset of small holder farms, specifically in Ghana and South Sudan. We are also the first to utilize high resolution, high frequency satellite data in segmenting small holder farms.

    The dataset includes time series of satellite imagery from Sentinel-1, Sentinel-2, and PlanetScope satellites throughout 2016 and 2017. For each tile/chip in the dataset, there are time series of imagery from each of the satellites, as well as a corresponding label that defines the crop type at each pixel. The label has only one value at each pixel location, and assumes that the crop type remains the same across the full time span of the satellite image time series. In many cases where ground truth was not available, pixels have no label and are set to a value of 0.

  8. Z

    tBiodiv: Semantic Table Annotations Benchmark for Biodiversity Domain

    • data.niaid.nih.gov
    Updated Apr 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    König-Ries, Birgitta (2024). tBiodiv: Semantic Table Annotations Benchmark for Biodiversity Domain [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10283014
    Explore at:
    Dataset updated
    Apr 19, 2024
    Dataset provided by
    Hassanzadeh, Oktie
    Jimènez-Ruiz, Ernesto
    Abdelmageed, Nora
    König-Ries, Birgitta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    tBiodiv is a dataset for tabular data to knowledge graph matching. It is derived for the Biodiversity domain and has two types of tables. On the one hand, Horizontal Relational Tables are where each table represents a collection of entities. On the other hand, Entity Tables represent a single entity. We supported ground truth data from Wikidata as a target knowledge graph (KG).

    tBiodiv is generated by KG2Tables using two levels of a recursive hierarchy of related concepts in Wikidata.

    We updated this repository with full verion of the dataset, we will update it again with the test ground truth (gt) data in the future.

    The supported tasks for semantic table annotations are:

    Topic Detection (TD) links the entire table to an entity or a class from the target KG.

    Cell Entity Annotation (CEA) maps individual table cells to entities from the target KG.

    Column Type Annotation (CTA) links individual table columns to classes from the target KG.

    Column Property Annotation (CPA) detects the relations between column pairs from the target knowledge graph.

    Row Annotation (RA) annotates the entire row to a KG entity or property.

  9. D

    Replication data for: Prefix variation in путать: в-. за-, пере- and с-

    • dataverse.azure.uit.no
    • dataone.org
    bin +1
    Updated Sep 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Nordrum; Maria Nordrum (2023). Replication data for: Prefix variation in путать: в-. за-, пере- and с- [Dataset]. http://doi.org/10.18710/0JC95M
    Explore at:
    text/plain; charset=us-ascii(10169), text/plain; charset=us-ascii(411), bin(110637)Available download formats
    Dataset updated
    Sep 29, 2023
    Dataset provided by
    DataverseNO
    Authors
    Maria Nordrum; Maria Nordrum
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Russia, Norway
    Description

    This case study of the four Natural Perfectives of the Russian simplex verb путать ‘tangle’ sheds light on the following questions: Is it possible to predict the choice of prefix when there is prefix variation in Russian? And if yes, how? Since these questions are particularly relevant for second-language learners, the author also discusses how the present study and similar ones, can be used to make second language learning of Russian more effective. The analysis is based on a database of 630 sentences from the Russian National Corpus (RNC) and takes two factors into consideration: type of construction and semantic category of the internal argument. The uploaded data contain 3 files: "Database, everything": Each sentence is tagged according to prefix, form of the verb (Active vs Passive), type of construction and semantic category of the internal argument. The four types of constructions and four types of semantic categories are explained with examples from the database inside the article. "Database_simplified": This version of the database contains the three parameters for the sentences: prefix, type of construction and semantic category of the internal argument. The simplified database was created to do statistical analyses in R. "R_putat": The R script that was used in order to produce the cTree which is presented in the article.

  10. Data from: Code4ML: a Large-scale Dataset of annotated Machine Learning Code...

    • zenodo.org
    csv
    Updated Sep 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous authors; Anonymous authors (2023). Code4ML: a Large-scale Dataset of annotated Machine Learning Code [Dataset]. http://doi.org/10.5281/zenodo.6607065
    Explore at:
    csvAvailable download formats
    Dataset updated
    Sep 15, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymous authors; Anonymous authors
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present Code4ML: a Large-scale Dataset of annotated Machine Learning Code, a corpus of Python code snippets, competition summaries, and data descriptions from Kaggle.

    The data is organized in a table structure. Code4ML includes several main objects: competitions information, raw code blocks collected form Kaggle and manually marked up snippets. Each table has a .csv format.

    Each competition has the text description and metadata, reflecting competition and used dataset characteristics as well as evaluation metrics (competitions.csv). The corresponding datasets can be loaded using Kaggle API and data sources.

    The code blocks themselves and their metadata are collected to the data frames concerning the publishing year of the initial kernels. The current version of the corpus includes two code blocks files: snippets from kernels up to the 2020 year (сode_blocks_upto_20.csv) and those from the 2021 year (сode_blocks_21.csv) with corresponding metadata. The corpus consists of 2 743 615 ML code blocks collected from 107 524 Jupyter notebooks.

    Marked up code blocks have the following metadata: anonymized id, the format of the used data (for example, table or audio), the id of the semantic type, a flag for the code errors, the estimated relevance to the semantic class (from 1 to 5), the id of the parent notebook, and the name of the competition. The current version of the corpus has ~12 000 labeled snippets (markup_data_20220415.csv).

    As marked up code blocks data contains the numeric id of the code block semantic type, we also provide a mapping from this number to semantic type and subclass (actual_graph_2022-06-01.csv).

    The dataset can help solve various problems, including code synthesis from a prompt in natural language, code autocompletion, and semantic code classification.

  11. epoch data after pre-processing

    • figshare.com
    hdf
    Updated Nov 24, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yali Pan (2022). epoch data after pre-processing [Dataset]. http://doi.org/10.6084/m9.figshare.21206990.v4
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Nov 24, 2022
    Dataset provided by
    figshare
    Authors
    Yali Pan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The epoch data are a fieldtrip format structure, all epochs (ended with 'WrdOn') are aligned with fixation onset to a given word (as time 0), with a length of one second ([-0.5 0.5]s). Epochs were named as the combination of the acquisition date, subject code, and data type. Epochs that ended with 'BL_Cross' were the baseline period before the presentation of the sentences. This filed 'trialinfo' is the information about each trial, the header of all columns is as followings: 1- sentence_id: sentence number for this epoch 2- word_loc: the location of the current word in a sentence 3- loc2targ:location distance between the current word and target word; loc2targ for pre-target, target, post-target are -1, 0, and 1 4- saccade2this_duration: saccade duration toward this word 5- fixation_on_MEG: MEG trigger for fixation onset to this word 6- fixation_duration 7- NextOrder: next word location minus the current word location; negative value indicates saccade backward to the previous words 8- FirstPassFix: whether this fixation is the first for this word or not 9- PreviousOrder: previous word location minus the current word location; negative value indicates saccade forward to the next words 10- SentenceCondition: the current word is in a sentence with incongruent or congruent target word; 11 -- incongruent, 2 -- congruent 11- PupilSize: averaged pupil size during this fixation

  12. f

    Mapping from the data types to the oligonucleotides.

    • figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heng Sun; Jian Weng; Guangchuang Yu; Richard H. Massawe (2023). Mapping from the data types to the oligonucleotides. [Dataset]. http://doi.org/10.1371/journal.pone.0077090.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Heng Sun; Jian Weng; Guangchuang Yu; Richard H. Massawe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Mapping from the data types to the oligonucleotides.

  13. U

    Grammar transformations of topographic feature type annotations of the U.S....

    • data.usgs.gov
    • datasets.ai
    • +1more
    Updated Jul 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Abbott (2024). Grammar transformations of topographic feature type annotations of the U.S. to structured graph data. [Dataset]. http://doi.org/10.5066/P1BDPXKZ
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Emily Abbott
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    1994 - 1999
    Area covered
    United States
    Description

    These data were used to examine grammatical structures and patterns within a set of geospatial glossary definitions. Objectives of our study were to analyze the semantic structure of input definitions, use this information to build triple structures of RDF graph data, upload our lexicon to a knowledge graph software, and perform SPARQL queries on the data. Upon completion of this study, SPARQL queries were proven to effectively convey graph triples which displayed semantic significance. These data represent and characterize the lexicon of our input text which are used to form graph triples. These data were collected in 2024 by passing text through multiple Python programs utilizing spaCy (a natural language processing library) and its pre-trained English transformer pipeline. Before data was processed by the Python programs, input definitions were first rewritten as natural language and formatted as tabular data. Passages were then tokenized and characterized by their part-of-spee ...

  14. Data from: Entity Typing Datasets

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Mar 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Russa Biswas; Russa Biswas (2023). Entity Typing Datasets [Dataset]. http://doi.org/10.5281/zenodo.7688590
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 2, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Russa Biswas; Russa Biswas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These are the datasets used in the Entity Type Prediction task for Knowledge Graph Completion.

    • DB630k_Fine-grained_Hierarchical.zip dataset has been used in the papers [1] and [2]. It is an extended version of DBpedia630k dataset originally created for Text classification and is available here.
    • FIGER.zip dataset has also been used in the papers [1] and [2].
    • MultilingualETdata.zip dataset has been used in the paper [3]
    • NamesETdata.zip dataset has been used in the paper [4]. The CaLiGraph test dataset can also be downloaded here.

    [1] Biswas R, Sofronova R, Sack H, Alam M. Cat2type: Wikipedia category embeddings for entity typing in knowledge graphs. InProceedings of the 11th on Knowledge Capture Conference 2021 Dec 2 (pp. 81-88).

    [2] Biswas R, Portisch J, Paulheim H, Sack H, Alam M. Entity type prediction leveraging graph walks and entity descriptions. In The Semantic Web–ISWC 2022: 21st International Semantic Web Conference, Virtual Event, October 23–27, 2022, Proceedings 2022 Oct 16 (pp. 392-410). Cham: Springer International Publishing.

    [3] Biswas R, Chen Y, Paulheim H, Sack H, Alam M. It’s All in the Name: Entity Typing Using Multilingual Language Models. In The Semantic Web: ESWC 2022 Satellite Events: Hersonissos, Crete, Greece, May 29–June 2, 2022, Proceedings 2022 Jul 20 (pp. 36-41). Cham: Springer International Publishing.

    [4] Biswas R, Sofronova R, Alam M, Heist N, Paulheim H, Sack H. Do judge an entity by its name! entity typing using language models. In The Semantic Web: ESWC 2021 Satellite Events: Virtual Event, June 6–10, 2021, Revised Selected Papers 18 2021 (pp. 65-70). Springer International Publishing.

  15. K

    Knowledge Graph Technology Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Knowledge Graph Technology Report [Dataset]. https://www.marketreportanalytics.com/reports/knowledge-graph-technology-53638
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Apr 2, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Variables measured
    Market Size
    Description

    The Knowledge Graph Technology market is experiencing robust growth, driven by the increasing need for enhanced data interoperability, improved data analysis capabilities, and the rising adoption of artificial intelligence (AI) and machine learning (ML) across various industries. The market's expansion is fueled by the advantages of knowledge graphs in improving decision-making processes, streamlining operations, and fostering innovation. Specific applications, such as semantic search, personalized recommendations, and fraud detection, are witnessing significant traction. While precise market size figures are unavailable, a conservative estimate places the 2025 market value at $5 billion, with a Compound Annual Growth Rate (CAGR) of 25% projected through 2033. This growth trajectory is supported by the escalating demand for efficient data management solutions in sectors like healthcare, finance, and retail, where knowledge graphs can significantly enhance operational efficiency and strategic decision-making. Technological advancements, particularly in graph database technologies and semantic web technologies, further bolster market expansion. However, the market faces challenges such as the complexity of knowledge graph implementation, the need for specialized expertise, and data integration issues across disparate sources. Despite these challenges, the long-term outlook for knowledge graph technology remains positive, driven by continuous technological innovations and the growing recognition of its transformative potential across diverse sectors. The segmentation of the Knowledge Graph Technology market reveals significant opportunities within various application areas and technology types. Application-wise, semantic search and recommendation engines are currently leading the market, while emerging applications in areas such as risk management and supply chain optimization are poised for rapid growth in the coming years. In terms of technology types, ontology engineering and graph databases are experiencing high demand. Regionally, North America and Europe currently dominate the market due to early adoption and established technological infrastructure. However, the Asia-Pacific region is projected to witness significant growth, spurred by increasing digitalization and investments in AI and ML initiatives. Competitive landscape analysis reveals a mix of established technology providers and emerging startups, creating a dynamic and competitive ecosystem. The continuous evolution of technologies and the expansion into new applications will continue to shape the market's growth and trajectory over the forecast period.

  16. f

    Summary of the best performing measures for different applications.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaston K. Mazandu; Nicola J. Mulder (2023). Summary of the best performing measures for different applications. [Dataset]. http://doi.org/10.1371/journal.pone.0113859.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Gaston K. Mazandu; Nicola J. Mulder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of the best performing functional similarity measures, term specificity and semantic similarity approaches for different biological data, including Enzyme Commission (EC), Pfam domain, Sequence Similarity (Seq. Sim.), Protein-Protein Interaction (PPI) and Co-expression Network (CN) or Gene Expression (microarray) data.Summary of the best performing measures for different applications.

  17. NPClassifier: A Deep Neural Network-Based Structural Classification Tool for...

    • figshare.com
    xlsx
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hyun Woo Kim; Mingxun Wang; Christopher A. Leber; Louis-Félix Nothias; Raphael Reher; Kyo Bin Kang; Justin J. J. van der Hooft; Pieter C. Dorrestein; William H. Gerwick; Garrison W. Cottrell (2023). NPClassifier: A Deep Neural Network-Based Structural Classification Tool for Natural Products [Dataset]. http://doi.org/10.1021/acs.jnatprod.1c00399.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    ACS Publications
    Authors
    Hyun Woo Kim; Mingxun Wang; Christopher A. Leber; Louis-Félix Nothias; Raphael Reher; Kyo Bin Kang; Justin J. J. van der Hooft; Pieter C. Dorrestein; William H. Gerwick; Garrison W. Cottrell
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Computational approaches such as genome and metabolome mining are becoming essential to natural products (NPs) research. Consequently, a need exists for an automated structure-type classification system to handle the massive amounts of data appearing for NP structures. An ideal semantic ontology for the classification of NPs should go beyond the simple presence/absence of chemical substructures, but also include the taxonomy of the producing organism, the nature of the biosynthetic pathway, and/or their biological properties. Thus, a holistic and automatic NP classification framework could have considerable value to comprehensively navigate the relatedness of NPs, and especially so when analyzing large numbers of NPs. Here, we introduce NPClassifier, a deep-learning tool for the automated structural classification of NPs from their counted Morgan fingerprints. NPClassifier is expected to accelerate and enhance NP discovery by linking NP structures to their underlying properties.

  18. m

    Data from: Orthographic-semantic consistency effects in lexical decision:...

    • data.mendeley.com
    Updated Apr 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasushi Hino (2025). Orthographic-semantic consistency effects in lexical decision: What types of neighbors are responsible for the effects? [Dataset]. http://doi.org/10.17632/m3hryjj7h5.5
    Explore at:
    Dataset updated
    Apr 14, 2025
    Authors
    Yasushi Hino
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data and analysis codes to reproduce the results reported in "Orthographic-Semantic Consistency Effects in Lexical Decision: What types of Neighbors are Responsible for the Effects?" authored by Yasushi Hino, Debra Jared and Steve Lupker. In Data Analyses 1,2 and 3, lexical decision latency and accuracy data from English Lexicon Project (Balota, Yap, Cortese, Hutchison, Kessler, Loftus, Neely, Nelson, Simpson & Treiman, 2007) are used and analyzed to examine whether orthographic-semantic consistency effect is observed on lexical decision data. In Experiment, on the other hand, behavioral data are collected in online lexical decision experiment to examine whether the orthographic-semantic consistency effect is observed when the consistency is manipulated based on either addition neighbors or substitution neighbors. In addition, we also provided analysis codes to reproduce the results for Table 16 in the paper as well as the results of Tables and Figures reported in Supplementary Materials.

  19. Resources of IncRML: Incremental Knowledge Graph Construction from...

    • zenodo.org
    bin, text/x-python +1
    Updated Mar 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dylan Van Assche; Dylan Van Assche; Julian Andres Rojas Melendez; Julian Andres Rojas Melendez; Ben De Meester; Ben De Meester; Pieter Colpaert; Pieter Colpaert (2024). Resources of IncRML: Incremental Knowledge Graph Construction from Heterogeneous Data Sources [Dataset]. http://doi.org/10.5281/zenodo.10171157
    Explore at:
    xz, text/x-python, binAvailable download formats
    Dataset updated
    Mar 18, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dylan Van Assche; Dylan Van Assche; Julian Andres Rojas Melendez; Julian Andres Rojas Melendez; Ben De Meester; Ben De Meester; Pieter Colpaert; Pieter Colpaert
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jul 8, 2023
    Description

    IncRML resources

    This Zenodo dataset contains all the resources of the paper 'IncRML: Incremental Knowledge Graph Construction from Heterogeneous Data Sources' submitted to the Semantic Web Journal's Special Issue on Knowledge Graph Construction. This resource aims to make the paper experiments fully reproducible through our experiment tool written in Python which was already used before in the Knowledge Graph Construction Challenge by the ESWC 2023 Workshop on Knowledge Graph Construction. The exact Java JAR file of the RMLMapper (rmlmapper.jar) is also provided in this dataset which was used to execute the experiments. This JAR file was executed with Java OpenJDK 11.0.20.1 on Ubuntu 22.04.1 LTS (Linux 5.15.0-53-generic). Each experiment was executed 5 times and the median values are reported together with the standard deviation of the measurements.

    Datasets

    We provide both dataset dumps of the GTFS-Madrid-Benchmark and of real-life use cases from Open Data in Belgium.
    GTFS-Madrid-Benchmark dumps are used to analyze the impact on execution time and resources, while the real-life use cases aim to verify the approach on different types of datasets since the GTFS-Madrid-Benchmark is a single type of dataset which does not advertise changes at all.

    Benchmarks

    • GTFS-Madrid-Benchmark: change types with fixed data size and amount of changes: additions-only, modifications-only, deletions-only (11 versions)
    • GTFS-Madrid-Benchmark: amount of changes with fixed data size: 0%, 25%, 50%, 75%, and 100% changes (11 versions)
    • GTFS-Madrid-Benchmark: data size with fixed amount of changes: scales 1, 10, 100 (11 versions)

    Real-life use cases

    • Traffic control center Vlaams Verkeerscentrum (Belgium): traffic board messages data (1 day, 28760 versions)
    • Meteorological institute KMI (Belgium): weather sensor data (1 day, 144 versions)
    • Public transport agency NMBS (Belgium): train schedule data (1 week, 7 versions)
    • Public transport agency De Lijn (Belgium): busses schedule data (1 week, 7 versions)
    • Bike-sharing company BlueBike (Belgium): bike-sharing availability data (1 day, 1440 versions)
    • Bike-sharing company JCDecaux (EU): bike-sharing availability data (1 day, 1440 versions)
    • OpenStreetMap (World): geographical map data (1 day, 1440 versions)

    Remarks

    1. The first version of each dataset is always used as a baseline. All next versions are applied as an update on the existing version. The reported results are only focusing on the updates since these are the actual incremental generation.
    2. GTFS-Change-50_percent-{ALL, CHANGE}.tar.xz datasets are not uploaded as GTFS-Madrid-Benchmark scale 100 because both share the same parameters (50% changes, scale 100). Please use GTFS-Scale-100-{ALL, CHANGE}.tar.xz for GTFS-Change-50_percent-{ALL, CHANGE}.tar.xz
    3. All datasets are compressed with XZ and provided as a TAR archive, be aware that you need sufficient space to decompress these archives! 2 TB of free space is advised to decompress all benchmarks and use cases. The expected output is provided as a ZIP file in each TAR archive, decompressing these requires even more space (4 TB).

    Reproducing

    By using our experiment tool, you can easily reproduce the experiments as followed:

    1. Download one of the TAR.XZ archives and unpack them.
    2. Clone the GitHub repository of our experiment tool and install the Python dependencies with 'pip install -r requirements.txt'.
    3. Download the rmlmapper.jar JAR file from this Zenodo dataset and place it inside the experiment tool root folder.
    4. Execute the tool by running: './exectool --root=/path/to/the/root/of/the/tarxz/archive --runs=5 run'. The argument '--runs=5' is used to perform the experiment 5 times.
    5. Once executed, you can generate the statistics by running: './exectool --root=/path/to/the/root/of/the/tarxz/archive stats'.

    Testcases

    Testcases to verify the integration of RML and LDES with IncRML, see https://doi.org/10.5281/zenodo.10171394

  20. Z

    Global Healthcare Data Interoperability Market By Type (Solutions,...

    • zionmarketresearch.com
    pdf
    Updated May 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zion Market Research (2025). Global Healthcare Data Interoperability Market By Type (Solutions, Services), By Level (Foundational, Structural, and Semantic), By Deployment (Cloud-based, On-premise), By Application (Diagnosis, Treatment, Others), By Model (Centralized, Hybrid, Decentralized), By End-users (Ambulatory Surgical Centers, Hospitals): Global Industry Perspective, Comprehensive Analysis and Forecast, 2020 - 2026 [Dataset]. https://www.zionmarketresearch.com/report/healthcare-data-interoperability-market
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 18, 2025
    Dataset authored and provided by
    Zion Market Research
    License

    https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy

    Time period covered
    2022 - 2030
    Area covered
    Global
    Description

    The global healthcare interoperability solutions market was valued at USD 2.5 billion in 2019, and is expected to reach USD 4.9 billion by 2026, at a CAGR of 11.2%.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
John Snow Labs (2021). Disease or Syndrome Concepts and Types [Dataset]. https://www.johnsnowlabs.com/marketplace/disease-or-syndrome-concepts-and-types/
Organization logo

Disease or Syndrome Concepts and Types

Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
N/A
Description

This dataset contains the entire concept structure of UMLS Metathesaurus for the semantic type "Disease or Syndrome". One of the primary purposes of this dataset is to connect different names for all the concepts for a specific Semantic Type. There are 125 semantic types in the Semantic Network. Every Metathesaurus concept is assigned at least one semantic type; very few terms are assigned as many as five semantic types.

Search
Clear search
Close search
Google apps
Main menu