3 datasets found
  1. F

    Templates Recommendation in the Open Research Knowledge Graph

    • data.uni-hannover.de
    json
    Updated Jun 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TIB (2022). Templates Recommendation in the Open Research Knowledge Graph [Dataset]. https://data.uni-hannover.de/ne/dataset/5a34c82b-fdb3-4058-a98e-004f14e19bd1
    Explore at:
    json(3666489), json(896976), json(414787), json(351327), json(4081193)Available download formats
    Dataset updated
    Jun 3, 2022
    Dataset authored and provided by
    TIB
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    This dataset has been created for implementing a content-based recommender system in the context of the Open Research Knowledge Graph (ORKG). The recommender system accepts research paper's title and abstracts as input and recommends existing templates in the ORKG semantically relevant to the given paper.

    Two approaches have been trained on this dataset in the context of this https://doi.org/10.15488/11834 master's thesis, namely a Natural Language Inference (NLI) approach based on SciBERT embeddings and an unsupervised approach based on ElasticSearch.

    This publication consists therefore of one general dataset, two training sets for each approach, validation set for the supervised approach and a test set for both approaches.

  2. t

    Templates Recommendation in the Open Research Knowledge Graph - Vdataset -...

    • service.tib.eu
    Updated Jun 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Templates Recommendation in the Open Research Knowledge Graph - Vdataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/luh-templates-recommendation-in-the-open-research-knowledge-graph
    Explore at:
    Dataset updated
    Jun 3, 2022
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    This dataset has been created for implementing a content-based recommender system in the context of the Open Research Knowledge Graph (ORKG). The recommender system accepts research paper's title and abstracts as input and recommends existing templates in the ORKG semantically relevant to the given paper. Two approaches have been trained on this dataset in the context of this https://doi.org/10.15488/11834 master's thesis, namely a Natural Language Inference (NLI) approach based on SciBERT embeddings and an unsupervised approach based on ElasticSearch. This publication consists therefore of one general dataset, two training sets for each approach, validation set for the supervised approach and a test set for both approaches.

  3. Dataset - Templates Recommendation in the Open Research Knowledge Graph

    • zenodo.org
    json
    Updated Jun 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Omar Arab Oghli; Omar Arab Oghli (2022). Dataset - Templates Recommendation in the Open Research Knowledge Graph [Dataset]. http://doi.org/10.5281/zenodo.6607165
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jun 3, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Omar Arab Oghli; Omar Arab Oghli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset has been created for implementing a content-based recommender system in the context of the Open Research Knowledge Graph (ORKG). The recommender system accepts research paper's title and abstracts as input and recommends existing templates in the ORKG semantically relevant to the given paper.

    Two approaches have been trained on this dataset in the context of this master's thesis, namely a Natural Language Inference (NLI) approach based on SciBERT embeddings and an unsupervised approach based on ElasticSearch.

    This publication consists therefore of one general dataset, two training sets for each approach, validation set for the supervised approach and a test set for both approaches.

    dataset.json

    The main JSON object consists of a list of templates and a list of neutral papers.

    Each template object has an ID, label, list of research fields, list of properties and list of papers using that template, whereas each paper object has ID, label, DOI, research field and abstract.

    Each neutral paper object has the same schema of a paper object using that template.

    See an example instance below.

    {
      "templates": [
        {
          "id": "R138668",
          "label": "Psychiatric Disorders AI Overview",
          "research_fields": [
            {
              "id": "http://orkg.org/orkg/resource/R133",
              "label": "Artificial Intelligence"
            }
            ...
          ],
          "properties": [
            "Study cohort",
            ...
          ],
          "papers": [
            {
              "id": "R138698",
              "label": "Application of Autoencoder in Depression Diagnosis",
              "doi": "10.12783/dtcse/csma2017/17335",
              "research_field": {
                "id": "R104",
                "label": "Bioinformatics"
              },
              "abstract": "Major depressive disorder (MDD) is a mental disorder characterized by at least two weeks of low mood which is present across most situations. Diagnosis of MDD using rest-state functional magnetic resonance imaging (fMRI) data faces many challenges due to the high dimensionality, small samples, noisy and individual variability. No method can automatically extract discriminative features from the origin time series in fMRI images for MDD diagnosis. In this study, we proposed a new method for feature extraction and a workflow which can make an automatic feature extraction and classification without a prior knowledge. An autoencoder was used to learn pre-training parameters of a dimensionality reduction process using 3-D convolution network. Through comparison with the other three feature extraction methods, our method achieved the best classification performance. This method can be used not only in MDD diagnosis, but also other similar disorders."
            },
            ...
        },
       ...
       ]
      "neutral_papers": [
        {
          "id": "R109377",
          "label": "Structural basis of SARS-CoV-2 3CLpro and anti-COVID-19 drug discovery from medicinal plants",
          "doi": "10.1016/j.jpha.2020.03.009",
          "research_field": {
            "id": "R104",
            "label": "Bioinformatics"
          },
          "abstract": "Abstract The recent outbreak of coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 in December 2019 raised global health concerns. The viral 3-chymotrypsin-like cysteine protease (3CLpro) enzyme controls coronavirus replication and is essential for its life cycle. 3CLpro is a proven drug discovery target in the case of severe acute respiratory syndrome coronavirus (SARS-CoV) and middle east respiratory syndrome coronavirus (MERS-CoV). Recent studies revealed that the genome sequence of SARS-CoV-2 is very similar to that of SARS-CoV. Therefore, herein, we analysed the 3CLpro sequence, constructed its 3D homology model, and screened it against a medicinal plant library containing 32,297 potential anti-viral phytochemicals/traditional Chinese medicinal compounds. Our analyses revealed that the top nine hits might serve as potential anti- SARS-CoV-2 lead molecules for further optimisation and drug development process to combat COVID-19."
        },
        ...
      ]
    }

    All other files

    The main JSON object consists of a list of entailments, a list of contradiction and a list of neutrals.

    Each object of the above mentioned lists has the same schema. An instance_id created by concatenating the template_id (when exists) with the paper_id, a template_id, a paper_id, premise (representing the paper's title), hypthesis (representing the paper's abstract), their concatenation in sequence and the target class.

    See an example instance below.

    {
      "entailments": [
        {
          "instance_id": "R138668xR138698",
          "template_id": "R138668",
          "paper_id": "R138698",
          "premise": "psychiatric disorders ai overview study cohort outcome assessment aims performance findings used models data",
          "hypothesis": "application of autoencoder in depression diagnosis major depressive disorder (mdd) is a mental disorder characterized by at least two weeks of low mood which is present across most situations diagnosis of mdd using rest state functional magnetic resonance imaging (fmri) data faces many challenges due to the high dimensionality, small samples, noisy and individual variability no method can automatically extract discriminative features from the origin time series in fmri images for mdd diagnosis in this study, we proposed a new method for feature extraction and a workflow which can make an automatic feature extraction and classification without a prior knowledge an autoencoder was used to learn pre training parameters of a dimensionality reduction process using 3 d convolution network through comparison with the other three feature extraction methods, our method achieved the best classification performance this method can be used not only in mdd diagnosis, but also other similar disorders",
          "sequence": "[CLS] psychiatric disorders ai overview study cohort outcome assessment aims performance findings used models data [SEP] application of autoencoder in depression diagnosis major depressive disorder (mdd) is a mental disorder characterized by at least two weeks of low mood which is present across most situations diagnosis of mdd using rest state functional magnetic resonance imaging (fmri) data faces many challenges due to the high dimensionality, small samples, noisy and individual variability no method can automatically extract discriminative features from the origin time series in fmri images for mdd diagnosis in this study, we proposed a new method for feature extraction and a workflow which can make an automatic feature extraction and classification without a prior knowledge an autoencoder was used to learn pre training parameters of a dimensionality reduction process using 3 d convolution network through comparison with the other three feature extraction methods, our method achieved the best classification performance this method can be used not only in mdd diagnosis, but also other similar disorders [SEP]",
          "target": "entailment"
        },
       ...
       ],
      "contradictions": [ ... ],
      "neutrals": [ ... ]
    } 

    Statistics

    -Training (supervised)Validation (supervised)Training (unsupervised)Test
    Entailment1802020052
    Neutral1802020064
    Contradictrion7368400
    Total1096124400116

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
TIB (2022). Templates Recommendation in the Open Research Knowledge Graph [Dataset]. https://data.uni-hannover.de/ne/dataset/5a34c82b-fdb3-4058-a98e-004f14e19bd1

Templates Recommendation in the Open Research Knowledge Graph

Explore at:
json(3666489), json(896976), json(414787), json(351327), json(4081193)Available download formats
Dataset updated
Jun 3, 2022
Dataset authored and provided by
TIB
License

Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically

Description

This dataset has been created for implementing a content-based recommender system in the context of the Open Research Knowledge Graph (ORKG). The recommender system accepts research paper's title and abstracts as input and recommends existing templates in the ORKG semantically relevant to the given paper.

Two approaches have been trained on this dataset in the context of this https://doi.org/10.15488/11834 master's thesis, namely a Natural Language Inference (NLI) approach based on SciBERT embeddings and an unsupervised approach based on ElasticSearch.

This publication consists therefore of one general dataset, two training sets for each approach, validation set for the supervised approach and a test set for both approaches.

Search
Clear search
Close search
Google apps
Main menu