100+ datasets found
  1. E

    Data from: Slovenian datasets for contextual synonym and antonym detection

    • live.european-language-grid.eu
    binary format
    Updated Oct 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Slovenian datasets for contextual synonym and antonym detection [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/20526
    Explore at:
    binary formatAvailable download formats
    Dataset updated
    Oct 25, 2022
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    Slovenia
    Description

    Slovenian datasets for contextual synonym and antonym detection can be used for training machine learning classifiers as described in the MSc thesis of Jasmina Pegan "Semantic detection of synonyms and antonyms with contextual embeddings" (https://repozitorij.uni-lj.si/IzpisGradiva.php?id=141456). Datasets contain example pairs of synonyms and antonyms in contexts together with additional information on a sense pair. Candidates for synonyms and antonyms were retrieved from the dataset created in the BSc thesis of Jasmina Pegan "Antonym detection with word embeddings" (https://repozitorij.uni-lj.si/IzpisGradiva.php?id=110533). Example sentences were retrieved from The comprehensive Slovenian-Hungarian dictionary (VSMS) (https://www.clarin.si/repository/xmlui/handle/11356/1453). Each dataset is class balanced and contains an equal amount of examples and counterexamples. An example is a pair of example sentences where the two words are synonyms/antonyms. A counterexample is a pair of example sentences where two words are not synonyms/antonyms. Note that a word pair can be synonymous or antonymous in some sense of the two words (but not in the given context).

    Datasets are divided into two categories, datasets for synonyms and datasets for antonyms. Each category is further divided into base and updated datasets. These contain three dataset files: train, validation and test dataset. Base datasets include only manually-reviewed sense pairs. These are generated from all pairs of VSMS sense examples for all confirmed pairs of antonym and synonym senses. Updated datasets include automatically generated sense pairs while constraining the maximal number of examples per word. In this way, the dataset is more balanced word-wise, but is not fully manually-reviewed and contains less accurate data.

    A single dataset entry contains the information on the base word, followed by data on synonym/antonym candidate. The last column discerns whether the sense pair is a pair of synonyms/antonyms or not. More details on this can be found inside the included README file.

  2. Data from: Synonym, new species and checklist of the genus Fissocantharis...

    • gbif.org
    Updated Nov 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu-Xia Yang; Yûichi Okushima; Xing-Ke Yang; Yu-Xia Yang; Yûichi Okushima; Xing-Ke Yang (2024). Synonym, new species and checklist of the genus Fissocantharis Pic from Taiwan (Coleoptera, Cantharidae) [Dataset]. http://doi.org/10.5281/zenodo.213031
    Explore at:
    Dataset updated
    Nov 28, 2024
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Plazi
    Authors
    Yu-Xia Yang; Yûichi Okushima; Xing-Ke Yang; Yu-Xia Yang; Yûichi Okushima; Xing-Ke Yang
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Taiwan
    Description

    This dataset contains the digitized treatments in Plazi based on the original journal article Yang, Yu-Xia, Okushima, Yûichi, Yang, Xing-Ke (2012): Synonym, new species and checklist of the genus Fissocantharis Pic from Taiwan (Coleoptera, Cantharidae). Zootaxa 3262 (1): 46-53, DOI: 10.11646/zootaxa.3262.1.4, URL: https://biotaxa.org/Zootaxa/article/view/zootaxa.3262.1.4

  3. s

    Noun Compound Synonym Substitution in Books – NCSSB datasets

    • orda.shef.ac.uk
    txt
    Updated Feb 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Pickard; Aline Villavicencio; Agne Knietaite; Adam Allsebrook; Anton Minkov; Adam Tomaszewski; Norbert Slinko; Richard Johnson (2024). Noun Compound Synonym Substitution in Books – NCSSB datasets [Dataset]. http://doi.org/10.15131/shef.data.25259722.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 26, 2024
    Dataset provided by
    The University of Sheffield
    Authors
    Thomas Pickard; Aline Villavicencio; Agne Knietaite; Adam Allsebrook; Anton Minkov; Adam Tomaszewski; Norbert Slinko; Richard Johnson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Noun Compound Synonym Substitution in Books (NCSSB) datasets contain in-context instances of potentially idiomatic English noun compounds, obtained by substituting idioms for synonyms occurring in public domain books forming part of the Project Gutenberg corpus.

  4. Data from: Three new synonyms of the genus Kamimuria (Plecoptera, Perlidae)

    • gbif.org
    Updated May 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liang-Liang Zeng; Liang-Liang Zeng (2025). Three new synonyms of the genus Kamimuria (Plecoptera, Perlidae) [Dataset]. http://doi.org/10.3897/bdj.13.e153697
    Explore at:
    Dataset updated
    May 31, 2025
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Biodiversity Data Journal
    Authors
    Liang-Liang Zeng; Liang-Liang Zeng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Currently, 11 species of Kamimuria have been reported in Guizhou Province, China. However, the original illustrations of Kamimuria magnimacula Du, 2005 and K. extremispina Du, 2006, lack the necessary detail to accurately assess the spine patterns on the endophallus, which is a key diagnostic feature. To resolve this issue, a re-examination of the type materials, complemented by high-resolution colour photographs, is crucial to ensure precise identification and reliable documentation of these species.Based on a detailed examination of the type materials of Kamimuria magnimacula Du, 2005 and K. extremispina Du, 2006, we propose that K. hunanensis Li & Li, 2022 be considered a synonym of K. magnimacula, K. circumspina Li, Mo & Yang, 2019 and K. dabieshana Yan, Kong & Li, 2021 be regarded as synonyms of K. extremispina. Additionally, we have provided holotype photographs of K. magnimacula and K. extremispina, along with a distribution map for both species in this paper.

  5. n

    Data from: Two new synonyms for Chaerophyllum bulbosum based on...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated May 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ÖZLEM ÇETİN; Mustafa Çelik (2022). Two new synonyms for Chaerophyllum bulbosum based on morphological, anatomical and molecular data [Dataset]. http://doi.org/10.5061/dryad.hqbzkh1jf
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 2, 2022
    Dataset provided by
    Selçuk University
    Authors
    ÖZLEM ÇETİN; Mustafa Çelik
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The aim of the present study was to determine support for reducing Chaerophyllum karsianum and C. posofianum, both local endemic to northeast Anatolia, to the synonym C. bulbosum. Chaerophyllum karsianum is closely related to C. bulbosum but distinguished from it by its pink petals, ciliate bracteoles, entire leaf segments, and 12–16 rays according to the protologue and Flora of Turkey and the East Aegean Islands. Chaerophyllum posofianum is also closely related to C. bulbosum but is distinguished from it by its entire leaf segments, ciliate bracteoles, and purple anthers. Flower color of C. bulbosum ranges from white to purple within the same populations or even within the same individuals. The bracteole margin ranges from entire to ciliate in C. bulbosum. Our field observations and examination of herbarium specimens showed that morphological characteristics overlap in all of the examined samples. We also investigated and compared the anatomical and micromorphological characteristics of C. bulbosum, C. karsianum, and C. posofianum fruit. The nucleotide sequence data reported in the present study showed that the internal transcribed spacersequences of C. karsianum and C. posofianum were identical to that of C. bulbosum. Our results strongly support that C. karsianum and C. posofianum be conspecific with C. bulbosum.

  6. Bad Synonyms: bad synonyms

    • zenodo.org
    bin
    Updated Aug 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    script; script (2024). Bad Synonyms: bad synonyms [Dataset]. http://doi.org/10.5281/zenodo.13239699
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 6, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    script; script
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Dec 9, 2019
    Description

    taxonIDs of synonyms that should be removed from DH 1.1

  7. Unfiltered Depositor-Provided Chemical Synonyms for Substance Records in...

    • zenodo.org
    application/gzip
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sunghwan Kim; Sunghwan Kim; Bo Yu; Bo Yu; Qingliang Li; Qingliang Li; Evan E. Bolton; Evan E. Bolton (2025). Unfiltered Depositor-Provided Chemical Synonyms for Substance Records in PubChem [Dataset]. http://doi.org/10.5281/zenodo.11194943
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sunghwan Kim; Sunghwan Kim; Bo Yu; Bo Yu; Qingliang Li; Qingliang Li; Evan E. Bolton; Evan E. Bolton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This gzipped text file contains a list of all (live) substance records in PubChem with their "unfiltered" depositor-provided chemical synonyms, downloaded from PubChem in June 2017. Each line has a Substance ID (SID) and its chemical synonym, separated by a tab. The SID-synonym pairs in this file were used in the paper “PubChem Synonym Filtering Process Using Crowdsourcing” by Sunghwan Kim et al., published in the Journal of Cheminformatics (https://doi.org/10.1186/s13321-024-00868-3). The up-to-date version of this file can be downloaded from the PubChem FTP Site (https://ftp.ncbi.nlm.nih.gov/pubchem/Substance/Extras/).

  8. d

    Data from: Lappula duplicicarpa var. brevispinula C.J.Wang (Boraginaceae) is...

    • datadryad.org
    zip
    Updated Mar 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danhui Liu (2025). Lappula duplicicarpa var. brevispinula C.J.Wang (Boraginaceae) is a synonym of L. macrantha based on morphological and molecular data [Dataset]. http://doi.org/10.5061/dryad.j9kd51cnz
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 1, 2025
    Dataset provided by
    Dryad
    Authors
    Danhui Liu
    Description

    Molecular and morphological data of Lappula duplicicarpa var. brevispinula

    https://doi.org/10.5061/dryad.j9kd51cnz

    Description of the data and file structure

    Molecular and morphological data of Lappula duplicicarpa var. brevispinula.

    Total genomic DNA was extracted from silica-gel dried leaves of L. duplicicarpa var. duplicicarpa, L. duplicicarpa var. brevispinula, and L. macrantha using a modified CTAB method (Li et al. 2013). High-quality genomic DNA was sent to Novogene Biotechnology in Tianjin for library construction and sequencing. Libraries were sequenced on the NovaSeq 6000 platform, generating paired-end reads of 2×150 bp, with approximately 10 Gb of raw data per sample. Plastome assembly was performed using GetOrganelle version 1.7.5 (Jin et al. 2020) with default parameters, and annotation was carried out using the online tool CpGAVAS2 (Shi et al. 2019), with Lappula lasiocarpa (Accession: NC_077516) as the refer...

  9. n

    Data from: Taxonomic notes on the genus Piper (Piperaceae)

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Apr 27, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chalermpol Suwanphakdee; David A. Simpson; Trevor R. Hodkinson; Pranom Chantaranothai (2016). Taxonomic notes on the genus Piper (Piperaceae) [Dataset]. http://doi.org/10.5061/dryad.qp50f
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 27, 2016
    Dataset provided by
    Royal Botanic Gardens, Kew
    Trinity College
    Kasetsart University
    Khon Kaen University
    Authors
    Chalermpol Suwanphakdee; David A. Simpson; Trevor R. Hodkinson; Pranom Chantaranothai
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Asia
    Description

    Sixteen lectotypifications of Asian Piper species are provided. Piper argyrites, P. baccatum, P. leptostachyum, P. majusculum, P. peepuloides, P. quinqueangulatum and P. sulcatum are accepted as species and many new synonyms are proposed. Useful diagnostic characters are described and geographical distribution data of each species are provided.

  10. E

    Data from: Thesaurus of Modern Slovene 2.0

    • live.european-language-grid.eu
    binary format
    Updated Nov 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Thesaurus of Modern Slovene 2.0 [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/23182
    Explore at:
    binary formatAvailable download formats
    Dataset updated
    Nov 14, 2023
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Thesaurus of Modern Slovene is the largest automatically generated open-access collection of Slovene synonyms. It is sourced from the data in two principal language resources: The Oxford®-DZS Comprehensive English-Slovenian Dictionary and the Gigafida 1.0 corpus of written Slovene. The links identified between synonyms were additionally confirmed using the Dictionary of Standard Slovenian Language (SSKJ). The data extraction and structure for the Thesaurus were based on the frequency and manner in which words co-occur in translation strings of the Oxford-DZS Dictionary. This information is the basis for discriminating between ‘core’ and ‘near’ synonyms, with ‘core’ synonyms exhibiting a greater connection to the keyword. In the following step, an approach combining balanced co-occurrence graphs and the Personal PageRank algorithm automatically divides the synonyms into subgroups and ranks them according to the degree of semantic relatedness to the keyword, as well as their frequency in language use. For the creation methodology, see Krek et al. (2017) in the provided references.

    The database includes dictionary entries: single- and multiword headwords, their part-of-speech and other linguistic features, as well as automatically extracted synonyms, their type (core or near) and relevancy rank. In version 2.0, 4,544 manually revised antonyms were added to the database. Additionally, for a part of the database, synonyms were distributed under the corresponding word senses. Pertaining to how much lexicographic revision was involved in their preparation, database entries can have one of the following three statuses: (a) ssss-automatic (96,064 entries): no manual revision was conducted; (b) ssss-manual (3,421 entries): word senses and semantic indicators were prepared by lexicographers, and synonyms were manually distributed under each corresponding sense; (c) ssss-hybrid (1,352 entries): manually revised senses are combined with data compiled automatically. For novelties of v2.0, see Arhar Holdt et al. (2023) in the provided references.

  11. Data from: Knowledge Graph Consolidation by Unifying Synonymous...

    • figshare.com
    bz2
    Updated Sep 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan-Christoph Kalo (2019). Knowledge Graph Consolidation by Unifying Synonymous Relationships [Dataset]. http://doi.org/10.6084/m9.figshare.8490134.v2
    Explore at:
    bz2Available download formats
    Dataset updated
    Sep 9, 2019
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Jan-Christoph Kalo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The datasets for reproducing the results of the paper "Knowledge Graph Consolidation by Unifying Synonymous Relationships" published at ISWC 2019.

  12. Detecting Synonymous Relationships by Shared Data-driven Definitions

    • figshare.com
    txt
    Updated Dec 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan-Christoph Kalo (2019). Detecting Synonymous Relationships by Shared Data-driven Definitions [Dataset]. http://doi.org/10.6084/m9.figshare.11343785.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 9, 2019
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jan-Christoph Kalo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets that can be used together with the Code in: https://github.com/JanKalo/RuleAlign

  13. f

    Hard Synonyms MySQL dump 8.1

    • figshare.com
    txt
    Updated Jan 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ana Uban (2016). Hard Synonyms MySQL dump 8.1 [Dataset]. http://doi.org/10.6084/m9.figshare.1584665.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 20, 2016
    Dataset provided by
    figshare
    Authors
    Ana Uban
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Database dump containing words from 4 languages: English, Romanian, French and Spanish, and their translations.

  14. Data from: A new combination and a new synonym of Gesneriaceae in China

    • gbif.org
    Updated Nov 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zheng-Long Li; Zhang-Jie Huang; Da-Wei Chen; Xin Hong; Fang Wen; Zheng-Long Li; Zhang-Jie Huang; Da-Wei Chen; Xin Hong; Fang Wen (2024). A new combination and a new synonym of Gesneriaceae in China [Dataset]. http://doi.org/10.15468/3rtu4z
    Explore at:
    Dataset updated
    Nov 29, 2024
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Plazi
    Authors
    Zheng-Long Li; Zhang-Jie Huang; Da-Wei Chen; Xin Hong; Fang Wen; Zheng-Long Li; Zhang-Jie Huang; Da-Wei Chen; Xin Hong; Fang Wen
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    China
    Description

    This dataset contains the digitized treatments in Plazi based on the original journal article Li, Zheng-Long, Huang, Zhang-Jie, Chen, Da-Wei, Hong, Xin, Wen, Fang (2023): A new combination and a new synonym of Gesneriaceae in China. PhytoKeys 232: 99-107, DOI: http://dx.doi.org/10.3897/phytokeys.232.108644, URL: http://dx.doi.org/10.3897/phytokeys.232.108644

  15. t

    Synonym Finance Price Metrics

    • tokenterminal.com
    csv, json
    Updated Apr 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Token Terminal (2025). Synonym Finance Price Metrics [Dataset]. https://tokenterminal.com/explorer/projects/synonym-finance
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Apr 26, 2025
    Dataset authored and provided by
    Token Terminal
    License

    https://tokenterminal.com/termshttps://tokenterminal.com/terms

    Time period covered
    2020 - Present
    Variables measured
    Price
    Description

    Detailed Price metrics and analytics for Synonym Finance, including historical data and trends.

  16. d

    Replication Data for: Words That Stick Predicting Decision Making and...

    • search.dataone.org
    • data.mendeley.com
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dvir, Nimrod (2023). Replication Data for: Words That Stick Predicting Decision Making and Synonym Engagement Using Cognitive Biases and Computational Linguistics [Dataset]. http://doi.org/10.7910/DVN/J5LTYE
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Dvir, Nimrod
    Description

    This research utilizes cognitive neuroscience and information systems research to predict user engagement and decision-making in digital platforms. By applying Natural Language Processing (NLP) techniques and cognitive bias theories, we investigate user interactions with synonyms in digital content. Our approach incorporates four cognitive biases - representativeness, ease-of-use (processing fluency), affect-biased attention, and distribution/availability (R.E.A.D) - into a comprehensive model. The model's predictive capacity was evaluated using a large user survey, revealing that synonyms representative of core concepts, easy to process, emotionally resonant, and readily available, fostered increased user engagement. Importantly, our research provides a novel perspective on human-computer interaction, digital habits, and decision-making processes. Findings underscore the potential of cognitive biases as powerful predictors of user engagement, emphasizing their role in effective digital content design across education, marketing, and beyond.

  17. t

    Synonym Finance Code commits Metrics

    • tokenterminal.com
    csv, json
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Token Terminal (2025). Synonym Finance Code commits Metrics [Dataset]. https://tokenterminal.com/explorer/projects/synonym-finance
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Token Terminal
    License

    https://tokenterminal.com/termshttps://tokenterminal.com/terms

    Time period covered
    2020 - Present
    Variables measured
    Code commits
    Description

    Detailed Code commits metrics and analytics for Synonym Finance, including historical data and trends.

  18. n

    Data from: Three new synonyms of Rungia stolonifera (Acanthaceae) from China...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Jan 30, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zheli Lin; Van Hai Do; Yunfei Deng (2020). Three new synonyms of Rungia stolonifera (Acanthaceae) from China and Vietnam [Dataset]. http://doi.org/10.5061/dryad.pc866t1k0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 30, 2020
    Dataset provided by
    South China Botanical Garden
    Instituto de Ciencia y Tecnología de Alimentos y Nutrición
    Authors
    Zheli Lin; Van Hai Do; Yunfei Deng
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Vietnam, China
    Description

    Examination of relevant type materials and living plants reveals that Rungia axilliflora, R. densiflora and R. evrardii are conspecific with R. stolonifera. Lectotypes are designated for the names R. evrardii and R. stolonifera.

  19. o

    The Dataset of Camellia Cultivar Names in the World

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Nov 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yanan Wang; Huifu Zhuang; Yunguang Sheng; Yuhua Wang; Zhonglang Wang (2020). The Dataset of Camellia Cultivar Names in the World [Dataset]. http://doi.org/10.5281/zenodo.4289784
    Explore at:
    Dataset updated
    Nov 25, 2020
    Authors
    Yanan Wang; Huifu Zhuang; Yunguang Sheng; Yuhua Wang; Zhonglang Wang
    Area covered
    World
    Description

    The Camellia Cultivar Names were widely collected from books and journals and new registrations throughout the world every year, then reviewed by experts in the online working platform, the Database of International Camellia Register. After treating some important issues existed in camellia names, especially those plenty of re-used names and diacritical marks etc. especially in Japanese cultivars, a dataset of Camellia names was summarized from the year of 1253 to 2019 throughout the world. The data was contained in an excel table file (.xlsx format) including two sheets, entitled ’Cultivars’ and ’Synonyms’. The ’Cultivars’ sheet mainly recorded the name and description of each cultivar, while their corresponding Synonyms could be gained from the ’Synonyms’ sheet. Fields and its descriptions are given below: Data fields in Sheet ‘Cultivars’: CultivarId: A unique number for each cultivar. CultivarEpithet: The Cultivar Epithet for each cultivar. ScientificName: The Scientific Name for each cultivar. ChineseName: The Chinese Name for each cultivar. JapaneseName: The Japanese Name for each cultivar. Hiragana: The phonetic sounds in Japanese for each cultivar. SpeciesOrCombination: Cultivar’s origin or cross parentage. Meaning: The explanation of name. CultivarType: The type of economic value, For Ornamental, For Tea, Or For Oil. DescriptionEn: The English Description for each cultivar. DescriptionCn: The Chinese Description for each cultivar. DescriptionJp: The Japanese Description for each cultivar. YearPublished: The year of first publication. Country: The country name to release the cultivar. DefaultPhoto: The Type Image. DefaultPhotoChosenBy: A specialist name who determine Type image. DefaultPhotoChosenDate: A date when to choose type image by a specialist. IsExtinct: Whether it was extinct or not. Data fields in Sheet ‘Synonyms’: SynonymId: A unique number for each cultivar Synonym. Synonym: The Synonym for each cultivar used. Reference: The Reference recorded the synonym. CultivarEpithet: The corresponding Cultivar Epithet for the synonym.

  20. Z

    Datasets for Out-of-KB Mention Discovery with Entity Linking

    • data.niaid.nih.gov
    • zenodo.org
    Updated Aug 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dong, Hang (2023). Datasets for Out-of-KB Mention Discovery with Entity Linking [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8228370
    Explore at:
    Dataset updated
    Aug 10, 2023
    Dataset provided by
    Horrocks, Ian
    Yinan, Liu
    He, Yuan
    Chen, Jiaoyan
    Dong, Hang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The repository contains datasets for out-of-KB mention discovery from texts, documented in the work, Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity Linking, on arXiv: https://arxiv.org/abs/2302.07189 (CIKM 2023).

    Each data setting (as a sub-folder) contains train, valid, and test files and also 100 random sample files for each data split for debugging.

    Data folder names with “syn_full” at the end are synonym augmented data (each synonym as an entity) for the setting.

    Ontology .jsonl files have two versions for each, "syn_attr" setting treats synonyms are attributes, "syn_full" setting treats synonyms as entities.

    Data scripts are available at https://github.com/KRR-Oxford/BLINKout#data-scripts

    Acknowledgement of the data sources below:

    ShARe/CLEF 2013 dataset is from https://physionet.org/content/shareclefehealth2013/1.0/

    MedMention dataset is from https://github.com/chanzuckerberg/MedMentions

    UMLS (versions 2012AB, 2014AB, 2017AA) is from https://www.nlm.nih.gov/research/umls/index.html

    SNOMED CT (corresponding versions) is from https://www.nlm.nih.gov/healthit/snomedct/index.html

    NILK dataset is from https://zenodo.org/record/6607514

    WikiData 2017 dump is from https://archive.org/download/enwiki-20170220/enwiki-20170220-pages-articles.xml.bz2

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2022). Slovenian datasets for contextual synonym and antonym detection [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/20526

Data from: Slovenian datasets for contextual synonym and antonym detection

Related Article
Explore at:
binary formatAvailable download formats
Dataset updated
Oct 25, 2022
License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Area covered
Slovenia
Description

Slovenian datasets for contextual synonym and antonym detection can be used for training machine learning classifiers as described in the MSc thesis of Jasmina Pegan "Semantic detection of synonyms and antonyms with contextual embeddings" (https://repozitorij.uni-lj.si/IzpisGradiva.php?id=141456). Datasets contain example pairs of synonyms and antonyms in contexts together with additional information on a sense pair. Candidates for synonyms and antonyms were retrieved from the dataset created in the BSc thesis of Jasmina Pegan "Antonym detection with word embeddings" (https://repozitorij.uni-lj.si/IzpisGradiva.php?id=110533). Example sentences were retrieved from The comprehensive Slovenian-Hungarian dictionary (VSMS) (https://www.clarin.si/repository/xmlui/handle/11356/1453). Each dataset is class balanced and contains an equal amount of examples and counterexamples. An example is a pair of example sentences where the two words are synonyms/antonyms. A counterexample is a pair of example sentences where two words are not synonyms/antonyms. Note that a word pair can be synonymous or antonymous in some sense of the two words (but not in the given context).

Datasets are divided into two categories, datasets for synonyms and datasets for antonyms. Each category is further divided into base and updated datasets. These contain three dataset files: train, validation and test dataset. Base datasets include only manually-reviewed sense pairs. These are generated from all pairs of VSMS sense examples for all confirmed pairs of antonym and synonym senses. Updated datasets include automatically generated sense pairs while constraining the maximal number of examples per word. In this way, the dataset is more balanced word-wise, but is not fully manually-reviewed and contains less accurate data.

A single dataset entry contains the information on the base word, followed by data on synonym/antonym candidate. The last column discerns whether the sense pair is a pair of synonyms/antonyms or not. More details on this can be found inside the included README file.

Search
Clear search
Close search
Google apps
Main menu