4 datasets found
  1. GoogleNews-vectors-negative300.bin.gz (GZ format)

    • kaggle.com
    zip
    Updated Apr 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suraj (2023). GoogleNews-vectors-negative300.bin.gz (GZ format) [Dataset]. https://www.kaggle.com/datasets/suraj520/googlenews-vectors-negative300bingz-gz-format
    Explore at:
    zip(1760926034 bytes)Available download formats
    Dataset updated
    Apr 26, 2023
    Authors
    Suraj
    Description

    Dataset

    This dataset was created by Suraj

    Contents

  2. GoogleNews-vectors-negative300 2.bin

    • figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lintong Jiang (2023). GoogleNews-vectors-negative300 2.bin [Dataset]. http://doi.org/10.6084/m9.figshare.7441400.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Lintong Jiang
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    for word embedding

  3. NLP-Word2Vec-Embeddings(pretrained)

    • kaggle.com
    zip
    Updated Feb 1, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    pkugoodspeed (2018). NLP-Word2Vec-Embeddings(pretrained) [Dataset]. https://www.kaggle.com/pkugoodspeed/nlpword2vecembeddingspretrained
    Explore at:
    zip(2645946681 bytes)Available download formats
    Dataset updated
    Feb 1, 2018
    Authors
    pkugoodspeed
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    https://www.adityathakker.com/wp-content/uploads/2017/06/word-embeddings-994x675.png" alt="word2vec"> Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located in close proximity to one another in the space.

    Content

    Existing Word2Vec Embeddings. GoogleNews-vectors-negative300.bin glove.6B.50d.txt glove.6B.100d.txt glove.6B.200d.txt glove.6B.300d.txt

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  4. Z

    Data from: Negative Sampling Improves Hypernymy Extraction Based on...

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ustalov, Dmitry; Arefyev, Nikolay; Panchenko, Alexander; Biemann, Chris (2020). Negative Sampling Improves Hypernymy Extraction Based on Projection Learning [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_290524
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Ural Federal University
    University of Hamburg
    Moscow State University
    Authors
    Ustalov, Dmitry; Arefyev, Nikolay; Panchenko, Alexander; Biemann, Chris
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    We present a new approach to extraction of hypernyms based on projection learning and word embeddings. In contrast to classification-based approaches, projection-based methods require no candidate hyponym-hypernym pairs. While it is natural to use both positive and negative training examples in supervised relation extraction, the impact of negative examples on hypernym prediction was not studied so far. In this paper, we show that explicit negative examples used for regularization of the model significantly improve performance compared to the state-of-the-art approach on three datasets from different languages.

    The russian model.

    $ python -V; pip show tensorflow numpy scipy scikit-learn gensim | egrep -i '(name|version)' Python 3.5.2 :: Continuum Analytics, Inc. Name: tensorflow Version: 0.12.1 Name: numpy Version: 1.12.0 Name: scipy Version: 0.18.1 Name: scikit-learn Version: 0.18.1 Name: gensim Version: 0.13.4.1

    The english-combined model has been trained using the well-known word embeddings dataset based on Google News: GoogleNews-vectors-negative300.bin on EVALution, BLESS, K&H+N, ROOT09 combined. The english-evalution model is traned on EVALution only.

    $ python -V; pip show tensorflow numpy scipy scikit-learn gensim | egrep -i '(name|version)' Python 3.5.2 :: Anaconda custom (64-bit) Name: tensorflow Version: 0.12.1 Name: numpy Version: 1.11.3 Name: scipy Version: 0.18.1 Name: scikit-learn Version: 0.18.1 Name: gensim Version: 0.13.4.1

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Suraj (2023). GoogleNews-vectors-negative300.bin.gz (GZ format) [Dataset]. https://www.kaggle.com/datasets/suraj520/googlenews-vectors-negative300bingz-gz-format
Organization logo

GoogleNews-vectors-negative300.bin.gz (GZ format)

Explore at:
zip(1760926034 bytes)Available download formats
Dataset updated
Apr 26, 2023
Authors
Suraj
Description

Dataset

This dataset was created by Suraj

Contents

Search
Clear search
Close search
Google apps
Main menu