4 datasets found

GoogleNews-vectors-negative300.bin.gz (GZ format)
kaggle.com
zip
Updated Apr 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suraj (2023). GoogleNews-vectors-negative300.bin.gz (GZ format) [Dataset]. https://www.kaggle.com/datasets/suraj520/googlenews-vectors-negative300bingz-gz-format
Explore at:
zip(1760926034 bytes)Available download formats
Dataset updated
Apr 26, 2023
Authors
Suraj
Description
Dataset

This dataset was created by Suraj

Contents
GoogleNews-vectors-negative300 2.bin
figshare.com
bin
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lintong Jiang (2023). GoogleNews-vectors-negative300 2.bin [Dataset]. http://doi.org/10.6084/m9.figshare.7441400.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7441400.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Lintong Jiang
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
for word embedding
NLP-Word2Vec-Embeddings(pretrained)
kaggle.com
zip
Updated Feb 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
pkugoodspeed (2018). NLP-Word2Vec-Embeddings(pretrained) [Dataset]. https://www.kaggle.com/pkugoodspeed/nlpword2vecembeddingspretrained
Explore at:
zip(2645946681 bytes)Available download formats
Dataset updated
Feb 1, 2018
Authors
pkugoodspeed
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

https://www.adityathakker.com/wp-content/uploads/2017/06/word-embeddings-994x675.png" alt="word2vec"> Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located in close proximity to one another in the space.

Content

Existing Word2Vec Embeddings. GoogleNews-vectors-negative300.bin glove.6B.50d.txt glove.6B.100d.txt glove.6B.200d.txt glove.6B.300d.txt

Acknowledgements

We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Z
Data from: Negative Sampling Improves Hypernymy Extraction Based on...
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ustalov, Dmitry; Arefyev, Nikolay; Panchenko, Alexander; Biemann, Chris (2020). Negative Sampling Improves Hypernymy Extraction Based on Projection Learning [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_290524
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Ural Federal University
University of Hamburg
Moscow State University
Authors
Ustalov, Dmitry; Arefyev, Nikolay; Panchenko, Alexander; Biemann, Chris
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
We present a new approach to extraction of hypernyms based on projection learning and word embeddings. In contrast to classification-based approaches, projection-based methods require no candidate hyponym-hypernym pairs. While it is natural to use both positive and negative training examples in supervised relation extraction, the impact of negative examples on hypernym prediction was not studied so far. In this paper, we show that explicit negative examples used for regularization of the model significantly improve performance compared to the state-of-the-art approach on three datasets from different languages.

The russian model.

$ python -V; pip show tensorflow numpy scipy scikit-learn gensim | egrep -i '(name|version)' Python 3.5.2 :: Continuum Analytics, Inc. Name: tensorflow Version: 0.12.1 Name: numpy Version: 1.12.0 Name: scipy Version: 0.18.1 Name: scikit-learn Version: 0.18.1 Name: gensim Version: 0.13.4.1

The english-combined model has been trained using the well-known word embeddings dataset based on Google News: GoogleNews-vectors-negative300.bin on EVALution, BLESS, K&H+N, ROOT09 combined. The english-evalution model is traned on EVALution only.

$ python -V; pip show tensorflow numpy scipy scikit-learn gensim | egrep -i '(name|version)' Python 3.5.2 :: Anaconda custom (64-bit) Name: tensorflow Version: 0.12.1 Name: numpy Version: 1.11.3 Name: scipy Version: 0.18.1 Name: scikit-learn Version: 0.18.1 Name: gensim Version: 0.13.4.1
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Suraj (2023). GoogleNews-vectors-negative300.bin.gz (GZ format) [Dataset]. https://www.kaggle.com/datasets/suraj520/googlenews-vectors-negative300bingz-gz-format

GoogleNews-vectors-negative300.bin.gz (GZ format)

Explore at:

zip(1760926034 bytes)Available download formats

Dataset updated

Apr 26, 2023

Authors

Suraj

Description

Dataset

This dataset was created by Suraj

Clear search

Close search

Google apps

Main menu

GoogleNews-vectors-negative300.bin.gz (GZ format)

Dataset

Contents

GoogleNews-vectors-negative300 2.bin

NLP-Word2Vec-Embeddings(pretrained)

Context

Content

Acknowledgements

Inspiration

Data from: Negative Sampling Improves Hypernymy Extraction Based on...

GoogleNews-vectors-negative300.bin.gz (GZ format)

Dataset

Contents