Skip to content

Commit c347e7f

Browse files
committed
Merge pull request nltk#1011 from longdt219/fix_embedding
Fix bug import nltk#1002
2 parents 6ae247b + 980f2d3 commit c347e7f

File tree

2 files changed

+2
-4
lines changed

2 files changed

+2
-4
lines changed

nltk/corpus/__init__.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,6 @@
6464
from nltk.tokenize import RegexpTokenizer
6565
from nltk.corpus.util import LazyCorpusLoader
6666
from nltk.corpus.reader import *
67-
from nltk.data import find
6867

6968
abc = LazyCorpusLoader(
7069
'abc', PlaintextCorpusReader, r'(?!\.).*\.txt', encoding=[
@@ -253,7 +252,6 @@
253252
'semcor', SemcorCorpusReader, r'brown./tagfiles/br-.*\.xml',
254253
wordnet) # Must be defined *after* wordnet corpus.
255254

256-
word2vec_sample = str(find('models/word2vec_sample/pruned.word2vec.bin'))
257255

258256
def demo():
259257
# This is out-of-date:

nltk/test/gensim.doctest

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,8 @@ Using the pre-trained model
3838
~~~~~~~~~~~~~~~~~~~
3939
NLTK also include a pre-trained model which is part of a model that is trained on 100 billion words from Google News Dataset.
4040
The full model is from https://code.google.com/p/word2vec/ which is about 3 Gb.
41-
42-
>>> from nltk.corpus import word2vec_sample
41+
>>> from nltk.data import find
42+
>>> word2vec_sample = str(find('models/word2vec_sample/pruned.word2vec.bin'))
4343
>>> model = gensim.models.Word2Vec.load(word2vec_sample)
4444

4545
We pruned the model to only include the most common words (~44k words).

0 commit comments

Comments
 (0)