Skip to content

Commit 1d8fd23

Browse files
neoneo40luispedro
authored andcommitted
Modify UnicodeDecodeError text in Python 2.x
1 parent 22bc140 commit 1d8fd23

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

ch05/classify.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,9 @@ def prepare_sent_features():
5454
if not text:
5555
meta[pid]['AvgSentLen'] = meta[pid]['AvgWordLen'] = 0
5656
else:
57-
text = text.decode('utf-8')
57+
from platform import python_version
58+
if python_version().startswith('2'):
59+
text = text.decode('utf-8')
5860
sent_lens = [len(nltk.word_tokenize(
5961
sent)) for sent in nltk.sent_tokenize(text)]
6062
meta[pid]['AvgSentLen'] = np.mean(sent_lens)

0 commit comments

Comments
 (0)