(1)到GitHub查找源,https://github.com/nltk/nltk_data

(2)如图所示,将packets下载下来

(3)打开jupyter,输入如下两行代码

import nltk
nltk.data.find(".")

 这时,会显示nltk data存放目录

FileSystemPathPointer('C:\\ProgramData\\Anaconda3\\nltk_data')

将packets拷贝到当前目录下,然后将文件名从packets 变成 nltk_data

然后运行例子看一下,当然,这里在运行时可能还是有错,显示找不到文件,比如:运行

sents = nltk.corpus.treebank_raw.sents()


但是请打开目录看一下,这里没有treebank_raw的文件夹,而是 treebank文件夹,下面是raw文件夹,所以这是程序版本问题

 

C:\ProgramData\Anaconda3\nltk_data\corpora\treebank\raw

具体怎么测试nltk的文件是否安装好可以参考其他人的说法,本文只给出最简单的测试方案:

from nltk.book import *


output:

*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
sents = nltk.corpus.treebank.sents()
sents


output:

[['Pierre', 'Vinken', ',', '61', 'years', 'old', ',', 'will', 'join', 'the', 'board', 'as', 'a', 'nonexecutive', 'director', 'Nov.', '29', '.'], ['Mr.', 'Vinken', 'is', 'chairman', 'of', 'Elsevier', 'N.V.', ',', 'the', 'Dutch', 'publishing', 'group', '.'], ...]

如果是在windows下离线安装,可以使用环境变量配置路径

Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐