python nltk

作者：碎花洋裙的优雅发布日期:2025-09-07 浏览:112

import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

# 下载必要的资源
nltk.download('punkt')
nltk.download('stopwords')

# 示例文本
text = "This is an example sentence demonstrating the use of NLTK in Python."

# 分词
tokens = word_tokenize(text)
print("分词结果:", tokens)

# 去除停用词
stop_words = set(stopwords.words('english'))
filtered_sentence = [word for word in tokens if word.lower() not in stop_words]
print("去除停用词后的结果:", filtered_sentence)

解释说明：

导入模块：我们首先导入了 nltk 及其相关子模块，包括 stopwords 和 word_tokenize。
下载资源：使用 nltk.download() 下载了分词和停用词所需的资源。
示例文本：定义了一个简单的英文句子作为处理对象。
分词：使用 word_tokenize 对句子进行分词，并将结果存储在 tokens 列表中。
去除停用词：通过 stopwords.words('english') 获取英文停用词集合，并过滤掉这些停用词，得到最终的 filtered_sentence。

希望这段代码能帮助你理解如何使用 nltk 进行基本的自然语言处理任务。

上一篇：python中字符串怎么比较大小

下一篇：python随机数怎么生成