Python读取包含中文字符文件时报错gbk' codec can't decode byte 0x8a in position 87: illegal multibyte sequence
读取包含中文文件时报错,如题:
read3.py
#----coding:utf-8-----
def read1(path):
for line in open(path,'r'):
print(line)
if __name__ == '__main__':
path = r'D:\IStudy\Java\workspace\mypy\com\dgb\test\2017-05-10.txt'
read1(path)
执行后报了如题的错误,查看文档open方法还有参数,decode,修改文件后:
#----coding:utf-8-----
def read1(path):
for line in open(path,'r',encoding='UTF-8'):
print(line)
if __name__ == '__main__':
path = r'D:\IStudy\Java\workspace\mypy\com\dgb\test\2017-05-10.txt'
read1(path)
文件时正常输出了,但是行与行之间的间距明显与源文件的间距不一致,python对于str有方法strip,官方的说明是:
str.

当Python尝试用GBK编码读取包含中文字符的文件时,会报错'gbk codec can't decode byte 0x8a in position 87: illegal multibyte sequence'。为解决此问题,可以使用open函数的encoding参数指定UTF-8编码来读取文件,如:`open(path,'r',encoding='UTF-8')`。但这样做可能会导致输出的行间距与源文件不一致。可以使用`line.strip()`去除行尾的空白字符,保持源文件的原始间距。"
125691718,14598774,C语言基础学习:程序设计与实践,"['C语言', '编程基础', '程序设计', '计算机语言']
1万+

被折叠的 条评论
为什么被折叠?



