Python从文件读取并保存到utf-8

一尘不染

Python从文件读取并保存到utf-8

python

我在从文件读取，处理其字符串并将其保存到UTF-8文件时遇到问题。

这是代码：

try:
    filehandle = open(filename,"r")
except:
    print("Could not open file " + filename)
    quit()

text = filehandle.read()
filehandle.close()

然后，我对可变文本进行一些处理。

接着

try:
    writer = open(output,"w")
except:
    print("Could not open file " + output)
    quit()

#data = text.decode("iso 8859-15")    
#writer.write(data.encode("UTF-8"))
writer.write(text)
writer.close()

这样可以完美地输出文件，但是根据我的编辑器，它在iso
8859-15中可以输出。由于相同的编辑器将输入文件（在变量文件名中）识别为UTF-8，所以我不知道为什么会这样。据我的研究表明，注释行应该可以解决问题。但是，当我使用这些行时，产生的文件主要具有特殊字符的乱码，带有波浪号的单词作为文本是西班牙语。当我感到困惑时，我将不胜感激。

阅读 248

2020-12-20

共1个答案

一尘不染

使用以下codecs模块在程序的I / O边界处处理与Unicode之间的文本：

import codecs
with codecs.open(filename, 'r', encoding='utf8') as f:
    text = f.read()
# process Unicode text
with codecs.open(filename, 'w', encoding='utf8') as f:
    f.write(text)

编辑：io现在建议使用该模块代替编解码器，并且该模块与Python 3的open语法兼容，如果使用Python
3，则可以在open不需要Python 2兼容性的情况下使用。

import io
with io.open(filename, 'r', encoding='utf8') as f:
    text = f.read()
# process Unicode text
with io.open(filename, 'w', encoding='utf8') as f:
    f.write(text)

2020-12-20