一个非常常见的编码错误来源是,unicode当你将字符串与 相加时, python 2 会默默地将字符串强制转换为unicode。这可能会导致混合编码问题,并且很难调试。
unicode
例如:
import urllib import webbrowser name = raw_input("What's your name?\nName: ") greeting = "Hello, %s" % name if name == "John": greeting += u' (Feliz cumplea\xf1os!)' webbrowser.open('http://lmgtf\x79.com?q=' + urllib.quote_plus(greeting))
如果输入“John”,将会失败并出现一个神秘的错误:
/usr/lib/python2.7/urllib.py:1268: UnicodeWarning: Unicode equal comparison faile d to convert both arguments to Unicode - interpreting them as being unequal return ''.join(map(quoter, s)) Traceback (most recent call last): File "feliz.py", line 7, in <module> webbrowser.open('http://lmgtf\x79.com?q=' + urllib.quote_plus(greeting)) File "/usr/lib/python2.7/urllib.py", line 1273, in quote_plus s = quote(s, safe + ' ') File "/usr/lib/python2.7/urllib.py", line 1268, in quote return ''.join(map(quoter, s)) KeyError: u'\xf1'
当实际错误距离实际强制发生的位置相距甚远时,追踪就特别困难。
如何配置 python 以便在字符串强制转换为 unicode 时立即发出警告或异常?
只需安装它并像这样运行您的程序:
python -Werror -municodenazi myprog.py
你会在强制转换发生的地方得到一个回溯:
Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "SITE-PACKAGES/unicodenazi.py", line 128, in <module> main() File "SITE-PACKAGES/unicodenazi.py", line 119, in main execfile(sys.argv[0], main_mod.__dict__) File "myprog.py", line 4, in <module> print foo() File "myprog.py", line 2, in foo return 'bar' + u'baz' File "SITE-PACKAGES/unicodenazi.py", line 34, in warning_decode stacklevel=2) UnicodeWarning: Implicit conversion of str to unicode
如果你正在处理会自行触发隐式强制的 Python 库,并且无法捕获异常或以其他方式解决它们,那么你可以省略-Werror:
-Werror
python -municodenazi myprog.py
当这种情况发生时,至少可以在 stderr 上看到打印出来的警告:
/SITE-PACKAGES/unicodenazi.py:119: UnicodeWarning: Implicit conversion of str to unicode execfile(sys.argv[0], main_mod.__dict__)