一尘不染

使用csv模块读取ASCII分隔文本?

python

您可能会或可能不知道ASCII分隔文本,其中有使用非键盘字符分离领域和线条的不错的优势。

写下来很简单:

import csv

with open('ascii_delim.adt', 'w') as f:
    writer = csv.writer(f, delimiter=chr(31), lineterminator=chr(30))
    writer.writerow(('Sir Lancelot of Camelot', 'To seek the Holy Grail', 'blue'))
    writer.writerow(('Sir Galahad of Camelot', 'I seek the Grail', 'blue... no yellow!'))

而且,可以肯定的是,您可以正确地丢弃东西。但是,在阅读时,lineterminator什么也没有做,并且如果我尝试这样做:

open('ascii_delim.adt', newline=chr(30))

它抛出 ValueError: illegal newline value:

那么,如何读取ASCII分隔文件?我会降级line.split(chr(30))吗?


阅读 214

收藏
2021-01-20

共1个答案

一尘不染

您可以通过有效地将文件中的行尾字符转换为换行字符csv.reader进行硬编码来识别:

import csv

with open('ascii_delim.adt', 'w') as f:
    writer = csv.writer(f, delimiter=chr(31), lineterminator=chr(30))
    writer.writerow(('Sir Lancelot of Camelot', 'To seek the Holy Grail', 'blue'))
    writer.writerow(('Sir Galahad of Camelot', 'I seek the Grail', 'blue... no yellow!'))

def readlines(f, newline='\n'):
    while True:
        line = []
        while True:
            ch = f.read(1)
            if ch == '':  # end of file?
                return
            elif ch == newline:  # end of line?
                line.append('\n')
                break
            line.append(ch)
        yield ''.join(line)

with open('ascii_delim.adt', 'rb') as f:
    reader = csv.reader(readlines(f, newline=chr(30)), delimiter=chr(31))
    for row in reader:
        print row

输出:

['Sir Lancelot of Camelot', 'To seek the Holy Grail', 'blue']
['Sir Galahad of Camelot', 'I seek the Grail', 'blue... no yellow!']
2021-01-20