我如何从标准输入中读取一行，并将其余部分传递给子进程？

小能豆

我如何从标准输入中读取一行，并将其余部分传递给子进程？

如果您readline()从sys.stdin，将其余部分传递给子进程似乎不起作用。

import subprocess
import sys

header = sys.stdin.buffer.readline()
print(header)
subprocess.run(['nl'], check=True)

（我使用它sys.stdin.buffer来避免任何编码问题；此句柄返回原始字节。）

它运行了，但是我没有从子进程得到任何输出；

bash$ printf '%s\n' foo bar baz | python demo1.py
b'foo\n'

如果我取出readline等等，子进程将读取标准输入并产生我期望的输出。

bash$ printf '%s\n' foo bar baz |
> python -c 'import subprocess; subprocess.run(["nl"], check=True)'
     1  foo
     2  bar
     3  baz

当我开始读取 stdin 的其余部分时，Python 是否会缓冲它，或者这里发生了什么？运行 with并不能消除问题（事实上，它的文档只提到它改变了andpython -u的行为）。但是如果我传入大量数据，我会得到其中的一些：stdout``stderr

bash$ wc -l /etc/services
   13921 /etc/services

bash$ python demo1.py </etc/services  | head -n 3
     1     27/tcp     # NSW User System FE
     2  #                          Robert Thomas <BThomas@F.BBN.COM>
     3  #                28/tcp    Unassigned
 (... traceback from broken pipe elided ...)

bash$  fgrep -n 'NSW User System FE' /etc/services 
91:nsw-fe           27/udp     # NSW User System FE
92:nsw-fe           27/tcp     # NSW User System FE

bash$ sed -n '1,/NSW User System FE/p' /etc/services | wc
      91     449    4082

（因此，看起来它从一开始就占用了 4096 个字节。）

但是，有没有什么方法可以避免这种行为？我只想从头开始读取一行，然后将其余部分传递给子进程。

sys.stdin.buffer.readline(-1)循环反复调用没有帮助。

阅读 23

2024-12-18

共1个答案

小能豆

这是因为sys.stdin它是使用默认缓冲模式下的内置函数创建的 open，该模式使用大小为的缓冲区io.DEFAULT_BUFFER_SIZE，在大多数系统上，该缓冲区大小为4096或8192字节。

为了使父进程精确地使用标准输入中的一行文本，您可以通过将其0作为buffering参数传递给open或os.fdopen函数来禁用缓冲区并打开它：

# subp1.py
import os
import sys
import subprocess

# or with the platform-dependent device file:
# unbuffered_stdin = open('/dev/stdin', 'rb', buffering=0)
unbuffered_stdin = os.fdopen(sys.stdin.fileno(), 'rb', buffering=0)

print(unbuffered_stdin.readline())
subprocess.run(['nl'], check=True)

以便：

printf "foo\nbar\n" | python subp1.py

然后输出：

b'foo\n'
     1  bar

2024-12-18