i want to read bytes. sys.stdin is opened in textmode, yet it has a buffer that can be used to read bytes: sys.stdin.buffer.
my problem is that when i pipe data into python i only seem to have 2 options if i want readahead, else i get a io.UnsupportedOperation: File or stream is not seekable.
-
reading buffered text from
sys.stdin, decoding that text to bytes, and seeking back(
sys.stdin.read(1).decode(); sys.stdin.seek(-1, io.SEEK_CUR).unacceptable due to non-encodable bytes in the input stream.
-
using
peekto get some bytes from the stdin’s buffer, slicing that to the appropriate number, and praying, aspeekdoesn’t guarantee anything: it may give less or more than you request…(
sys.stdin.buffer.peek(1)[:1])peek is really underdocumented and gives you a bunch of bytes that you have to performance-intensively slice.
btw. that error really only applies when piping: for ./myscript.py <somefile, sys.stdin.buffer supports seeking. yet the sys.stdin is always the same hierarchy of objects:
$ cat testio.py
#!/usr/bin/env python3
from sys import stdin
print(stdin)
print(stdin.buffer)
print(stdin.buffer.raw)"
$ ./testio.py
<_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>
<_io.BufferedReader name='<stdin>'>
<_io.FileIO name='<stdin>' mode='rb'>
$ ./testio.py <somefile
[the same as above]
$ echo hi | ./testio.py
[the same as above]
some initial ideas like wrapping the byte stream into a random access buffer fail with the same error as mentioned above: BufferedRandom(sys.stdin.buffer).seek(0) ⇒ io.UnsupportedOperation…
finally, for your convenience i present:
Python’s io class hierarchy
IOBase
├RawIOBase
│└FileIO
├BufferedIOBase (buffers a RawIOBase)
│├BufferedWriter┐
│├BufferedReader│
││ └─────┴BufferedRWPair
│├BufferedRandom (implements seeking)
│└BytesIO (wraps a bytes)
└TextIOBase
├TextIOWrapper (wraps a BufferedIOBase)
└TextIO (wraps a str)
and in case you forgot the question: how do i get the next byte from stdin without de/encoding anything, and without advancing the stream’s cursor?

所有评论(0)