boto3 streamingBody to BytesIO
boto3 class into
- A Session is about a particular configuration. a custom session:
|
|
- Resources is an object-oriented interface to AWS. Every resource instance has a number of attributes and methods. These can conceptually be split up into identifiers, attributes, actions, references, sub-resources, and collections.
|
|
- Client includes common APIs:
|
|
Service Resource have bucket and object subresources, as well as related actions.
Bucket, is an abstract resource representing a S3 bucket.
|
|
- Object, is an abstract resource representing a S3 object.
|
|
read s3 object & pipeline to mdfreader
there are a few try-and-outs. first is to streaming s3 object as BufferedReader, which give a file-like object, and can read(), but BufferedReader
looks more like a IO streaming than a file, which can’t seek.
botocore.response.StreamingBody as BufferedReader
the following discussion is really really helpful:
boto3 issue #426: how to use botocore.response.StreamingBody as stdin PIPE
at the code of the StreamingBody and it seems to me that is is really a wrapper of a class inheriting from io.IOBase) but only the read method from the raw stream is exposed, so not really a file-like object. it would make a lot of sense to expose the io.IOBase interface in the StreamingBody as we could wrapped S3 objects into a io.BufferedReader or a io.TextIOWrapper.read() get a binary string . the actual file-like object is found in the ._raw_stream attribute of the StreamingBody class
|
|
wheras this buff_reader
is not seekable, which makes mdfreader failure, due to its file operate needs seek()
method.
steam a non-seekable file-like object
stdio stream to seekable file-like object
so I am thinking to transfer the BufferedReader
to a seekable file-like object. first, need to understand why it is not seekable. BufferedRandom
is seekable, whereas BufferedReader and BufferedWriter are not. Buffered streams design: BufferedRandom is only suitable when the file is open for reading and writing. The ‘rb’ and ‘wb’ modes should return BufferedReader and BufferedWriter, respectively.
is it possbile to first read() the content of BufferedReader to some memory, than transfer it to BufferedRandom? which gives me the try to BufferedReader.read(), which basicaly read all the binaries and store it in-memoryA, then good news: in-memory binary streams are also aviable as Bytesio objects:
f = io.BytesIO(b"some in-memory binary data")
what if assign BytesIO
to this in-memoryA. which really gives me a seekable object:
fid_ = io.BufferedReader(mf4_['Body']._raw_stream) ;
read_in_memory = fid_.read()
bio_ = io.BytesIO(read_in_memory);
then BytesIO
object pointer is much more file-like, to do read() and seek().
refer
what is the concept behind file pointer or stream pointer
using io.BufferedReader on a stream obtained with open
working with binary data in python