Answer a question

I need to pickle a Python3 object to a string which I want to unpickle from an environmental variable in a Travis CI build. The problem is that I can't seem to find a way to pickle to a portable string (unicode) in Python3:

import os, pickle    

from my_module import MyPickleableClass


obj = {'cls': MyPickleableClass, 'other_stuf': '(...)'}

pickled = pickle.dumps(obj)

# raises TypeError: str expected, not bytes
os.environ['pickled'] = pickled

# raises UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb (...)
os.environ['pickled'] = pickled.decode('utf-8')

pickle.loads(os.environ['pickled'])

Is there a way to serialize complex objects like datetime.datetime to unicode or to some other string representation in Python3 which I can transfer to a different machine and deserialize?

Update

I have tested the solutions suggested by @kindall, but the pickle.dumps(obj, 0).decode() raises a UnicodeDecodeError. Nevertheless the base64 approach works but it needed an extra decode/encode step. The solution works on both Python2.x and Python3.x.

# encode returns bytes so it needs to be decoded to string
pickled = pickle.loads(codecs.decode(pickled.encode(), 'base64')).decode()

type(pickled)  # <class 'str'>

unpickled = pickle.loads(codecs.decode(pickled.encode(), 'base64'))

Answers

pickle.dumps() produces a bytes object. Expecting these arbitrary bytes to be valid UTF-8 text (the assumption you are making by trying to decode it to a string from UTF-8) is pretty optimistic. It'd be a coincidence if it worked!

One solution is to use the older pickling protocol that uses entirely ASCII characters. This still comes out as bytes, but since those bytes contain only ASCII code points, it can be converted to a string without stress:

pickled = str(pickle.dumps(obj, 0))

You could also use some other encoding method to encode a binary-pickled object to text, such as base64:

import codecs
pickled = codecs.encode(pickle.dumps(obj), "base64").decode()

Decoding would then be:

unpickled = pickle.loads(codecs.decode(pickled.encode(), "base64"))

Using pickle with protocol 0 seems to result in shorter strings than base64-encoding binary pickles (and abarnert's suggestion of hex-encoding is going to be even larger than base64), but I haven't tested it rigorously or anything. Test it with your data and see.

Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐