# A tool for inspecting Python pickles¶

AUTHORS:

• Carl Witty (2009-03)

The explain_pickle function takes a pickle and produces Sage code that will evaluate to the contents of the pickle. Ideally, the combination of explain_pickle to produce Sage code and sage_eval to evaluate the code would be a 100% compatible implementation of cPickle’s unpickler; this is almost the case now.

EXAMPLES:

sage: explain_pickle(dumps(12345))
pg_make_integer = unpickle_global('sage.rings.integer', 'make_integer')
pg_make_integer('c1p')
sage: explain_pickle(dumps(polygen(QQ)))
pg_Polynomial_rational_flint = unpickle_global('sage.rings.polynomial.polynomial_rational_flint', 'Polynomial_rational_flint')
pg_unpickle_PolynomialRing = unpickle_global('sage.rings.polynomial.polynomial_ring_constructor', 'unpickle_PolynomialRing')
pg_RationalField = unpickle_global('sage.rings.rational_field', 'RationalField')
pg = unpickle_instantiate(pg_RationalField, ())
pg_make_rational = unpickle_global('sage.rings.rational', 'make_rational')
pg_Polynomial_rational_flint(pg_unpickle_PolynomialRing(pg, ('x',), None, False), [pg_make_rational('0'), pg_make_rational('1')], False, True)
sage: sage_eval(explain_pickle(dumps(polygen(QQ)))) == polygen(QQ)
True


By default (as above) the code produced contains calls to several utility functions (unpickle_global, etc.); this is done so that the code is truly equivalent to the pickle. If the pickle can be loaded into a future version of Sage, then the code that explain_pickle produces today should work in that future Sage as well.

It is also possible to produce simpler code, that is tied to the current version of Sage; here are the above two examples again:

sage: explain_pickle(dumps(12345), in_current_sage=True)
from sage.rings.integer import make_integer
make_integer('c1p')
sage: explain_pickle(dumps(polygen(QQ)), in_current_sage=True)
from sage.rings.polynomial.polynomial_rational_flint import Polynomial_rational_flint
from sage.rings.polynomial.polynomial_ring_constructor import unpickle_PolynomialRing
from sage.rings.rational import make_rational
Polynomial_rational_flint(unpickle_PolynomialRing(RationalField(), ('x',), None, False), [make_rational('0'), make_rational('1')], False, True)


The explain_pickle function has several use cases.

• Write pickling support for your classes

You can use explain_pickle to see what will happen when a pickle is unpickled. Consider: is this sequence of commands something that can be easily supported in all future Sage versions, or does it expose internal design decisions that are subject to change?

• Debug old pickles

If you have a pickle from an old version of Sage that no longer unpickles, you can use explain_pickle to see what it is trying to do, to figure out how to fix it.

• Use explain_pickle in doctests to help maintenance

If you have a loads(dumps(S)) doctest, you could also add an explain_pickle(dumps(S)) doctest. Then if something changes in a way that would invalidate old pickles, the output of explain_pickle will also change. At that point, you can add the previous output of explain_pickle as a new set of doctests (and then update the explain_pickle doctest to use the new output), to ensure that old pickles will continue to work.

As mentioned above, there are several output modes for explain_pickle, that control fidelity versus simplicity of the output. For example, the GLOBAL instruction takes a module name and a class name and produces the corresponding class. So GLOBAL of sage.rings.integer, Integer is approximately equivalent to sage.rings.integer.Integer.

However, this class lookup process can be customized (using sage.misc.persist.register_unpickle_override). For instance, if some future version of Sage renamed sage/rings/integer.pyx to sage/rings/knuth_was_here.pyx, old pickles would no longer work unless register_unpickle_override was used; in that case, GLOBAL of ‘sage.rings.integer’, ‘integer’ would mean sage.rings.knuth_was_here.integer.

By default, explain_pickle will map this GLOBAL instruction to unpickle_global('sage.rings.integer', 'integer'). Then when this code is evaluated, unpickle_global will look up the current mapping in the register_unpickle_override table, so the generated code will continue to work even in hypothetical future versions of Sage where integer.pyx has been renamed.

If you pass the flag in_current_sage=True, then explain_pickle will generate code that may only work in the current version of Sage, not in future versions. In this case, it would generate:

from sage.rings.integer import integer


and if you ran explain_pickle in hypothetical future sage, it would generate:

from sage.rings.knuth_was_here import integer

but the current code wouldn’t work in the future sage.

If you pass the flag default_assumptions=True, then explain_pickle will generate code that would work in the absence of any special unpickling information. That is, in either current Sage or hypothetical future Sage, it would generate:

from sage.rings.integer import integer


The intention is that default_assumptions output is prettier (more human-readable), but may not actually work; so it is only intended for human reading.

There are several functions used in the output of explain_pickle. Here I give a brief description of what they usually do, as well as how to modify their operation (for instance, if you’re trying to get old pickles to work).

• unpickle_global(module, classname): unpickle_global(‘sage.foo.bar’, ‘baz’) is usually equivalent to sage.foo.bar.baz, but this can be customized with register_unpickle_override.
• unpickle_newobj(klass, args): Usually equivalent to klass.__new__(klass, *args). If klass is a Python class, then you can define __new__() to control the result (this result actually need not be an instance of klass). (This doesn’t work for Cython classes.)
• unpickle_build(obj, state): If obj has a __setstate__() method, then this is equivalent to obj.__setstate__(state). Otherwise uses state to set the attributes of obj. Customize by defining __setstate__().
• unpickle_instantiate(klass, args): Usually equivalent to klass(*args). Cannot be customized.
• unpickle_appends(lst, vals): Appends the values in vals to lst. If not isinstance(lst, list), can be customized by defining a append() method.
class sage.misc.explain_pickle.EmptyNewstyleClass

Bases: object

A featureless new-style class (inherits from object); used for testing explain_pickle.

class sage.misc.explain_pickle.EmptyOldstyleClass

A featureless old-style class (does not inherit from object); used for testing explain_pickle.

class sage.misc.explain_pickle.PickleDict(items)

Bases: object

An object which can be used as the value of a PickleObject. The items is a list of key-value pairs, where the keys and values are SageInputExpressions. We use this to help construct dictionary literals, instead of always starting with an empty dictionary and assigning to it.

class sage.misc.explain_pickle.PickleExplainer(sib, in_current_sage=False, default_assumptions=False, pedantic=False)

Bases: object

An interpreter for the pickle virtual machine, that executes symbolically and constructs SageInputExpressions instead of directly constructing values.

APPEND()
APPENDS()
BINFLOAT(f)
BINGET(n)
BININT(n)
BININT1(n)
BININT2(n)
BINPERSID()
BINPUT(n)
BINSTRING(s)
BINUNICODE(s)
BUILD()
DICT()
DUP()
EMPTY_DICT()
EMPTY_LIST()
EMPTY_TUPLE()
EXT1(n)
EXT2(n)
EXT4(n)
FLOAT(f)
GET(n)
GLOBAL(name)
INST(name)
INT(n)
LIST()
LONG(n)
LONG1(n)
LONG4(n)
LONG_BINGET(n)
LONG_BINPUT(n)
MARK()
NEWFALSE()
NEWOBJ()
NEWTRUE()
NONE()
OBJ()
PERSID(id)
POP()
POP_MARK()
PROTO(proto)
PUT(n)
REDUCE()
SETITEM()
SETITEMS()
SHORT_BINSTRING(s)
STOP()
STRING(s)
TUPLE()
TUPLE1()
TUPLE2()
TUPLE3()
UNICODE(s)
check_value(v)

Check that the given value is either a SageInputExpression or a PickleObject. Used for internal sanity checking.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: from sage.misc.sage_input import SageInputBuilder
sage: sib = SageInputBuilder()
sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True)
sage: pe.check_value(7)
Traceback (most recent call last):
...
AssertionError
sage: pe.check_value(sib(7))

is_mutable_pickle_object(v)

Test whether a PickleObject is mutable (has never been converted to a SageInputExpression).

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: from sage.misc.sage_input import SageInputBuilder
sage: sib = SageInputBuilder()
sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True)
sage: v = PickleObject(1, sib(1))
sage: pe.is_mutable_pickle_object(v)
True
sage: sib(v)
{atomic:1}
sage: pe.is_mutable_pickle_object(v)
False

pop()

Pop a value from the virtual machine’s stack, and return it.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: from sage.misc.sage_input import SageInputBuilder
sage: sib = SageInputBuilder()
sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True)
sage: pe.push(sib(7))
sage: pe.pop()
{atomic:7}

pop_to_mark()

Pop all values down to the ‘mark’ from the virtual machine’s stack, and return the values as a list.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: from sage.misc.sage_input import SageInputBuilder
sage: sib = SageInputBuilder()
sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True)
sage: pe.push_mark()
sage: pe.push(sib(7))
sage: pe.push(sib('hello'))
sage: pe.pop_to_mark()
[{atomic:7}, {atomic:'hello'}]

push(v)

Push a value onto the virtual machine’s stack.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: from sage.misc.sage_input import SageInputBuilder
sage: sib = SageInputBuilder()
sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True)
sage: pe.push(sib(7))
sage: pe.stack[-1]
{atomic:7}

push_and_share(v)

Push a value onto the virtual machine’s stack; also mark it as shared for sage_input if we are in pedantic mode.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: from sage.misc.sage_input import SageInputBuilder
sage: sib = SageInputBuilder()
sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True)
sage: pe.push_and_share(sib(7))
sage: pe.stack[-1]
{atomic:7}
sage: pe.stack[-1]._sie_share
True

push_mark()

Push a ‘mark’ onto the virtual machine’s stack.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: from sage.misc.sage_input import SageInputBuilder
sage: sib = SageInputBuilder()
sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True)
sage: pe.push_mark()
sage: pe.stack[-1]
'mark'
sage: pe.stack[-1] is the_mark
True

run_pickle(p)

Given an (uncompressed) pickle as a string, run the pickle in this virtual machine. Once a STOP has been executed, return the result (a SageInputExpression representing code which, when evaluated, will give the value of the pickle).

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: from sage.misc.sage_input import SageInputBuilder
sage: sib = SageInputBuilder()
sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True)
sage: sib(pe.run_pickle('T\5\0\0\0hello.'))  # py2
{atomic:'hello'}

share(v)

Mark a sage_input value as shared, if we are in pedantic mode.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: from sage.misc.sage_input import SageInputBuilder
sage: sib = SageInputBuilder()
sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True)
sage: v = sib(7)
sage: v._sie_share
False
sage: pe.share(v)
{atomic:7}
sage: v._sie_share
True

class sage.misc.explain_pickle.PickleInstance(klass)

Bases: object

An object which can be used as the value of a PickleObject. Unlike other possible values of a PickleObject, a PickleInstance doesn’t represent an exact value; instead, it gives the class (type) of the object.

class sage.misc.explain_pickle.PickleObject(value, expression)

Bases: object

Pickles have a stack-based virtual machine. The explain_pickle pickle interpreter mostly uses SageInputExpressions, from sage_input, as the stack values. However, sometimes we want some more information about the value on the stack, so that we can generate better (prettier, less confusing) code. In such cases, we push a PickleObject instead of a SageInputExpression. A PickleObject contains a value (which may be a standard Python value, or a PickleDict or PickleInstance), an expression (a SageInputExpression), and an “immutable” flag (which checks whether this object has been converted to a SageInputExpression; if it has, then we must not mutate the object, since the SageInputExpression would not reflect the changes).

class sage.misc.explain_pickle.TestAppendList

Bases: list

A subclass of list, with deliberately-broken append and extend methods. Used for testing explain_pickle.

append()

A deliberately broken append method.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: v = TestAppendList()
sage: v.append(7)  # py2
Traceback (most recent call last):
...
TypeError: append() takes exactly 1 argument (2 given)
sage: v.append(7)  # py3
Traceback (most recent call last):
...
TypeError: append() takes 1 positional argument but 2 were given

We can still append by directly using the list method:
sage: list.append(v, 7) sage: v [7]
extend()

A deliberately broken extend method.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: v = TestAppendList()
sage: v.extend([3,1,4,1,5,9])  # py2
Traceback (most recent call last):
...
TypeError: extend() takes exactly 1 argument (2 given)
sage: v.extend([3,1,4,1,5,9])  # py3
Traceback (most recent call last):
...
TypeError: extend() takes 1 positional argument but 2 were given

We can still extend by directly using the list method:
sage: list.extend(v, (3,1,4,1,5,9)) sage: v [3, 1, 4, 1, 5, 9]
class sage.misc.explain_pickle.TestAppendNonlist

Bases: object

A list-like class, carefully designed to test exact unpickling behavior. Used for testing explain_pickle.

class sage.misc.explain_pickle.TestBuild

Bases: object

A simple class with a __getstate__ but no __setstate__. Used for testing explain_pickle.

class sage.misc.explain_pickle.TestBuildSetstate

A simple class with a __getstate__ and a __setstate__. Used for testing explain_pickle.

class sage.misc.explain_pickle.TestGlobalFunnyName

Bases: object

A featureless new-style class which has a name that’s not a legal Python identifier.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: globals()['funny$name'] = TestGlobalFunnyName # see comment at end of file sage: TestGlobalFunnyName.__name__ 'funny$name'
sage: globals()['funny\$name'] is TestGlobalFunnyName
True

class sage.misc.explain_pickle.TestGlobalNewName

Bases: object

A featureless new-style class. When you try to unpickle an instance of TestGlobalOldName, it is redirected to create an instance of this class instead. Used for testing explain_pickle.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
TestGlobalNewName

class sage.misc.explain_pickle.TestGlobalOldName

Bases: object

A featureless new-style class. When you try to unpickle an instance of this class, it is redirected to create a TestGlobalNewName instead. Used for testing explain_pickle.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
TestGlobalNewName

class sage.misc.explain_pickle.TestReduceGetinitargs

An old-style class with a __getinitargs__ method. Used for testing explain_pickle.

class sage.misc.explain_pickle.TestReduceNoGetinitargs

An old-style class with no __getinitargs__ method. Used for testing explain_pickle.

sage.misc.explain_pickle.explain_pickle(pickle=None, file=None, compress=True, **kwargs)

Explain a pickle. That is, produce source code such that evaluating the code is equivalent to loading the pickle. Feeding the result of explain_pickle to sage_eval should be totally equivalent to loading the pickle with cPickle.

INPUT:

• pickle – the pickle to explain, as a string (default: None)
• file – a filename of a pickle (default: None)
• compress – if False, don’t attempt to decompress the pickle
(default: True)
• in_current_sage – if True, produce potentially simpler code that is
tied to the current version of Sage. (default: False)
• default_assumptions – if True, produce potentially simpler code that
assumes that generic unpickling code will be used. This code may not actually work. (default: False)
• eval – if True, then evaluate the resulting code and return the
evaluated result. (default: False)
• preparse – if True, then produce code to be evaluated with
Sage’s preparser; if False, then produce standard Python code; if None, then produce code that will work either with or without the preparser. (default: True)
• pedantic – if True, then carefully ensures that the result has
at least as much sharing as the result of cPickle (it may have more, for immutable objects). (default: False)

Exactly one of pickle (a string containing a pickle) or file (the filename of a pickle) must be provided.

EXAMPLES:

sage: explain_pickle(dumps({('a', 'b'): [1r, 2r]}))
{('a', 'b'):[1r, 2r]}
sage: explain_pickle(dumps(RR(pi)), in_current_sage=True)
from sage.rings.real_mpfr import __create__RealNumber_version0
from sage.rings.real_mpfr import __create__RealField_version0
__create__RealNumber_version0(__create__RealField_version0(53r, False, 'RNDN'), '3.4gvml245kc0@0', 32r)
sage: s = 'hi'
sage: explain_pickle(dumps((s, s)))
('hi', 'hi')
sage: explain_pickle(dumps((s, s)), pedantic=True)
si = 'hi'
(si, si)
sage: explain_pickle(dumps(5r))
5r
sage: explain_pickle(dumps(5r), preparse=False)
5
sage: explain_pickle(dumps(5r), preparse=None)
int(5)
sage: explain_pickle(dumps(22/7))
pg_make_rational = unpickle_global('sage.rings.rational', 'make_rational')
pg_make_rational('m/7')
sage: explain_pickle(dumps(22/7), in_current_sage=True)
from sage.rings.rational import make_rational
make_rational('m/7')
sage: explain_pickle(dumps(22/7), default_assumptions=True)
from sage.rings.rational import make_rational
make_rational('m/7')

sage.misc.explain_pickle.explain_pickle_string(pickle, in_current_sage=False, default_assumptions=False, eval=False, preparse=True, pedantic=False)

This is a helper function for explain_pickle. It takes a decompressed pickle string as input; other than that, its options are all the same as explain_pickle.

EXAMPLES:

sage: sage.misc.explain_pickle.explain_pickle_string(dumps("Hello, world", compress=False))
'Hello, world'


(See the documentation for explain_pickle for many more examples.)

sage.misc.explain_pickle.name_is_valid(name)

Test whether a string is a valid Python identifier. (We use a conservative test, that only allows ASCII identifiers.)

EXAMPLES:

sage: from sage.misc.explain_pickle import name_is_valid
sage: name_is_valid('fred')
True
sage: name_is_valid('Yes!ValidName')
False
sage: name_is_valid('_happy_1234')
True

sage.misc.explain_pickle.test_pickle(p, verbose_eval=False, pedantic=False, args=())

Tests explain_pickle on a given pickle p. p can be:

• a string containing an uncompressed pickle (which will always end with a ‘.’)
• a string containing a pickle fragment (not ending with ‘.’) test_pickle will synthesize a pickle that will push args onto the stack (using persistent IDs), run the pickle fragment, and then STOP (if the string ‘mark’ occurs in args, then a mark will be pushed)
• an arbitrary object; test_pickle will pickle the object

Once it has a pickle, test_pickle will print the pickle’s disassembly, run explain_pickle with in_current_sage=True and False, print the results, evaluate the results, unpickle the object with cPickle, and compare all three results.

If verbose_eval is True, then test_pickle will print messages before evaluating the pickles; this is to allow for tests where the unpickling prints messages (to verify that the same operations occur in all cases).

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: test_pickle(['a'])  # py2
0: \x80 PROTO      2
2: ]    EMPTY_LIST
3: q    BINPUT     1
5: U    SHORT_BINSTRING 'a'
8: a    APPEND
9: .    STOP
highest protocol among opcodes = 2
explain_pickle in_current_sage=True/False:
['a']
result: ['a']

sage.misc.explain_pickle.unpickle_appends(lst, vals)

Given a list (or list-like object) and a sequence of values, appends the values to the end of the list. This is careful to do so using the exact same technique that cPickle would use. Used by explain_pickle.

EXAMPLES:

sage: v = []
sage: unpickle_appends(v, (1, 2, 3))
sage: v
[1, 2, 3]

sage.misc.explain_pickle.unpickle_build(obj, state)

Set the state of an object. Used by explain_pickle.

EXAMPLES:

sage: from sage.misc.explain_pickle import *
sage: v = EmptyNewstyleClass()
sage: unpickle_build(v, {'hello': 42})
sage: v.hello
42

sage.misc.explain_pickle.unpickle_extension(code)

Takes an integer index and returns the extension object with that index. Used by explain_pickle.

EXAMPLES:

sage: from six.moves.copyreg import *
sage: unpickle_extension(42)
<class 'sage.misc.explain_pickle.EmptyNewstyleClass'>
sage: remove_extension('sage.misc.explain_pickle', 'EmptyNewstyleClass', 42)

sage.misc.explain_pickle.unpickle_instantiate(fn, args)

Instantiate a new object of class fn with arguments args. Almost always equivalent to fn(*args). Used by explain_pickle.

EXAMPLES:

sage: unpickle_instantiate(Integer, ('42',))
42

sage.misc.explain_pickle.unpickle_newobj(klass, args)

Create a new object; this corresponds to the C code klass->tp_new(klass, args, NULL). Used by explain_pickle.

EXAMPLES:

sage: unpickle_newobj(tuple, ([1, 2, 3],))
(1, 2, 3)

sage.misc.explain_pickle.unpickle_persistent(s)

Takes an integer index and returns the persistent object with that index; works by calling whatever callable is stored in unpickle_persistent_loader. Used by explain_pickle.

EXAMPLES:

sage: import sage.misc.explain_pickle
sage: sage.misc.explain_pickle.unpickle_persistent_loader = lambda n: n+7
sage: unpickle_persistent(35)
42