3.7. Refactoring for evo3
¶
Code fragments in this section are taken from
rt2/src/main/java/.../vsj2/evo3
in the project source, and are exercised by unit tests inrt2/src/test/java/.../vsj2/evo3
.
Code in the last few sections has all resided in package evo2
.
Reflecting on how this has developed,
several choices made along the way could be improved upon.
We don’t want to invalidate the commentary that has been written so far,
so that readers can no longer find the code referred to.
The only solution is to make a copy of everything in a new package evo3
,
and refactor the code there.
3.7.1. Static Factory Methods in Py
¶
We have made extensive use of constructors to create Python objects in Java. There are some potential advantages to using static functions: brevity in the client code, the potential to share commonly used values, and the possibility of returning specialised sub-classes. This will be done mostly for immutable value types.
It seems sensible (as in Jython 2.7) to implement these in class Py
.
We name the factory methods after the Python type they produce.
In place of new PyUnicode("a")
,
we should write Py.str("a")
,
in place of new PyTuple(...)
, Py.tuple(...)
,
and so on.
Py.int(42)
and Py.float(1.0)
aren’t valid Java,
so we take advantage of overloading and write
Py.val(42)
and Py.val(1.0)
.
srcgen.py
supports this with a pluggable code generator,
generating new tests in the new pattern,
from the same examples used previously.
3.7.2. Some Re-naming¶
In evo3
we rename Cell
to PyCell
having realised it must be a PyObject
.
We rename PyDictionary
to PyDict
to be closer to
the names used in CPython (PyDict_New
and so on).
We can’t quite bring ourselves to rename PyUnicode
to PyStr
,
although it would be consistent and save space.
3.7.3. Re-working PyType
¶
Untangling the Type Initialisation¶
In evo2
,
the implementation class of each Python object
created an instance of concrete class PyType
by calling a constructor
from the static initialisation of the class.
This has been ok so far,
but it results in a bit of a recursive scramble,
with the risk that we call methods on fundamental types
before they are ready.
(The Jython 2.7 type registry is also delicate on this way.)
We should like the option to create variants of PyType
,
according (for example) to different mutability patterns,
and whether the type has numerical or sequence nature.
This might extend to sub-classing PyType
if necessary.
Constraints on the coding of constructors
(on when super
and this
or instance methods may be called)
limit the possibilities for expression.
A series of static factory methods and helpers is more flexible, but it is complicated to express different desired behaviours even to static calls, and undesirable that a type once created be mutated freely. We therefore introduce a “specification” object:
class PyType implements PyObject {
// ...
/** Specification for a type. A data structure with mutators. */
static class Spec {
final static EnumSet<Flag> DEFAULT_FLAGS =
EnumSet.of(Flag.BASETYPE);
final String name;
final Class<? extends PyObject> implClass;
final List<PyType> bases = new LinkedList<>();
EnumSet<Flag> flags = EnumSet.copyOf(DEFAULT_FLAGS);
/** Create (begin) a specification for a {@link PyType}. */
Spec(String name, Class<? extends PyObject> implClass) {
this.name = name;
this.implClass = implClass;
}
Spec base(PyType b) { ... }
Spec flag(Flag f) { ... }
Spec flagNot(Flag f) { ... }
PyType[] getBases() { ... }
// ...
}
}
One may observe in this the beginnings of support for new features,
including an attempt to support Python inheritance, and
a field called flags
(an EnumSet
),
with the same general purpose as tp_flags
(a bit set) in CPython.
Note
As things develop, we expect this code to evolve (so it may not be the same in the code base as in the text).
CPython also has a PyType_Spec
structure (for a PyHeapTypeObject
s),
which shares name
and flags
with ours,
but it otherwise runs mainly to a description of the slots,
which we seem to be doing adequately by a naming convention.
Following CPython,
we introduce a static factory method to interpret the PyType.Spec
:
class PyType implements PyObject {
// ...
/** Construct a type from the given specification. */
static PyType fromSpec(Spec spec) {
return new PyType(spec.name, spec.implClass, spec.getBases(),
spec.flags);
}
private PyType(String name, Class<? extends PyObject> implClass,
PyType[] declaredBases, EnumSet<Flag> flags) {
this.name = name;
this.implClass = implClass;
this.flags = flags;
// Fix-up base and MRO from bases array
setMROfromBases(declaredBases);
// Fill slots from implClass or bases
setAllSlots();
}
// ...
}
As an example of its use, consider PyBool:
class PyBool extends PyLong {
static final PyType TYPE = PyType.fromSpec( //
new PyType.Spec("bool", PyBool.class) //
.base(PyLong.TYPE) //
.flagNot(PyType.Flag.BASETYPE));
// ...
Here we are saying that bool
has int
as a base (in Python)
and may not itself be a base for further derivation (in Python).
Flattening the Slot-function Table¶
In the section Representing a Python Class,
we noted that in the CPython PyTypeObject
,
some slots were directly in the type object (e.g. tp_repr
),
while most were arranged in sub-tables,
pointed to by fields (that may be NULL
) in the type object.
The motivation is surely to save space on type objects that do not need
the full set of slots.
The cost is some testing and indirection where these slots are used.
A common idiom in the CPython source is something like:
m = o->ob_type->tp_as_mapping;
if (m && m->mp_subscript) {
PyObject *item = m->mp_subscript(o, key);
return item;
}
We observe that types defined in Python (PyHeapTypeObject
)
always create all the tables,
so only types defined in C benefit from this parsimony.
As there are 80 slots in total,
the benefit cannot exceed 640 bytes per type (64-bit pointers),
which is a minor saving even if there are a few hundred such types.
In evo3
we have chosen an implementation in which
all the slots are fields directly in the type object.
This significantly simplifies the code to create them,
and saves an indirection or two with each operation.
With the trick of using EmptyException
in place of a test,
the equivalent Java code is just:
PyType oType = o.getType();
try {
return (PyObject) oType.mp_subscript.invokeExact(o, key);
} catch (EmptyException e) {}
The supporting fields in PyType
are all MethodHandle
s as before:
class PyType implements PyObject {
//...
// Standard type slots table see CPython PyTypeObject
MethodHandle tp_hash;
MethodHandle tp_repr;
//...
// Number slots table see CPython PyNumberMethods
MethodHandle nb_negative;
MethodHandle nb_add;
//...
// Sequence slots table see CPython PySequenceMethods
MethodHandle sq_length;
MethodHandle sq_repeat;
MethodHandle sq_item;
//...
// Mapping slots table see CPython PyMappingMethods
MethodHandle mp_length;
MethodHandle mp_subscript;
//...
Previously we gave these short names within their sub-table structures,
but in the larger scope
it seems wise to flag them as slots by using the same names as CPython.
This will simplify translation of code from CPython to Jython, and
it avoids clashes,
e.g. between sq_length
and mp_length
,
or between Java int
and nb_int
.
We shall not name all the PyType
fields with the tp_
prefix:
fields like name
, bases
and mro
are not slots in this sense.
A revised version of Slot.java
(about half the length it was)
now defines an enum with constants for each slot:
enum Slot {
tp_hash(Signature.LEN), //
tp_repr(Signature.UNARY), //
//...
nb_negative(Signature.UNARY, "-", "neg"), //
nb_add(Signature.BINARY, "+", "add"), //
//...
sq_length(Signature.LEN, null, "length"), //
sq_repeat(Signature.SQ_INDEX), //
sq_item(Signature.SQ_INDEX), //
sq_ass_item(Signature.SQ_ASSIGN), //
mp_length(Signature.LEN, null, "length"), //
mp_subscript(Signature.BINARY), //
mp_ass_subscript(Signature.MP_ASSIGN);
final MethodType type;
final String methodName;
final String opName;
final MethodHandle empty;
final VarHandle slotHandle;
Slot(Signature signature, String opName, String methodName) {
this.opName = opName == null ? name() : opName;
this.methodName = methodName == null ? name() : methodName;
this.type = signature.type;
this.empty = signature.empty;
this.slotHandle = Util.slotHandle(this);
}
Slot(Signature signature) { this(signature, null, null); }
Slot(Signature signature, String opName) {
this(signature, opName, null);
}
// ...
}
As with the previous Slot.TP
, Slot.NB
, and so on,
the enum
encapsulates a lot of behaviour (not shown),
supporting its use.
The design is the same one outlined in How we Fill the Slots,
but we no longer have to repeat the logic for Slot.NB
, Slot.SQ
, etc..
The name of the slot is the same as that of the enum
constant, and
unless otherwise given,
so is the name of the method in the implementation class.
Consequently,
this change has us changing the names of some of these methods,
to match the distinctive slot name, e.g.:
class PyTuple implements PyObject {
//...
static int length(PyObject s) {
//...
}
static PyObject sq_item(PyObject s, int i) {
//...
}
static PyObject mp_subscript(PyObject s, PyObject item)
throws Throwable {
//...
}
}
Note that in the definition of enum Slot
,
we defined the implementation method name of sq_length
and mp_length
,
to be "length"
in both cases.
This reproduces the behaviour we had before,
but it is not necessarily right.
In all cases in the CPython core where both are defined,
one method serves both slots,
but they are not always both defined.
The initialisation of the PyType
now uses a single loop
to initialise all the slots,
as may be seen in Inheritance of Slot Functions.
Manipulation of Slots and PyType.flags
¶
We have made the slots package-accessible
so that we may use them directly in the implementation of
methods in the abstract object API (in Abstract.java
, for example).
It is not intended that they be accessible to user-written Java,
or be updated directly, even by the runtime.
Rather, we must allow the type to police updates:
in some cases, it is necessary for the PyType
to co-ordinate additional changes.
Certain Python types allow setting the slots in a controlled way.
In others they are immutable.
In CPython this is controlled by a bit in an int tp_flags
field,
and in our implementation by an element of EnumSet flags
.
In CPython, the question of mutability is conflated with
whether the type is a “heap type” (that is, allocated dynamically).
This second issue concerns where the type object is allocated.
Built-in types like int
are not heap types,
and are not mutable,
while user-defined classes are heap types and are mutable.
All objects in Java are allocated by its runtime,
so we do not need the second, literal sense idea of “heap type”.
When reading CPython source to emulate it,
we must be alert to which sense of Py_TPFLAGS_HEAPTYPE
is being used.
We will use PyType.Flag.MUTABLE
to signify that slots may be written,
or conversely, may be depended upon never to change.
Inheritance of Slot Functions¶
We noted in A bool Implementation that
bool
inherited the slot functions of int
,
because the look-up of (say) add
on PyBool
found PyLong.add
.
This made the test pass,
but resulted in taking the slow path in Number.binary_op1
.
We actually need to copy the add
slot from the type of PyLong
to the type of PyBool
,
and not to create a new MethodHandle
for the same method.
Secondly,
we should ensure that when PyBool
gives a different meaning to a slot,
this is the one that applies to a Python bool
.
An example of this is the boolean binary operations &
, |
and ^
,
which are bit-wise operations in int
,
but when both operands are bool
, yield True
or False
.
We address this in the refactoring by copying the slots in a new type from the base. In fact we allow for multiple bases, as we shall have to eventually. The first one to supply the slot wins:
class PyType implements PyObject {
// ...
private void setAllSlots() {
for (Slot s : Slot.values()) {
final MethodHandle empty = s.getEmpty();
MethodHandle mh = s.findInClass(implClass);
for (int i = 0; mh == empty && i < bases.length; i++) {
mh = s.getSlot(bases[i]);
}
s.setSlot(this, mh);
}
}
// ...
The equivalent code in CPython is typeobject.c::inherit_slots()
.
The logic there is more complex
as it has to deal with the sub-table structure.
Note
At the time of writing,
the author has not worked out why in CPython the “base of the base”
is involved in the SLOTDEFINED
,
so that seems only to copy slots originating in the immediate base.
Eventually, we aim to:
Compute an MRO by Python rules (or approximation).
Choose the unique
__base__
by Python rules.
These can wait until we introduce class definitions in Python, through generated tests.
3.7.4. Lightweight EmptyException
¶
In the section A Unary Operation UNARY_NEGATIVE,
we introduced the idea that empty slots would not be null
,
as in CPython,
but would throw an EmptyException
when invoked.
Code in the run-time would catch this exception,
and this would save us a test for null
.
It is normally advised that, in Java,
the run-time cost of exceptions precludes
using them for normal transfers of control.
As a result,
the logic for binary operations (see A Binary Operation BINARY_ADD)
reserves this technique for use only where
an empty slot would lead to an exception anyway (often a TypeError
).
At the point where finding an empty slot is a normal occurrence,
the attempt is preceded by a test for BINARY_EMPTY
:
class Number extends Abstract {
// ...
private static PyObject binary_op1(PyObject v, PyObject w,
Slot binop) throws Slot.EmptyException, Throwable {
PyType vtype = v.getType();
PyType wtype = w.getType();
MethodHandle slotv = binop.getSlot(vtype);
MethodHandle slotw;
if (wtype == vtype || (slotw = binop.getSlot(wtype)) == slotv)
// Both types give the same result
return (PyObject) slotv.invokeExact(v, w);
else if (!wtype.isSubTypeOf(vtype)) {
// Ask left (if not empty) then right.
if (slotv != BINARY_EMPTY) {
PyObject r = (PyObject) slotv.invokeExact(v, w);
if (r != Py.NotImplemented)
return r;
}
return (PyObject) slotw.invokeExact(v, w);
} else {
// Right is sub-class: ask first (if not empty).
if (slotw != BINARY_EMPTY) {
PyObject r = (PyObject) slotw.invokeExact(v, w);
if (r != Py.NotImplemented)
return r;
}
return (PyObject) slotv.invokeExact(v, w);
}
}
In Abstract.getItem
and Abstract.setItem
(see Tuple Creation and Indexing and List Creation and Indexing)
we try the mapping slot first and do not guard it with a test,
choosing instead to catch the exception.
In almost all built-in types,
mp_subscript
is defined if sq_item
is,
so the exception is rare.
The alternative is a call to Slot.isDefinedFor()
.
The cost of the test is paid every time,
while the cost of the exception only arises if it is thrown.
This may be a good trade if exceptions are cheap enough,
but how cheap do they have to be?
It has been shown (see “The Exceptional Performance of Lil’ Exception” [Shipilev 2014]) that the cost of throwing an exception is very high, and consists of:
the creation of a stack-trace, and
the subsequent unwinding to the catch point.
The first cost may be eliminated by creating one instance statically and
throwing it every time.
Obviously the trace is then misleading, but we don’t ever use it,
and we could suppress it altogether.
We can make that change in a refactoring of Slot
:
enum Slot {
tp_hash(Signature.LEN), //
tp_repr(Signature.UNARY), //
// ...
enum Signature implements ClassShorthand {
UNARY(O, O), // NB.negative, NB.invert
// ...
Signature(Class<?> returnType, Class<?>... ptype) {
// em = λ : throw Util.EMPTY
// (with nominally-correct return type)
MethodHandle em = MethodHandles.throwException(returnType,
EmptyException.class).bindTo(Util.EMPTY);
// empty = λ u v ... : throw Util.EMPTY
this.empty = MethodHandles.dropArguments(em, 0, ptype);
// All handles in the slot must have the same type as empty
this.type = this.empty.type(); // = (ptype...)returnType
}
}
static class Util {
static final EmptyException EMPTY = new EmptyException();
// ...
}
}
The unwinding cost will be greatly reduced, to no more than the cost of a jump, if the compiler has in-lined the intervening calls. Shipilev suggests that if the exception is thrown more than one time in 104, a test is to be preferred, although for a compiler that in-lines successfully, his figures suggest the break-even is more like one time in 100.
Shipilev warns against relying on the in-lining, but in our case (Java 11 and 6 years on) it may now be reliable. The modern compiler is said to in-line method handle graphs well.
It is difficult to decide the issue without performance tests that would deflect from our architectural investigation. The benefit of the static exception trick is to make it matter less which we choose in any given piece of code.
Note
Exceptions in Python are recommended
as a normal form of flow control.
It may be necessary to revisit ours with this technique in mind,
if our PyException
s are to have adequate performance.
3.7.5. Type Cast in the Method Handle¶
The implementation of the neg
method of PyLong
provides a simple example of this change.
The slot nb_negative
may be invoked on any object.
As far as MethodHandle.invoke()
is concerned
it must have the signature (PyObject)PyObject
,
but this leads to ugly implementations of neg
and many other methods.
Previously,
the implementation method neg
looked like this:
/** The Python {@code int} object. */
class PyLong implements PyObject {
// ...
static PyObject neg(PyObject v) {
BigInteger a = valueOrError(v);
return new PyLong(a.negate());
}
// ...
private static BigInteger valueOrError(PyObject v)
throws InterpreterError {
try {
return ((PyLong) v).value;
} catch (ClassCastException cce) {
throw PyObjectUtil.typeMismatch(v, TYPE);
}
}
}
We had to cast v
to the correct type before working on it.
The ClassCastException
should never be thrown in practice,
since only operations on a PyLong
instance
should ever lead the interpreter to the PyType
for int
,
and in no other type (except sub-classes) is a slot bound to this method.
Of course, it will happen that sometimes we end up here incorrectly, but that would signify a bug in the interpreter. We should perhaps let the cast fail, catch it outside the interpreter loop, and let the Java stack trace lead us to the problem.
In evo3
we introduce a new type Slot.Self
,
exclusively for use as a placeholder in method signatures.
When looking for an implementation method with Slot.findInClass()
,
Self
gets replaced with with the class being searched.
Therefore the new implementation will be found that looks like this:
static PyObject neg(PyLong v) {
return new PyLong(v.value.negate());
}
This is much clearer to read,
here and in many other types and methods,
than what preceded it.
The same effect appears in CPython,
but because C is not type-safe,
it is just a matter of casting the implementation function pointer
to the required type for the slot.
Objects/longobject.c
provides an example:
static PyObject *
long_neg(PyLongObject *v)
{
PyLongObject *z;
if (Py_ABS(Py_SIZE(v)) <= 1)
return PyLong_FromLong(-MEDIUM_VALUE(v));
z = (PyLongObject *)_PyLong_Copy(v);
if (z != NULL)
Py_SIZE(z) = -(Py_SIZE(v));
return (PyObject *)z;
}
static PyNumberMethods long_as_number = {
/* ... */
(unaryfunc)long_neg, /*nb_negative*/
The support for this is to revise the signatures,
and to re-work Util.findInClass()
as follows:
enum Slot {
// ...
interface Self extends PyObject {}
private interface ClassShorthand {
static final Class<PyObject> O = PyObject.class;
static final Class<?> S = Self.class;
// ...
}
enum Signature implements ClassShorthand {
UNARY(O, S), // nb_negative, nb_invert
BINARY(O, O, O), // +, -, u[v]
TERNARY(O, S, O, O), // **
PREDICATE(B, S), // nb_bool
LEN(I, S), // sq_length
RICHCMP(O, S, O, CMP), // (richcmpfunc) tp_richcompare only
SQ_INDEX(O, S, I), // (ssizeargfunc) sq_item, sq_repeat only
SQ_ASSIGN(V, S, I, O), // (ssizeobjargproc) sq_ass_item only
MP_ASSIGN(V, S, O, O); // (objobjargproc) mp_ass_subscript only
// ...
Signature(Class<?> returnType, Class<?>... ptypes) {
// The signature is recorded exactly as given
this.type = MethodType.methodType(returnType, ptypes);
// In the type of this.empty, replace Self with PyObject.
MethodType slotType = Util.replaceSelf(this.type, O);
// em = λ : throw Util.EMPTY
MethodHandle em = MethodHandles
.throwException(returnType, EmptyException.class)
.bindTo(Util.EMPTY);
// empty = λ u v ... : throw Util.EMPTY
this.empty = MethodHandles.dropArguments(em, 0,
slotType.parameterArray());
}
}
// ...
private static class Util {
static MethodHandle findInClass(Slot slot, Class<?> c) {
try {
// The method has the same name in every implementation
String name = slot.getMethodName();
// The implementation has c where slot.type has Self
MethodType mtype = replaceSelf(slot.type, c);
MethodHandle impl = LOOKUP.findStatic(c, name, mtype);
// The invocation type remains that of slot.empty
return impl.asType(slot.empty.type());
} catch (NoSuchMethodException | IllegalAccessException e) {
return slot.empty;
}
}
static MethodType replaceSelf(MethodType type, Class<?> c) {
int n = type.parameterCount();
for (int i = 0; i < n; i++) {
if (type.parameterType(i) == Self.class) {
type = type.changeParameterType(i, c);
}
}
return type;
}
}
}
The cast that cluttered the implementation code is not avoided at runtime,
but is bound into the MethodHandle
we create for the slot,
by the call to MethodHandle.asType()
.
If we ever need the invocation MethodType
of a slot s
,
it is simply s.empty.type()
.
It is interesting to inspect which arguments can be Slot
type.
UNARY(O, S)
facilitates the simplification we have discussed.
BINARY(O, O, O)
cannot take slot types because the same method must
compute the operation and its reflected version.
For example, following CPython exactly, PyFloat.add
handles
float+float
, int+float
and float+int
.
(Jython 2 effectively has add
and radd
slot functions,
and it is worth asking whether this is not better.)
In contrast, we can write RICHCMP(O, S, O, CMP)
,
because reversed operations reverse the operands and comparison type,
and so the target type is always the first argument.
Return types are never Slot types, because they may be any Python type, especially when an operation wraps a dunder method implemented in Python.
We make a final observation concerning inheritance.
bool
should inherit int
’s definition of nb_negative
.
PyBool
is an acceptable argument to PyLong.neg
because PyBool
extends PyLong
.
The function definition is no longer a match in findInClass
,
because it does not match (PyBool)PyObject
,
but inheritance occurs as we require it
by slot copy in PyType.setAllSlots()
.