7.8. Functions Defined in Java

7.8.1. Introduction

This section is for discussion of conventions around the definition and calling of functions defined in Java. We might include in this:

  • Functions in modules defined in Java.

  • Abstract API intended to be public.

  • Private methods within the interpreter.

  • Methods in “discovered” Java objects.

What signatures can these have? What conversion rules apply at the call interface?

Note

At the time of writing, we have a mixed style, but lean towards a signature in Java that reflects constraints documented and enforced in CPython.

We have to consider the precedent set by CPython, in which public API is usually in terms only of PyObject *, even when the documentation says the object must be of a particular type. Contrary to this, the implementation functions of type slots sometimes declare with the actual type expected, and those in modules defined in C, where Argument Clinic has inserted a type-checking wrapper.

We must also look ahead to the possibility of “acceptable implementations”. This may mean that a method currently declared to expect exactly PyUnicode will have to accept String as well.

7.8.2. Patterns of Method Definition

In the course of exploring types and modules defined in C in CPython, we encounter several patterns in Python, visible to the user of the language. We must reproduce these when implementing types and modules in Java. We do not necessarily commit to implementing the same types and modules in Java that are implemented in C in CPython.

We study the CPython implementation for guidance how to achieve this visible behaviour, and find some useful structures and concepts, but we do not have to follow the pattern if there is a better way in Java. Here are the visible type of methods as they appear in the dictionary of built-in types and modules, and some notes on their implementation in C.

Python built-in method m of type T or module M

Kind

Dictionary entry T.__dict__[m.__name__]

Notes

special

wrapper_descriptor(T, m)

instance

method_descriptor(T, m)

class

classmethod_descriptor(T, m)

static

staticmethod(builtin_function_or_method(m, None))

(1)(2)(3)

__new__

builtin_function_or_method(T.__new__, T)

(4)(5)

module

builtin_function_or_method(m, M)

(6)

Notes:

  1. In CPython, the builtin_function_or_method constructor is in fact given T to bind and stores it in the field m_self, but the surface behaviour is that __self__ is None. (Accessors return NULL internally when the flag METH_STATIC is set in the attached PyMethodDef.)

  2. The C signature of a function exposed as static begins with an ignored argument which receives NULL when called.

  3. The entry in T.__dict__['m'] is wrapped in the staticmethod decorator.

  4. In CPython, the builtin_function_or_method exposing __new__ also stores T in the field m_self which accesors return because the special PyMethodDef for __new__ does not set METH_STATIC.

  5. The entry in T.__dict__['__new__'] is a builtin_function_or_method that actually references a private function tp_new_wrapper in C via a private PyMethodDef. It most resembles a method of type bound to instance T. It is this private method that invokes the static method posted by T to its tp_new slot.

  6. The C signature of a method in a module begins with a module instance, which in a call is supplied by the bound value (M).

The first three method kinds in the table, are all represented in the dictionary by descriptors: wrapper_descriptor, method_descriptor and classmethod_descriptor. They have binding behaviour through a __get__ method that implements the . operator. When any is called directly, one has to supply a self object (or class) as the first argument. When bound to a (non-type) object using . the result is a bound object:

>>> type(int.__dict__['__add__'])
<class 'wrapper_descriptor'>
>>> type(b:=(7).__add__)
<class 'method-wrapper'>
>>> b.__self__
7
>>>
>>> type(int.__dict__['bit_length'])
<class 'method_descriptor'>
>>> type(b:=(7).bit_length)
<class 'builtin_function_or_method'>
>>> b.__self__
7
>>>
>>> type(int.__dict__['from_bytes'])
<class 'classmethod_descriptor'>
>>> type(b:=(7).from_bytes)
<class 'builtin_function_or_method'>
>>> b.__self__
<class 'int'>

A static method is also represented by a descriptor staticmethod, but __get__ simply returns the inner builtin_function_or_method.

>>> type(b:=str.__dict__['maketrans'])
<class 'staticmethod'>
>>> type(str.maketrans)
<class 'builtin_function_or_method'>
>>> assert str.maketrans is b.__func__
>>> assert "x".maketrans is b.__func__
>>> type(str.maketrans.__self__)
<class 'NoneType'>

Decorator staticmethod is applied to methods in classes defined in Python to disable binding that would otherwise take place. Much the same effect could be obtained by storing the builtin_function_or_method directly in the dictionary, since that has no binding behaviour, although to dispense with it would be a visible divergence from Python behaviour.

In summary of the first 4 method kinds, the result of any binding is a builtin_function_or_method, except when it is a method-wrapper, which is similar.

The last two method kinds in the table do not have binding behaviour. Python enters an already bound builtin_function_or_method representing the callable. The use of . to access them via a an object simply returns the value in the dictionary of the type.

>>> type(int.__dict__['__new__'])
<class 'builtin_function_or_method'>
>>> assert int.__new__ is int.__dict__['__new__']
>>> int.__new__.__self__
<class 'int'>
>>>
>>> type(math.__dict__['sqrt'])
<class 'builtin_function_or_method'>
>>> assert math.sqrt is math.__dict__['sqrt']
>>> math.sqrt.__self__
<module 'math' (built-in)>

__new__ is a static method when defined by the type but the builtin_function_or_method differs from the staticmethod that is the exposed form of other static methods of a built-in. It is not clear why this is so, or how essential this difference is to the semantics of Python. We return __new__ from the exposer as any other static method, but process __new__ specially in type construction, where we can reproduce expected behaviour as closely as proves necessary.

7.8.3. Argument and Return Value Coercion

Where to use null and where Py.None?

A null should never escape as the value of a Python variable. In some cases, assigning null will mean “delete” or “deleted”. A function returning a PyObject should not return null but Py.None. A function returning a specific sub-type T of PyObject, but which may produce no result, should probably return an Optional<T>, thus forcing the client to acknowledge the possibility explicitly.

It is convenient to have strongly-typed fields and arguments in the implementation of Python types. Sometimes these fields (as Python attributes) are optional. admit None as their effective a value. For example, function.__defaults__ “must be set to a tuple object”, according to the error message, but None is also acceptable, and internally to PyFunction, CPython converts it to a NULL.

In arguments, we cannot pass Py.None where a specific type is expected. We may, however, make the argument null, and that might be what we store.

Generally it appears we should support null on the way in and Py.None on the way out.