Python 中的对象模型
is instance of
+-----+ +---------------------+ +------------------------+
| | | is instance of | | is instance of |
| | | | | |
| +-+--v------+ +--+-v-----+ +-----+------+
+---> | | | | |
| type | | Class A | | instance |
| | | | | |
+---+----^--+ +-----+----+ +------------+
| | |
is sub-class of | |is instance of |
| | |
| | |
+---v----+--+ |
| | is sub-class of|
| object <-----------------+
| |
+-----------+
上图中有两个比较特殊的对象:object 和 type。
object 是所有 Class 的基类,所有的 Class 都直接或间接继承至 object,所以 class 是 object 的子类。
>>> issubclass(int, object)
True
>>> issubclass(str, object)
True
>>>type 是 metaclass,class 是通过 metaclass 创建的,所以 class 是 type 的实例。
>>> isinstance(int, type)
Truetype 是特殊的对象,它的父类是 object,由于 metaclass 可以继承自父类,所以 type 的 metaclass 是它本身。
>>> issubclass(type, object)
True
>>> isinstance(object, type)
True
>>> isinstance(type, type)
True
>>>内部类(例如 int,str,list,dict)使用 PyTypeObject 表示。
然而对于用户自定义的类,Python 并不知道用户会在类中定义多少变量和属性,所以必须使用另外一种结构来表示用户自定义的类,那就是 PyHeapTypeObject。
/* The *real* layout of a type object when allocated on the heap */
typedef struct _heaptypeobject {
/* Note: there's a dependency on the order of these members
in slotptr() in typeobject.c . */
PyTypeObject ht_type;
PyNumberMethods as_number;
PyMappingMethods as_mapping;
PySequenceMethods as_sequence; /* as_sequence comes after as_mapping,
so that the mapping wins when both
the mapping and the sequence define
a given operator (e.g. __getitem__).
see add_operators() in typeobject.c . */
PyBufferProcs as_buffer;
PyObject *ht_name, *ht_slots;
/* here are optional user slots, followed by the members. */
} PyHeapTypeObject;PyType_Ready 函数负责对 PyTypeObject 进行初始化,主要工作有:
-
设置 tp_base(默认为 PyBaseObject_Type,即 Python 层面的 object)
-
设置 ob_type(一般继承自基类的 ob_type)
-
设置 tp_bases
-
初始化 tp_dict(重点)
-
add_operators
-
add_methods
-
add_members
-
add_getset
-
-
计算 MRO(method resolution order)
-
将本 type 添加到所有父类的 subclass 中
定义:
/* Table mapping __foo__ names to tp_foo offsets and slot_tp_foo wrapper
functions. The offsets here are relative to the 'PyHeapTypeObject'
structure, which incorporates the additional structures used for numbers,
sequences and mappings.
Note that multiple names may map to the same slot (e.g. __eq__,
__ne__ etc. all map to tp_richcompare) and one name may map to multiple
slots (e.g. __str__ affects tp_str as well as tp_repr). The table is
terminated with an all-zero entry. (This table is further initialized and
sorted in init_slotdefs() below.) */
struct wrapperbase {
char *name;
int offset;
void *function;
wrapperfunc wrapper;
char *doc;
int flags;
PyObject *name_strobj;
};slotdef 表示 PyTypeObject 中的一个个操作,例如 __add__,__str__。
在 slot 中,包含了很多关于一个操作的信息,但是很可惜,在 tp_dict 中,与 “getitem” 关联在一起的,一定不会是一个 slot。原因很简单,slot 不是一个 PyObject,它不能存放在 dict 对象中。当然,如果我们再深入地思考一下,会发现,slot 也不会被调用。既然 slot 不是一个 PyObject,那么它就没有 type,也就无从谈起什么 tp_call 了,所以 slot 是无论如何也不能满足前面描述的 Python 中的 “可调用” 这个概念。
前面我们说过,Python 虚拟机在 tp_dict 找到 “getitem” 对应的 “操作” 后,会调用该 “操作”,所以在 tp_dict 中与 “getitem” 对应的只能是另一个包装了 slot 的 PyObject,在 Python 中,这是一个我们称之为 descriptor 的东西。
在 Python 内部,存在多种 descriptor,与 PyTypeObject 中的操作对应的是 PyWrapper- DescrObject。在此后的描述中,我们将用术语 descriptor 来专门表示 PyWrapperDescr- Object。一个 descriptor 包含一个 slot,其创建是通过 PyDescr_NewWrapper 完成的。
定义:
#define PyDescr_COMMON \
PyObject_HEAD \
PyTypeObject *d_type; \
PyObject *d_name
typedef struct {
PyDescr_COMMON;
struct wrapperbase *d_base;
void *d_wrapped; /* This can be any function pointer */
} PyWrapperDescrObject;
static PyDescrObject *
descr_new(PyTypeObject *descrtype, PyTypeObject *type, const char *name)
{
PyDescrObject *descr;
descr = (PyDescrObject *)PyType_GenericAlloc(descrtype, 0);
if (descr != NULL) {
Py_XINCREF(type);
descr->d_type = type;
descr->d_name = PyString_InternFromString(name);
if (descr->d_name == NULL) {
Py_DECREF(descr);
descr = NULL;
}
}
return descr;
}
PyObject *
PyDescr_NewWrapper(PyTypeObject *type, struct wrapperbase *base, void *wrapped)
{
PyWrapperDescrObject *descr;
descr = (PyWrapperDescrObject *)descr_new(&PyWrapperDescr_Type,
type, base->name);
if (descr != NULL) {
descr->d_base = base;
descr->d_wrapped = wrapped;
}
return (PyObject *)descr;
}/* This function is called by PyType_Ready() to populate the type's
dictionary with method descriptors for function slots. For each
function slot (like tp_repr) that's defined in the type, one or more
corresponding descriptors are added in the type's tp_dict dictionary
under the appropriate name (like __repr__). Some function slots
cause more than one descriptor to be added (for example, the nb_add
slot adds both __add__ and __radd__ descriptors) and some function
slots compete for the same descriptor (for example both sq_item and
mp_subscript generate a __getitem__ descriptor).
In the latter case, the first slotdef entry encoutered wins. Since
slotdef entries are sorted by the offset of the slot in the
PyHeapTypeObject, this gives us some control over disambiguating
between competing slots: the members of PyHeapTypeObject are listed
from most general to least general, so the most general slot is
preferred. In particular, because as_mapping comes before as_sequence,
for a type that defines both mp_subscript and sq_item, mp_subscript
wins.
This only adds new descriptors and doesn't overwrite entries in
tp_dict that were previously defined. The descriptors contain a
reference to the C function they must call, so that it's safe if they
are copied into a subtype's __dict__ and the subtype has a different
C function in its slot -- calling the method defined by the
descriptor will call the C function that was used to create it,
rather than the C function present in the slot when it is called.
(This is important because a subtype may have a C function in the
slot that calls the method from the dictionary, and we want to avoid
infinite recursion here.) */
static int
add_operators(PyTypeObject *type)
{
PyObject *dict = type->tp_dict;
slotdef *p;
PyObject *descr;
void **ptr;
init_slotdefs();
for (p = slotdefs; p->name; p++) {
if (p->wrapper == NULL)
continue;
ptr = slotptr(type, p->offset);
if (!ptr || !*ptr)
continue;
if (PyDict_GetItem(dict, p->name_strobj))
continue;
descr = PyDescr_NewWrapper(type, p, *ptr);
if (descr == NULL)
return -1;
if (PyDict_SetItem(dict, p->name_strobj, descr) < 0)
return -1;
Py_DECREF(descr);
}
if (type->tp_new != NULL) {
if (add_tp_new_wrapper(type) < 0)
return -1;
}
return 0;
}class A(object):
name = 'Python'
def __init__(self):
print 'A::__init__'
def f(self):
print 'A::f'
def g(self, aValue):
self.value = aValue
print self.value
a = A()
a.f()
a.g(10)模块的字节码:
1 0 LOAD_CONST 0 ('A')
3 LOAD_NAME 0 (object)
6 BUILD_TUPLE 1
9 LOAD_CONST 1 (<code object A at 0x55abcddf3288, file "class_00.py", line 1>)
12 MAKE_FUNCTION 0
15 CALL_FUNCTION 0
18 BUILD_CLASS
19 STORE_NAME 1 (A)
13 22 LOAD_NAME 1 (A)
25 CALL_FUNCTION 0
28 STORE_NAME 2 (a)
14 31 LOAD_NAME 2 (a)
34 LOAD_ATTR 3 (f)
37 CALL_FUNCTION 0
40 POP_TOP
15 41 LOAD_NAME 2 (a)
44 LOAD_ATTR 4 (g)
47 LOAD_CONST 2 (10)
50 CALL_FUNCTION 1
53 POP_TOP
54 LOAD_CONST 3 (None)
57 RETURN_VALUEclass A 的字节码:
1 0 LOAD_NAME 0 (__name__)
3 STORE_NAME 1 (__module__)
2 6 LOAD_CONST 0 ('Python')
9 STORE_NAME 2 (name)
3 12 LOAD_CONST 1 (<code object __init__ at 0x55abcddf30a8, file "class_00.py", line 3>)
15 MAKE_FUNCTION 0
18 STORE_NAME 3 (__init__)
6 21 LOAD_CONST 2 (<code object f at 0x55abcddf3030, file "class_00.py", line 6>)
24 MAKE_FUNCTION 0
27 STORE_NAME 4 (f)
9 30 LOAD_CONST 3 (<code object g at 0x55abcddf3210, file "class_00.py", line 9>)
33 MAKE_FUNCTION 0
36 STORE_NAME 5 (g)
39 LOAD_LOCALS # f_locals 中包含 class 的动态元信息:类属性和方法
# BUILD_CLASS 指令会用到 f_locals
40 RETURN_VALUEcase BUILD_CLASS:
u = TOP(); // class 的动态信息
v = SECOND(); // 基类 tuple
w = THIRD(); // 类名
STACKADJ(-2);
x = build_class(u, v, w);
SET_TOP(x);
Py_DECREF(u);
Py_DECREF(v);
Py_DECREF(w);
break;static PyObject *
build_class(PyObject *methods, PyObject *bases, PyObject *name)
{
// 1. find metaclass
// 2. build class by metaclass: PyObject_CallFunctionObjArgs
PyObject *metaclass = NULL, *result, *base;
if (PyDict_Check(methods))
metaclass = PyDict_GetItemString(methods, "__metaclass__");
if (metaclass != NULL)
Py_INCREF(metaclass);
else if (PyTuple_Check(bases) && PyTuple_GET_SIZE(bases) > 0) {
base = PyTuple_GET_ITEM(bases, 0);
metaclass = PyObject_GetAttrString(base, "__class__");
if (metaclass == NULL) {
PyErr_Clear();
metaclass = (PyObject *)base->ob_type;
Py_INCREF(metaclass);
}
}
else {
PyObject *g = PyEval_GetGlobals();
if (g != NULL && PyDict_Check(g))
metaclass = PyDict_GetItemString(g, "__metaclass__");
if (metaclass == NULL)
metaclass = (PyObject *) &PyClass_Type;
Py_INCREF(metaclass);
}
result = PyObject_CallFunctionObjArgs(metaclass, name, bases, methods, NULL);
Py_DECREF(metaclass);
if (result == NULL && PyErr_ExceptionMatches(PyExc_TypeError)) {
/* A type error here likely means that the user passed
in a base that was not a class (such the random module
instead of the random.random type). Help them out with
by augmenting the error message with more information.*/
PyObject *ptype, *pvalue, *ptraceback;
PyErr_Fetch(&ptype, &pvalue, &ptraceback);
if (PyString_Check(pvalue)) {
PyObject *newmsg;
newmsg = PyString_FromFormat(
"Error when calling the metaclass bases\n %s",
PyString_AS_STRING(pvalue));
if (newmsg != NULL) {
Py_DECREF(pvalue);
pvalue = newmsg;
}
}
PyErr_Restore(ptype, pvalue, ptraceback);
}
return result;
}调用流程:
PyObject_CallFunctionObjArgs -> PyObject_Call -> metaclass.ob_type.tp_call -> type.type_call -> metaclass.tp_new
metaclass 创建 class
metaclass.__new__(meta_class, name, bases, attrs)
疑问
- 没有看懂 type_new 中对
__slot__的处理
22 LOAD_NAME 1 (A)
25 CALL_FUNCTION 0
28 STORE_NAME 2 (a)CALL_FUNCTION 的调用路径:call_function -> do_call -> PyObject_Call -> tp_call -> tp_new(object_new),最终调用 class 的 tp_new 方法创建 instance
class 创建 instance
- `meta_class.__call__`
- `class.__new__`
- `class.__init__`
-
尝试从 mro 列表查找属性 (_PyType_Lookup),假设查找结果为 descr
1.1 如果 descr 是 data descriptor,则调用 descr.ob_type.tp_descr_get 获取真正的属性。读取属性结束。
-
尝试从实例的__dict__中查找属性,如果查找成功,读取属性结束。
-
如果 descr.ob_type.tp_descr_get不等于 NULL,则调用 descr.ob_type.tp_descr_get 获取真正的属性。读取属性结束。
-
如果 descr不等于 NULL,则返回 descr。读取属性结束。
-
返回 NULL。
-
类 class
-
创建流程
-
观察各个 tp_xxx 变量
-
用户自定义__init__, __new__等方法,观察 fixup_slot_dispatchers 如何修改默认方法。
-
-
实例instace
-
创建流程:new, init
-
如何读取属性
-
descriptor
-
bound method, unbound method
-