oop 2.7 __new__ - What are metaclasses in Python?





7 Answers

Classes as objects

Before understanding metaclasses, you need to master classes in Python. And Python has a very peculiar idea of what classes are, borrowed from the Smalltalk language.

In most languages, classes are just pieces of code that describe how to produce an object. That's kinda true in Python too:

>>> class ObjectCreator(object):
...       pass
...

>>> my_object = ObjectCreator()
>>> print(my_object)
<__main__.ObjectCreator object at 0x8974f2c>

But classes are more than that in Python. Classes are objects too.

Yes, objects.

As soon as you use the keyword class, Python executes it and creates an OBJECT. The instruction

>>> class ObjectCreator(object):
...       pass
...

creates in memory an object with the name "ObjectCreator".

This object (the class) is itself capable of creating objects (the instances), and this is why it's a class.

But still, it's an object, and therefore:

  • you can assign it to a variable
  • you can copy it
  • you can add attributes to it
  • you can pass it as a function parameter

e.g.:

>>> print(ObjectCreator) # you can print a class because it's an object
<class '__main__.ObjectCreator'>
>>> def echo(o):
...       print(o)
...
>>> echo(ObjectCreator) # you can pass a class as a parameter
<class '__main__.ObjectCreator'>
>>> print(hasattr(ObjectCreator, 'new_attribute'))
False
>>> ObjectCreator.new_attribute = 'foo' # you can add attributes to a class
>>> print(hasattr(ObjectCreator, 'new_attribute'))
True
>>> print(ObjectCreator.new_attribute)
foo
>>> ObjectCreatorMirror = ObjectCreator # you can assign a class to a variable
>>> print(ObjectCreatorMirror.new_attribute)
foo
>>> print(ObjectCreatorMirror())
<__main__.ObjectCreator object at 0x8997b4c>

Creating classes dynamically

Since classes are objects, you can create them on the fly, like any object.

First, you can create a class in a function using class:

>>> def choose_class(name):
...     if name == 'foo':
...         class Foo(object):
...             pass
...         return Foo # return the class, not an instance
...     else:
...         class Bar(object):
...             pass
...         return Bar
...
>>> MyClass = choose_class('foo')
>>> print(MyClass) # the function returns a class, not an instance
<class '__main__.Foo'>
>>> print(MyClass()) # you can create an object from this class
<__main__.Foo object at 0x89c6d4c>

But it's not so dynamic, since you still have to write the whole class yourself.

Since classes are objects, they must be generated by something.

When you use the class keyword, Python creates this object automatically. But as with most things in Python, it gives you a way to do it manually.

Remember the function type? The good old function that lets you know what type an object is:

>>> print(type(1))
<type 'int'>
>>> print(type("1"))
<type 'str'>
>>> print(type(ObjectCreator))
<type 'type'>
>>> print(type(ObjectCreator()))
<class '__main__.ObjectCreator'>

Well, type has a completely different ability, it can also create classes on the fly. type can take the description of a class as parameters, and return a class.

(I know, it's silly that the same function can have two completely different uses according to the parameters you pass to it. It's an issue due to backwards compatibility in Python)

type works this way:

type(name of the class,
     tuple of the parent class (for inheritance, can be empty),
     dictionary containing attributes names and values)

e.g.:

>>> class MyShinyClass(object):
...       pass

can be created manually this way:

>>> MyShinyClass = type('MyShinyClass', (), {}) # returns a class object
>>> print(MyShinyClass)
<class '__main__.MyShinyClass'>
>>> print(MyShinyClass()) # create an instance with the class
<__main__.MyShinyClass object at 0x8997cec>

You'll notice that we use "MyShinyClass" as the name of the class and as the variable to hold the class reference. They can be different, but there is no reason to complicate things.

type accepts a dictionary to define the attributes of the class. So:

>>> class Foo(object):
...       bar = True

Can be translated to:

>>> Foo = type('Foo', (), {'bar':True})

And used as a normal class:

>>> print(Foo)
<class '__main__.Foo'>
>>> print(Foo.bar)
True
>>> f = Foo()
>>> print(f)
<__main__.Foo object at 0x8a9b84c>
>>> print(f.bar)
True

And of course, you can inherit from it, so:

>>>   class FooChild(Foo):
...         pass

would be:

>>> FooChild = type('FooChild', (Foo,), {})
>>> print(FooChild)
<class '__main__.FooChild'>
>>> print(FooChild.bar) # bar is inherited from Foo
True

Eventually you'll want to add methods to your class. Just define a function with the proper signature and assign it as an attribute.

>>> def echo_bar(self):
...       print(self.bar)
...
>>> FooChild = type('FooChild', (Foo,), {'echo_bar': echo_bar})
>>> hasattr(Foo, 'echo_bar')
False
>>> hasattr(FooChild, 'echo_bar')
True
>>> my_foo = FooChild()
>>> my_foo.echo_bar()
True

And you can add even more methods after you dynamically create the class, just like adding methods to a normally created class object.

>>> def echo_bar_more(self):
...       print('yet another method')
...
>>> FooChild.echo_bar_more = echo_bar_more
>>> hasattr(FooChild, 'echo_bar_more')
True

You see where we are going: in Python, classes are objects, and you can create a class on the fly, dynamically.

This is what Python does when you use the keyword class, and it does so by using a metaclass.

What are metaclasses (finally)

Metaclasses are the 'stuff' that creates classes.

You define classes in order to create objects, right?

But we learned that Python classes are objects.

Well, metaclasses are what create these objects. They are the classes' classes, you can picture them this way:

MyClass = MetaClass()
my_object = MyClass()

You've seen that type lets you do something like this:

MyClass = type('MyClass', (), {})

It's because the function type is in fact a metaclass. type is the metaclass Python uses to create all classes behind the scenes.

Now you wonder why the heck is it written in lowercase, and not Type?

Well, I guess it's a matter of consistency with str, the class that creates strings objects, and int the class that creates integer objects. type is just the class that creates class objects.

You see that by checking the __class__ attribute.

Everything, and I mean everything, is an object in Python. That includes ints, strings, functions and classes. All of them are objects. And all of them have been created from a class:

>>> age = 35
>>> age.__class__
<type 'int'>
>>> name = 'bob'
>>> name.__class__
<type 'str'>
>>> def foo(): pass
>>> foo.__class__
<type 'function'>
>>> class Bar(object): pass
>>> b = Bar()
>>> b.__class__
<class '__main__.Bar'>

Now, what is the __class__ of any __class__ ?

>>> age.__class__.__class__
<type 'type'>
>>> name.__class__.__class__
<type 'type'>
>>> foo.__class__.__class__
<type 'type'>
>>> b.__class__.__class__
<type 'type'>

So, a metaclass is just the stuff that creates class objects.

You can call it a 'class factory' if you wish.

type is the built-in metaclass Python uses, but of course, you can create your own metaclass.

The __metaclass__ attribute

In Python 2, you can add a __metaclass__ attribute when you write a class (see next section for the Python 3 syntax):

class Foo(object):
    __metaclass__ = something...
    [...]

If you do so, Python will use the metaclass to create the class Foo.

Careful, it's tricky.

You write class Foo(object) first, but the class object Foo is not created in memory yet.

Python will look for __metaclass__ in the class definition. If it finds it, it will use it to create the object class Foo. If it doesn't, it will use type to create the class.

Read that several times.

When you do:

class Foo(Bar):
    pass

Python does the following:

Is there a __metaclass__ attribute in Foo?

If yes, create in memory a class object (I said a class object, stay with me here), with the name Foo by using what is in __metaclass__.

If Python can't find __metaclass__, it will look for a __metaclass__ at the MODULE level, and try to do the same (but only for classes that don't inherit anything, basically old-style classes).

Then if it can't find any __metaclass__ at all, it will use the Bar's (the first parent) own metaclass (which might be the default type) to create the class object.

Be careful here that the __metaclass__ attribute will not be inherited, the metaclass of the parent (Bar.__class__) will be. If Bar used a __metaclass__ attribute that created Bar with type() (and not type.__new__()), the subclasses will not inherit that behavior.

Now the big question is, what can you put in __metaclass__ ?

The answer is: something that can create a class.

And what can create a class? type, or anything that subclasses or uses it.

Metaclasses in Python 3

The syntax to set the metaclass has been changed in Python 3:

class Foo(object, metaclass=something):
    [...]

i.e. the __metaclass__ attribute is no longer used, in favor of a keyword argument in the list of base classes.

The behaviour of metaclasses however stays largely the same.

Custom metaclasses

The main purpose of a metaclass is to change the class automatically, when it's created.

You usually do this for APIs, where you want to create classes matching the current context.

Imagine a stupid example, where you decide that all classes in your module should have their attributes written in uppercase. There are several ways to do this, but one way is to set __metaclass__ at the module level.

This way, all classes of this module will be created using this metaclass, and we just have to tell the metaclass to turn all attributes to uppercase.

Luckily, __metaclass__ can actually be any callable, it doesn't need to be a formal class (I know, something with 'class' in its name doesn't need to be a class, go figure... but it's helpful).

So we will start with a simple example, by using a function.

# the metaclass will automatically get passed the same argument
# that you usually pass to `type`
def upper_attr(future_class_name, future_class_parents, future_class_attr):
    """
      Return a class object, with the list of its attribute turned
      into uppercase.
    """

    # pick up any attribute that doesn't start with '__' and uppercase it
    uppercase_attr = {}
    for name, val in future_class_attr.items():
        if not name.startswith('__'):
            uppercase_attr[name.upper()] = val
        else:
            uppercase_attr[name] = val

    # let `type` do the class creation
    return type(future_class_name, future_class_parents, uppercase_attr)

__metaclass__ = upper_attr # this will affect all classes in the module

class Foo(): # global __metaclass__ won't work with "object" though
    # but we can define __metaclass__ here instead to affect only this class
    # and this will work with "object" children
    bar = 'bip'

print(hasattr(Foo, 'bar'))
# Out: False
print(hasattr(Foo, 'BAR'))
# Out: True

f = Foo()
print(f.BAR)
# Out: 'bip'

Now, let's do exactly the same, but using a real class for a metaclass:

# remember that `type` is actually a class like `str` and `int`
# so you can inherit from it
class UpperAttrMetaclass(type):
    # __new__ is the method called before __init__
    # it's the method that creates the object and returns it
    # while __init__ just initializes the object passed as parameter
    # you rarely use __new__, except when you want to control how the object
    # is created.
    # here the created object is the class, and we want to customize it
    # so we override __new__
    # you can do some stuff in __init__ too if you wish
    # some advanced use involves overriding __call__ as well, but we won't
    # see this
    def __new__(upperattr_metaclass, future_class_name,
                future_class_parents, future_class_attr):

        uppercase_attr = {}
        for name, val in future_class_attr.items():
            if not name.startswith('__'):
                uppercase_attr[name.upper()] = val
            else:
                uppercase_attr[name] = val

        return type(future_class_name, future_class_parents, uppercase_attr)

But this is not really OOP. We call type directly and we don't override or call the parent __new__. Let's do it:

class UpperAttrMetaclass(type):

    def __new__(upperattr_metaclass, future_class_name,
                future_class_parents, future_class_attr):

        uppercase_attr = {}
        for name, val in future_class_attr.items():
            if not name.startswith('__'):
                uppercase_attr[name.upper()] = val
            else:
                uppercase_attr[name] = val

        # reuse the type.__new__ method
        # this is basic OOP, nothing magic in there
        return type.__new__(upperattr_metaclass, future_class_name,
                            future_class_parents, uppercase_attr)

You may have noticed the extra argument upperattr_metaclass. There is nothing special about it: __new__ always receives the class it's defined in, as first parameter. Just like you have self for ordinary methods which receive the instance as first parameter, or the defining class for class methods.

Of course, the names I used here are long for the sake of clarity, but like for self, all the arguments have conventional names. So a real production metaclass would look like this:

class UpperAttrMetaclass(type):

    def __new__(cls, clsname, bases, dct):

        uppercase_attr = {}
        for name, val in dct.items():
            if not name.startswith('__'):
                uppercase_attr[name.upper()] = val
            else:
                uppercase_attr[name] = val

        return type.__new__(cls, clsname, bases, uppercase_attr)

We can make it even cleaner by using super, which will ease inheritance (because yes, you can have metaclasses, inheriting from metaclasses, inheriting from type):

class UpperAttrMetaclass(type):

    def __new__(cls, clsname, bases, dct):

        uppercase_attr = {}
        for name, val in dct.items():
            if not name.startswith('__'):
                uppercase_attr[name.upper()] = val
            else:
                uppercase_attr[name] = val

        return super(UpperAttrMetaclass, cls).__new__(cls, clsname, bases, uppercase_attr)

That's it. There is really nothing more about metaclasses.

The reason behind the complexity of the code using metaclasses is not because of metaclasses, it's because you usually use metaclasses to do twisted stuff relying on introspection, manipulating inheritance, vars such as __dict__, etc.

Indeed, metaclasses are especially useful to do black magic, and therefore complicated stuff. But by themselves, they are simple:

  • intercept a class creation
  • modify the class
  • return the modified class

Why would you use metaclasses classes instead of functions?

Since __metaclass__ can accept any callable, why would you use a class since it's obviously more complicated?

There are several reasons to do so:

  • The intention is clear. When you read UpperAttrMetaclass(type), you know what's going to follow
  • You can use OOP. Metaclass can inherit from metaclass, override parent methods. Metaclasses can even use metaclasses.
  • Subclasses of a class will be instances of its metaclass if you specified a metaclass-class, but not with a metaclass-function.
  • You can structure your code better. You never use metaclasses for something as trivial as the above example. It's usually for something complicated. Having the ability to make several methods and group them in one class is very useful to make the code easier to read.
  • You can hook on __new__, __init__ and __call__. Which will allow you to do different stuff. Even if usually you can do it all in __new__, some people are just more comfortable using __init__.
  • These are called metaclasses, damn it! It must mean something!

Why would you use metaclasses?

Now the big question. Why would you use some obscure error prone feature?

Well, usually you don't:

Metaclasses are deeper magic that 99% of users should never worry about. If you wonder whether you need them, you don't (the people who actually need them know with certainty that they need them, and don't need an explanation about why).

Python Guru Tim Peters

The main use case for a metaclass is creating an API. A typical example of this is the Django ORM.

It allows you to define something like this:

class Person(models.Model):
    name = models.CharField(max_length=30)
    age = models.IntegerField()

But if you do this:

guy = Person(name='bob', age='35')
print(guy.age)

It won't return an IntegerField object. It will return an int, and can even take it directly from the database.

This is possible because models.Model defines __metaclass__ and it uses some magic that will turn the Person you just defined with simple statements into a complex hook to a database field.

Django makes something complex look simple by exposing a simple API and using metaclasses, recreating code from this API to do the real job behind the scenes.

The last word

First, you know that classes are objects that can create instances.

Well in fact, classes are themselves instances. Of metaclasses.

>>> class Foo(object): pass
>>> id(Foo)
142630324

Everything is an object in Python, and they are all either instances of classes or instances of metaclasses.

Except for type.

type is actually its own metaclass. This is not something you could reproduce in pure Python, and is done by cheating a little bit at the implementation level.

Secondly, metaclasses are complicated. You may not want to use them for very simple class alterations. You can change classes by using two different techniques:

99% of the time you need class alteration, you are better off using these.

But 98% of the time, you don't need class alteration at all.

factory inheritance multiple

What are metaclasses and what do we use them for?




One use for metaclasses is adding new properties and methods to an instance automatically.

For example, if you look at Django models, their definition looks a bit confusing. It looks as if you are only defining class properties:

class Person(models.Model):
    first_name = models.CharField(max_length=30)
    last_name = models.CharField(max_length=30)

However, at runtime the Person objects are filled with all sorts of useful methods. See the source for some amazing metaclassery.




I think the ONLamp introduction to metaclass programming is well written and gives a really good introduction to the topic despite being several years old already.

http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html (archived at https://web.archive.org/web/20080206005253/http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html)

In short: A class is a blueprint for the creation of an instance, a metaclass is a blueprint for the creation of a class. It can be easily seen that in Python classes need to be first-class objects too to enable this behavior.

I've never written one myself, but I think one of the nicest uses of metaclasses can be seen in the Django framework. The model classes use a metaclass approach to enable a declarative style of writing new models or form classes. While the metaclass is creating the class, all members get the possibility to customize the class itself.

The thing that's left to say is: If you don't know what metaclasses are, the probability that you will not need them is 99%.




Python 3 update

There are (at this point) two key methods in a metaclass:

  • __prepare__, and
  • __new__

__prepare__ lets you supply a custom mapping (such as an OrderedDict) to be used as the namespace while the class is being created. You must return an instance of whatever namespace you choose. If you don't implement __prepare__ a normal dict is used.

__new__ is responsible for the actual creation/modification of the final class.

A bare-bones, do-nothing-extra metaclass would like:

class Meta(type):

    def __prepare__(metaclass, cls, bases):
        return dict()

    def __new__(metacls, cls, bases, clsdict):
        return super().__new__(metacls, cls, bases, clsdict)

A simple example:

Say you want some simple validation code to run on your attributes -- like it must always be an int or a str. Without a metaclass, your class would look something like:

class Person:
    weight = ValidateType('weight', int)
    age = ValidateType('age', int)
    name = ValidateType('name', str)

As you can see, you have to repeat the name of the attribute twice. This makes typos possible along with irritating bugs.

A simple metaclass can address that problem:

class Person(metaclass=Validator):
    weight = ValidateType(int)
    age = ValidateType(int)
    name = ValidateType(str)

This is what the metaclass would look like (not using __prepare__ since it is not needed):

class Validator(type):
    def __new__(metacls, cls, bases, clsdict):
        # search clsdict looking for ValidateType descriptors
        for name, attr in clsdict.items():
            if isinstance(attr, ValidateType):
                attr.name = name
                attr.attr = '_' + name
        # create final class and return it
        return super().__new__(metacls, cls, bases, clsdict)

A sample run of:

p = Person()
p.weight = 9
print(p.weight)
p.weight = '9'

produces:

9
Traceback (most recent call last):
  File "simple_meta.py", line 36, in <module>
    p.weight = '9'
  File "simple_meta.py", line 24, in __set__
    (self.name, self.type, value))
TypeError: weight must be of type(s) <class 'int'> (got '9')

Note: This example is simple enough it could have also been accomplished with a class decorator, but presumably an actual metaclass would be doing much more.

The 'ValidateType' class for reference:

class ValidateType:
    def __init__(self, type):
        self.name = None  # will be set by metaclass
        self.attr = None  # will be set by metaclass
        self.type = type
    def __get__(self, inst, cls):
        if inst is None:
            return self
        else:
            return inst.__dict__[self.attr]
    def __set__(self, inst, value):
        if not isinstance(value, self.type):
            raise TypeError('%s must be of type(s) %s (got %r)' %
                    (self.name, self.type, value))
        else:
            inst.__dict__[self.attr] = value



A metaclass is a class that tells how (some) other class should be created.

This is a case where I saw metaclass as a solution to my problem: I had a really complicated problem, that probably could have been solved differently, but I chose to solve it using a metaclass. Because of the complexity, it is one of the few modules I have written where the comments in the module surpass the amount of code that has been written. Here it is...

#!/usr/bin/env python

# Copyright (C) 2013-2014 Craig Phillips.  All rights reserved.

# This requires some explaining.  The point of this metaclass excercise is to
# create a static abstract class that is in one way or another, dormant until
# queried.  I experimented with creating a singlton on import, but that did
# not quite behave how I wanted it to.  See now here, we are creating a class
# called GsyncOptions, that on import, will do nothing except state that its
# class creator is GsyncOptionsType.  This means, docopt doesn't parse any
# of the help document, nor does it start processing command line options.
# So importing this module becomes really efficient.  The complicated bit
# comes from requiring the GsyncOptions class to be static.  By that, I mean
# any property on it, may or may not exist, since they are not statically
# defined; so I can't simply just define the class with a whole bunch of
# properties that are @property @staticmethods.
#
# So here's how it works:
#
# Executing 'from libgsync.options import GsyncOptions' does nothing more
# than load up this module, define the Type and the Class and import them
# into the callers namespace.  Simple.
#
# Invoking 'GsyncOptions.debug' for the first time, or any other property
# causes the __metaclass__ __getattr__ method to be called, since the class
# is not instantiated as a class instance yet.  The __getattr__ method on
# the type then initialises the class (GsyncOptions) via the __initialiseClass
# method.  This is the first and only time the class will actually have its
# dictionary statically populated.  The docopt module is invoked to parse the
# usage document and generate command line options from it.  These are then
# paired with their defaults and what's in sys.argv.  After all that, we
# setup some dynamic properties that could not be defined by their name in
# the usage, before everything is then transplanted onto the actual class
# object (or static class GsyncOptions).
#
# Another piece of magic, is to allow command line options to be set in
# in their native form and be translated into argparse style properties.
#
# Finally, the GsyncListOptions class is actually where the options are
# stored.  This only acts as a mechanism for storing options as lists, to
# allow aggregation of duplicate options or options that can be specified
# multiple times.  The __getattr__ call hides this by default, returning the
# last item in a property's list.  However, if the entire list is required,
# calling the 'list()' method on the GsyncOptions class, returns a reference
# to the GsyncListOptions class, which contains all of the same properties
# but as lists and without the duplication of having them as both lists and
# static singlton values.
#
# So this actually means that GsyncOptions is actually a static proxy class...
#
# ...And all this is neatly hidden within a closure for safe keeping.
def GetGsyncOptionsType():
    class GsyncListOptions(object):
        __initialised = False

    class GsyncOptionsType(type):
        def __initialiseClass(cls):
            if GsyncListOptions._GsyncListOptions__initialised: return

            from docopt import docopt
            from libgsync.options import doc
            from libgsync import __version__

            options = docopt(
                doc.__doc__ % __version__,
                version = __version__,
                options_first = True
            )

            paths = options.pop('<path>', None)
            setattr(cls, "destination_path", paths.pop() if paths else None)
            setattr(cls, "source_paths", paths)
            setattr(cls, "options", options)

            for k, v in options.iteritems():
                setattr(cls, k, v)

            GsyncListOptions._GsyncListOptions__initialised = True

        def list(cls):
            return GsyncListOptions

        def __getattr__(cls, name):
            cls.__initialiseClass()
            return getattr(GsyncListOptions, name)[-1]

        def __setattr__(cls, name, value):
            # Substitut option names: --an-option-name for an_option_name
            import re
            name = re.sub(r'^__', "", re.sub(r'-', "_", name))
            listvalue = []

            # Ensure value is converted to a list type for GsyncListOptions
            if isinstance(value, list):
                if value:
                    listvalue = [] + value
                else:
                    listvalue = [ None ]
            else:
                listvalue = [ value ]

            type.__setattr__(GsyncListOptions, name, listvalue)

    # Cleanup this module to prevent tinkering.
    import sys
    module = sys.modules[__name__]
    del module.__dict__['GetGsyncOptionsType']

    return GsyncOptionsType

# Our singlton abstract proxy class.
class GsyncOptions(object):
    __metaclass__ = GetGsyncOptionsType()



The tl;dr version

The type(obj) function gets you the type of an object.

The type() of a class is its metaclass.

To use a metaclass:

class Foo(object):
    __metaclass__ = MyMetaClass



The type() function can return the type of an object or create a new type,

for example, we can create a Hi class with the type() function and do not need to use this way with class Hi(object):

def func(self, name='mike'):
    print('Hi, %s.' % name)

Hi = type('Hi', (object,), dict(hi=func))
h = Hi()
h.hi()
Hi, mike.

type(Hi)
type

type(h)
__main__.Hi

In addition to using type() to create classes dynamically, you can control creation behavior of class and use metaclass.

According to the Python object model, the class is the object, so the class must be an instance of another certain class. By default, a Python class is instance of the type class. That is, type is metaclass of most of the built-in classes and metaclass of user-defined classes.

class ListMetaclass(type):
    def __new__(cls, name, bases, attrs):
        attrs['add'] = lambda self, value: self.append(value)
        return type.__new__(cls, name, bases, attrs)

class CustomList(list, metaclass=ListMetaclass):
    pass

lst = CustomList()
lst.add('custom_list_1')
lst.add('custom_list_2')

lst
['custom_list_1', 'custom_list_2']

Magic will take effect when we passed keyword arguments in metaclass, it indicates the Python interpreter to create the CustomList through ListMetaclass. new (), at this point, we can modify the class definition, for example, and add a new method and then return the revised definition.




Related