Django model default and custom behaviors: methods, managers and meta options

Problem

You want to understand the default functionality in all Django models and also customize these default behaviors.

How it works

When you create a Django model class it always inherits its behavior from the django.db.models.Model class. This Django class provides a Django model with a great deal of functionality, that includes basic operations via methods like save() and delete(), as well as naming conventions and query behaviors for the model as a whole. For most circumstances, the behaviors provided by the django.db.models.Model class are sufficient, but in other cases you'll want to provide custom behavior for some of these default behaviors.

Next, I'll enumerate a Django model's various constructs provided via the django.db.models.Model class. For popular constructs that have the most elaborate behavior (e.g. save() method) I'll use a dedicated section and for simpler constructs that are related to one another (e.g. validation methods) I'll group them into their own section.

save() method

The save() method offers one of the most common operations for Django models: to save (i.e. create or update) a record on a database. Once you create or have a reference to a Django model instance, you can call the save() method to create/update the instance on a database. Listing 1 illustrates this process.

Listing 1 - Django model use of the save() method

# Import Django model class
from coffeehouse.stores.models import Store

# Create a model Store instance
store_corporate = Store(name='Corporate',address='624 Broadway',city='San Diego',state='CA',email='corporate@coffeehouse.com')
# Invoke the save() method to create/save the record
# No record id reference, so operation is create and reference is updated with id
store_corporate.save() 
# Change field on instance
store_corporate.city='625 Broadway'
# Invoke the save() method to update/save the record
# Record has id reference from prior save() call, so operation is update 
store_corporate.save()

As you can see in listing 1, two calls are made to the save() method on the same reference, the first one creates the record on the database and the second one updates the record on the database. Can you tell how Django knows when to create and when to update a record with the same save() method ? It's not in plain sight so don't worry if you can't spot it.

In the previous recipe when you first set up Django models I mentioned Django automatically adds an id field as a primary key to all Django models to make queries easier and more efficient. The presence of this id primary key is what Django uses to determine if the save() method performs a create or update operation.

Notice the initial Store reference in listing 1 lacks an explicit id primary key value. Once you invoke the save() method on this reference, Django attempts to create a new record because it can't find an id primary key value. If the creation operations is succesful the database assigns the record an id primary key value that is then returned to Django which updates the reference with this id primary key value.

Since Django automatically updates the record reference with an id primary key, on subsequent calls made to the save() method on the same reference, Django detects the presence of the id primary key value and performs an update operation based on this id primary key value. In case you're wondering, if you add an explicit id primary key value to a record reference, Django also performs an update because that's the flag it looks for to determine whether to create or update a record, so be aware placing an explicit id primary key updates/overwrites the database record associated with the given id primary key.

Now that you have a firm understanding of a Django model's save() method default behaviors, I'll describe the various options available for the save() method. The save() method can accept a series of arguments to override its default behavior, table 1 illustrates these arguments, their behavior and their default value.

Table 1 - Django model save() method arguments
ArgumentDefaultDescription
force_insertforce_insert=FalseExplicitly tells Django to force a create operation on a record (e.g. .save(force_insert=True). This is rarely used but can be helpful for cases when you don't or can't rely on Django detecting a create operation (i.e. via the id primary key)
force_updateforce_update=FalseExplicitly tells Django to force an update operation on a record (e.g. .save(force_update=True). This is rarely used but can be helpful for cases when you don't or can't rely on Django detecting an update operation (i.e. via the id primary key)
usingusing=DEFAULT_DB_ALIAS, where DEFAULT_DB_ALIAS is a constant with a value of defaultAllows the save() to perform the operation against a database that's not the default value in settings.py (e.g. .save(using='oracle') performs the operation against the oracle database, where oracle is a key in the DATABASES variable in settings.py, see Set up a database for a Django project for more details on this topic)
update_fieldsupdate_fields=NoneAccepts a list of fields to update (e.g..save(update_fields=['name']) only updates a record's name value). Helpful when you have large models and want to do a more efficient/granular update, because by default Django updates all model fields.

The option you'll likely end up using the most from table 1 is update_fields because it produces a performance boost. However, table 1 gives you the full series of options in case you hit an edge-case with the save() method.

Finally, to close our discussion on the save() method, it's possible to define an implementation of the save() method on each Django model to execute custom logic or do what's technically known as override the method's behavior. Listing 2 illustrates this process.

Listing 2 - Django model with custom save() method


class Store(models.Model):
    name = models.CharField(max_length=30)    
    address = models.CharField(max_length=30)
    city = models.CharField(max_length=30)
    state = models.CharField(max_length=2)
    	
    def save(self, *args, **kwargs):
        # Do custom logic here (e.g. validation, logging, call third party service)
	# Run default save() method
        super(Store,self).save(*args, **kwargs)

Notice the save() method in listing 2 is declared inline with a Django model's fields. In this case, when a call is made to save() on a reference for this type of model (e.g.downtown.save()) Django attempts to run the model's custom save() method. This is helpful in circumstances where you want to perform other actions (e.g. log a message, call a third party service) when a model instance is created or updated. The last snippet in the custom save() method super(Store,self).save(*args, **kwargs) tells Django to run the base save() method -- from django.db.models.Model.

delete() method

The delete() method is used to eliminate a record from the database through a reference. For example, if in listing 1 you call store_corporate.delete() -- where store_corporate is the record reference -- Django removes the record from the database. Under the hood, the delete() method relies on the id primary key to remove the record, so it's a requirement for a reference to have this value.

When delete() is called on a reference, its id primary key value is removed but the record's remaining values remain in memory. In addition, the delete() method responds with the amount of deleted records (e.g. (1, {u'stores.Store_amenities': 0, u'stores.Store': 1}), indicates 1 Store record was deleted, where stores.Store_amenities represents a relationship on the main Store model).

Similar to the save() method, the delete() method also supports two arguments: using=DEFAULT_DB_ALIAS and keep_parents=False. The using argument allows you to specify an alternate database to perform the delete() operation on -- see table 1 for more details on this type of argument -- where as the keep_parents arguments is useful when the delete() operation takes place on a model with a relationship and you wish to keep the parent model's data intact -- or removed which is the default. I'll provide more details on the use of keep_parents in upcoming recipes that address Django model relationships.

And finally, it's also possible to define a custom delete() method on a Django model class -- just like save() in listing 2 -- to execute custom logic (e.g. create an audit trail) when a record is removed.

Validation: clean_fields(), clean(), validate_unique(),full_clean()

When you create or update a Django model instance with the save() method, Django enforces the instance values comply with those of the model definition. For example, if you use the model field name = models.CharField(max_length=30) Django enforces the name value is a text field with at most 30 characters. The tricky part to understand about Django model instance validation is that it's done on two layers, at the database layer and the Django/Python layer.

Once you have a Django model and create its initial migration -- as described in the Set up Django models and understand the migrations workflow recipe -- Django generates the database DDL (Data definition language) to create a database table in accordance with the model definition (e.g. the Django model field CharField(max_length=30) generates a varchar(30) NOT NULL database column type). Therefore due to this initial database DDL, all Django model values that don't comply with validation rules are guaranteed to be rejected at the database layer where it's the actual database that perform the Django model validation.

Although relying on Django model validation at the database layer is perfectly valid, Django also supports model instance validation at the Django/Python layer. Using model validation at the Django/Python layer has the advantage of supporting more complex validation rules, as well as reducing database load for operations that will end up being rejected by the database. However, unlike database layer validation which is automatically done after Django first migration, Django/Python layer validation requires that you create one of a series of Django model method's designed for validation which are the topic of this section.

Note DDL definitions for Django model data types

The next recipe Django model data types: Options and validations describes the DDL generated for each Django model data type to enforce model validation at the database layer.

Let's explore the Django model validation clean_fields() method first. Listing 3 illustrates a simple model definition, followed by a call sequence that uses the clean_fields() method.

Listing 3 - Django model use of validation clean_fields() method

class Store(models.Model):
    name = models.CharField(max_length=30)    
    address = models.CharField(max_length=30,unique=True)
    city = models.CharField(max_length=30)
    state = models.CharField(max_length=2)

# Create a model Store instance, that violates the max_length rule
store_corporate = Store(name='This is a very long name for the Corporate store that exceeds the 30 character limit',address='624 Broadway',city='San Diego',state='AZ',email='corporate@coffeehouse.com')
# No error yet
# You could call save() and let the database reject the instance...
# But you can also validate at the Django/Python level with the clean_fields() method 
store_corporate.clean_fields()
Traceback (most recent call last):
    raise ValidationError(errors)
ValidationError: {'name': [u'Ensure this value has at most 30 characters (it has 84).']}

First off, notice the model's name field in listing 3 uses the max_length=30 option to enforce values of this kind be capped to 30 characters. After the model definition, you can see the store_corporate instance breaks this last rule with a value greater than 30 characters, which means Django doesn't detect broken model rules at instance creation. While you could attempt to call save() on this last instance and let the database reject the operation via its DDL, you can call the clean_fields() method on the instance to tell Django to check the values of the instance against the model date types and raise an error.

Also notice the output of the clean_fields() method is a ValidatioError data type with a dictionary. This last dictionary corresponds to model_field-error_message key-values, making it easy to identify multiple validation errors and reuse this data for other purposes (e.g. logging, presenting the error in a template).

While the clean_fields() method validates model values individually against their data types, the clean() method can be used to enforce more elaborate rules (e.g. relationships or specific values). By default, the clean() method does nothing, so you must provide an implementation for it as illustrated in listing 4.

Listing 4 - Django model use of validation clean() method

class Store(models.Model):
    name = models.CharField(max_length=30)    
    address = models.CharField(max_length=30,unique=True)
    city = models.CharField(max_length=30)
    state = models.CharField(max_length=2)
    def clean(self):
        # Don't allow 'San Diego' city entries that have state different than 'CA'
        if self.city == 'San Diego' and self.state != 'CA':
            raise ValidationError(_('Wait San Diego is CA!, are you sure there is another San Diego in %s ?' % self.state))


# Create a model Store instance, that violates the max_length rule
store_corporate = Store(name='This is a very long name for the Corporate store that exceeds the 30 character limit',address='624 Broadway',city='San Diego',state='AZ',email='corporate@coffeehouse.com')

# To enforce more complex rules call the clean() method implemented on a model
store_corporate.clean()
Traceback (most recent call last):
    raise ValidationError('Wait San Diego is in CA!, are you sure there is another San Diego in %s ?' % (self.state))
ValidationError: [u'Wait San Diego is in CA!, are you sure there is another San Diego in AZ ?']

Notice in listing 4 the Django model class defines the clean() method. In this case, the method enforces that if an instance name value is San Diego its state value must be CA, if this condition is not met then a ValidationError exception is raised. Next, check out how once you invoke the clean() method on a model instance, a ValidationError error is raised similar to the clean_fields() method.

Another Django validation mechanism you can use is the clean_unique() method to enforce no two instances have the same value for a field that uses unique* options. Listing 5 illustrates the use the clean_unique() method.

Listing 5 - Django model use of validation clean_unique() method with unique* fields

class Store(models.Model):
    name = models.CharField(max_length=30)    
    address = models.CharField(max_length=30,unique=True)
    city = models.CharField(max_length=30)
    state = models.CharField(max_length=2)

# Create a model Store instance
store_corporate = Store(name='Downtown',address='624 Broadway',city='San Diego',state='AZ',email='corporate@coffeehouse.com')

# Save instance 
store_corporate.save()

# Create another instance to violate uniqueness of address field 
store_uptown = Store(name='Uptown',address='624 Broadway', city='San Diego',state='CA')
# You could call save() and let the database reject the instance...
# But you can also validate at the Django/Python level with the validate_unique() method 
store_uptown.validate_unique()
Traceback (most recent call last):
    raise ValidationError(errors)
ValidationError: {'address': [u'Store with this Address already exists.']}

Look at how the address field of the Store model in listing 5 uses unique=True, which tells Django not to allow two Store instances with the same address value. Next, we create a Store instance with address='624 Broadway' and save it to the database. Right after, we create another Store instance with the same address='624 Broadway' value, but because the address model field has the unique option this new instance is in violation of the rule. Therefore when you call the validate_unique() method on the store_uptown reference Django raises a ValidationError exception indicating there's already a Store record with the same address in the database. Note the next recipe Django model data types: Options and validations, unique values describes Django's various unique* options for model fields.

In addition to the clean_unique function performing validation on fields marked with unique* options, the clean_unique method also enforces validation for the unique_together option declared in a model's Meta class. This variation is illustrated in listing 6.

Listing 6 - Django model use of validation clean_unique() method with Meta unique_together option

class Store(models.Model):
    name = models.CharField(max_length=30)    
    address = models.CharField(max_length=30,unique=True)
    city = models.CharField(max_length=30)
    state = models.CharField(max_length=2)
    class Meta:
        unique_together = ("name", "email") 

# Create instance to show use of validate_unique() via Meta option
store_downtown_horton = Store(name='Downtown',address'Horton Plaza',city='San Diego',state='CA',email='downtown@coffeehouse.com')
# Save intance to DB 
store_downtown_horton.save()

# Create additional instance that violated unique_together rule in Meta class
store_downtown_fv = Store(name='Downtown',address'Fashion Valley',city='San Diego',state='CA',email='downtown@coffeehouse.com')

# You could call save() and let the database reject the instance but lets use validate_unique
store_downtown_fv.validate_unique()
Traceback (most recent call last):
ValidationError: {'__all__': [u'Store with this Name and Email already exists.']}

Notice how the class in listing 6 declares the Meta class followed by unique_together = ("name", "email"), which tells Django not to allow two Store instances with the same name and email value.

Next, two Store records are created with the same name and email value. Because this is in violation of the Django model meta option unique_together = ("name", "email"), after you save the first record store_downtown_horton and call the validate_unique() method on the second record store_downtown_fv, Django raises a ValidationError exception indicating there's already a Store record with the same name and email values in the database. Toward the end of this recipe I'll describe a Django model's Meta class options in greater detail.

Finally, the last validation method available on all Django models is the full_clean() method which is a shortcut to run the clean_fields(), clean() and validate_unique() methods -- in that order.

objects reference or Django model manager

The save() and delete() methods, as well as the multiple validation methods -- clean_fields(), clean(), validate_unique(),full_clean() -- all operate on Django model instance references, which means they're designed to be called on individual database records. On many ocassions though, you'll need to perform operations that span various Django model instances or database records, which is where the objects reference comes into play.

The objects reference -- technically known as a Django model manager -- is available on all Django models and is charged with managing all query operations associated with a Django model. This means that if you want to read any amount of Django model records or perform bulk model operations (e.g. create, update or delete multiple records) you'll end up using a Django model's objects reference.

A Django model's objects reference is used directly on a model class. For example, to read all Store model records you would use the Store.objects.all() syntax and to delete all Store model records you would use the Store.objects.delete() syntax. Because the functionalities of a model's objects reference are very extensive, I won't go deeper than these two simple objects syntax examples. The CRUD operations with multiple records and Django models recipe is dedicated to exploring the majority of an objects references functionalities, including some of its more subtle behaviors associated with QuerySet classes.

Finally, it's worth mentioning the objects reference is a convention for a Django model's default model manager. You can however change this default model manager reference to any other name (e.g.mgr) or inclusively create multiple model managers for a single model. However, because customizing Django model managers is only required for a minority of cases, I'll leave this discussion for another ocassion.

refresh_from_db(), from_db() and get_deferred_fields() methods

The refresh_from_db() method is a helpful aid if you want to update a pre-exisiting model instance with data from the database, either because the database was updated by another process or you accidently (or purposely) changed the model instance and want it to reflect the data in database once again. Using the refresh_from_db() method is as simple as executing it on a model reference (e.g. downtown.refresh_from_db() updates the downtown instance from values in the database).

Although the refresh_from_db() method is generally called without arguments, it does support two optional arguments. The using argument can be used to specify an alternate database from which to perform the refresh operation, a mechanism that works just like the option used in the save() and delete() methods and is described in table 1. The fields argument can be used to selectively refresh certain model fields, if no fields argument list is provided then the refresh_from_db() method refreshes all model fields.

In most circumstances, the initial loading mechanism for Django model instances is reasonable and sufficient. However, if you want to customize the default loading mechanism you can define the from_db() method. Unlike the refresh_from_db() method which can be called on a model instance, the from_db() method cannot be called directly and is intended to be part of a model class to be called every time a model instance is created from database data. So what would be a good reason to use the from_db method ? If you wanted to defer the loading of model field data.

For example, if you start to work with large Django models (e.g. more than 10 fields) you may quickly notice a performance hit from accessing large amounts of data at once. To minimize this performance hit, you can create a from_db method to defer the loading of model field data, instead of having Django load the full field data set at once which it does by default.

Complementing the functionality of deferred model fields is the get_deferred_fields() method, which returns a list of model fields that have been deferred from loading. Although the from_db() and get_deferred_fields() methods don't have as many usage scenarios as the refresh_from_db() method and are not as widely used as something like the save() or delete() methods, you may encounter a need for these two model methods once you work with larger and more complex models.

Django model custom methods

All the methods I've described up to this point come from Django's django.db.models.Model class. While it's important to learn how to use these methods and provide your own implementation for them, this doesn't necessarily mean a Django model class is restricted to using just these methods. You can use your own custom model class methods, as illustrated in listing 7.

Listing 7 - Django model with custom method


class Store(models.Model):
    name = models.CharField(max_length=30)    
    address = models.CharField(max_length=30)
    city = models.CharField(max_length=30)
    state = models.CharField(max_length=2)
    
    def latitude_longitude(self):
    	# Call remote service to get latitude & longitude
	latitude, longitude = geocoding_method(self.address, self.city, self.state)
	return latitude, longitude

The latitude_longitude method in listing 7 gives the Django model the ability to offer a common calculation on the model instance. For example, if you had a Store instance called downtown you could call downtown.latitude_longitude() to get a result based on the instance's address, city and state values aided by a remote service. This type of custom method is helpful because it keeps the logic on the Django model where it favors encapsulation.

Django model Meta class and options

In a previous section -- in listing 6 -- I made use of the Meta class on a Django model to enforce the uniqueness of model field values. In this section I'll expand on the purpose and various options available for a Django model's Meta class.

The Meta class in a Django model is intended to define behaviors associated with a Django model as a whole. Where as Django model data types provide a granular level for Django models to associate characteristic on Django model fields like how an individual field's data is stored and validated. So through Django's Meta class options it's possible declare behaviors that operate across multiple model fields or influence the Django model as a whole.

The Django Meta class and its options are always declared after a Django model's data types, as illustrated in listing 8.

Listing 8 - Django model with Meta class and ordering option


class Store(models.Model):
    name = models.CharField(max_length=30)    
    address = models.CharField(max_length=30)
    city = models.CharField(max_length=30)
    state = models.CharField(max_length=2)
    	
    class Meta: 
      	ordering = ['-state']

In listing 1 you can see the class Meta: statement declares the ordering = ['-state'] option. In this case, the ordering option tells Django that when a query is made on the model it order the results by the state field in descending order. The ordering meta option is helpful because it overrides the default model query order -- which is by id -- and it avoids the need to constantly and explicitly declare a model query's sort order.

Now that you have a better idea of a Django model's Meta class, in the upcoming sections I'll classify the various Meta options by category so you can easily identify them and use them approprietly.

Database Definition Language (DDL) Meta options: db_table, db_tablespace, managed, required_db_vendor, required_db_features, index_together

By default, the database table name for a Django model is based on the app name and model, with all lowercase letters and separated by an underscore. For example, if the app name is stores (e.g. django-admin.py startapp stores) and a model class in its models.py file is Amenity, by default Django stores this model's records in the stores_amenity database table. You can provide an explicit database table name for a Django model with the meta db_table option.

By default, if a Django project's backing database brand (e.g. Oracle) supports the concept of a tablespace, Django uses the DEFAULT_TABLESPACE variable in settings.py as the default tablespace. It's possible to specify an explicit tablespace for a Django model through the meta db_tablespace option. Note that if a Django project's backing database doesn't support the concept of a tablespace, this option is ignored.

All Django models are subject to the life-cycle described in the Set up Django models and understand the migrations workflow recipe. As part of this life-cycle, Django manages the DDL that creates and/or destroys the backing database table for every Django model. If you want to disable Django executing a model's default DDL against a database, you can do so with the managed=False option in the Meta class. The managed=False option is useful when a model's backing database table is created by some other means and therefore don't want Django to interfere with the management of this structure.

Because Django can work with different database back-ends (e.g. MySQL, Oracle, PostgreSQL) you can have situations where certain model definitions are designed to work with features that are not available on all database back-ends. To ensure a Django model is deployed against a certain database back-end, you can use two Meta class options. The required_db_vendor option accepts the values sqlite, postgresql, mysql or oracle to ensure a project's underlying database connection is of a given database vendor, if the connection does not match the specified vendor the model is not migrated against the database. The required_db_features option is used to ensure a backing database connection is enabled with a given list of features, if the connection does not have specified feature list enabled the model is not migrated against the database.

Finally, the index_together option allows you to define a multi-field index on a Django model. When you declare the index_together option (e.g. index_together=['city','state']) Django executes the necessary DDL (e.g.CREATE INDEX...) to create this database structure that speeds up query lookups.

Naming convention Meta options: app_label, verbose_name, verbose_name_plural, label, label_lower

By default, Django models are placed in models.py files inside Django apps, as described in the Set up Django models and understand the migrations workflow recipe.

Inheritance Meta options: abstract, proxy

The meta abstract option allows a Django model to function as a base class that doesn't have a backing database table and is used as a foundation for other Django model classes. Listing x illustrates a set of Django models that use the abstract option.

Query Meta options: get_latest_by, order_with_respect_to, ordering, unique_together, default_manager_name, base_manager_name
Permission Meta options: permissions, default_permissions