You want to make CRUD operations for multiple database records that rely on Django models.
To create multiple records associated with a Django model use the
bulk_create() method with an argument list of model instances. Be aware the
bulk_create method does not support pre-save and post-save model signals or models that span multiple tables (i.e. have relationships among one another). To overcome the limitations of the
bulk_create() method, you can use the standard
save() method and manage transactions manually to speed up a multiple record creation operation.
To read multiple records using a Django model you can use the
exclude() methods, all of which can be chained to create more complex queries. The
exclude() methods produce a
QuerySet data type which has a built-in cache mechanism and is lazy (i.e. it doesn't hit the database until certain triggers are reached).
To update multiple database records you can use the
update() method with a set of arguments to update, as well the
select_for_update() method which acquires a lock on the query rows that gives you time to perform complex updates that can't be expressed in a single statement. To delete multiple database records you can call the
delete() method to a query.
Working with multiple records in Django models is just as easy as working with single records that were described in the previous recipe. However, because working with multiple records can entail multiple database calls, other factors such as caching and bulk operations need to be taken into account to minimize execution times.
To create multiple records based on a Django model you can use the built-in
bulk_create() method. The advantage of the
bulk_create() method is that it creates all entries in a single query, so it's very efficient if you have a list of a dozen or a hundred entries you wish to create. Listing 1 illustrates the process to create multiple records for the
As you can see in listing 1, the
bulk_create() method accepts a list of model instances to create all records in a single step.
As efficient as the
bulk_create() method is, you should be aware it has certain limitations:
save()method to create single records, the
bulk_create()method does not execute pre-save and post-save model signals. In case you're unfamiliar with the concepts, pre-save and post-save model signals allow the execution of custom logic prior and after a model record is saved.
save()method which does support creating records that span multiple tables.
In case you face some of the limitations outlined for the
bulk_create() method, then the only alternative you have is to loop over each record and use the
save() method to create each entry individually. Listing 2 shows this process.
As I mentioned at the start, the process in listing 2 can be highly inefficient particularly if its done for dozens or hundreds of records, but sometimes it's the only option for creating multiple records in bulk. However, the speed issues related to listing 2 can be dramatically improved if you manually deal with database transactions.
Listing 3 illustrates how to use the
save() method and group the entire record creation process in a single transaction to speed up the bulk creation process.
save()method in a single transaction
As you can see in listing 3, there are two ways to create bulk operations in a single database transaction, both using the
django.db.transaction package.The first instance uses the
with transaction.atomic(): statement, so any nested code within this statement is run in a single transaction. The second instance uses the
@transaction.atomic method decorator, by decorating a method this way all of its operations are run in a single transaction.
|Be careful with explicit transactions|
There is a reason Django's default database transaction mechanism creates transactions for every single query, it's to err on the safe side and minimize the potential for data loss.
If you do decide to use explicit transactions as illustrated in listing 3 to improve performance, be aware that either all records will be created or none will. Although this might even be a desired behavior, for certain circumstances it might lead to unexpected results. Make sure you understand the implications of transactions on the data you're working with.
To read multiple records associated with a Django model you can use several methods, which include
exclude(). The purpose of the
all() method should be self explanatory, it retrieves all the records of a given model. The
filter() method is used to restrict query results on a given model property, for example
filter(state='CA') is a query to get all model records with
state='CA'. And the
exclude() method is used to execute a query that excludes records on a given model property, for example
exclude(state='AZ') is a query to get all model records except those with
It's also possible to chain
exclude() methods to create more complex multiple record queries. For example,
filter(state='CA').exclude(city='San Diego') is a query to get all model records with
state='CA' and exclude those with
city='San Diego'. Listing 4 illustrates more multiple record query examples.
| Append |
Sometimes it can be helpful or even necessary to view the actual SQL executed by a Django model query. You can do so by appending
As you can see in the previous snippet, you can output the SQL query to a Python logger or use the 'quick & dirty'
|More complex mutliple record queries with OR, AND, IN, LIKE and other variations|
In addition to the previous multiple record queries that match exact criteria (e.g.
See the recipe Django model queries classified by SQL keywords for a classification of the different Django query options classified by SQL keyword.
All the queries in listing 4 have a common behavior, their end result is always a Django
QuerySet data type. The
QuerySet data type is produced for all Django model queries that use either the
exclude() methods. So before we move on to the final two parts of this recipe -- how to update and delete multiple records -- we'll take a brief detour to explore the
QuerySet data type.
QuerySet: Lazy evaluation & caching
The first important characteristic of a
QuerySet data type is technically known as lazy evaluation. This means a
QuerySet isn't executed against the database right away, it's just lazy until its evaluated. In other words, the act of running a snippet like
all_stores = Store.objects.all() doesn't involve any database activity. Listing 5 illustrates how you can even chain query after query and still not trigger any database activity.
As you can see in listing 5, there are three different statements chained using the
exclude() methods. In the end, listing 5 makes a query for
Store records with
state='CA' and excludes those with
city='San Diego'. Though it can appear listing 5 makes three database calls, it actually doesn't make any.
So when does a query made up of a
QuerySet data type hit the database ? There are many triggers that make a
QuerySet evaluate and invoke an actual database call. Table 1 illustrates the various triggers.
|Iteration||Creating a loop on a |
|Slicing with 'step' argument||Slicing a |
|Pickling*||Pickling a |
|repr() method||Calling |
|len() method||Calling |
|list() method||Calling |
|Boolean tests (bool(), or, and or if statements)||Making a boolean test on a |
|* Pickling is Python's standard mechanism for object serialization, a process that converts a Python object into a character stream. The character stream contains all the information necessary to reconstruct the object at a later time. Pickling in the context of Django queries is typically used for heavyweight queries in an attempt to save resources (e.g. make a heavyweight query, pickle it and on subsequent occasions consult the pickled query). You can consider pickling Django queries a rudimentary form of caching.|
Now that you know about the evaluation triggers that cause a
QuerySet to make a call to a DB, let's take a final look at another important
QuerySet subject: caching.
QuerySet contains a cache to minimize database access which is empty when the
QuerySet is created. But the first time a
QuerySet is evaluated and the actual database query takes place -- see evaluation triggers in table 1 -- Django saves the results in the
QuerySet's cache for later use.
QuerySet's cache is most useful when an application has a recurring need to use the same data, as it leads to less hits on a database. However, leveraging a
QuerySet's cache comes with a few subtleties tied to the evaluation of a
QuerySet. The rule of thumb is to first evaluate a
QuerySet you plan to use more than once and then proceed to use its data to leverage the
QuerySet cache. This is best explained with the examples presented in listing 6.
As you can see in the examples in listing 6, those sequences that leverage a
QuerySet's cache trigger the evaluation of the
QuerySet right away and then use a reference to the evaluated
QuerySet to access the cached data. Those sequences that don't use a
QuerySet cache, either constantly create identical
QuerySet statements or make the evaluation process late and for each data assignment.
The only edge case for caching
QuerySet's that doesn't exactly fit the previous behavior is the second to last example in listing 8. If you trigger a partial evaluation of
QuerySet by slicing it (e.g.  or [1:5]) the cache is not populated. To ensure the use of a
QuerySet cache you must fully evaluate a
QuerySet and then you can slice the results, as illustrated in the last example in listing 8.
In the previous recipe on single record CRUD operations, we explored how to update single records with the
update() method, this same method can handle updating multiple records. This process is illustrated in listing 7.
The first example in listing 7 uses the
update() method to update all
Store records and set their
email@example.com. The second example uses a Django F expression and the
update() method to update all
Drink records and set their
stock value to the current
stock value plus
100. For the moment, don't worry to much about Django F expressions, just realize they allow you to reference model fields within a query, which is necessary in this case to perform the update in a single operation.
Even though the
update() method guarantees everything is done in a single operation to avoid race conditions, on certain occasions the
update() method may not be enough to do complex updates. Offering another alternative is Django's
select_for_update() method which locks rows on the given query until the update is marked as done. Listing 8 illustrates an example of the
Listing 8 shows two variations for
select_for_update(), one using an explicit transaction and the other decorating a method to scope it inside a transaction. Both variations use the same logic, they first create a query with
select_for_update(), then loop over the results to update each record and use
save() to update individual records. In this manner the rows touched by the specified query remain locked from inadvertent changes until the transaction finishes.
Be aware that when using the
select_for_update() it's absolutely necessary to use transactions using any of the techniques described in listing 8. If you run the
select_for_update() method in a database that supports it and you don't use transactions as illustrated in listing 8 -- maintaining Django's default auto-commit mode -- Django throws a
TransactionManagementError error because the rows cannot be locked as a group. Using the
select_for_update method in a database that offers no support for it has no effect (i.e. you won't see an error).
Under the hood the Django
In addition, there's the special argument
To delete multiple records you use the
delete() method and append it to a query. Listing 9 illustrates this process.
delete()method on query
The example in listing 9 uses the
delete() method to delete the
Store records with