CRUD multiple records in Django models

In this section you'll learn how to work with multiple records in Django models. Although the process is just as easy as working with single model records, working with multiple records can require multiple database calls, as well as caching techniques and bulk operations, all of which need to be taken into account to minimize execution times.

Create multiple records with bulk_create()

To create multiple records based on a Django model you can use the built-in bulk_create() method. The advantage of the bulk_create() method is that it creates all entries in a single query, so it's very efficient if you have a list of a dozen or a hundred entries you wish to create. Listing 8-12 illustrates the process to create multiple records for the Store model.

Listing 8-12 Create multiple records of a Django model with the bulk_create() method

# Import Django model class
from coffeehouse.stores.models import Store

# Create model Store instances
store_corporate = Store(name='Corporate',address='624 Broadway',
city ='San Diego',state='CA',email='corporate@coffeehouse.com')
store_downtown = Store(name='Downtown',address='Horton Plaza',
city ='San Diego',state='CA',email='downtown@coffeehouse.com')
store_uptown = Store(name='Uptown',address='240 University Ave',
city ='San Diego',state='CA',email='uptown@coffeehouse.com')
store_midtown = Store(name='Midtown',address='784 W Washington St',
city ='San Diego',state='CA',email='midtown@coffeehouse.com')

# Create store list
store_list = [store_corporate,store_downtown,store_uptown,store_midtown]

# Call bulk_create to create records in a single call
Store.objects.bulk_create(store_list)

In listing 8-12 you can see the bulk_create() method accepts a list of model instances to create all records in one step. But as efficient as the bulk_create() method is, you should be aware it has certain limitations:

It does not support pre-save and post-save model signals.- To speed things up and unlike the save() method to create single records, the bulk_create() method does not execute pre-save and post-save model signals. In you're unfamiliar with the model signal concept, pre-save and post-save model signals allow the execution of custom logic prior and after a model record is saved, a topic covered in the previous chapter.
It does not support models that span multiple tables (i.e. have relationships among one another).- Because records are created in bulk, there is no way to obtain primary key references for the first type of created records, which are then used to create related child records. If a model spans multiple tables, then you must individually create each record using the save() method which does support creating records that span multiple tables.

If you face these limitations for the bulk_create() method, the only alternative is to loop over each record and use the save() method to create each entry, as illustrated in listing 8-13.

Listing 8-13 Create multiple records with the save() method

# Same store_list as listing 8-12

# Loop over each store and invoke save() on each entry
# save() method called on each list member to create record
for store in store_list:
    store.save()

As I mentioned when I introduced the bulk_create() method, the process in listing 8-13 can be highly inefficient if its done for dozens or hundreds of records, but sometimes it's the only option to create multiple records in bulk. However, the speed issues related to listing 8-13 can be improved if you manually deal with model transactions.

Listing 8-14 illustrates how to use the save() method and group the entire record creation process in a single transaction to speed up the bulk creation process.

Listing 8-14 Create multiple records with save() method in a single transaction

# Import Django model and transaction class 
from coffeehouse.stores.models import Store
from django.db import transaction

# Create store list, with same references from listing 8-12
first_store_list = [store_corporate,store_downtown]
second_store_list = [store_uptown,store_midtown]

# Trigger atomic transaction so loop is executed in a single transaction
with transaction.atomic():
    # Loop over each store and invoke save() on each entry
    for store in first_store_list:
        # save() method called on each member to create record        
        store.save()

# Method decorated with @transaction.atomic to ensure logic is executed in single transaction
@transaction.atomic
def bulk_store_creator(store_list):
    # Loop over each store and invoke save() on each entry
    for store in store_list:
        # save() method called on each member to create record
        store.save()

# Call bulk_store_creator with Store list
bulk_store_creator(second_store_list)

As you can see in listing 8-14, there are two ways to create bulk operations in a single database transaction, both using the django.db.transaction package. The first instance uses the with transaction.atomic(): statement, so any nested code within this statement is run in a single transaction. The second instance uses the @transaction.atomic method decorator, which ensures the method operations are run in a single transaction.

Be careful with explicit transactions

There's a reason Django's default database transaction mechanism creates transactions on every query, it's to err on the safe side and minimize the potential for data loss.

If you decide to use explicit transactions to improve performance -- as illustrated in listing 8-14 -- be aware that either all or no records are created. Although this can be a desired behavior, for certain circumstances it might lead to unexpected results. Make sure you understand the implications of transactions on the data you're working with. The previous chapter contains a section discussing the topic of Django model transactions in greater detail.

Read multiple records with all(), filter(), exclude() or in_bulk()

To read multiple records associated with a Django model you can use several methods, which include: all(), filter(), exclude() and in_bulk(). The purpose of the all() method should be self explanatory, it retrieves all the records of a given model. The filter() method is used to restrict query results on a given model property, for example filter(state='CA') is a query to get all model records with state='CA'. And the exclude() method is used to execute a query that excludes records on a given model property, for example exclude(state='AZ') is a query to get all model records except those with state='AZ'.

It's also possible to chain filter() and exclude() methods to create more complex multiple record queries. For example, filter(state='CA').exclude(city='San Diego') is a query to get all model records with state='CA' and exclude those with city='San Diego'. Listing 8-15 illustrates more multiple record query examples.

Listing 8-15. Read multiple records with with all(), filter() and exclude() methods

# Import Django model class
from coffeehouse.stores.models import Store

# Query with all() method or equivalent SQL: 'SELECT * FROM ...'
all_stores = Store.objects.all()

# Query with include() method or equivalent SQL: 'SELECT....WHERE city = "San Diego"'
san_diego_stores = Store.objects.filter(city='San Diego') 

# Query with exclude() method or equivalent SQL: 'SELECT....WHERE NOT (city = "San Diego")'
non_san_diego_stores = Store.objects.exclude(city='San Diego')

# Query with include() and exclude() methods or equivalent SQL:
# 'SELECT....WHERE STATE='CA' AND NOT (city = "San Diego")'
ca_stores_without_san_diego = Store.objects.filter(state='CA').exclude(city='San Diego')

Append .query to view the actual SQL

Sometimes it can be helpful or even necessary to view the actual SQL executed by a Django model query. You can do so by appending .query to a query, as illustrated in the following listing:

from coffeehouse.stores.models import Storeimport loggingstdlogger = logging.getLogger(__name__) # Get the Store records with city San Diego san_diego_stores = Store.objects.filter(city='San Diego') stdlogger.debug("Query %s" % str(san_diego_stores.query)) # You can also use print(san_diego_stores.query)

As you can see in the previous snippet, you can output the SQL query to a Python logger or use the 'quick & dirty' print statement. Note that .query only works with queries that output QuerySets, so it doesn't work with queries like with the get() method -- more on QuerySets shortly. Chapter 5 describes other alternatives to inspect the SQL used by model queries (e.g. Django debug toolbar) and Chapter 3 shows how to output SQL queries in Django templates (e.g. Debug context processor sql_queries variable).

Tip In addition to single fields -- city="San Diego" or state="CA" -- the all(), filter() and exclude() methods can also accept multiple fields to produce an and query (e.g. filter(city="San Diego", state="CA") to get records were both city and state match). See the later section in the chapter on queries classified by SQL keyword.

Besides the all(), filter() and exclude() methods, Django models also support the in_bulk() method. The in_bulk() method is designed to efficiently read many records, just like the bulk_create() method -- described in the past section -- is used to efficiently create many records.

The in_bulk() method is more efficient to read many records vs. the all(), filter() and exclude() methods, because all the latter methods produce a QuerySet and the former produces a standard Python dictionary. Listing 8-16 illustrates the use of the in_bulk() method.

Listing 8-16. Read multiple records with with in_bulk() method

# Import Django model class
from coffeehouse.stores.models import Store

# Query with in_bulk() all 
Store.objects.in_bulk()
# Outputs: {1: <Store: Corporate (San Diego,CA)>, 2: <Store: Downtown
# (San Diego,CA)>, 3: <Store: Uptown (San Diego,CA)>, 4: <Store: Midtown (San Diego,CA)>}

# Compare in_bulk query to all() that produces QuerySet
Store.objects.all()
# Outputs: <QuerySet [<Store: Corporate (San Diego,CA)>, <Store: Downtown
# (San Diego,CA)>, <Store: Uptown (San Diego,CA)>, <Store: Midtown (San Diego,CA)>]>

# Query to get single Store by id
Store.objects.in_bulk([1])
# Outputs: {1: <Store: Corporate (San Diego,CA)>}

# Query to get multiple Stores by id
Store.objects.in_bulk([2,3])
# Outputs: {2: <Store: Downtown (San Diego,CA)>, 3: <Store: Uptown (San Diego,CA)>}

The first example in listing 8-16 uses the in_bulk() method without any arguments to produce a dictionary with the records of the Store model (i.e. just like the all() method). However, notice how the output of the in_bulk() method is a standard Python dictionary, where each key corresponds to an id value of the record.

The remaining examples in listing 8-16 illustrate how the in_bulk() method can accept a list of values to specify which record id's should be read from the database. Here again, notice that although the behavior is similar to the filter() or exclude() methods, the output is a standard Python dictionary vs. a QuerySet data structure.

Now that you have a clear understanding of the various methods that can read multiple model records and how some methods produce a QuerySet and other don't, it begets the question, what is a QuerySet and why is it used in the first place ? So before we move on to the next parts of this broader section -- on how to do CRUD operations on multiple records -- we'll take a brief detour to explore the QuerySet data type.

Understanding a QuerySet: Lazy evaluation & caching

The first important characteristic of a QuerySet data type is technically known as lazy evaluation. This means a QuerySet isn't executed against the database right away, it just waits until its evaluated. In other words, the act of running a snippet like Store.objects.all() doesn't involve any database activity right away. Listing 8-17 illustrates how you can even chain query after query and still not trigger database activity.

Listing 8-17. Chained model methods to illustrate concept of QuerySet lazy evaluation.

# Import Django model class
from coffeehouse.stores.models import Store

# Query with all() method
stores = Store.objects.all()

# Chain filter() method on query
stores = stores.filter(state='CA')

# Chain exclude() method on query
stores = stores.exclude(city='San Diego')

Notice the three different statements in listing 8-17 that chain the all(), filter() and exclude() methods. Although it can appear listing 8-17 makes three database calls to get Store records with state='CA' and excludes those with city='San Diego', there is no database activity!

This is how QuerySet data structures are designed to work. So when does a query made on a QuerySet data type hit the database ? There are many triggers that make a QuerySet evaluate and invoke an actual database call. Table 8-1 illustrates the various triggers.

Table 8-1. Django QuerySet evaluation triggers that invoke an actual database call

Evaluation Trigger	Description	Example
Iteration	Creating a loop on a QuerySet triggers a database call.	for store in Store.objects.all():
Slicing with 'step' argument	Slicing a QuerySet with a third argument (a.k.a. 'step' or 'stride' argument) triggers a database call. NOTE: Slicing a Queryset with 1 or 2 arguments just creates another QuerySet	# A list of every 5th record, for the first 100 records Store.objects.all()[:100:5] # Records 50 to 100 # This does NOT trigger a database hit, (2 arguments) Store.objects.all()[49:99] # All records starting from the 6th # This does NOT trigger a database hit (1 argument) Store.objects.all()[5:] # First record # This does NOT trigger a database hit (1 argument) Store.objects.all()[0]
Pickling*	Pickling a QuerySet forces all the results to be loaded into memory prior to pickling.	import pickle stores = Store.objects.all() pickled_stores = pickle.dumps(stores)
repr() method	Calling repr() on a QuerySet triggers a database call. NOTE: This is for convenience in the Python interactive interpreter, so you can immediately see query results.	repr(Store.objects.all())
len() method	Calling len() on a QuerySet triggers a database call. NOTE: If you only want the number of records, it's more efficient to use the Django model count() method.	total_stores = len(Store.objects.all()) #NOTE: The count() method is more efficient to get a total count efficient_total_stores = Store.objects.count()
list() method	Calling list() on a QuerySet triggers a database call.	store_list = list(Store.objects.all())
Boolean tests (bool(), or, and or if statements)	Making a boolean test on a QuerySet triggers a database call. NOTE: If you only want to check if a record exists, it's more efficient to use the Django model exists() method.	# Check if there's a store with city='San Diego' if Store.objects.filter(city='San Diego'): # There is a store in 'San Diego' pass #NOTE: The exists() method is more efficient for a boolean check san_diego_stores = Store.objects.exists(city='San Diego')

* Pickling is Python's standard mechanism for object serialization, a process that converts a Python object into a character stream. The character stream contains all the information necessary to reconstruct the object at a later time. Pickling in the context of Django queries is typically used for heavyweight queries in an attempt to save resources (e.g. make a heavyweight query, pickle it and on subsequent occasions consult the pickled query). You can consider pickling Django queries a rudimentary form of caching.

Now that you know the triggers that cause a QuerySet to make a call to a database, let's take a look at other important QuerySet subject: caching.

Every QuerySet contains a cache to minimize database access. The first time a QuerySet is evaluated and a database query takes place -- see evaluation triggers in table 8-1 -- Django saves the results in the QuerySet's cache for later use.

A QuerySet's cache is most useful when an application has a recurring need to use the same data, as it leads to less hits on a database. However, leveraging a QuerySet's cache comes with a few subtleties tied to the evaluation of a QuerySet. A rule of thumb is to first evaluate a QuerySet you plan to use more than once and proceed to use its data to leverage the QuerySet cache. This is best explained with the examples presented in listing 8-18.

Listing 8-18 - QuerySet caching behavior.

# Import Django model class
from coffeehouse.stores.models import Store

# CACHE USING SEQUENCE
# Query awaiting evaluation
lazy_stores = Store.objects.all()
# Iteration triggers evaluation and hits database
store_emails = [store.email for store in lazy_stores]
# Uses QuerySet cache from lazy_stores, since lazy_stores is evaluated in previous line  
store_names = [store.name for store in lazy_stores] 


# NON-CACHE SEQUENCE 
# Iteration triggers evaluation and hits database
heavy_store_emails = [store.email for store in Store.objects.all()]
# Iteration triggers evaluation and hits database again, because it uses another QuerySet ref  
heavy_store_names = [store.name for store in Store.objects.all()] 


# CACHE USING SEQUENCE
# Query wrapped as list() for immediate evaluation
stores = list(Store.objects.all()) 
# Uses QuerySet cache from stores
first_store  = stores[0] 
# Uses QuerySet cache from stores
second_store = stores[1] 
# Uses QuerySet cache from stores, set() is just used to eliminate duplicates
store_states = set([store.state for store in stores])
 # Uses QuerySet cache from stores, set() is just used to eliminate duplicates 
store_cities = set([store.city for store in stores])


# NON-CACHE SEQUENCE 
# Query awaiting evaluation
all_stores = Store.objects.all()
# list() triggers evaluation and hits database
store_one = list(all_stores[0:1]) 
# list() triggers evaluation and hits database again, because partially evaluating
# a QuerySet does not populate the cache
store_one_again = list(all_stores[0:1]) 


# CACHE USING SEQUENCE 
# Query awaiting evaluation
coffee_stores = Store.objects.all()
# Iteration triggers evaluation and hits database
[store for store in coffee_stores] 
# Uses QuerySet cache from coffee_stores, because it's evaluated fully in previous line
store_1 = coffee_stores[0] 
# Uses QuerySet cache from coffee_stores, because it's already evaluated in full
store_1_again = coffee_stores[0]

As you can see in the examples in listing 8-18, sequences that leverage a QuerySet's cache, trigger the evaluation of the QuerySet right away and then use a reference to the evaluated QuerySet to access the cached data. Sequences that don't use a QuerySet cache, either constantly create identical QuerySet statements or make the evaluation process late and for each data assignment.

The only edge case for caching QuerySet's that doesn't fit the previous behavior is the second to last example in listing 8-18. If you trigger a partial evaluation of QuerySet by slicing it (e.g. [0] or [1:5]) the cache is not populated. So to ensure a QuerySet cache is used, you must evaluate a QuerySet and then slice the results, as illustrated in the last example in listing 8-18.

Read performance methods: defer() ,only(), values(), values_list(), iterator(), exists() and none()

Although QuerySet data structures represent a step forward toward dealing with multiple data records by integrating lazy evaluation and caching mechanisms, they don't cover the entire performance spectrum needed to deal with large data queries.

A common performance problem you'll face with large data queries is related to reading unnecessary record fields. Although selectively choosing which fields to read from a database record can be an afterthought in most circumstances, it can have an important impact for queries made on Django models with more than a couple of fields.

The first methods available to increase performance while reading model records are the defer() and only() methods, both of which are intended to delimit which fields to read in a query. The defer() and only() methods accept a list of fields to defer or load, respectively, and are complementary to one another depending on what you want to achieve. For example, if you want to defer loading the majority of model fields, it's simpler to specify which fields to load with only(), if you want to defer loading one or a few fields in a model you can specify the fields in the defer() method. Listing 8-19 illustrates the use of the defer() and only() methods.

Listing 8-19 -- Read performance with defer() and only() to selectively read record fields.

from coffeehouse.stores.models import Store
from coffeehouse.items.models import Item

# Item names on the breakfast menu 
breakfast_items = Item.objects.filter(menu__name='Breakfast').only('name')

# All Store records with no email 
all_stores = Store.objects.defer('email').all()

# Confirm loaded fields on overall query 
breakfast_items.query.get_loaded_field_names()
# Outputs: {<class 'coffeehouse.items.models.Item'>: {'id', 'name'}}

all_stores.query.get_loaded_field_names()
# Outputs: {<class 'coffeehouse.stores.models.Store'>: {'id', 'address', 'state', 'city', 'name'}}

# Confirm deferred fields on individual model records
breakfast_items[0].get_deferred_fields()
# Outputs: {'calories', 'stock', 'price', 'menu_id', 'size', 'description'}

all_stores[1].get_deferred_fields()
# Outputs: {'email'}

# Access deferred fields, note each call on a deferred field implies a database hit
breakfast_items[0].price
breakfast_items[0].size
all_stores[1].email

As you can see in listing 8-19, both the defer() and only() methods can be chained to a model manager (i.e. objects) either at the start or end of a query, as well as be used in conjunction with other methods like all() and filter(). In addition, notice how both methods can accept a list of fields to defer or load.

To verify which model fields have been deferred or loaded, listing 8-19 illustrates two alternatives. The first technique consists of calling the get_loaded_field_names() on the query reference of a query statement to get a list of loaded fields. The second technique consists of calling the get_deferred_fields() method on a model instance to obtain a list of deferred fields.

So how do you obtain deferred fields ? Easy, you cast call them. Toward the end of listing 8-18, notice how even though the breakfast_items represents a query that only loads the name field, a call is made to the get the value of the price and size fields. Similarly, the all_stores reference in listing 8-19 represents a query that defers the email field, nevertheless you can get a record's email field value by just calling it. Although this last technique requires an additional database hit to get the deferred field(s), it also illustrates how easy it's to get a record's entire fields even if they're deferred.

The values() and values_list() methods offer another alternative to delimit the fields fetched by a query. Unlike the defer() and only() methods which produce a QuerySet of model instances, the values() and values_list() methods produce QuerySet instances composed of plain dictionaries, tuples or lists. This has the performance advantage of not creating full-fledged model instances, albeit this also has the disadvantage of not having access to full-fledged model instances.

The the values() and values_list() methods accept a list of fields to load as part of a query, a process that's illustrated in listing 8-20.

Tip You can use the values() and values_list() methods without any field argument to produce full model records as plain dictionaries, tuples or lists.

Listing 8-20 -- Read performance with values() and values_list() to selectively read record fields.

from coffeehouse.stores.models import Store
from coffeehouse.items.models import Item

# Item names on the breakfast menu 
breakfast_items = Item.objects.filter(menu__name='Breakfast').values('name')
print(breakfast_items)
# Outputs: <QuerySet [{'name': 'Whole-Grain Oatmeal'}, {'name': 'Bacon, Egg & Cheese Biscuit'}]>

# All Store records with no email 
all_stores = Store.objects.values_list('email','name','city').all()
print(all_stores)
# Outputs: <QuerySet [('corporate@coffeehouse.com', 'Corporate', 'San Diego'),
# ('downtown@coffeehouse.com', 'Downtown', 'San Diego'), ('uptown@coffeehouse.com', 'Uptown', 'San Diego'),
# ('midtown@coffeehouse.com', 'Midtown', 'San Diego')]>

all_stores_flat = Store.objects.values_list('email',flat=True).all()
print(all_stores_flat)
# Outputs: <QuerySet ['corporate@coffeehouse.com', 'downtown@coffeehouse.com',
# 'midtown@coffeehouse.com', 'uptown@coffeehouse.com']>

# It isn't possible to access undeclared model fields with values() and values_list() 
breakfast_items[0].price #ERROR
# Outputs AttributeError: 'dict' object has no attribute 'price'

The first variation in listing 8-20 generates an Item QuerySet with the name field, which as you can see produces a list of dictionaries with only the name field and value. Next, a query is made to get the email, name and city fields for all Store models using the values_list() method. Notice that unlike the values() method, the values_list() method produces a more compact structure in the form of a tuple. In listing 8-20 you can also see the values_list() method accepts the optional flat=True argument to flatten the resulting tuple into a plain list.

Finally, toward the end of listing 8-20 you can see that when using the values() and values_list() methods it isn't possible to obtain undeclared fields by just calling them, like it's possible with the defer() and only() methods. This behavior is due to the watered-down QuerySet produced by the values() and values_list() methods which aren't full-fledged model objects.

The iterator() method is yet another option available in Django models that creates an iterator over the results of a QuerySet. The iterator() method is ideal for large queries that are intended to be used once, as this lowers the required memory to store data which is an inherent property of all Python iterators. Listing 8-21 illustrates a query that uses the iterator() method and appendix A describes the core concepts behind Python iterators.

Listing 8-21 -- Read performance with iterator(), exists() and none().

from coffeehouse.stores.models import Store

# All Store with iterator()
stores_on_iterator = Store.objects.all().iterator()
print(stores_on_iterator)
# Outputs: <generator object __iter__ at 0x7f2864db8fc0>

# Advance through iterator with __next__()
stores_on_iterator.__next__()
# Outputs: <Store: Corporate (San Diego,CA)>
stores_on_iterator.__next__()
# Outputs: <Store: Downtown (San Diego,CA)>

# Check if Store object with id=5 exists
Store.objects.filter(id=5).exists()
# Outputs: False

# Create empty QuerySet on Store model
Store.objects.none()
# Outputs: <QuerySet []>

Another Django model read performance technique is the exists() method, which is illustrated in listing 8-21 and is used to verify if a query returns data. Although the exists() method executes a query against the database, the query used by exists() is a simplified version compared to a standard query, in addition to the exists() method returning a boolean True or False value compared to a full-fledged QuerySet. This makes the exists() method a good option for queries that operate on conditionals, where it's only necessary to verify if model records exists and the actual records data is unnecessary.

Finally, the Django model none() method -- illustrated at the end of listing 8-21 -- is used to generate an empty QuerySet, specifically of a sub-class named EmptyQuerySet. The none() method is helpful for cases where you knowingly need to assign an empty model QuerySet, such as edge cases related to Django model forms or Django templates, that expect a QuerySet instance in one way or another. In such cases, it becomes necessary to create a dummy QuerySet, instead of inefficiently creating QuerySet that returns data and deleting its contents.

As you've learned in this sub-section, in addition to the QuerySet data structure, Django also offers many methods specifically designed to efficiently read large or small amount of records associated with Django models.

Tip Remember the in_bulk() method from the past section also provides read performance over the basic all(), filter() and exclude() methods.

Update multiple records with update() or select_for_update().

In the section on single record CRUD operations, you explored how to update single records with the update() method, this same method can handle updating multiple records. This process is illustrated in listing 8-22.

Listing 8-22. Update multiple records with the update() method

from coffeehouse.stores.models import Store

Store.objects.all().update(email="contact@coffeehouse.com")

from coffeehouse.items.models import Item
from django.db.models import F

Item.objects.all().update(stock=F('stock') +100)

The first example in listing 8-22 uses the update() method to update all Store records and set their email value to contact@coffeehouse.com. The second example uses a Django F expression and the update() method to update all Drink records and set their stock value to the current stock value plus 100. Django F expressions allow you to reference model fields within a query, which is necessary in this case to perform the update in a single operation.

Although the update() method guarantees everything is done in a single operation to avoid race conditions, on certain occasions the update() method may not be enough to do complex updates. Offering another alternative to update multiple records is the select_for_update() method which locks rows on the given query until the update is marked as done. Listing 8-23 illustrates an example of the select_for_update() method.

select_for_update() support is database dependent

Under the hood, the Django select_for_update() method is based on SQL's SELECT...FOR UPDATE syntax which is not supported by all databases. Postgres, Oracle and MySQL databases support this functionality, but SQLite does not.
In addition, there's the special argument nowait (e.g. select_for_update(nowait=True) to make a query non-blocking) . By default, if another transaction acquires a lock on one of the selected rows, the select_for_update() query blocks until the lock is released. If you use nowait, this allows a query to run right away and in case a conflicting lock is already acquired by another transaction the DatabaseError is raised when the QuerySet is evaluated. Be aware though, MySQL does not support the nowait argument and if used with MySQL, Django throws a DatabaseError.

Listing 8-23 Update multiple records with a Django model with the select_for_update() method

# Import Django model class
from coffeehouse.stores.models import Store
from django.db import transaction

# Trigger atomic transaction so loop is executed in a single transaction
with transaction.atomic():
    store_list = Store.objects.select_for_update().filter(state='CA')
    # Loop over each store to update and invoke save() on each entry
    for store in store_list:
        # Add complex update logic here for each store
        # save() method called on each member to update
        store.save()

# Method decorated with @transaction.atomic to ensure logic is executed in single transaction
@transaction.atomic
def bulk_store_updae(store_list):
    store_list = Store.objects.select_for_update().exclude(state='CA')
    # Loop over each store and invoke save() on each entry
    for store in store_list:
        # Add complex update logic here for each store
        # save() method called on each member to update
        store.save()        

# Call bulk_store_update to update store records
bulk_store_update(store_list_to_update)

Listing 8-23 shows two variations for select_for_update(), one using an explicit transaction and the other decorating a method to scope it inside a transaction. Both variations use the same logic, they first create a query with select_for_update(), then loop over the results to update each record and use save() to update individual records. In this manner the rows touched by the query remain locked to other changes until the transaction finishes.

Be aware that when using the select_for_update() it's absolutely necessary to use transactions using any of the techniques described in listing 8-23. If you run the select_for_update() method in a database that supports it and you don't use transactions as illustrated in listing 8-23 -- maintaining Django's default auto-commit mode -- Django throws a TransactionManagementError error because the rows cannot be locked as a group. Using the select_for_update method in a database that offers no support for it has no effect (i.e. you won't see an error).

Delete multiple records with delete()

To delete multiple records you use the delete() method and append it to a query. Listing 8-24 illustrates this process.

Listing 8-24 Delete model records with the delete() method

from coffeehouse.stores.models import Store

Store.objects.filter(city='San Diego').delete()

The example in listing 8-24 uses the delete() method to delete the Store records with city='San Diego'.