What does the word “SARGable” really mean

terminology

SQL Server users use the term "sargable". I'm wondering if there is an objective implementation-agnostic timeless definition for "sargable."

For instance, WHERE foo LIKE '%bar%' is said by many to be not sargable, but some RDBMSs are able to use indexes on such queries. What then does "not sargable" mean?

Other References

Best Answer

The term "sargable" was first introduced by P. Griffiths Selinger et al. in their 1979 paper "Access Path Selection in a Relational Database Management System", published by ACM. For non-ACM members there's a copy of that paper at http://cs.stanford.edu/people/chrismre/cs345/rl/selinger.pdf

The term is defined in this paragraph:

Both index and segment1 scans may optionally take a set of predicates, called search arguments (or SARGS), which are applied to a tuple before it is returned to the RSI2 caller. If the tuple satisfies the predicates, it is returned; otherwise the scan continues until it either finds a tuple which satisfies the SARGS or exhausts the segment or the specified index value range. This reduces cost by eliminating the overhead of making RSI calls for tuples which can be efficiently rejected within the RSS. Not all predicates are of the form that can become SARGS. A sargable predicate is one of form (or which can be put into the form) "column comparison-operator value". SARGS are expressed as a boolean expression of such predicates in disjunctive normal form.

In other words, a sargable predicate is such that can be resolved by the storage engine (access method) by directly observing the table or index record. A non-sargable predicate, conversely, requires a higher level of the DBMS to take action. For example, the outcome of WHERE lastname = 'Doe' can be decided by the storage engine by simply looking at the contents of the field lastname of each record. On the other hand, WHERE UPPER(lastname) = 'DOE' requires execution of a function by the SQL engine, which means the storage engine will have to return all rows it reads (provided they match possible other, sargable predicates) back to the SQL engine for evaluation, incurring additional CPU costs.

You can see from the original definition that sargable predicates can apply not only to index scans, but also to table (segment in System R terminology) scans, as long as the conditions "column comparison-operator value" are met and therefore they can be evaluated by the storage engine. This is indeed the case with Db2, a descendant of System R in many ways:

Index sargable predicates are not used to bracket a search, but are evaluated from the index if one is chosen, because the columns involved in the predicate are part of the index key. These predicates are also evaluated by the index manager.

Data sargable predicates are predicates that cannot be evaluated by the index manager, but can be evaluated by Data Management Services (DMS). Typically, these predicates require the access of individual rows from a base table. If necessary, DMS will retrieve the columns needed to evaluate the predicate,

The fact that in SQL Server-speak sargable predicates are only those that can be resolved using index seeks is probably determined by its storage engine's inability to apply such predicates during table scans.

Sargable and non-sargable predicates are sometimes described as "stage 1" and "stage 2" predicates respectively (this also comes from Db2 terminology). Stage 1 predicates can be evaluated at the lowest level of query processing, while reading table or index records. Rows that match stage 1 conditions, if any, are sent to the next level, stage 2, of evaluation.


1 -- Segment in System R is the physical storage of a table's tuples; a segment scan is somewhat equivalent to a table scan in other DBMSes.

2 -- RSI -- RSS3 Interface, a tuple-oriented query interface. The interface function relevant to this discussion is NEXT, which returns the next row matching query predicates.

3 -- RSS, or Research Storage System, the storage subsystem of System R.