In my last post, we saw how a query featuring a scalar aggregate could be transformed by the optimizer to a more efficient form. As a reminder, here’s the schema again:
About This Blog
Monday, 8 July 2013
Wednesday, 26 June 2013
Optimization Phases and Missed Opportunities
There are two complementary skills that are very useful in query tuning. One is the ability to read and interpret execution plans. The second is knowing a bit about how the query optimizer works to translate SQL text into an execution plan.
Putting the two things together can help us spot times when an expected optimization was not applied, resulting in an execution plan that is not as efficient as it could be.
The lack of documentation around exactly which optimizations SQL Server can apply (and in what circumstances) means that a lot of this comes down to experience, however.
Monday, 17 June 2013
Improving Partitioned Table Join Performance
The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example of that, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing.
Tuesday, 11 June 2013
Hello Operator, My Switch Is Bored
This post is in two parts. The first part looks at the Switch execution plan operator. The second part is about an invisible plan operator and cardinality estimates on filtered indexes.
Thursday, 4 April 2013
Optimizer Limitations with Filtered Indexes
One of the filtered index use cases mentioned in the product documentation concerns a column that contains mostly NULL
values. The idea is to create a filtered index that excludes the NULLs
, resulting in a smaller nonclustered index that requires less maintenance than the equivalent unfiltered index.
Another popular use of filtered indexes is to filter NULLs
from a UNIQUE
index, giving the behaviour users of other database engines might expect from a default UNIQUE
index or constraint: Uniqueness enforced only for non-NULL
values.
Unfortunately, the query optimizer has limitations where filtered indexes are concerned. This post looks at a couple of less well-known examples.
Wednesday, 20 March 2013
The Problem with Window Functions and Views
Introduction
Since their introduction in SQL Server 2005, window functions like ROW_NUMBER
and RANK
have proven to be extremely useful in solving a wide variety of common T-SQL problems. In an attempt to generalize such solutions, database designers often look to incorporate them into views to promote code encapsulation and reuse.
Unfortunately, a limitation in the SQL Server query optimizer often means that views1 containing window functions do not perform as well as expected. This post works through an illustrative example of the problem, details the reasons, and provides a number of workarounds.
Note: The limitation described here was first fixed in SQL Server 2017 CU 30. Optimizer fixes must be enabled using trace flag 4199 or the database scoped configuration option. The fix is standard behaviour without optimizer hotfixes under compatibility level 160 (SQL Server 2022).
Friday, 8 March 2013
Execution Plan Analysis: The Mystery Work Table
I love SQL Server execution plans. It is often easy to spot the cause of a performance problem just by looking at one closely. That task is considerably easier if the plan includes run-time information (a so-called ‘actual’ execution plan), but even a compiled plan can be very useful.
Nevertheless, there are still times when the execution plan does not tell the whole story, and we need to think more deeply about query execution to really understand a problem. This post looks at one such example, based on a question I answered.
Thursday, 21 February 2013
Halloween Protection – The Complete Series
I have written a four-part series on the Halloween Problem.
Some of you will never have heard about this issue. Those that have might associate it only with T-SQL UPDATE
queries. In fact, the Halloween Problem affects execution plans for INSERT, UPDATE
, DELETE
and MERGE
statements.
This is a topic I have been meaning to write about properly for years, ever since I read Craig Freedman’s 2008 blog post on the topic, which ended with the cryptic comment:
“…although I’ve used update statements for all of the examples in this post, some insert and delete statements also require Halloween protection, but I’ll save that topic for a future post.”
That future post never materialized, so I thought I would have a go. The four parts of the series are summarized and linked below, I hope you find the material interesting.
Wednesday, 20 February 2013
The Halloween Problem – Part 4
The Halloween Problem can have a number of important effects on execution plans. In this final part of the series, we look at the tricks the optimizer can employ to avoid the Halloween Problem when compiling plans for queries that add, change or delete data.
Monday, 18 February 2013
The Halloween Problem – Part 3
The MERGE
statement (introduced in SQL Server 2008) allows us to perform a mixture of INSERT
, UPDATE
, and DELETE
operations using a single statement.
The Halloween Protection issues for MERGE
are mostly a combination of the requirements of the individual operations, but there are some important differences and a couple of interesting optimizations that apply only to MERGE
.
Friday, 15 February 2013
The Halloween Problem – Part 2
In the first part of this series, we saw how the Halloween Problem applies to UPDATE
queries. To recap briefly, the problem was that an index used to locate records to update had its keys modified by the update operation itself (another good reason to use included columns in an index rather than extending the keys). The query optimizer introduced an Eager Table Spool operator to separate the reading and writing sides of the execution plan to avoid the problem. In this post, we will see how the same underlying issue can affect INSERT
and DELETE
statements.
Wednesday, 13 February 2013
The Halloween Problem – Part 1
Much has been written over the years about understanding and optimizing SELECT
queries, but rather less about data modification. This series looks at an issue that is specific to INSERT
, UPDATE
, DELETE
and MERGE
queries – the Halloween Problem.
The phrase “Halloween Problem” was originally coined with reference to a SQL UPDATE
query that was supposed to give a 10% raise to every employee who earned less than $25,000. The problem was that the query kept giving 10% raises until everyone earned at least $25,000.
We will see later on in this series that the underlying issue also applies to INSERT
, DELETE
and MERGE
queries, but for this first entry, it will be helpful to examine the UPDATE
problem in a bit of detail.
Wednesday, 6 February 2013
Incorrect Results with Indexed Views
If you use MERGE
, indexed views and foreign keys, your queries might return incorrect results. Microsoft have released a fix for incorrect results returned when querying an indexed view. The problem applies to:
- SQL Server 2012
- SQL Server 2008 R2
- SQL Server 2008
The Knowledge Base article does not go into detail, or provide a reproduction script, but this blog post does.
Friday, 1 February 2013
A creative use of IGNORE_DUP_KEY
Let’s say you have a big table with a clustered primary key, and an application that inserts batches of rows into it. The nature of the business is that the batch will inevitably sometimes contain rows that already exist in the table.
The default SQL Server INSERT
behaviour for such a batch is to throw error 2627 (primary key violation), terminate the statement, roll back all the inserts (not just the rows that conflicted) and keep any active transaction open:
Saturday, 26 January 2013
Optimizing T-SQL queries that change data
Most tuning efforts for data-changing operations concentrate on the SELECT
side of the query plan. Sometimes people will also look at storage engine considerations (like locking or transaction log throughput) that can have dramatic effects. A number of common practices have emerged, such as avoiding large numbers of row locks and lock escalation, splitting large changes into smaller batches of a few thousand rows, and combining a number of small changes into a single transaction in order to optimize log flushes.
This is all good, but what about the data-changing side of the query plan — the INSERT
, UPDATE
, DELETE
, or MERGE
operation itself — are there any query processor considerations we should take into account? The short answer is yes.
The query optimizer considers different plan options for the write-side of an execution plan, though there isn’t a huge amount of T-SQL language support that allows us to affect these choices directly. Nevertheless, there are things to be aware of, and things we can look to change.
Monday, 10 December 2012
MERGE Bug with Filtered Indexes
A MERGE
statement can fail, and incorrectly report a unique key violation when:
- The target table uses a unique filtered index; and
- No key column of the filtered index is updated; and
- A column from the filtering condition is updated; and
- Transient key violations are possible
Monday, 15 October 2012
Cardinality Estimation Bug with Lookups
Estimated row counts on Key or RID Lookups where a filtering predicate is applied can be wrong in SSMS execution plans.
This error does not affect the optimizer’s ultimate plan selection, but it does look odd.
There are other cases where estimated row counts are inconsistent (for defensible reasons) but the behaviour shown in this post in certainly a bug.
Wednesday, 12 September 2012
Why Doesn’t Partition Elimination Work?
Given a partitioned table and a simple SELECT
query that compares the partitioning column to a single literal value, why does SQL Server read all the partitions when it seems obvious that only one partition needs to be examined?
Wednesday, 5 September 2012
Compute Scalars, Expressions and Execution Plan Performance
The humble Compute Scalar is one of the least well-understood of the execution plan operators, and usually the last place people look for query performance problems. It often appears in execution plans with a very low (or even zero) cost, which goes some way to explaining why people ignore it.
Some readers will already know that a Compute Scalar can contain a call to a user-defined function, and that any T-SQL function with a BEGIN…END
block in its definition can have truly disastrous consequences for performance (see When is a SQL function not a function? by Rob Farley for details).
This post is not about those sorts of concerns.
Friday, 31 August 2012
Deletes that Split Pages and Forwarded Ghosts
Can DELETE
operations cause pages to split?
Yes. It sounds counter-intuitive on the face of it. Deleting rows frees up space on a page, and page splitting occurs when a page needs additional space. Nevertheless, there are circumstances when deleting rows causes them to expand before they can be deleted.
Friday, 17 August 2012
Temporary Table Caching Explained
SQL Server (since 2005) caches temporary tables and table variables referenced in stored procedures for reuse, reducing contention on tempdb allocation structures and catalogue tables.
A number of things can prevent this caching (none of which are allowed when working with table variables):
- Named constraints (bad idea anyway, since concurrent executions can cause a name collision)
- DDL after creation (though what is considered DDL is interesting)
- Creation using dynamic SQL
- Table created in a different scope
- Procedure executed using
WITH RECOMPILE
Temporary objects are often created and destroyed at a high rate in production systems, so caching can be an important optimization.
Wednesday, 15 August 2012
Temporary Table Caching in Stored Procedures
Introduction
Ask anyone what the primary advantage of temporary tables over table variables is, and the chances are they will say that temporary tables support statistics and table variables do not.
This is true, of course. The indexes that enforce PRIMARY KEY
and UNIQUE
constraints on table variables do not have populated statistics associated with them. Neither do any non-constraint table variable indexes (using inline index definitions, available starting with SQL Server 2014). Finally, it is not possible to manually create statistics on table variables.
Intuitively, then, any query that has alternative execution plans to choose from ought to benefit from using a temporary table rather than a table variable. This is also true, up to a point.
Thursday, 3 May 2012
Parallel Execution Plans Suck
Summary: A deep dive into SQL Server parallelism, and a potential performance problem with parallel plans that use TOP
.
Tuesday, 1 May 2012
Query Optimizer Deep Dive - Part 4
This is the final part in a series of posts based on the content of the Query Optimizer Deep Dive presentations I have given over the last month or so at the Auckland SQL Users’ Group, and SQL Saturday events in Wellington, New Zealand and Adelaide, Australia.
Links to other parts of this series: Part 1 Part 2 Part 3
Beating the Optimizer
Our AdventureWorks test query produces an optimized physical execution plan that is quite different from the logical form of the query.
The estimated cost of the execution plan shown below is 0.0295 units.
Since we know the database schema very well, we might wonder why the optimizer did not choose to use the unique nonclustered index on Name
in the Product
table to filter rows based on the LIKE
predicate.
Sunday, 29 April 2012
Query Optimizer Deep Dive – Part 3
This is the third in a series of posts based on the content of the Query Optimizer Deep Dive presentations I have given over the last month or so at the Auckland SQL Users’ Group, and SQL Saturday events in Wellington, New Zealand and Adelaide, Australia.
Links to other parts of this series: Part 1 Part 2 Part 4
Storage of Alternative Plans
We saw in part 2 how optimizer rules are used to explore logical alternatives for parts of the query tree, and how implementation rules are used to find physical operations to perform each logical steps.
To keep track of all these options, the cost-based part of the SQL Server query optimizer uses a structure called the Memo. This structure is part of the Cascades general optimization framework developed by Goetz Graefe.