About This Blog

Including my content originally published on 𝕏, SQLperformance.com, and SQLblog.com
Showing posts with label Query Plans. Show all posts
Showing posts with label Query Plans. Show all posts

Tuesday, 17 September 2024

Why a Self-Join Requires Halloween Protection

Title image

This article was originally published on 𝕏.

I was asked recently why Halloween Protection was needed for data modification statements that include a self-join of the target table. This gives me a chance to explain, while also covering some interesting product bug history from the SQL Server 7 and 2000 days.

If you already know all there is to know about the Halloween Problem as it applies to SQL Server, you can skip the background section.

Sunday, 15 September 2024

Current State of the ANY Aggregate Transformation

Title image
This article was originally published on 𝕏.

SQL Server provides a way to select any one row from a group of rows, provided you write the statement using a specific syntax. This method returns any one row from each group, not the minimum, maximum or anything else. In principle, the one row chosen from each group is unpredictable.

The general idea of the required syntax is to logically number rows starting with 1 in each group in no particular order, then return only the rows numbered 1. The outer statement must not select the numbering column for this query optimizer transformation (SelSeqPrjToAnyAgg) to work.

Friday, 13 September 2024

A Small Sample of SQL Server Chaos

Title image
This article was originally published on 𝕏.

Background

Since SQL Server indexed views don’t allow MIN or MAX aggregates, I recently found myself writing a trigger instead. The trigger’s job was to keep a summary table in sync with a source query (which featured a MAX aggregate).

There’s a cost to running a trigger after every insert, update, or delete (with up to three trigger invocations per merge statement) but fast access to the summary data was worth it in this case. Though a trigger is a bit more expensive than the inline materialised view maintenance automatically added to the source statement’s execution plan by SQL Server, efficient trigger code and good indexing can help with the performance aspect (as always).

Friday, 30 August 2024

A Nonclustered Index Update Disaster

Title image
This article was originally published on 𝕏.

Introduction

Update execution plans are not something the T-SQL statement writer has much control over. You can affect the data reading side of the plan with query rewrites and hints, but there’s not nearly as much tooling available to affect the writing side of the plan.

Update processing can be extremely complex and reading data-changing execution plans correctly can also be difficult. Many important details are hidden away in obscure and poorly documented properties, or simply not present at all.

In this article, I want to you show a particularly bad update plan example. It has value in and of itself, but it will also give me a chance to describe some less well-known SQL Server details and behaviours.

Monday, 12 August 2024

Don't Mix with Datetime

Title image
This article was originally published on 𝕏.

Introduction

Microsoft encourages us not to use the datetime data type:

Avoid using datetime for new work. Instead, use the time, date, datetime2, and datetimeoffset data types. These types align with the SQL Standard, and are more portable. time, datetime2 and datetimeoffset provide more seconds precision. datetimeoffset provides time zone support for globally deployed applications.

Well, ok. Sensible and well-informed people might still choose to use datetime for performance reasons. Common date and time functions have optimised implementations in the SQL Server expression service for the datetime and smalldatetime data types.

Friday, 31 May 2024

Impossible Execution Plan Timings

Erik Darling (@erikdarlingdata) shared an interesting SQL Server execution plan with me recently. The demo script is at the end of this article.

The important section is shown below: 

Impossible timings?






The Gather Streams operator appears to execute for less time (2.16s) than the Sort operator below it (5.431s). This seems impossible on the face of it. 

The Parallelism (Gather Streams) operator runs in row mode (as always), while the Sort and Hash Match (Inner Join) operators both run in batch mode. This mixed mode plan adds a little complexity to interpreting plan timings because: 
  • A batch mode operator reports CPU and elapsed times for that operator alone 
  • A row mode operator reports times for itself and all its children 
I've written about those aspects before in Understanding Execution Plan Operator Timings, which also covers a confusing situation that can arise in exclusively row mode parallel plans.

I showed a hidden option to make all operators report only their individual times in More Consistent Execution Plan Timings in SQL Server 2022. That feature isn't complete yet, so the results aren't perfect, and it's not documented or supported.

I mention all that in case you are interested in the background. None of the foregoing explains what we see in this mixed mode plan. The row mode Gather Streams elapsed time ought to include its children. The batch mode Sort should just be reporting its own elapsed time. With that understanding in mind, there's no way the Sort could run for longer than the Gather Streams. What's going on here?

Monday, 13 November 2023

Why Batch Mode Sort Spills Are So Slow

Why Batch Mode Sort Spills Are So Slow

Batch mode sorting was added to SQL Server in the 2016 release under compatibility level 130. Most of the time, a batch mode sort will be much faster than the row mode equivalent.

This post is about an important exception to this rule, as recently reported by Erik Darling (video).

No doubt you’ll visit both links before reading on, but to summarize, the issue is that batch mode sorts are very slow when they spill—much slower than an equivalent row mode sort.

This also seems like a good opportunity to write down some sorting details I haven’t really covered before. If you’re not interested in those details and background to the current issue, you can skip down to the section titled, “Erik’s Demo”.

Friday, 20 October 2023

Fast Key Optimization for Row Mode Sorts

Fast Key Optimization for Row Mode Sorts

SQL Server row-mode sorts generally use a custom implementation of the well-known merge sort algorithm to order data.

As a comparison-based algorithm, this performs a large number of value comparisons during sorting—usually many more than the number of items to sort.

Although each comparison is typically not expensive, even moderately sized sorting can involve a very large number of comparisons.

SQL Server can be called upon to sort a variety of data types. To facilitate this, the sorting code normally calls out to a specific comparator to determine how two compared values should sort: lower, higher, or equal.

Although calling comparator code has low overhead, performing enough of them can cause noticeable performance differences.

To address this, SQL Server has always (since at least version 7) supported a fast key optimization for simple data types. This optimization performs the comparison using highly optimized inline code rather than calling out to a separate routine.

Saturday, 23 July 2022

More Consistent Execution Plan Timings in SQL Server 2022

More Consistent Execution Plan Timings in SQL Server 2022

The updated showplan schema shipped with SSMS 19 preview 2 contains an interesting comment:

ExclusiveProfileTimeActive: true if the actual elapsed time (ActualElapsedms attribute) and the actual CPU time (ActualCPUms attribute) represent the time interval spent exclusively within the relational iterator.

What does this mean?

Tuesday, 4 August 2020

SQL Server 2019 Aggregate Splitting

SQL Server 2019 Aggregate Splitting

The SQL Server 2019 query optimizer has a new trick available to improve the performance of large aggregations. The new exploration abilities are encoded in two new closely-related optimizer rules:

  • GbAggSplitToRanges
  • SelOnGbAggSplitToRanges

The extended event query_optimizer_batch_mode_agg_split is provided to track when this new optimization is considered. The description of this event is:

Occurs when the query optimizer detects batch mode aggregation is likely to spill and tries to split it into multiple smaller aggregations.

Other than that, this new feature hasn’t been documented yet. This article is intended to help fill that gap.

Sunday, 26 July 2020

A bug with Halloween Protection and the OUTPUT Clause

A bug with Halloween Protection and the OUTPUT Clause

Background

The OUTPUT clause can be used to return results from an INSERT, UPDATE, DELETE, or MERGE statement. The data can be returned to the client, inserted to a table, or both.

There are two ways to add OUTPUT data to a table:

  1. Using OUTPUT INTO
  2. With an outer INSERT statement.

For example:

-- Test table
DECLARE @Target table
(
    id integer IDENTITY (1, 1) NOT NULL, 
    c1 integer NULL
);

-- Holds rows from the OUTPUT clause
DECLARE @Output table 
(
    id integer NOT NULL, 
    c1 integer NULL
);

Sunday, 5 July 2020

How MAXDOP Really Works

How MAXDOP Really Works

A few days ago I ran a Twitter poll:

Twitter poll

The most popular answer gets highlighted by Twitter at the end of the poll, but as with many things on social media, that doesn’t mean it is correct:

Friday, 23 January 2015

Allocation Ordered Scans

Allocation Order Scans

When an execution plan includes a scan of a b-tree index structure, the storage engine may be able to choose between two physical access strategies when the plan is executed:

  1. Follow the index b-tree structure; or,
  2. locate pages using internal page allocation information.

Where a choice is available, the storage engine makes the runtime decision on each execution. A plan recompilation is not required for it to change its mind.

The b-tree strategy starts at the root of the tree, descends to an extreme edge of the leaf level (depending on whether the scan is forward or backward), then follows leaf-level page links until the other end of the index is reached.

The allocation strategy uses Index Allocation Map (IAM) structures to locate database pages allocated to the index. Each IAM page maps allocations to a 4GB interval in a single physical database file, so scanning the IAM chains associated with an index tends to access index pages in physical file order (at least as far as SQL Server can tell).

Saturday, 31 August 2013

Nested Loops Prefetching

Nested Loops Prefetching

Nested loops join query plans can be a lot more interesting (and complicated) than is commonly realized.

One query plan area I get asked about a lot is prefetching. It is not documented in full detail anywhere, so this seems like a good topic to address in a blog post.

The examples used in this article are based on questions asked by Adam Machanic.

Wednesday, 24 July 2013

Two Partitioning Peculiarities

Two Partitioning Peculiarities

Table partitioning in SQL Server is essentially a way of making multiple physical tables (row sets) look like a single table. This abstraction is performed entirely by the query processor, a design that makes things simpler for users, but which makes complex demands of the query optimizer.

This post looks at two examples which exceed the optimizer’s abilities in SQL Server 2008 onward.

Wednesday, 20 March 2013

The Problem with Window Functions and Views

The Problem with Window Functions and Views

Introduction

Since their introduction in SQL Server 2005, window functions like ROW_NUMBER and RANK have proven to be extremely useful in solving a wide variety of common T-SQL problems. In an attempt to generalize such solutions, database designers often look to incorporate them into views to promote code encapsulation and reuse.

Unfortunately, a limitation in the SQL Server query optimizer often means that views1 containing window functions do not perform as well as expected. This post works through an illustrative example of the problem, details the reasons, and provides a number of workarounds.

Note: The limitation described here was first fixed in SQL Server 2017 CU 30. Optimizer fixes must be enabled using trace flag 4199 or the database scoped configuration option. The fix is standard behaviour without optimizer hotfixes under compatibility level 160 (SQL Server 2022).

Friday, 8 March 2013

Execution Plan Analysis: The Mystery Work Table

Execution Plan Analysis: The Mystery Work Table

I love SQL Server execution plans. It is often easy to spot the cause of a performance problem just by looking at one closely. That task is considerably easier if the plan includes run-time information (a so-called ‘actual’ execution plan), but even a compiled plan can be very useful.

Nevertheless, there are still times when the execution plan does not tell the whole story, and we need to think more deeply about query execution to really understand a problem. This post looks at one such example, based on a question I answered.

Thursday, 21 February 2013

Halloween Protection – The Complete Series

Halloween Protection – The Complete Series

I have written a four-part series on the Halloween Problem.

Some of you will never have heard about this issue. Those that have might associate it only with T-SQL UPDATE queries. In fact, the Halloween Problem affects execution plans for INSERT, UPDATE, DELETE and MERGE statements.

This is a topic I have been meaning to write about properly for years, ever since I read Craig Freedman’s 2008 blog post on the topic, which ended with the cryptic comment:

“…although I’ve used update statements for all of the examples in this post, some insert and delete statements also require Halloween protection, but I’ll save that topic for a future post.”

That future post never materialized, so I thought I would have a go. The four parts of the series are summarized and linked below, I hope you find the material interesting.

Wednesday, 20 February 2013

The Halloween Problem – Part 4

The Halloween Problem – Part 4

The Halloween Problem can have a number of important effects on execution plans. In this final part of the series, we look at the tricks the optimizer can employ to avoid the Halloween Problem when compiling plans for queries that add, change or delete data.

Monday, 18 February 2013

The Halloween Problem – Part 3

The Halloween Problem – Part 3

The MERGE statement (introduced in SQL Server 2008) allows us to perform a mixture of INSERT, UPDATE, and DELETE operations using a single statement.

The Halloween Protection issues for MERGE are mostly a combination of the requirements of the individual operations, but there are some important differences and a couple of interesting optimizations that apply only to MERGE.