Page Free Space: Parallelism

Showing posts with label Parallelism. Show all posts

Thursday, 20 June 2024

SQL Server Parallel Index Builds

Parallel Index Building Execution Plan

SQL Server doesn't support parallel modifications to a b-tree index.

That might sound surprising. After all, you can certainly write to the same b-tree index from multiple sessions concurrently. For example, two sessions can happily write alternating odd and even numbers to the same integer b-tree index. So long as both sessions execute on different schedulers and take row locks, there will be no blocking and you'll get true concurrency.

No, what I mean is: A single session can't write to a b-tree index using more than one thread. No parallel plan modifications of a b-tree index, in other words. It's a bit like the lack of parallel backward ordered scans. There's no reason it couldn't be implemented, but it hasn't been so far.

You may have thought SQL Server would use a regular parallel scan to read the index source data, optionally sort it into index key order, then add those rows to the index in parallel. This would indeed work, even without sorting, but SQL Server just can't do it.

In case you're wondering, sorting into destination key order is an optimization. The resulting index would still be correct without it, but you'd be inserting rows essentially at random into a b-tree, with all the random I/O and page splitting that would entail.

Ok, you say, but what about parallel index builds? They've been around for a long time in premium editions and certainly seem to modify a single b-tree in parallel. Yes, they do seem to, but SQL Server cheats.

Read the full article on 𝕏.

https://t.co/0f59kSlLzQ
— Paul White (@SQL_Kiwi) June 20, 2024

Friday, 31 May 2024

Impossible Execution Plan Timings

Erik Darling (@erikdarlingdata) shared an interesting SQL Server execution plan with me recently. The demo script is at the end of this article.

The important section is shown below:

Impossible timings?

The Gather Streams operator appears to execute for less time (2.16s) than the Sort operator below it (5.431s). This seems impossible on the face of it.

The Parallelism (Gather Streams) operator runs in row mode (as always), while the Sort and Hash Match (Inner Join) operators both run in batch mode. This mixed mode plan adds a little complexity to interpreting plan timings because:

A batch mode operator reports CPU and elapsed times for that operator alone
A row mode operator reports times for itself and all its children

I've written about those aspects before in Understanding Execution Plan Operator Timings, which also covers a confusing situation that can arise in exclusively row mode parallel plans.

I showed a hidden option to make all operators report only their individual times in More Consistent Execution Plan Timings in SQL Server 2022. That feature isn't complete yet, so the results aren't perfect, and it's not documented or supported.

I mention all that in case you are interested in the background. None of the foregoing explains what we see in this mixed mode plan. The row mode Gather Streams elapsed time ought to include its children. The batch mode Sort should just be reporting its own elapsed time. With that understanding in mind, there's no way the Sort could run for longer than the Gather Streams. What's going on here?

Read the full article on 𝕏.

https://t.co/DrANqdwngA
— Paul White (@SQL_Kiwi) May 30, 2024

Tuesday, 30 August 2022

Reducing Contention on the NESTING_TRANSACTION_FULL latch

Each additional worker thread in a parallel execution plan executes inside a nested transaction associated with the single parent transaction.

Parallel worker access to shared parent transaction structures is protected by a latch. A NESTING_TRANSACTION_READONLY latch is used for a read-only transaction. A NESTING_TRANSACTION_FULL latch is used if the transaction has modified the database.

This design has its roots in SQL Server 7, where read-only query parallelism was introduced. SQL Server 2000 built on this with parallel index builds, which for the first time allowed multiple threads to cooperate to change a persistent database structure. Many improvements have followed since then, but the fundamental parent-child transaction design remains today.

Though lightweight, a latch can become a point of contention when requested sufficiently frequently in incompatible modes by many different threads. Some contention on shared resources is to be expected; it becomes a problem when latch waits start to affect CPU utilisation and throughput.

Read the full article »

Saturday, 23 July 2022

More Consistent Execution Plan Timings in SQL Server 2022

The updated showplan schema shipped with SSMS 19 preview 2 contains an interesting comment:

ExclusiveProfileTimeActive: true if the actual elapsed time (ActualElapsedms attribute) and the actual CPU time (ActualCPUms attribute) represent the time interval spent exclusively within the relational iterator.

What does this mean?

Read the full article »

Saturday, 5 June 2021

Empty Parallel Zones

An empty parallel zone is an area of the plan bounded by exchanges (or the leaf level) containing no operators.

How and why does SQL Server sometimes generate a parallel plan with an empty parallel zone?

Read the full article »

Wednesday, 24 March 2021

Incorrect Results with Parallel Eager Spools and Batch Mode

You might have noticed a warning at the top of the release notes for SQL Server 2016 SP2 CU 16:

Note: After you apply CU 16 for SQL Server 2016 SP2, you might encounter an issue in which DML (insert/update/delete) queries that use parallel plans cannot complete any execution and encounter HP_SPOOL_BARRIER waits. You can use the trace flag 13116 or MAXDOP=1 hint to work around this issue. This issue is related to the introduction of fix for 13685819 and it will be fixed in the next Cumulative Update.

That warning links to bug reference 13685819 on the same page. There isn’t a separate KB article, only the description:

Fixes an issue with insert query in SQL Server 2016 that reads the data from the same table and uses a parallel execution plan may produce duplicate rows

Read the full article »

Sunday, 5 July 2020

How MAXDOP Really Works

A few days ago I ran a Twitter poll:

Twitter poll

The most popular answer gets highlighted by Twitter at the end of the poll, but as with many things on social media, that doesn’t mean it is correct:

Read the full article »

Thursday, 3 May 2012

Parallel Execution Plans Suck

Summary: A deep dive into SQL Server parallelism, and a potential performance problem with parallel plans that use TOP.

Read the full article »

Friday, 23 December 2011

Forcing a Parallel Query Execution Plan

This article is for SQL Server developers who have experienced the special kind of frustration that only comes from spending hours trying to convince the query optimizer to generate a parallel execution plan.

This situation often occurs when making an apparently innocuous change to the text of a moderately complex query — a change which somehow manages to turn a parallel plan that executes in ten seconds, into a five-minute serially-executing monster.

Read the full article »

Thursday, 7 July 2011

Bitmap Magic (or… how SQL Server uses bitmap filters)

Question

Can a parallel query use less CPU than the same serial query, while executing faster?

The answer is yes! To demonstrate, I’ll use the following two (heap) tables, each containing a single column typed as integer:

Tables #BuiltInt and #Probe

Read the full article »

Saturday, 5 March 2011

How Parallelism Works in SQL Server

You might have noticed that January was a quiet blogging month for me.

Part of the reason was that I was working on an article for Simple Talk, looking at how parallel query execution really works. The first part is published today at:

Understanding and Using Parallelism in SQL Server.

This introductory piece is not quite as technical as normal, but I hope there be enough interesting material there to make it worth a read.

Thursday, 4 November 2010

Myth: SQL Server Caches a Serial Plan with every Parallel Plan

Many people believe that whenever SQL Server creates an execution plan that uses parallelism, an alternative serial plan is also cached.

The idea seems to be that the execution engine then decides between the parallel and serial alternatives at runtime. I’ve seen this on forums, in blogs, and even in books.

In fairness, a lot of the official documentation is not as clear as it might be on the subject. In this post I will show that only a single (parallel) plan is cached. I will also show that SQL Server can execute a parallel plan on a single thread.

Read the full article »

About This Blog

Thursday, 20 June 2024

Friday, 31 May 2024

Tuesday, 30 August 2022

Saturday, 23 July 2022

Saturday, 5 June 2021

Wednesday, 24 March 2021

Sunday, 5 July 2020

Thursday, 3 May 2012

Friday, 23 December 2011

Thursday, 7 July 2011

Question

Saturday, 5 March 2011

Thursday, 4 November 2010