About This Blog

Including my content originally published on 𝕏, SQLperformance.com, and SQLblog.com

Saturday, 24 August 2019

Batch Mode Bitmap Demos

Batch Mode Bitmap Demos

This is a companion post to my main article Batch Mode Bitmaps in SQL Server. This post provides demos and illustrations to supplement the technical article.

The scripts presented here were run on SQL Server 2017 CU 16.

Setup

The demos use a table of numbers with ten million rows:

-- Itzik Ben-Gan's row generator
WITH
  L0   AS (SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
  L2   AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
  L3   AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
  L4   AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
  L5   AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
  Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
SELECT
    -- Destination column type integer NOT NULL
    ISNULL(CONVERT(integer, N.n), 0) AS n
INTO dbo.Numbers
FROM Nums AS N
WHERE N.n >= 1
AND N.n <= 10 * 1000 * 1000
OPTION (MAXDOP 1);
GO
ALTER TABLE dbo.Numbers
ADD CONSTRAINT PK_Numbers_n
PRIMARY KEY CLUSTERED (n)
WITH (SORT_IN_TEMPDB = ON, MAXDOP = 1, FILLFACTOR = 100);

Bitmap Selectivity and Rounding

The query optimizer introduces a bitmap for a batch mode hash join if the estimated selectivity of a hypothetical semi join on the hash join input keys is 0.75 or lower.

DROP TABLE IF EXISTS #T1, #T2;
GO
CREATE TABLE #T1 (c1 bigint NOT NULL);
CREATE TABLE #T2 (c1 bigint NOT NULL, INDEX c CLUSTERED COLUMNSTORE);

INSERT #T1 (c1)
SELECT N.n
FROM dbo.Numbers AS N
WHERE N.n BETWEEN 1 AND 102400 * 3 / 4;

INSERT #T2 (c1)
SELECT N.n
FROM dbo.Numbers AS N
WHERE N.n BETWEEN 1 AND 102400;

SELECT COUNT_BIG(*) 
FROM #T1 AS T1
JOIN #T2 AS T2 ON T2.c1 = T1.c1
OPTION (MAXDOP 1, QUERYTRACEON 3604, QUERYTRACEON 2363);

The TF 2363 output shows a semi join selectivity of 0.75:

Begin selectivity computation

Input tree:
  LogOp_LeftSemiJoin
      CStCollBaseTable(ID=2, CARD=102400 TBL: #T2 AS TBL: T2)
      CStCollBaseTable(ID=1, CARD=76800 TBL: #T1 AS TBL: T1)
      ScaOp_Comp x_cmpEq
          ScaOp_Identifier QCOL: [T1].c1
          ScaOp_Identifier QCOL: [T2].c1

Plan for computation:
  CSelCalcExpressionComparedToExpression( QCOL: [T2].c1 x_cmpEq QCOL: [T1].c1 )

Selectivity: 0.75

Stats collection generated: 
  CStCollJoin(ID=9, CARD=76800 x_jtLeftSemi)
      CStCollBaseTable(ID=2, CARD=102400 TBL: #T2 AS TBL: T2)
      CStCollBaseTable(ID=1, CARD=76800 TBL: #T1 AS TBL: T1)

End selectivity computation

The execution plan shows the hash join is a bitmap creator, with a generated name of Opt_Bitmap1006. The probe-side bitmap filter is pushed down from the hash join to the columnstore scan:

Batch Mode Bitmap Plan Properties

The optimizer expects 25% of the probe-side rows to be eliminated by the bitmap. This is just enough for the optimizer to specify that the hash join should build a bitmap.

The estimated number of rows produced by the columnstore scan is 102,400, which matches the total cardinality of that table. The selectivity of 0.75 is rounded to a power of ten for cardinality estimation purposes. This reflects the optimizer’s uncertainty about the effectiveness of the bitmap at runtime. See the main article for the details of the rounding calculation.

In this case, selectivity is rounded up to 1, meaning all rows qualify. This only affects the cardinality estimate on the probe side of the hash join. The decision to use a bitmap has already been made.

No bitmap

Adding one more matching row to the build-side table is enough to convince the optimizer not to build a bitmap:

DROP TABLE IF EXISTS #T1, #T2;
GO
CREATE TABLE #T1 (c1 bigint NOT NULL);
CREATE TABLE #T2 (c1 bigint NOT NULL, INDEX c CLUSTERED COLUMNSTORE);

INSERT #T1 (c1)
SELECT N.n 
FROM dbo.Numbers AS N 
WHERE N.n BETWEEN 1 AND 102400 * 3 / 4 + 1;

INSERT #T2 (c1) 
SELECT N.n 
FROM dbo.Numbers AS N 
WHERE N.n BETWEEN 1 AND 102400;

SELECT COUNT_BIG(*)
FROM #T1 AS T1 
JOIN #T2 AS T2 ON T2.c1 = T1.c1
OPTION (MAXDOP 1, QUERYTRACEON 3604, QUERYTRACEON 2363);

The selectivity of the semi join is calculated to be 0.75001:

Begin selectivity computation

Input tree:

  LogOp_LeftSemiJoin
      CStCollBaseTable(ID=2, CARD=102400 TBL: #T2 AS TBL: T2)
      CStCollBaseTable(ID=1, CARD=76801 TBL: #T1 AS TBL: T1)
      ScaOp_Comp x_cmpEq
          ScaOp_Identifier QCOL: [T1].c1
          ScaOp_Identifier QCOL: [T2].c1

Plan for computation:

  CSelCalcExpressionComparedToExpression( QCOL: [T2].c1 x_cmpEq QCOL: [T1].c1 )

Selectivity: 0.75001

Stats collection generated: 

  CStCollJoin(ID=9, CARD=76801 x_jtLeftSemi)
      CStCollBaseTable(ID=2, CARD=102400 TBL: #T2 AS TBL: T2)
      CStCollBaseTable(ID=1, CARD=76801 TBL: #T1 AS TBL: T1)

End selectivity computation

When a batch mode hash join does not build a bitmap, the Bitmap Creator property is absent, and there is no probe-side filter.

Inconsistent estimates

The optimizer assumes the rounded selectivity of a batch mode bitmap applies to all operators on the probe side of the join. In other words, it expects the bitmap filter to be pushed all the way down.

Pushing the bitmap filter is performed in a separate pass over the finished execution plan. If the filter cannot be pushed all the way down, estimates will look inconsistent:

DROP TABLE IF EXISTS #T1, #T2;
GO
CREATE TABLE #T1 (c1 bigint NOT NULL);
CREATE TABLE #T2 (c1 bigint NOT NULL, INDEX c CLUSTERED COLUMNSTORE);

INSERT #T1 (c1) 
SELECT N.n 
FROM dbo.Numbers AS N 
WHERE N.n BETWEEN 1 AND 102400 * 316 / 1000;

INSERT #T2 (c1) 
SELECT N.n 
FROM dbo.Numbers AS N 
WHERE N.n BETWEEN 1 AND 102400;

SELECT COUNT_BIG(*)
FROM #T1 AS T1 
JOIN #T2 AS T2 ON T2.c1 + 1 = T1.c1 
OPTION (MAXDOP 1, QUERYTRACEON 3604, QUERYTRACEON 2363);

This demo produces a semi join selectivity of 0.315986:

Begin selectivity computation

Input tree:

  LogOp_LeftSemiJoin
      CStCollProject(ID=11, CARD=102400)
          CStCollBaseTable(ID=2, CARD=102400 TBL: #T2 AS TBL: T2)
      CStCollBaseTable(ID=1, CARD=32358 TBL: #T1 AS TBL: T1)
      ScaOp_Comp x_cmpEq
          ScaOp_Identifier COL: Expr1005 
          ScaOp_Identifier QCOL: [T1].c1

Plan for computation:

  CSelCalcExpressionComparedToExpression

Selectivity: 0.315986

Stats collection generated: 

  CStCollJoin(ID=15, CARD=32357 x_jtLeftSemi)
      CStCollProject(ID=11, CARD=102400)
          CStCollBaseTable(ID=2, CARD=102400 TBL: #T2 AS TBL: T2)
      CStCollBaseTable(ID=1, CARD=32358 TBL: #T1 AS TBL: T1)

End selectivity computation

The selectivity is rounded to 0.1 for the probe-side bitmap filter. The optimizer assumes this selectivity will apply all the way down to the columnstore scan, but the calculation in the ON clause prevents the bitmap filter being pushed down that far.

The estimated execution plan shows the cardinality of the columnstore table (102,400 rows) being multiplied by the bitmap selectivity (0.1) even though the bitmap is not applied until the later Filter operator:

Inconsistent estimates

This simplification allows the optimizer to skip calculating many alternative positions for the bitmap filter, but it can result in strange-looking estimates, and potentially poor plan choices. Something to be aware of.

Bitmap Type

While the optimizer decides whether a hash join batch mode bitmap should be used, the batch mode execution engine decides the type of bitmap to use at runtime, after the hash join hash table has been built (without spilling).

The type of bitmap (simple or complex) is reported by the extended event query_execution_column_store_segment_scan_started.

Simple bitmap

The following script creates a build-side table with 1,048,576 values and a probe-side columnstore table with 4 segments of 1,048,576 values each:

DROP TABLE IF EXISTS #T1, #T2;
GO
CREATE TABLE #T1 (c1 bigint NOT NULL);
CREATE TABLE #T2 (c1 bigint NOT NULL, INDEX c CLUSTERED COLUMNSTORE);

INSERT #T1 (c1) 
SELECT N.n 
FROM dbo.Numbers AS N 
WHERE N.n BETWEEN 1 AND 1048576;

INSERT #T2 (c1) 
SELECT N.n 
FROM dbo.Numbers AS N 
WHERE N.n BETWEEN 1 AND 4 * 1048576;

This batch mode execution engine chooses a small SIMPLE bitmap for the following query at runtime because the range of build-side values fits in 1MB:

DBCC TRACEON (3604, 646, -1);
GO
SELECT COUNT_BIG(*)
FROM #T1 AS T1 
JOIN #T2 AS T2 ON T2.c1 = T1.c1 
OPTION (MAXDOP 1);
GO
DBCC TRACEOFF (3604, 646, -1);

The extended event reports:

Extended event details

The trace flag reports that segment elimination was achieved using information about the range of values in the bitmap compared with columnstore minimum and maximum data value metadata:

Xact (xactid=372361906496) skipped row group
    (dbid=2 rowsetid=6989586646587736064 rowgroupid=3)
Xact (xactid=372361906496) skipped row group
    (dbid=2 rowsetid=6989586646587736064 rowgroupid=2)
Xact (xactid=372361906496) skipped row group
    (dbid=2 rowsetid=6989586646587736064 rowgroupid=1)

Complex bitmap

The following demo creates the same probe-side table as before (with four fully-populated segments). The build-side table has only two rows, containing the values 1 and (8 * 1048576 + 1).

This range is one bit too big to fit in an 1MB small simple bitmap. The execution engine chooses between a large simple bitmap and a complex bitmap at runtime. In this case, it chooses a complex bitmap since it will be much smaller because there are only two distinct values.

To force a complex bitmap, make the upper value (8 * 8 * 1048576 + 1) instead. This exceeds the number of bits in an 8MB bitmap, which is the maximum for a large simple bitmap.

DROP TABLE IF EXISTS #T1, #T2;
GO
CREATE TABLE #T1 (c1 bigint NOT NULL);
CREATE TABLE #T2 (c1 bigint NOT NULL, INDEX c CLUSTERED COLUMNSTORE);

INSERT #T1 (c1)
VALUES (1), (8 * 1048576 + 1);

INSERT #T2 (c1) 
SELECT N.n 
FROM dbo.Numbers AS N 
WHERE N.n BETWEEN 1 AND 4 * 1048576;
GO
DBCC TRACEON (3604, 646, -1);
GO
SELECT COUNT_BIG(*)
FROM #T1 AS T1 
JOIN #T2 AS T2 ON T2.c1 = T1.c1 
OPTION (MAXDOP 1);
GO
DBCC TRACEOFF (3604, 646, -1);

The extended event now reports a complex bitmap:

Complex bitmap

Three of the four columnstore segments are still skipped, thanks to the range table built in addition to the complex bitmap:

Xact (xactid=372361906544) skipped row group
    (dbid=2 rowsetid=7854277775534391296 rowgroupid=3)
Xact (xactid=372361906544) skipped row group
    (dbid=2 rowsetid=7854277775534391296 rowgroupid=2)
Xact (xactid=372361906544) skipped row group
    (dbid=2 rowsetid=7854277775534391296 rowgroupid=1)

The post-execution plan shows perfect early row filtering (despite a Bloom filter being used):

Perfect Bloom filtering

This is not unexpected because there are only two unique values to encode, and only one actually exists on the probe side.

Compiled bitmap on compressed data

The next demo creates a columnstore table with 32 duplicates of each integer value from 0 to 32,767 inclusive. The build-side table contains just the two values 1 and 32767.

The duplicate values encourage SQL Server to use a dictionary. There are not enough values to justify run-length encoding, so the dictionary values are bit-pack compressed.

DROP TABLE IF EXISTS #T1, #T2;
GO
CREATE TABLE #T1 (c1 bigint NOT NULL);
CREATE TABLE #T2 (c1 bigint NOT NULL, INDEX c CLUSTERED COLUMNSTORE);

INSERT #T1
VALUES (1), (32767)

INSERT #T2 (c1) 
SELECT N.n % 32768 
FROM dbo.Numbers AS N 
WHERE N.n BETWEEN 1 AND 1024 * 1024;
GO
SELECT COUNT_BIG(*)
FROM #T1 AS T1
JOIN #T2 AS T2 ON T2.c1 = T1.c1 
OPTION (MAXDOP 1);

The scan started extended event shows that a new bitmap filter was created on the compressed data:

Filter on compressed data

The extended event column_store_expression_filter_set shows the new bitmap has one bit for each of the 32,768 dictionary entries, and two bits set — one for each value set in the simple bitmap:

Expression filter

Compressed data comparison bitmap filter

If you change the values in the build-side table to form a contiguous single range, the expression filter bitmap will change from RAWBITMAP to one of the comparison types such a LT for a less-than filter.

DROP TABLE IF EXISTS #T1, #T2;
GO
CREATE TABLE #T1 (c1 bigint NOT NULL);
CREATE TABLE #T2 (c1 bigint NOT NULL, INDEX c CLUSTERED COLUMNSTORE);

INSERT #T1
VALUES (123), (124), (125)

INSERT #T2 (c1) 
SELECT N.n % 32768 
FROM dbo.Numbers AS N 
WHERE N.n BETWEEN 1 AND 1024 * 1024;
GO
SELECT COUNT_BIG(*)
FROM #T1 AS T1 
JOIN #T2 AS T2 ON T2.c1 = T1.c1 
OPTION (MAXDOP 1);

The compressed data bitmap now has three bits set:

Three bits set

The change to a contiguous range of values on the build side (123 to 125 inclusive) means the scan started event now reports a LTGT comparison filter instead of a RAWBITMAP:

LTGT bitmap filter

That concludes the demos for my batch mode bitmap article.

No comments:

Post a Comment

All comments are reviewed before publication.