This article describes the structure of a sql_handle
and shows how the batch text hash component is calculated.
SQL handle structure
The 44-byte sql_handle
has the following components (in order):
- 4 bytes: Byte-reversed integer code for the target cache e.g.
0x02000000
(2) for SQL Plans (CACHESTORE_SQLCP
)0x03000000
(3) for Object Plans (CACHESTORE_OBJCP
)
- 4 bytes: Byte-reversed integer object id
- For adhoc and prepared SQL this is an internal hash of the batch text as returned by
@@PROCID
orobjectid
insys.dm_exec_plan_attributes
.
- For adhoc and prepared SQL this is an internal hash of the batch text as returned by
- 16 bytes: MD5 hash of the batch text including parameter definitions as a prefix (if present).
- 20 bytes
0000000000000000000000000000000000000000
Batch Text Hash
The second component (object id) of the SQL handle is described in the documentation as:
This is the object ID stored in sys.objects for database objects (procedures, views, triggers, and so on). For plans of type “Adhoc” or “Prepared”, it is an internal hash of the batch text.
Modules are easy; it is the ad-hoc SQL and prepared statements we are interested in. What exactly is the “internal hash of the batch text”?
The first question to ask is, what is the batch text? For batches without parameters, it is exactly the text of the whole batch. For batches with parameters, the whole batch text is prefixed with the parameter definitions enclosed in parentheses.
In both cases, every character (including spaces, carriage returns and whatnot) is significant.
Computing the Hash
T-SQL lacks several functions commonly used by hashing routines, like bit shifting and integer arithmetic that ignores overflow. To make the hash computation somewhat easier to read and understand, I will start by creating several helper scalar functions. Performance is not the goal here, but each of these functions is capable of being in-lined on SQL Server 2019+.
-- Shift arithmetic right 2 places
CREATE FUNCTION dbo.SAR2 (@i integer)
RETURNS integer
WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING
AS
BEGIN
RETURN IIF(@i >= 0, @i / 4, ~(~@i / 4));
END;
-- Shift left 5 places
CREATE FUNCTION dbo.SHL5 (@i integer)
RETURNS integer
WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING
AS
BEGIN
RETURN
CONVERT(integer,
CONVERT(binary(4),
CONVERT(bigint, @i) * 32));
END;
-- Integer add with wraparound
CREATE FUNCTION dbo.IntegerAdd (@x integer, @y integer)
RETURNS integer
WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING
AS
BEGIN
RETURN
CONVERT(integer,
CONVERT(binary(4),
CONVERT(bigint, @x) + CONVERT(bigint, @y)));
END;
-- Integer substraction with wraparound
CREATE FUNCTION dbo.IntegerSub (@x integer, @y integer)
RETURNS integer
WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING
AS
BEGIN
RETURN
CONVERT(integer,
CONVERT(binary(4),
CONVERT(bigint, @x) - CONVERT(bigint, @y)));
END;
-- Integer multiply with wraparound
CREATE FUNCTION dbo.IntegerMultiply (@x integer, @y integer)
RETURNS integer
WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING
AS
BEGIN
RETURN
CONVERT(integer,
CONVERT(binary(4),
CONVERT(bigint, @x) * CONVERT(bigint, @y)));
END;
GO
The hash computation itself cannot be in-lined, and contains several magic numbers:
CREATE OR ALTER FUNCTION dbo.BatchTextHash
(
@BatchText nvarchar(max)
)
RETURNS integer
WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING
AS
BEGIN
DECLARE
@bin varbinary(max) = CONVERT(varbinary(max), @BatchText, 0),
@b integer = 0,
@c integer = 0,
@d integer = 0,
@r integer = 0,
@x integer = 1;
DECLARE
@8 integer = DATALENGTH(@bin) - 2,
@9 integer = DATALENGTH(@bin);
-- Step 1
IF @8 <= 0 GOTO J2;
J1:
SET @x += 4;
SET @c = dbo.IntegerAdd(dbo.SHL5(@b), dbo.SAR2(@b));
SET @c = dbo.IntegerAdd(@c, CONVERT(integer, SUBSTRING(@bin, @x - 3, 1) + SUBSTRING(@bin, @x - 4, 1)));
SET @b ^= @c;
SET @c = dbo.IntegerAdd(CONVERT(integer, SUBSTRING(@bin, @x - 1, 1) + SUBSTRING(@bin, @x - 2, 1)), dbo.SHL5(@d));
SET @c = dbo.IntegerAdd(@c, dbo.SAR2(@d));
SET @d ^= @c;
IF @x < @8 GOTO J1;
J2:
IF @x < @9
BEGIN
SET @c = dbo.IntegerAdd(dbo.SHL5(@b), dbo.SAR2(@b));
SET @c = dbo.IntegerAdd(@c, CONVERT(integer, SUBSTRING(@bin, @x + 1, 1) + SUBSTRING(@bin, @x, 1)));
SET @b ^= @c;
END;
-- Step 2
SELECT
@d = dbo.IntegerMultiply(@d, 314159269),
@b = dbo.IntegerMultiply(@b, 1179605760);
SET @d = dbo.IntegerSub(@d, @b);
SET @x = IIF(@d >= 0, 0, -1);
SET @r = @d;
SET @r ^= @x;
SET @r = dbo.IntegerSub(@r, @x);
SET @x =
CONVERT(integer,
SUBSTRING(
CONVERT(binary(8),
CONVERT(bigint, 1152921497) *
CONVERT(bigint, @r)),
1, 4));
-- SAR (28) inlined
SET @x = IIF(@x >= 0, @x / 268435456, ~(~@x / 268435456));
SET @c = IIF(@x >= 0, 0, 1);
SET @x = dbo.IntegerAdd(@x, @c);
SET @x = dbo.IntegerMultiply(@x, 1000000007);
SET @r = dbo.IntegerSub(@r, @x);
IF @r = 0 SET @r = 1;
RETURN @r;
END;
Test 1 — No parameters
This shows the @@PROCID
for an ad-hoc SQL batch and the computed hash for the same text (including the carriage return and line feed):
GO
SELECT @@PROCID AS objectid;
GO
DECLARE @BatchText nvarchar(max) =
N'SELECT @@PROCID AS objectid;
';
SELECT dbo.BatchTextHash(@BatchText);
Both return the value 836550104
.
Test 2 — Prepared statement
This shows the @@PROCID
for a parameterized batch executed with sp_executesql
, and the same value computed using the hash function. Note the @Params
text sent to sp_executesql
is reused, after wrapping in parentheses:
DECLARE @SQL nvarchar(max) =
N'
SELECT
cnt = COUNT_BIG(*),
objectid = @@PROCID
FROM master..spt_values AS N
WHERE
N.number < @n;';
DECLARE @Params nvarchar(max) = N'@n integer';
EXECUTE sys.sp_executesql
@SQL,
@Params,
@n = 100;
SELECT dbo.BatchTextHash
(
N'(' +
@Params +
N')' +
@SQL
) AS objectid;
Again, the @@PROCID
and computes values are the same: 456334234
.
Final Thoughts
The code above is provided as-is. I believe it is a faithful T-SQL translation of the SQL Server routine in sqllang!CSQLStringsStream::ComputeObjidAndLength
, but it comes with no guarantees or support. I present for educational value, not for use in real applications.
This work is licensed under CC BY-NC-SA 4.0
Possibly saving time to others.
ReplyDeleteWhen creating the batch text to generate the hashes from, all char parameters must have also the size specified, ie. char(1), varchar(15) etc. varchar(max) also has to have a space after the closing bracket. I shit you not.
example:
(@Name varchar(max) )
select * from Orders where Name = @Name
Thanks, that is useful to add.
DeleteIt looks like that it doesn't work when the query has some unicode characters such as € then the function generates different hash :(
ReplyDeleteThat's unfortunate.
DeleteDespite the declaimer at the end of the text, I have found the problem and fixed it in the code above. There was an off-by-one error with SUBSTRING, and a failure to byte-swap for endianness. Works with double-byte characters now. Thank you for the feedback.
Delete