You guys are still here! Synonyms are from the devil?

So... I have a large db designed to support "full text search" for documents that I manage, and I have been researching an issue of slowness when trying to do some of the searches. I have discovered that the synonyms in my query appear to have a significant (negative) impact on performance.

Set up:
2 dbs:
1 - Doc info and metadata
2 - Indexing info

Synonyms are used to 'link' the dbs together so I don't have to hardcode db names into each. Which allows me to have differently named pairs of these dbs for testing/dev/etc.

Query in question:

With synonym indexdb | Without synonym IndexDb.dbo:

( Select EntryId From indexdb.tbl_EntryTokenSummary with(nolock) Where TokenId = indexdb.GetTokenId(4,518529899282903344,898) and (substring(BitMask,1,1) & 8) = 8 Union Select EntryId From indexdb.tbl_EntryTokenSummary with(nolock) Where TokenId = indexdb.GetTokenId(4,518529899282903345,899) and (substring(BitMask,1,1) & 8) = 8 Union Select EntryId From indexdb.tbl_EntryTokenSummary with(nolock) Where TokenId = indexdb.GetTokenId(4,518529899282903346,900) and (substring(BitMask,1,1) & 8) = 8 ) Intersect ( Select EntryId From indexdb.tbl_EntryTokenSummary with(nolock) Where TokenId = indexdb.GetTokenId(4,2608536602748007472,7636274) Union Select EntryId From indexdb.tbl_EntryTokenSummary with(nolock) Where TokenId = indexdb.GetTokenId(4,2622845642773639265,28336751) ) Option(Recompile)

I ran this query with synonyms, without, and again with and got very different time performance. On the surface, the query plan looks the same:

Is there something I am missing about synonyms?
Why would they affect performance so much?
My current plan is to just 'translate' the aliases before I run the query... are there better options?

Thanks!

seventhnight

What is the difference between dmi.tbl_EntryTokenSummary and dbo.tbl_EntryTokenSummary?

Would need to see the actual execution plans - not pictures. I am also confused by the statement that you have a synonym called 'indexdb' to replace 'indexdb.dbo'. Synonyms are used for the full object reference and not a partial reference.

The query you show isn't displaying a difference between using a synonym vs not using a synonym, so I can only guess at the real differences. With that said - that query is going to have performance issues either way, since you are using a scalar function in the where clause - multiple times in addition to using SUBSTRING on a bit mask with bitwise operation.

And - throw in the UNION vs UNION ALL so you have multiple sort operations - and yeah, going to have some problems with performance.

dmi.object is a synonym in the dmi schema that references an object of the same name in the dbo schema of the index db.

For example:

dmi.tbl_EntryTokenSummary references the object DMI.dbo.tbl_EntryTokenSummary

So in both cases the SAME sql object is ultimate referenced.

Couldn't figure out how to fix the formatting in the original question... so here is the SQL again.

(
Select EntryId From *dmi*.tbl_EntryTokenSummary with(nolock) Where TokenId = *dmi*.GetTokenId(4,518529899282903344,898) and (substring(BitMask,1,1) & 8) = 8
Union
Select EntryId From *dmi*.tbl_EntryTokenSummary with(nolock) Where TokenId = *dmi*.GetTokenId(4,518529899282903345,899) and (substring(BitMask,1,1) & 8) = 8
Union
Select EntryId From *dmi*.tbl_EntryTokenSummary with(nolock) Where TokenId = *dmi*.GetTokenId(4,518529899282903346,900) and (substring(BitMask,1,1) & 8) = 8
)
Intersect
(
Select EntryId From *dmi*.tbl_EntryTokenSummary with(nolock) Where TokenId = *dmi*.GetTokenId(4,2608536602748007472,7636274)
Union
Select EntryId From *dmi*.tbl_EntryTokenSummary with(nolock) Where TokenId = *dmi*.GetTokenId(4,2622845642773639265,28336751)
)
Option(Recompile)

This only shows the code that is using the synonym - so no way to compare to the other code that is not using a synonym.

The actual execution plans would also help - the plan with the synonym vs the plan without.

One question comes to mind - who is the owner of the DMI schema?

I feel like I haven't done a great job presenting this information, so, sorry for that.
The crux of the issue is now: "Why does using synonyms cause a massive amount of scans?"

I'm working in the DM db, so the synonym DM.dmi.tbl_EntryTokenSummary points to DMI.dbo.tbl_EntryTokenSummary.

The purpose of the synonyms are to help 'connect' to databases with having to explicitly name them in generated code.
For example, DM1 db pairs with DMI1 db and DM2 db pairs with DMI2 db.

Slow Query:

(
    Select EntryId From DM.dmi.tbl_EntryTokenSummary with(nolock) Where TokenId = DM.dmi.GetTokenId(4,518529899282903344,898) and (substring(BitMask,1,1) & 8) = 8
    Union
    Select EntryId From DM.dmi.tbl_EntryTokenSummary with(nolock) Where TokenId = DM.dmi.GetTokenId(4,518529899282903345,899) and (substring(BitMask,1,1) & 8) = 8
    Union
    Select EntryId From DM.dmi.tbl_EntryTokenSummary with(nolock) Where TokenId = DM.dmi.GetTokenId(4,518529899282903346,900) and (substring(BitMask,1,1) & 8) = 8
)
Intersect
(
    Select EntryId From DM.dmi.tbl_EntryTokenSummary with(nolock) Where TokenId = DM.dmi.GetTokenId(4,2608536602748007472,7636274)
    Union
    Select EntryId From DM.dmi.tbl_EntryTokenSummary with(nolock) Where TokenId = DM.dmi.GetTokenId(4,2622845642773639265,28336751)
)
Option(Recompile)

Fast Query:

(
    Select EntryId From DMI.dbo.tbl_EntryTokenSummary with(nolock) Where TokenId = DMI.dbo.GetTokenId(4,518529899282903344,898) and (substring(BitMask,1,1) & 8) = 8
    Union
    Select EntryId From DMI.dbo.tbl_EntryTokenSummary with(nolock) Where TokenId = DMI.dbo.GetTokenId(4,518529899282903345,899) and (substring(BitMask,1,1) & 8) = 8
    Union
    Select EntryId From DMI.dbo.tbl_EntryTokenSummary with(nolock) Where TokenId = DMI.dbo.GetTokenId(4,518529899282903346,900) and (substring(BitMask,1,1) & 8) = 8
)
Intersect
(
    Select EntryId From DMI.dbo.tbl_EntryTokenSummary with(nolock) Where TokenId = DMI.dbo.GetTokenId(4,2608536602748007472,7636274)
    Union
    Select EntryId From DMI.dbo.tbl_EntryTokenSummary with(nolock) Where TokenId = DMI.dbo.GetTokenId(4,2622845642773639265,28336751)
)
Option(Recompile)

When looking comparing the execution profiles, I noticed that 1 step was out of order, but otherwise looked very similar:

Kicker was looking at the I/O:


The scan count when using the synonyms is through the roof.

Query Design Concerns:

Scalar Function:
The scalar function is used in the where clause and encapsulate the construction of a varbinary value. Being in the where clause with static values and being deterministic, I'm reasonably sure it only gets calculated once. I tested with the resultant values and saw no significant difference between that query and the query using the function via direct object reference.

Substring:
Unavoidable. Still doesn't appear to be too much of a performance impact.

Union vs Union All & Intersect:
Union is the appropriate use as I do not want duplicates. The table is designed so that the clustered index (pk) is being hit and the data is already sorted in the appropriate order.
Interestingly, this query doesn't have ANY sort operations.

Volume:
The poor performing synonym query is running in ~53 seconds, but the direct reference query runs in ~.350 seconds. The table being referenced in each (5 times) has 9,602,395,633 rows.

Update:

Through experimentation, I've discovered that the performance issue appears to be directly related to the use of the scalar function via the synonym in the where clause.

If the function is used by synonym reference [DM.dmi.GetTokenId()], results are SLOOOWW.
If the function is used by direct reference [DMI.dbo.GetTokenId()], results are Fast.
If the function is bypassed using literals [0x0000000ETC], results are Fast.
There does not appear to be a performance distinction between direct function reference and literal.

Using the synonym or direct reference on the table has no discernable performance impact.

So, I can mitigate the performance issues, BUT I still don't understand why the function being referenced via synonym has such an impact?

Which version of SQL are you on? If it's a newer version that can in-line scalar functions, maybe SQL doesn't inline the function if an alias is used to call it?

1 Like

When you run the code using direct references - is the database context the same as the 3-part naming being used? In other words, are you running the code with a direct reference while connected to the DMI database?

I just noticed you are using 3-part naming to access that function. What you should be doing is using 2-part names in the database where the code is called.

Instead of calling that function as DM.dmi.GetTokenId() you should be calling it as dmi.GetTokenID() with the synonym defined in the database where the code is being called.

In the DMI database - CREATE SYNONYM dmi.GetTokenId() FOR DMI.dbo.GetTokenId();

If possible, you should see if you can convert that function to an inline-table valued function and then use CROSS/OUTER apply to call it.

Pretty sure we are 2019, so that could be a definite possibility.
I'm double checking with my DBA.

This code has been in Production since the end of August, so I'm wondering why it's just now surfacing as an issue.

Maybe a SQL Patch tweaked the query engine somehow and introduced this issue?

Hmm, and you already have RECOMPILE, which is another thing I would have suggested trying.

You can determine the version and CU level with

SELECT @@VERSION;

So you can....

Microsoft SQL Server 2019 (RTM-CU30) (KB5049235) - 15.0.4415.2 (X64)