We have an IDENTITY column in every table (usually the PKey, rarely the Clustered Index). It means that we have a "safe" single-column unique value to use in URLs and the like (the "like" includes IDs for HTML elements too, such as drill-down <DIV> that are replaced by AJAX).
I have a report which (now) needs to run on data from Production. We have warehouse of that data, but it is impractical to refresh it - the report is to find exceptions (in Production table), they are fixed, then User wants to re-run the report to look for further issues, clearly waiting for an import to Warehouse is not practical.
Production is a 3rd party DB, Warehouse is an in-house DB. I can make reasonable changes to the 3rd Party DB, but we try to avoid doing that (in case complications with normal operation and/or during upgrades)
Also, the performance of the report if run on Production tables is terrible. I haven't looked into that, but probably we already have more suitable indexes in Warehouse - that's fair enough, the two DBs are doing different jobs.
So I've been doing:
SELECT MyID, PKCol, Col2, Col3, ...
FROM WAREHOUSE.dbo.MyTable
WHERE ...
and I can change that to:
SELECT PKCol, Col2, Col3, ...
, [PKCol_ENCODED] = dbo.MyEncodeFunction(PKCol)
FROM PRODUCTION.dbo.MyTable
WHERE ...
and then using PKCol_ENCODED
anywhere that can't handle embedded characters like SPACE or "",
"&" (yes, I know, but they HAVE been putting things like that in the PKCol, and the APP does nothing to prevent that) and they drive both the URL and the <DIV ID="XXX\YYY">
mad ...
And now I find that my Cunning Plan! to replace any non alphanumeric with "_" is causing duplicates ... XXX\YYY
and XXX&YYY
both map to the same thing ...
My first thought is to create a translation table for the PKeys
CREATE TABLE PKeyTranslation
(
[MyID] int IDENTITY(1,1) NOT NULL
, [MyPKey] varchar(10) NOT NULL
)
and then to freshen this up on every run of the report (which will find no, or very few, new rows each time)
But its a right palaver to do (adding to lots of different reports), and maybe the reports are going to be too slow anyway (because querying PRODUCTION with inappropriate structure/indexes etc)
In fairness the ID is only required to provide a unique ID (for URL and <DIV ID>) and if the user Drills Down and I get the ID as a @Parameter to my SProc that will only be a single-row decode lookup in the PKeyTranslation table
I suppose I could use a single PKeyTranslation table regardless of how many physical tables there are (provided they all have a one-part varchar column for their PKey), as I don't really care about the mapping, I just need a unique, non-changing, ID ...
... but I'm more inclined to match the PKey to the actual WAREHOUSE table's ID so that whether it is displaying WAREHOUSE data or PRODUCTION data the ID will always represent the same row.
That will mean (presumably?) changing the IDENTITY in the Warehouse Table to INT and putting the IDENTITY in the "translation table" [one for each actual table] instead.
Grrrr ... I hate 3rd Party Databases !!
Ding! Possibly eureka moment. Add a TRIGGER to PRODUCTION so that Translation Table in WAREHOUSE is maintained in real time?