Best Practice on using SCHEMABINDING on all Views and Functions?

Kristen · July 30, 2015, 11:37am

Following some earlier chatter about Views which reference Tables which then change (particularly if the VIEW does a SELECT * - which is not something that every actually happens here) it got me thinking about sp_RefreshView (which we use when we rollout scripts that may have changed Tables) and SCHEMABINDING.

It raises the question why would I not use SCHEMABINDING on all views?

Obviously some hassle if I want to change a Table as I will have to drop all related views and recreate them. Not sure that is any harder than Drop/Recreate constraints, Triggers and so on?? which I have to do now anyway (well, not for a simple DDL change, but many of mine require a drop/re-create of the table)

As it happens we have all VIEW scripts in individual files. Our rollout process consists of assembling a script from all scripts with a newer modified-date than "last time". So if I just "touch" the date on a VIEW script it will be included in the next rollout - thus I just have to add a series of DROP VIEW statements to my script prior to the ALTER TABLE statement and any necessary Re-create Views will happen as part of my rollout process. (Sidenote: we use ALTER for pre-existing Views that are re-created, and I suppose being able to see a Create and a Modify date might have a benefit, but with this process the original view will be Dropped so my Create & Modify dates will be the same)

Upside is that I will never make a change to a Table which has a knock-on effect on a view without also considering which views might be effected and need changing.

That lead me to an interesting article on using SCHEMBABINDING in Functions that do simple calculations. I have quite a few of those which I use as shorthand to save me repeating a small snippet of code. For example a "fnSafeInt2String(MyInt)" which converts an INT to a VARCHAR and if the INT is NULL it returns a blank string, and thus can safely be concatenated without propagating any NULL values. Clearly no Tables or Column involved there, but when I check OBJECTPROPERTYEX() on the function I find that:
SYSTEMDATAACCESS = 1
USERDATAACCESS = 1
IsDeterministic = 0
IsPrecise = 0
IsSystemVerified = 0

and all those values are swapped if I use SCHEMBABINDING. Apparently the absence of SCHEMABINDING will cause the query optimizer to assume that the function may access data and it will add a spool operator ... so adding SCHEMBABINDING improves the query plan (I read that it accounts for a performance improvement of 22% but I haven't checked that)

There was also an additional benefit (which doesn't apply to me) where a Function is persisted in, say, a computed column / index.

So ... ... any reason not to use SCHEMABINDING on all, suitable, Functions and all Views (other than the hassle of Drop/Create all relevant views)?

References:
blogs.msdn.com/b/sqlprogrammability/archive/2006/05/12/596424.aspx
http://sqlstudies.com/2014/08/06/schemabinding-what-why/
http://sqlstudies.com/2014/09/15/i-schemabound-my-scalar-udf-and-you-wont-believe-what-happend-next/

robert_volk · July 31, 2015, 10:57am

From my purely anecdotal, grumpy DBA experience, I like the idea of using SCHEMABINDING as much as possible (although I haven't it used it much, typically only when required like indexed views, etc.)

I like that it forces some kind of discipline, you can't just through SELECT * in there, you need to specify schema and tables, etc. I've worked on too many fire-and-forget database designs, it's tiring. I didn't know that it could improve performance, so thanks for that! More ammo for when I make the suggestion. And if it prevents people from changing or dropping the table schema easily, GREAT! Let them think it through first.

I've used SCHEMABINDING in a presentation as an example to prevent accidental alter/drop, and it gets excellent feedback. (Seems like a lot of developers have no trouble dropping and creating tables all the time) I even demo'd an uber-view that includes every table with SCHEMABINDING simply to prevent them being dropped, and it does no actual data access.

Kristen · July 31, 2015, 11:46am

Thank's Rob, good points.

I'm loving that ... presumably you then DENY all permissions?

robert_volk · July 31, 2015, 12:16pm

Actually no, I hadn't thought of that.

The general form is:

CREATE VIEW NoDROP(A) WITH SCHEMABINDING AS
SELECT TOP 0 1 FROM dbo.Table1 WHERE 1=0 UNION ALL
SELECT TOP 0 1 FROM dbo.Table2 WHERE 1=0 UNION ALL
SELECT TOP 0 1 FROM dbo.Table3 WHERE 1=0 UNION ALL
SELECT TOP 0 1 FROM dbo.Table4 WHERE 1=0 UNION ALL ...

The TOP 0 and WHERE clauses are technically redundant but they guarantee you don't return any data, and since you're not SELECTing any actual columns you get a single constant scan regardless of number of tables or actual schema. So it's really fast at not returning any data! And you can still ALTER the table any way you like.

And for those pesky devs that have sysadmin access or enough permissions to DROP the view, just create thousands of them.

Kristen · July 31, 2015, 12:37pm

You wouldn't would you?
You did already?

robert_volk · July 31, 2015, 1:05pm

I haven't yet. I have tested thousands of constraints on a table to gauge their performance impact, with really promising results. These are CHECK constraints that don't access data and have low overhead:

http://weblogs.sqlteam.com/robv/archive/2011/12/13/t-sql-tuesday-025-ndash-check-constraint-tricks.aspx

Again, it's just to annoy folks that have sysadmin privileges I can't remove. I'm considering using some of them in my current job for an existing codebase, because our DEV and STAGE environments are using a wild-west coding "standard". I also just discovered a procedure in production that routinely drops and creates tables, I'd like to strangle it and the folks who wrote it.

Kristen · July 31, 2015, 1:36pm

Aggghhhhh!

Although ... it reminds me that I've been thinking about raising some Coding Standards thoughts for discussion. Such topics usually dissolve into all-out-war ... but I'm interested in debating not the "We do it this way BECAUSE WE DO" but whether Style-A or Style-B is better "defensive programming" and likely to lead to fewer bugs / more issues spotted before Code Review (or beyond!).

Our coding standards are very strongly adhered to, so no Wild West Standards here, I'm relieved to say! but we do have some "variances" to get around THIS and THAT which bother me ... but they are there with good reason.

I'll start a new topic ...

ScottPletcher · July 31, 2015, 4:35pm

well, not for a simple DDL change, but many of mine require a drop/re-create of the table

What table changes require a drop/re-create?

Kristen · July 31, 2015, 5:34pm

data type change

We like to group columns into logical groups - makes it easier for the Users to find them in their report writers, so we will deliberately insert a column at a point in the Column Order that helps that requirement. (Clearly not a requirement for the APP to work properly)

robert_volk · July 31, 2015, 8:35pm

Have you considered using schema transfer:

http://sqlperformance.com/2012/08/t-sql-queries/t-sql-tuesday-schema-switch-a-roo
http://sqlperformance.com/2013/04/t-sql-queries/schema-switch-a-roo-part-2

You can create your new table structure under a different schema, migrate data, then transfer. I started using this about 5 years ago and it's really, really handy. Tables stay online, no data is lost, and everything can be rolled back easily. You can keep the same names on constraints, and if you're careful you can still use triggers properly without dropping them, just make sure to use 1-part names only. (I mentioned this to Aaron at a conference but I don't think he updated the article).

Kristen · August 1, 2015, 10:00am

Its scary what I don't know ... including things that have not yet been invented!!

Interesting for rolling out DDL changes. On balance I suspect that our rollouts are so widespread & destructive, and usually huge, that I think the benefits (not locking out the users until quite a bit later in the rollout process) would be minor ...

Worht thinking about the Backout benefits though ... right now we would take a Full Backup at the outset and then just restore back to that. On a large database that's not a quick process though ...

That said, each individual DDL change to a table originates from what was a single-change in DEV and it might be that the Schema Change trick might be useful there. We could hang-on to the AlternativeSchema version after making the real change, which might be handy for a DIFF ... but fx:knocking-on-wood! the times that we have restored from backup after a DDL change went wrong are incredibly rare, so I'm not sure we'd make frequent use of an AlternativeSchema for DIFF.

That said, we do make multiple changes (when the first change was poorly thought through and subsequent program changes revealed additional DDL chagnes that were needed). We make each change in DEV and then the rollout using the same sequential steps the same sequence - so that could result in multiple, sequential, changes to a table. If the table was huge we might consolidate them to save time, during rollout, but generally we prefer "These, exact, steps worked fine in DEV so let's go with that on TEST and if that is OK them PRODUCTION too"

But with an AlternativeSchema it would be much easier to bring-back the original table and have-another-go, more intelligently, and that might well be better. It would certainly make it easier for us to consolidate multiple-DDL-updates into a single-DDL-update.

Secondly, we have loads of Data Integration, as we call it, where we pull "changes only" from other systems and then merge them into our database. Some of those are into destination tables that may also be written to within our APP (so I guess they have to happen in real-time, as before), but any Read Only tables would hugely benefit from the Schema Swap method, thanks for enlightening me to that.

ScottPletcher · August 3, 2015, 7:22pm

Your users should be seeing a view, not the actual table definition, if column grouping is important. In a major environment, it would be impossible to drop, recreate and reload tables just to add a column!