Indexed views, also known as materialized views, can offer significant performance benefits when properly implemented.
A co-worker asked me to take a look at a query he was running that was behaving in a bizarre way. The query seemed simple enough… A couple joins, and 2 columns in the result set. The problem was that the WHERE clause had a WHERE column_value = 1 in it, but it wasn’t working. To make it even more perplexing, the identical query was working on a copy of the database, on a different server. The copy had an additional days worth of data in it; otherwise the two databases were identical.
We went through the typical checks… data types of the columns, indexes, consistency of the tables between the two databases. We looked at the compatibility level of the databases. We looked at the service pack levels of the two engines. Everything matched up perfectly.Finally, we looked at the actual query plan that was used. Voila! The database that was not returning the expected result set was using a view, an indexed view, to get the data; the other database that was working correctly was using the base table.
So here’s the lesson… Indexed views may be substituted by the query optimizer if it is determined that the cost of the query will be less than using the base table. (Note: This only applies to Enterprise, Developer and Data Center editions of SQL Server).
In our particular case, the indexed view that was chosen by the optimizer filtered out the desired rows the query was intended to select. The optimizer chose this view, probably because there were way fewer rows in the view than the base table.
Unfortunately for my associate, the database he was working on uses indexed views heavily. I’ll even go as far as to say indiscriminately. An additional challenge is that the data is accessed from the database with third party tools that may not have the ability to force the query plan to use the base table. The good news is that the applications seldom access the data from the base tables directly, so the unintentional substitution of the incorrect index view for the base table may be the exception.
We identified 3 possible workarounds that will guarantee that the optimizer would not substitute the indexed view (would use the base table) even if the query plan cost is less:
1. Use the table hint WITH (EXPAND). Not a great option considering the third party reporting tools. It would also need to be applied to every SELECT statement (and possible UPDATE and DELETE too!)
2. Downgrade to SQL Server Standard edition. This was not very popular…
3. Set one of the SET OPTIONs that are necessary for the index view to be used by the optimizer, such as SET NUMERIC_ROUNDABOUT to ON, instead of OFF. The other potential candidates are ANSI_NULLS, ANSI_PADDING, ANSI_WARNINGS, ARITHABORT, CONCAT_NULL_YIELDS_NULL and QUOTED_IDENTIFIER
We tested #1 and #3 and they both forced the query optimizer to utilize the base table instead of the indexed view. Option #2 we’ll leave as an exercise for the reader.
I looked to see if there is an option to over-ride the auto substitution of the view for the base table when the view is created, ie., to emulate the view behavior as if the edition was STANDARD, but did not see anything.
SQL Server BoL describes the many benefits of using indexed views, including the benefit that the query optimizer will use an indexed view instead of the base table if the cost of the plan is the least. Unfortunately, in our case, the view didn’t contain the data we needed. I recall reading about horizontally partitioned indexed views, which at the time I interpreted as partitioned tables / views (partition function, partition scheme, etc), so I may need to go back and dig deeper in this area. By creating a view that is filtering on a column’s value, we have essentially partitioned the base table.
Another cost of indexed views is that the data is actually stored. By applying the clustered index we are creating a copy of the data. So if you have several indexed views on a single table, you essentially have several copies of the data, along with the statistics of the data and the indexes and all of the overhead that goes with it.
It only seems reasonable that new records inserted into the base table will also result in inserted rows in the indexed view(s) that are SCHEMA bound to the table. Likewise with DELETE and UPDATE operations… the cost of these operators will be multiplied by the number of indexed views.
And anyone who indiscriminately implements indexed views that call indexed views… layers of indexed views… should be publicly flogged. That’s just my opinion and is unlikely to be deployed.
CONCLUSION: The overuse of indexed views can be like winding a watch with a wrench. It seems like it’ll work great but something will eventually break. In a way it’s like the over use of triggers… the invisible stored procedures… They have their place in production systems, but they should be implemented in moderation and with a full understanding of their positive and negative impacts. Unintended side effects can have significant costs that totally negate the upside of their use in the first place.
Obviously, this is a big topic and this posting just raises some of the issues. I’m certain others have hit this issue and have additional methods to avoid some of the ill-intended side effects of improper/over use of indexed views.
No comments:
Post a Comment
Hi,
Thanks for your visit to this blog.
We would be happy with your Queries/Suggestions.