SQLTeam.com | Weblogs | Forums

A little help on indexes


#1

Hi.

declare @DateFrom datetime
declare @DateTo datetime
set @DateFrom = '20161130'
set @DateTo = '20161201'

declare @dindexFrom datetime 
Declare @dindexTo datetime
set @dindexFrom = @DateFrom  - 30
set @dindexTo = @DateTo + 30

SELECT * 
FROM	tblTrans_Cash with (index = indDateTime)
		CROSS JOIN tblCinema C

where TransC_strType in ('M')



and TransC_dtmRealTransTime BETWEEN @DateFrom AND  @DateTo
and TransC_dtmDateTime BETWEEN  @dindexFrom  AND  @dindexTo

	

ORDER BY 1

I have a table with a missing index (TransC_dtmRealTransTime) , as I cannot insert it because I am not allowed as it is a 3rd party db, I am forcing an existing index on the query so I can at least get a non clustered RIP lookup and an indexd seek on tbltrans_cash ( you can see the pictures).
In that way I get a 2 seconds vs 18 seconds(without the forced index).

Is there anything else I can try, knowing that I cannot created indexes on the table?
And also there is no clustered index on the table.
Can't seem to find a better solution by myself.
And this is a simple query that is sargable, as you can see.

thanks.

P.S. and TransC_dtmDateTime BETWEEN @dindexFrom AND @dindexTo is actually not needed but I trigger the index in that way, else the index will not work , even forced.


#2

Can you create a VIEW in another database (specifically for the purpose, if necessary) referencing that table and then create an INDEX on that? Can't remember if indexed view across databases is possible ...

INSERT the data from that table into your own DB and then query it from there? Depends how real-time the results need to be I expect ...


#3

Hi.
Unfortunately I can't do that because the view requires some changes on the options (can't remember 100% but was on another topic I posted hare) , for example option strict differently (not really , this is an example but to give a general idea)
and the database must be altered again to support the indexing.


#4

Another approach is to have a separate reporting database just containing the data you need. You can then add whatever indexes/stats you want.

The reporting DB could be refreshed, say, overnight or if you need near real time data you could look at transactional replication etc.


#5

Hi.
The reporting database has been suggested again but the problem here is that many reports we issue, including this one, are on the fly, with new data.
We already have a database that runs overnight and gets data from all sources but we can't query this db for transactions data need to be shown now and contain new data.

There is also another issue I see on the aforementioned example.
When the data retrieved is relatively small and from a close data between ( say 20170101 - 20170113) I get a huge time difference.
When the data retrieved are many rows and on a wider data range (say 20170101 - 20170313) I get almost the same amount of time for the data.
Usually the needed data is per month bases, that will make a difference but if it is more data then I don't get any real difference.

thanks.


#6

Don't suppose?? that your Source Database has ChangeDateTime column on the tables? - such that you can copy-across JUST the Changed Data? If so it might be "OK" to copy over the new data regularly (we do that every 10 minutes and it is "quick enough" that it does not disrupt the servers during the working day)

Really REALLY longshot! :

Create partitioned views in your REPORTING database (referencing the tables in the Source Database)

Not sure that is going to work unless there is, already, a suitable date-based index (i.e. appropriate for the partitions in the Views) on the Source table. Also no idea if you can have portioned views from one DB to another? Might come back to the same "options" problem you mentioned earlier.


#7

We do not have partitioned views , also another issue is that the data is copied from another sql server versions.
So we have 2005,2008,2015 databases that gets copied.

Actually we have a timestamp column that we use to load the daily import database data but we can't do that on the live databases (or at least I'm not good at it) as there are many occurrences of transaction lost if we try to meddle with the database at live times.
This happened because the database is not our own and the general suggestion from the 3rd party is not to query live data, unless is done by their software( that will do it with temp sessions, somehow), so there is real danger on corruption here :frowning:


#8

OK, noted.

Sounds like your SQL reports, on the live data, are also going to fall foul of that "transaction" issue?

My thought was to create some new partitioned views in a separate, "reporting", DB.

Any chance that you could only report on "Enquiry Data" - which I presume is static "as of last night" ?


#9

Transactional Replication will not interfere with the source DB transactions as it works by reading the log.
It does have overhead so your server will need to have enough capacity to cope.
It is also a pain to setup and monitor but you might get some long term benefits.

https://technet.microsoft.com/en-us/library/ms151706(v=sql.105).aspx


#10

Hi.
Yes we are taking the risk on running our own query's (if you remember, on many example i use read uncommitted so we may lose some data on live runs) but they are in small scale, not on big daily imports and if we can avoid something running on live, we tend to.
I'm not sure on the term "Enquiry Data" . If it is data that is used from the last night backup, well, i don't think I can use it.
Or is something else?
Thanks.


#11

Ifor is Transactional Replication suitable for data this is created now ?
If so, I can saw that to our DB admin.
thanks.


#12

Transactional Replication is near real time. If it is working well it is normally only a few seconds behind the source DB.

To implement Transactional Replication it is worth spending time up front on planning. You should only bring over the data you actually need etc. It also needs monitoring in case it stops or otherwise goes wrong.


#13

Yep.
I'm reading it right now.
Can't be sure if we can use it as it needs data and time and it's our db admin responsibility , so I can't say that we will use it but "I will mark it down.
Thanks.


#14

In addition to losing some data you will also get some data twice (when an index page is split and your query has just processed that page, and then your query ALSO processes the new split-page).

Can't see how that is useful, the data is frequently going to be "incomplete" and if you run the query again, immediately, you will get a different answer because data is being added.

If Vendor recommends that you don't query the live database I would imagine that is because it is not transactionally atomic - so you are likely to get "bits of data" I expect? For example "some of the invoice details" because not all of it has been committed yet


#15

Actually the issue is that we get a lot of deadlocks when querying the data.
We might be OK on losing or getting xtra data on SOME reports but we can't afford the deadlocks as we may lose money on incomplete orders with that.

The complete invoices for the accounting department that are also very important are issued to them on the next day, through the overnight backup.
So the most important data is preserved but the deadlock may and have issued unexpected problems.

Thanks.


#16

We avoid that by using Read Committed Snapshot Isolation on our OLTP databases, but the system was designed and tested, from the ground up, to be compatible with RCSI, so its not something you could just "turn on" safely, without testing. Doesn't sound like your APP developers are using that - which is either a pity, or they have some technical reason why they don't do that.

Given those sorts of issues I would only want to be querying a "known transactionally complete" copy database, but if you are happy with the level of inaccuracy you get on your financial reports I guess it is OK.

I'm used to people saying to me "Inaccuracy is fine provided it is not on anything important" ... similarly to "Loss of any data is fine, provided it isn't important data" ...


#17

Sapator,

The more you reveal about the problems with your system the more I am convinced there are some fundamental problems with it; you seem to have to fight very hard just to get some basic reports. While ideas like RCSI, Transaction Replication etc might be part of the solution it may be possible to avoid them if you know the underlying problems. I would start by downloading the following free script and running it out of hours:

https://www.brentozar.com/blitz/

I would also run the following scripts for latency and waits:

Once you have a list of the current problems and the results of these scripts you should then be in a position to decide how to proceed. You might have to work with your supplier and/or external DB consultants to get this system working well.

Good luck.


#18

Thanks everyone.
Even the 3rd party developers are not hitting the DB directly.
They do everything on a temp session.It's not something that they reveal but I have run the profiler a lot on the databases and that's what I can see.
For that part, the apps are working fine but when we start querying data directly on their db's then we get into a lot of issue, as you may have suspected.

I could probably have our admin run the tools but the fact remains that I cannot change any 3rd party sp's or use my own indexes without consulting with them and that would take a looot of time.

What I am trying to do is, at least, run any parametric reports or queries that hit the main databases with as little problems as possible.

I appreciate all the help from you and again, thanks a bunch.


#19

If it was me I would stop querying the database direct and put my efforts into maintaining a copy-database that was sufficiently "fresh" for reporting against.

You said (IIRC) that you were using NOLOCK-equivalent, and were happy with some inaccuracy, so seems reasonable to me that data that is a few minutes old, but was 100% accurate, would be just as "accurate" in practical terms.


#20

They are probably caching data in the middle tier. I would investigate if there is some sort of published interface to use this data for reporting. If there is, it will probably use web services.

Like Kristen, I suspect you should not be querying the production DB directly; even NOLOCK holds some locks:
https://blogs.sentryone.com/aaronbertrand/bad-habits-nolock-everywhere/

I think someone needs to stand back and take a strategic view. This may involve discussing requirements and options with your suppliers.