Hello,
What would be the best approach when using entities with values valid over different datetime intervals?
Like VAT percentages, example:
RateId ValidFrom ValidTo Value
1 2010-01-01 2016-12-31 10%
1 2017-01-01 2018-03-31 15%
1 2018-04-01 2019-12-31 12%
2 2018-06-01 2019-12-31 6%
Should I put the "Between ValidFrom And ValidTo" in the Join clause or in Where?
Like: Select * From Products p Inner Join VAT v on v.RateId=p.RateId And workdate Between v.ValidFrom And v.ValidTo
or Select * From Products p Inner Join VAT v on v.RateId=p.RateId Where workdate Between v.ValidFrom And v.ValidTo
Do you think a composite index on RateId,ValidFrom would help?
(not for this example case but for other interval-based tables)
Thank you,
Indeed, it's exactly my case but I don't understand what the conclusion of that post is
However, I think both versions are fine; most likely the SQL server is smart enough.
Regarding the second part of my question, do you think an composite Index on Id + ValidFrom would improve the join? (I have a table growing at a rate of 500,000 records/month joined with a 25,000 interval table).
ok @harishgg1 ... what's up with that? You just pointed someone you suspect of being a beginner at a post that uses a different RDBMS altogether and it's laced with code proprietary to that RDBMS and has no chance of working in T-SQL but, if it did, it's one of the worst examples in the world when it comes to both table structure and a data example
And it's not difficult even for a beginner to test if an index helps. Run the existing code before adding the index and measure the duration, etc, add the index, and the run the same code again. It just doesn't get a simpler than that to find out if an index will work (short of checking if the code is SARGable or not).
I also modified the code from that article to work in T-SQL. The code that took 100 SECONDS in PostGre SQL took only 3 MS duration and 0 MS CPU to execute in T-SQL. The supposedly improved code (which is actually a CTE) took 6 MS duration and 15 MS CPU to execute.
SQL is NOT SQL! You can try using examples for other SQL Dialects but don't just post them as an answer because they're frequently going to be wrong for T-SQL.
Yeah, sorry about the recommendation to simply do a test (although that's what I'd have done). The composite index you posed could be ok or reversing the column names in the index might work better. It all depends on which column have the highest cardinality (number of distinct values). The column with the highest cardinality should usually go first. I say "usually" because the optimizer in SQL Server has a "brain" of it's own and what we think will work the best based on all that is holy in SQL, sometimes will not.
I'm one that sincerely appreciates humility but that's starting to sound like you've never tried to run the code in the article. Maybe never even read it?
Thank you Jeff,
For the sake of example I over-simplified the issues I deal with.
.. and, yes, I did some teste before posting this question.
However, the composite index had no effect whatsoever.
The gain comes from the link harish found, about avoiding joins in queries. I had a join with a five records table wich I managed to get rid of and the difference was huge. I'm still not clear about the usage of composite indexes....
I guess I'd have to see your actual query and know more about the tables and indexes involved.
And, I have to tell you that I tested the methods in the post the Harish cited. The original query that supposedly took 100 Seconds ran faster than the query that supposedly fixed the problem and that also ran nasty fast but slower than than the first query that supposedly had the problem.
I do agree that the "Divide'n'Conquer" method in the second query does sometimes make a difference but, as with all else in SQL Server, "It Depends". Again, the only way to tell would be to know more about your query and the tables. One way to do a good part of that would be to save the actual execution plan and attach it.