Running Total very slow

Ah, it looks like you don't want the current entry in the running total, so change it to ">" rather than ">=".

thanks for all your help. I guess the last thing is why the running total is not summing correctly. I verified the insert into the temp table and the entries look correct where there is several negative entries and several positive entries as far as amounts go, etc. but looking at the running total they all show negative. I can't figure it out. the amount of (positive - which means the customer has a balance) $124,782.63 is the verified correct 'high balance'. the way this is summing in the sub query something is going on. hmmmm.

so I am trying to work through this. I verified that the records in the temp table are the same as I queried from the regular table. the problem is how they are being added to achieve the running total MAX. I have messed with the join, I have added a primary key to the temp table and also changed around the index. nothing. I am still having issues. just very strange to me that even when I use the same script that I know that works but takes way too long (reason why I initially posted) it seems to run soooo much faster because of the temp table then insert......but it is adding incorrectly. is this inherent in the way this is done?

No, there isn't, that's what's so frustrating here. It should work. If you don't need the date, you can drop that out of the table. I did a unique key in the table, I just don't make it a "primary key" unless I absolutely have to, because then you can't have a NULL value in any of the keys.

DROP TABLE #gl_entries
CREATE TABLE #gl_entries (
[Customer No_] varchar(10) NULL,
[Entry No_] int NULL,
Amount money,
UNIQUE CLUSTERED ( [Customer No_], [Entry No_] )
)

The speed is because there's an index that perfectly matches the total. If you could put some actual rows and expected totals in that would help a lot. Edit: Wait, I see a few there, when I can I'll test with those, although another customer's data wouldn't hurt :-).

@osupratt1,

I'll admit that I've not done a deep dive on all of the responses, so far, but I have noticed that they all use some form of "Triangular Join", which is really a half of a full CROSS JOIN (Cartesian Product/Square Join), or at least it seems that way from my quick scan. Those can get pretty slow pretty quickly.

I have a method that will do a running total (or grouped running total or both) on a million rows in seconds but I need more information.

  1. What is the actual order of the rows that you want the running total to be calculated by? For example, do you want the running total based on Posting Date using Entry No as the tie breaker for entries that occur on the same date?
  2. It would appear from your original post that you're looking for the "high water mark" for each Customer No for the given year. It that correct? In other words, you want the running total to work like a checkbook but for each Customer, Correct? And the first entry for each customer would be the first balance for that customer, correct?
  3. As a point of curiosity, how many total rows of data are we talking about? And how many unique Customer No's are we talking about?

For best performance refer to this url

http://sqlperformance.com/2012/07/t-sql-queries/running-totals

Thanks guys for your help. What Scott has given has worked. For some reason I had thought that the Entry No_ was a NVARCHAR data type but was in error as it was an INT like Scott thought in the beginning. I am now capturing everything out of the temp table correctly. We do not have licensing or access rights to change our underlying accounting system Navision. Ideally I would like to just add indexes to these tables and then test, but might have to do it the temp table way. Let me say thanks to everyone for their help......Scott especially.

Oddly enough, that link dismisses the most powerful method as an unsupported anomaly. Too bad there is so little understanding there.

Again, just curious, how many rows are you talking about and how long is it now taking?

I didn't think quirky update was needed here, particularly given its level of "black box"ness. You just sort of have to trust that it works -- a lot of people, even some well know ones, just aren't willing to do that.

Like I said... lot's of misunderstand and FUD. You don't have to trust that it works. There's an easy way to have it check for errors to let you know if it worked or not.

Try the solution used in the below post to calculate running total in SQL Server
Running Total in SQL Server

So can you index when running CTE? Can this be used in a View as temp tables can't? Seems a bit slower but it could solve my problem as my DBA states that he needs a View to place in an SSIS package.

@osupratt1,

Please answer my previous question. How many total rows are we talking about?

we have 8 companies in my division. each company as from 100 - 1000 customers. average quantity of records per customer is about 10k. so for company 1 with 50 customers at 10k records per customer then that would be 500k records grouped by customer number to where we would see 50 records showing the MAX running total (high balance) for the current year. multiplied by 8.

CTE solution does not calculate correct running total. verified totals do not match. I only want the entries for current year. if I filter for one particular customer it shows the same number of records and the 'amount' (amount that is getting totaled) is the same. it is not calculating correctly so I'm not sure if there is the >= entry number that needs to go somewhere?

I am coming to the party late, so I don't know the details of the CTE solution that you referred to in the above post and couldn't see one in the previous posts. If you filter for one customer and you are still getting the same results, that usually indicates something wrong with the joins. If you are able to post a test case that you are using (that one can copy and run in an SSMS window) that would help.

If you are concerned about performance in running totals, there are 3 options that I can think of:

a) If you are on SQL 2012 or later, use windowing functions ( i.e., SUM() OVER()).

b) On an earlier version, use the quirky update. It is an undocumented feature, but as far as I know it has never been shown to give wrong results if you follow the rules. See @JeffModen's article here to learn what you need to be aware of (scroll down to the rules section)

c) If you are on SQL Server 2005 or later, use CLR aggregates. Here is an article that describes it

One other question @osupratt1. As I was reading back on previous posts, I saw that I had suggested that what you are trying to do is not "running total" in the conventional sense that people understand it.

Let us say you had data like shown below

Date    Amount
Jan 1   10
Jan 2   5
Jan 3  -2
Jan 4   7

Running total in the conventional sense is the following:

Date    Amount  Running Total
Jan 1   10       10
Jan 2   5        15
Jan 3  -2        13
Jan 4   7        20

Is that what you are trying to compute?

Although I have a grand appreciation (love it) for the "Quirky Update" method that you used, the way you did it is dangerous because it doesn't guarantee the run in any way. There's nothing to check for errors, if parallelism occurs, you're screwed, and if someone adds a unique index not in the same order as your clustered index, you'll be making wrong answers all over the place.

There are a fair number of rules that you have to follow to keep from shooting yourself or you project in the head when using the "Quirky Update" (the 3 part update based on the underlying "pseudo cursor"). You must use TABLOCK (TABLOCKX is better), you must use MAXDOP 1, doing it the way you did it, you must use an INDEX(1) hint (which also means you need the correct Clustered Index, which you don't have).

It would also be extremely smart to build in a "safety counter" to ensure that the rows are being processed in the correct order or they cause a noticeable error.

Ok.. with that thought in mind, how long is it taking now?