Sql update a column on the same row

Gideon · August 19, 2016, 8:36am

I am trying to update a column based on the average (excluding zero values) of five columns on the same row.

I have the following

Update SQLTableBlokke
set blokgemiddeld = (SELECT AVG(tha_min4 + tha_min3 + tha_min2 + tha_min1 + tha_huidig) FROM SQLTableBlokke)

where tha_min4<>0 AND
tha_min4<>0 AND
tha_min3<>0 AND
tha_min2<>0 AND
tha_min1<>0 AND
tha_huidig<>0

This is not working since it uses the value of the total column ( not just the row value), thus giving the blokgemiddeld column for all the rows the same value.

Any help would be much appreciated.

Regards

Kristen · August 19, 2016, 8:58am

Not sure I understand how you can have an average of only one row.

Do you mean you want to add up all the non-zero columns, and divide by the number of (non-zero) columns, for each row?

Gideon · August 19, 2016, 10:16am

Hello Kristen. See the below image.

I want to average the five columns and update the sql table's last column (blokgemiddeld) with the value on each row (ignoring any 0 values in any of the five columns)

Regards

Kristen · August 19, 2016, 10:43am

Something like this

Update U
set blokgemiddeld = MyAverage
FROM SQLTableBlokke AS U
    CROSS APPLY
    (
        SELECT [MyAverage] = (tha_min4 + tha_min3 + tha_min2 + tha_min1 + tha_huidig)
                             / NullIf(
                                   CASE WHEN tha_huidig= 0 THEN 0 ELSE 1 END
                                   + CASE WHEN tha_min1 = 0 THEN 0 ELSE 1 END
... etc ...
                              , 0)
    ) AS X
WHERE blokgemiddeld <> MyAverage
      OR (blokgemiddeld IS NULL AND MyAverage IS NOT NULL)
      OR (blokgemiddeld IS NOT NULL AND MyAverage IS NULL)

If you post CREATE TABLE statement and some sample data with INSERT statements, rather than an image, then folk here can test their answers. I've no idea if what I have posted will work ... !!

ScottPletcher · August 19, 2016, 4:52pm

Update SQLTableBlokke
set blokgemiddeld = (SELECT (tha_min4 + tha_min3 + tha_min2 + tha_min1 + tha_huidig) / 5.00)
where tha_min4<>0 AND
tha_min3<>0 AND
tha_min2<>0 AND
tha_min1<>0 AND
tha_huidig<>0

Kristen · August 19, 2016, 7:03pm

I don't reckon yours satisfies the last row in the sample grid provided @ScottPletcher

ScottPletcher · August 19, 2016, 7:20pm

True enough. The example is definitely different than the initial description:

Update SQLTableBlokke
set blokgemiddeld = ((tha_min4 + tha_min3 + tha_min2 + 
        tha_min1 + tha_huidig) / 
    NULLIF(((ABS(sign(tha_min4))+ABS(sign(tha_min3))+ABS(sign(tha_min2))+
        ABS(sign(tha_min1))+ABS(sign(tha_huidig))) * 1.00),0)
    )

Kristen · August 20, 2016, 8:08am

All the Specs I get to build are the same!

Gideon · August 23, 2016, 6:17am

Thank you and apologies for my poor initial description.

Gideon · August 23, 2016, 6:17am

Thank you Scott and sorry for my poor initial description.

simon155 · August 23, 2016, 11:52am

UPDATE SQLTableBlokke
SET blokgemiddeld =
(tha_min1+tha_min2+tha_min3+tha_min4+tha_huidig) /
(IIF(tha_min4=0,0,1)+IIF(tha_min3=0,0,1)+IIF(tha_min2=0,0,1)+IIF(tha_min1=0,0,1)+IIF(tha_huidig=0,0,1))

Just a small abbreviation.

Kristen · August 23, 2016, 2:06pm

I doubt if it matters in this context, but my suspicion is that IIF() will be slower than other means of counting the number of non-zero values

Personally I would also like to avoid updating a row that already has the correct value (although SQL may optimise that out, and not perform any physical disk operation if there are no column changes to a row)

simon155 · August 23, 2016, 2:26pm

The query doesn't count non zero values. It's column based. Scroll up
If you were counting values within a single column, you might use COUNT(NULLIF(column,0))

If you're not sure about suspicions, SET STATISTICS IO ON before running, or run a trace if in doubt.

Yes - adding a WHERE clause in would prevent unnecessary updates, though the question implied the correct values had yet to be calculated. I think those were his expected answers, but it's immaterial really

simon155 · August 23, 2016, 2:29pm

I should also add that IIF is the same in performance as a CASE statement which you used yourself. It's just a shorter more convenient syntax, available from 2012 onwards.

Kristen · August 23, 2016, 2:48pm

Ermmm ... how would you describe it? You, me and Scott are all counting the number of non-zero columns as the means of calculating the average - of only the non-zero columns. Or I'm missing something - in which case happy to be enlightened.

Thanks, I do know about that, but I don't have the time, or inclination, to do that for the O/P, I just wanted to alert them to the possibility, if this code is running often / on lots of rows.

I don't read it like that, but you may well be right. My reading is that this cannot be a one-time-operation as otherwise the column would not be correctly populated in future. Of course, in future, it might be populated by a trigger or somesuch, and thus only effecting the changed rows, but even then the trigger might be subjected to many rows (even "all rows in the table") being updated, and therefore, again, I wanted to draw the O/P's attention to the possibly issue of performance. Personally, if I was building this code into a trigger I would avoid updating rows where there would be no change (e.g. some other column in the row might be being changed, which had no impact on the average but would still fire the trigger)

Correct, and I should have pointed that out. Scotts code is more elegant / efficient than mine, it followed mine, and I thought it was better. Yours came after that, so I just wanted to point out to the O/P that IIF was unlikely to be as efficient (I should have explicitly pointed out that I felt that Scotts code was more efficient that yours, and mine, rather than just suggesting that your code was inefficient, per se; my apologies, no offence was intended)