Differing results on similiar queries

I am attempting to 'compact' a bit of a TSQL query but getting massively differing results telling me that something is amiss.

The original query which gave OK results more in line with what I expected to see was:

DECLARE @start_date date
DECLARE @end_date date
SET @start_date =  '01-01-2022'
SET @end_date = '12-31-2022'

SELECT d.DESCRIPTION,
       sum(amount) as TOTAL
  FROM tbEXPENSES as e 
  INNER JOIN tbDESCRIPT as d ON e.DESCRIPT_ID=d.ID
 WHERE REC_DATE BETWEEN @start_date AND @end_date
AND e.DESCRIPT_ID IN ('81', '83', '86', '87', '88', '89', '90', '91', '92', '93',  '97', '98', '99', 
'100', '101', '107', '108', '109', '111',  '146', '161', '166', '176', '181', 
'182', '186', '187', '191', '197', '198', '199', '200', '201', '211', '216', 
'221', '246', '256', '271', '276', '280', '281', '286', '311', '317', '416', 
'435', '446', '451', '461', '466', '471', '476', '481', '491', '492', '496', 
'497', '501', '503', '506', '511', '516', '521', '523', '526', '527', '531', 
'532', '551', '556', '561', '562', '563', '571', '581', '586', '611', '621', 
'626', '631', '636', '641', '646', '655', '661', '666', '667', '681', '696', 
'706', '711', '730', '731', '732', '740', '741', '742', '743', '744', '745', 
'746', '747')
GROUP BY DESCRIPTION
ORDER BY DESCRIPTION

The second TSQL one I attempted gives crazy high numbers and that one is:

DECLARE @start_date date
DECLARE @end_date date
SET @start_date =  '01-01-2022'
SET @end_date = '12-31-2022'

SELECT d.DESCRIPTION,
       sum(amount) as TOTAL
  FROM tbEXPENSES as e 
  INNER JOIN tbDESCRIPT as d ON e.DESCRIPT_ID=d.ID
 WHERE REC_DATE BETWEEN @start_date AND @end_date
   AND e.DESCRIPT_ID BETWEEN '81' AND '221' 
    OR e.DESCRIPT_ID BETWEEN '461' AND '546'
    OR e.DESCRIPT_ID BETWEEN '266' AND '296'
    OR e.DESCRIPT_ID BETWEEN '551' AND '563'
    OR e.DESCRIPT_ID BETWEEN '676' AND '706'
    OR e.DESCRIPT_ID BETWEEN '740' AND '747'
    OR e.DESCRIPT_ID IN ('246', '256', '311', '317', '416', '435', '451', '571',  '581', 
                                       '586', '611', '621', '626', '631', '636', '641', '646', '655', 
                                       '661', '666', '667', '711', '730', '731', '732')
GROUP BY DESCRIPTION
ORDER BY DESCRIPTION

I would like to condense my SQL a bit but something is definitely not correct. Am I using the OR statement correctly? Can someone point me in the correct direction please?

The @start_date and @end_date are parameters I pass via the program. I have hard coded it here for ease of understanding.

Since DESCRIPT_ID is not numeric, any range you specify is based on char value, not a numeric value.

The best approach would be to put the DESCRIPT_ID values into a table (temp or permanent; i.e., not a table variable) -- uniquely clustered on that value, just in case it matters -- and do an INNER JOIN to that table.

hi

best thing is Scotts idea

another idea is 
**INSTEAD of inner join**  

..after putting DESCRIPT_ID  values as INT into temp table 

WHERE 
    cast( e.DESCRIPT_ID  as INT )  
     IN 
    ( SELECT DESCRIPT_ID  FROM #DESCRIPT_ID_TEMPTABLE)
another idea is 

cast( e.DESCRIPT_ID  as INT ) BETWEEN 81 and 221

The reason you are not getting similar results is because of the OR statements. If you add parenthesis it should result in similar results:

WHERE REC_DATE BETWEEN @start_date AND @end_date
   AND ( 
       e.DESCRIPT_ID BETWEEN '81' AND '221' 
    OR e.DESCRIPT_ID BETWEEN '461' AND '546'
    OR e.DESCRIPT_ID BETWEEN '266' AND '296'
    OR e.DESCRIPT_ID BETWEEN '551' AND '563'
    OR e.DESCRIPT_ID BETWEEN '676' AND '706'
    OR e.DESCRIPT_ID BETWEEN '740' AND '747'
    OR e.DESCRIPT_ID IN ('246', '256', '311', '317', '416', '435', '451', '571',  '581', 
                                       '586', '611', '621', '626', '631', '636', '641', '646', '655', 
                                       '661', '666', '667', '711', '730', '731', '732')
       )
GROUP BY DESCRIPTION
ORDER BY DESCRIPTION

I would validate that DESCRIPT_ID is in fact a string value - based on how that is used in the JOIN I would expect it to be a numeric value. If it is truly a numeric value then change your statement - either structure - to use numeric values. That will eliminate a hidden implicit conversion...

1 Like

Along with what Scott suggested (which is spot-on, IMHO), I believe another problem may be that the ranges in the ORs encompass more values that the original IN.

For example, you use the range of ...

BETWEEN '81' AND '221' 

... but there was, for example, no 82, 84, 85 values in the original IN. You need to check to see if the missing values in that range in the original IN actually DO have values.

effw8713 Your notation of the missed parenthesis has resolved the issue. Much obliged there. I never even considered that!

To explain and hopefully allow for better understanding of why I did the SQL as I did and to address the several ideas offered:

tbEXPENSE schema

ID	int	Unchecked
SECT_ID	int	Checked
DESCRIPT_ID	int	Checked
REC_DATE	datetime2(7)	Checked
AMOUNT	money	Checked
POSTEXPENSE	char(1)	Checked
POSTLOAD	char(1)	Checked
POSTPERSONAL	char(1)	Checked
LOADNUMBER	int	Checked
DATE_ENTERED	date	Checked

The tbEXPENSE table has both the SECT_ID and DESCRIPT_ID in Int format. and example of the tbEXPENSE raw return

ID	SECT_ID	DESCRIPT_ID	REC_DATE	AMOUNT	POSTEXPENSE	POSTLOAD	POSTPERSONAL	LOADNUMBER	DATE_ENTERED
239	1	11	2014-01-09 00:00:00.0000000	22.68	N	NULL	NULL	0	NULL
240	7	171	2014-01-09 00:00:00.0000000	907.00	N	NULL	NULL	0	NULL
241	7	106	2014-01-09 00:00:00.0000000	0.25	Y	NULL	NULL	0	NULL

The DB is set up to allow for gaps in the DESCRIPT_ID for expansion or inclusion into a specific SECT_ID group.

In other words The program has 20 different Sections which are assigned via a tbSECTIONS table.
(Example: Income, Savings Expense, Load Expenses, Truck Expenses etc.)

Each Description as found in the tbDESCRIPT table hinges upon input by the recorder where they chose what section then what 'Description' fits the input... so In other words using the above short posted tbEXPENSE raw data:

In the user portal they would chose Section: "TRUCK EXPENSES" and the DESCRIPTION: "Truck Payment" via separate drop downs as an example. This then sends over and saves to the tbEXPENSE table as
7 / 171 (Section 7 / Description 171).

During reporting the SQL (which is reflected above) is looking for ONLY the DESCRIPTION ID number (171) so there is no need, with this specific report, to reference or join to the tbSECTIONS table. I have many reports where I do the exact opposite and only have need of the tbSECTIONS table and include everything within a Section only as well as a mix of needing and joining both the tbSECTIONS and tbDESCRIPTIONS tables.

The DESCRIPTION column (included in the tbDESCRIPT table) extract below: is simply a translation of the numerically assigned request as described above and translates or prints that to an actual description of, in the case of "171", 'Truck Payment' and gives the total sum for the month/quarter/year as requested by the user.

ID	SECT_ID	DESCRIPTION	EXPENSENAME	LOADEXPENSE	PERSONALEXPENSE
68  166	7	Truck Maintenance (Include PM Cost)	NULL	REPAIR_COSTS	NULL
69  171	7	Truck Payment	TRUCK_PAYMENTS	NULL	NULL
70  176	7	Truck Registration (Plates)	TRUCK_REGISTRATION	NULL	NULL
71  181	7	Truck Supplies (Maps; T/S Guides; Gloves etc.)	NULL	TRUCK_SUPPLIES	NULL
72  182	7	DAT/Loadboard Fees	LOAD_BOARD_FEES	NULL	NULL
73  186	7	Truck Tires (Drives or Steers)	BREAKDOWN_TOTAL	REPAIR_COSTS	NULL
74  187	7	Truck Wash	TRUCK_WASHING	EQUIP_WASHING_COST	NULL
75  191	7	Qualcomm Fee	QC_RENTAL	NULL	NULL

Thus, the reason for the numeric gaps as I allowed a gap of 5-10 numbers (depending on the section) to allow for future expansion to any section. So in the referenced note above you note there is no 82, 84, 85 and that would be correct. Should I have need of entering something into the DB under the Truck Expenses Section, I would have the SECT_ID 7 / DESCRIPT_ID 82, 84, 85 'slots' I can still fill, Additionally there are some deliberate gaps especially in the higher Description numbers as the items assigned to those DESCRIPT_ID's would not be relevant to the specific report being generated.

I certainly hope this makes sense as it is hard to explain... but it works perfectly and is very quick.

DESCRIPT_ID is in fact an int column - make sure you change the code to remove the single-quotes around those values.