SQLTeam.com | Weblogs | Forums

Missing month

Sql Server 2008

I think sample data is self explainatory.

declare @t table ( code varchar(6), yearmonth bigint)
insert @t values ('220012',201901)
, ('220012',201902)
, ('220012',201903)
, ('220012',201904)
, ('220012',201905)
, ('220012',201907)
, ('220012',201908)
, ('220012',201909)
, ('220012',201910)
, ('220012',201912)
, ('220012',202001)
, ('220012',202002)
, ('220012',202003)
, ('220012',202005)
, ('220012',202006)

The first (min) month in data is 201901
The last (max) month in data is 202006
Show all months missing between 201901 and 202006

Expected Result
code yearmonth
222012 201906
222012 201911
222012 202004

please see what i tried

obviously Data can break my SQL very easily ..

its just a rough patch NOT the complete solution

; with cte as 
(
	select 
				*
			,	abs(yearmonth - lead(yearmonth) over(order by yearmonth)) as ok  
    from 
	    @t
) 
select 
    yearmonth +1 
from 
   cte 
where 
    ok =2

@harishgg1
working in 2012 as expected.
require in 2008 as mentioned.

Thanks

for 2008

; with cte_12 as 
(
SELECT   N=number
FROM     master..spt_values
WHERE    type = 'P'   AND number between 1 and 12 
) , cte_main as 
(
select cast(cast(a.year as varchar)+ RIGHT('00' + cast(cte_12.N as varchar) , 2) as bigint)ok  
from   cte_12 cross join ( select distinct yearmonth/100 as year from @t )  a 
)
select * from cte_main 
where  ok not in ( select yearmonth from @t) 
and ok <= (select max(yearmonth) from @t)

@hareshgg1
I tried many version of data and it worked.
I had to add
and ok >= (select min(yearmonth) from @t)

Thanks

@harishgg1
It is my fault that I provided data for one code.

Please help for multiple code
If I add , ('220013',202003)
, ('220013',202004)
, ('220013',202006)
then how to modify the query
Expected result would be

(No column name) ok
220012 201906
220012 201911
220012 202004
220013 202005

hi

hope this helps ..

; with cte_12 as 
(
SELECT   N=number
FROM     master..spt_values
WHERE    type = 'P'   AND number between 1 and 12 
) , cte_main as 
(
select b.code,c.ok 
from ( select distinct code from @t ) b ,
(
select cast(cast(a.year as varchar)+ RIGHT('00' + cast(cte_12.N as varchar) , 2) as bigint) ok  
from   cte_12 cross join ( select distinct yearmonth/100 as year from @t )   a
) c
),cte_min_max as 
(
select code,yearmonth/100 as year,min(yearmonth)  as minm, max(yearmonth) as maxm  from @t
group by code , yearmonth/100
) , cte_actual_main as 
(
select a.* from cte_main a join cte_min_max b on a.code = b.code and a.ok between b.minm and b.maxm
) 
select * from cte_actual_main
except 
select * from @t
1 Like
;WITH
cte_tally10 AS (
    SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number)
),
cte_tally1000 AS (
    SELECT 0 AS number UNION ALL
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS number 
    FROM cte_tally10 c1 CROSS JOIN cte_tally10 c2 CROSS JOIN cte_tally10 c3
),
cte_code_date_ranges AS (
    SELECT code, 
        CAST(CAST(MIN(yearmonth) * 100 + 01 AS varchar(8)) AS date) AS date_min, 
        CAST(CAST(MAX(yearmonth) * 100 + 01 AS varchar(8)) AS date) AS date_max 
    FROM @t
    GROUP BY code
)
SELECT cdr.code, CONVERT(varchar(6), DATEADD(MONTH, t.number, cdr.date_min), 112) AS yearmonth
FROM cte_code_date_ranges cdr
INNER JOIN cte_tally1000 t ON t.number BETWEEN 0 AND DATEDIFF(MONTH, cdr.date_min, cdr.date_max)
WHERE NOT EXISTS(SELECT 1 FROM @t WHERE code = cdr.code AND 
    yearmonth = CONVERT(varchar(6), DATEADD(MONTH, t.number, cdr.date_min), 112))
ORDER BY cdr.code, yearmonth
1 Like

Sorry, missed a part of the requirements and so took the post down. I'll be back.

Ok.. sorry for the delay...

@Mateen,

Scott is definitely on the right track. To make things even easier, I strongly recommend that you add the proverbial "Swiss Army Knife" of SQL to your database(s). It will also keep "Tally Overrun" accidents from happening.

You can find it at the article at the following link.

https://www.sqlservercentral.com/scripts/create-a-tally-function-fntally

If your DBA complains about functions, explain that it's a very high performance iTVF (inline Table Valued Function) that works as if the code for the function were inline like any of the other code.

After that, things of this nature become nearly trivial to solve.

   WITH 
cteDateLimits AS
(--==== Find the min/max DATEs (which are easier to work with) for each code.
 SELECT  code
        ,MinYYYYMM = MIN(CONVERT(DATE,CONVERT(CHAR(8),yearmonth*100+1),112))
        ,MaxYYYYMM = MAX(CONVERT(DATE,CONVERT(CHAR(8),yearmonth*100+1),112))
   FROM @T
  GROUP BY code
)
,cteDateRange AS
(--==== "DRY" out the formulas while calculating all yearmonth values for each code.
 SELECT  code
        ,yearmonth = CONVERT(INT,CONVERT(CHAR(6),DATEADD(mm,t.N,MinYYYYMM),112))
   FROM cteDateLimits
  CROSS APPLY dbo.fnTally(0,DATEDIFF(mm,MinYYYYMM,MaxYYYYMM)) t
)--==== Find the skipped dates for each code.
 SELECT  code
        ,OK = yearmonth 
   FROM cteDateRange drng
  WHERE NOT EXISTS (SELECT * FROM @t t WHERE t.code = drng.code AND t.yearmonth = drng.yearmonth)
  ORDER BY code, yearmonth
;

I also recommend that the first column in your table should actually be a CHAR(6) instead of a VARCHAR(6) because the latter takes an extra 2 bytes per row and that you should never store temporal data as INTs. In this case, use the DATE datatype and don't format them as YYYYMM until it's display/report time. Certainly there is no need to store a 6 digit INT as a BIGINT.

Oldest rule in the books... keep the data layer separate from the presentation layer. In most cases, it will make small miracles much easier to perform and usually a good bit faster because you'll only suffer one CONVERT.

1 Like

@harishgg1

Checked in actual data with 900 different codes having 1 to 5 years data.

It worked and did show missing month to resolve

Thanks and regards

Hi @ScottPletcher

Checked in actual data with 900 different codes having 1 to 5 years data.

It worked and did show missing month to resolve

Thanks and regards

@JeffModen
Thanks for all your valuable advice.

I have already the tally function and your famous "Tally OH" as well with a lot of usage
in my codes.

Your code worked in actual data.

Thanks and regards

@JeffModen

CROSS APPLY dbo.fnTally(0,DATEDIFF(mm,MinYYYYMM,MaxYYYYMM)) t

Removing not required numbers from tally
or
Building required numbers from tally

was really fun and fantastic for me.

Love you!

@ScottPletcher, @JeffModen

By the way I have a table "salary_dates" with all yearmonths (no missing)

select yearmonth from salary_dates order by 1

200504
200505
200506
200507
200508
200509
200510
200511
200512
200601
200602
200603
200604
200605
200606
200607
200608
200609
200610
200611
200612
200701
200702
200703
200704
200705
200706
200707
200708
200709
200710
200711
200712
200801
200802
200803
200804
200805
200806
200807
200808
200809
200810
200811
200812
200901
200902
200903
200904
200905
200906
200907
200908
200909
200910
200911
200912
201001
201002
201003
201004
201005
201006
201007
201008
201009
201010
201011
201012
201101
201102
201103
201104
201105
201106
201107
201108
201109
201110
201111
201112
201201
201202
201203
201204
201205
201206
201207
201208
201209
201210
201211
201212
201301
201302
201303
201304
201305
201306
201307
201308
201309
201310
201311
201312
201401
201402
201403
201404
201405
201406
201407
201408
201409
201410
201411
201412
201501
201502
201503
201504
201505
201506
201507
201508
201509
201510
201511
201512
201601
201602
201603
201604
201605
201606
201607
201608
201609
201610
201611
201612
201701
201702
201703
201704
201705
201706
201707
201708
201709
201710
201711
201712
201801
201802
201803
201804
201805
201806
201807
201808
201809
201810
201811
201812
201901
201902
201903
201904
201905
201906
201907
201908
201909
201910
201911
201912
202001
202002
202003
202004
202005
202006
202007
202008
202009
202010
202011
202012
202101

@JeffModen

if salary_dates is tally table
and am_aug_closing is data table
then I simulate your code and was successful

WITH cteDateLimits AS
( select card_code
, minmt = min(yearmonth)
, maxmt = max(yearmonth)
from am_aug_closing g
group by g.card_code
)
,cteDateRange AS
 (SELECT  card_code
        , sd.yearmonth 
   FROM cteDateLimits
   CROSS APPLY salary_dates sd
)
SELECT  drng.card_code, dl.minmt, dl.maxmt, drng.yearmonth
FROM cteDateRange drng
join cteDateLimits dl on (drng.card_code = dl.card_code 
                             and drng.yearmonth between dl.minmt and dl.maxmt)
where not exists ( select 1
                   from am_aug_closing g3
				   where g3.card_code = drng.card_code
				   and   g3.yearmonth = drng.yearmonth)

But though both brings result within one second
, your code is fast
Mine takes a longer pause within the one second.

You were right!!!

@JeffModen

Your code!!!!

ALTER procedure [dbo].[chk_missing_months_am_aug_closing] as

  WITH 
cteDateLimits AS
(--==== Find the min/max DATEs (which are easier to work with) for each card_code.
 SELECT  card_code
        ,MinYYYYMM = MIN(CONVERT(DATE,CONVERT(CHAR(8),yearmonth*100+1),112))
        ,MaxYYYYMM = MAX(CONVERT(DATE,CONVERT(CHAR(8),yearmonth*100+1),112))
   FROM am_aug_closing
  GROUP BY card_code
)
,cteDateRange AS
(--==== "DRY" out the formulas while calculating all yearmonth values for each card_code.
 SELECT  card_code
        ,yearmonth = CONVERT(INT,CONVERT(CHAR(6),DATEADD(mm,t.N,MinYYYYMM),112))
   FROM cteDateLimits
  CROSS APPLY dbo.fnTally(0,DATEDIFF(mm,MinYYYYMM,MaxYYYYMM)) t
)--==== Find the skipped dates for each card_code.
 SELECT  card_code
        ,OK = yearmonth 
   FROM cteDateRange drng
  WHERE NOT EXISTS (SELECT * FROM am_aug_closing t WHERE t.card_code = drng.card_code AND t.yearmonth = drng.yearmonth)
  ORDER BY card_code, yearmonth

@Mateen ... Thank you for the feedback.

The problem with the code that you have using the salary_dates table is in the CROSS APPLY... you have no limits in that CROSS APPLY and so you're actually ending up with a Cartesian Product. That's kind of what I meant by an "Accidental Tally Overrun" except yours would be classified as an "Accidental Date Overrun".

If you would post the CREATE TABLE statement (including any and all indexes, please), we'd have a better chance at helping you actually use that table (which none of us knew existed until now) for this problem.

Its ok JeffModen

Since the use of tally is giving faster result I would go with that.

A tally ITVF, a tally building within the query or using actual table.
The best part is that I did learn how to code such situation.

Thanks to you and all.

@Mateen ,

Works for me. Glad to have been of service. The part you said about learning is a great reflection on you. Well done.