SQLTeam.com | Weblogs | Forums

Map INT Dates to SQL Dates


#1

I'm trying to pull some data from a 3rd party SQL Database into my tables. The dates are all INT and make no sense to me :frowning: FWIW the APP is based on Cobol. I was expecting "Days since Epoch" or something like that ... but it doesn't seem to map

I reckon that the Cobol datatype is an UNsigned INT and, as such, the dates run from 0 to 32,767 and then from -32,768 to -1 ... but even that isn't mapping for me.

I have a handful of INT numbers and their actual date counterparts - in case anyone can figure out the mapping please :slight_smile:

SELECT	[T_ID] = IDENTITY(int, 1, 1)
	, [T_DateNo] = T1_DateNo
	, [T_AbsDateNo]
	, [T_Date]
	, [T_DateDiff]=DATEDIFF(Day, T_Date, '19801119')
	, [T_DateNoDiff] = 65536 -T_AbsDateNo 
INTO	#TEMP
FROM
(
	SELECT	[T1_DateNo]=12345, [T1_Date]='yyyymmdd' WHERE 1=0
	UNION ALL SELECT  31091, '19801119'
	UNION ALL SELECT  31484, '19811128'
	UNION ALL SELECT  31635, '19820419'
	UNION ALL SELECT  31843, '19821103'
	UNION ALL SELECT  32086, '19830622'
	UNION ALL SELECT -32765, '19850403'
	UNION ALL SELECT -32744, '19850424'
	UNION ALL SELECT -32735, '19850501'
) AS T1
	CROSS APPLY
	(
		SELECT	[T_Date]=CONVERT(date, [T1_Date])
			, [T_AbsDateNo] = CASE WHEN T1_DateNo < 0 THEN 65536 + T1_DateNo
						ELSE T1_DateNo
						END
	) AS X
ORDER BY T_Date
--
SELECT	T0.T_DateNo
	, T0.T_Date
	, T0.T_AbsDateNo
	, [AbsDiff]=T0.T_AbsDateNo-T1.T_AbsDateNo
	, [DateDiff]=DATEDIFF(Day, T1.T_Date, T0.T_Date)
	, [Variance] = (T0.T_AbsDateNo-T1.T_AbsDateNo) - DATEDIFF(Day, T1.T_Date, T0.T_Date)
FROM	#TEMP AS T0
	LEFT OUTER JOIN #TEMP AS T1
		 ON T1.T_ID = T0.T_ID - 1
--
ORDER BY T0.T_ID

DROP TABLE #TEMP
T_DateNo    T_Date     T_AbsDateNo AbsDiff     DateDiff    Variance    
----------- ---------- ----------- ----------- ----------- ----------- 
31091       1980-11-19 31091       NULL        NULL        NULL
31484       1981-11-28 31484       393         374         19
31635       1982-04-19 31635       151         142         9
31843       1982-11-03 31843       208         198         10
32086       1983-06-22 32086       243         231         12
-32765      1985-04-03 32771       685         651         34
-32744      1985-04-24 32792       21          21          0
-32735      1985-05-01 32801       9           7           2

(8 row(s) affected)

I'll try to get some sample data that is more "decades" apart


#2

Just guessing here, but could it be that the dateno represents number of days from at given start date minus "holydays" from another table?
Regarding the "integer overflow", you could turn negative numbers to positive like this: (t1_dateno&0x8000)+(t1_dateno&0x7fff)


#3

I think that's unlikely as the particular APP is a 365/24/7 sort of thing, so won't be "avoiding" public holidays, as such.

I wondered if it was allowing 31 days for each month, or something like that. The two dates in 1985-04-NN have zero variance ...

I'll work on getting some more sample dates which might help narrow it down.

That's much more handy than my code, thanks.


#4

You were right

just with the twist of calculating 32 days in a month.

Try this:

with cobolstuff(t1_dateno,t1_date)
  as (          SELECT  31091, '19801119'
      UNION ALL SELECT  31484, '19811128'
      UNION ALL SELECT  31635, '19820419'
      UNION ALL SELECT  31843, '19821103'
      UNION ALL SELECT  32086, '19830622'
      UNION ALL SELECT -32765, '19850403'
      UNION ALL SELECT -32744, '19850424'
      UNION ALL SELECT -32735, '19850501'
     )
select ((t1_dateno&0x8000)+(t1_dateno&0x7fff)) as t1_dateno
      ,cast(t1_date as date) as t1_date
      ,dateadd(day
              ,((t1_dateno&0x8000)+(t1_dateno&0x7fff))%32-1
              ,dateadd(month
                      ,((t1_dateno&0x8000)+(t1_dateno&0x7fff))/32
                      ,cast('1899-12-01' as date)
                      )
              )
       as calc_date
  from cobolstuff
;

#5

Seriously? What would be the matter with having 31 days in the month? ...

... I'm finding some more data (wider date range) - I have to find official documents that match, and then tie-up the INT date with the official date ... not exactly a Speedy Cut&Paste Job !!


#6

On second thought, I was wrong - they did use 31 days for each month, but to calc back to the date, we have to divide by 32. Sorry for the mistake


#7

Hehehe ... and there I was worrying about why they had "wasted" a day for each month!

I still find it inconceivable they would have come up with such a concept ... can't do "Days between these dates" or anything like that. "Days since Epoch" would have been far more useful, but I also cannot understand why, when they moved from ISAM to SQL, they didn't convert the dates to SQL DataType DATE / DATETIME - no possibility of doing any useful Date reports in SQL - other than ORDER BY MyDateColumn - and then even that doesn't work when the dates bridge the positive/negative point


#8

Now run with a larger range of dates and your formula has worked flawlessly, many thanks for that :toast:


#9

Many years ago, I worked at a company where out developer/production environment was cobol on a HP3000. We used ISAM and as I recall, there was a kind of "sql layer" on top of the ISAM, so we were able to query using sql. But as "the old cobol guys" were used do the "start/fetch next" thingy, I was not allowed to incorporate sql queries in the programs. May it is the same you encountered, where their programs was not changed to do sql queries ?!?
Anyway - if they were concerned to optimize for size (as it was common "back in the day"), they still "wasted" a day, as the value range zero to 31 counts 32 days (I know I'm splitting hairs now).


#10

Yup, that's exactly it. Its pretty smart actually, the COBOL program talks to the ISAM thingie, and that is capable of saying "Right, that stuff is now in SQL" and getting it from whichever store is now "current", so the data can be ported over time. But here we are in 2017 and still 80% of their data is in ISAM ...

But its not a SQL database as we would know it. No clustered indexes, no PKeys, no FKeys, no NULLs ... along with "No DATE datatype :slight_smile: ... so it would be pretty hard to use SQL to query it effectively.

I've created a VIEW but the performance will be dreadful for general SQL queries - 50% of the 100+ columns are dates with your magic date formula :sunglasses:

	, [MyDateCol] = DATEADD(Day
		, ((NullIf(CONVERT(int, MyDateCol), 0)&0x8000)+(CONVERT(int, MyDateCol)&0x7fff))%32-1
		, DATEADD(Month
		, ((CONVERT(int, MyDateCol)&0x8000)+(CONVERT(int, MyDateCol)&0x7fff))/32
		, CONVERT(Date, '18991201')))

(Yeah, all their date columns are NUMERIC(5,0) instead of INT so can't Bitwise-AND them without CAST to INT ... and so it goes on!)

and then there are all the "text" columns that need a "blank to NULL" conversion:

	, [MyTextCol] = NullIf(MyTextCol, '')

Yup ... my VIEW going to run like a tortoise! But the intention is to import the data and merge it with our core data, so its only a short term performance problem.

just seems like a lost opportunity not to actually convert the data to SQL DataType whilst migrating it from ISAM to SQL ... Ho!Hum!


#11

Don't suppose anyone will ever stumble over a need to use this! but just in case they do:

The BITWISE Maths is very efficient ... but I have a slight worry that the data MIGHT have a modulo-32 INT value that maps to 31-Feb :frowning: Using the "Add Days and Months" BITWISE maths would just add "a few extra days" to the date, and I would wind up with 03-Mar (or 02-Mar in a leap year) rather than an error. I would prefer the error ...

I've done a check on a couple of the date columns and did not find any 31-XXX in 30-day months, or 29-31 Feb, so perhaps all is well, but I might change the DATEADD(Day, ... DATEADD(Month, ... to use DATEFROMPARTS() instead.