Cleanup of output from select statement

I'm attempting to output a date format, however, the values in the column are not overly conducive to a date format of mm/dd/yy. Any suggestions on how to potentially approach this would be greatly appreciated.

Select max(ARP.InstallDate0) as 'Install Date'
From
Table

Sample results below. The first value shown below is unusable and should be converted to a null value. The same would apply if any other similar unusable values appeared. Not sure if this is feasible. The second, third, and fourth value formats will need to be converted to dd/mm/yy.

AAAA-01-JJ
Fri Apr 07 12:11:12 EDT 2017
9/9/2020
20210127

The problem is that you have a string - not a date. To get to a date you need to be able to convert the data to a valid date. Because you have such disparate data in the results - this is going to be almost impossible to accomplish.

To start - you need to identify all of the different formats available - then use a convert statement specific to that format, or - modify the string to an acceptable format and then convert.

For example - you can use try_cast(InstallDate0 As datetime) on the values 9/9/2020 and 202101127 and get a valid datetime, but there is nothing that will work for 'Fri Apr 07 12:11:12 EDT 2017' - which would need to be reconstructed as 'Apr 07 2017 12:11:12' to be cast as a datetime.

The first step is to identify all of the available formats - then figure out the code to convert that string to a usable format - then use cast or try_cast to cast as a datetime. Or - use convert with an appropriate style.

1 Like

This was a big help.

try_cast(MAX(ARP.InstallDate0) as date) as 'Install Date' resolved the following three formats.

AAAA-01-JJ
9/9/2020
20210127

The following is the only other format that will need to be addressed. However, try_cast converted it to a null value. I'm assuming I would need to convert the format to 'Apr 07 2017' prior to applying try_cast. I'm still new to this and would be very grateful if you could perhaps post a link to a page with sample code on how to do this.

Fri Apr 07 12:11:12 EDT 2017

I think I might have the answer. Currently testing the following.

case
when len(MAX(ARP.InstallDate0)) > 10
then try_cast(stuff(stuff(MAX(ARP.InstallDate0), 12, 13,''), 1,4,'') as date)
else try_cast(MAX(ARP.InstallDate0) as date)
end as 'Install Date'

How many rows with this "date" format? And is the pattern consistent ie
ddd Mmm Nn hh:mm:sec TZ yyyy

hi

please see if this helps :slight_smile:

The first thing I would do is:

Select Distinct ARP.InstallDate0
Into #tempDates
From ... ARP

I would then evaluate all of the different formats - and start categorizing them. For example - if you have dates like 'Thu Apr 06 11:13:12 PDT 2016' and similar dates but all with the same format - that becomes one category.

Dates as nn/nn/yyyy need further review - is that dd/mm/yyyy or mm/dd/yyyy and can you be sure they are the correct format?

Once you have identified the different available formats - you can the derive code to convert each type to a valid date string that can then be cast/converted.

Sample code:

Declare @dateString varchar(30) = 'Fri Apr 07 12:11:12 EDT 2017';

 Select *
      , stuff(s1.DateString, 7, 0, s1.DateStringYear)
      , try_cast(stuff(s1.DateString, 7, 0, s1.DateStringYear) As datetime)
   From (Values (@dateString)) As dt(DateString)
  Cross Apply (Values (substring(dt.DateString, 5, 16), right(dt.DateString, 5))) As s1(DateString, DateStringYear)

You also seem to be looking for the MAX value - which isn't going to work out how you expect. You need to convert the values to valid dates before getting the max value - simply because 'Fri' comes before 'Mon' but 'Mon Apr 10 13:13:13 PDT 2018' is later than 'Fri Apr 07 12:11:12 EDT 2017'.

The important part of the above code is this:

substring( {date string}, 5, 16)  --gets the first part of the date
right( {date string}, 5)  --gets the year with leading blank
stuff( {first part of date}, 7, 0, {year})  --builds valid date string for cast/convert

This only works for those date strings with that specific format - so a check for that format is needed so you apply the above code only to those strings that match.

Thank everyone for your assistance. After further research I discovered additional formats were being used and was able to account for them using all of your suggestions. The MAX command was needed to ensure I obtained a date as there were duplicate rows of data whereby one of the duplicate rows contained a date and the other duplicate row contained a null value.