This month I wish to share an interesting voyage of discovery to do with TIMESTAMP calculations.
How Interesting …
It all started decades ago, when I wanted to find out how many microseconds there were between two timestamps. A simple idea for a trace in various programs which, when analyzed, could highlight problematic code paths.
Use DISPLAY Stupid!
Just DISPLAY the current timestamp at the start of each SECTION. The trace analysis program would then simply subtract two timestamps – et Voila! You have a difference!
Nope … That Fails
Well, I found out very quickly that math on TIMESTAMPs does *not* work like that! What you actually get is a “duration” which, in my humble opinion, is worthless! Here’s an example:
SELECT TIMESTAMP('2023-12-21-15.06.40.120234') - TIMESTAMP('2023-12-21-15.06.40.120034') FROM SYSIBM.SYSDUMMY1 ; ---------+---------+---------+---------+-------- ---------+---------+---------+---------+-------- .000200
Looks good doesn’t it? Exactly 200 microseconds – as it should be.
Now look at this example:
SELECT TIMESTAMP('2023-12-21-15.06.41.120234') - TIMESTAMP('2023-10-22-17.08.55.130234') FROM SYSIBM.SYSDUMMY1 ; ---------+---------+---------+---------+-------- ---------+---------+---------+---------+-------- 129215745.990000
What?
Yep, what the calculation does is really “subtract”… What you get is a decimal 14 containing the YYYY (leading zeroes blanked of course!) years, MM months, DD days HHMMSS then a decimal point and then your usual six digits for microseconds.
Not Good!
This, for me, was not actually usable! So, what I did, was then extract all the fields multiplying by the different values and adding them all together to get a SECONDS field. This gave me what I wanted but was a bit messy.
Months?
Whoever dreamed up the western calendar obviously was not a programmer! 31, 30, 28 sometimes 29 days in a month but then not always… The days in month has been, and always will be, a real PITA. However, IBM Db2 has a nice Built-in Function (BiF) called DAYS (not DAY – That just extracts the day portion of the calendar!)
DAYS ( expression ) – The result is 1 more than the number of days from January 1, 0001 to D, where D is the date that would occur if the DATE function were applied to the argument.
All in UTC and this is *very* handy as it bypasses the days-in-the-month problem completely!
Armed with this it made the COBOL code simpler and better.
TIMESTAMPDIFF to the Rescue!
Then in Db2 V8 came a new BiF – TIMESTAMPDIFF – will it be our savior?
When it was announced I thought: Wow! This is great – it will do all the work for me – just tell it what you want as output and give it two timestamps and you are done!
TIMESTAMPDIFF( numeric-expression, string-expression)
Not Really …
The problem is in the first sentence of the docu that most people do not bother to read:
The TIMESTAMPDIFF function returns an estimated number of intervals of the type that is defined by the first argument, based on the difference between two timestamps.
Seconds is Good?
Naturally, I used 2 (to get seconds) as the numeric-expression and, to begin with, was a happy bunny with the results.
Guestimate?
Then I noticed that all was not well in the world of TIMSTAMPDIFF and that the “estimated” number can be difficult to judge! The problem gets clearer when you review the “assumptions” list at the end of the docu:
The following assumptions are used in estimating a difference:
- One year has 365 days
- One year has 52 weeks
- One year has 12 months
- One month has 30 days
- One day has 24 hours
- One hour has 60 minutes
- One minute has 60 seconds
Now we all know that this is not true… Not all years have 365 days or even 52 weeks and nearly all months do not have 30 days! Now in my trace program it was just a bit irritating, but if you are using TIMESTAMPDIFF believing you really get the number of DAYS between two dates then it is time to think again!
Seeing is Believing
Here’s an example showing where it first works fine and then one day earlier and it all goes astray:
SELECT TIMESTAMPDIFF( 2, CHAR(TIMESTAMP('2023-12-21-00.00.01') - TIMESTAMP('2023-10-22-00.00.01'))) AS TSDIFF , DAYS( TIMESTAMP('2023-12-21-00.00.01')) - DAYS( TIMESTAMP('2023-10-22-00.00.01')) AS DAYS FROM SYSIBM.SYSDUMMY1 ; ---------+---------+---------+---------+---------+---------+--------- TSDIFF DAYS ---------+---------+---------+---------+---------+---------+--------- 5184000 60 DSNE610I NUMBER OF ROWS DISPLAYED IS 1 SELECT TIMESTAMPDIFF( 2, CHAR(TIMESTAMP('2023-12-21-00.00.01') - TIMESTAMP('2023-10-21-00.00.01'))) AS TSDIFF , DAYS( TIMESTAMP('2023-12-21-00.00.01')) - DAYS( TIMESTAMP('2023-10-21-00.00.01')) AS DAYS FROM SYSIBM.SYSDUMMY1 ; ---------+---------+---------+---------+---------+---------+--------- TSDIFF DAYS ---------+---------+---------+---------+---------+---------+--------- 5184000 61 DSNE610I NUMBER OF ROWS DISPLAYED IS 1
As you can easily see, the day has changed by one but the TSDIFF has not…Not good! The problem is naturally caused by going over some internal threshold. If you do not care about accuracy, it is fine.
I Do!
So, what I have done is write my own “timestampdiff” in SQL:
SELECT MIDNIGHT_SECONDS(TIMESTAMP('2023-12-21-00.00.01')) - MIDNIGHT_SECONDS(TIMESTAMP('2023-10-22-00.00.01')) + ( 86400 * (DAYS(TIMESTAMP('2023-12-21-00.00.01')) - DAYS(TIMESTAMP('2023-10-22-00.00.01')))) AS DIFF ,TIMESTAMPDIFF( 2, CHAR(TIMESTAMP('2023-12-21-00.00.01') - TIMESTAMP('2023-10-22-00.00.01'))) AS TSDIFF , DAYS( TIMESTAMP('2023-12-21-00.00.01')) - DAYS( TIMESTAMP('2023-10-22-00.00.01')) AS DAYS FROM SYSIBM.SYSDUMMY1 ; ---------+---------+---------+---------+---------+---------+--------- DIFF TSDIFF DAYS ---------+---------+---------+---------+---------+---------+--------- 5184000 5184000 60 DSNE610I NUMBER OF ROWS DISPLAYED IS 1 DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100 ---------+---------+---------+---------+---------+---------+--------- SELECT MIDNIGHT_SECONDS(TIMESTAMP('2023-12-21-00.00.01')) - MIDNIGHT_SECONDS(TIMESTAMP('2023-10-21-00.00.01')) + ( 86400 * (DAYS(TIMESTAMP('2023-12-21-00.00.01')) - DAYS(TIMESTAMP('2023-10-21-00.00.01')))) AS DIFF ,TIMESTAMPDIFF( 2, CHAR(TIMESTAMP('2023-12-21-00.00.01') - TIMESTAMP('2023-10-21-00.00.01'))) AS TSDIFF , DAYS( TIMESTAMP('2023-12-21-00.00.01')) - DAYS( TIMESTAMP('2023-10-21-00.00.01')) AS DAYS FROM SYSIBM.SYSDUMMY1 ; ---------+---------+---------+---------+---------+---------+--------- DIFF TSDIFF DAYS ---------+---------+---------+---------+---------+---------+--------- 5270400 5184000 61 DSNE610I NUMBER OF ROWS DISPLAYED IS 1 DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100 ---------+---------+---------+---------+---------+---------+--------- SELECT MIDNIGHT_SECONDS(TIMESTAMP('2023-12-21-00.00.01')) - MIDNIGHT_SECONDS(TIMESTAMP('2023-10-20-00.00.01')) + ( 86400 * (DAYS(TIMESTAMP('2023-12-21-00.00.01')) - DAYS(TIMESTAMP('2023-10-20-00.00.01')))) AS DIFF ,TIMESTAMPDIFF( 2, CHAR(TIMESTAMP('2023-12-21-00.00.01') - TIMESTAMP('2023-10-20-00.00.01'))) AS TSDIFF , DAYS( TIMESTAMP('2023-12-21-00.00.01')) - DAYS( TIMESTAMP('2023-10-20-00.00.01')) AS DAYS FROM SYSIBM.SYSDUMMY1 ; ---------+---------+---------+---------+---------+---------+ DIFF TSDIFF DAYS ---------+---------+---------+---------+---------+---------+ 5356800 5270400 62 DSNE610I NUMBER OF ROWS DISPLAYED IS 1
It all looks a lot better and actually returns the correct number of seconds between two timestamps!
How Does it Work?
The key part is just some math:
MIDNIGHT_SECONDS(TIMESTAMP('from ts')) - MIDNIGHT_SECONDS(TIMESTAMP('to ts')) + ( 86400 * (DAYS(TIMESTAMP('from ts')) - DAYS(TIMESTAMP('to ts')))) AS DIFF
I use the MIDNIGHT_SECONDS BiF (First appeared in DB2 V6.1) that returns the number of seconds from midnight up to the given timestamp. I then simply subtract the “to ts” value from the “from ts” value and then use the DAYS BiF again subtracting from each other to get the days difference multiplying by 86400 as seconds/day. There is no need to subtract an extra day as, if that was required, it is automatically handled by the MIDNIGHT_SECONDS subtraction. Then these two results are simply added together.
Code Review Time?
I have re-reviewed all my code to make sure that any use of TIMESTAMPDIFF can accept “approximate” answers and in all other cases replaced it with the above code.
I might even open an Aha! Idea about getting this as a BiF – it is not that complex and is “better” than the best guess system we have now. In fact, I have – DB24ZOS-I-1599 – please vote if you think it would be great to get this as a simple BiF!
Db2 Does Not Stand Alone Here!
We are also not alone in this problem! Oracle, MySQL and JAVA all have the same “approximation” routines. I did a Google search for a web-based timestamp calculator and it made exactly the same mistakes. I guess that there is a “fast algorithm” out there that does the best guess quickly…but sadly not accurately!
This problem is right up there with Daylight Saving Time and the grief that causes! See my earlier blog.
TTFN,
Roy Boxwell