Monday, August 26, 2013

DATEDIFF Mistakes Case Study 2: Foxhound


Previously on . . . Documenting DATEDIFF described how the peculiar workings of SQL Anywhere's DATEDIFF() function can lead to coding errors, and DATEDIFF Mistakes Case Study: This Blog was a bug hunt for those errors in some sample code.

Now the bug hunt continues, this time in production code for an application that depends heavily on DATEDIFF(): the Foxhound performance monitor.
An early design decision is reflected in the code throughout Foxhound: All time intervals are calculated in milliseconds, even long intervals measured in seconds, hours or even days. As a result, even though DATEDIFF ( MILLISECOND, x, y ) is often called with TIMESTAMP values for x and y, and even though that can result in an error of (almost) one whole millisecond in the value returned by DATEDIFF, it doesn't matter: Foxhound only displays intervals to the nearest 0.1 second, so an error of 0.001 second doesn't affect the result.

Here's an example; the internal Foxhound function rroad_f_msecs_as_d_h_m_s() takes a BIGINT value in milliseconds and formats it to return a string like '1h 16m 56s' or '9.9s'
rroad_f_msecs_as_d_h_m_s ( DATEDIFF ( MILLISECOND, active_alert.recorded_at, @current_timestamp ) ),
Only small values (less than one hour) are shown to the nearest tenth of a second, like '9.9s', and large values are only shown to the nearest second like '1h 16m 56s', so DATEDIFF errors of one millisecond or less don't matter.

Foxhound is awash in DATEDIFF MILLISECOND calls, and they all share the same characteristic: The errors don't matter because the code doesn't care about milliseconds. Foxhound isn't a financial application that cares about the pennies, it isn't even an execution profiler that cares about how long a single statement takes to execute; it is a performance monitor that gathers samples every 10 seconds.


Are there any DATEDIFF something-other-than-MILLISECOND calls in Foxhound?

Yes, there are a few. This one checks a user input datetime for validity:
WHEN DATEDIFF ( DAY, @FOXHOUND3UPGRADE_timestamp, CURRENT TIMESTAMP ) > 100000 THEN
A @FOXHOUND3UPGRADE_timestamp value more than 100,000 days in the future is ignored because it is "way too big". Since DATEDIFF DAY counts the number of day boundaries between the two timestamps, the return value could be wrong by (almost) one entire day, which means a value only 99,999 days in the future could be incorrectly rejected.

Yeah, that's a bug... one that's not going to be fixed, or even documented in the Foxhound FAQ, but a bug nonetheless... the Foxhound Development Team promises to do better!


Here's another example; DATEDIFF DAY is called calculate how many days are left before the current rental period expires:
IF  @edition_name = 'Rental'
AND @expiry_date < '9999-12-31' THEN

   SET @rental_period_will_end_in_days = DATEDIFF ( DAY, CURRENT DATE, DATEADD ( DAY, 1, @expiry_date ) );

ELSE

   SET @rental_period_will_end_in_days = 9223372036854775807; -- never ends; i.e., it's not a rental

END IF;
In this case, the second and third arguments to DATEDIFF are only precise to the nearest DAY, so counting the number of day boundaries is the same as counting the number of days, and DATEDIFF DAY returns the right answer.

What's the score?

Here's how DATEDIFF usage is scored according to the previous Case Study:
  • FAIL means the DATEDIFF usage is badly flawed; it shouldn't have been coded that way, and it should be fixed.

  • LUCKY means the DATEDIFF usage may be flawed but it doesn't matter given the data values involved.

  • OK means the DATEDIFF usage is OK given the data types involved.
Foxhound gets a LUCKY for all those DATEDIFF MILLISECOND calls where the errors don't matter (the first example above), a FAIL for the DATEDIFF DAY bug that rejects 99,999 as being "way too big", and an OK for the DATEDIFF DAY calculation involving the expiry date.

This bug hunt didn't turn up anything worth changing... like all bug hunts it did find some stuff in need of fixing, just not DATEDIFF :)

No comments: