Wild stuff: Nines complement date format

March 5, 2010 by · 7 Comments
Filed under: English 

For German readers: Es gibt eine deutsche Version dieses Blogeintrags.

 
Photo from Picture Taker 2 at flickr.com

This is the first in what I hope to be a series of some wild stuff I have to deal with: Strange formats, encodings and data modeling decisions that I or others come across. 

This post is about a date format that we are calling Nine’s complement. I couldn’t find anything about this on wikipedia, and googling I could only find some mention in an obscure IBM documentation.

What we found

Here’s what we found: In one of SAP’s screens you have a ‘valid to’ date field. You can enter some date until the business information was valid (like ‘31.12.2006’). When we looked for the corresponding field in the database, the column was easy to find (called VALIDTO), but it contained values like ‘79938768’ that surely didn’t look like a date. Also the column was defined as a CHAR(8) field. CHAR(8) is not unusual in itself for a date column, storing the date in a format like ‘YYYYMMDD’ so you can sort a date by sorting a number. In that way, ‘31.12.2006’ would be stored as ‘20061231’.

It took some asking around to figure out what was going on with the strange values we were seeing in the database. What you do is you take the YYYYMMDD-format and subtract that from 99999999. Here is the whole story:

Date 31.12.2006
as YYYYMMDD 20061231
   
8 Nines: 99999999
minus Date-String 20061231
Date in Nine’s Complement 79938768

 

In order to convert the values found in the database back into a proper date, you can perform the same operations in reverse (i.e. subtract the value from ‘8 Nines’, and you get a date in the YYYYDDMM-format).

But why?

The “official” reason behind this format is that you reverse the sorting order of the dates, so that the most recent date comes first. This way, you can simulate a descending sort order in a database that does not support this concept.

As we were working with an SAP system and SAP has built an abstraction layer on top of the underlying databases, that may well be the case here. Some of the code dates back to the early 1980s, and databases were quite different back then. Sometimes I think that they even built there own database independent indexing mechanism, although these days most indexes defined in SAP correspond to indexes on the database level.

However, it is still not clear to me why defining a descending sort order in an index is important . I’ve asked a question on StackOverflow about why you would defined a descending order in an index definition, and the answers pointed to reasonably obscure situations that do not apply to the situation we were seeing. Unless there is stuff like clustered tables involved, the ordering of an index shouldn’t be important, if you’re doing an order, a binary search or traversing a b-tree, the order would not have a material effect on performance.

It would be quite interesting to evaluate the performance impacts of a “normal” date storage and a regular ascending index. I’m quite confident that there will not be much of a difference. So I can only assume that if there was a reason in the past to do this, the reasoning no longer applies. However, my guess is that someone optimized something that didn’t need to be optimized. Probably some “old COBOL mainframe programmer” from back in the days when all you could do was sequential access to a file …