Duplicate Contacts – Where do they come from? (Or: McFly, you’re a slacker)

March 29, 2012 by · Leave a Comment
Filed under: DataQuality, iOS, SmarterContacts 

When dealing with data quality issues such as duplicates, the question people usually focus on is “how do I get rid of the duplicate records?” While this is important, this does not remove the cause of the problem and usually leads to ongoing or recurring cleaning efforts. Therefore, if you really want to resolve a data quality issue, you have to ask the question of  “where do the duplicates come from?”

As with almost all data quality issues, there are easy answers to this question:

“McFly, you’re a slacker”
(image from http://images4.wikia.nocookie.net)

After the teacher needling Marty, I’m also calling this the “Strickland Explanation”.

or a bit less harsh

“I don’t have time!”
(image from http://images.forum-auto.com/)

While these explanations may be true in some cases, they are not very helpful: They insult the people that you enter the data, making them a lot less willing to help you resolve the problem. Also, it shows that you haven’t thought about the problem or discussed the issue with the users – as there are always other explanations that take a bit of effort to unearth.

Here’s a list of issues that I found lead to duplicates in your address book:

  1. Technical Limitations
    Older versions of address books were limited in the number of fields you had available. One typical issue is that you could only have one email address or phone-number. If you wanted to store multiple phone numbers for a person (e.g. the home number, the work number and the cell phone) you had to create multiple records that have the same name, but different details. This is especially true for address book programs from older phones.
  2. Synching Gone Wrong
    Synching is a surprisingly difficult problem, and almost everyone has their own horror stories of synchs that have gone wrong. Typically issues are an extra copy of each record, information showing up in unexpected places (for example a zip code being stored in a phone number) or some information being lost during synch (one of my pet peeves is the birthdate).
  3. Information Hoarders
    Some email programs had the option of storing the email address of each person that sent you an email in your address book, resulting in a large number of “sparse” records (records that may consist of only an email address or just a phone number, but not a proper name). Also, the large number of resulting records makes it hard to figure out if multiple records belong to the same contact or if you already for a record for a person.

I’m sure that there are even more causes for duplicate contacts. Please note that these causes require completely different approaches than the “Strickland Explanation”. I will have a closer look at these in a future post.

Where do the duplicates in your address book come from?

IconSmarterContactsAppstore_BadgeIf you want to find duplicate contacts, please give my iPhone app “SmarterContacts” a try. You can find it on the app store. Please let me know when you identify other causes for duplicates so I can update this post and provide additional functionality in my app.

Low Data Quality in your iPhone Address Book – Why care?

February 9, 2012 by · Leave a Comment
Filed under: DataQuality, SmarterContacts 

As with all data quality issues, it is important to understand the consequences of a low quality address book. Too often this discussion is skipped, resulting in only half-hearted efforts of cleaning up and quickly slipping back into old habits.

Here are a few scenarios that show the impact of low data quality in your address book – for the sake of this discussion I equate a low quality address book as having lots of duplicate records, i.e. more than one record for a single contact.

Which is the right record to use?

When you want to use the information in your address book, it is difficult to pick “the right” record to use. Consider the following excerpt from (fictional) iPhone contacts:


If you want to send an email to McFly, which address should you use?


There are a couple of contacts, but you can’t see the context of each email address. Some might be from his school (is he still going to school?) or from a college, some seem work-related. Just from looking at this list of potential addresses, there is no way to figure out which one is still valid, let alone which one to use.

(This example is not unusual, as older programs were only able to store one email address per contact.)

While you may have some additional information to help you decide (you know that Marty has already left high school and college, and that he has been “terminated” from his job at Fujitsu Enterprises), this is next to impossible for Siri, the iPhone’s digital assistant. Here’s a look what happens if you ask Siri to “Call McFly”:


If you try narrow it down with “Martin” or “Martin McFly”, you’ll still get to choose between two (not necessarily the right choices) – and then you’re stuck and can’t even get Siri to pick one of them:


Which is the right record to update?

A similar problem arises when you want to update the contact information for Marty. If Marty lets you know he’s got a new job and had to move to a new postal address, will you remember to delete the old address? If you just add the new address as a new record or just update one of the records, after a month or two you will have no way to decide which is the right address. If you have multiple records for Marty and decide to finally write down his birthday after missing it a few times – which record should you update? Just one (at least you’ve committed the info to your digital memory) or all (better safe than sorry)?

There is no good answer if you have more than one record for a person. Things get easy if you have just one record for Marty – only one place to store his birthday, and you’ll see what other addresses you already have for Marty when you enter a new one, so you can delete those that are no longer valid.

What’s the data quality in your address book?

IconSmarterContactsAppstore_BadgeIf you want to find out how many duplicates you have in your address book, please give my iPhone app “SmarterContacts” a try. You can find it on the app store.