Predicting Social Security Numbers from Public Data

ssnAlessandro Acquisti and Ralph Gross have recently published their provocative article, Predicting Social Security Numbers from Public Data in the Proceedings of the National Academy of Sciences.  According to the abstract:

Information about an individual’s place and date of birth can be exploited to predict his or her Social Security number (SSN). Using only publicly available information, we observed a correlation between individuals’ SSNs and their birth data and found that for younger cohorts the correlation allows statistical inference of private SSNs. The inferences are made possible by the public availability of the Social Security Administration’s Death Master File and the widespread accessibility of personal information from multiple sources, such as data brokers or profiles on social networking sites. Our results highlight the unexpected privacy consequences of the complex interactions among multiple data sources in modern information economies and quantify privacy risks associated with information revelation in public forums.

Acquisti and Gross’s study has generated significant media attention.  Here’s an article by Bob Sullivan for MSNBC and by Hadley Leggett for Wired.  As Sullivan writes:

The two say they can guess the first 5 digits of the Social Security number of anyone born after 1988 within two guesses, knowing only birth date and location. The last four digits, while harder to guess, can be had within a few hundred guesses in many situations — a trivial hurdle for criminals using automated tools.

SSNs are currently used by numerous businesses and organizations to allow access to accounts – they function as a kind of password. They are also used to verify identity when people sign up for a new credit card or other account. They are thus a very useful tool for identity thieves and fraudsters who want to impersonate people to improperly access their accounts or obtain credit cards in their name.

The current focus of policymakers has been to provide better protections against the disclosure of SSNs.

Acquisti and Gross’s paper provides a powerful demonstration that protecting against the disclosure of SSNs is not providing enough protection to consumers.  The article shows that no matter how much protection against the disclosure of SSNs, SSNs can be determined with other public information.

Congress or the FTC should prohibit companies from using SSNs as a means to verify identity. Companies, organizations, and government entities should be prohibited from using SSNs as a means of verifying identity to provide access to accounts or to create new accounts. Merely protecting against the disclosure of SSNs is insufficient since Acquisti and Gross demonstrate they can readily be predicted.

The government and businesses are at fault here.  Too many business and organizations use the SSN improperly as a means to verify identity.  And the government is at fault for creating the SSN and allowing it to be used improperly in ways that harm people.

2 Responses

  1. Jon Sheck says:

    Hopefully this is the final nail in the coffin to STOP the Social Security Number from being improperly used by both government and private companies as a form of ID for everything from drivers licenses, passports, utilities, bank accounts, etc,.

  2. joe says:

    My bank recently asked me to say the last four of my SSN to verify my identity. I told them I didn’t think my SSN was much of a secret and could they instead ask me something else. And they were happy to ask me something else.

    @Jon: SSNs can still be used to ID people, that’s what they’re for… it’s the using them as *authenticators* (passwords) that’s the problem.