Healthcare.Gov – 70,000 records in 4 minutes
In light of the recent news regarding the Healthcare.gov site there seems to be a lot of confusion on if the site has been compromised or not. Much of this confusion is based on a statement made by Mr. David Kennedy from TrustedSEC stating “he was able to import some 70,000 records in about four minutes”.
This statement is raw meat for a press hungry for clickbait headlines, and so was picked up a by a large number of news outlets. This situation is summarized quite well within a Boing Boing post.
Beyond the sensationalist headline this statement has an impact for a very distinct reason: It may be admission of a criminal act.
Prior Case Law
The case of Andrew ‘Weev’ Auernheimer is very similar. In his case, he accessed servers from AT&T extracting user information. The process he used to access the data did not include actual compromise of the server yet he was still able to access user data. Auernheimer was charged with “conspiracy to access a computer without authorization” for his actions, found guilty and sentenced to 41 months in prison.
The apparent similarities between the cases were pointed out to Mr. Kennedy. At this point, he began to soften his wording of the actions taken and started undertaking an effort to “correct” media statements. His explanation was that no “hacking” occurred, that the attack was performed with a simple browser with “passive reconnaissance techniques”. His exact quote was:
“You can literally just open up your browser, go to this, and extract all this information without actually having to hack the website itself,” he added.
This is not accurate. If we view the original graphic from the TrustedSEC site we observe:
This screenshot depicts a custom Python tool targeting the healthcare.gov “infrastructure”. It is able to extract user data including fields such as “ID” “DisplayName” “Login” “screenName” “Permissions” and “Email”. The output from the python tool is sanitized however we observe the value “[admin]” from the “Permissions” field. If this data originated from a Google advanced search as Mr Kennedy states we should be able to repeat the results in a browser.
The results from the Google search demonstrate that Mr. Kennedy’s statement is most likely not accurate. The data he claims he pulled from Google does not actually exist there. The data he extracted was, as stated by his Python attack tool, originated from the healthcare.gov infrastructure.
Even if the data was stored in Google’s servers (also known as the Google Cache) Mr Kennedy would not be able to mine the data in four minutes time due to limitations in the Google restrictions against automated queries. 70,000 automated queries would trigger restrictions, as documented by Google.
Google.com or Healthcare.gov?
“There’s a technique called, what we call ‘passive reconnaissance,’” Kennedy explained to “Fox News Sunday” host Chris Wallace, “which allows us to query and look at how the website operates and performs.”
“And these type of attacks that I’m mentioning here, and the 70,000 [personal records Kennedy found] that you’re referencing, is very easy to do,” Kennedy continued. “It’s a rudimentary type attack that doesn’t actually attack the website itself. It extracts information from it without actually having to go into the system.”
“And 70,000 was just one of the numbers that I was able to go up to and I stopped after that,” he said. “You know, I’m sure it’s hundreds of thousands, if not more, and it was done within about a 4 minute timeframe. So, it’s just wide open.”
Mr Kennedy states two contradicting methods of data extraction. Both “from the website itself” and from Google.
Mr Kennedy is currently claiming that the media is sensationalizing his statements. This may be the case, yet the interview on Fox News Sunday left little to distort as it was a plain admission of a crime direct from Mr Kennedy himself.
Other aspects of Mr Kennedy’s statements are awkward and not feasible. For instance, he has repeatedly claimed that “anyone” can access the data with a simple web browser. However there is not a way to access 70,000 separate data records in four minutes manually unless the process is programmatically automated. Furthermore, when confronted with this on twitter he stated that he used a Python shared library that he claimed was the browser “anyone” can use to access the data.
It is my belief that Mr Kennedy accessed user data directly from the healthcare.gov infrastructure. He reported this to Congress but did not elaborate on the methods that he used as he was concerned about the legal repercussions. Now that Mr Kennedy has made multiple media appearances on the topic the holes in his explanation are becoming apparent.
Is Healthcare.gov hackable?
It is hard to determine if the site has been compromised or not. If Mr Kennedy’s statements are correct then the least we know is the site does not store information securely. It is conceivable he was able to access the user data without hacking the server.
For all practical purposes it appears as if Mr Kennedy conducted the same actions against healthcare.gov as Auernheimer did against AT&T. This is a question we will not get answers to directly without an official investigation.