A server containing a staggering 24-plus million financial documents collected over more than a decade was discovered to be hemorrhaging data. The documents contained some of the most sensitive personal information possible: financial and banking information relating to mortgages and loans written by several banks in the United States. A data breach of this scale is difficult to comprehend. What makes this even harder to understand is that the incident was an unforced error: the server was totally unprotected.
Security researcher Bob Diachenko discovered the leaky Elastisearch server on January 10 using nothing more than public search engines. His team took samples of the data and after seeing the sensitivity of the exposed data “immediately initiated a responsible disclosure protocol” to privately alert the named server owner of the error. Diachenko also sent a responsible disclosure to CitiFinancial (also on January 10), who was referenced in “a massive amount of the documents.”
A statement from Citi to Diachenko and Zack Whitaker of TechCrunch stated that none of their systems had been compromised, and that “it appears the third party is a company that had purchased the loans…”
After a little more digging, Diachenko and Whitaker of TechCrunch traced the exposed server to Ascension Data & Analytics, a Texas-based company that specializes in services for the financial industry. One of their specialties is document management, including using optical character recognition (OCR), which converts typed or handwritten text from scanned images machine readable.
Whitaker was able to reach Ascension, who said that one of its vendors, OpticsML, another document management company located in New York “had mishandled the data and was to blame for the data leak.
The personally identifiable information (PII) stored on the exposed server had been scanned using OCR. While the PII was not immediately readable by the average person, someone with a cursory understanding of metadata would have no problem sifting through the data.
According to Diachenko’s original report, the server was taken offline on January 15.
Digging Deeper into the Leak
Like any good researcher, Diaschenko didn’t stop digging after the initial breach was shut down. Whitaker, in a follow-up report, notes that Diachenko discovered the same data exposed on another server. However, this Amazon S3 bucket contained the original documents rather than their OCR-scanned counterparts.
Whitaker states that “anyone who went to an easy-to-guess web address in their web browser could have accessed the storage server to see—and download—the files stored inside.
As if the story couldn’t get worse, Diachenko notes that he was “very surprised” by this second discovery, since Amazon storage servers are “private by default” and therefore shouldn’t be accessible to the public. Someone made the choice to make this server publicly accessible.
This second exposed server contained 23,000 pages of documents cobbled together into 21 files. A brief comparison of the data from the Elastisearch and Amazon servers showed that there was some overlap of portions of the PII found in both leaks. These original documents included loan and mortgage agreements, W2 tax forms, and loan repayment schedules, among other things, and clearly displayed names, addresses, phone numbers, and Social Security numbers.
Whitaker and Diachenko tried to contact OpticsML regarding the S3 bucket, but its website was taken offline and the company’s phone was disconnected. After tracking down an e-mail address through some Internet Archive sleuthing, they were able to reach the chief executive officer (CEO) and chief technology officer (CTO) of OpticsML who took the server offline “within the hour.”
CTO John Brozena told TechCrunch that OpticsML “are working with the appropriate authorities and a forensic team to analyze the full extent of the situation regarding the exposed Elastisearch server.” He also noted that the company is “working to notify all affected parties,” a step required by New York’s data breach notification law.
Preventing the Preventable
Far too many data breaches like this one are completely preventable through basic security practices. Diachenko points out that this incident was akin “leaving the front door open.” Using a public configuration allows a cybercriminal to gain full administrative privileges to the server where they can install malware that allows them to steal, destroy, or ransom the data. Since this was a treasure trove of sensitive PII it likely could have sold easily on the Dark Web.
Like many other security professionals, Diachenko recommends taking a proactive stance when it comes to data security. In a note to Whitaker, he said that he “would assume that after such publicity like these guys had, first thing you would do is to check if your cloud storage is down or, at least, password-protected.” Sound advice.
In the age of GDPR and expanding data privacy laws in the United States companies need to stay ahead of these kinds of issues. Securing cloud storage or other online servers is a great place to start. If your organization is thinking about migrating to cloud storage, reach out to a credentialed security consultant to help you through the process. While the upfront costs might seem steep, they are a good deal cheaper than penalties and damages imposed after a data breach.