April Trillion Updates

By Jon Inns - March 25th, 2019 Posted in Trillion Announcements

In this post we’re going to take a look at a few new features that will be rolled out to the Trillion platform in our April release.  We’re excited about this release because its all about data viability and context.

Ever since the very first release of Trillion we’ve remained dedicated to providing industry leading breach information alongside the best possible actionable data, and this April release goes on to further enhance the risk accuracy of our output.  In this upcoming release we are focusing on two new features, breach source metadata and proof of life.  We’ll explain why each of these matter in more details in this post.

#1 Breach Meta Data

Over the last 12 months we have detected a rapid increase in the creation and distribution of massive credential stuffing lists.  Very frequently these end up being lists containing a clutch of newly sourced breach data, merged back together with old data lists, which are then being repackaged as brand new user credentials ready for use for account takeover.  This data rewashing is causing many of those trying to track and mitigate this data a real headache.

Collection#1 contained over 700 Million user credentials

In January 2019 we saw the release and distribution of a huge credential stuffing list now known as Collection#1.  For those that haven’t yet heard about it, Collection#1 is a massive collection of data breaches which were aggregated together and then sold or traded, with over 700 million rows of usernames and passwords contained within in it.  For the organisations who are at least security savvy enough to use something like HaveIBeenPwned to monitor for these leaks, the release of Collection#1 caused a great deal of news and alarm.  Users and security teams began getting notified that they may have had credentials in it, but that wasn’t really the problem.  The real issue wasn’t that their user secrets might have been exposed but actually they didn’t know how to usefully react to the information being presented.  Without directly trying to find the original data themselves and then investing time manually analysing it, they were unable to figure out where the leaks really sourced from and how harmful it might have been in relation to their corporate assets.  To make matters worse, shortly after the release of Collection #1, Collections #2, #3, #4, and #5 were also released, once again leaving the industry scratching its head about what to do next.

Trillion has been solving this issue since the beginning as we run various processes over every line of data we inspect rather than just loading in these large collections as a single source.  Trillion runs analysis and probability contests on the data, looking to see if we might have seen it before, breaking apart file structures back into originators and correlating individual user risk profiles based on the characteristics of user data as it has been distributed across multiple breaches.

While we have been doing this since the beginning we didn’t previously expose some of that information back to the platform operators, but in our April release users will be able to access a complete Breach Dictionary of every breach source that has been processed by Trillion.


In addition to the Breach Dictionary, full breach metadata will be directly available from our link charts so analysts will be able to identify at a glance their “at risk” users, the risk score of each user (based on the distribution and availability of lost secrets), the source of their leaks and full summary of the individual breaches.


Breach Metadata is now woven into the fabric of Trillion and will provide a powerful support reference for Security Analysts.

#2 Proof Of Life

One of the big challenges with Breach Data is that it’s very difficult to order into a helpful chronological series.  Sometimes data is stolen and not released for many months or years, sometimes its old, so it appears like it doesn’t matter anymore when actually it does (frequently old data can be as harmful as new breaches), and sometimes new data might throw up artefacts that look important but they actually aren’t.  Here are a couple of examples:

That’s an old breach.  Why should I care about that?

  1. A breach occurred 4 years ago to a popular networking site and your corporate users passwords have been compromised.  Would you care? Would you want to know (Pro-tip: You should care regardless of data age.  Users frequently keep the same passwords for years across many different systems).
  2. A breach occurred yesterday and you have users in it.  Would you care? Is it more important that the old breach?  What about if you have high staff turnover?  Maybe a lot of those affected users have already left your organisation.  Just because the breach only happened yesterday their accounts could have been created years ago.

As you can see – its tricky to get the balance right.  Regardless of the date or size of the breach, its very difficult to know how it impacts your organisation, and the larger the organisation the harder it can be to identify the risks.

To remove this instinctive focus on “Breach Date”, we’ve introduced a new filter which allows you to refocus on “User Relevance” instead.  In our April release Trillion will now go through a number of background steps in order to attempt to determine (for each of your detected at risk users) if the username it found looks to be active in your organisation.  It will then give you the option to prioritise the data for the users that look Alive (hence the heartbeat) so if you’re required to oversee the security of a large workforce Trillion immediately bubbles up the most at risk and relevant accounts to the top.

Heartbeats indicate likelihood of users still being active in your organisation

With our Proof of Life feature, it means that the age of the data-set that has your users data in it becomes a reduced decision factor, Trillion will automatically attempt to triage for you, identifying the users which are active in your business and who have had their records leaked, our risk scoring engine will highlight the risk of the data located, and provide you the ability to send the secrets securely down to your users for viewing.

We’re excited about this release and we have lots more in development that we cant wait to share.  Stay safe.

Threat Status 7,960

Applications Breached

Threat Status 5,523,563,366

Credentials Located

Threat Status 73,932,471

Brands At Risk