In this post I want to share some of the day to day ethical considerations we have to take into account when handling, and sometimes even talking about the data we uncover. Its quite long because its a complex topic, but read on if it interests you.
When we originally created our core platform (Trillion) we had one over-riding principal, which was to ensure it would only ever be used for enhancing security on the internet and nothing we would do, to the very best of our abilities, would have an adverse impact on anyone. With that in mind, not only did our software have to be designed from the ground up with solid security principals, but so did all of our working practices around data discovery, collection, and disclosure. Now, being as everything starts with discovery, lets talk about that first and then move through some of the other interesting areas.
Ethical Issue #1 – Data Acquisition From Dark Markets
I suppose the first thing to say is that we at Threat Status will never pay for leaked data, and the reason for that is very simple. When people pay money for things it stimulates a market. Our services exist because there is already an established underground market for leaked user credentials but the last thing we want to do is encourage it’s growth. The fundamental purpose of Trillion (and in the future ARC) is to try and defuse and minimise the usefulness of credentials from 3rd party data leaks. Organisations and researchers who are willing to pay for breach data are having the opposite effect. They are feeding the sharks which is just going to make them come closer to shore. Don’t feed the animals!
However, there is an argument made by some that buying data would enable us to report on it faster. That is certainly true, but in our view the negative impacts of doing that significantly outweigh the positive, and buying stolen goods from criminals just doesn’t feel like a good way to run a business. I’d like to think that this is not even worthy of being pointed out, but recently there have been reports of companies doing exactly that. We’re not talking about buying decryption keys to unlock devices infected with ransomware, but actually buying stolen data with the argument that it’s in their customers interest to get hold of it in case they might be impacted.
This argument just doesn’t stack up and those customers who might benefit from that scenario also now have to realise that if they ever stop being customers of said supplier, they’re living with the knowledge that there is a company out there willing to pay criminals for data that has probably been stolen, maybe even from them, and it might be their money that helped pay for it.
Under no circumstances does it make sense to use a security monitoring service that is paying for illegally obtained material. So, if its a claim being spun by your Threat Intel partner, you should think carefully about engaging.
Ethical Issue #2 – Data Acquisition From Unknown “Researchers”
Sometimes data is located and discovered by security researchers and they’re happy to share it to help disarm the threat. This is when the infosec community is at its best, when it collaborates to solve a problem ethically and with the best of intentions.
There is another side to that coin though. Sometimes (and I could identify some) I have seen so called security researchers living a double life. By day they claim to be a good infosec professional, working for the interest of clients, but by night they are on hacker forums bigging up their elite hacking profiles and sharing data they have either stolen themselves, or collaborated with others to obtain.
When services are happy to receive data from un-vetted researchers and then credit them publicly through news blasts or social media for doing a good deed, simply in order to encourage the feed of more data from the individual in the future, they may well be being complicit in the marketing of a criminal activity or advertisement of criminal goods.
Now, I need to be clear here… we welcome the support of security researchers with an established or verifiable history of honest behaviour, as there is a genuine benefit to us, our service, our clients and the researchers in trying to get ahead of any problem the leaked data might present. That’s the whole point of responsible vulnerability disclosure.
Giving large pubic shout outs to just anyone who hands over data is irresponsible though and we wont do that.
Ethical Issue #3 – Publicly Announcing New Data Discoveries
There is a lot of easy to obtain media buzz around data breach discoveries, and many organisations and services will be very quick to announce the fact that they have obtained data for a new breach and publicly post blogs and articles about it in order to drum up as much media attention as they possibly can. However, usually what is happening “under the hood” is that at the same time that they are announcing to the world that there is new data “on the market” a feeding frenzy begins happening below the surface on the criminal forums. The surface web media announcements drive interest and demand in the underground material and we see a lot of cases where data is being deliberately given to those who can shout the loudest about it, so that underground value and demand increases.
This is difficult to balance because as a business media attention is always welcome, but we made a decision early on that we wouldn’t announce new data finds. Initially this was because of exactly the scenario mentioned above, but latterly there is also just so much data in circulation we would never take a breath. Had we been driven by media this might mean that we ended up only reporting on the juicy data that would get the best media coverage, which would mean we are frankly just shouting about the breaches for the sake of it, and we don’t want to do that.
For that reason you wont ever hear us “announcing” that we have located new data breaches. We do still want to be very transparent about our work though, and so we make sure that anyone with an authenticated account on Trillion has full visibility to the many thousands of data breaches we have located so far. We’re just not going to brag in the news.
Ethical Issue #4 – To Verify Or Not To Verify
The next issue is about verifying the validity of data breaches. There are two schools of thought here. Option 1 – verify every data breach by contacting the alleged source of the leak and asking them if they can confirm it, or option two, don’t bother. It might surprise you to hear that we’ve gone for option 2. Now let me explain why.
Getting data owners to confirm they have lost data is incredibly time consuming and prone to false negative results. Many times organisations will flat out deny they have lost data, or they simply don’t know (and may never know). That is very difficult to deal with.
Early on we made attempts to contact numerous organisations asking if they could help us verify data legitimacy. Almost on every occasion the organisation did not want to engage, but it’s completely understandable why. It could be for any number of reasons including, they were busy, they were scared, they didn’t know us, they didn’t trust us, they outsourced the application so were blind to the answer, they didn’t have any technical resource to help confirm it, and so on. These are all totally fair reasons not to engage but it just meant that for 95% of the data we find we’re probably never going to know the real answer to how or where it came from, so what should we do? Ignore it? Not warn our customers in case we were wrong? No. Lets fail safe. We give the customers the data and let them decide if it was a threat, so we built tools and capabilities that enabled our clients to verify data accuracy as part of a crowd.
The fact that we don’t publicly shout into the internet horn about what we’ve located or where we think it leaked from also means we’re not shaming organisations at any point in time about data loss. Lets not forget they’ve become a victim too. Its not helpful to do that and it distracts from the real issue of protecting the user accounts as quickly as possible. Our internal crowd will confirm or reject the data accuracy over time.
Ethical Issue #5 – Handling Passwords
This is a thorny issue. Passwords are (and should always be) the secrets of the individual that created them and no-one else. Seeing passwords in the clear is like reading someones mind. They are things that are assumed to be privately stored in someones memory and never known by anyone else. We believe that passwords are basically someones private thoughts and they should stay that way.
One of the challenges we had when we developed Trillion was how to decide what the correct and most secure way to handle passwords would be? We felt at the very beginning it was critical that we made them available, because without them it is impossible to validate the authenticity of the data or determine the threat of the leak either to the individual, or to an organisation.
There are services that alert on generic data dumps without having the details accurately broken out but this can be highly misleading. If a user receives an alert with “a breach of company X has happened, it had passwords in it, and you were in it” then that’s a pretty scary and alarming warning. However, if a more accurate description was that there were passwords in the data but your passwords weren’t, then they really need to know that. As anyone in security will know, false positives are bad and lead to a rapid loss of faith in the information being presented.
For example, when the Canva breach was released, it had approximately 137 million user accounts in it, but only around 10% of the data had actual passwords included too. Services like HaveIBeenPwned have made decisions (for their own safety and security) not to associate passwords with user records, but this results in a very high number of users being warned about a data breach that may actually have no risk to them at all, and the 10% that were actually at risk, we’re not able to know which password had been leaked.
That’s one perspective.
The flip side of this is, as I said at the beginning, passwords are peoples private thoughts and should be protected as such. We made design decisions very early on that ONLY the person that created the password should be able to see it. If we could provide ample meta data, risk indicators and interactive tools for corporate security teams to be able determine the risk to their business of the data leaks, there would be no need for passwords to be seen by anyone except the user. This is very different to other services that will just share all usernames and passwords when asked.
This has been a decision of principle for us from the outset and we believe it’s the correct one. Knowingly enabling the sharing of passwords with anyone but the user who created it is a violation of trust to that individual (whether we’ve ever interacted with them or not), and it also exposes others to claims of abuse of power and perhaps accusations of criminal activity. If a user knows that someone else can see their secret passwords, then there is a potential case (rightly or wrongly) for accusation of unlawful access.
Have we spoken to people who don’t like this approach and have decided to look elsewhere, because they feel it is their right to see their users passwords? Yes. Did it change our mind? No.
Ethical Issue #6 – User Rights To Privacy
The last one I want to talk about here is user privacy.
We find data leaks from all over the internet. Educational websites, Social Media breaches, Dating sites, Adult Entertainment, Shopping. You name a category and I can show you sites being compromised and having had data leaked. For every breach we locate (and we have processed many thousands) we try and determine what information the site was hosting when it was compromised. Our team then (for each and every source) will try and determine if that site is considered ICO “special category data“. If it is then we keep that private and hidden to everyone except the actual user who was in the breach.
Again – there are two sides to this thinking.
On one hand, some will argue that its important that they know where this leak came from because it could have a baring on the individual user and what they are doing.
On the other hand, the only reason anyone probably would have ever known that the user was using these sites or applications is because of a catastrophic security failure by the site that was trusted with the users data in the first place. Had that security been robust then that users data would have remained private forever. We think this is the correct way to treat data from sites that we consider Special Category.
Our thinking here is that it was not the individual users fault that the site security has failed them and i’ll refer back to the principle I mentioned at the beginning of this post which is “[Trillion] would only ever be used for enhancing security on the internet and nothing we would do, to the very best of our abilities, would have an adverse impact on anyone”. For us again, it feels like the right balance between protecting the internet, protecting organisations, and protecting individual users privacy.
The one exception to this would be if we ever uncovered a data breach for a highly illegal site (such as one related to child abuse). Under those situations that data wouldn’t be processed by Trillion at all, but turned straight over to law enforcement. They can then do what they do best and deal with any identified individuals using the most appropriate procedures and investigative skills.
I hope that’s some useful insight into why we have made some of the decisions we have made and the trade-offs that need to be considered when handling sensitive breach data. Its a complex topic with lots of opinions and perspectives, but these are mine.