It’s not just what you say that’s sensitive in a message
“Metadata is far more intimate than our conversations. It shows where we go, our interests, our relationships—it shows who we are.” —Bruce Schneier
We talk a lot about metadata in terms of online advertising, social networks, and websites, but we don’t discuss a metadata definition in terms of mobile messaging very often. Understanding mobile metadata is essential to understanding mobile messaging security, privacy, and anonymity.
Metadata definition, and why it’s crucial for privacy
As the quote above says, a proper metadata definition should illustrate that the data is much more comprehensive and personal than even messages themselves. Metadata puts what we do and say online into a much larger context of who we are, what we’re doing, and who we’re connected to.
Systematic use of metadata can build a frighteningly accurate profile of us. Which is why people who are concerned about privacy and security care so much about how metadata is handled by the apps they use. The more metadata an app gathers, stores, and uses, the more the company knows about us. And the more a company knows, the more that could be shared or stolen in a hack.
Here’s a metadata definition with some visual help:
We all know how much information websites gather about us, and how that information is used for advertising and personalization. Website metadata paints a pretty detailed picture—almost a unique fingerprint—of you. Messaging apps like Signal and WhatsApp—encrypted or not—gather similar, but different kinds of information. These can include your:
- Username
- Phone number
- Carrier
- Specific device identification number
- Contact usernames, real names, phone number)
- Contact times and who you contacted
- Contact list on your phone and sometimes your contacts’ contacts (if they are using the same messaging app)
- Mobile operating system
- Device
- IP address
- Location
The first six items on the list identify you right away and any messaging app that collects and stores that information can’t be considered secure by any measure, but what about the last four? With only a few data points, your phone activity can be uncovered including the location, time and length of conversation, and who you were communicating with. And if matched up with the metadata from other half of the conversation, you can confirm that “this person communicated with that person on this day for this long”. You don’t have to know exactly what people said for this to be problematic for privacy.
A metadata definition from the EFF
Here are some examples from a 2013 EFF post about phone metadata and the inferences—rightly or wrongly—that can be made from them if they look at your metadata and know that you:
- Rang a phone sex service at 2:24 am and spoke for 18 minutes. But they don’t know what you talked about.
- Called the suicide prevention hotline from the Golden Gate Bridge. But the topic of the call remains a secret.
- Spoke with an HIV testing service, then your doctor, then your health insurance company in the same hour. But they don’t know what was discussed.
- Received a call from the local NRA office while it was having a campaign against gun legislation, and then called your senators and congressional representatives immediately after. But the content of those calls remains safe from government intrusion.
- Called a gynecologist, spoke for a half hour, and then called the local Planned Parenthood’s number later that day. But nobody knows what you spoke about.
These examples relate to the first revelations about NSA gathering of telephone metadata, but the same can be applied to mobile data as well. If a messaging app gathers this data and shares it with advertisers—the way WhatsApp does with its parent company Facebook (WhatsApp privacy policy)—your activities are not anonymous or secure. We don’t know what was said, but the metadata gives away some really essential data where some easy assumptions can be made.
Let’s look at this in a broader context from a paper by Nidhi Rastogi and James Handler of Rensselar Polytechnic Institute:
“In the privacy domain, there have been concerns related to user metadata as well. WhatsApp encrypts the communication channel between users using end-to-end encryption. The metadata of the user is encrypted as well when data is in motion on the communication channel between various parties. It is essential to understand that information stored in metadata is just as important in preserving privacy of the users, as is the data itself. The company’s legal terms allow them to store information associated with successfully delivered messages such as time of delivery, mobile phone numbers involved in the messages, size of any digital content swapped between the two parties (Bernstein 2006).
Also, the app persists the user to share one’s entire contact list with the app. This is a way to further gather information about who is in a particular social network of a user. It is like trading the convenience of having the app to figure out who uses it amongst one’s contacts for giving up the entire list of which one contacts regularly, including those who don’t use the app. There is still no option of selectively adding contacts to the WhatsApp list. Any addition of this feature in the future will not help existing users as they have already shared this detail with the app. A smartphone metadata reflects a wealth of details both at the level of individual calls and when analyzed in aggregate. Computer scientists and researchers have proved this a number of times in the past.
It is here where WhatsApp falters.
While the metadata is encrypted during transit, phone numbers, timestamps, connection duration, connection frequency, as well as user location are being stored on the company’s servers. This metadata is sufficient to create a profile and draw some strong inferences between the communicating parties. And as we’ve seen very often, both governments and hackers can get their hands on the metadata if they really go after it.”
For messaging apps to work, some metadata that must be collected and stored for at least a short time. The key for secure messaging apps is minimizing what is collected and stored to only what is absolutely needed in order to offer full anonymity.
How much metadata is just enough?
For best-of-breed secure messaging apps, like SKY ECC, you only need the sender’s and recipients’ SKY ECC ID to send the message from one person to another. The only time encrypted messages are stored is if one person is offline. Unlike other apps in the market, we provide true anonymity for our users, capturing the least amount of metadata possible. We believe that secure messaging means gathering and storing as little information as possible about what our users are doing.
The less information we have, the more protection we can offer our users.
Metadata becomes a big problem when it is used incorrectly, without your knowledge, or for things you didn’t agree to. In the case of secure messaging, the less metadata gathered and kept, the better.