Even if you have encrypted your traffic with a VPN (or the Tor Network), advanced traffic analysis is a growing threat against your privacy. Therefore, we now introduce DAITA.
Even if you have encrypted your traffic with a VPN (or the Tor Network), advanced traffic analysis is a growing threat against your privacy. Therefore, we now introduce DAITA.
Through constant packet sizes, random background traffic and data pattern distortion we are taking the first step in our battle against sophisticated traffic analysis.
Good for Mullvad. Long overdue, as they say. Wish I could still be a customer. Whoever induced them to turn off port forwarding did the world a disservice.
The possibility that some of those bad customers might've had the specific objective of permanently crippling the service as they successfully did is often under-appreciated.
By observing that when someone visits site X, it loads resources A, B, C, etc in a specific order with specific sizes, then with enough distinguishable resources loaded like that someone would be able to determine that you're loading that site, even if it's loaded inside a VPN connection. Think about when you load Lemmy.world, it loads the main page, then specific images and style sheets that may be recognizable sizes and are generally loaded in a particular order as they're encountered in the main page, scripts, and things included in scripts. With enough data, instead of writing static rules to say x of size n was loaded, y of size m was loaded, etc, it can instead be used with an AI model trained on what connections to specific sites typically look like. They could even generate their own data for sites in both normal traffic and the VPN encrypted forms and correlate them together to better train their model for what it might look like when a site is accessed over a VPN. Overall, AI allows them to simplify and automate the identification process when given enough samples.
Mullvad is working on enabling their VPN apps to: 1. pad the data to a single size so that the different resources are less identifiable and 2. send random data in the background so that there is more noise that has to be filtered out when matching patterns. I'm not sure about 3 to be honest.
Thanks very much, I believe I understand that part now, like a fingerprint to associate to site components like pulled in js, css, etc. I still don't understand, though, how they associate that to a particular user of a VPN. Does each request done through a VPN include some sort of identifier for each of us or is AI also doing something to put these requests in a particular user's bucket?