“What would it mean for your business if you could target potential clients who are actively discussing their need for your services in their day-to-day conversations? No, it's not a Black Mirror episode—it's Voice Data, and CMG has the capabilities to use it to your business advantage.”
The part of CMG advertising the capability is CMG Local Solutions. CMG itself is owned by Apollo Global Management and Cox Enterprises, which includes the ISP Cox Communications. CMG operates a wide array of local news television and radio stations.
Cox Enterprises isn’t some random company. It’s one of the largest privately owned companies in the US. They are somewhat capable of doing things like this.
Having experience with Cox Enterprises, it’s just a massive amalgamation of disparate acquisitions that have never been remotely brought together in a meaningful way so it is a slightly dubious claim. This would require much more coordination across entities than I feel is possible with the CMG I knew of pre-pandemic.
What about modern capitalism makes you optimistic. I know for a fact this is happening. I bought a pair of Bose earbuds—I was pretty excited about them but they were defective. The app they tried to get me to download required me to sign away permission to “map” my head movements, intercept any sound coming through what I actively play through the headphones…AND “passively record any sound around you.”
And when I saw that shit, I got right the fuck out of there—even though seeing that shit required me to click through three sub menus and entirely different legal documents, all of which I would’ve agreed to like every other privacy policy: absentmindedly.
After getting right the fuck out of there, I went on their website to contact customer service about the defect. So I opened an SMS chat with customer service—where I was told “replying to this chat is tacit agreement to our CUSTOMER SERVICE PRIVACY POLICY,” which I opened. And initially I was fine because it seemed like it was a different policy just allowing them to record the conversation “for training purposes.” Until I clicked through one, two, three and now FOUR sub menus to find I WOULD’VE AGREED TO THE SAME FUCKING PRIVACY POLICY.
So I fucking called Bose. I wanted to know if I could use these headphones without ever agreeing to the privacy policy. But of course customer service couldn’t even conceive of my question. I asked to get transferred to the legal dept.
Lol of course not. What the fuck was I thinking.
So fuck them, I returned those fuckers as fast as I could.
How often are you digging into sub pages and cited clauses of the privacy policies you’re agreeing to on a day-to-day basis? Because I will tell you, they were making me sign away the right to ALL a of that information, and their specific info on how they were using it (a different sub-contract) was pretty lax on who they could share it with.
I fully believe this has been happening WAY longer than just recently. Capitalism is trading on our data in the most invasive ways imaginable. The spying and capabilities have reached dystopian levels. How long ago did those CIA leaks come out about smart TVs being used to eavesdrop? That was like 2014. Ten goddamn years ago.
"Nah I've already got 4 tin-foil hats on and I'm destroying anything made after the 1950s right now. Kids included, they are microchipped with the vaccines. It's okay because I'll plead insanity." -way too many people
Literally anyone can run the basic numbers on the bandwidth that would be involved, you have 2 options:
They stream the audio out to their own servers which process is there. The bandwidth involved would be INSTANTLY obvious, as streaming audio out is non-trivial and anyone can pop open their phone to monitor their network usage. You'd hit your data limit in 1-2 days right away
They have the app always on and listening for "wakewords", which then trigger the recording and only then does it stream audio out. WakewordS plural is doing a LOT of heavy lifting here. Just 1 single wakeword takes a tremendous amount of training and money, and if they wanted the countless amount of them that would be required for what people are claiming? We're talking a LOT of money. But thats not all, running that sort of program is extremely resource intensive and, once again, you can monitor your phones resource usage, you'd see the app at the top burning through your battery like no tomorrow. Android and iPhone both have notifications to inform you if a specific app is using a lot of battery power and will show you this sort of indicator. You'd once again instantly notice such an app running.
I think a big part of this misunderstanding comes from the fact that Alexa/Google devices seem so small and trivial for their wakewords.
What people dont know though is Alexa / Google Home have an entire dedicated board with its own dedicated processor JUST for detecting their ONE wake word, and not only that they explicitly chose a phrase that is easy to listen for
"Okay Google" and "Hey Alexa" have a non-trivial amount of engineering baked into making sure they are distinct and less likely to get mistaken for other words, and even despite that they have false positives constantly.
If thats the amount of resources involved for just one wake word/phrase, you have to understand that targeted marking would require hundreds times that, its not viable for your phone to do it 24/7 without also doubling as a hand warmer in your pocket all day long.
The point of OK Google is to start listening for commands, so it needs to be really good and accurate. Whereas, the point of fluffy blanket is to show you an ad for fluffy blankets, so it can be poorly trained and wildly inaccurate. It wouldn’t take that much money to train a model to listen for some ad keywords and be just accurate enough to get a return on investment.
(I’m not saying they are monitoring you, just that it would probably be a lot less expensive than you think.)
If it's random sampled no one would notice. "Oh my battery ran low today." Tomorrow it's fine.
Google used to (probably still does) A/B test Play services that caused battery drain. You never knew if something was wrong or you were the unlucky chosen one out of 1000 that day.
Bandwidth for voice is tiny. The amr-wb standard is 6.6 kbits/second with voice detection. So it's only sending 6 kbits/ when it detects voice.
Given that a single webpage today averages 2 megabytes, an additional 825 bytes of data each second could easily go unnoticed.
This is simply not true. Low bit compressed audio is small amounts of bandwidth you would never notice on home internet. And recognizing wakewords? Tiny, tiny amounts of processing. Google's design is for accuracy and control, a marketing team cares nothing about that. They'll use an algorithm that just grabs everything.
Yes, this would be battery intensive on phones when not plugged in. But triggering on power, via CarPlay, or on smart speakers is trivial.
I'm still skeptical, but not because of this.
Edit:
For creds: Developer specializing in algorithm creation and have previously rolled my own hardware and branch for MyCroft.
FYI, sd 855 from 2019 could detect 2 wake words at the same time. With the exponential power increase in npus since then it wouldn't be shocking if newer ones can detect hundreds
But what about a car? Cars are as smart as smartphones now, and you certainly wouldn't notice the small amount of power needed to collect and transfer data compared to driving the car. Some car manufacturer TOS agreements seemingly admit that they collect and use your in-car conversations (including any passengers, which they claim is your duty to inform them they are being recorded). Almost all the manufacturers are equally bad for privacy and data collection.
What you're saying makes sense, but I can't believe nobody has bought up the fact that a lot of our phones are constantly listening for music and displaying the song details on our lock screen. That all happens without the little green microphone active light and minimal battery and bandwidth consumption.
I know next to nothing about the technology involved, but it doesn't seem like it's very far from listening for advertising keywords.
That uses a similar approach to the wake word technology, but slightly differently applied.
I am not a computer or ML scientist but this is the gist of how it was explained to me:
Your smartphone will have a low-powered chip connect to your microphone when it is not in use/phone is idle to run a local AI model (this is how it works offline) that asks one thing: is this music or is it not music. Anyway, after that model decides it's music, it wakes up the main CPU which looks up a snippet of that audio against a database of other audio snippets that correspond to popular/likely songs, and then it displays a song match.
To answer your questions about how it's different:
the song id happens on a system level access, so it doesn't go through the normal audio permission system, and thus wouldn't trigger the microphone access notification.
because it is using a low-powered detection system rather than always having the microphone on, it can run with much less battery usage.
As I understand it, it's a lot easier to tell if audio seems like it's music than whether it's a specific intelligible word that you may or may not be looking for, which you then have to process into language that's linked to metadata, etc etc.
The initial size of the database is somewhat minor, as what is downloaded is a selection of audio patterns that the audio snippet is compared against. This database gets rotated over time, and the song id apps often also allow you to send your audio snippet to the online megadatabases (Apple's music library/Google's music library) for better protection, but overall the data transfer isn't very noticeable. Searching for arbitrary hot words cannot be nearly as optimized as assistant activations or music detection, especially if it's not built into the system.
And that's about it....for now.
All of this is built on current knowledge of researchers analysing data traffic, OS functions, ML audio detection, mobile computation capabilities, and traditional mobile assistants. It's possible that this may change radically in the near future, where arbitrary audio detection/collection somehow becomes much cheaper computationally, or generative AI makes it easy to extrapolate conversations from low quality audio snippets, or something else I don't know yet.
I'm very skeptical of their claims, but it's possible they've partnered with some small number of apps so that they can claim that this is technically working.
This is why I generally ensure my phone is configured ahead of time to block ads in most cases. I don't need this garbage on my device.
As for how they could listen? It's pretty easy.
By waiting until the phone is completely still and potentially on a charger, it can collect a lot of data. Phones typically live on the nightstand by your bed at night; and could be listening intently when charging.
Similarly it could start listening when it hears extended conversations; simply by listening to the microphone for human speech every x minutes for y minutes. Then it can record snippets; encode them quickly and upload them for processing. This would be thermally undetectable.
Finally it could simply start listening in certain situations; like when it detects other devices (via BT). Then it could simply capture as many small snippets of your conversation as it could.
Both Android and iOS do enforce permissions against applications that have not been granted explicit access to listen constantly.
For example, the Google Assistant is a privileged app oftentimes; and it is allowed to listen. It does so by listening efficiently for one kind of sound, the hotword "Ok Google".
Other applications not only have to obtain user permission; but oftentimes that permission is restricted to be only granted "While app is in use", meaning it's the app on the screen, notifying the user, in the foreground, or recently opened. This permission prevents most abuses of the microphone unless someone is using an app.
the phone's processor has the wake up word hardcoded, so it's not like an ad company can add a new one on a whim. and it uses passive listening, so it's not recording everything you say - I've seen it compared to sitting in a class and not paying attention until the teacher says your name.
For that I think they use special hardware, that's the reason that you can't modify the calling word, and they still notify you when the voice assistant is disabled.
I don't know if this is actually true, or the companies try to hide behind this, or I just remember it incorrectly.
Of course this is possible. Is it practical? Nope. There is already so much data harvested by the likes you Google and Facebook that they can tell what you like, what videos or articles you read, what you share, in some cases who you talk to. Importing a shit ton of audio data is pointless, they already know what you like.
you just need to process the audio on the devices and then send keywords to Google etc. it's technically trivial since most phones already have dedicated hardware for that. your phone listens to activation words all the time, unless you disable it. there is no reason why they can't also forward anything else it hears as text
I don't know why, given recent impressive developments, but I've always met thie idea that this is really happening with heavy skepticism and I still do. This is definitely the most concrete thing I've ever heard and I definitely don't doubt companies would do this, I just... I don't know, it's hard to believe they really are.
One reason is it just seems like they'd be absolutely overwhelmed by useless data, it's not like AI is cheap to run, and it'd be so hard to link a conversation that's captured to a genuine sentiment and then to an ad connecting to that person and then a purchasing decision to that ad. This is scary for sure but it feels like this is more marketing hype to marketeers than a real thing.
Will be watching closely. I feel like this might actually be that bridge too far that the mainstream of society will demand action be taken against if it gets widely adopted and widely known. Even if it technically works and is provably effective to advertisers I think you'd need Google or Amazon to be the ones pulling it off and to have done so silently so we all just kinda assume they're doing it but don't know. If a company "starts" offering this service in a way the public can latch on to it would likely cause a massive backlash that would hopefully scupper such plans.
The biggest criticism for the idea of phones always listening and sending that data somewhere ia that they would also be listening to other corporations and their meetings. Even if multi-billion corporations can just waltz over the rights of normal people, other companies would be very interested in knowing this is happening.
Also I feel like they already know this stuff so they gain very fucking little in listening on us. You saw an interesting website two days ago and spent more time in it than normal. Then you meet uo with friends whom are known to have similar likes as you, why the hell would ad companies not show ads for the same page / item / event to those people. It doesn't matter at all if you mention it or not. Companies already know what products and brands you like, if your friends search for something, obviously they get ads of products that are interesting in their circle of friends. The items / brands / whatever are being talked about because they're interesting to the circle of people, which companies already know.
All you need is a list of advertising keywords. Have the device treat those like wake words just like Alexa and then target ads to the device based on which words it heard most often.
This is but the most simple version, it's easy to elaborate from there.
Not sure about every company lol, but this article is helpful. I had everything off but the driving one, and can confirm I got ads from stuff I mentioned in the car the other day.
I wonder if they are gathering this audio data from their own cable boxes, so the data transmission wouldn’t be noticed, they have remotes with microphones for voice commands.
CMG’s website addresses this with a section that starts “We know what you are thinking…”
“Is this legal? YES- it is totally legal for phones and devices to listen to you. That's because consumers usually give consent when accepting terms and conditions of software updates or app downloads,” the website says.
Well, yes, but actually no. No idea how this might play out in other parts of the world than the US. But in most places, you'd usually need consent of all parties, that are involved. If my neighbor were to install an (infected) app like this, then carries his phone around and talks to me, I did not consent and it would be illegal to record me, even if he were not tricked into consenting, but did knowingly accept it. Worse yet, in the last scenario, he might be on the hook for legal consequences, too...
Besides that legal minefield, I thinks it's a bluff. The tech is either way less accurate than they claim, or quite ressource intensive by either eating through your data plan on a mobile phone or draining your battery. My bet is on a PR stunt.
Nowadays you need 2 devices if you want to use custom ROMs.
In order to have access in your bank account, you need Google Play Integrity... So you are enforced to use google..
Democracy and free markets at their finest..