Google Home - Loving it!

Status
Not open for further replies.
Originally Posted by PeterPolyol
Originally Posted by Quattro Pete
The conversion from speech to text happens in the cloud, not on the device itself, so yes, compressed audio is transmitted to the cloud.

Are we certain about this? I'm not doubting that voice data isn't uploaded, but are we sure that's it's restricted to this by some means?


If I was making a smart TV I sure as heck would have the TV hold the speech-to-text libraries on-board and do the actual processing. That would:

1) offload the power consumption to the consumer
2) make the deciphered text available without any network or further processing delay
3) use *orders of magnitude* less data to send the text to some central server for [whatever nefarious purpose is served by Samsung listening to you converse in your living room].
 
Last edited:
Originally Posted by uc50ic4more
Originally Posted by PeterPolyol
Originally Posted by Quattro Pete
The conversion from speech to text happens in the cloud, not on the device itself, so yes, compressed audio is transmitted to the cloud.

Are we certain about this? I'm not doubting that voice data isn't uploaded, but are we sure that's it's restricted to this by some means?


If I was making a smart TV I sure as heck would have the TV hold the speech-to-text libraries on-board and do the actual processing. That would:

1) offload the power consumption to the consumer
2) make the deciphered text available without any network or further processing delay
3) use *orders of magnitude* less data to send the text to some central server for [whatever nefarious purpose is served by Samsung listening to you converse in your living room].

A smart TV will presumably have a predetermined and limited number of phrases that the user will throw at it. With the device like Google Home, you can ask it anything. I can imagine that is harder to execute with limited hardware locally. But again, i dont work for Google and dont know its innerworkings other than what I have read.
 
Originally Posted by Quattro Pete
[don't think text-to-speech is the same as speech-to-text.

Yeah messed that up. I did obv mean speech-to-text. Still a total cakewalk compared to other AI and facial recognition, which is done locally in the device. Again, uploading raw audio (or video) data to be processed remotely is a logistical nightmare and a very bad idea on many fronts. It's a huge leap of faith to believe the machine is only listening for it's command to open a connection, and nothing else, to the point that it's just not realistically fathomable. These are data gathering companies and people are trusting them to voluntarily be "hands off" from gathering data! Who would expect that?

My sister has this "Smart Home Monitoring" system. I asked, do you feel comfortable being on camera all the time? She said "The only camera is the security cam looking at the driveway". I took her over to the unit and pointed to the built-in console camera and said, "but this one is in the house, in the foyer". She said "well idk is that's a camera (it is) but it's obviously not on" to which I replied "well, wouldn't that suck to have this camera in your house, and you're the only one without access to it"..
 
Originally Posted by PeterPolyol
[Again, uploading raw audio (or video) data to be processed remotely is a logistical nightmare and a very bad idea on many fronts.
It's not raw audio. It's heavily compressed audio which in small snippets does not take up that much space. Certainly these companies have massive storage facilities, so it's not a problem for them, and most people's internet connections are fast enough these days where such audio transmissions won't impact the user. My guess is Google and Amazon want the actual audio capture so that they can analyze how successfully their AI deciphered it and converted it to text. If S2T was done locally, then all they'd get would be the text, and they'd never know if that text was truly what the person had said or if it was just a poor speech-to-text job done by the local device.

Again, I don't know for sure that processing is done remotely. If you can find info to the contrary, then post it.

Here is what I found:

[Linked Image]

"Once a wake word is detected, audio recorded by the device is sent to Google or Amazon Servers for processing and, by default, long term storage. A response is crafted by Amazon and Google servers and sent back to the user."

https://www.wired.com/story/amazon-echo-and-google-home-voice-data-delete/

https://www.hallsteninnovations.com/amazon-echo-google-home-listening2/


Quote
It's a huge leap of faith to believe the machine is only listening for it's command to open a connection, and nothing else, to the point that it's just not realistically fathomable. These are data gathering companies and people are trusting them to voluntarily be "hands off" from gathering data! Who would expect that?
I don't disagree. It's a recording device, and as such could be made to record anything at any time, either intentionally by the provider, or intentionally by a hacker.
 
Originally Posted by JeepWJ19
I love the irony :]

Maybe, maybe not. It's one thing to do things online intentionally. Yes, some put way too much of themselves online, but that's another issue. One does certain things online by choice. However, there are things that certainly don't need to be online, and only gain very limited real functionality by being connected. I have little use for a thermostat that goes online, a television that goes online itself, or a connected fridge.

We don't have a dichotomy where to be private one must live in a log cabin in the forest 100 miles from the nearest settlement versus having a smartphone, smart watch, smart TV, Google home, and tweeting every moment of life. I go online to do things I enjoy and to make certain other things easier (i.e. paying a bill online is much easier than going downtown). I certainly don't need every other thing I own that has an on/off switch to also be connected to the net somehow. I like browsing BITOG, so I do it. My car rides don't get any more enjoyable by using iThings, so I don't do that.
 
Originally Posted by JeepWJ19
Originally Posted by Quattro Pete
Originally Posted by Smokescreen
I don't have any smart TV's, NEST or the like in my domain.

...I am unplugged from the matrix and prefer to remain that way.

Yet you are on the internet, posting here.


I love the irony :]


There is a large difference between initiating the connection on the internet and then disconnecting. I get to choose to engage or not. With these smart devices you are always connected.
 
Originally Posted by Smokescreen
There is a large difference between initiating the connection on the internet and then disconnecting. I get to choose to engage or not. With these smart devices you are always connected.

So you manually switch off your computer's network adapter every time you're done browsing the internet and you always keep your phone in Airplane Mode if you're not using it?

Regardless who initiates the connection and for how long, you are always leaving a trail of activity, one way or another.
 
Originally Posted by Quattro Pete
Originally Posted by Smokescreen
There is a large difference between initiating the connection on the internet and then disconnecting. I get to choose to engage or not. With these smart devices you are always connected.

So you manually switch off your computer's network adapter every time you're done browsing the internet and you always keep your phone in Airplane Mode if you're not using it?

Regardless who initiates the connection and for how long, you are always leaving a trail of activity, one way or another.



The computer is usually only on when I am on the internet, after which it gets turned off. Phone, likewise goes into airplane mode when not on the internet. I don't have any social media either.

Leaving a trail while is a known calculated risk. I have the level of control and access I want. Nothing I have gets activated or accessed without me knowing.
 
Not only do I turn off the Internet-I put aluminum foil over the computers and my windows-and my baseball cap. You just NEVER KNOW.

We are not alone.......
 
Originally Posted by Quattro Pete
So you manually switch off your computer's network adapter every time you're done browsing the internet and you always keep your phone in Airplane Mode if you're not using it?

I can't speak for others, but I often turn off the power bar to which my modem and router are connected and I don't have a cell phone, at least nothing that would be recognizable as a cell phone by most born anywhere around the turn of the century.
wink.gif
 
Originally Posted by Quattro Pete
It's not raw audio. It's heavily compressed audio which in small snippets does not take up that much space. Certainly these companies have massive storage facilities, so it's not a problem for them, and most people's internet connections are fast enough these days where such audio transmissions won't impact the user. My guess is Google and Amazon want the actual audio capture so that they can analyze how successfully their AI deciphered it and converted it to text. If S2T was done locally, then all they'd get would be the text, and they'd never know if that text was truly what the person had said or if it was just a poor speech-to-text job done by the local device.

Yes, plenty of storage and plenty of bandwidth right down to the consumer level. Of course. Also, plenty of device processing power and very advanced stages of biometric tech, including face/voice/speech recognition as well!
Let me start by geting the the semantics out of the way, "raw audio" in the context would mean recorded audio, regardless of format or compression. Sorry for not being clear.
To record everything as audio and uploading it to "cloud" servers for subsequent processing is just a generally bad idea. Uploading data at conspicuous times is also generally a very bad idea. How obvious and stupid would that be, especially since people WILL be looking for anomalous data uploads from this new "device" concept as clearly evidenced in this very thread?? Who ever would conceptualize such a product as "voice" driven smart home bugging devices, and campaign them into being accepted as one of your standard array of household devices, without thinking about these basic concerns of suspicious bugging behvaviour??

Why wouldn't these devices use both means, to minimize data transit and subsequent suspicion?? S2T "inspection" of all speech that the machine hears, for trigger phrases, and the recording of audio upon triggering a word/phrase, for contextual evidence if required. For marketing and consumer data purposes, vague "mentions" of trigger words (like 'eggs' or 'walmart' or 'football' of 'beer') deciphered and uploaded as strings of text in an encrypted stream from your unique device ID would make the most sense. If they need the audio from your machine, then it would probably wait for trigger phrases like "murder" or "kill" or "terror" or "bomb"... and this isn't really new to these type of home devices and the whole thing about it is that it can never become known that this is happening or even possible.
 
I don't care what you consider a good or bad idea. That's for everyone in this thread to decide for themselves. I was just pointing to the information I found that describes how it presumably works.

There is a fine line between caution and paranoia.
smile.gif
 
Originally Posted by Quattro Pete
I don't care what you consider a good or bad idea. That's for everyone in this thread to decide for themselves. I was just pointing to the information I found that describes how it presumably works.


what does that mean? Everyone in this thread is in the develoment stage of designing a 'smart home device' and are determining what design elements of an already finished data gathering product are a good or bad idea? Thanks for sharing the consumer electronics magazine article, I'm convinced! The only local speech recognition is the wake word! Duh, how could I be so stupid to think the code could work on more than one single phrase for more than a single purpose...oh, the paranoia. I see now, google doesn't really have any business plan beyond informing us about the population of Iceland via voice interaction... I see the error of my paranoid ways... I now denounce the aluminum foil conspiracies and embrace the drooling...
... does that about cover it?
 
Originally Posted by PeterPolyol
Thanks for sharing the consumer electronics magazine article, I'm convinced!
I am still waiting for you to post links to something of substance to validate your claims.

Quote
The only local speech recognition is the wake word! Duh, how could I be so stupid to think the code could work on more than one single phrase for more than a single purpose.

Even I stated earlier that there could be additional wake words.
 
My wife and I always talk about top secret things and shout out our credit card, bank account numbers, etc. while standing in our living room in front of our Amazon (Echo) listening device.

Need to put tin foil over it.......
 
Originally Posted by Quattro Pete
Originally Posted by PeterPolyol
Thanks for sharing the consumer electronics magazine article, I'm convinced!
I am still waiting for you to post links to something of substance to validate your claims.



Well, I'm just not going to do that. I wouldn't reasonably expect to find any solid proof of the sort in the public domain, either. My concerns about the entire concept however, won't be easily anaesthetized. If you call that approach paranoid, then it probably won't help... you.
A friendly reminder that you're free to ignore anything I said as pure paranoid hogwash and subsequently shut down. It'll make someone think, maybe not you or 98% of people on here, and that's fine, but at least someone will be inspired to think critically about it.
 
Originally Posted by CKN
My wife and I always talk about top secret things and shout out our credit card, bank account numbers, etc. while standing in our living room in front of our Amazon (Echo) listening device.

Need to put tin foil over it.......

How does it feel to confuse 'vapid' and 'basic' with funny?
 
Originally Posted by PeterPolyol
Originally Posted by CKN
My wife and I always talk about top secret things and shout out our credit card, bank account numbers, etc. while standing in our living room in front of our Amazon (Echo) listening device.

Need to put tin foil over it.......

How does it feel to confuse 'vapid' and 'basic' with funny?



Oh-you mean like other posts on this forum?
 
I wonder how many people would use a Google product if this company was forced to label the outside of their packages much like cigarettes.

Something like this should be required "Warning This Device Packages and Sells the Information of the Person Using it to the Highest Bidder in the World"
 
Last edited:
We have 2 mini's love them. Mainly use them for controlling the thermostat, using for alarms, and adding things to grocery list.
 
Status
Not open for further replies.
Back
Top