Snap! v10.6

Folks, we've just published a minor release: Snap! 10.6.
You might need to clear your browser cache, afterwards you can play with a new block in the "Text to Speech / Voice to Text" library that lets you input text by speaking into your device's microphone:

This block uses a new-ish web api and works quite differently across major browsers. In my tests the best results - across spoken languages - are obtained in Chrome and mobile Safari, whereas Safari on Mac only seems to work with the language of your device, and not with different ones.

This is a fun - still somewhat experimental - feature that we've added for a joint research project with the University of Esslingen (Germany) where we create an iPad app for anonymous surveys among preliterate children with different spoken languages in daycare institutions. As a side benefit you now also get to play with speech input, cheers! :slight_smile:

10.6.0:

  • New Features:
    • new "recognize speech" reporter block in the "Text to Speech, Voice to Text" library
    • new "tts_recognize" extension block
  • Notable Changes:
    • renamed the "Text to Speech" library into "Text to Speech, Voice to Text"
    • MQTT library update with base64 encoding, thanks, Xavier and Simon!
  • Notable Fixes:
    • fixed a costume-loss issue for multi-scene projects stored in the cloud
    • reduced processor load when idling
  • Translation Updates:
    • Armenian, thanks, Antrohoos Education Foundation!

honestly I've found the ecraft2learn best ecraft2learn commands script pic
to be a bit more reliable than the snap one untitled script pic, but I'm wondering how it works for others. For me, the snap one keeps not recognizing the words "what" and "when".

Oh cool!

Great!

That may be because they're probably not relying on the browser api.

This is going to be a lot of fun!!!

I tried it in Chrome, and it works both for Italian and for my bad English.

Instead, in Firefox I get this error message when I run this very simple script (the same that works well in Chrome):

I think @toontalk is using the same SpeechRecognition web API in ecraft2learn...

interesting. does it always produce the same results on your end?

yeah, firefox doesn't have speech recognition.

Oh yes, Firefox is still trailing behind, see: SpeechRecognition - Web APIs | MDN

It is a shame. But, luckily enough, browsers are free :slight_smile:

Cool!

But, whose API? It sounds like we need to revise our posted privacy policy to note that this is an exception to the "we don't track you" part.

Sadly, it's either you get tracked buuuutt you get all the newest and shiniest web features, or you don't get tracked and maybe get them a year after they're out

It's a browser (javascript) api that is (I'm pretty sure) all done locally, which is not being used to track you (or at least, anything but google chrome (idk if google chome uses it to track you, but a privacy focused chromium browser should be completely safe)).

edit: never mind, I just looked at SpeechRecognition - Web APIs | MDN

Note: On some browsers, like Chrome, using Speech Recognition on a web page involves a server-based recognition engine. Your audio is sent to a web service for recognition processing, so it won't work offline.

My best guess is that it's sending it to google.

Offline speech recognition is available for high end devices. E.g. Google's Pixel phones: An All-Neural On-Device Speech Recognizer

But in general it is sent off to servers by the browser provider. Google, and I assume the others, claim to respect privacy.

I assume over time more and more will be running on the device.

Though, Mozilla recently posted a new privacy policy for Firefox that, among other things, allows them to store any data that could be needed to make some feature "work", including training AI features


I can finally finish my LCARS recreation project! Finally, I can talk to the computer!

So in other words yes, we do have to update the privacy policy. :~(

No, Brian, we're not using somebody's web service, but the official speech-to-text API.

But...

I read this as saying that the official speech-to-text API may use a web server, depending on your browser and OS. Am I wrong?

That's what it says, right here: SpeechRecognition - Web APIs | MDN

But, we also offer XHR in the url block, and let people publish projects to query to any web service - I've personally used the Chuck Norris web api a couple of times - therefore I think this is not to be considered a breach of privacy whatsoever.

A quick test revealed that Safari does it completely offline, whereas Chrome currently seems to send it somewhere. So this isn't us but whatever your OS and browser decides to do.