Snap! v10.6

jens · March 14, 2025, 3:35pm

Folks, we've just published a minor release: Snap! 10.6.
You might need to clear your browser cache, afterwards you can play with a new block in the "Text to Speech / Voice to Text" library that lets you input text by speaking into your device's microphone:

This block uses a new-ish web api and works quite differently across major browsers. In my tests the best results - across spoken languages - are obtained in Chrome and mobile Safari, whereas Safari on Mac only seems to work with the language of your device, and not with different ones.

This is a fun - still somewhat experimental - feature that we've added for a joint research project with the University of Esslingen (Germany) where we create an iPad app for anonymous surveys among preliterate children with different spoken languages in daycare institutions. As a side benefit you now also get to play with speech input, cheers!

sathvikrias · March 14, 2025, 3:33pm

10.6.0:

New Features:
- new "recognize speech" reporter block in the "Text to Speech, Voice to Text" library
- new "tts_recognize" extension block
Notable Changes:
- renamed the "Text to Speech" library into "Text to Speech, Voice to Text"
- MQTT library update with base64 encoding, thanks, Xavier and Simon!
Notable Fixes:
- fixed a costume-loss issue for multi-scene projects stored in the cloud
- reduced processor load when idling
Translation Updates:
- Armenian, thanks, Antrohoos Education Foundation!

sathvikrias · March 14, 2025, 3:35pm

honestly I've found the ecraft2learn
to be a bit more reliable than the snap one , but I'm wondering how it works for others. For me, the snap one keeps not recognizing the words "what" and "when".

owlsss · March 14, 2025, 3:54pm

Oh cool!

emodrow · March 14, 2025, 4:03pm

Great!

ego-lay_atman-bay · March 14, 2025, 4:08pm

That may be because they're probably not relying on the browser api.

s_federici · March 14, 2025, 4:12pm

This is going to be a lot of fun!!!

I tried it in Chrome, and it works both for Italian and for my bad English.

Instead, in Firefox I get this error message when I run this very simple script (the same that works well in Chrome):

jens · March 14, 2025, 4:35pm

I think @toontalk is using the same SpeechRecognition web API in ecraft2learn...

sathvikrias · March 14, 2025, 4:39pm

interesting. does it always produce the same results on your end?

yeah, firefox doesn't have speech recognition.

jens · March 14, 2025, 4:47pm

Oh yes, Firefox is still trailing behind, see: SpeechRecognition - Web APIs | MDN

s_federici · March 14, 2025, 5:10pm

It is a shame. But, luckily enough, browsers are free

bh · March 14, 2025, 6:34pm

Cool!

But, whose API? It sounds like we need to revise our posted privacy policy to note that this is an exception to the "we don't track you" part.

owlsss · March 14, 2025, 7:01pm

Sadly, it's either you get tracked buuuutt you get all the newest and shiniest web features, or you don't get tracked and maybe get them a year after they're out

ego-lay_atman-bay · March 14, 2025, 7:05pm

It's a browser (javascript) api that is (I'm pretty sure) all done locally, which is not being used to track you (or at least, anything but google chrome (idk if google chome uses it to track you, but a privacy focused chromium browser should be completely safe)).

edit: never mind, I just looked at SpeechRecognition - Web APIs | MDN

Note: On some browsers, like Chrome, using Speech Recognition on a web page involves a server-based recognition engine. Your audio is sent to a web service for recognition processing, so it won't work offline.

My best guess is that it's sending it to google.

toontalk · March 15, 2025, 11:10am

Offline speech recognition is available for high end devices. E.g. Google's Pixel phones: An All-Neural On-Device Speech Recognizer

But in general it is sent off to servers by the browser provider. Google, and I assume the others, claim to respect privacy.

I assume over time more and more will be running on the device.

bluebaritone21 · March 18, 2025, 2:21am

Though, Mozilla recently posted a new privacy policy for Firefox that, among other things, allows them to store any data that could be needed to make some feature "work", including training AI features

I can finally finish my LCARS recreation project! Finally, I can talk to the computer!

bh · March 18, 2025, 3:16am

So in other words yes, we do have to update the privacy policy. :~(

jens · March 18, 2025, 5:48am

No, Brian, we're not using somebody's web service, but the official speech-to-text API.

bh · March 18, 2025, 9:40am

But...

I read this as saying that the official speech-to-text API may use a web server, depending on your browser and OS. Am I wrong?

jens · March 18, 2025, 1:57pm

That's what it says, right here: SpeechRecognition - Web APIs | MDN

But, we also offer XHR in the url block, and let people publish projects to query to any web service - I've personally used the Chuck Norris web api a couple of times - therefore I think this is not to be considered a breach of privacy whatsoever.

A quick test revealed that Safari does it completely offline, whereas Chrome currently seems to send it somewhere. So this isn't us but whatever your OS and browser decides to do.