How are hyperblocks so fast? (with inside views from @jens and @bh)

Runtime performance often isn’t that important, especially not for an education-oriented language. Besides, no code written in an interpreted language on top of Javascript is going to be faster than (equally well-written) code in Javascript itself, I suppose. Still, when I “moved” from Scratch to Snap! in 2022 - mainly because Snap! supports recursion, and lists are variables - I was somewhat disappointed at first by its laggish execution; I found a 1:1 translation of a Scratch program to Snap! would typically take somewhere within the range from 1.5 to 8 times longer to run.

Then I discovered higher-order functions (HOFs, e.g. untitled script pic 179), hyperblocks (e.g. Dijkstra’s shortest distance script pic 9), and recently built-in stuff like untitled script pic 142. It turned out that many Snap! programs are actually faster than Scratch’s, once you’ve learned about these features (working on lists) and experience how to use ‘em. I’ve understood development of many of these functions was actually driven by a desire to speed up Snap! program execution. Nowadays I sometimes even catch myself deliberately choosing a construction that has a short execution time instead of a logically understandable, more elegant structure - that is perhaps a disadvantage of such features, although it can be compensated with comments.

Mostly out of curiosity I would be interested to know, at least superficially, how these speed boosts were realized.

  • Is it due to the JavaScript code that implements the Snap! function in question, it having been optimized to process a whole series of data?
  • Are higher-order JavaScript functions being called, which in themselves work more efficiently with series of data?
  • Or in some cases (I'm thinking especially of hyperblocks) perhaps even GPU hardware is used instead of the CPU? (I don't know much about hardware so I'm just speculating. Snap! of course doesn't work directly on the hardware layer, but maybe that's how it works in practice?).

Thanks (beforehand) for explaining!

BTW are there any plans to further improve processing speed? (I’m asking this even though IMO updating online help and the Ref. Manual is a higher priority)

I agree about the higher priority, but luckily (I guess luckily) I'm in charge of the documentation and Jens is in charge of the code, so those things can happen in parallel.

Yes, we too have found ourselves writing ugly Snap! code to make it run fast. My proposal about that is to have a "compiler" that compiles HOF code into hyperblock code whenever possible, or even just whenever easily possible. This is a very recent idea so we haven't tried to do it, so far. The nice thing about it is that with metaprogramming the compiler could be written in Snap!.

I blush to admit that another thing that speeds up hyperblocks is that they convert linked lists to arrays before doing anything else. Then we get the advantage of JS's built in array support, but also, we don't have to keep checking at runtime which flavor of list we have. Of course this depends on using iterative rather than recursive code; for the latter you really need constant time ALL BUT FIRST OF. That's part of the uglification cost. So it'd be great to compile tail recursive code into iterations. Of course branched recursion, such as tree traversal, can't be turned into iterations without inventing an explicit stack. (Luckily that's not so hard with first class continuations!)

On the third hand, if you have a really enormous data set, you want to write your program using streams, so it doesn't all have to be in memory at once, and I think that's hard with arrays. :~)

Thanks for explaining!
I agree a compiler would be great for execution speed (would that be JIT compilation?). I guess the interpreter will still be available for debugging, so one can enjoy the best of both worlds.

Hyperblocks are fast because of the halting problem.

Let me explain:

I'm slowing down Snap! execution - even drastically so - in various ways to

  1. support parallel animations (Snap's microworld is 2D cartoons) across a wide variety of hardware platforms with wildly differing performance specs, so I'm shooting for the least performant, slowest options, and to
  2. let learners follow along and mentally trace what's happening in their programs, so they can interact and intervene, e.g. in the case of infinite recursion, which otherwise would crash their environment (as it does in almost every other programming language).

Allowing users to manually stop a process gone awry gives them the security to experiment, you're not supposed to be able to make a "bad mistake" that leaves your environment "hanging" in such a way that you loose work. Also, I loathe the concept (often taught over here in Germany) that informatics is some kind of chess in which you are rewarded for thinking ahead around 10 corners, and punished for missing a detail. Bottom line is: I know how to make things fast, that's not even the hard part. It's how to make things reasonably slow and interactive that's interesting and surprisingly hard. First-class environments are not "slow" per se, but making the programming language that supports them interactive requires filling in hooks and interrupts all along.

Being built around parallel animations makes Snap! great for introductory level little games and simulations, but it also promises more than it delivers. Because it's easy to get started with a little interactive project folks are led to believe that this way of often interactive exploration easily scales up to any mathematical task, but it doesn't. As you play with recursion, e.g. by drawing fractals, the time it takes to accomplish something grows exponentially. And even if you don't use recursion but are "simply" crunching data the traditional imperative way using loops and variables takes unexpectedly long because there is no visible animation involved. Scratch addresses these as special cases: Scratch scans the blocks inside a loop for visible side effects, and if there are none, it doesn't yield. But Scratch also doesn't support custom reporters and higher order functions, so it can do these special cases, which a more general purpose programming language such a Snap! cannot do.

The problem with lambda is that it's often way too generic. It's super powerful and Turing-complete on it's own, but it's also equivalent to taking a sledgehammer to crack a nut. In theory it's wonderful to compose your own higher order functions, but in practice you pretty much only ever need / use 3 of them (map, filter, fold). And even with these you mostly don't need the generality of being able to pass in arbitrary "funargs", especially not in the case of "fold" (in practice you almost always just want to aggregate some basic numerical operator or concatenate text), and also to some lesser degree in the case of "map". Another detail that makes lambda unwieldy for scaling and parallelization is closures. It's wonderful - and one of my favorite things to demo - to be able to bootstrap your own object system from nothing but rings, but it also ends up stacking and nesting scopes in ways that affect every single variable look-up with more cost than in systems that don't support it. Keep in mind that Scratch doesn't even support any variable or function scopes, so no wonder it's "faster", it doesn't do anything interesting!

Don't get me wrong here. As I'm re-reading what I just wrote it might seem as if I'm suggesting that lambda, recursion and higher order functions are somehow inferior to other abstractions when it comes to scaling up projects. That's totally not what I'm saying, though. In fact, I believe that lambda, recursion and higher-order functions are the closest thing there is to a Math Gospel! Everybody, especially every child should be offered the opportunity to find out about the glorious universality of functions! But the beauty of functions lies in their expressivity. It can be very hard to understand / trace what's actually happening. This is precisely what Snap! aims to address: We want to encourage learners to explore the awesomeness of functions knowing that their work is secure, and that they (the learners, programmers) will always be in control, not "the computer" or "the language". That's what makes Snap! "slow"!.

Hyperblocks are really nothing but a special case for abstracting away "map" for a finite catalog of concrete operations. In the "real" map you can put literally anything into the arg ring, including an infinitely recursive (buggy!) expression. That's why I need to make it so you can press the red stop button or click on the running script to halt it. Otherwise your system would crash and if you didn't save before running it you'd loose your work, become frustrated with programming and decide to study business administration and produce bullshitz on LinkedIn for a living :). In the case of hyperblocks, however, I know the exact "funarg" you're mapping over data, so I can be (reasonably) confident, that this operation - given non-circular data - will actually terminate and not go on forever. I know this not because I've solved the "halting problem", but because I wrote the thing that goes into the imaginary map-ring (e.g. the plus / times / join reporter), not you. Therefore, for these special cases of known and supported reporters I can simply compute the result directly, without having to support parallelism, closures and scopes. That's why you get what appears to be miraculous performance with hyperblocks.

Yes, I've abstracted much of the underlying linear algebra into a set of generalized higher-order functions (e.g. "hyper()"), and we have explored various hardware accelerations in the past, but so far haven't made them available in production. I'm also exploring using the GPU for matrix multiplication, but it's still a long shot. Not so much because of the technology, but because of the pedagoy.

BTW: Here's a synopsis (in German) I've recently written up for a German computing journal comparing the different ways to compute n-grams in Snap! functionally, imperatively, and hyper-dimensionally:

addendum: No, hyperblocks wasn't and isn't motivated by (hardware) performance characteristics. We love APL and (want to) teach linear algebra with it.

Thanks for your (@jens) explanation, too!

Another example of how a function may be implemented in many different ways, with varying characteristics:

(actually posts 3, 5, 6, 15, and 17)