Streams library 2.0 development part 2 (Part 1)

jens · July 10, 2024, 8:29am

I've deployed your new streams library to dev yesterday, thak you! In order to see new versions of a library you need to explicitly clear your browser cache.

qw23 · July 10, 2024, 8:47am

I tried a different computer and it works now, thanks!

@bh : strange as it may seem (at least to myself), now that it has been deployed I am somehow more inclined to look at it with the eyes of a novice - and think it could do with a tutorial block at the top of the palette. If you agree I'll try to write create one later this month, integrating the contents of demo block.

bh · July 10, 2024, 4:07pm

Interesting idea. I wonder if we should do that for some other libraries.

pajamaclaws21 · July 11, 2024, 11:29pm

MQTT has a tutorial/example block (or had it at one point) and I find it super helpful when I occasionally forget what a block in the library does or how it can be used. It's very useful.

bh · July 12, 2024, 4:20am

Thanks. Yeah, the streams library already has one like that, with examples of using the library. What @qw23 is suggesting is something more tutorial than that, with a relative lot of text explaining what streams are. (The MQTT one doesn't do that; it doesn't even say what "MQTT" is an acronym for.) It'd be a lot of work to do a good job of it. (Actually if I were going to try it, I'd start by just copy-pasting the relevant section from SICP.)

bh · August 1, 2024, 4:16am

New streams library featured in Snap!shot 2024 "What's new in v10" keynote.

https://people.eecs.berkeley.edu/~bh/streams.mpg

qw23 · August 1, 2024, 8:20pm

Nice presentation! Much better than I would have done

bh · August 2, 2024, 1:06am

Thanks! :~)

bh · October 29, 2024, 2:48am

From a discussion on github:

But, whoa, on the way of doing this testing, I found a performance bug in your imperative looping when using your "repeat" and "for" blocks (all of them) that you use in an imperative implementation for your "list ___ items of stream" and "item ___ of stream" blocks that takes most of the time to loop for a thousand times (over 10 seconds). Rewriting these to using recursive loops results in about a twenty times speed up so a hundred thousand numbers in about thirteen seconds. This still isn't fast as it is still about five hundred times slower than straight JavaScript, but at least it makes streams somewhat usable...

Could you please look into this?

If you want to see the rest of the conversation, it's at

github.com/jmoenig/Snap

Streams library "sieve" block producing primes is NOT a Sieve of Eratosthenes...

opened 06:00PM - 27 Oct 24 UTC

GordonBGood

In the help notes for this "sieve" block included in the demos for the Streams l…ibrary, you say "It's called SIEVE because the algorithm it uses is the Sieve of Eratosthenes: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes ", yet when one reads the linked Wikipedia article (at least as updated over the last 15 years or so), one finds the following statement: "The widely known 1975 [functional](https://en.wikipedia.org/wiki/Functional_programming) sieve code by [David Turner](https://en.wikipedia.org/wiki/David_Turner_(computer_scientist))[[13]](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes#cite_note-13) is often presented as an example of the sieve of Eratosthenes[[7]](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes#cite_note-Runciman-7) but is actually a sub-optimal trial division sieve.[[2]](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes#cite_note-ONeill-2)". That David Turner sieve is exactly what you have implemented, as expressed in Haskell as follows: ```haskell primes = sieve [2..] sieve (p : xs) = p : sieve [x | x <− xs, x `mod` p > 0] ``` or more as you have written it in Snap! (using Haskell's "filter" instead of the equivalent Snap! "keep"): ```haskell primes = sieve [2..] sieve (p : xs) = p : sieve (filter (\ c -> c `mod` p > 0) xs) ``` This has much worse performance than a real functional "incremental" Sieve of Eratosthenes (SoE) so as to be almost unusable for your Snap! version, taking over 10 seconds on a fairly high end desktop computer to find the primes just to a thousand; it is slow because rather than using SoE sieving by only incrementing the culled composites per base prime by addition of a constant span, it uses trial division (via the "mod" function) to test ALL remaining prime candidates for even division by all found primes - thus, it has an exponential computational complexity rather than the O(n log n (log log n)) computational complexity for the true incremental SoE. Using the work [from the Wikipedia article referenced article](http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf), one could write a true incremental SoE using a Priority Queue as O'Neill preferred but that would require the extra work of implementing the Priority Queue using either a binary tree structure or using Snap!'s List's in imperative mode (not functional linked-list mode). In the Epilogue of the article, reference is made to a "Richard Bird list-based version" of the SoE, which is much easier to implement. However, at that time, this sieve still hadn't been optimized to use "infinite tree folding" to reduce the complexity of the base primes composites merging to a computational complexity of about O(n log n) rather than O(n sqrt n) without the tree folding, which is a very significant difference as prime ranges get larger. A refined version of this sieve (sieves odds-only) in Haskell is as follows: ```haskell primes = 2 : oddprimes() where merge xs@(x : xtl) ys@(y : ytl) = case compare x y of EQ -> x : merge xtl ytl GT -> y : merge xs ytl LT -> x : merge xtl ys composites ((hd : tl) : rest) = hd : merge tl (composites $ pairs rest) where pairs (f : s : rest) = merge f s : pairs rest testprmfrm n cs@(c : rest) = if n >= c then testprmfrm (n + 2) rest else n : testprmfrm (n + 2) cs oddprimes() = (3 :) $ testprmfrm 5 $ composites $ map (\ bp -> [bp * bp, bp * bp + bp + bp .. ]) $ oddprimes() ``` In contrast to the David Turner sieve, instead of progressively sieving composite numbers from the input list by filtering the entire remaining list, this tree folding algorithm rather builds a merged (and therefore ascending ordered) lazy sequence of all of the composite values based on the recursively determined "oddprimes" function (non-sharing) and then the final output sequence of prime numbers is all of the (in this case odd) numbers that aren't in the composites sequence. It is relatively easy to arrange the merged sequence of composites in ascending order because each of the sub lists of culls from each of the base primes is in order but also because each new sub sequence based on the next base prime will start at a new increased value from the last (starting at the square of the base prime). For this fully functional "incremental" algorithm, there is a cost in merging for all of the sub sequences, but the binary tree depth is limited: for instance, when finding the primes up to a million, the base primes sequence only goes up to the square root or a thousand, and there are only 167 odd primes up to a thousand, meaning that the binary tree only has a depth of about eight (two to the power of eight is 256) in the worst case (some merges won't need to traverse to the bottom of the tree) Now, as all of these sequence states don't refer back to previous states, there is no need to memoize the states as in a "lazy list" or as in your "streams" library, and a simpler Co-Inductive Stream (CIS) can be used, as follows in Haskell: ```haskell data CIS a = CIS !a !(() -> CIS a) primes = CIS 2 $! oddprimes where merge xs@(CIS x xtl) ys@(CIS y ytl) = case compare x y of EQ -> CIS x $! \ () -> merge (xtl()) (ytl()) GT -> CIS y $! \ () -> merge xs (ytl()) LT -> CIS x $! \ () -> merge (xtl()) ys composites (CIS (CIS hd tl) rest) = CIS hd $! \ () -> merge (tl()) (composites $! pairs (rest())) where pairs (CIS f ftl) = let (CIS s rest) = ftl() in CIS (merge f s) $! \ () -> pairs (rest()) allmults (CIS bp bptl) = CIS (cullfrmstp (bp * bp) (bp + bp)) $! \ () -> allmults (bptl()) where cullfrmstp c adv = CIS c $! \ () -> cullfrmstp (c + adv) adv testprmfrm n cs@(CIS c rest) = if n >= c then testprmfrm (n + 2) (rest()) else CIS n $! \ () -> testprmfrm (n + 2) cs oddprimes() = CIS 3 $! \ () -> testprmfrm 5 $! composites $! allmults $! oddprimes() ``` The above Haskell code is written so as to better understand as this "CIS" implementation is written in non-lazy languages as it doesn't take advantage of Haskell's built-in laziness and bypasses the compiler "strictness analyzer" so may be less efficient that the lazy list version above in spite of not doing the extra work to do memoization of each sequence value... The above has been translated to Snap! [as per this attached file](https://github.com/user-attachments/files/17534667/TreeFoldingPrimes.blocks.zip) and can be tested with "listncis 168 primes" to show the list of the primes up to a thousand in about a second, and works in a little over ten seconds showing primes up to about ten thousand; it thus has more of a linear execution time with increasing ranges... I recommend you fix your Streams demo to use an implementation of this algorithm, although as noted above, the algorithm doesn't really required the overhead of the Streams memoization...

qw23 · October 30, 2024, 8:52am

I’m not entirely sure what you’re asking me to look into, as Gordon ‘s comments cover a lot of ground. Here’s what I found so far:

I agree that ringify & call consumes relatively much runtime - I replicated Gordon’s testLoop blocks, and the second one needs 6 times more runtime on my iPad.
I also agree with Gordon that the library’s current sieve code is inefficient - while agreeing with you that, from a pedagogical point of view, simpler is better.
As for implementing a recursive list [number] items of [stream] (which I suspect is what you’re aiming at), I’ve run some tests, with two different recursive versions. For relatively small numbers (say, < 1,000) the recursive versions are considerably faster than the library’s repeat-based version. With larger numbers (say, > ~~100,000~~ 20,000) I’ve encountered stability issues - it feels like a deja vu, I need to look up our discussions earlier in this log.

Edit
Hmm … that didn't bring me any farther.

I asked Gordon in Github if he would present his proposal for a recursive version of e.g. list [number] items of [stream] … and @jens asked him to move the discussion here.
Let’s see if Gordon can help us speed up Streams operations within Snap!

bh · October 30, 2024, 2:56pm

Yeah, sounds great.

gordonbgoodnew · October 30, 2024, 4:05pm

@qw23,

I’m not entirely sure what you’re asking me to look into...

I'm not sure what you are interested in doing either; The issue that I filed isn't a request for a new implementation of the primes sieving algorithm, but rather registers an objection that the current implementation states that it is a Sieve of Eratosthenes, which it isn't, and shows the better performance of an alternate true SoE.

If you are interested in adding better primes sieve's, the following is an expression of the algorithms in Haskell syntax:

The current primes sieve is a version of a Trial Division sieve and worse than that, in the interests of making the very shortest functional sieve, it is incredibly inefficient with `O(n ^ 2) computational complexity, also requiring huge amounts of heap memory in building a heap structure of nested closures with one for each prime up to the range used.

Both problems can be greatly reduced at a slight cost in increased code complexity by O'Neill's suggested alternative, although it is still a Trial Division sieve and not a SoE, expressed in Haskell as follows:

primes = 2 : [x | x <− [3..], isprime x] where
  isprime x = all (\p −> x ‘mod‘ p > 0) (factorsToTry x) where
    factorsToTry x = takeWhile (\p −> p*p <= x) primes

with greatly reduced computational complexity of about O(n ^ 1.5) and heap storage for nested closures also greatly reduced by about the same amount.

The above Trial Division sieve could be somewhat simplified and sped up by about a constant factor by only sieving the odd primes, as follows:

primes = 2 : oddprimes where
  oddprimes = 3 : [ x <- [ 5, 7 .. ], isprime x ] where
    isprime x = all (\ bp -> x `mod` bp <= x) (factorsToTry x) where
      factorsToTry x = takeWhile (\ p -> p * p <= x) oddprimes

Now this requires an infinite-stream-by-step block (which I think you already have), a take-stream-while-predicate block, and an all-predicate-for-stream block (which must be limited to non-infinite streams in order to return), but otherwise would be easy enough to implement in Snap!.

If you want to introduce your users to about the simplest true functional SoE trying to re-use the stream blocks (such as merge - non-duplicate-eliminating, map, and numbers-from-step) that already exist, you could implement the Richard Bird sieve, which in Haskell is as follows:

primes = 2 : oddprimes
  oddprimes = (3 :) $ minus 5 $ combine $ map (\ bp -> [ bp * bp, bp * bp + bp + bp .. ]) oddprimes where
    minus n cs@(c : cstl) =
      if n < cs then n : minus (n + 2) cs
      else minus (n + 2) (adv cstl) where
        adv ncs@(hd : tl) = if hd > n then ncs else adv tl
   combine ((c : ctl) : cstl) = c : merge ctl (combine cstl)

Note that this re-uses your merge block, which doesn't eliminate duplicates in the merged streams.

The above will be perhaps slightly faster than the better TD sieve only because it doesn't use division (implied by modulo) but has the same O(n ^ 1.5) complexity, although not eliminating duplicate composites in the merge function will cost something. If one wants O(n log n) complexity, one adds one extra pairs function to the above combine block to get infinite tree folding, as follows:

primes = 2 : oddprimes
  oddprimes = (3 :) $ minus 5 $ combine $ map (\ bp -> [ bp * bp, bp * bp + bp + bp .. ]) oddprimes where
    minus n cs@(c : cstl) =
      if n < cs then n : minus (n + 2) cs
      else minus (n + 2) (adv cstl) where
        adv ncs@(hd : tl) = if hd > n then ncs else adv tl
    combine ((c : ctl) : cstl) = c : merge ctl (combine $ pairs cstl) where
      pairs (cs0 : cs1 cstl) = merge cs0 cs1 : pairs cstl

The way it works isn't really that complicated although it takes a few lines of code to implement: In words - the odd primes are primes starting at 3 (base case to avoid a data race) followed by all the numbers starting at five minus the stream of composite numbers, which are formed by the ordered combining of all the composite streams resulting from the base odd primes (recursive). The extra refinement of infinite tree folding is implemented by merging pairs of the composite number streams, so (since no pair is taken for the head), the threes stream, merged with the merged pair of the fives and sevens stream, merged with the merged pair of pairs of the elevens/thirteens/seventeens/nineteens streams, etc. Note that this implementation automatically only uses enough base primes up to the square root of the currently processed value; it could be improved slightly for storage requirements when the output stream is consumed as it is used by using a "non-sharing" recursion of the oddprimes stream (ie. using oddprimes as a function to get a new separate base primes stream feed).

The above is likely the simplest true incremental (functional) SoE that one can write, and in some ways is no harder to write than the better TD sieve as no extra takeWhile and all blocks need be defined. This last SoE sieve may allow Snap! to find primes up to a million in a reasonable time, and in fact doesn't absolutely require the memoizing Streams library but could be written to use a non-memoizing stream for an extra constant factor better speed...

An alternate to the entirely stream based Richard Bird sieve or the tree folding sieve is an implementation using Priority Queue's, but that would be a little more complex as one would first have to essentially create a Binary Heap Tree based Priority Queue library (not that hard but extra work), with a priority queue version generally a little faster than the tree folding version by a constant factor due to not having functions nested quite so deeply (also avoiding extra use of the memoized streams if that is what is used). I suppose that providing a functional priority queue library would add to the value of Snap! as a teaching aid in providing an implementation of a frequently useful data structure...

bh · October 30, 2024, 4:22pm

I think the issue to be discussed here isn't about primes, but about your proposal to replace some iterative stream library procedures with recursive versions.

qw23 · October 30, 2024, 4:24pm

Hello @gordonbgoodnew, welcome to the Snap! forum!

I’m not primarily interested in improving the library’s sieve algorithm - that’s just an demonstration of what you can do with streams, and actually @bh ‘s hobby horse - but mostly in any suggestions for reducing runtime a/o boosting stability of “infrastructural” blocks, like list [number] items of [stream]. I suspect you devised and tested a recursive solution - if so, will you share it so I can try it out? Thx in advance!

BTW I don’t have any working command of Haskell.

bh · October 30, 2024, 8:17pm

What format works for posting blocks in this forum? I tried posting images, but the forum doesn't seem to allow that?

Was this the post you said on github you deleted by mistake? I could recover it if you want.

Brand-new forum users can't post pictures; this is an anti-porn-spam feature. Mostly by people's second day on the forum they're out of that category; the criterion is something like how many messages you've read. (I can never remember exactly.) It might take you longer if you read very selectively. See postscript below.

The best way to post a picture of a script, from the block editor or from the scripting area, is to right-click on the script and choose "script pic" from the context menu. This makes a .png that includes the script's xml form in the metadata, so if you drag it into a forum tab it appears as a picture, but if you drag in into a Snap! tab the actual script is loaded (unless you're in the Costumes tab).

PS After posting this I suddenly realized that I can set your trust level manually. You can now post images.

gordonbgoodnew · October 30, 2024, 9:21pm

@bh,

Never mind, I'll reconstruct it following:

Yes, that's what I have been doing, although I hadn't reallized that the metadata also contained the xml form - neat!

Thank you for that; I'll now post what I intended to post here:

@Qw23,

I think the issue to be discussed here isn't about primes, but about your proposal to replace some iterative stream library procedures with recursive versions.

In thinking about this problem, I am wondering whether the imperative loops are so slow because they are doing constant conversion between imperative and linked-list versions of the list values used. So in testing I bypassed the use of the imperative loop blocks entirely by using functional recursive "loops"...

any suggestions for reducing runtime a/o boosting stability of “infrastructural” blocks, like list [number] items of [stream] . I suspect you devised and tested a recursive solution - if so, will you share it so I can try it out?

Sure, but as my needs were only for an infinite stream version, I haven't done much testing for a stream that might terminate with an empty stream. Thus, the following isn't completely tested:

BTW I don’t have any working command of Haskell.

Ah, sorry - I assumed anyone working on functional algorithms such as streams would also know Haskell; I'll only post Snap! images of blocks then as I don't care for LISP and exported blocks in .xml format are so ugly...

Ah, I discovered Cloud sharing and publishing...

The recursive list-n-items-of-stream block, along with the internal block it uses, also including a new list reverse block and its internal block as follows:

The recursive item-n-of-stream block, as follows:

These seem to be about twenty times faster than the current stream blocks of the same function...

qw23 · October 30, 2024, 11:31pm

Thank you for sharing your code (and apparently you picked up how to use the forum pretty quickly!).

It’s bedtime in my part of the world, so I’m going to fiddle with it later - for now I’lm just informing you about the existence of a fast multi-purpose Snap! primitive block you may want to employ reversing a list:

.

BTW this picture was made using the result pic pop-up menu option from the editor.

gordonbgoodnew · October 30, 2024, 11:39pm

@qw23,

Thanks for your information on the reverse primitive! There are no doubt all sorts of primitives I am not using...

Anyway, while this method of reversing the list would have saved me the work of creating a custom block, it won't change the time perceptibly in this case as it is just the clean up reporting after the recursive loop...

dardoro · October 31, 2024, 12:13am

Imperative loops are not inherently slow. They just yield at every iteration with a 60 times/s rate.
Without yield ...

Streams library is targeting multiple goals at once - FIFO buffer, async producer-consumer abstraction, lazy dynamic list. Unconditional "warp" may hinder some of those goals.

gordonbgoodnew · October 31, 2024, 5:07pm

@dardoro,

Imperative loops are not inherently slow. They just yield at every iteration with a 60 times/s rate.

Ah, I understand about imperative loops. So recursive loops don't "yield" at the animation frame rate?

I understand that "yield" points are necessary to make Snap! have interactive "liveliness"...

Streams library is targeting multiple goals at once - FIFO buffer, async producer-consumer abstraction, lazy dynamic list. Unconditional "warp" may hinder some of those goals.

You guys have done some incredible work in making it possible by using Snap! to do functional programming essentially writing LISP scripts without doing any typing other than for block, names, labels, variable names, numbers, and strings, and have extended Scratch to be actually an interesting language to me (as a primarily functional programmer). However, one might wonder what your target audience might be, as you have moved well beyond the children who are the target audience for Scratch (other than for child prodigies), and most children who would be ready for the more advanced concepts would move on to a language that involves some typing, at which they have become adapt by that time, by moving to Python (SHUDDER...) or something like Elm or maybe Fable (Professor Christopher Anand of McMaster University uses Elm to teach computer programming using SVG graphics to children above 12 years old through an Outreach extension program).

But in spite of its limitations, getting back into Snap! has been somewhat of a blast in exploring those limitations, and I think that the biggest program I might write is to use Legendre's inclusion/exclusion principle to count the number of primes to a billion, a task that took Meissel something like ten years of hand calculation in the 1870's! In order to get it to run this problem in an acceptable amount of time before my Google Chrome browser crashes (about 30 seconds to a minute), I may need to resort to manipulating imperative lists-used-as-arrays but otherwise use recursive loops in order to avoid the problem you've outlined with imperative loops, as I don't want to freeze the tab for the many seconds it may take to compute by using "warp"...

If I need to use partial sieving in order to make this fast enough, I'll also need some fairly big arrays, as in about three or four list/arrays of about 32623 number elements each, but from what I see that shouldn't be a problem...

I see in the documentation that you have addressed Tail Call Optimization, but you don't mention it in the manual and especially the example using a recursive loop to produce a list of even numbers doesn't have the recursive call in Tail Call Position (the recursive call returns a new list that needs to be returned to be passed into the in-front-of list composition function/block, so therefore not in Tail Call Position). For large lists, there are ways to handle this including using Continuation Passing Style or Trampolining (although ugly) and moving the result list to be used as a recursive loop parameter that then would need to be reversed on completion, but the manual doesn't mention whether there is a potential problem or not or the solutions if there is...

I find that the Snap! language feels a bit dated, even though it succeeds in not having to deal with matching LISP parenthesis with the graphics building process, but I suppose that isn't so surprising as many of the contributors seem to have a LISP bias. Just saying...