How do I optimize this merge sort I made?

polymations · February 6, 2024, 1:28pm

I know the merge block Itself works, and it's fast, it's just the sorting algorithm i made with it that is slow.
It's apparently acting like O(N^2) and takes a few seconds to organize a 100000 items list. It's not as slow as selection sort and that stuff though.

And just in case:

qw23 · February 6, 2024, 3:13pm

A small adaptation that will make your code slightly faster (though IDK if it's noticeable), and shorter, is using the predicate. Also, instead of deleting the first item of a list, you could raise the index by 1.

Instead of keep items you could use something like , it's much faster.

Anyway, a more elegant way of writing code (though not necessarily faster at execution time) is with these blocks:

polymations · February 6, 2024, 3:15pm

I've decided. Do you think hybridizing it with selection for items under 10 or 5 length will optimize it?

polymations · February 6, 2024, 3:17pm

I used the wikipedia pseudocode to make this.

qw23 · February 6, 2024, 3:28pm

It might. If you want to test how fast an algorithm is, try a block such as this one I adopted from @dardoro:

Programming tools DEVLAB script pic

(if the block to be tested is a reporter, use e.g. ignore (from the Iteration library) as a connector)

There are many pseudocode versions of merge sort at the relevant Wikipedia page.

For inspiration, here's one with lists (an adaptation from code by @bh, I think):

mergesort script pic (3)

mergesort script pic (4)

The comparator may be any predicate with 2 inputs. Using a variable comparator makes for somehwat slower execution BTW. But flexibility matters, too.

polymations · February 6, 2024, 3:34pm

Made my own ignore and duration block.

Definitely O(n log(n)), but slow O(n log(n)).
I'll optimize it using selection sort.

qw23 · February 6, 2024, 3:38pm

Well done!

polymations · February 6, 2024, 3:44pm

Suprisingly, it performed worse! But the selection case is for 5 or less items.

polymations · February 6, 2024, 3:50pm

qw23 · February 6, 2024, 3:56pm

If you really really want optimum speed for some application, other than experimenting with sorting algorithms as a goal for its own sake (which is interesting enough, IMO), Snap!'s built-in block is very hard to beat.
In my experience there are two cases though where, for fastest processing, you'd want to write your own code (building on top of Snap!'s built-in sort):

If you're only looking for the top x% of a list's items ("partial sorting");
If you want to sort by a different key than by the item itself, in ascending order.

polymations · February 6, 2024, 4:59pm

geokid · February 6, 2024, 5:56pm

Whoa!

When was this feature added?

polymations · February 6, 2024, 7:29pm

before i joined the snap forums

bh · February 6, 2024, 7:49pm

That last REPORT at the end of QUICK SORT will never happen (because the IF always reports), which is a good thing, because a recursive call with the same input (rather than a smaller input) would infinite loop!

I may be misremembering, especially if Wikipedia is telling you to use selection sort, but my impression was that the official quicksort algorithm uses insertion sort for small input lists.

The fastest (but ugliest) way to sort small lists is with explicit hard-coded comparisons, e.g.,

I think that's right, but I wouldn't swear to it. You'd have a similar procedure for lengths 0, 1, 2 (you already have those), 3 (above), 4, and 5. The one for length 3 does at most three comparisons, two if you're lucky. Note that there's no recursion in these small-list procedures, so if you have one for a length-five list, you'll never use the length-three procedure. You need the ones for lengths zero to four only in the case in which the original toplevel list is strictly smaller than length five.

polymations · February 6, 2024, 8:01pm

What is that

polymations · February 6, 2024, 8:01pm

I meant for Merge Sort.

Good thing to know, because i thought it would return "Reporter wouldn't Report"

bh · February 6, 2024, 9:34pm

I know, but (I should have said) Quicksort is a specific algorithm, for which someone got a PhD way back when, and it includes a whole bunch of optimizations, one of which is the one about using a simpler sort for small sublists. And I think it uses insertion sort, for a reason having to do with the fact that it performs better when the input data are already almost sorted. There isn't an official sorting algorithm for small sublists with mergesort because there isn't an official mergesort algorithm altogether. It's not that big a deal.

That would be if the script ended without doing a REPORT at all. You're expecting the computer to be really smart, to look at the input expression in the REPORT block and say "hey, wait a minute, you can't do a recursive call with the same input you were given." Computers aren't that smart. (People aren't that smart, either; in general it's theoretically impossible to predict infinite loops. You can find simple cases, but even in your example, the input expression is L (insert rant about sans serif fonts here) (insert rant about case-sensitive symbols here), not LIST, in the recursive call at the end. So it'd have to prove that L=LIST to know that this is an infinite recursion.

For all its wonderfulness, Wikipedia isn't going to teach you the habit of functional programming. If you apply DELETE to items in your input list, the caller of your sort function will be surprised that the original is gone. That is, the contract a reporter has with the user is that it doesn't change anything; it just reports a value. So if the user says SET [SORTED] TO (MERGE SORT (INCOMING)) they should be able to assume that they have two lists, INCOMING and SORTED, the first of which is still in its original order.

system · March 7, 2024, 9:34pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.