Given that it takes a long time, I have a couple of ideas about making it faster.
Some things that you do using MAP and/or KEEP might be faster using the hyperblock feature, i.e., instead of
you can just say
But this doesn't work for =, alas, because it isn't hyperized.
I think there are quadratic-time parts of the algorithm, especially if there are many unique colors in the original. Some of those might be sped up by sorting the pixels (n log n time) before doing anything else. You're thinking that you can't do that because you need to know where in the costume each pixel is, but you can start by making a list in which each RGBA pixel becomes (LIST (rgba) (index)) and then sort on
Then rearrange the result into a list of (LIST (rgba) (list-of-indices)), and use that instead of UNIQUES.
If the overall algorithm is quadratic time, little constant-time speedups may not be worth worrying about, but there are an awful lot of bounds checks, if color<0 and if color>255, that could be avoided by making sure that doesn't happen in the first place.
The problem was the fact that it wasn’t checking whether or not the color existed in the list, so it was calling the replace color function for all of the colors in the range.
Interestingly, I got roughly half your time last night, and roughly half time today with your new optimisations.
With my windows machine I can tell you it does that on one core and only one core. I don't know if you could find that data on an ipad, but it probably only uses one core as well.
I'm using a 13th Gen Intel Core i5.
The one score that were different, was alonzo, which took ten seconds. Odd. In my mind, it's much simpler and much smaller, so it should have been much faster. Are you still calculating the whole stage? (A brief check tells me it should just be calculating the costume, hrmmm)
You may have misunderstood me/I may have explained poorly. Last night I ran it before bed it took 5 minutes and your comment said it took a little over 11 minutes. When I woke up this morning you had optimized it to be a lot faster, but my speeds were consistently half your benchmarks no matter the speed.
672 (You) == 300 (Me) / 51.888 (You) == 26 (Me)
What phone did you run it on?
Samsung A70
PFP == 45 seconds
4333 == Error.
Uh, ok, nevermind lol. Snap barely runs on my phone let alone trying to render something. So I won't try the others lol. My PC does your PFP in 14 seconds.
Also, keeping in mind, all the devices are probably only using a single core regardless. Snap! does support threading, but Iunno if you could actually get the pc to run the code on multiple cores.
These speeds would be ludicrously lower on even a dual-core setup, and that's even before we wander into the weeds that is GPU