i think i have an idea but it is pretty complex to create properly
it would convert a string like this
abcaacbcbbb
into
a1b1a2c1b1c1b3
however for list items
another idea i had was doing something like this
abcaacbcbbb
would turn into
A, 1,4,5
B, 2,7,9,10,11
C, 3,6,8
this also fixes the repeat item issue but in a much smaller way
i managed to create the last one, however uncompressing it is something im not sure about
I do want to point this out. I did watch a video a while ago about how super mario bros compresses it's levels, and it uses a combination of RLE, and just the raw data. What I mean is, abcdeefffgabbccc
turns into 0abcd2e3f0ga2b3c
Which solves the issue with a run of single values that aren't the same.
Different picture file formats use different algorithms, but yes, the good ones have a variety of algorithms on tap. But in order to know which is best for a particular picture, the meta-algorithm has to actually run all of them and compare result sizes. So that's not going to be infinitely fast! Decompressing is easier because the file tells the software what algorithm was used to compress this part.
I guess such meta-algorithm, when handling very large objects, would employ sampling, don't you think? Besides, storage efficiency (rather than runtime speed) would be the principal consideration.
I do not share the project, as cymplecy wants to encourage younger Snappers...
ABCD - RGBA
E - chunk length
F - chunk start, for further experiments with RGBA groups
Yes, but I think not the way you mean. Pretty sure the png meta-algorithm uses different algorithms for different chunks of picture, so, e.g., it could handle a cartoon character on a photo background efficiently. And I'm pretty sure that means it looks at the entire picture before compressing anything.
But png isn't lossless, I think. It has a quality dial, which has to mean how precisely the compressed version captures every pixel. So cartoons, compressed by RLE, will be perfect regardless of the dial setting, but not photos. As opposed to gzip, which is lossless, but has a dial for speed vs. efficiency of compression.
Png is usually lossless. I'm pretty sure the dial you're talking about is the amount of compression to do, or the bit depth (which usually doesn't change the quality, unless you choose 4 or 2 bit depth). Now, I said it's usually lossless, because there is a png mode that has a color pallet that maps onto the pixels. This pallet is not infinite, so you do need to sacrifice quality to use it. But for normal use, png is lossless, and it's only lossy if you tweak specific settings to make is lossy.
Implementing Huffman coding is far from trivial because of its variable code length, not corresponding to (multiples of) byte size.
With a single constraint (fitting cartoon-type images): max. 256 different RGBAs … a mix of RLE and one other simple compression method will achieve almost similar results.
With regions grouped there are
75*4 {RGBA} + 2049{chunks}*2 {chunk start, length} => 4398 numbers.
Of course, this metric has a serious flaw. JS numbers are 64-bit long. So without the variable length encoding an effective size will be ridiculous compared to the PNG 4.7 kB, or the WEBP 3.7 kB.
To test the compression ratio, a text representation may be used. Here are the results:
Pixels
Strangely enough, a plain, byte-sized, uncompressed image buffer has ~ 4*10K = 40 kB, so even this not-so-effective encoding offers real benefits.
Given that the CSV is redundant
You might do better to encode the run lengths into decimal digits! :~)
We still talking about the Alonzo costume? It has, what, four colors? The dark border, the yellow, black, and white. If you make a palette a la GIF format, you can reduce the 32-bit color of each pixel to two bits. That's already, what, a 16-fold compression! Cartoons are easy.