[Challenge]Make a costume pixel lossless compressor/decompressor

cookieclickerer33 · November 24, 2023, 3:16pm

i think i have an idea but it is pretty complex to create properly

it would convert a string like this
abcaacbcbbb
into
a1b1a2c1b1c1b3
however for list items

another idea i had was doing something like this
abcaacbcbbb
would turn into
A, 1,4,5
B, 2,7,9,10,11
C, 3,6,8

this also fixes the repeat item issue but in a much smaller way
i managed to create the last one, however uncompressing it is something im not sure about

dardoro · November 24, 2023, 3:33pm

Yes, it's exactly the RLE.
So it's time to do the exercise And the idea to group regions by their RGBA values seems promising.

ego-lay_atman-bay · November 24, 2023, 3:57pm

I do want to point this out. I did watch a video a while ago about how super mario bros compresses it's levels, and it uses a combination of RLE, and just the raw data. What I mean is,
abcdeefffgabbccc
turns into
0abcd2e3f0ga2b3c
Which solves the issue with a run of single values that aren't the same.

bh · November 25, 2023, 5:43am

Different picture file formats use different algorithms, but yes, the good ones have a variety of algorithms on tap. But in order to know which is best for a particular picture, the meta-algorithm has to actually run all of them and compare result sizes. So that's not going to be infinitely fast! Decompressing is easier because the file tells the software what algorithm was used to compress this part.

loucheman · November 25, 2023, 6:08pm

can you flatten your result plz

qw23 · November 25, 2023, 6:47pm

I guess such meta-algorithm, when handling very large objects, would employ sampling, don't you think? Besides, storage efficiency (rather than runtime speed) would be the principal consideration.

I’m looking forward to e.g. @cookieclickerer33’s solution

dardoro · November 25, 2023, 9:15pm

I do not share the project, as cymplecy wants to encourage younger Snappers...
ABCD - RGBA
E - chunk length
F - chunk start, for further experiments with RGBA groups

loucheman · November 25, 2023, 11:16pm

You need E or F to recompose (right) so 2049 * 5 = 10245 numbers in your list

ratio:
43200 : 10245
1 : 0.24

I'm curious: what is the result if you rotate the costume before compressing (rotate the matrix)

Rotate alonzo

did you have the same result ?

bh · November 26, 2023, 12:06am

Yes, but I think not the way you mean. Pretty sure the png meta-algorithm uses different algorithms for different chunks of picture, so, e.g., it could handle a cartoon character on a photo background efficiently. And I'm pretty sure that means it looks at the entire picture before compressing anything.

But png isn't lossless, I think. It has a quality dial, which has to mean how precisely the compressed version captures every pixel. So cartoons, compressed by RLE, will be perfect regardless of the dial setting, but not photos. As opposed to gzip, which is lossless, but has a dial for speed vs. efficiency of compression.

ego-lay_atman-bay · November 26, 2023, 12:15am

Png is usually lossless. I'm pretty sure the dial you're talking about is the amount of compression to do, or the bit depth (which usually doesn't change the quality, unless you choose 4 or 2 bit depth). Now, I said it's usually lossless, because there is a png mode that has a color pallet that maps onto the pixels. This pallet is not infinite, so you do need to sacrifice quality to use it. But for normal use, png is lossless, and it's only lossy if you tweak specific settings to make is lossy.

bh · November 26, 2023, 12:44am

Sorry, right you are, I just saved a png in Photoshop and the dial is speed vs amount of compression.

loucheman · November 26, 2023, 5:46pm

I tried the RLE algorithm but for each channel individually

ratio:
43200 : 11642
1 : 0.27

dardoro · November 26, 2023, 6:12pm

Separate channels

=> 11642

Rotated

=> 9650

loucheman · November 26, 2023, 7:15pm

Thk for the test...

cookieclickerer33 · November 26, 2023, 7:55pm

i think this is the smallest way to go about it without some sort of dedicated system

bh · November 26, 2023, 9:04pm

It's a good way if there are way fewer possible characters than the length of the string.

Look up "Huffman coding" for another approach.

qw23 · November 26, 2023, 11:14pm

Implementing Huffman coding is far from trivial because of its variable code length, not corresponding to (multiples of) byte size.
With a single constraint (fitting cartoon-type images): max. 256 different RGBAs … a mix of RLE and one other simple compression method will achieve almost similar results.

dardoro · November 27, 2023, 12:33am

With regions grouped there are
75*4 {RGBA} + 2049{chunks}*2 {chunk start, length} => 4398 numbers.

Of course, this metric has a serious flaw. JS numbers are 64-bit long. So without the variable length encoding an effective size will be ridiculous compared to the PNG 4.7 kB, or the WEBP 3.7 kB.

To test the compression ratio, a text representation may be used. Here are the results:
Pixels

RLE compact

RLE grouped by RGBA regions

Strangely enough, a plain, byte-sized, uncompressed image buffer has ~ 4*10K = 40 kB, so even this not-so-effective encoding offers real benefits.
Given that the CSV is redundant

, it can be further reduced.

Someone may check results - it looks too good to be true

bh · November 27, 2023, 12:33am

Nah, you just make a list of the encodings, and when you're done you JOIN them.

Brian's Law: If you have data that aren't in a list, the first thing you do is get them there.

bh · November 27, 2023, 12:40am

You might do better to encode the run lengths into decimal digits! :~)

We still talking about the Alonzo costume? It has, what, four colors? The dark border, the yellow, black, and white. If you make a palette a la GIF format, you can reduce the 32-bit color of each pixel to two bits. That's already, what, a 16-fold compression! Cartoons are easy.