Hash Algorithm

Yes! A 100% real and working hash algorithm, and using a few blocks.

But, what's a hash?

A hash function is a function that codifies text to never be decoded again! The most famous is SHA (Secure Hash Algorithm) because of its effectiveness and security. But, I created a very secure one and you can set the output length as well!

When should I use it?

Use it, for example, to verify a password. When a user creates an account with a password, it will go to the servers but hashed! And when that user logs in with it, the server will verify the password by hashing and comparing it with the original one.

Is there a way to decode it?

It's really..., buuuuuuuuut really hard to decode a hashed text. The only way to decode it could be guessing. That explains why is important to have a very hard and secure password. By the way, when servers share private passwords by accident, it will show them but hashed instead!


There are some previews:




Source

Copy and Import:

<blocks app="Snap! 7, https://snap.berkeley.edu" version="2"><block-definition s="hash %&apos;text&apos; output length: %&apos;length&apos;" type="reporter" category="operators"><header></header><code></code><translations></translations><inputs><input type="%txt">text</input><input type="%n">20<options>default=20&#xD;256 bits=76&#xD;384 bits=115&#xD;512 bits=152</options></input></inputs><script><block s="doIfElse"><block s="reportIsA"><block var="length"/><l><option>number</option></l></block><script><block s="doIf"><block s="reportOr"><block s="reportGreaterThan"><l>16</l><block var="length"/></block><block s="reportGreaterThan"><block var="length"/><l>256</l></block></block><script><block s="doApplyExtension"><l>err_error(msg)</l><list><block s="reportJoinWords"><list><l>The output length provided (</l><block s="reportSum"><block var="length"/><l>0</l></block><l>) is outside the range [16, 256]</l></list></block></list></block></script></block></script><script><block s="doApplyExtension"><l>err_error(msg)</l><list><block s="reportJoinWords"><list><l>Expecting a number but getting a </l><block s="reportTypeOf"><block var="length"/></block></list></block></list></block></script></block><block s="doReport"><block s="reportJoinWords"><block s="reportAtomicMap"><block s="reifyReporter"><autolambda><block s="reportLetter"><l><option>last</option></l><block s="reportRound"><block s="reportQuotient"><block s="reportProduct"><block s="reportDifference"><block var="length"/><block s="reportAtomicCombine"><block var="value"/><block s="reifyReporter"><autolambda><block s="reportSum"><l></l><l></l></block></autolambda><list></list></block></block></block><block s="reportMonadic"><l><option>10^</option></l><l>12</l></block></block><block s="reportSum"><block s="reportAtomicCombine"><block s="reportListAttribute"><l><option>flatten</option></l><block var="list"/></block><block s="reifyReporter"><autolambda><block s="reportSum"><l></l><l></l></block></autolambda><list></list></block></block><block var="index"/></block></block></block></block></autolambda><list><l>value</l><l>index</l><l>list</l></list></block><block s="reportListAttribute"><l><option>columns</option></l><block s="reportReshape"><block s="reportTextSplit"><block s="reportProduct"><block s="reportUnicode"><block var="text"/></block><l>97187</l></block><l><option>letter</option></l></block><list><block s="reportSum"><block s="reportStringSize"><block var="text"/></block><l>1</l></block><block var="length"/></list></block></block></block></block></block></script></block-definition></blocks>

(C) 2022 ScratchModification

Can you add a link to a project?
Also this looks like it's susceptible to collisions...

There's no project. Just click Source

Yeah, I was thinking that as well. But in my tests, I didn't find something that I can worry about.

I meant that you could upload the code to a project for easier testing and viewing.

What is your criteria about "something that I can worry about?"

Many people requested me to share a project. But I don't like to publish a lot of projects.

Collisions are common at output length from 0 to 7, that's why I disabled them.
If there's any problem with the other ones, that means probably I have to stay more hours programming again :confounded:

I think it would be an improvement to find the closest prime number to the user's length and use that instead of what the user said.

Also, if you assume most of the text is letters (not punctuation or emoji or whatever), then the leftmost three bits of all the letters are the same. That makes hash collisions likely, and if you're using the hash for crypto I think also makes it easier to find patterns in the cyphertext. That's why in hashing text, people usually left-shift the value so far by five bits, so that the "bad" (so to speak) bits match up with good bits as of the previous character.

But I am totally not a crypto expert. When I use hash tables, it's to keep track of symbols (variable and procedure names).

Why do you want to publish your code on the forum, but not publish it the usual way?

I've updated the algorithm so it's less susceptible to collisions.
What did the user said to implement?

Really?

Because I don't want to fill my whole inventory:

I mean, if the user says length 100, you find the nearest prime to 100, which I think is 101, and use that instead.

I'm not sure what your picture is supposed to prove. I said, "if you assume most of the text is letters (not punctuation or emoji or whatever)." Capital letters are all 010xxxxx and lower case letters are all 011xxxxx. So if you're just adding Unicode values, you're wasting those bits in the hash.

You know that window scrolls if you have more projects than fit, right? ;~)

Well, that makes the code larger and harder to understand.

I don't understand you

Yes, but it will look ugly :no_mouth:

Okay, here's a vastly simplified picture. Let's say a character is eight bits wide. There are only three letters: A, with Unicode value 00000001; B, with Unicode value 00000010; and C, with Unicode value 00000011. You want to hash the text "CABACA" so you add 3+1+2+1+3+1 and get 11, which in binary is 00001011. The leftmost four bits are still zero, as they will be for any text without a lot of letters. So instead you left shift five bits between adding characters. So, you start with C, which is 00001011.
Before adding the next letter, you shift left (well actually you rotate left so you don't lose any of the bits) by five, getting 01100001, then you add the A, which is 00000001, so the sum is 01100010. Before adding the next letter you rotate five bits again, getting 01001100. The next letter is B, 00000010, so we add that and get 01001110. Now rotate by five bits again, getting 11001001. Then add A, getting 11001010. Rotate five bits, getting 01011001. Then add C, getting 11001101. Rotate five bits, 10111001. Add A, 10111010. This way, the bits that differ between letters (just the last two bits, in our simplified alphabet) get spread out throughout the 8-bit hash.

Does that help?

Yes, but I don't how this helps my algorithm?

At the bottom of your code, you add up all the character codes. I'm suggesting a slightly more complicated algorithm that shifts and adds.

Have you considered the possibility that someone might be deliberately malicious?

What does that mean?

Someone might try to make a collision on purpose

Could be, but I'm rewriting the code. I'm sure it will be harder than before...

how do we import this into a project?

Can't import the project on mobile or on a chromebook.

Just copy the source, open Notepad. Save it as TXT and then go Snap! and import the file.

chrome os doesn't have a notepad
ok...