Things I want that would be incompatible with current Snap!

You can see that I replied to this:

Body is too similar to what you recently posted ???

I know. I am asking how is that relevant to his question?

Hm?

Follow the reply chain.

I think he was asking about normal constants and concepts in cs come largely from math.

Explanations may vary from language to language so it is a good idea not to point to any language in particular, in my opinion.

:slightly_smiling_face: ok

maybe in settings add an option to turn case-sensitivity on/off

You have a point. But how many unique words are in "We call turkeys turkey because people thought they came from Turkey."?

Admittedly case sensitivity can get very complicated in some languages - https://www.w3.org/International/wiki/Case_folding

I have a project that needs to find the first proper noun in a sentence - a heuristic that mostly works is to pick out the first word with an initial upper case letter ignoring the first word unless it is the only one.

With the unicode block one can figure out the case of first letters. But it odd that unicode(x) might not equal unicode(y) despite x = y and unicode being a function.

By the way, too many text editors thing they are being helpful when I write "I think that Snap! is cool" it turns it into "I think that Snap! Is cool"

Yeah and two spaces after the ! also.

But if your sentence were "Turkeys are called turkey because..." you still couldn't use case to help.

Thinking as a programmer, not a linguist, my question is "which policy on case-folding makes it easier to write programs?" And the answer is that when I tell the user "Enter yes or no" it's easier if I don't also have to check separately for Yes and No. Or, I ask myself "which kind of bug will be harder to find, one in which words the user expected to be unequal compare equal, or one in which words the user expected to be equal compare unequal?" That one doesn't have as clear an answer, but I lean toward thinking that it's easier to notice the pun of the bird and the country having the same name than to notice why I typed "brian harvey" into a name field and it didn't match any record -- which never happens, I point out, in real software, because all real software correctly case folds.

I think you are largely winning this argument but

The plural for the first word saves this. But one could say "Turkey is what we call a bird of the species Meleagris gallopavo because people thought turkeys came from Turkey".

And I too find way too many websites that annoyingly force me to reenter text because they didn't like the casing.

But the unicode function that isn't a function does bother me. For the unicode block to be a function

x = y <=> unicode(x) = unicode(y)

This also bothers me

image

Maybe we should start using

image

instead for case insensitive equals. (Admittedly impractical but mathematically pure.)

Case conversion aside, = doesn't measure mathematical equality because for lists it reports True if the lists look the same, even if they're distinct lists. We have IDENTICAL TO for a narrower equivalence class.

The underlying problem is that we don't make the distinction between symbols and strings that Scheme makes. Two strings are equal only if they're Unicode-equal, but there's a ci-equal? function ("ci" = "case ignored"). Two symbols are equal if they're case-folded equal, in real Scheme. (In R6RS or later, which isn't Scheme even though they call it that, symbols have to be Unicode-equal too.)

(Besides case-folding, I think we should do looks-the-same folding on Unicodes. There are three or four lambdas in Unicode, one for Greek letters, one for mathematical Greek letters, one for special symbols I think... I forget. I'm not really pushing for this because I know it's opening Pandora's box (is the symbol for AND the same as capital lambda?) but I wish they built something like that into Unicode, so you could say canonical_Unicode_character(char) and get a character that looks like the input, such that all other characters that look like the input have the same canonical character.)

But [scratchblocks]<is [A] identical to [a]?::operators>[/scratchblocks] returns true.

Yes, because as presently defined it's the same as = for everything except lists. But we could define it as whatever narrower-than-= class is appropriate for other data types, such as case-sensitive comparison for strings.

But actually I'm leaning toward a Setting, because then it would automatically affect things like CONTAINS and other blocks that work by making equality comparisons. Conversely, we could fix < and > to do case folding before comparisons when the setting says to. (Talk about misbehaving functions, (A) = (a) is True, and (A) < (a) is True also!)

I would have thought that mathematical equality for lists is if they look the same they are the same. The notion of distinct lists only matters if side effects on lists are permitted. And side effects are not part of the mathematical view of things.

The bigger picture here is how much should Snap! strive to match mathematical conventions. I think we largely agree that when we can we should, e.g. using "=" for equality and not assignment. And this should be fixed:

But sometimes there are trade-offs. As Brian has argued case sensitivity often leads to frustrating obscure bugs and yet avoiding it causes some conflicts with the mathematical notion of equality (e.g. the unicode function behaves differently for inputs that are tagged as "equal").

My vote would not be for a setting since libraries may depend upon a different setting. If is identical to is fixed to be case sensitive and two new primitives were introduced for case sensitive versions of < and > then the appropriate behaviour can be specified locally.

But we would also need case-sensitive versions of index of, remove duplicates from, assoc, and probably a dozen other things I'm forgetting that depend on testing for equality. The setting solution automatically fixes them. In other words I'm arguing that libraries' behavior does depend on the setting, but that the library is happy to work either way. This is an empirical question, and before any serious effort along these lines I'd have to go through all the libraries to see what the reality is for each.

If is identical to was intended for lists, why is it not in the Lists palette and the inputs are not list inputs?

Sigh. Its domain is all values, including but not limited to lists. It is defined as being the same as = except for lists, for which it is true only for pointer equality. That's why.

The reason behind the reason is partly to avoid having to give error messages, partly to allow for the case where only one of the inputs is a list, and partly to allow us to consider giving it a specialized behavior for other kinds of values.

Good point.

By setting I assume you mean a global setting - the problem is that if I construct a block that depends upon case sensitivity then I should restore the setting when I return. I would discourage the use of the command for setting case sensitivity and use something like this

image

Though it should restore the setting if there is an error calling blocks. I guess the library catch errors block could be used for this.

Yeah if we had the feature at all we could wrap all kinds of interfaces around it. :~)