"List entanglement" glitch (or feature?)

ch · December 17, 2022, 1:58am

I recently ran into this bug again recently and wanted to know more about it.
In snap, if you set a variable to another variable that is a list, both lists get "entangled". Basically, if you modify one of the lists, the other list is modified too. I made this project to demonstrate it:
https://snap.berkeley.edu/project?username=ch__&projectname=List%20entanglement

ego-lay_atman-bay · December 17, 2022, 2:01am

This is intentional. It's called linked lists. To make a copy of a list without having the copy modify the source, use this block

bh · December 17, 2022, 6:16am

It's pretty common among programming languages that assigning a list to another variable you get the same list, not a copy. That's for efficiency; putting a pointer in the new variable takes much less time than copying a large list.

jens · December 17, 2022, 10:08pm

Linked lists have nothing to do with it.

It's interesting how this exact behavior keeps surprising people over and over again. I have this theory that this is - contrary what most CS educators believe - both the hardest concept in all of CS and the very heart of CS. Referring to things by their name is the very essence of abstraction. Once you refer to things by a name it's clear that someone else might refer to the same thing by a different name. Take my home country, for instance: It's called Germany for some, Allemagne for others and for us it's Deutschland, but it's nonetheless the same country. Those different are different variables pointing to the same data. Likewise Stefani Germanotta and Lady Gaga are different names for the same person. If you stream Lada Gaga's music you will affect Stefani Germanotta's income. Are the two people "entangled"? No, because they are the same.

The ability to refer to the same data by different variables is essential to CS. I'm curious: Which programming language do you use that does not exhibit what you consider to be this "glitch"?

cymplecy · December 17, 2022, 11:01pm

Country example is very good

But I'd like to make the case for muggle computer people

I think, the CS community embraces the concept of references due to history

In olden times, ,memory was VERY expensive so it was better to just make a reference to the same data than make a copy of it.

And in lots of cases, this is a good thing but it IS counter-intuitive to muggles - it really is

It's all down to what a language defines what its assignment operator actually does - does it make a copy in a different memory location or does it just refer to the same memory location/object

Gives 1 which is what muggles think it should

Muggles get cognitive dissonance when the same thing doesn't work for large variable types such as lists

#DefenseRests

snapenilk · December 18, 2022, 1:49am

tl;dr: you assigned the same list (not a copy) to the other variable; there is no other list

bh · December 18, 2022, 1:50am

I agree that it's a little more confusing than Jens makes out, especially for beginners who don't know what a pointer is.

In these days of Data Science it's not unusual to have million-item lists or bigger. So even though computers are bigger and faster than the ones I grew up with, efficiency of time and memory are both still important.

It doesn't help with the confusion that sometimes we do copy lists. For example, if you say

then every time you call FOO we make a new copy of the list, so that mutating BAZ doesn't implicitly change the code of FOO.

Note that this copying only happens for literal lists made with LIST in the procedure body. If you say
SET BAZ TO (NUMBERS FROM 1 TO 100)
then the question doesn't arise, because there isn't a literal list in the code, and nobody is surprised that NUMBERS makes a new list each time it's called.

P.S. Another fine point is that the question arises only for lists, not because (or not only because) lists are big, but because only lists are mutable. For things like numbers, users have no way to know whether all references to 87 in the program point to the same 87 or separate 87s. CHANGE A BY 1, despite "change" in its name, doesn't modify any existing datum; it really makes a new number and attaches the name A to it.

snapenilk · December 18, 2022, 1:52am

No, that is where each list item has a pointer/reference to the next item wherever it is (maybe also the previous) in addition to the value.

jens · December 18, 2022, 2:23pm

Numbers - and sometimes text - are special cases of data, not the regular case! The regular case is mutable data such as sprites or lists. When introducing variables we often start by assigning a sprite to a variable. You can see the sprite in the variable watcher, and if you, for example, write a script that turns the sprite to another direction you can watch its reference inside the variable watcher also turn. That's the regular case of a reference.

Numbers - and sometimes text - are special cases of data because they are symbols. A symbol is literally "half a meaning". It's derived from ancient Greek and used to describe a piece of clay broken into two pieces. You would give one of the pieces to a friend of yours, and they would pass it on to their child, and when after a generation or so that child would come back to visit you, they would bring their "symbol" along, and if it fit into your counterpart that would prove them to be related to your old friend. Point here is: A symbol alone doesn't mean anything, symbols are half an association. The number 3 doesn't mean anything by itself. You can associate it to the describe how many children you have, to set the price for a service, to estimate the time spent on an activity, or the distance between two walls. Because symbols themselves don't mean anything it doesn't make sense to change them. What would it even mean to "change 3"? On the contrary: Changing a symbol would make it loose its ability to associate.

The issue we're encountering here to distinguish between mutable information and immutable symbolic data is - I'm convinced - one of the hardest, most pernicious and challenging pedagogical quests that's unsolved so far. All I do know is that it doesn't help to start introducing variables with symbolic data, especially numbers.

Aaaaand, my question remains unanswered:

cymplecy · December 18, 2022, 2:51pm

I think Java uses copies of arrays rather than reference to the same data

But I'm no CS expert.

I'm just saying that it's confusing for students and is a barrier that has to be overcome and leads to unexpected results if the programmer doesn't keep track, in their head, as to whether the data they are working on is entangled or not

bh · December 18, 2022, 6:26pm

Quantum entanglement is a really cute metaphor, but it's misleading in this context. Two entangled particles are two different particles, which is why it's so bizarre that information can pass between them at lightspeed. But what you're calling "entangled" lists are the same list. It's not surprising at all that if you say "Hey, Brian, let's have lunch" you get to the restaurant and bh is there waiting for you.

I think it's futile to argue about whether mutable or immutable data is the ordinary case. What makes this seem hard in languages other than Lisp (including Snap! as a Lisp) is that they use the same notation for variable assignment and array mutation. That is,

foo=bar

does something radically different from

foo[87]=781

The first one changes the binding of the name foo so that it points to an entirely different value from whatever value it had before. The second one doesn't change the binding of foo, but rather mutates the thing to which it's bound. And they make it worse by using the equal sign, which means a predicate function in the mind of anyone who's taken algebra. Lisps avoid confusion by using wholly different notations for the two kinds of "assignment":

(define foo bar)

and

(array-set! foo 87 781)

In our case these are

and

People find this mysterious only if they've learned some bad language before learning Lisp, which is unfortunately usually true.

snapenilk · December 18, 2022, 8:53pm

No, Java arrays are reference types.

int[] a = {5};            //create an array with one element: 5
int[] b = a;              //assign the same array reference
a[0] = 7;                 //reassign the array element to 7
System.out.println(b[0]); //7
/*
System.arraycopy(src, srcPos, dest, destPos, length)
can be used to copy elements from one array to another both arrays must be constructed
*/

cymplecy · December 18, 2022, 9:28pm

I'm wrong again then