I copied a list to another variable, and then when I change the copy, it changes the original too!

Short answer: You didn't copy the list to the variable, you gave the same list another name. [The pictures below are a lie. See below the divider -------- for the full story.]

Here's what it looks like:



The list is made of pairs, each of which points to one list item (left half) and to another pair (right half). The right half of the last pair is a special empty-list object.


The variable FOO points to the list, and the SET copies the pointer to BAR:


Why not copy the entire list? One reason is efficiency; even if the list contains a million items, it just takes a tiny, fixed amount of time to copy the pointer. But also, this technique allows for partial sharing of list structure:





-------- dual-format lists

Lists made out of pairs are called linked lists. They're ideal for writing recursive functions, like this:



ITEM 1 OF, ALL BUT FIRST OF, and IN FRONT OF are all fast constant-time functions for linked lists. But ITEM n OF, for arbitrary n, is not a constant-time operation; it takes time proportional to n. So if you write PLURALS in imperative style:


then the total time required for this version of PLURALS using linked lists is proportional to the square of the size of the list. (n calls to ITEM OF, each taking average time n/2.) For this style of programming, it's more efficient to represent the list as a dynamic array, in which ITEM n OF takes constant time regardless of the value of n. But when using dynamic arrays, ALL BUT FIRST OF or IN FRONT OF require making a new copy of the entire list, so arrays take quadratic time for the functional version of PLURALS.

Our goal in designing Snap! was that beginners should be able to write programs in any style, without thinking about how lists are represented, and still get good performance. So, for each list separately, we take note of whether you last used IN FRONT OF and friends, or last used the commands ADD, REPLACE, etc., and change the internal list structure accordingly. As long as you use a list consistently (functionally or imperatively), you get good performance. If you mix and match functional and imperative styles for a single list, we keep converting back and forth, and performance is terrible.

But, getting back to the original question, even for dynamic arrays, when you assign a list value to a variable, only the pointer to the list is copied, not the list itself.

(This is actually true for any value, not just lists, but it's only in the case of lists that anyone notices, because only lists are mutable--they can be changed without making a new copy.)

If I add something to one of the lists, how does it not get added to the other list?
I created two variables, a and b. I set a to a list of numbers and b to all but first of (a).
add [thing] to (a) only adds thing to a. add [thing] to (b) behaves differently attached to the script or separate: attached to the same script as the set blocks, thing is added to both lists, but separate, thing is only added to b.

The last pair in your image was
but shouldn't it be
b ?
The reason I think that is because the last pair is still a pair with two items: "Ringo" and an empty list. Otherwise the image should be
c .

Your proposed notation means the same thing as mine. A diagonal filling a box means "empty list." It's just a notational convention.

Oy. This is a good example of why you shouldn't mix functional and imperative style with the same list structure. Here's the sequence of events:

  1. You use the LIST block to make a list of numbers. It's stored as a dynamic array, just because.
  2. You apply ALL BUT FIRST OF to A. This converts it to a linked list. So B is also a linked list.
  3. Suppose you ADD THING TO A. This converts A to a dynamic array. But B still points to a shorter list that's in linked form, and B doesn't know anything about what you did to A, so now they are separate data structures. B remains linked.
  4. Undo that step and instead ADD THING TO B. This converts B to dynamic array form. A is now a "hybrid" list, a pair whose car is A and whose cdr is the dynamic array B. They still share storage.

It all works fine, but it's not easy to follow.

How does add [thing] to (b) work differently within the same script or separate?
If the block is separate, a is still changed but the variable monitor isn't.

You can't edit linked lists. If you only use linked lists, then this difference is impossible to detect. The only problem is if you use a list that is something in front of an array. For example, if you set foo and bar up with
then the list structure of foo will be (array "x" "y" "z") and bar will be (cons "w" foo), meaning that bar is (cons "w" (array "x" "y" "z")). Now you add "k" to foo, and because it is a operation on foo, and doesn't use set, it changes bar too. Now foo is (array "x" "y" "z" "k") and bar is (cons "w" (array "x" "y" "z" "k")). However, if you set it up again and this time add "k" to bar, then bar is a linked list. Because of that, the first step is to convert it to an array meaning that foo stays as (array "x" "y" "z") but bar changes from (cons "w" (array "x" "y" "z")) where the array is foo, to a new array (array "w" "x" "y" "z"). Then k is added to bar, ending with foo (array "x" "y" "z") and bar (array "w" "x" "y" "z" "k"). Ignoring the list type, bar ends up the same, but foo is missing an item. This is probably the most confusing way to use lists, which is why the cons block should first convert the cdr to linked.

Good analysis, thank you for clarifying that.

I still say, if you use functional programming on your list, it'll be a linked list and behave just like a Lisp list. If you use imperative programming, it'll be, and behave like, a Javascript array. If you mix mode, your program will run slowly and we make no promises.

Having said that, maybe we should look into how to maintain cdr-coded lists (that's what those mixed mode ones are called). There's probably something we can do short of abandoning cdr-coding, which has turned out to be very efficient overall in Lisp programs. (This claim is based on Lisp Machine Lisp, which is where I learned about cdr-coding.)

Having used Tcl as my primary scripting language for 30 years, I thought I'd seen enough 'shimmering' to last a lifetime. In Tcl, because of the commitment to treat 'everything as a string', you can get atrocious performance if you inadvertently force conversion of a chunk of data to string and whatever it's underlying type oughta be, and then back again. The beauty of the Tcl implementation, though, is that, functionally, it is completely opaque to the user.
Isn't the problem that you want to treat linked lists as immutable, so that as soon as someone does a direct modification of the list, you convert it to an array? What would be lost by allowing for mutable lists in the sense of allowing list insertion/deletion and direct node replacement? Certainly, some programs that depend on the semantics of Lisp lists might break, but that can't be a lot of beginner programs.
Of course, then you'd need to figure out how to explicitly create arrays, I guess.
Just speculating.

That's not exactly the case. As an old Lispian, I'm happy to write my own projects that mutate linked lists, with the goal that upstream lists will see the change. But what we tell people is that if you write purely functional code, you get linked lists and efficient processing; if you write purely imperative code (Scratch-style), you get dynamic arrays and efficient processing. But if you do both to the same list, then we don't make any guarantees.

To give users complete control, we should invent linked-list-specific mutators set-car! and set-cdr! that don't convert to array. We haven't done that, mainly just because it hasn't been urgent. But if some TEALS project needed it, or more likely one of Jens's German CS professor friends, we could do it quickly.

Well, no, I'm going to partly take that back. Once you let users make circular list structures, you have to worry about breaking out of infinite loops when they try to print them -- or, try to print them in a circularity-aware manner, like Common Lisp. (Dunno if they've shoehorned that into the Scheme standard yet; I got depressed around R6RS and it's only gotten worse since.) And then if you want to preserve shared structure when you save a project it slows things down a lot, which is why it's one of the shift-click options. Meanwhile iirc we actually do catch certain kinds of circularity when displaying a list.

https://snap.berkeley.edu/snapsource/dev/snap.html#present:Username=spaceelephant&ProjectName=mutable%20linked%20lists. The only problem I see is that, for circular lists, there is no data about how circular it is, only the number infinity. It needs blocks for the singular part length and the repeating part length. (for standard lists, singular is everything and repeating is empty)
When a watcher is required, it makes a table watcher that shows the whole list (it should show the singular part and the repeating part separated by something) and scrolling works, but clicking, left or right, on the watcher crashes snap.
EDIT: There is now a split length block. I do not have a fix for the watcher because I do not understand it at all. Split length of a finite length list is [length, 0]. For a list z where it some CDR, CDDR etc is z, the split length is 0 and the number of Ds. For any other linked list, it is the same as the split length of the CDR except for the first element being one higher.