Logic programming library development log

I find commenting my code really useful for 3 reasons:

  1. To explain the code (how to use, inner workings) to others
  2. To enable my future self to remember the code’s intricacies and limitations even after a few months or years
  3. To fully understand what I’m doing at development time

If you think the comment options (and how used them) in Amalog v1.4 were over the top: currently Amalog (v1.5) has the option of multiple comments, which I’ve used to include and discuss pieces of Amalog “code”. I’m even contemplating in-code comment options for a future version. :smirk:

Brian's potted lecture on program documentation:

A program is a tree structure. It has some big tasks to carry out (parsing user input, say). The big tasks have medium-size constituent tasks, etc. Each of these tasks is carried out by a family of procedures.

So in a perfect world, in which programmers had liberal educations and knew how to write, there would be a documentation tree isomorphic to the program tree. There would be a Program Logic Manual describing the overall structure of the program. (That's what they were called at IBM when I was a child; they were themselves products IBM would sell you.) Then there would be more detailed documentation on the intermediate program chunks, or at least on the interesting ones. (If a chunk did something really new, or did it in a really new way, such intermediate documentation might include a journal article to announce the new thing to the world. Think about the Mapreduce article as a paradigmatic example.) And then, out at the leaves of the documentation tree, there would be inline comments attached to the code of each procedure.

So, it isn't a perfect world, and programmers would much rather program than write documentation. So you're lucky if you get user documentation, let alone internal documentation.

Given that state of affairs, if you (the later maintainer of a program, your own or someone else's) can only have a subset of the complete documentation tree, which subset would you like? I claim that you want the top levels of the tree, documenting the overall program and its major components. If you understand those levels, you can read the code, which should be largely self-documenting anyway. But, alas, what you get instead is the very bottom level of the documentation tree, the inline comments per procedure, which are almost always a direct translation of the code into English. (Yes, I know there are other human languages, but not so much in the computing world. When I visited China, a week after Mao died, the only person I met who spoke English other than the professional interpreters was the head of a university CS department.) Those inline comments rarely give me any insight into the code, my own or someone else's.

So, for example, in Snap!, we don't really have separate internal documentation, but what we do have is the great big comment at the beginning of morphic.js, and that's what you really want to read, to understand how Snap! works. The inline comments are less useful, imho. (There is also some intermediate-level documentation in the form of large comments at the beginning of the implementation of an object class.)

I'm an extremist about this. I tell my students that any time I'm tempted to put a comment in my code, I take that as a sign that I'm being too clever and should instead rewrite the code to be understandable. (Really, this is true; when I revisit my own code years later, and there's a comment, the code gives me a headache and the comment doesn't help a bit.)

The paradigmatic example of inline commenting is in Unix v6, the first one to be distributed outside of Bell Labs. In the scheduler, the switch from one user process to another is done in one instruction, which just assigns a new value to the hardware register containing the stack pointer. And the comment is

// You are not expected to understand this.

It's the only thing everyone still remembers from v6.

And you didn't really answer

Anyway, we already have the ability to attach comments to blocks, so there's no need for you to invent another way. And your program block is going to be gigantic even without the inline comments, if you imagine extending the grandmother example to a general family tree package.

Speaking of packages, I know you know this, because you said "in the current version," but what we need way more than inline comments are the ability to include a package of rules by reference, and an interactive REPL! (I want you to get it perfect, so that we can make this an official library and I don't have to write one!)

I hear what you’re saying. Actually many of my current comments are either of a tutorial nature, or about limitations and development to-do’s.

IMO this mostly proves the developers had a sense of humour.

Some things I don’t particularly like about the Snap! IDE’s comment facility:

  • comments are positioned to the right of the code, and therefore effectively require a wider window for reading;
  • larger comments may partially cover each other;
  • comments allow only text - I would like to include literal code fragments, in blocks;
  • comments may get unattached from the code, and get lost.
    That’s why I developed my own facility. But if you disapprove of it for library inclusion, I can of course use the standard comment facility eventually.

My program block can already do that, if I understand you correctly:

What do you imagine that would be like?

Thanks for the implied compliment!

So I picked a random method in morphic.js that has no comments and asked ChatGPT 4 "Please add comments to the following: ..."

If we think it is too detailed that is easy to fix. I then entered "Good. Now add only the important comments."

Yeah, this is exactly why I hate comments. Your "only the important" comments include several -- almost all the comments -- things like

if (onBeforeDrop) { onBeforeDrop(); } // Execute onBeforeDrop if provided

Now, really: How does that add one iota to the understanding of the person reading the comment? And it's not that ChatGPT is especially bad at commenting! No, that's exactly the kind of comment that human beings write; they did a good job of scraping human-written code for comments.

Oh, except, arguably, the comment on reactToDropOf is plain wrong, because it suggests that it's THIS rather than ORIGIN that does the reacting.

Now, wouldn't it be better to have a page of English, in Baskerville, saying

The following group of methods deal with the user moving a morph by direct manipulation with the mouse. Here are things the user can do:

  • Grab the morph with the mouse.
  • Move a grabbed morph around the World window.
  • Let go of the morph, perhaps in a different parent morph from where it started.
  • Click on the morph without moving it.
  • Double-click on the morph.
  • Hover over the morph.

In all of these methods, this.parent is the morph in which this morph was located before being moved. this.position.parent is the morph over which the morph is released after being moved. [Note: I am making these up. I have no idea what the truth is. But it's the sort of thing I'd like to be told. -bh] If the morph is released outside of the World window, then this.position will be null.

The isPressed method is not called unless the morph is clicked and released within morph.clickMaxTime milliseconds (default 2000) without moving more than morph.clickMaxMotion turtle steps (default 8). If the morph is held clicked in one place for a long time, the click is ignored; none of these methods is called. If the morph moves a significant distance, the isMoved method is called, but the isPressed and isReleased methods are not called. [I'm making this up too.]

And probably lots more, including everything Jens has ever had to stop and ask himself, or read the code to find out.

On our list. Taken some baby steps in that direction. Don't Hold Your Breath.™

That is what I meant, but I want to be able to keep the selected package in the global environment while interacting with the REPL.

You know,

Okay, actually, why do you need the PROGRAM block at all? Why not just have FACT and RULE and GOAL blocks that can be used in any order, and/or can be put in a plain old Snap! stack? And the package thing I was asking for would just be a stack not including any goals.

And I guess I would make GOAL a reporter, which would report a stream, like the logic programming language in SICP. That's a really elegant way to cope with infinite loops; you'd get an infinite-length stream, which you could manipulate like any stream, printing the first ten items, etc. As you said earlier, this works elegantly only if all the desired solutions appear in a finite first-n-items of the stream.

So the FACT and RULE blocks could do some antibugging by reordering the database to put base cases first, etc. And look for obvious circularities like sibling(a,b) :- sibling(b,a). But all the real work is done in GOAL, which starts with an initial empty stream of environments and processes a query by resolving it with each item of the database; a resolution can omit an item from the stream of environments if this database item is incompatible with that environment, or it can expand an item from the stream of environments if the resolution gives rise to new bindings for that environment.

But why am I trying to say all this? It's in SICP 4.4 Logic Programming.

Oh yes, of course, but the joke is that they are acknowledging that there's no way that a comment on that one line of code can explain how it works. Rather, what you need is an overall understanding of the various data structures that make up a process, and also an understanding of how the PDP-11 hardware did memory management (and now I think about it, I must have been wrong in my description; the one register they changed must have been the pointer to the page table, in which page 0 contains all the processor registers). In other words, what you'd need is a node in the second depth level of the documentation tree.

It suddenly occurs to me that we are starting with different ideas of the task, and that partly explains why what I want is a little different from what you're doing. I think that your idea is to implement Prolog, period, and you happen to be using Snap! as the implementation language. But your example Amalog program really doesn't connect with Snap! in any substantial way; you write the program in 100% Prolog notation and you get an answer from Amalog and that's that. Maybe it's just your example that has that flavor rather than your view of the entire language. But I'm envisioning a user writing what's mainly a Snap! project, but using an Amalog extension that provides answers to queries, with a typical style of use wherein the GOAL block is inside a FOR EACH ITEM block that does something or other with each answer. In fact, maybe what GOAL reports isn't a stream of printable solutions, but rather a stream of environments of bindings of variables, and there's a separate ENVIRONMENT->SOLUTION that turns such an environment into a copy of the goal but with variables filled in. Or maybe even an intermediate ENVIRONMENT->VALUES that takes an environment in which variables can be bound to other variables or expressions and reports an environment, same data structure, but one in which every variable is bound directly to a constant value. And then VALUES->SOLUTION that etc.

So this is why you do things like putting a program name field in your program block. Lispians don't do that, generally; if you want to name something, you bind a variable to it. And even with respect to comments, what you really want is to attach a comment to a specific rule, but you don't like our notation for that. (I don't either, honestly, but since I don't do line-by-line commenting it isn't a problem for me!)

That’s why I DIY-ed it :wink:

OK, I’ll have to give that some thinking, how to achieve that. Especially since you’re also …

Apparently you’re thinking out loud, which I agree is helpful at this stage. I’m not sure if the stream approach is going to work here, though it may be useful for the REPL user interface.

As I see it now, there would have to be a common part, a REPL user interface, and an API. And within these main parts, modules for specific tasks like ranking clauses, and preventing infinite loops, I suppose I could do some experiments with respect to the software architecture, over the next 1-2 months or so, using this topic for sharing ideas and intermediate results.

Didn’t read that section, yet - or if I did I must have skimmed it, some time ago. Thanks for the suggestion … OMG, it’s SICP’s longest section.

Meanwhile I did another little experiment. One of the advantages of using a block language is that it's easier not to make spelling mistakes. I actually spent an hour or so recently, debugging an Amalog program where I had made a spelling error concerning one of the variables within a rule, I recall.

So I wondered if I could extend the concoept of blocks for variables (or constants, or functors) to Amalog, too. This is what I came up with:

Logic programming DEVLAB script pic

Logic programming DEVLAB script pic (3)

And here's an example of its use:

Or even introducing new identifiers en passant:

The whole feature is optional, of course. One may as well use literals, or any combination of literals and this reporter.

Why a chick in an egg?

The item is being spawned (like in a game). I searched for “spawn” on an emoji-webpage, and this was their no. 1 result. Do you like it?

Oh. I see. That makes sense!

This alternative I like, too:

Logic programming DEVLAB script pic 3

But first I’m going to read all 72 (!) pages of SICP section 4.4 now :flushed:

I agree that that's more block languagy. I'm struggling to love the distinction between constants and variables, like the CONST declaration in Those Other Languages, although otoh in the Prolog context it really isn't like T.O.L. in which a "constant" is really just a variable (living with the other variables in memory) with a straitjacket. In fact I'm wondering if you shouldn't have a untitled script pic (2) one for variables. Ideally when you drag out the upvar your copy would have a question mark prepended, but that would require a mod to Snap!.

You didn't invent this nomenclature, but I'm wondering why Prolog treats the relation as a different kind of thing from its args. It's treated uniformly in unification, right? You can use a variable in the functor position, to ask "what relation hold between liz and bill?"

The exercise with identifier blocks is a mere experiment for now - I’m not sure if this approach is turning out beneficial on balance. I tried the ? reporter before, but (like you are writing) the upvar name will not automatically be prefixed with “?”, so the “?” needs to be part of the upvar name proper: Logic programming DEVLAB script pic 5; thus the generating block would be like: Logic programming DEVLAB script pic 4, looking odd.


As for the difference Prolog makes between functor and arguments: I agree it seems unnecessary, so the Amalog interpreter doesn’t, really:

Indeed one can ask for various relations between “arguments”:

One step further, though this final step is mostly cosmetic, would be to simply define a goal as a tuple:

In Amalog, the only real difference between item 1 (functor) and the following arguments is that a Snap! script may be used as functor, with the final argument (= item last) as the reported result of the script (function / predicate) having been applied on the other arguments - unless the script is a command, in which case the result is “true” by definition.

Logic programming DEVLAB script pic 9

Don't forget to handle (?relation ?person bill). :~)

There are many optimizations that constant functors enable. And for a particular task if one wanted variable relations then every clause could be (relation ?relation ?arg-1 ... ?arg-n)

Furthermore, constant relations makes it easy to interpret logic programs as first-order logic (really the Horn clause subset).

Which relatives is bill having arguments with? :wink: I'm afraid that's a subject for juicier fora.

Interesting. Such as ... ?

That looks a bit overdone for most practical purposes. OTOH it's probably easy to transform a database to enable it.

What practical advantages does that bring?

Yes. But can't you recognize the common special case of a constant functor and do the optimizations? IIRC SICP does that; its databases of assertions and rules are indexed on the functor, if constant, so you can quickly find relevant facts and rules, but also you can use a variable and search the entire database.

I mean, don't you ever think "Julia is some sort of cousin of mine, but I forget exactly what kind" and then wouldn't you like to follow that with "Let me ask Prolog"? (I used to give implementing "nth cousin k times removed" in logic programming as an exam question, so I have a warm spot in my heart for those obscure relations.)

I hate the name "functor," by the way, because it only encourages students to try to do composition of functions

grandparent(x,y) :- parent(x,(parent(y)) //wrong!

Why aren't they called "relations"?

Although SICP has optimizations for constant ones below the abstraction barrier, in its user-level documentation it treats assertions, rules, and queries as arbitrary token strings, so you can say

(assert! (Brian likes potstickers))

and ask questions such as (?who likes potstickers) and (Brian likes ?what).

P.S. That would also be more Snap!ly, because we have Smalltalk-style title text interspersed with arguments.

P.P.S. And anyway I can never remember the order of arguments to a relation such as PARENT, and it'd be easier for me if the relation were

(parent of ?x is ?y)

I don’t like it either - to me it sounds like a very strict bureaucrat from former East Germany.

Interesting line of thought. Perhaps even something like ... ?

Logic programming DEVLAB script pic (8)

Logic programming DEVLAB script pic (6)

This enables us to style a fact / goal / head as:

  • a postfix relation of 1 argument
  • an infix relation with 2 arguments
  • an infix relation with 1 leading argument, and any number of trailing arguments
    The relation would always be item 1 of the block's output.

A separate block would support calling Snap! functions.

The most obvious is that a goal need only be matched with the heads of those clauses with the same predicate. Either a quick table lookup or a pointer to those clauses.

For many logic programs one can understand them both procedurally and declaratively. Sometimes instead of debugging a predicate procedurally one looks at the code and asks "Is this saying something true?"