Logic programming library development log

sonic123cdmania · February 8, 2024, 2:43pm

Why did you reply to ME when that should be a reply to @qw23? But a sudoku solver WOULD be nice.

bluebaritone21 · February 8, 2024, 2:44pm

With prolog, it's easy.

sonic123cdmania · February 8, 2024, 2:45pm

... You only replied to the 2nd part, with the sudoku solver, but I had a question before that part.

bluebaritone21 · February 8, 2024, 2:48pm

I forgot who I was replying to. I just wanted to quote qw23.

sonic123cdmania · February 8, 2024, 2:53pm

Ah, Ok.

qw23 · February 8, 2024, 4:57pm

Amalog (the name of my logic programming language) is a kind of simple Prolog (and the interpreter is only a few days old, so it may still have serious bugs). Within the project I made a few small demo programs (family tree, a potentially infinite loop, and fibonacci). So I'm not sure if you can actually write a sudoku solver yet.
BTW I'm working on a short tutorial.

bluebaritone21 · February 8, 2024, 4:58pm

Okay.

qw23 · February 9, 2024, 12:50am

done that!

In the current version (1.4) each AMALOG program is a self-contained block reporting its result:

Link to project: see post #19.

bh · February 10, 2024, 5:47pm

Interesting. I think programs that deal with data structures could be tough to analyze because we may know things about what counts as valid input to this program that the analysis program doesn't know. So there might be an obvious infinite loop for data structures of a certain type, but we know that those structures will never come up, so the program works correctly. But I am only an egg.

I'm not sure if "retracted" has some special meaning in this context. But what I meant is that, suppose the query matches the conclusions of two rules. So you look at the first rule, and try to evaluate its body in an environment containing bindings that make the query match the conclusion. The body is itself a query, and as you unify that secondary query with some rule's conclusion you create additional bindings, then try to evaluate that rule's body, etc etc, and then finally you find that you can't, in fact, satisfy the body of the first rule. So, now you want to match the original query against the second candidate rule. But you have to do that in an environment that doesn't include any of the bindings created while trying to satisfy the first rule.

None of this is problematic in principle, because in principle each query is evaluated with respect to an environment, and bindings come and go as queries are entered or exited, same as for procedure calls in conventional languages. But what started this sub-conversation was you saying you were going to memoize bindings. "Memoize" means that those bindings are going to stick around even when the query that gives rise to them finishes. So that's what worries me.

You're right, you have to satisfy all the conditions of a conjunction. That's why it's a tree and not just a linear task list. So you have to get down to the root to have a complete solution. But that's not the question; the question is whether you have to build an explicit depth-first tree search into your code. And no, you don't. Every time you resolve a query against a rule, that's analogous to calling a procedure in a procedural language. You put this new task on a stack (which is just a linear data structure, not a tree). When this resolution is finished, you pop it off the stack and continue with the previously-current task. Because one of these tasks may involve several subtasks, one at a time, you may push and pop tasks on and off the stack repeatedly before popping off the next level down. Since the subtasks happen one at a time, the stack doesn't become a tree.

Does that help? Or am I answering the wrong question?

bh · February 10, 2024, 5:56pm

Cool. Is the program ID used for anything? Why not instead, if you want to name one, put it in a ring and assign it to a (Snap!) variable?

The comment thing pushes my buttons. Your example comment says only and exactly the same thing that the code says! How is that helpful? Maybe if it were a little more abstracted: "This program is about a family tree. The facts in the database are about parenthood; the rules define other relationships (grandparent, cousin, sibling, etc.)." But even that should be obvious from the code if the relations have reasonable names.

Seems to me that you've been bitten by a high school computer science teacher. :~(

qw23 · February 10, 2024, 8:39pm

I’m currently struggling with the Loop Stopper^TM, as I’ve called the mechanism to a avoid infinite loops. Or, to put a positive spin on it, it’s my primary area of interest right now. The main challenge is not to prevent infinite loops, but rather to prevent the Loop Stopper from aborting promising branches. As a general rule, any program should contain facts before rules, more specific rules before more general rules, non- before autorecursive rules, and l’ve also adopted some policies for use within a rule’s body. Even so, in one of my programs there’s a clause now that, when added behind the other clauses, unexpectedly reduces the number of solutions (This particular case has been solved in v1.5, but I’m far from sure that it can’t occur again, in a more complex program, to come)… It’s a work in progress!

You must have misunderstood what I wrote in post #27. My interpreter memoizes solutions per goal.

bh:

You're right, you have to satisfy all the conditions of a conjunction. That's why it's a tree and not just a linear task list. So you have to get down to the root to have a complete solution. But that's not the question; the question is whether you have to build an explicit depth-first tree search into your code. And no, you don't. Every time you resolve a query against a rule, that's analogous to calling a procedure in a procedural language. You put this new task on a stack (which is just a linear data structure, not a tree). When this resolution is finished, you pop it off the stack and continue with the previously-current task. Because one of these tasks may involve several subtasks, one at a time, you may push and pop tasks on and off the stack repeatedly before popping off the next level down. Since the subtasks happen one at a time, the stack doesn't become a tree.

Does that help? Or am I answering the wrong question?

I truly don’t know. I’ll look into it later.

The program name doesn't do anything (yet), it’s just a field for the programmer’s reference now.

This, too, is work in progress.

Actually when I was in secondary school, decades ago, there was no such thing as computer science being taught there. I read, and loved, GEB though.

bh · February 10, 2024, 9:36pm

Yeah, that's what I was trying to ask.

Oh. I can help. Just delete that input. ;~P

Well someone must have taught you comment fetishism! I don't believe anyone thinks that way naturally.

GEB is wonderful! Not least because it's typeset in Baskerville.

qw23 · February 11, 2024, 6:34pm

I find commenting my code really useful for 3 reasons:

To explain the code (how to use, inner workings) to others
To enable my future self to remember the code’s intricacies and limitations even after a few months or years
To fully understand what I’m doing at development time

If you think the comment options (and how used them) in Amalog v1.4 were over the top: currently Amalog (v1.5) has the option of multiple comments, which I’ve used to include and discuss pieces of Amalog “code”. I’m even contemplating in-code comment options for a future version.

bh · February 11, 2024, 7:38pm

Brian's potted lecture on program documentation:

A program is a tree structure. It has some big tasks to carry out (parsing user input, say). The big tasks have medium-size constituent tasks, etc. Each of these tasks is carried out by a family of procedures.

So in a perfect world, in which programmers had liberal educations and knew how to write, there would be a documentation tree isomorphic to the program tree. There would be a Program Logic Manual describing the overall structure of the program. (That's what they were called at IBM when I was a child; they were themselves products IBM would sell you.) Then there would be more detailed documentation on the intermediate program chunks, or at least on the interesting ones. (If a chunk did something really new, or did it in a really new way, such intermediate documentation might include a journal article to announce the new thing to the world. Think about the Mapreduce article as a paradigmatic example.) And then, out at the leaves of the documentation tree, there would be inline comments attached to the code of each procedure.

So, it isn't a perfect world, and programmers would much rather program than write documentation. So you're lucky if you get user documentation, let alone internal documentation.

Given that state of affairs, if you (the later maintainer of a program, your own or someone else's) can only have a subset of the complete documentation tree, which subset would you like? I claim that you want the top levels of the tree, documenting the overall program and its major components. If you understand those levels, you can read the code, which should be largely self-documenting anyway. But, alas, what you get instead is the very bottom level of the documentation tree, the inline comments per procedure, which are almost always a direct translation of the code into English. (Yes, I know there are other human languages, but not so much in the computing world. When I visited China, a week after Mao died, the only person I met who spoke English other than the professional interpreters was the head of a university CS department.) Those inline comments rarely give me any insight into the code, my own or someone else's.

So, for example, in Snap!, we don't really have separate internal documentation, but what we do have is the great big comment at the beginning of morphic.js, and that's what you really want to read, to understand how Snap! works. The inline comments are less useful, imho. (There is also some intermediate-level documentation in the form of large comments at the beginning of the implementation of an object class.)

I'm an extremist about this. I tell my students that any time I'm tempted to put a comment in my code, I take that as a sign that I'm being too clever and should instead rewrite the code to be understandable. (Really, this is true; when I revisit my own code years later, and there's a comment, the code gives me a headache and the comment doesn't help a bit.)

The paradigmatic example of inline commenting is in Unix v6, the first one to be distributed outside of Bell Labs. In the scheduler, the switch from one user process to another is done in one instruction, which just assigns a new value to the hardware register containing the stack pointer. And the comment is

// You are not expected to understand this.

It's the only thing everyone still remembers from v6.

And you didn't really answer

Anyway, we already have the ability to attach comments to blocks, so there's no need for you to invent another way. And your program block is going to be gigantic even without the inline comments, if you imagine extending the grandmother example to a general family tree package.

Speaking of packages, I know you know this, because you said "in the current version," but what we need way more than inline comments are the ability to include a package of rules by reference, and an interactive REPL! (I want you to get it perfect, so that we can make this an official library and I don't have to write one!)

qw23 · February 12, 2024, 1:26am

I hear what you’re saying. Actually many of my current comments are either of a tutorial nature, or about limitations and development to-do’s.

IMO this mostly proves the developers had a sense of humour.

Some things I don’t particularly like about the Snap! IDE’s comment facility:

comments are positioned to the right of the code, and therefore effectively require a wider window for reading;
larger comments may partially cover each other;
comments allow only text - I would like to include literal code fragments, in blocks;
comments may get unattached from the code, and get lost.
That’s why I developed my own facility. But if you disapprove of it for library inclusion, I can of course use the standard comment facility eventually.

My program block can already do that, if I understand you correctly:

What do you imagine that would be like?

Thanks for the implied compliment!

toontalk · February 12, 2024, 2:35am

So I picked a random method in morphic.js that has no comments and asked ChatGPT 4 "Please add comments to the following: ..."

If we think it is too detailed that is easy to fix. I then entered "Good. Now add only the important comments."

bh · February 12, 2024, 4:26am

Yeah, this is exactly why I hate comments. Your "only the important" comments include several -- almost all the comments -- things like

if (onBeforeDrop) { onBeforeDrop(); } // Execute onBeforeDrop if provided

Now, really: How does that add one iota to the understanding of the person reading the comment? And it's not that ChatGPT is especially bad at commenting! No, that's exactly the kind of comment that human beings write; they did a good job of scraping human-written code for comments.

Oh, except, arguably, the comment on reactToDropOf is plain wrong, because it suggests that it's THIS rather than ORIGIN that does the reacting.

Now, wouldn't it be better to have a page of English, in Baskerville, saying

The following group of methods deal with the user moving a morph by direct manipulation with the mouse. Here are things the user can do:

Grab the morph with the mouse.

Move a grabbed morph around the World window.

Let go of the morph, perhaps in a different parent morph from where it started.

Click on the morph without moving it.

Double-click on the morph.

Hover over the morph.

In all of these methods, this.parent is the morph in which this morph was located before being moved. this.position.parent is the morph over which the morph is released after being moved. [Note: I am making these up. I have no idea what the truth is. But it's the sort of thing I'd like to be told. -bh] If the morph is released outside of the World window, then this.position will be null.

The isPressed method is not called unless the morph is clicked and released within morph.clickMaxTime milliseconds (default 2000) without moving more than morph.clickMaxMotion turtle steps (default 8). If the morph is held clicked in one place for a long time, the click is ignored; none of these methods is called. If the morph moves a significant distance, the isMoved method is called, but the isPressed and isReleased methods are not called. [I'm making this up too.]

And probably lots more, including everything Jens has ever had to stop and ask himself, or read the code to find out.

bh · February 12, 2024, 6:47am

On our list. Taken some baby steps in that direction. Don't Hold Your Breath.™

That is what I meant, but I want to be able to keep the selected package in the global environment while interacting with the REPL.

You know,

Okay, actually, why do you need the PROGRAM block at all? Why not just have FACT and RULE and GOAL blocks that can be used in any order, and/or can be put in a plain old Snap! stack? And the package thing I was asking for would just be a stack not including any goals.

And I guess I would make GOAL a reporter, which would report a stream, like the logic programming language in SICP. That's a really elegant way to cope with infinite loops; you'd get an infinite-length stream, which you could manipulate like any stream, printing the first ten items, etc. As you said earlier, this works elegantly only if all the desired solutions appear in a finite first-n-items of the stream.

So the FACT and RULE blocks could do some antibugging by reordering the database to put base cases first, etc. And look for obvious circularities like sibling(a,b) :- sibling(b,a). But all the real work is done in GOAL, which starts with an initial empty stream of environments and processes a query by resolving it with each item of the database; a resolution can omit an item from the stream of environments if this database item is incompatible with that environment, or it can expand an item from the stream of environments if the resolution gives rise to new bindings for that environment.

But why am I trying to say all this? It's in SICP 4.4 Logic Programming.

Oh yes, of course, but the joke is that they are acknowledging that there's no way that a comment on that one line of code can explain how it works. Rather, what you need is an overall understanding of the various data structures that make up a process, and also an understanding of how the PDP-11 hardware did memory management (and now I think about it, I must have been wrong in my description; the one register they changed must have been the pointer to the page table, in which page 0 contains all the processor registers). In other words, what you'd need is a node in the second depth level of the documentation tree.

It suddenly occurs to me that we are starting with different ideas of the task, and that partly explains why what I want is a little different from what you're doing. I think that your idea is to implement Prolog, period, and you happen to be using Snap! as the implementation language. But your example Amalog program really doesn't connect with Snap! in any substantial way; you write the program in 100% Prolog notation and you get an answer from Amalog and that's that. Maybe it's just your example that has that flavor rather than your view of the entire language. But I'm envisioning a user writing what's mainly a Snap! project, but using an Amalog extension that provides answers to queries, with a typical style of use wherein the GOAL block is inside a FOR EACH ITEM block that does something or other with each answer. In fact, maybe what GOAL reports isn't a stream of printable solutions, but rather a stream of environments of bindings of variables, and there's a separate ENVIRONMENT->SOLUTION that turns such an environment into a copy of the goal but with variables filled in. Or maybe even an intermediate ENVIRONMENT->VALUES that takes an environment in which variables can be bound to other variables or expressions and reports an environment, same data structure, but one in which every variable is bound directly to a constant value. And then VALUES->SOLUTION that etc.

So this is why you do things like putting a program name field in your program block. Lispians don't do that, generally; if you want to name something, you bind a variable to it. And even with respect to comments, what you really want is to attach a comment to a specific rule, but you don't like our notation for that. (I don't either, honestly, but since I don't do line-by-line commenting it isn't a problem for me!)

qw23 · February 12, 2024, 9:39am

That’s why I DIY-ed it

OK, I’ll have to give that some thinking, how to achieve that. Especially since you’re also …

bh:

… envisioning a user writing what's mainly a Snap*!* project, but using an Amalog extension that provides answers to queries, with a typical style of use wherein the GOAL block is inside a FOR EACH ITEM block that does something or other with each answer. In fact, maybe what GOAL reports isn't a stream of printable solutions, but rather a stream of environments of bindings of variables, and there's a separate ENVIRONMENT->SOLUTION that turns such an environment into a copy of the goal but with variables filled in. Or maybe even an intermediate ENVIRONMENT->VALUES that takes an environment in which variables can be bound to other variables or expressions and reports an environment, same data structure, but one in which every variable is bound directly to a constant value. And then VALUES->SOLUTION that etc.

Apparently you’re thinking out loud, which I agree is helpful at this stage. I’m not sure if the stream approach is going to work here, though it may be useful for the REPL user interface.

As I see it now, there would have to be a common part, a REPL user interface, and an API. And within these main parts, modules for specific tasks like ranking clauses, and preventing infinite loops, I suppose I could do some experiments with respect to the software architecture, over the next 1-2 months or so, using this topic for sharing ideas and intermediate results.

Didn’t read that section, yet - or if I did I must have skimmed it, some time ago. Thanks for the suggestion … OMG, it’s SICP’s longest section.

qw23 · February 12, 2024, 11:56am

Meanwhile I did another little experiment. One of the advantages of using a block language is that it's easier not to make spelling mistakes. I actually spent an hour or so recently, debugging an Amalog program where I had made a spelling error concerning one of the variables within a rule, I recall.

So I wondered if I could extend the concoept of blocks for variables (or constants, or functors) to Amalog, too. This is what I came up with:

Logic programming DEVLAB script pic

Logic programming DEVLAB script pic (3)

And here's an example of its use:

Or even introducing new identifiers en passant:

The whole feature is optional, of course. One may as well use literals, or any combination of literals and this reporter.