Logic programming library development log

But why does the logic have to be first order to do that? Mathematicians evaluate the truth of higher order relations all the time.

Why not allow any number of left arguments, if you're going to distinguish the relation as an upvar?

But in a perfect world I could say

?list items from ?a to ?b are ?result

and you'd make a relation named _ items from _ to _ are _. But the user would refer to it the way I wrote it above. I know, upvars would have to be more flexible.

Because people have figured out how to evaluate the Horn clause subset of first-order logic very efficiently. And I'm pretty sure higher-order is computationally expensive to evaluate.

Yeah. I remember AI people at both MIT and Stanford working on that sort of thing in my even more ignorant youth. But still, one should be able to use easy methods on easy problems and still have hard methods for hard problems. (Not so much conceptually hard, I think, as just exponential time.)

OK. I was going to look into that option anyway.

There’s practical reasoning behind that: dropping a reporter block onto a double arrow ( ◁▷ ) that is followed by at least one slot in the same line, is rather difficult, or close to impossible (granted, one may make a copy of the block, paste that onto the arrow, subsequently delete the original). Therefore I only use multi-inputs at the end of a line - if necessary I’ll insert a line break, which IMO isn’t suitable here. OTOH, it’s not very likely that anyone is going to use an input list here. So I’ll reconsider.

… back to SICP

In order to make a reasonably efficient logic programming interpreter, I’ll need to implement memoization. I edited some code I used earlier for memoizing recursive operations:

(corrected version; the original contained a bug)

When I’m going to use this for storing solutions to a goal (“relation” will be item 1 of the arguments), I need to make sure that the number of arguments of that type of relation is always the same, for this algorithm to work.. So if a type of relation’s arguments number is variable, I may have to insert the arity as an extra argument for memoization.

The function uses the following simple lookup function:

I would have preferred to use item (key) of (table) - unfortunately that doesn’ t work for numbers, so I made this one up.

You can have the value part of each kv-pair to be itself a pair consisting of a value and a sub-dictionary. Then, if there are no more levels you report the value part, and if there are more levels you do the recursive call on the sub-dictionary part. (Pretty sure this is in Chapter 3. ;~P)

I searched SICP for multilevel memoization, but didn’t find it. Pretty sure that what you suggest is exactly what my code does, though.

Ah, sorry, I didn't make myself clear. Memoization is an application of the table data structure. (This is true in general, not just in SICP-land.) There isn't anything specifically about multilevel memoization, but that's just memoization in a multilevel table. Tables are Section 3.3.3 and n-dimensional tables are ex. 3.25. The detail about allowing both a value and a subtable for the same key is, I guess, one of my exam problems. :~)

Taking into account @bh’s suggestions (or directions, actually), I am manually translating relevant SICP code to Snap! - so I can at least see how it works (not necessarily going to use all of it in the end).

One piece of code I'm not sure I don't understand (at all) is on p. 642 (§ 4.4.4.2):

(define (always-true ignore frame-stream) frame-stream)

I now guess this translates as: a reporter always-true with two arguments (such as another reporter, and a stream), ignoring the first argument and reporting the second.

(put ‘always-true ‘qeval always-true)

And this would make an entry into a table implementing polymorphism, I suppose. The put function (with its sibling get) is a bit hard to find within the SICP text, it's not mentioned in the index and I believe I saw two different implementations but I can't remember where exactly, probably somewhere between pages 150 and 600-something. in § 2.4.3, and § 3.3.3, and perhaps somewhere else too; really confusing is in § 4.4.4.2 it appears as if the arguments have been switched.

Could anyone knowledgeable (@bh, @toontalk?) explain what this means, and how that would translate into Snap!confirm this ?

Also, I’d like to be sure that (p. 656, § 4.4.4.7):

(define (contents exp)
_ (if (pair? exp)
_ _ (cdr exp)
_ _ (error “unknown expression CONTENTS” exp)))

[BTW the “_”s mean nothing, I am merely using them for layout (indentation)]

… can safely be translated into Snap! as:

… or even just:

… ?

Uh oh, am I being too controlling?

ALWAYS-TRUE: The general form of a rule is
(rule (conclusion) body)
analogous to
(define (function args...) body)
in Scheme.

How do you try using a rule to satisfy a query? There are two steps: (1) Unify the conclusion of the rule with the query. (2) If that succeeds (i.e., doesn't report FAILED), then qeval the body in the frame-stream it reports. This gives rise to a possibly-extended-or-shortened stream of possibly-extended frames.

But some rules have empty bodies, e.g.,
(rule (append () ?b ?b))
They could have made that a special case in APPLY-A-RULE this way (my added code in capital letters -- this is why symbols should be case-insensitive!):

(define (apply-a-rule rule query-pattern query-frame)
(let ((clean-rule (rename-variables-in rule)))
(let ((unify-result
(unify-match query-pattern
(conclusion clean-rule)
query-frame)))
(if (eq? unify-result 'failed)
the-empty-stream
(IF (NULL? (RULE-BODY CLEAN-RULE))       ; no body?
(SINGLETON-STREAM UNIFY-RESULT)    ; return unify result
(qeval (rule-body clean-rule)
(singleton-stream unify-result)))))))


but instead they are avoiding having to make that test by being extra clever in the selector RULE-BODY:

(define (rule-body rule)
(if (null? (cddr rule))
'(always-true)


so that instead of seeing an empty body, APPLY-A-RULE sees (ALWAYS-TRUE).

The use of PUT here isn't about polymorphism; it's about recognizing special forms in the query language, which mostly means AND, OR, and NOT. But the PUT makes ALWAYS-TRUE also a special form, which means that instead of treating it as a query (remember, bodies are made of queries) and trying to match it against the database of assertions and rules, qeval will just call the Scheme procedure that's PUT into the table as the value associated with the key ALWAYS-TRUE. (The 'QEVAL as the second argument to PUT is just because back in Chapter 2 they defined PUT to use a 2D table, I think. Just take PUT on faith.)

So what happens when the evaluator finds a special form?

(define (qeval query frame-stream)
(LET ((QPROC (GET (TYPE QUERY) 'QEVAL)))
(IF QPROC
(QPROC (CONTENTS QUERY) FRAME-STREAM)
(simple-query query frame-stream))))


This is the code from the book; I haven't changed it. The capitalized part is what handles special forms. We look up the car of the query (that's what TYPE does) in the get/put table, and if we find it, call the procedure in the table value with two inputs: the special form with its keyword removed (CONTENTS is cdr) and the frame stream. They remove the keyword because in the usual case of AND, OR, or NOT, the contents part of the special form is a list of queries (AND query query ...) and so the handler can call itself recursively on the cdr of that list.

But in the case of ALWAYS-TRUE, the CONTENTS will be empty; there aren't any sub-queries to evaluate. So ALWAYS-TRUE has to be written to expect two arguments: an empty list of queries, which it ignores, and a frame stream, which it returns unchanged.

CONTENTS: Yes, your translation is okay, although PAIR? isn't the same as LIST?. It's making sure the EXPRESSION is a nonempty list. That's so that the query language won't accept (AND) etc.; those special forms require at least one sub-query (exactly one sub-query in the case of NOT). I guess also a perverse user could say (AND . FOO) to get a non-list CONTENTS.

I'm not saying you have to exactly copy the SICP code! Just that you should understand the SICP code and in particular how they use the idea of a stream of frames to make the whole thing functional.

Thanks for the explanation!

Yeah, that has been my interpretation, too. The SICP code from this section - and building on code from other sections - is kind of intimidating to me as a non-Schemer. I haven’t gotten around yet to analyzing how streams are to be integrated into my approach of the interpreter, but I will eventually . “Don’t hold your breath”

Well … perhaps you could help me out a bit more by telling if there are any ready-to-use Snap! equivalents of:

delay
force
stream-ref (would that be: item (n) of show stream?)
display-stream
cons-stream (= in front of stream ?)
the-empty-stream (= list() ??)
singleton-stream (= stream (x))
stream-flatmap (= sum (map over stream) ?)
interleave-delayed (why is it different from interleave, which in itself is supposed to handle streams?)
qeval
flatten-stream

BTW while studying SICP’s streams section I tried to implement the pairs function in Snap!, but something is not working well. If both streams are finite, show stream will raise an error if the number of elements to be shown is larger than the length of the stream of pairs:

This is the “project” (I simplified the test case as much as possible).
I’m pretty sure show stream is not causing the issue (while debugging I found it encounters a pair of two empties as item 1 of a Snap! record representing a stream), but I don’t know what is.

Solved it: the pairs function was devised for infinite streams only.

I'm not saying you have to exactly copy the SICP code! Just that you should understand the SICP code and in particular how they use the idea of a stream of frames to make the whole thing functional.

Thanks not what I was trying to do either. I just wanted to get the framework functional so I could get in and examine the innards of snap and javascript in real time. Just a meta-circular evaluator in snap would be incredible. INCREDIBLE, let alone what I could add on top.

The problem is, the claim is that "Snap! is Scheme disguised as scratch" and it's not, because it has features that 1996's scheme will never have, and certain expectations that SICP asks for have been subtly changed and there is a barrier there, and it's not even there intentional, it's the side effect of the book being nearly 30 years old and the next version of the book written for a language that's powerful for all the wrong reasons.

Thinking aloud, I’m considering to add and, or and not blocks to Amalog (or whatever a logic programming library will be called in the end).

I don’t think it’s absolutely necessary though:

• and is the standard for a series of sub-goals within a rule’s body within the current version.
• the effect of or can be realized by adding extra clauses, e.g.:

(child, ?a, ?b) :- (father, ?b, ?a) or (mother, ?b, ?a)

… is logically equivalent to my current implementation’s:

(child, ?a, b?) :- (parent, ?b, ?a);
(parent, ?a, ?b) :- (father, ?a, ?b);
(parent, ?a, ?b) :- (mother, ?a, ?b)

• implementing not is also strictly unnecessary, since e.g. SICP’s example (§ 4.4.3, sub “problems with not”; my paraphrase):

… would be equivalent to:

[note]

However, I suspect and, or, and not are easier to use, and may even enable implementation optimizations.

note (to myself): the final argument (result) of a goal denoting a Snap! predicate test is actually superfluous, as a test can always be rewritten to be successful when true.

(delay x) is more or less (lambda () x) (so it's a special form, not an ordinary procedure). I say "more or less" because it's actually memoized, so when you force it for a second time it doesn't recompute the underlying expression. So

(let ((already-run? #f)
(value #f))
(lambda ()
value
(let ((result X))
(set! value result)
result))))


where the X is the unevaluated text of the input to DELAY. (If you wanted to implement this in Scheme, you'd write a macro. In Snap! you can just make it a block whose input is Any(unevaluated). We rule.)

(define (force promise) (promise))
This is the opposite of DELAY. Since PROMISE is a thunk (a procedure of no inputs), we can just call it; unlike DELAY, FORCE is an ordinary procedure.

stream-ref: Yes, you could implement it using SHOW-STREAM, but it's usually just defined in terms of STREAM-CAR and STREAM-CDR:

(define stream-ref index strm)
(if (= index 0)   ; of course in Snap! it's one-origin
(stream-car strm)
(stream-ref (stream-cdr (- index 1) strm))))


display-stream is what we call SHOW STREAM, except that their version works only for finite streams, whereas we have an extra input saying the maximum number of items to display.

(cons-stream a b) is a macro for (cons a (delay b)). Again, in Snap! we can do it with an unevaluated input. CONS, for lists, is our IN FRONT OF.

I actually didn't use DELAY, though, because our IN FRONT OF doesn't allow a non-list in the second input. I'm too tired to change that tonight.

the-empty-stream can actually be any value that STREAM-NULL? can find unambiguously. We use the empty list, but it could be (LIST [THE EMPTY LIST]) too.

Jumping down a bit, because flattening is complicated, QEVAL is the central piece of what you're writing! It evaluates a query (a goal, which is basically just list-structured text) with respect to a stream of frames, reporting another stream of frames. So, no, neither Scheme nor Snap! provides this.

It's a stream of frames as input and as result because maybe you've already partly evaluated another query by applying a rule, which means (1) unify the query with the conclusion (the part before the :- ) of the rule, maybe generating some variable bindings, and (2) in the context of those bindings, evaluate the body of the rule (the part after the :- ), which may generate zero or more sets of additional bindings. Since the result of evaluating a query can be more than one set of bindings, we have to collect all the frames. In principle we could make an ordinary list of frames, but we use streams because (1) maybe there are infinitely many solutions, and we'd like the user to see some of them before we give up, and (2) even if the set of solutions isn't infinite, we want to get all the way to the end of one solution even though we write the code as if we're finding all the solutions in parallel.

This is the point about streams in general; the goal is to write programs that look like they use higher order functions:

but run as if they use iterative loops:

(Try them both with 1000000 as input if you don't see the efficiency difference.) Then try the stream version:

There's a lot more to say about the structure of QEVAL, but I don't want to write all of 3.5 here!

Okay, now to discuss flattening. FLATTEN (for lists) is

(define (flatten lists)
(accumulate append '() lists))


i.e., one degree of flattening, not like , which flattens all the way down to atoms. (This is a confusion of nomenclature, and I wish we'd agreed with Scheme even though Jens is right that what we do is closer to what the name suggests in English.)

(define (flatmap func data)
(flatten (map func data)))


is a frequently useful abbreviation, for situations in which FUNC returns a list, and we want to combine all the elements of all the lists.

Trivial example:

Less trivial example:

Note that there's an outer FLATMAP whose mapped function is a plain MAP.

Okay, now we move from lists to streams.

(define (stream-flatmap fn strm)
(flatten-stream (stream-map fn strm)))


is an ordinary procedure, not a special form. STREAM-MAP is of course just like MAP but using CONS-STREAM, STREAM-CAR, and STREAM-CDR instead of CONS, CAR, and CDR.

But FLATTEN-STREAM is tricky:

(define (flatten-stream strm)
(if (stream-null? strm)
the-empty-stream
(interleave-delayed
(stream-car strm)
(DELAY (flatten-stream (stream-cdr strm))))))


Why can't we use plain old INTERLEAVE? If you look at the examples that use INTERLEAVE, it's always in the second input of a CONS-STREAM, e.g., just above ex. 3.66,

(define (pairs s t)
(CONS-STREAM
(list ...)
(INTERLEAVE
(stream-map ...)
(pairs ...))))


As a result, the call to INTERLEAVE, including the evaluation of its inputs, is delayed. (And when it's forced, the recursive call to PAIRS again has its call to INTERLEAVE as the second input to CONS-STREAM and therefore delayed.) Note ex. 3.68, in which Louis Reasoner tries to use INTERLEAVE by itself, not inside a CONS-STREAM, and gets in trouble.

I think you could work around this by pulling out the first item of the first stream, to use as the first input to CONS-STREAM, and then juggling things around to put all the rest of the problem in the second input to CONS-STREAM, like this:

(define (flatten-stream strm)
(cond ((stream-null? strm) the-empty-stream)
((stream-null? (stream-car strm))
(flatten-stream (stream-cdr strm)))
(else (CONS-STREAM
(stream-car (stream-car strm))
(INTERLEAVE
(flatten-stream (stream-cdr strm))
(stream-cdr (stream-car strm)))))))


There's an extra base case because I'm trying to extract the first item of the first stream, so I have to handle specially the situation in which the first stream doesn't have a first item.

But they chose a different solution, which doesn't require rearranging the inputs and kludgily pulling the first item out of the first stream. Instead, they explicitly DELAY the recursive call to FLATTEN-STREAM, and to compensate for that, they need a special version of INTERLEAVE, called INTERLEAVE-DELAYED, which FORCEs its second input.

I guess what they're thinking is that they'll need INTERLEAVE-DELAYED more than once. They could have made it a special form, like CONS-STREAM, but I think they thought it was too complicated not to have it shown as a procedure that students can read and understand. (Look at where they explain CONS-STREAM instead of defining it.)

Goodnight...

85 posts were split to a new topic: Streams library 2.0 development

Thanks, bh, for moving posts on Streams library 2 to a separate topic.

So … flatmap is simply append input list: map (function) over (data) ?

And stream-flatmap is combine (map (function) over stream (s)) using (interleave streams) ?

(or variadic interleave input list: map (function) over stream)

Have you thought about how to avoid duplicates in the stream of rational numbers? Recall that the concept of a countable set is that one can place the set in a one-to-one correspondence with the natural numbers. A stream with duplicates isn't a set.

True. The above was just to see if I'm correctly interpreting SICP flatmap. To make it a set one may e.g. insert for example. Or check if GCD = 1 with every fraction calculated:

(using GCD from the APL library, plus the weird but convenient map over keep from map over stream block )