Regular expression parser

warpedwartwars · October 5, 2022, 3:04am

re

@bh how do I do infixes? E.g., a|b->(| a b)

bh · October 5, 2022, 11:28am

Look in CSLS vol 3 and search on the page for "Expressions and Precedence."

warpedwartwars · October 5, 2022, 8:20pm

I can't quite understand it.

bh · October 5, 2022, 8:22pm

Can you ask a more specific question?

warpedwartwars · October 5, 2022, 8:28pm

I think I would have more of a chance of understanding it if you explained it, especially if it's in Snap! terms.

bh · October 5, 2022, 11:47pm

Umm, maybe tomorrow... sorry...

18001767679 · October 5, 2022, 11:48pm

lol its morning for me and you are rapidly posting haha

warpedwartwars · October 5, 2022, 11:52pm

Ok.

@18001767679 it's midafternoon here.

18001767679 · October 5, 2022, 11:52pm

haha ik

bh · October 6, 2022, 6:08am

Okay, let's see what I can do.

The heading of this thread is "regular expression parser" but that's not what we're doing at all, right? We're doing parsing of arithmetic expressions, even though your example is a boolean rather than an arithmetic operator.

So, you've put the program text through a tokenizer, so now you just have meaningful tokens to deal with and not spacing and so on. In particular, instead of a text string such as "3 + 4" or "3+4" you have a list (3, +, 4).

Now what you want to do is scan through that list left to right and get something you can run. And the problem you have is that when you've scanned (3, +, 4) you can't multiply 3 by 4 yet because maybe the input expression is actually 3+4×5 and you have to do the multiplication before you can add the result to 3.

Is that where we're starting?

Let's get agreement on the starting point and then I can continue from there.

warpedwartwars · October 7, 2022, 1:57am

No, I am trying to parse regular expressions; | just happens to be an infix binary RE operator.

I'm trying to parse it the way the lisplistparse project parses Lisp lists.

bh · October 7, 2022, 2:31am

Umm, DuckDuckGo doesn't know what that is, and neither do I...

I think your situation is basically the same as arithmetic; you have to handle infix operators with different precedences.

The way Lisp parses lists is trivial, just call the parser recursively when you see a left paren, and report your current value to the caller when you see a right paren. That's it, except for a couple of details, such as turning 'FOO into (QUOTE FOO). Anything else that comes along is just an item in the list you're building.

warpedwartwars · October 7, 2022, 2:41am

lisplistparse
And you do know what it is--you wrote it after all!

bh · October 7, 2022, 5:05am

Ugh, hard to believe I wrote that. It's because I was trying to meet the OP halfway.

Part of the hair is from the fact that it's trying to tokenize and parse all at once. That's doable, but it confuses the issue. Much better to separate out those parens (which is all there is to tokenization in Lisp except for exceptions) first:

After this I started to write the actual parser but it has a bug and I have to get packed before my 6am flight tomorrow (8 hours from now). So I'll show you the current state of things. First, as I said earlier, I want to turn the input token list into a buffer object that includes a pointer saying where I'm up to. This is necessary because the recursive calls for sublists have to tell their caller how far they've read and so instead of calling ALL BUT FIRST OF all the time, I just increment the pointer by mutation, so the token list that the caller already has stays up to date:

Once that's clear, you're ready to read PARSE:

So, find my bug for me. :~)

warpedwartwars · October 7, 2022, 5:14am

Just to be clear, the OP was me. I had wanted to have a SPLIT BY XML block and eventually you posted this, and then I ended up failing,

bh · October 7, 2022, 5:17am

Right, okay, trying to meet you halfway! :~P

warpedwartwars · October 7, 2022, 5:18am

Also, why were you trying to meet me halfway?

bh · October 7, 2022, 5:19am

Because I'm a teacher, and because the way to get someone to understand something is to meet them where they are and move them one step at a time.

warpedwartwars · October 7, 2022, 5:23am

Ah.