10 dim a$(2)
15 def fn a(x) = x * 2
20 a$(0) = "abc"
30 input a$(1)
40 if a$(0)=a$(1)then print sin(1), fna(2)
This piece of code doesn't make any sense, but it's a nice test case for #Genesis64's pre-parser. And now you ask yourself: "Why a pre-parser (and what is it)?"
When I first started to work on this, the Genesis64 parser actually was an interpreter, you entered #C64 #BASIC code and ... interpreted it using a boatload of RegExes (because, hey, BASIC looks like it's so easy, you could parse it with Regular Expressions alone - boy was I wrong).
Because I used RegExes to parse the code, it seemed like a good idea to make finding arrays easier, by using the C-like [ and ] instead of ( and ). I used the same reasoning for =, thinking that simple compare operations should be easier to find when I use == instead of =.
Here's the result of the (very first) pre-parser used:
10 dim a$[2]
15 def fn a[x] = x * 2
20 a$[0] == "abc"
30 input a$[1]
40 if a$[0]==a$[1]then print sin[1], fna[2]
Using this nice RegEx to detect the "start" of an array in code:
/[a-z]+\d*[\%\$]*\s*\(/
It will find everything that looks like an array up to the opening bracket, so just run this over each line, find the closing bracket and replace ( and ) with [ and ].
Almost .... correct.
Let's start with line 10, obviously a$[2]
is wrong, as we're not accessing the array, but DIM
ing it. So it needs to check if the array variable is used in a DIM
statement.
Line 15, wrong, DEF FN is followed by a "name" rather than an array
Line 20, wrong, it's a LET
command, without LET
, as we don't need it in BASIC v2.
Line 30, wow, correct.
Line 40, almost correct, but SIN
is a function, so not an array and thus has to keep its ( and ) intact, and FN A is a function as well.
So 3 out of 5 lines that needed further testing, and I've not even really tackled comparison.
First check for DIM
and DEF FN
/ FN
, then filter out the BASIC functions (SIN
, ABS
, ...), oh and deal with nested things like a$(b(2))
, or SIN(a((ABS(3)))
...
Nonetheless, I kept using that idea throughout three (or 4?) iterations of the parser, adding conditions for DEF FN
, functions and what not, to the point where it perfectly converted C64 BASIC to "G64"-BASIC. It just took a few hundred lines of code and conditions to arrive there. With the current rush of motivation (I just released v. 2.6.7.3 of our e-learning suit WideBight), I re-read the code for the pre-parser and decided to drop that idea and just make sure the parser can deal with BASIC the way it is written instead of jumping hoops to convert it to something else first.