[bitc-dev] BitC surface syntax

Jonathan S. Shapiro shap at eros-os.org
Mon Aug 7 21:06:05 EDT 2006


Many of you have said, in various forms:

	LISP syntax really sucks.

In fact, it has been suggested that "LISP" originally stood for "Loony
Idiots Stick to Parenthesis". :-)

A recent conversation with Mark Miller has convinced me that we may be
able to drop this syntax.

  [Long pause to wait for applause. Remember that the Europeans won't
   see this until tomorrow AM, so we need a *very* long pause. :-)

   In fact, I have just heard the distinctive sound of a certain
   anonymous reader down at a certain anonymous Defense think tank
   falling out of his chair. ]

The original reason for using s-expression language was the need for
quasiliteral patterns. We had a very bad experience with quoting and
quasiliteral issues in HOL, which led me to say to myself "I don't want
to think about quoting issues while I'm wrestling with language design,
so let me put this issue off." Since it was a solved problem in the LISP
family of languages, I adopted a LISP syntax (over many objections).

To illustrate why S-expression are very nice for quasiliterals, consider
that we need a way to write things like:

     (implies (and (>= x 0) (>= y 0))
              (> (* x y) 0))

In fact, in ACL2 you can write exactly this. But ACL2 is using a pun,
because two languages are being mixed up here: the programming language
and the logic language. Let me introduce some new "quotes". I will use
square braces to indicate "stuff in the programming language" and curly
braces to indicate "stuff in the logic language". What is really being
said here is:

     {(implies [(and (>= {x} 0) (>= {y} 0))]
               [(> (* {x} {y}) 0))]}

When these decorations are inserted, it starts to jump out that the
expression [(and ...)] computes a value in the programming language, not
in the logic language. Because of this, we need a way to convert values
back and forth between the programming language (actually, the values
computed by the evaluator) and the logic. The boundaries are actually
very easy to see, because they occur at any place where you switch from
one to the other.

ACL2 gets away without the quoting because it guarantees that
*everything* you write in the programming language has a precise
conceptual equivalence in the logic. [It does this by restricting the
programming language.] When you write the ACL2 expression

   (and nil t)

this expression has two *simultaneous* meanings:

  1. It is an expression in the semantics of the programming language
  2. It is a logical term in the semantics of the logic.

The really cute part is that the result of the programmatic expression
always corresponds directly to the result of applying term substitution
to the term. There is a carefully maintained term<->value duality.
Because this duality exists universally in ACL2, it is not necessary to
visibly distinguish between the ACL2 logic language and the ACL2
programming language. This is VERY nice for the poor suffering
developer.

In my initial ignorance, I had hoped that we could come up with
something this simple and convenient for BitC. Now that I know a little
more, it is pretty clear that we can't. In simplest form, the problem
is:

  Most BitC programs are not terms, therfore no term<->expression
  duality exists in general.

Because of this, the visual program pun that works very nicely in ACL2
(because the duality is almost never violated in ACL2) will be very bad
in BitC (where the duality mostly doesn't exist). When proving stuff in
BitC, the author will need to be more aware of the fact that the proof
is a proof in one language (the BitC logic language) that is *about* a
program in a second language (the BitC programming language).

This means that we actually need to introduce some sort of quoting
syntax explicitly.

So what should it look like?

Look again at the example:

     {(implies [(and (>= {x} 0) (>= {y} 0))
               [(> (* {x} {y}) 0))]}

There are some things to notice. First, note that the outermost {} can
be removed. If we are writing a theorem, we certainly know that we are
operating in the logic language. We don't need any quotes to tell us
this.

Second, note that the *inner* {} quotes can also be removed. Every inner
occurrence is a single variable. It is useful if these variables are
visually distinctive, but we don't really need quotes around them. BitC
already reserves identifiers beginning with '$', because it has always
been my intention that we would omit the inner quotes by writing
something like:

     (implies [(and (>= $x 0) (>= $y 0))
              [(> (* $x $y) 0))]}

Third. Look again at the $x,$y things. Each of them (necessarily)
appears in the position of a grammatically well-formed expression. This
means that we can actually make them part of the grammar -- even in a
non s-expression language (though we also need to deal with '$x for
types). Actually, we don't even need the special '$' identifiers -- this
is purely a visual convenience for the developer.

What this means is that the following input would be equally clear (now
using '[[' and ']]' for quasiliteral patterns):

  implies [[ ($x >= 0) and ($y >= 0) ]]
          [[ ($x * $y) >= 0 ]]

And THIS means that we are free to adopt a more conventional surface
syntax.

I can hear a few of you laughing at me and saying "I told you so." Yes.
You told me so. I am tempted to respond "I am just a simple farmer from
Patektikva." [A former very talented head of Mossad used to say this a
lot.] There are only so many issues that I can juggle at once.

So: I will soon propose a more conventional surface syntax. Unless there
is violent objection, I will attempt to do something in the style of the
C/Java/C# family of languages.

Please! No syntax wars. Let's get a proposal in place, and then we can
fight about it over on the Haskell mailing list... :-)


shap




More information about the bitc-dev mailing list