[bitc-dev] Syntax meanderings

Jonathan S. Shapiro shap at eros-os.org
Fri Apr 30 13:09:02 PDT 2010


On Fri, Apr 30, 2010 at 2:47 AM, Elias Gabriel Amaral da Silva
<tolkiendili at gmail.com> wrote:

> Random comment: I like to have ?, ! (and _ too) counted as "letters",
> so that is_string? is a valid identifier. I like the scheme/ruby idiom
> of functions ending in ? being predicates, and functions ending in !
> being destructive on its parameters. (Except that ! is "not" on C..).

Oddly enough, I spent about 20 minutes thinking about just this
yesterday (amazing how expensive tokenization is in dollar terms :-) ,
because I share your inclination. Let me lay out what I've got so far.

Underscore ('_') isn't really a problem. There is no need to include
it in the legal characters for "mixidents". The only tricky bit is
that we use leading double underscore for reserved identifiers, so if
"++" is a legal mixident, we want "__++" to be a legal mixident. So:

  1. We make identifiers consisting entirely of underscores illegal
(need to fix the specification here).
  2. Leading underscores are permitted in either idents or mixidents
  3. The first non-underscore fixes you into one space or the other.

So much for underscore. :-)

Concerning question mark ('?') and ('!'), here are my best ideas at the moment:

The main use of these seems to be at the end of alphanumeric
identifiers. We can handle that without ambiguity provided we adopt
one of the following rules:

  1) all identifiers beginning with '?' or '!' are (a) mixidents,
having (b) length >= 2 OR
  2) When '?' or '!' appear at the end of an alpha-ident, they must be
followed by
     an identifier-separating character, which is one of '(', ')', ','
or ' '. The first three
     cover use in application or in argument position. The last covers
binding, and
     exists mainly to deal with "def a!=expr" vs. "def a! = expr"

Unfortunately:

- unary '!' is well-established among C programmers as the way to write "not"
- binary "?:" is how C programmers write conditionals-as-expressions.

Let's take these in turn.


Concerning '!', my personal opinion is that the unary ! operator is
very rarely used, and invariably clearer when expressed as "not" (i.e.
the ascii keyword). There is a mildly annoying assymetry in the "a !=
b" vs "not a", but that does not trouble me excessively.


Concerning "?:", my opinion is that the "?:" in C is (and always has
been) an aesthetic abomination. It doesn't indent gracefully to save
your life, and it's not necessary at all in a language where
IF-THEN-ELSE is an expression rather than a statement. In "new BitC",
it's perfectly legal to write:

   LET x = IF <cond> THEN e-true ELSE e-false
   IN stmt

and in those places where this creates a precedence problem it can be
parenthesized or placed inside a block. So I'm fairly well willing to
give up the possibility of "?:" as a legal identifier, and I'm not
aware of any other compelling use case in which a singleton question
mark is desirable as an operator any widely used convention.

** Does anyone know of a counter-example?


But with all that said, I think that the "trailing identifier
separator" requirement is the most sensible thing to do. The only
place where it seems to interfere lexically is in the binding form,
and anybody who writes something like

  def a!=expr

is pretty well begging for trouble in any case.

So my inclination is to permit '!' and '?' as the last character
(only), and go with the "trailing identifier separator" rule.


Violent objections? Comments? Rotten fruit or vegetables?


shap


More information about the bitc-dev mailing list