No Identifiers in Asymptote

Caveat Lector! The purpose of what follows is to clarify and categorize the syntax and semantics of the language. It doesn't have much to do with the current implementation.

Most programming language grammars include a production for "ID" or the like. The idea is that an identifier is a user-defined symbol that unambiguously identifies something, like an int or a function.

In Asymptote, however, this doesn't really apply. Traditional identifiers are names; the name identifies the thing, period. In Asymptote, that's not enough, The language supports overloading of symbols; this in itself isn't so unusual, since many other languages do the same, such as C++. But the scope of overriding varies among languages. Scheme, for example, allows redefinition of virtually all symbols (the syntactic operators '(' and ')' being the sole exceptions), so even such normally constant symbols the digits 0-9 can be rebound to different values. You can define '3' to mean "foo" in Scheme, if you're twisted enough. C++ allows overriding of various kinds as well, but it draws the line at constant symbols like the digits. Asymptote doesn't go as far as Scheme, but it goes farther than C++.

The key distinction between Asymptote and similar languages is that the former allows overriding of any function name. Normally a function name is sufficient to identify a function. In Asymptote this isn't the case: a name alone is not enough, you also need a type. So you can legally say, for example,

int x, x();

This gives you the name 'x', but the name alone is ambiguous. Only by combining the name and the type (signature) can you remove the ambiguity.

Thus 'x' cannot be an identifier, in the strict sense of the term: it can't identify anything without help.

In asymptote, every (user-defined?) symbol carries with it a signature; for non-functions, the signature is null.

So in the syntax of Asymptote, "ID" would be misleading. But it 'x' is not an ID, what should we call it? My vote is "free symbol". In contrast with "bound symbol".

We can partition the symbol space of Asymptote in two ways. On the one hand, we have "constant symbols", such as the digits and keywords like "for" and "else", and "variable symbols", such as user-defined symbols and symbols whose meaning can be overridden such as '+' and '-'. Constant symbols cannot be overridden, so they are effectively Identifiers, but since 'Identifier' is usually used only to indicate the class of user-defined symbols it doesn't seem appropriate for constant symbols. ]

On the other hand, we can divide the symbols in an Asymptote program into those whose denotation is predefined and those whose definition is specified by the user. For example, the symbol '+' is predefined, bound to the (arithmetic) Addition operator; nonetheless is is not a constant symbol, since the user is free to rebind it to some other value.

So we have predefined contant symbols (e.g. the digits), predefined variable ("rebindable") symbols (e.g. '+' etc.), and user-defined variable symbols (e.g. 'foo' in "int foo;"). Syntactically, none of the last can be considered identifiers, since overriding can give them more than one binding. Only the combination of of name and type can identify.

We can capture the semantics in the grammar as follows. First of all, everything is a symbol Predefined constant symbols are terminal symbols; their productions are named accordingly. E.g. DIGIT, FOR, ELSE, etc. Their type is fixedsym; this tells us that they are permanently bound to specific values; "fixed" being synonymous with "constant". Predefined, overridable symbols, such as '+' and ???, we call by the appropriate name, e.g. "ADD", and their type is "boundsym". (NB: "preboundsym" would be more accurate.) This is an abuse of terminology, since any symbol can be bound; here we take "bound symbol" to mean "non-constant, but bound to a default value". A user defined symbol, such as an int var or function name, is neither constant nor prebound, so we call them "freesym", and their type is also "freesym". For example, an EBNF grammar for Asymptote might look something like:

[TODO: EBNF of lexer:
FOR := 'f' 'o' 'r' {makefixed(…)}
PLUS := '+' { makebound(…)} or: makeprebound
FREESYM := ALPHA …. (production for name){makefree( …)}

(Note: the actual grammar in camp.l camp.y doesn't look anything like this.)

I.e. "FREESYM" is what "ID" would be in most languages.

What makes this unique is that Asymptote treats function names like any other user-defined names: free to (re)bind with anything. They can be rebound at will. In particular, other languages treat function names as constant once they are declared, and cannot be rebound. It's true that languages like C++ allow for overriding of methods, but they don't allow this generally: you cannot associate one function name with multiple signatures in C++. Asymptote allows the user to associate a single name like "foo" to any number of types including function types (remember a signature defines a function type). The only restriction is that ambiguity is disallowed: a single name cannot be bound to two different signatures or two different non-function values.

(For this to make sense it would help to understand the difference between name and thing named, lambda expressions, etc.)
(Obviously this needs cleanup)

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License