Saturday, June 15, 2013

Definitions, Declarations, and Constraints

If you know anything about programming languages, the title of this section probably strikes you as strange. It is common to group definitions and declarations together, but constraints are usually considered an entirely different topic. However, declarations in common programming languages are rather complex creatures and a part of what they do is very much like a constraint.

This section will analyze the three concepts and see how they are related. All of the concepts involve namespaces, so first lets talk about namespaces.

Namespaces

A namespace is any sort of mapping from names to denotations. It may be a package, module, class, function body or block among other things. Anything that maps names to denotations is a namespace. The mapping is also called a binding.

Here is an example from Java

class C {
 int j;
 int f(int x) {int y; y = j; return x+y;}
 static int g() {}
};

In this example there are many namespaces:
  1. The global namespace (or more properly the package namespace) maps C to a class. The phrase beginning with “class C” introduces that binding into the global namespace.
  2. The class C itself is a namespace that maps g to a static function. The phrase beginning with “static int g” introduces that name.
  3. When the class is instantiated into an object obj, then obj will be a namespace that maps j to a location. Note that the phrase “int j” introduces the name j into a runtime namespace when it gets executed, not a static namespace like the previous two definitions.
  4. When f is called, it will create a runtime namespace to hold local variables x.
  5. When control enters the body of f, it will create another runtime namespace in the block, and this namespace is where y is bound.
  6. When g is called it will create a runtime namespace with no members.

There are two ways to view f in the above example. In Java, C++ and many other languages one would say that f is a function that is a member of the class namespace C. When you call the function in an expression like obj.f(3), this is treated as a call to a function that looks like it has only one argument but really has two: obj and 3. You could rewrite the call to something like f(obj,3), and change the definition of f to something like

int f(C o, int x) {int y; y = o.j; return x+y;}

All I did was add o as an argument and replace the unbound variable j with o.j.

That is the C++ and Java way to view member functions. There is another way to view them and that is to say that when the class is created, it creates a closure for each member function and binds the function name to the closure in the object namespace. More on that when I get around to discussing SL5.

Definitions

A definition is a phrase that gives a name a denotation in a namespace. We say that the definition binds the name in the namespace. For example

const x=100;

binds the name x to the number 100 and the definition

function sum(x,y,z) {return x+y+z;}

finds the name sum to a function.

Variable declarations are also definitions because they bind a name to a location:

var x,y;
var z:integer;

Constraints

There is a class of programming language called constraint programming languages that use constraints as the primary programming paradigm (a logic programming language is a type of constraint programming language where the constraints look a bit like predicate logic). Constraints can be useful even if they don’t form the primary programming paradigm, but I don’t know of any languages that use general constraints in the more limited way that I will discuss in this subsection.

A constraint is a condition that is enforced by the language implementation. For example, in a language with constraints you might be able to write something like this:

constraint x < 0;

and this would cause the compiler to always ensure that x is a negative number. It might do this by checking the value every time x changes, or it might use some sort of logic to prove that the algorithm can never assign a non-negative number.

Constraints like the one above primarily apply to variables. After all, what’s the point of constraining the value of a constant?

But there are other constraints like this:

constraint y = 2*x;

where the compiler could implement the constraint by treating y as another name for 2*x. In this case y would not be a variable (a name that denotes a location) because it denotes twice the value of whatever x denotes.

As you can see, a constant definition looks a lot like a special case of a constraint. For example

constraint y = 2;

seems to have the same effect as

const y = 2;

The only difference is that the definition introduces a new name while a constraint does not. A constraint works on names that are already in scope. Otherwise, consider the problem of deciding what names the following constraint would introduce into scope:

constraint x=y and z=2 and y<3

Declarations

A declaration is a phrase that describes the features of a name that a language needs in order to use the name. In statically typed languages, the language cannot use a name unless the type is known, so a name declares the type of the name. In C, for example you have declarations like this:

extern int x;
extern void f(int);

These declarations do not define anything; all they do is tell the following code how to use the names. They tell the following code that x is an integer variable and that f is a function that takes one integer argument and returns no value.

In other cases, a declaration is combined with a definition:

var x:integer;

This phrase does four different things:

  1. Introduces a new name, x, into the namespace of the innermost block.
  2. Binds x to a new location.
  3. Attaches the signature, integer, to x.
  4. Constrains x so that it can only reference an integer.

The difference between 3 and 4 involves overloaded functions and operators. In a language with static types and overloading, the particular function or operator that is chosen for an expression depends on the signature of the expression. But there is nothing inherent in a signature that constrains the type of values put into a variable. There is nothing logically contradictory about telling the compiler “always use floating point computations on x” and then putting an integer value in x. It may be error-prone and machine-dependent, but it isn’t contradictory. In fact, C and C++ allow doing just that.

In C and C++, a declaration

int x;

assigns a signature to a variable and that signature means that an assignment to the variable such as

x = 4.0;

uses a branch that converts the value to an integer before assigning it to the location. This is one way that a language might enforce a constraint, but it isn’t reliable enforcement when the language has other ways to assign a value to a location. In the ultimate programming language, a constraint must be fully enforced. So in C and C++, a type declaration only attaches a signature to a name and does not impose a constraint.

In other languages such as Common Lisp, a declaration only imposes a constraint and does not attach a signature. In Common Lisp if you declare a variable:

(declare (type integer x))

Then try to assign a non-integer to the variable:

(setf x "foo")

it raises an error. This is the constraint-checking in action. However, such a declaration does not attach a signature to x. Signatures have no meaning in Common Lisp because there is no overloading of functions. When you add x to another number:

(+ x 1)

there is only one branch of + and it uses that branch regardless of any declarations (although + is defined to dynamically check the types of its arguments and do the right thing, and this type checking might be in-lined and then the in-lined tests might be solved at compile-time and then removed).

No comments:

Post a Comment