Saturday, April 20, 2013

Principles of Ultimate Programming Language Design

Here is a quick summary of the principles that should be followed in designing the ultimate unobtainable programming language. These principles may or may not apply to designing real programming languages.

Lets start with all of the mutually incompatible goals that the perfect language must satisfy.

Goals

  • Least Surprise. The language should not have surprising effects, interactions or limitations. Things should look normal and be predictable to even a casual user of the language. Syntactic constructs should do what they look like they do. Once a user understands how two features work, it should be obvious how to combine them.
  • Elegance. Rather than being a grab-bag of features, the language should be based on a relatively small number of powerful primitives that can be combined in novel ways to simulate the more common features. Since "novel" is a close relative of "surprising", this goal seems to conflict with Least Surprise.
  • Power. The language must have a full set of sophisticated and powerful features to let the programmer implement any sort of algorithm or paradigm that he has in mind. It also must provide all of the basic data structures such as expandable lists and sets and algorithms such as sorting.
  • Simplicity. The language must be easy to learn. This seems to contradict the requirement for power.
  • Standardization. The language should include a lot of standardized built-in functionality of the sort usually associated external libraries so that a programmer can come into a new project and already understand how things work. For example it ought to have standard support for dates and times, localization, internationalization, network services, 2D and 3D graphics, sound, user-interface widgets, database access, etc.
  • Flexibility. The language ought to support and encourage the creation of new libraries to improve on the the standard libraries. This seems to conflict with the purpose of standardization.
  • Predictability. The programmer ought to be able to look at a piece of code and know roughly what it does based on generic rules of the language. For example, the user ought to be able to guess that a function call does not modify its arguments or that all of the arguments to a function are evaluated before the function is called.
  • Extensibility. The language should allow the use of any parameter-passing mechanism and the creation of new syntax, new control structures, and pretty much new versions of anything that is already in the language. This seems to conflict with predictability.
  • High-levelness. The language must be highly abstract and automatically handle the low-level busywork such as storage management and managing a persistent (on-disk) store of language values.
  • Low-levelness. The language must allow the programmer get down to near assembly-language-like details on the operation of the machine and completely control storage management, disk usage, on-disk data structures, and the like. This seems to contradict the requirement for high-levelness.
  • Dynamicness. The language must allow a lot of decisions at runtime including creation of new functions at runtime, dynamic dispatch, late binding, reflection, and similar things.
  • Verifiability. The language must verify at compile time essential programmer assumptions such as the types of variables, the existence of function implementations and similar things in order to help ensure program correctness. This seems to be incompatible with dynamicness.
  • Safety. The language must dynamically ensure that if the programmer make mistakes that would invalidate the program state like dereferencing NULL pointers or overflowing a number then a clear message is sent back to the programmer with an immediate abort. Otherwise these sorts of errors can be very hard to reproduce and to track down.
  • Efficiency. The language must allow the programmer to skip checks such as testing for NULL pointers, checking for arithmetic overflow and similar issues when he knows that the check is not necessary.This seems to be incompatible with safety.

So how do we satisfy all of those incompatible goals? With some basic principles of ambivalent design.

Principles

  • Least Surprise/Elegance. The language should be based on a small number of powerful features combined in novel ways, but the combinations should be part of the standard syntax of the language so that the user doesn't have to know how the "user-level" features are implemented at the lower level.
  • Power/Simplicity. The language should contain a lot of powerful features, but should be decomposable into smaller, simpler sublanguages that can be taught and used without reference to the full language. It should be possible to start with a small teaching sublanguage that just introduces basic control structures and variables, then gradually adds features such as widgets and graphics.
  • Standardization/Flexibility. There should be lots of powerful standard libraries, but they should all have decomposable, pluggable features than can be replaced in sections. This lets programmers replace minimal parts of standard libraries and allows the users of those libraries to use largely standard interfaces.
  • Predictability/Extensbility. Powerful extensiblity features should be allowed but those features should be syntactically distinct from standard language features. For example, it should be possible to define a special function that does not always evaluate its arguments but the call should not look like a normal function call.
  • High-levelness/Low-levelness. The language should be fundamentally high-level but it should allow special escapes into a lower-level sublanguage. For example, the normal number type would be an arbitrary precision decimal number, but the user could declare some variables to be a 64-bit machine integers, which would cause arithmetic on those variables to use machine instructions. There should also be ways for users to control their own storage management, on-disk structures and other low-level features, but these facilities would be part of the low-level sublanguage, not visible to most programmers.
  • Dynamicness/Verifiability. The language should be fully dynamic but allow declarations that are checked at compile time. This lets a programmer write a small quick program using dynamic logic and then gradually refine it for verification and performance by adding declarations and assertions.
  • Safety/Efficiency. The language should have no undefined behavior but should have a sophisticated system of assertions to allow the programmer to prove to the compiler that a particular construct is safe without checking. The language should aggressively use these assertions to optimize away checks. The language should also do some sophisticated reasoning of its own and should have a facility to let the user know what it knows about values at compile time. In addition, it should provide an optimization facility to tell the programmer which checks might potentially be eliminated if they could be proven unnecessary.

1 comment:

  1. These are the very useful principal of programming languages which are essentially required to use programming language. This is very well written article and I praise this work.

    ReplyDelete