Sunday, February 2, 2014

GUI vs. SPUI --the Semantic Presentation User Interface

In the beginning IBM created the punch card and the PCUI. And the PCUI was ungainly and plagued by lost punch cards. The Punch-Card User Interface or PCUI (pronounced Puh-Koo-ee) was the first wide-spread commercial interface for computers. I am old enough to have used punch cards in college. They were not a great user interface. It was an enormous improvement when I first moved to a Command-Line Interface or CLUI (pronounced Kloo--ee as in “Did you find a clue?” “Not really, but I found something that’s sort of clue-y”).

In the command-line interface, the user has to learn many commands and options. He types in a command on a keyboard and the computer outputs a response to the screen. This sort of interface extends very easily and naturally to scripting where the user enters a sequence of commands in a file and executes them in a batch. CLUIs are also a high-bandwidth interface in the sense that an experienced user can type in commands and options very fast. But CLUIs require expertise on each application (or constant reading of documents) and for many applications they require even more expertise to read the outputs because it is not possible to output good graphs, charts, pictures, and other representations that make information understandable to humans.

So along came the Graphical User Interface or GUI (pronounced like “gooey”). This user interface required much less expertise on a given application because there were menus that let you look for the command that you want and dialogs that prompted you to enter your options, and beautiful, detailed, scrollable and clickable graphs, charts, and pictures to present data. For applications that they know, typical GUI users may not be able to interact with the computer as fast as CLUI users, but GUI can be much faster for new and seldom-used applications because they don’t have to read documentation.

However the GUI also has some serious drawbacks:

Platform Dependence

Typical GUIs need to be modified for different platforms such as web browsers, Android, iPhone, MS Windows, X Windows, etc. This is an expensive process because different platforms have different APIs, and the APIs are generally so different as to require a very detailed rewriting. There are some portable APIs that work on different platforms but they have not become very influential, in large part --I presume-- because they can’t keep up with the constant proliferation of important new platforms.

Accessibility

Not everyone can read normal-sized type. Not everyone can read type at all. Device makers have become very good at supplying magnifiers, audio text readers and other accessibility tools, but these are necessarily after-the-fact hacks tacked on to the application. There are disadvantages to such generic tools vs. a real interface built into the application to provide accessibility. And no matter how good these tools get, the pace of innovation (or at least change) in user interfaces will always leave them with interfaces that they don’t know how to handle.

And there are different levels of accessibility needs. Some people who can’t use a standard application on a smartphone would be able to use it easily if it just had 25% larger fonts. Many applications let you expand the screen some surprising ones don’t. Google, which works very hard at accessibility doesn’t allow enlarging fonts in one of their flagship products --Google Maps. You can expand the screen, but all that does is zoom in on the map and keep the street names in a tiny font that few people over the age of 40 can read. This is the kind of design failure that happens when you leave accessibility design up to individual application writers.

Device and Display Dependence

Resolutions, pixel layout, aspect ratio, color support, frame rate, physical update characteristics of the screen, device performance, all can affect what kind of presentation looks best. On many platforms it is very difficult to write a GUI that looks good across all possible displays. It requires extensive testing on each display. Sometimes you may need to change fonts and not just font size to make text look good on a different display.

There are other issues as well. On some platforms users expect fancy animations and other graphical features that stress a processor but other platforms cannot keep up with those effects. Some devices have special displays that need an entirely different sort of user interface such as wrist computers or e-book readers.

Input Device Dependence

A mouse needs different rules of interaction than a touch screen. A touch screen that supports multi-touch has different rules of interaction than one that does not. A very small device like a watch may have only a few buttons. Speech-controlled devices have yet another form of interaction. Devices for people with physical disabilities may have entirely different forms of input.

Language and Culture

It is not that hard to internationalize a GUI so that it works well for 90% of all languages and cultures. And that is what many i18n projects do. This leaves 10% of computer users out in the cold, because their language just doesn’t fit the model or because they aren’t economically important enough to matter.

Individuals Preferences

Some people like bright colors; other people do not. Some people like busy, crowded screens; other people do not. Some people like slick animations and transparency; other people do not. But if you are using a GUI, you are subject to the whims of whoever controls the GUI design. If you had a huge investment in Microsoft Office documents and you hated the change from menus to ribbons, for example, you were stuck. You and everyone else in your office were forced to waste a few days relearning software that you had been using for a decade. This is the kind of problem that you get into when companies start thinking that they need to drive fashion in GUIs like famous designers do in apparel. The difference, of course, is that you don’t have to spend days learning how to wear a new style of sport coat.

Functionality vs. Presentation

For most applications with a user interface there is an abstract line to be drawn between the functionality that the application provides and the way that it presents that functionality to the user. This division is not just an engineering distinction; it is also a product distinction. Some projects are primarily driven by a vision of functionality --a program that does something that users need done. Other projects are driven primarily by a vision of how to present things to the user. But the functionality-driven project often has to spend enormous time and resources on providing a GUI, even though they don’t think the exact form of GUI is very important. And the GUI-driven project has to implement an underlying functionality even though they don’t feel that they have anything particular to offer in that area. I believe that the large majority of projects fall into one of these two categories: functionally projects forced to provide a GUI or GUI projects forced to provide functionality.

Not only are projects forced to do a great deal of extra work on an area where they are not experts and not particularly inspired but the extra work often complicates the part of the program that they really want to concentrate on. Good programmers are aware of the line between functionality and user interface and they make an effort to keep these parts separate but there is usually a large fuzzy area that interleaves the two parts. If you have a widget that presents a table, sometimes it is just easier and more efficient to let the widget manage the data than to keep moving data back and forth between the widget and another data structure. Sometimes a callback function is so small that you may as well just put the functionality you need in the function itself rather than have it call into a separate functional part of the program. This mixing of parts makes both parts more complex, harder to write and harder to maintain, but many projects do it anyway. Sometimes for good reasons.

The Semantic Presentation User Interface

The Semantic Presentation User Interface or SPUI (pronounced spoo-ee) is a concept that completely separates the functionality of an application from its interface. This separation is not done at the application level but at the platform level. Each device provides a default presentation layer that interfaces with an application to present the published data structures of the application. Presentation layers vary dramatically from device to device, from big-screen TVs to watches. Not all will be graphical; an audio interface device would have an entirely audio output.

With this model, a company that has a vision to build some great new application does not have to build a GUI to go with it. They simply have to decide what data of the application users need to see and what operations that they need to do on this data. Then the programmers build an abstract interface for these features and they are done with the user interface. Their application can now be used on any device that has the necessary features in the presentation layer.

Another company that has a vision for how to present data and interact with computers does not need to create an application to show off their great new ideas in user interfaces. All they have to do is write presentation classes to load into the presentation layer. Every application on the device can then use this new user-interface technology with no changes --every application that publishes an object that can be displayed by the new classes.

The SPUI concept is related to the Model-View-Controller design pattern and more specifically to the Presentation Model pattern or the Model-View-ViewModel patterns. What is different is that the SPUI concept appears to an application writer (if not a GUI writer) like an entirely different kind of user interface. All the application writer has to do is specify the semantics of his application and the system takes care of the rest. This is made possible by an application interface based on an interface language much like the IDL of CORBA or the interface language of protocol buffers. The language used by a SPUI is the Semantic Application Programming Interface Language or SAPIL.

Interface languages come in many varieties, but most languages designed to provide an interface between application tend to mimic the types of programming languages. The typically have numbers and strings and something like a struct or class. Some also have sets or lists or some other collection type and some of them have references or pointers that can connect objects in a network structure. It is up to the application writers to know the meaning of these structures. For example, if a particular member of an object is an unsigned integer that represents a phone number, then this is something that that the receiving application just has to know so that it can do the right things with the number.

SAPIL, because it is intended to connect applications to users rather than just applications to applications, has a richer type system that provides more of the meaning of the data. In SAPIL, a phone number has a special type so that the presentation layer knows how to display it. The type “phone number” is not just a number with some formatting information. Rather, it is semantic information that tells the presentation layer what significance this number holds to the people using the application. The presentation layer might present it in any form that is most appropriate for the context in which the user is viewing it. It could parenthesis around the area code or put a dash in front. It could show the country code or not. It may even leave out the area code in some circumstances. All of this is possible because of the semantic content that tells the presentation layer what the integer means, not just which integer it is.

For some other interface languages, the classes generated in the target language are highly specialized classes as interface elements. They are not as efficient as the class that would come from a similar declaration in the language itself. For example, if you declare a class using the interface language of Protocol Buffers, you will get a C++ class that doesn’t have the same members. Instead, you get getter and setter methods to access the elements. The typical usage of Protocol Buffers is to copy an internal data structure (or parts of it) to a protocol buffer, send a message, get a reply, and then copy the reply to an internal data structure for further processing.

SAPIL has a goal of minimizing the programmer and runtime work needed to deal with the interface, so SAPIL tries very hard to generate normal, efficient data structures in the host language that can be used for normal computation. The model of SAPIL is that the programmer creates the data structures that he needs to implement the functionality and then he publishes some of these data structures (or parts of them) to the user. It is up to the machinery of SAPIL to get out the parts that need to be displayed and send them to the presentation layer.

SAPIL Classes

As an example, lets look at a simple file manager application. One thing a file manager needs to display is files. The presentation layer doesn’t know about files so the application must declare what a file is. It might look something like this:

class File {
published:
 string name;
 unsigned size;
 string owner;
 timestamp creationTime;
public:
 string path;
}

This defines a class with five members. Four of the members are published, meaning that when an object of this class is presented, those members will be displayed in some form. The member path is public, meaning that the generated class definition will declare it as public but it is not presented in the user interface. There can also be private members which would be declared as private in the generated class.

For C++ the generated class definition would be something like this:

class File : public Presentable {
public:
 published<string> name;
 published<unsigned> size;
 published<string> owner;
 published<timestamp> creationTime;
 string path;
};

The application programmer can use File just like any other class and it is nearly as small and efficient as the class that the programmer would have written in the source language. The template type published handles the special features of published variables. It may be necessary to define an assignment operator on published values that records that the value has been changed so that the presenter knows that it needs to update the value. I need to put more thought into this area.

When a file object is displayed, the presenter will use the member names as labels except that it may modify the letter case and it recognizes underscores and lowercase followed by uppercase as word separators. The presenter might present a dialog with the file information that looks like this:

Name
presenter.cc
Size
10400
Owner
bbunny
Creation Time
2010-04-15 09:12:22.32

Lists

The presentation of a given class is not always the same. SAPIL supports not only classes but also other data structures such as lists. A list of Files would be declared like this

Type FileList is List<File>

and the presenter might then display the files in tabular form and put the names in a table header like this:

Name
Size
Owner
Creation Time
presenter.c
10400
bbunny
2010-04-15 09:12:22.32
presenter.h
5212
bbunny
2010-04-15 08:32:12.89

Methods

A SAPIL class can also have method declarations. Published method declarations represent operations that can be executed on an object in the user interface. They might, for example show up as buttons or as items in a context menu depending on the presenter:

class File {
published:
 string name;
 unsigned size;
 string owner;
 timestamp creationTime;
public:
 string path;
published:
 void delete();
 void rename(string newFileName);
 void SPUI_open();
 void SPUI_dragAndDrop();
 void SPUI_copy();
 void SPUI_cut();
 void SPUI_paste();
}

The methods beginning with “SPUI_” are special methods used by the presenter for certain user actions. For a method with arguments like rename() the presenter does something to ask the user for the argument values. It may, for example, open a dialog box with the message “Enter new file name”. The same rules are used to break up these parameter names into words as are used for class members.

These methods can be used by the application as well as by the presenter so the application writers do not need to write two functions to rename a file. They can just write one function and make it available to both the presentation layer and the application internal code.

Attributes

The presenter can infer some semantic information from the class structure, but classes often contain additional meaning that is just not apparent from the structure alone. This semantic information is provided by the use of attributes. A member or method declaration can be preceded by a label attribute to give that member a different name from the name that it has in the class. For example suppose that the programmer used cTime rather than creationTime as the member name. Then it would be a good idea to communicate a better name to the presenter with a label attribute:

class File {
published:
 ...
 label "created" timestamp cTime;
 ...
};

The label attribute applies to an individual but other attributes can apply to groups of members. In particular there is the group attribute which forms a named group of members:

class File {
published:
 string name;
 unsigned size;
 string owner;
 group "timestamps" {
   label "created" timestamp cTime;
   label "modified" timestamp mTime;
   label "accessed" timestamp aTime;
 }
};

A group behaves similarly to a nested class, but it is only a feature of the user interface; it is transparent to the application programmer. The presenter may put a box around these attributes giving them a label, or it may present them on a different page or a different tab. The point is that the application programmer just provided the semantic information that these three members form a conceptual group. What the presenter does with that information is not the application programmer’s concern.

Trees

Our file manager will also need to represent a directory structure. The presenter does not know about directories but it does know about trees. A tree is declared like this:

class Directory extends File, Tree {
published:
 children List<File> contents;
}

This declaration says that Directory is a subtype of both File and Tree. It extends File with a member named contents which is a list of Files. The children attribute says that this member is a list of children in the tree structure. Multiple children members are allowed and they may be of other aggregate types such as Set and Map in addition to List. Related to the children attribute is the child attribute: an attribute labeled child represents a single child in the tree rather than a list of children.

The child and children attributes may be used in any combination. They are only there to provide semantic information to the presentation layer and have no significance to the application programmer other than as internal documentation.

We might want to add some methods to our directory structure

class Directory extends File, Tree {
published:
 children List<File> contents;
 unsigned diskUsage();
 FileList find(
   label "file name pattern" string pattern="",
   label "other file systems" bool xdev=false,
   label "max search depth" int maxDepth=0)
}

Notice that attributes can also apply to the parameters of a method. The presenter will ask the user for the values of the arguments, possibly by showing a dialog with the default values already filled in. The find() and diskUsage() methods return a value which signals to the presentation layer that it has to present whatever the method returns.

Extension and Customization and Builtins

No strategy of this type can ever be complete. There are always going to be new sorts of things --meanings-- that have to be handled. And even for the old things, there are always going to be special ways to present things in special contexts that there is no attribute to specify. There has to be a way to extend SAPIL/SPUI and there has to be a way for application writers to customize their own applications, possibly through a local form of the extension method.

Builtin Types

However, to maximize the set of things that SAPIL can handle natively, we need a large set of meaningful types. Here are some of the types that SAPIL might support:

numbers

number (arbitrary precision)
phone number
IPv4 address
IPv6 address
MAC number
currencies

text

unformatted
HTML (or some other formatting standard)
URL
password

Date/Time

date
timestamp
time interval by some calendar unit

multimedia

raster image
vector graphics (2D, 3D, animation)
video (soundtrack)
sound sample
synthesized sound

Aggregates

class
list
set
tree
network (I prefer “network” to “graph” because the word is less overloaded)
table
array (multi-dimensional)
mapping

document

Text with logical formatting such as used in LaTeX along with hyperlinks and embedded objects of other types. Does not allow for author-defined fonts. The presentation layer choses physical fonts. Items are laid out automatically using normal document rules.

geography

a set of raster and vector images for layers, GPS coordinates for the upper-left corner, and the length of the sides.

No comments:

Post a Comment