Project White: Architectural Overview Document

The White Programming Language will closely adhere to the vision of Ambient Intelligence and the Ambient-oriented paradigm as set forth by the European Commision's Information Society Technologies Advisory Group and the Department of Computer Science at Vrije University in Brussels, Belgium.

The language will build upon these fundamental concepts, integrating ideas synthesized by ourselves, as well as those at the vanguard of today's language reasearch. The language will provide a robust and powerful system for implementing massively distributed entities.

In this document, I present the various aspects of system design as well as the primary areas of innovation in the implementation of this language. Those innovations stand to include:

Realtime communication: It is understood that a highly network-aware system that relies heavily upon messaging in a potentially dynamic or hostile environment requires communication facilities that explicitly expect and incorporate potential network failures at the very heart of its basic computational steps. I propose a set of constructs to supply this, namely a lightweight messaging by contract step at the point of communication initiation. Also, since all messaging is asyncronous, it becomes necessary to support futures for synchronization. White will innovate by providing time-aware futures, that promise to yield a computational value after some contractually-agreed upon time limit, or trow an exception should that limit have been exceeded.
Access-control and security: Existing implementations fail miserably when contrasted with today's requirements for secure systems and communication. CORBA's unencrypted traffic is subject to eavesdropping and man-in-the-middle attacks, and it requires a port to be opened in the firewall for each service. This conflicts with the reality of most sane network security policies. White will employ security conscientious design at all levels from the very start. All socket communication will default to tranmission across a secure transport such as OpenSSL. Furthermore, I intend to fully allow for a X.509 certificate-based infrastructure, to ensure trust among White peers. Lastly, a robust set of access control mechanisms will be put in place to guard against unwarranted communication amongst nodes.
Non-blocking communication primitives: As there is no shared state between any objects in the White system, a robust communication standard and protocol must be designed. Protocol efficiency is an incredibly important aspect, as there exists the very real potential for heavy messaging in large mesh networks simply to keep all nodes in sync. This overhead requires us to avoid design flaws such as those in CORBA's interoperability protocol, which makes it nearly impossible to build a high-performance event distribution service. The on-the-wire encoding of CORBA contains a large amount of redundancy, but the protocol does not support compression. This leads to poor performance over wide-area networks.
Reified resource and acquaintance management: I want to avoid the concomitant complexity that arises from naming facilities like CORBA's interoperable object references (IORs). IORs are opaque entities whose contents are supposed to remain hidden from developers. This is unfortunate for many reasons, not least of which because opaque references essentially force the use of a naming service because clients cannot create object references without the help of an external service. I propose the use of three reified objects accessible to the network subsystem and transparent to the programmer that present resource mappings.

As mentioned in the introduction, the goals in my implementation of White are:

Openness: No part of the system is hidden. All stages are visible, accessible to and modifiable by, the user. The transformations between adjacent stages are evident.
Simplicity: I seek the most simplex language that achieves the stated requirements, and the most compliant C code that implements this language. This necessitates a simple syntax with a small number of language constructs.
Efficiency: I seek fast compilation and fast execution of White programs. This implies a fast, smart, one-pass compiler and a fast virtual machine. Additionally, flexible and consise protocol messaging, compiled down to ASN.1 distinguished encoding rules for on-the-wire communication is implemented.
Portability: I require White to run on as many platforms as possible. I want to be able to compile the White core unmodified everywhere, and to run White programs unmodified on every platform that has a suitable Lua interpreter. This implies a clean, strict ANSI C implementation with special attention to portability issues, such as avoiding dark corners of C and its librarys, and ensuring that it compiles cleanly as C++. I seek warning-free compilations.
Practical: I seek to reign in the sea of research that is at the vanguard of the distributed programming field by identifying the most viable paradigms. I will encorporate these elements into a single language runtime, with an eyes towards practicaliy, style and elegance.

White will use a parser produced by yacc, which is a very valuable tool while the language's suntax is unstable. The White compiler will use no intermediate representation. It will emit instructions for the virtual machine "on the fly" as it parses a program. Nevertheless, it will perform some optimizations. For instance, it may delay the generation of code for base expressions like variables and constants. When it parses such expressions, it generates no code; instead, it will a simple structure to represent them. Therefore, it is very easy to check whether an operand for a given instruction is a constant or a local variable and use those values directly in the instruction, thereby avoiding unnecessary and costly moves.

To be portable across many different C compilers and platforms, White must avoid many tricks commonly used by interpreters, such as direct threaded code. Instead, well use a standard while-switch dispatch loop.

4.1. Prototype-Based

As a consequence of parameter passing in the context of remote messages, objects are copied back and forth between remote hosts. Since an object in a class-based programming language cannot exist without its class, this copying of objects implies that classes have to be copied as well. However, a class is by definition an entity that is conceptually shared by all its instances. From a conceptual point of view there is only one single version of the class on the network, containing the shared class variables and method implementations. Hence, copying classes over the network causes state consistency problems because objects residing on different machines can independently update a class variable of "their" copy of the class.

By definition, classes impose a sharing relation upon all their instances. This relation is established at object creation time and remains implicit throughout the lifetime of all its instances. However, because of independent class updates performed by autonomous disconnected devices, two instances of the same class can unexpectedly exhibit different behaviour.

A simple solution consists of getting rid of classes and the sharing relation they impose on objects altogether. This is the paradigm defined by prototype-based languages like Self. In these languages objects are conceptually idiosyncratic, such that the above problems do not arise. Sharing relations between different prototypes can still be established (such as traits), but the point is that these have to be explicitly encoded by the programmer at all times.

Object discovery is perhaps one of the most challenging aspects of a decentralized system. Any node in the context of a White system is capable of cloning N active objects, all of which must both notify the network of their presence and broadcast their public functionality as made available in the Provided Mailbox.

5.1. Discovery

TODO

5.2. Acquaintance Management

TODO

6.1. Non-blocking Primitives

The fact that every hardware device is an autonomous computational entity (inducing natural concurrency), combined with the fact that connections are volatile, implies the necessity for non-blocking communication primitives. Blocking communication is a source of deadlocks, in our case, distributed deadlocks. Such deadlocks in local networks are not considered to be that harmful, since the cause of the deadlock can be easily debugged with contemporary remote debugging environments. However, in a mobile network environment, not all parties are necessarily available for communication, making the resolution of deadlocks extremely hard. Another, more important consideration when designing a concurrency model for a language that is to run on mobile networks, is that the communication mechanism should minimize the duration resources are locked. This is very important, because the extremely high latency of communication (over volatile connections) in mobile networks would diminish the availability of resources.

Indeed, having blocking communication primitives would imply a program or device to block upon encountering unstable connections or temporary unavailability of another device. One of the foundamental tenents in ambient-oriented programming tells us that communication must be non-blocking, to avoid the issues previously discussed. An ambient-oriented concurrency model is a concurrency model without blocking communication primitives. Quite often, the issue of non-blocking communication is confused with asynchronous message sending. Asynchronous message sending implies that the send operation is nonblocking, but tells us nothing about the, potentially implicit, receive operation.

6.2. Realtime Facilities

Characteristics

server cannot statically decide timing constraints on the server
rejected or aborted notifications from server do not always arrive

Requirements

best effort / least suffering
impossible to assume global clock
communication time cannot be bounded
cannot make assumptions about time permitted for computation on server

Elements

time-constrained futures
object response timing threshold contracts (inter-messaging by contract)

6.3. Message Mailboxes

TODO

6.4. Communication Traces

TODO

White plans to implement asymmetric coroutines (also called semi-symmetric coroutines or semi-coroutines). These coroutines will be supported by three functions from the White standard library: create, resume, and yield. The create function receives a "main" function and creates a new coroutine with that function. It returns a value of type thread that represents that coroutine (like all values in White, coroutines will be first-class values). The resume function (re)starts the execution of a given coroutine, calling its main function. The yield function suspends the execution of the running coroutine and returns the control to the call that resumed that coroutine.

Conceptually, each coroutine has its own stack. Coroutines in White should be stackful, in the sense that one can suspend a coroutine from inside any number of nested calls. The interpreter simply puts aside the entire stack for later use and continues running on another stack. A program can restart any suspended coroutine at will. The garbage collector collects stacks whose coroutines are no longer accessible.

The combination of stackfulness and first-class status makes coroutines, as implemented in White, equivalent to one-shot continuations. They allow the programmer to implement several advanced control mechanisms, such as cooperative multithreading, generators, symmetric coroutines, backtracking, and so on.

White will run programs by first compiling them into instructions ("opcodes") for a virtual machine and then executing those instructions. For each function that White compiles, it will create a prototype, which contains an array with the opcodes for the function and an array of White values with all constants (literal strings and numerals) used by the function.

White will use a register-based virtual machine. This register-based machine also uses a stack, for allocating activation records, wherein the registers live. When White enters a function, it preallocates from the stack an activation record large enough to hold all the function registers. All local variables are allocated in registers. Consequently, access to local variables should be particularly efficient.

Register-based code avoids several "push" and "pop" instructions that stack-based code needs to move values around the stack. The register architecture both avoids excessive copying of values, and reduces the total number of instructions per function. There are two problems typically associated with register-based machines: code size and decoding overhead. An instruction in a register machine needs to specify its operands, and so it is typically larger than a corresponding instruction in a stack machine. On the other hand, register machines generate less opcodes than stack machines, so the total code size is not much larger.

Most instructions in a stack machine have implicit operands. The corresponding instructions in a register machine must decode their operands from the instruction. Such decoding adds overhead to the interpreter. There are several factors that ameliorate this overhead. First, stack machines spend some time manipulating implicit operands (e.g., to increment or decrement the stack top). Secondly, in a register machine all operands are inside the instruction and the instruction is a machine word, so the operand decoding involves only cheap operations, such as logical operations. Moreover, instructions in stack machines frequently need multi-byte operands. On a register machine, because the operands are inside the instruction, the interpreter does not have to fetch them independently.

There are 35 proposed instructions in White's virtual machine. Most instructions were chosen to correspond directly to language constructs: arithmetic, table creation and indexing, function and method calls, setting variables and getting values. There is also a set of conventional jump instruction to implement control structures. Below is the complete set, together with a brief summary of what each instruction does, using the following notation:

R(X) means the Xth register
K(X) means the Xth constant
RK(X) means either R(X) or K(X-k)
G[X] means the field X in the table of globals
U[X] means the Xth upvalue

Figure N: Tentative instructions in White's proposed virtual machine.

MOVE         A B      R(A) := R(B)
LOADK        A Bx     R(A) := K(Bx)
LOADBOOL     A B C    R(A) := (Bool)B; if (C) PC++
LOADNIL      A B      R(A) := ... := R(B) := nil
GETUPVAL     A B      R(A) := U[B]
GETGLOBAL    A Bx     R(A) := G[K(Bx)]
GETTABLE     A B C    R(A) := R(B)[RK(C)]
SETGLOBAL    A Bx     G[K(Bx)] := R(A)
SETUPVAL     A B      U[B] := R(A)
SETTABLE     A B C    R(A)[RK(B)] := RK(C)
NEWTABLE     A B C    R(A) := {} (size = B,C)
SELF         A B C    R(A+1) := R(B); R(A) := R(B)[RK(C)]
ADD          A B C    R(A) := RK(B) + RK(C)
SUB          A B C    R(A) := RK(B) - RK(C)
MUL          A B C    R(A) := RK(B) * RK(C)
DIV          A B C    R(A) := RK(B) / RK(C)
POW          A B C    R(A) := RK(B) ^ RK(C)
UNM          A B      R(A) := -R(B)
NOT          A B      R(A) := not R(B)
CONCAT       A B C    R(A) := R(B) .. ... .. R(C)
JMP          sBx      PC += sBx
EQ           A B C    if ((RK(B) == RK(C)) ~= A) then PC++
LT           A B C    if ((RK(B) < RK(C)) ~= A) then PC++
LE           A B C    if ((RK(B) <= RK(C)) ~= A) then PC++
TEST         A B C    if (R(B) <=> C) then R(A) := R(B) else PC++
CALL         A B C    R(A), ... ,R(A+C-2) := R(A)(R(A+1), ... ,R(A+B-1))
TAILCALL     A B C    return R(A)(R(A+1), ... ,R(A+B-1))
RETURN       A B      return R(A), ... ,R(A+B-2) (see note)
FORLOOP      A sBx    R(A)+=R(A+2); if R(A) <?= R(A+1) then PC+= sBx
TFORLOOP     A C      R(A+2), ... ,R(A+2+C) := R(A)(R(A+1), R(A+2));
TFORPREP     A sBx    if type(R(A)) == table then R(A+1):=R(A), R(A):=next;
SETLIST      A Bx     R(A)[Bx-Bx%FPF+i] := R(A+i), 1 <= i <= Bx%FPF+1
SETLISTO     A Bx
CLOSE        A        close stack variables up to R(A)
CLOSURE      A Bx     R(A) := closure(KPROTO[Bx], R(A), ... ,R(A+n))

9.1. Local

White will use the canonical Boehm-Demers-Weiser conservative garbage collector, which can be used as a garbage collecting replacement for C's malloc. It allows you to allocate memory basically as you normally would, without explicitly deallocating memory that is no longer useful. The collector automatically recycles memory when it determines that it can no longer be otherwise accessed.

The collector uses a mark-sweep algorithm. It provides incremental and generational collection under operating systems which provide the right kind of virtual memory support (currently this includes Sun's Solaris, IRIX, OSF/1, Linux, and Windows, with varying restrictions). It allows finalization code to be invoked when an object is collected. It can take advantage of type information to locate pointers if such information is provided, but it is usually used without such information.

Performance of the nonincremental collector is typically competitive with malloc/free implementations. Both space and time overhead are likely to be only slightly higher for programs written for malloc/free. For programs allocating primarily very small objects, the collector may be faster; for programs allocating primarily large objects it will be slower. If the collector is used in a multithreaded environment and configured for thread-local allocation, it may in some cases significantly outperform malloc/free allocation in time. In many cases any additional overhead will be more than compensated for by decreased copying etc. if programs are written and tuned for garbage collection.

9.2. Distributed

Mobile ad hoc networks represent a new kind of distributed system with distinguishing characteristics that pose new challenges in the field of distributed garbage collection. In particular, DGC must deal with a highly partial disconnected network topology where remote references may be inaccessible for unpredicted amount of time.

DGC mechanisms typically determine the reachability of the remote objects by means of communication between the nodes involved. However, in a mobile setting where disconnections are the rule rather the exception, the system cannot determine how long it should wait for a connection to be restored. Knowing when the communication will be restored depends on how the application reacts to disconnections. Some applications may wait for the connection to be restored to resume its computation, while some others may continue their computation with a substitute service instead. Sometimes the geographic location of the devices or identify information may also influence the behaviour of mobile applications.

DGC thus depends not only on the object graph but also on the context in which the objects are themselves, that is, the semantics of the application and the role of the reference in the network, but also implicit context such as the location of devices.

TODO

Discussion of Message-based Programming (MBP)
Versioning. Deployed commercial software requires systems that allows for gradual upgrades of the software in a backware-compatible way. Protocol considerations between differing versions of the softare, and how nodes of various versioning can gracefully coexist and commicate. It is infeasible that all deployed applications be replaced or upgraded at once.

2007 May 01	Network Subsyste implementation complete.
2007 May 12	White Programming Language Architectural Overview document complete
2007 July 01	White Programming Language Specification document complete

17.1. Project White

17.2. See Also

ASN.1: Abstract Syntax Notation One (ASN.1) is a formal language for abstractly describing messages to be exchanged among applications involving the Internet, intelligent network and others. Due to its streamlined encoding rules, ASN.1 is also reliable and ideal for wireless broadband and other resource-constrained environments. Its extensibility facilitates communications between newer and older versions of applications.
Concurrent programming: Concurrent programming encompasses the programming languages and algorithms used to implement concurrent systems. Concurrent programming is usually considered to be more general than parallel programming because it can involve arbitrary and dynamic patterns of communication and interaction, whereas parallel systems generally have a predefined and well-structured communications pattern. The base goals of concurrent programming include correctness, performance and robustness.
Provided mailbox: Actors that want to make themselves available for collaboration can advertise themselves by placing descriptive tags in this mailbox.

My sinceren thanks to the Illinois Insitute of Technology's Coordination in Open Dynamic Environment (CODE) group, specifically Shangping Ren and Mattox Beckman.

[1-dedecker] J. Dedecker, W. Van Belle. 2004. Actors for Mobile Ad-Hoc Networks. In Proceedings of Embedded and Ubiquitous Computing, International Conference EUC2004, Aizu-Wakamatsu City, Japan.
[2-dedecker] J. Dedecker, T. Van Cutsem, S. Mostinckx, T. DHondt, W. De Meuter. 2005. Ambient-Oriented Programming In Companion of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications, San Diego, CA, USA.
[3-demeuter] W. De Meuter, E. Tanter, S. Mostinckx, T. Van Cutsem, J. Dedecker. 2005. Flexible Object Encapsulation for Ambient-Oriented Programming. In Proceedings of the Dynamic Language Symposium - Companion of the 20th annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications. San Diego, CA, USA.
[4-dedecker] J. Dedecker, T. Van Cutsem, S. Mostinckx, T. DHondt, W. De Meuter. 2006. Ambient-Oriented Programming in AmbientTalk In Proceedings of the 20th European Conference on Object-Oriented Programming (ECOOP), Dave Thomas (Ed.), Lecture Notes in Computer Science Vol. 4067, pp. 230-254, Springer-Verlag., 2006
[5-boix] Semi-Automatic Garbage Collection for Mobile Networks. Elisa Gonzalez Boix, Tom Van Cutsem, Stijn Mostinckx, Jessie Dedecker, Wolfgang De Meuter, and Theo DHondt. In Proceedings of the workshop on Object Technology for Ambient Intelligence and Pervasive Computing (OT4AmI), collocated with ECOOP 2006, Nantes, France.

Project White: Architectural Overview Document

1. Introduction

2. Guiding Principles

3. Design and Implementation

4. The Object Model

4.1. Prototype-Based

5. Resources and Acquaintances

5.1. Discovery

5.2. Acquaintance Management

6. Communication

6.1. Non-blocking Primitives

6.2. Realtime Facilities

6.3. Message Mailboxes

6.4. Communication Traces

7. Threads and Coroutines

8. Virtual Machine

9. Garbage Collection

9.1. Local

9.2. Distributed

10. Data Structures

11. Functions and Closures

12. Exceptions

13. Security

14. Conclusion

15. To Do

16. Roadmap

17. Resources

17.1. Project White

17.2. See Also

18. Glossary

19. Acknowledgments

20. References