ap5 from a programmers point of view

AP5 from a programmers point of view

Dennis G. Allard
http://oceanpark.com
March 29, 1990
minor revisions: December 11, 1996, October 29, 2014, August 12, 2022

Introduction

This is a ruthlessly abridged presentation of AP5, a high level language, currently implemented as an extension to Common Lisp. I write this to communicate my discovery of a good language to others. Everything described is implemented.

Please refer to ap5.com for a more complete definition of AP5, including the AP5 reference manual and freely downloadable source code.

AP5 is a specification language in that it enables a programmer to specify programs at a level more abstract than with standard programming languages. The AP5 compiler converts the programmer's spec into LISP code, alleviating the programmer from the need to make many data structuring and coding decisions.

AP5 was designed and implemented by Don Cohen, inspired by the AP3 language of Neil Goldman. Don and Neil have worked together to meld AP5 into what it is today, supported by a cast of AP5 users and developers . Work on AP5 continues and descendants of AP5 have been developed, in the guise of extensions to C++, Ada, and a new version of AP5, known as rellisp.

We will present AP5 via an example application, which we specify in English as follows. We want a system which keeps a data base of books and other reference items in a library, keeps track of who has borrowed or reserved different reference items, and performs miscellaneous bookkeeping functions for a librarian. Let us assume that we have entities of type Reference, Person, and "primitive" types String and Integer. Reference items will have any number of authors and a title. A reference item may be borrowed by zero or one person, and be reserved by any number of people. We want to be able to know who reserved an item first, when more than one person has done so.

Notation

In what follows, the AP5 language is presented using a syntax which resembles lisp but is somewhat stylized for purposes of elucidating the constructs. Someone who understands Lisp syntax should have no problem understanding the syntax. For those less familiar with Lisp, let it be known that Lisp makes use of Cambridge Polish Notation in which a function or operator, F, and its arguments, x, y, and z are displayed as (F x y z) instead of via the more familiar F(x, y, z) used in FORTRAN, Visual Basic and Java.

Throughout this document, angle brackets are used to delimit an informal concept which, if we were to elucidate further, could and would be explained more fully. So, when you see <a piece of syntax like this in angle brackets>, you are expected to understand what is intended from context, with no further explanation required beyond what appears between the angle brackets.

AP5 makes use of expressions of the form:

<list of variables> s.t. <wff>

In these expressions, "s.t." is an abbreviation for "such that" and <wff> is a Well Formed Formula in the language of First Order Logic. We will see examples of such expressions shortly.

AP5

AP5 extends Common Lisp with relations, operations on relations, consistency rules, and automation rules. Operations and rules specify the semantics of programs. Efficiency details are declared via annotations.

A relation is a possibly mutable set of tuples of objects. By object, or Entity, we mean, roughly, a Lisp object. There are two kinds of relations: transition and derived. Transition relations are finite relations whose tuples must be explicitly asserted or retracted. Derived ones are either computed by Lisp code or defined by a wff (Well Formed Formula), in the language of first order logic. A derived relation can be declared to be constant, i.e., to have an unvarying set of tuples.

Transition relations

Examples of declaring transition relations for a library data base are:

(defrelation Reference :types (Entity))
(defrelation author :types (Reference Person)
(defrelation title :types (Reference String))
(defrelation borrower :types (Reference Person))
(defrelation reserver :types (Reference Person))
(defrelation reserve :types (Reference Person Date))

Here, six relations are declared, one unary, four binary, and one tertiary. For each relation, we have indicated the types of each of its domains. For example, the 2-ary relation borrower above is declared, and is required to be a subset of {<r,p> s.t. (and (Reference r) (Person p))}. Entity is a unary relation true of any object.

The types mentioned here are called type constraints. We are stating that objects in the various slots of the relations above will be of certain types. In fact, we are requiring that they be of the stated types. But what is a type, in AP5? A modelling type, or type for short, in AP5 is merely a unary relation. This notion of type is not the same as the notion of type in programming languages, where a type determines the representation or a valid set of operations on objects of that type. We will discuss type constraints in a later section. For now, simply view a type constraint as saying that an object in a tuple of the constrained relation must by of a certain type, i.e., must be in a certain unary relation. For example, that any object in the range of author must be in the unary relation Person.

The above relations are transition relations, meaning that a program must explicitly assert which tuples are in them. The term 'transition' comes from the fact that tuples are asserted (or retracted) in atomic transactions which cause a transition between states of the database.

Note that nothing has been said about how to represent any of these relations. From a conventional point of view, what is going on here is that the programmer wants to specify the existence of certain data structures. In some languages, the programmer would have to state that a binary relation is to be implemented as a hash table, or a tree mapping keys to values, or as something else. In AP5, a default representation will be automatically chosen. As we shall later see, the programmer may, in a declarative manner, influence representation choices.

Operations involving relations and wffs

The user is going to write AP5 code which operates on borrower and other relations. Such operations are also abstract in that they operate on the abstract relation and not on any underlying representation.

The primitive operations on relations are adding, deleting, testing and generating. Example AP5 forms for these operations are, respectively:

(++ (borrower <principia> <L.Wittgenstein>))
(-- (borrower <principia> <D.Knuth>))
(?? (borrower <principia> <D.Knuth>))
(loop for (X Y) s.t. (borrower X Y) do <operations on X and Y>)

The first three forms add, delete, and test specific tuples of the borrower relation. The last form iterates over all tuples <X,Y> in borrower, performing some operations on each tuple generated.

We refer to adding and deleting as updates, testing and generating as accessing. Updates operate on named relations, accessors on arbitrary wffs, stated in a reverse Cambridge variant of first order logic syntax. In other words, accessors are quite general. For example:

(?? (E (R P) (borrower R P)))

(loop for (R) s.t. (E (P) (and (last-name P "Wittgenstein")
                               (borrower R P)))
      do <operations on R>)

The symbol E is the existential quantifier symbol. We use A as the universal quantifier.

Every stored relation has an implementation to support these database operations. There is a user extensible implementation library. Hash tables, trees, or anything a programmer wants can be specified. Note that some operations need not or cannot be supported by an implementation. For example, a binary relation implemented as a hash table without back pointers might not support generating the domain elements related to a given range element. Or an encryption relation between a key and its encryption would surely be unable to support generating a key given an encryption.

The default implementation is a list of tuples where a tuple is a list of objects. The default is good for prototyping since it supports all operations.

Typically, implementations store data in main memory. Because of this, we often say that AP5 includes a Virtual Memory Data Base, VMDB. AP5 has a limited persistency mechanism for data which a user wants to keep around across virtual memory incarnations. That persistency mechanism is under development and is not documented here.

Transactions

During program execution, the VMDB is always in a state. The state is mutable. It is changed by updates to one or more relations. Updates to relations occur within an atomic transaction, or atomic for short. For example:

  (atomic
     (++ (borrower <principia> <L.Wittgenstein>))
     (-- (borrower <principia> <D.Knuth>)))

would take the data base from a state in which Knuth was the borrower of Principia to a state in which he no longer was the borrower but in which Wittgenstein is the borrower.

In other words, an atomic groups together one or more updates to one or more relations, and makes the database transition from its state prior to execution of the atomic to a new state in which all of the updates in the atomic have occurred. The concept of an atomic transaction will be discussed in more detail when we define consistency rules in a later section.

Derived relations

In addition to transition relations, there are two kinds of derived relations, defined and computed. Defined relations are specified in terms of a logical formula, computed ones in terms of Lisp code.

An example of declaring a defined relation is:

(defrelation first-reserver
   :definition
   ((r p) s.t. (and (reserver r p)
                    (A (z p1 z1) (implies (and (reserve r p z)
                                               (reserve r p1 z1))
                                          (<= z z1))))})

First-reserver is not a transition relation. Therefore, in general, update operations make no sense on it. But it can be accessed (queried or generated) just like stored relations are. For example:

(?? E (R) (first-reserver R <L.Wittgenstein>)

asks if there exists a reference item for which the person <L.Wittgenstein> is the first reserver.

At query or generation time, AP5 derives the answer. In compiling uses of first-reserver, AP5 searches the space of algorithms determined by the defining wff and the supporting relations and selects the algorithm deemed most efficient based on annotations.

The relations reserver and reserve are quite related in meaning. As originally declared, the program would need to explicitly update both of them in a coordinated manner. We would prefer to state the relationship by declaring:

(defrelation reserver
   :definition ((r p) s.t. (E (z) (reserve r p z))))

This would make reserver be a defined relation, and no longer an updatable transition relation. However, see the next section.

Maintained relations

It is possible for a relation to have both a definition and a representation. Suppose we define reserver as:

(defrelation reserver
   :definition ((r p) s.t. (E (z) (reserve r p z)))
   :representation <our favorite implementation>)

Such a relation is called a maintained relation. It cannot be explicitly updated by code. But AP5 will cache all the tuples which it is true of, and monitor transactions so that whenever a relation in the support of its definition is updated, the cache will be updated, if necessary.

For example, programs are not allowed to explicitly update reserver. But updates to reserve affect the data cached for reserver, based on the definition for reserver.

AP5 programs

We have given isolated examples of stored and derived relations, and operations on those relations. In general full blown programs can be written in a combination of Lisp and AP5. For example, here is a Lisp function, consisting largely of AP5 forms, which a librarian could use to print out a list of people who have reserved some item and when they reserved it. For convenience of expression, we also define a new defined relation, Reserved.

(defrelation Reserved
  :definition ((r) s.t. (E (p) (reserver r p))))


(defun report-reservers ()
    (loop for ref s.t. (and (Borrowed ref) (Reserved ref))
        do
        (let ((borrower (theonly person s.t.(borrower ref person)))
              (reserver (theonly person s.t. (first-reserver ref person)))
              (title (theonly x s.t. (title ref x))))
            (print title "is borrowed by" borrower)
            (print title "was reserved by" reserver "on"
                   (theonly date s.t. (reserve ref reserver date)))
    )
)

In the above code we use Lisp constructs such as LET which binds two local variables (borrower and reserver) to a result returned by the AP5 macro "theonly". We don't explain Lisp in this document beyond what we said in the introduction. Ap5 macros such as "theonly" are fully explained in the ap5 manual which contains a useful index of terms and concepts.

The above code snippet is meant convey that a programmer can use ap5 as a general purpose programming language that provides a beautiful and general way to declaratively specify and generate data, separating the semantics of a program from the details of how the data is represented.

Annotations

We have not yet stated how any of the declared relations are represented other than that the default representation is a list of tuples. However, the programmer may declare how a relation in represented via annotations. Suppose after writing a number of functions such as the above, we determine that the only operation we do with title is to generate the title of a given reference. This might prompt us to use some kind of fast indexing implementation to map references to titles. The name of such an implementation (for binary relations) is Structprop. We could redeclare title via:

(defrelation title
   :types (Reference String)
   :representation Structprop).

The details of what Structprop is are unimportant for this presentation. It suffices to know that it provides a method of hashing certain kinds of objects in Lisp.

In general, as stated earlier, a relation is represented by an implementation, selected from a library of implementations built into AP5. There is a way to invent new implementations so that you can represent your relations with your favorite data structures. An implementation packages together the data structures for storing a relation and the code to which update and access operations on the relation are converted by the compiler.

One of the beauties of AP5 is that we do not need to rewrite any of our previous code which uses title. We need merely recompile it! AP5 will take care of reinterfacing our new representation to its uses. Note than in a conventional programming language, such changes of representation can involve substantial recoding.

AP5 makes an effort to automatically recompile things when that is necessary. For example, uses of title from before had been compiled into algorithms using the default Lisp list implementation. AP5 knows that those uses are now invalid, and keeps enough information around at run time to be able to dynamically recompile them.

Here are all the kinds of annotations on relations:

representation

size

speed of operations

Size and speed are used by the AP5 compiler to help determine what of many possible programs a compound wff should be implemented as. For example, the following annotations tell the system that about one out of a hundred References will typically be borrowed and that there will usually be 100 items checked out.

(defrelation borrower
   :types (Reference Person)
   :size ((input output) .01 (output output) 100))

More semantics

So far, we have declared some relations and written a small amount of code. Some of the relations are explicitly updated, others are derived. We have stated some type constraints. All of these things have enabled us to specify the semantics of our program. We have also discussed annotations, which are a way to declaratively affect the efficiency of the program.

We will now go into other things we can do in AP5 to indicate more of the semantics of a program, which we would need to code in an ad hoc manner in a more conventional language. These things include:

type constraints

count constraints

repair idioms

derivation idioms

consistency rules

automation rules

Type constraints serve as a check on what information is asserted into a relation. For example, the first domain of title is the type Reference. This means that if the program asserts (++ title X Y), then X must be in the Reference relation, i.e. (?? Reference X) be true. If not, AP5 may abort the atomic in which the assertion is executed. We will return to the subject of when and how AP5 aborts an atomic when we discuss consistency rules.

There is a type lattice used to record subtype relationships among a special kind of type, called a base type. For a base type, J, AP5 knows that an object of type J is also of type K, if K is a supertype of J as defined by the type lattice. We will not illustrate this capability in this paper. Note, AP5 has no built in notion of inheritance other than this simple subtyping. And, AP5 has no notion of methods associated with a type lattice, as do object oriented languages.

Count constraints enable stating things such as a reference must have one and only one title. We could augment our program by giving title a unique count constraint on its range:

(defrelation title
   :types (Reference String)
   :count ((input output) :unique :replacing)).

The term :replacing here is an example of a repair idiom. It is a statement that if ever an attempt is made to assert a title for a reference which already has one, the new title should replace the old one. If we did not provide this repair, the unique count constraint would cause AP5 to abort any transaction which attempted to assert a title for a reference already having a title.

Derivation idioms

Derivation idioms are things like saying that a relation is the transitive closure of some other relation or that it is an aggregate value, e.g. maximum or sum, defined over some other relation. For example, we could have defined first-reserver as follows:

(defrelation first-reserver
   :derivation (extreme reserve (fix vary <=))))

Here, the extreme idiom defines a binary relation which is a projection onto the first two domains of reserve (reference and person) of the tuples of reserve whose third domain value (a date) is minimum, according to <=, over all tuples having a common first domain value. In other words, vary the second domain while holding the first domain fixed, and gather all tuples having a minimum third domain value. Before, we used a wff to specify an algorithm which produced the desired extreme tuples. The extreme annotation tells AP5 what that original wff meant and enables AP5 to select a better algorithm for computing tuples of first-reserver. In this case, AP5 would use an algorithm linear in the number of reservers, instead of a quadratic one, for generating the first reserver for a given reference. The linear algorithm is provided as part of the implementation of the extreme idiom.

Another useful derivation idiom is transitive closure. Suppose we augment our library data model with a binary relation citation where citation(ref1, ref2) iff ref1 cites ref2 (e.g., in it's bibliography):

(defrelation citation :types (Reference Reference)).

Then suppose we want to study chains of citations, where one reference refers indirectly to another via a chain of citations. One might be tempted to define the transitive closure of citation as follows:

(defrelation citation*
  :definition ((x y) s.t. (or (citation x y)
                              (E (z) (and (citation x z)
                                          (citation* z y)))))).

Unfortunately, AP5 does not allow recursive definitions. However, we can explicitly tell it to create a transitive closure:

(defrelation citation*
  :derivation (tclosure citation)).

In our experience, the transitive closure idiom enables us to conveniently specify everything which one might otherwise be tempted to obtain via some kind of recursive definition.

Consistency rules

Consistency rules state invariants on the database. The type constraints used earlier to restrict the domains of relations are implemented in AP5 via consistency rules. In general, a consistency rule has a trigger and an action. The trigger is a wff. The action is an AP5 program. The state of the database is changed by an atomic transaction which proposes one or more updates, which semantically all occur 'at once'. After an atomic proposes its updates, all triggers of all rules are examined. Each trigger sees the database plus the proposed updates. Wffs in rules use an extension to FOL, called two state FOL which includes the metapredicate start. (Start wff) iff the wff is true after the proposed updates and was false before the atomic. If the trigger fails, the action is invoked and may propose further updates in an attempt to repair the database so that the trigger will no longer be violated. Once all actions of all triggered rules run, a consistency cycle is complete. Cycles repeat, with proposed updates consisting of all proposals from previous cycles, until the database is consistent, an unrepairable contradictory set of updates is collected, or the program explicitly aborts the transaction. In general it is undecidable if a given set of rules will cycle forever. The notion of consistency here is operational in the sense that the database is 'consistent' if no triggers are violated. Consistency cycling is a powerful and novel feature of AP5. Example:

(defconsistency borrower-subsumes-reserver
  (not (E (r p) (and (reserver r p)
                     (borrower r p))))
  (LAMBDA (r p)
     (if (?? start (borrower r p))
         (-- reserve r p (the time s.t. (reserve r p time))))))

where the syntax here is:

(defconsistency <rule name> <trigger> <action>).

The above rule assures that all atomics will preserve the database in a state such that no reference is both borrowed and reserved by the same person. In the case that it is asserted that someone who had reserved something borrows it, a fix is possible. In the case where it is asserted that someone who had borrowed something reserves it, the atomic will be aborted.

Automation rules

Automation rules are for reacting to changes of state of the database. Their syntax is identical to consistency rules. Their semantics, however shoddy it may be, is that the rule fires when the trigger becomes true as the result of an atomic. The action is run after performing the updates of the atomic before returning from the atomic. There is no notion of repairing or cycling for automation rules. Here is an example automation rule:

(DEFAUTOMATION notify-reserver
  (E (R P) s.t. (start (and (firstreserver R P)
                            (not (E (P) (borrower R P))))))
  (LAMBDA (R P)
    (mail-notification <librarian> P "notified that" R "is being held")
    (mail-notification P R "is being held for you at the library")))

Annotations vs. Semantics

When we talked about annotations, we were talking about efficiency issues. We mentioned that annotations can be altered without recoding uses of the annotated relations. This is not the case for the constraints, idioms, and (in general) rules which we have outlined. Adding or changing rules affects the semantics of the program and may require us to recode parts of it. For example, changing a type constraint to be more restrictive would invalidate any code which relied on the previous more general constraint. Or if we added a count constraint to a relation, then some previously valid code may no longer be so.

Towards object based computation environments

At ISI, computation environments have been implemented in AP5. Mechanisms are being built into AP5 to support persistent data. AP5 can, does and will serve as a language for building a general purpose computing environment for maintaining a uniform data model to support personal data processing, including electronic mail, appointment scheduling, etc., as well as internet data, software development, and system prototyping.

end of document