JML Tutorial - Model Fields and Datagroups
When writing specifications for an abstract parent class or interface and a concrete derived class, the parent class needs to express its properties in terms of some abstract quantities. Model fields can be used to describe the properties of an abstract class and datagroups are used as an abstraction of frame conditions.
Using model fields
We first give some basic information about model fields and datagroups.
Model fields
A model field is a specification-only field that encapsulates some property of a class (typically, but not necessarily, an abstract class or interface).
- The model field can be used in specifications (e.g., for a class or interface and its methods).
- A model field is given an implementation by writing a
representsclause in the derived class; this connects the model field to the concrete implementation. - A model field can also be used within a concrete class to represent some characteristic that is important for clients to know about (whether or not that characteristic is directly implemented).
- A model field is typically an instance field. Although Java only allows static fields to be declared in an interface, in JML one can declare instance model fields in an interface (or class) using the modifier
instance, and such instance model fields declarations are inherited by classes that implement the interface. As the (Java) default isstaticfor field declarations in an interface, theinstancekeyword is required if the field is intended to be found in objects having that interface’s type.
Datagroups
A datagroup is a set of (concrete) memory locations in the program’s state, usually locations in the heap. However, within the body of a method, local stack locations might also comprise a datagroup (this could be useful for stating frame conditions of a loop, for example).
- Each model field is also a datagroup, as well as representing some abstract value. A standalone datagroup (not associated with a model field) can be declared using the type
\datagroup. - A datagroup is an abstraction of the set of memory locations used in a frame condition (in
assignsorassignableclauses, also known asmodifiesor writes clauses) and inaccessible(or reads clauses). Datagroups can be used as follows:- In an abstract class or interface a datagroup can be used in a frame condition, and
- In a concrete class, concrete fields can be declared to be
ina datagroup (and thus are included in frame conditions that include that datagroup).
Here is a working example that verifies, with commentary below.
// openjml --esc Polygon.java
interface Polygon {
//@ model instance public int sides;
//@ model instance public int longestSide;
//@ public invariant sides >= 3;
//@ assigns longestSide;
//@ ensures longestSide == \old(longestSide)/2;
public void half();
//@ ensures \result == sides; spec_pure
public int sides();
//@ ensures \result == longestSide; spec_pure
public int longestSide();
}
class Square implements Polygon {
public int side; //@ in longestSide;
//@ public represents sides = 4;
//@ public represents longestSide = side;
//@ ensures side == s && sides == 4;
public Square(int s) { side = s; }
// specification inherited
public void half() { side = side/2; }
// specification inherited; cf. the represents clause for sides
public int sides() { return 4; }
// specification inherited; cf the represents clause for longestSide
public int longestSide() { return side; }
}
class Test {
//@ requires polygon.longestSide < 10000;
public void test(Polygon polygon) {
int s = polygon.sides();
int p = polygon.longestSide();
polygon.half();
int ss = polygon.sides();
int pp = polygon.longestSide();
//@ assert s == ss;
//@ assert pp == p/2;
}
public void test2(Polygon polygon) {
//@ assert polygon.sides() == 4; // NOPE - could be any kind of polygon
}
public void test3(Square square) {
//@ assert square.sides() == 4; // OK
}
public void test4(Polygon polygon) {
if (polygon instanceof Square square) {
//@ assert square.sides() == 4; // OK as well
}
}
}
Polygonis an (abstract) interface with a concrete implementationSquarePolygondeclares two properties, as model fields found in instances of that type:sides(the number of sides) andlongestSide(the length of the longest side of the polygon)- There is an invariant saying that all polygons have at least 3 sides.
- There are two abstract methods that return the values of
sidesandlongestSide. These are declared to bespec_pure(deterministic and do not change anything, as described in the section about calling methods in specifications). - The method
halfhalves the size of the polygon:- It has an
assignsclause saying what datagroups can be modified. - It has a postcondition saying what happens to the memory locations that are modified.
- It has an
Square is an implementation of Polygon:
Squarehas just one data member,side(the length of a side of the square)- The constructor for
Squareinitializesside. - The clause
represents sides = 4gives a value to thesidesmodel field inPolygon - The clause
represents longestSide = sidegives a value to thelongestSidemodel field using concrete fields ofSquare - And the
sidefield is declared to be inlongestSide. Whenhalf(abstractly) assigns to the model fieldlongestField, then all the fields that are in the datagrouplongestFieldare considered assigned to. - Then all the methods that
Squareinherits fromPolygonare implemented as expected, but they can inherit all their specifications fromPolygon. No additional specifications are needed. For example, look at the methodsides(): the specification (inPolygon) says it returns the value ofsides, which is given a value by the represents clause, so the implementation ofSquare.sides()satisfies the abstract specification given forPolygon.sides().
Test just checks that the specifications given for Polygon work to verify some simple uses of the interface and its specifications.
- Note that the implementation and verification of
Test.test()only usesPolygonand its specifications. (That is, it does not use the specification ofSquare.) - The
assertintest2is incorrect (does not verify) because a polygon (as specified) can have a variety of numbers of sides. - But the
assertintest3succeeds, because aSquareknows how many sides it has. - In
test4()we do a type test to see if the inputpolygonis aSquare, and if so, we know how many sides it has.
The output when verifying the example above (though you may want to add the --progress option) is this:
Polygon.java:49: verify: The prover cannot establish an assertion (Assert) in method test2
//@ assert polygon.sides() == 4; // NOPE - could be any kind of polygon
^
1 verification failure
Using datagroups
In the above example, longestSide does not quite properly capture the effect of half(), which alters the lengths of all of the sides, not just
the longest one. The following code separates the datagroup aspect of longestSide into a more general, standalone, datagroup named allSides.
This code is only slightly different than the previous listing. In particular, Square and Test are precisely the same. But now an implementation of
another derived class, Triangle (with three different sides) is clearer: put all three sides in allSides and make the implementation of
longestSide be the maximum of the three sides.
// openjml --esc Polygon3.java
interface Polygon3 {
//@ model instance public int sides;
//@ model instance public \datagroup allSides;
//@ model instance public int longestSide; //@ in allSides;
//@ public invariant sides >= 3;
//@ assigns allSides;
//@ ensures longestSide == \old(longestSide)/2;
public void half();
//@ ensures \result == sides; spec_pure
public int sides();
//@ ensures \result == longestSide; spec_pure
public int longestSide();
}
class Square implements Polygon3 {
public int side; //@ in longestSide;
//@ public represents sides = 4;
//@ public represents longestSide = side;
//@ ensures side == s && sides == 4;
public Square(int s) { side = s; }
// specification inherited
public void half() { side = side/2; }
// specification inherited; cf the represents clause for sides
public int sides() { return 4; }
// specification inherited; cf the represents clause for longestSide
public int longestSide() { return side; }
}
class Triangle implements Polygon3 {
public int side1; //@ in longestSide;
public int side2; //@ in longestSide;
public int side3; //@ in longestSide;
//@ public invariant side1 <= longestSide && side2 <= longestSide && side3 <= longestSide;
//@ ensures this.side1 == s1 && this.side2 == s2 && this.side3 == s3;
public Triangle(int s1, int s2, int s3) { side1 = s1; side2 = s2; side3 = s3; }
//@ public represents sides = 3;
//@ public represents longestSide = Math.max(side1, Math.max(side2, side3));
public int longestSide() { return Math.max(side1, Math.max(side2, side3)); }
public int sides() { return 3; }
public void half() { side1 /= 2; side2 /= 2; side3 /= 2; }
}
class Test {
//@ requires polygon.longestSide < 10000;
public void test(Polygon3 polygon) {
int s = polygon.sides();
int p = polygon.longestSide();
polygon.half();
int ss = polygon.sides();
int pp = polygon.longestSide();
//@ assert s == ss;
//@ assert pp == p/2;
}
public void test2(Polygon3 polygon) {
//@ assert polygon.sides() == 4; // NOPE - could be any kind of polygon
}
public void test3(Square square) {
//@ assert square.sides() == 4; // OK
}
public void test4(Polygon3 polygon) {
if (polygon instanceof Square square) {
//@ assert square.sides() == 4; // OK as well
}
}
}
Singly-linked Lists
The following example specifies singly-linked lists (as in Lisp) with the null value representing the empty list.
// openjml --esc T_NullableList.java
interface T_NullableList {
/*@ model instance non_null Object elem; @*/
/*@ model instance nullable T_NullableList tail; @*/
//@ ensures \result <==> ls == null;
static boolean isEmpty(T_NullableList ls) {
return ls == null;
}
//@ ensures \result == this.elem;
/*@ non_null @*/ Object head();
/*@ ensures \result == this.tail; @*/
/*@ nullable @*/ T_NullableList tail();
}
Each node in such a list (i.e., each object of type T_NullableList)
is modeled as a non-null object (the field elem) together with a possibly null list (the field tail).
The two model instance fields are used to specify the methods in the interface.
The null value is used to represent the empty list, hence the static method isEmpty returns true if its argument is a null reference.
In the implementation of this interface, in the class T_NullableListImpl shown below, there is a constructor, which is needed to ensure that the model field elem (represented by the field car)
is initialized to a non-null value.
The methods head and tail inherit their specifications from the interface.
// openjml --esc T_NullableListImpl.java
public class T_NullableListImpl implements T_NullableList {
/*@ non_null @*/ Object car; //@ in elem;
//@ represents elem = car;
/*@ nullable @*/ T_NullableList cdr; //@ in tail;
//@ represents tail = cdr;
public T_NullableListImpl(/*@ non_null @*/ Object e,
/*@ nullable @*/ T_NullableList ls) {
this.car = e;
this.cdr = ls;
}
public /*@ non_null @*/ Object head() {
return this.car;
}
public /*@ nullable @*/ T_NullableList tail() {
return this.cdr;
}
}
The interface and the T_NullableListImpl class both verify without any errors.
Type annotations and fully-qualified type names
Java’s syntax for type annotations applied to fully-qualified type names is a bit unexpected. One writes
java.lang.@NonNull String (rather than the incorrect @NonNull java.lang.String).
Exercises
As an exercise, change the specification of Polygon above to enforce the invariant that the longest side should always have a strictly positive value. (Note that 1/2 is 0 in Java.) You may need to use assume statements, as the prover that openjml uses cannot prove some facts about division (due to fundamental limitations on logic). Check your work by using openjml to verify the correctness of the result.