Tuesday, June 17, 2014

[Concurrency - State Models & Java Programs] Ch 4: Shared Objects and Mutual Exclusion

In the last chapter, we discussed the execution of multiple processes on one or more processors, modeling concurrent execution by interleaving and executing multiple concurrent threads in a Java program. We explained how process interaction is modeled using shared atomic actions, but not how real processes or threads interact. In this chapter, we turn to the issues involved in constructing concurrent programs in which threads interact to communicate and cooperate.
The simplest way for two or more threads in a Java program to interact is via an object whose methods can be invoked by the set of threads. This shared object’s state can of course be observed and modified by its methods. Consequently, two threads can communicate by one thread writing the state of the shared object and the other thread reading that state. Similarly, a set of threads may cooperate to update some information encapsulated in a shared object. Unfortunately, as we will explain, this simple scheme of interaction does not work.

4.1 Interference

We have seen that the execution of the instructions from a set of threads can be interleaved in an arbitrary fashion. This interleaving can result in incorrect updates to the state of a shared object. The phenomenon is known as interference. The problem of interference, and how to deal with it, is the main topic of this chapter.

4.1.1 Ornamental Garden Problem

To focus on the issues of thread interaction, we use an example known as the problem of the Ornamental Garden, due to Alan Burns and Geoff Davies (1993). The problem is stated as follows. A large ornamental garden is open to members of the public who can enter through either of two turnstiles as depicted in Figure 4.1. The management wants to determine how many people there are in the garden at any one time. They require a computer system to provide this information.
To simplify the problem further, we consider a garden that people are allowed to enter but never leave! The concurrent program to implement the population count required by the management of the ornamental garden consists of two concurrent threads and a shared counter object. Each thread controls a turnstile and increments the shared counter when a person passes through the turnstile. The class diagram for the program is depicted in Figure 4.2.
The Counter object and Turnstile threads are created by the go() method of the Garden applet shown below in which eastD, westD and counterD are objects of the same NumberCanvas class that we used in Chapter 2.
private void go() {
  counter = new Counter(counterD);
  west = new Turnstile(westD,counter);
  east = new Turnstile(eastD,counter);
  west.start();
  east.start();
}
The Turnstile thread shown in Program 4.1 simulates the periodic arrival of a visitor to the garden by sleeping for half a second and then invoking the increment() method of the counter. After the arrival of Garden.MAX visitors, the run() method exits and consequently, the thread terminates.
Program 4.1: Turnstile class.
Image from book
class Turnstile extends Thread {
  NumberCanvas display;
  Counter people;

  Turnstile(NumberCanvas n,Counter c)
    { display = n; people = c; }

  public void run() {
    try{
      display.setvalue(0);
      for (int i=1;i<=Garden.MAX;i++){
        Thread.sleep(500); //0.5 second between arrivals
        display.setvalue(i);
        people.increment();
      }
    } catch (InterruptedException e) {}
  }
}
Image from book
The remaining class Counter is more complex than is strictly necessary. The additional complexity is to ensure that the program demonstrates the effects of interference independently of any particular implementation of Java. To ensure that the program demonstrates the desired effect, Program 4.2 ensures that arbitrary interleaving occurs.
Program 4.2: Counter class.
Image from book
class Counter {
  int value=0;
  NumberCanvas display;

  Counter(NumberCanvas n) {
    display=n;
    display.setvalue(value);
  }

  void increment() {
    int temp = value;    //read value
    Simulate.HWinterrupt();
    value=temp+1;       //write value
    display.setvalue(value);
  }
}
Image from book
It does this by using the class Simulate which provides the method HWinterrupt(). The method, when called, sometimes causes a thread switch by calling Thread.yield() and sometimes omits the call leaving the current thread running. The idea is to simulate a hardware interrupt which can occur at arbitrary times between reading and writing to the shared Counter when performing an increment. Thus thread switches can occur at arbitrary times as discussed at the beginning of the last chapter. The Simulate class is defined by the following code:
class Simulate {
    public static void HWinterrupt() {
        if (Math.random()< 0.5) Thread.yield();
    }
}
The problem with the Ornamental Garden program is illustrated by the screen shot of the running applet in Figure 4.3. When the Go button is pressed, the Garden.go() method is invoked to create a Counter object and the two Turnstile threads. Each thread then increments the counter exactly Garden.MAX times and then terminates. The value of the constant Garden.MAX has been set to 20, consequently, when both Turnstile threads terminate, the counter display should register that 40 people have entered the garden. In fact, as can be seen from Figure 4.3, the counter registers only 31. Where have the missing people gone? Why have nine increments to the counter been lost? To investigate why, we develop a model of the Ornamental Garden problem.

Figure 4.3: Garden display.

4.1.2 Ornamental Garden Model

In the remainder of the book, we generally model each object or set of objects as an FSP process. However, to find out why the Ornamental Garden program operates incorrectly, we must model it at the level of store accesses. Consequently, the model includes a VAR process that describes the read and write accesses to a store location. This store location is the value variable encapsulated by the people instance of the Counter class (Program 4.2). The complete model is described in Figure 4.4. The reader may be surprised that there is no explicit mention of an increment action. Instead, increment is modeled using read and write actions by the definition INCREMENT inside TURNSTILE. Each thread object, east and west, has its own copy of the read and write actions that make up the increment operation or procedure. This models what happens in the actual Java program since methods are re-entrant and thus the instructions which constitute a method may be interleaved on behalf of the threads executing the method concurrently. In other words, method activations are not atomic actions. The LTS for the TURNSTILE is given in Figure 4.5.

The alphabet of the process VAR has been declared explicitly as the set in Figure 4.4. We have not used set constants before. A set constant can be used wherever we previously declared sets of action labels explicitly. Sets are simply a way of abbreviating model descriptions. VarAlpha is declared as follows:
set VarAlpha = {value.{read[T],write[T]} }
The alphabet for the TURNSTILE process is extended with this set using the alphabet extension construct +{...}. This is to ensure that there are no unintended free actions. For example, if a VAR write of a particular value is not shared with another process then it can occur autonomously. A TURNSTILE process never engages in the action value.write[0] since it always increments the value it reads. However, since this action is included in the alphabet extension of TURNSTILE, although it is not used in the process definition, it is prevented from occurring autonomously. The TURNSTILE process is slightly different from its Java implementation in that it does not run for a fixed number of arrivals but may end at any point. However, it cannot end in the middle of updating the shared variable value. The end action is only accepted as an alternative to an arrive action. Furthermore, TURNSTILE is defined as recursive so that analysis (discussed below) will not report spurious deadlocks as would be the case if we had used STOP after the action end. Note that the shared variable VAR is not only shared by the turnstiles east and west, but also by display which is used for checking purposes.
Having developed a model of the Ornamental Garden program, in some detail, what can we do with it? Well, we can animate the model using the LTSA tool to produce action traces for particular input scenarios. For example, the trace in Figure 4.6 illustrates the case where there is an east arrival and a west arrival and then end occurs.

The trace is correct in that after two arrivals the counter has a value of two. However, we might try many input scenarios before finding out what is wrong with the program. To automate the search for the error, we combine a TEST process with the existing model that signals when an erroneous action trace occurs. The process is defined below:
TEST       = TEST[0],
TEST[v:T]  =
     (when (v<N){east.arrive,west.arrive}->TEST[v+1]
     |end->CHECK[v]
     ),
CHECK[v:T] =
    (display.value.read[u:T] ->
       (when (u==v) right -> TEST[v]
       |when (u!=v) wrong -> ERROR
       )
    )+{display.VarAlpha}.
The process counts the total number of east.arrive and west.arrive actions. When an end action occurs, and consequently the shared variable updates are complete, it checks that the value stored is the same as the total number of arrival events. If not, an error is declared by moving into the ERROR state. ERROR (like STOP) is a predefined FSP local process (or state). It is always numbered -1 in the equivalent LTS. Again, alphabet extension is used to ensure that no actions prefixed by display can occur autonomously.
The TEST process is combined with the existing model as follows:
||TESTGARDEN = (GARDEN || TEST).
We can now request the LTSA analysis tool to perform an exhaustive search to see if the ERROR state in TEST can be reached and if so to produce an example trace. The trace produced is:
Trace to property violation in TEST:
     go
     east.arrive
     east.value.read.0
     west.arrive
     west.value.read.0
     east.value.write.1
     west.value.write.1
     end
     display.value.read.1
     wrong
This trace clearly indicates the problem with the original Java program. Increments are lost because the shared variable is not updated atomically. Thus both east and west turnstiles read the value 0 and write 1. If the east increment finished before the west increment started or vice versa, then the result would be two (as in the previous trace).
Destructive update, caused by the arbitrary interleaving of read and write actions, is termed interference.
In real concurrent programs, interference bugs are extremely difficult to locate. They occur infrequently, perhaps due to some specific combination of device interrupts and application I/O requests. They may not be found even after extensive testing. We had to include a simulated interrupt in the example program to demonstrate the error. Without the simulated interrupt, the program is still incorrect, although the erroneous behavior may not manifest itself on all systems.
The general solution to the problem of interference is to give methods that access a shared object mutually exclusive access to that object. This ensures that an update is not interrupted by concurrent updates. As we see in the following sections, methods with mutually exclusive access can be modeled as atomic actions.

4.2 Mutual Exclusion in Java

Concurrent activations of a method in Java can be made mutually exclusive by prefixing the method with the keyword synchronized.
The Counter class from the Ornamental Garden program can be corrected by deriving a SynchronizedCounter class from Counter and making the increment method in the subclass synchronized as shown in Program 4.3.
Program 4.3: Corrected Counter class.
Image from book
class SynchronizedCounter extends Counter {

  SynchronizedCounter(NumberCanvas n)
     {super(n);}

  synchronized void increment() {
       super.increment();
  }
}
Image from book
Java associates a lock with every object. The Java compiler inserts code to acquire the lock before executing the body of a synchronized method and code to release the lock before the method returns. Concurrent threads are blocked until the lock is released. Since only one thread at a time may hold the lock, only one thread may be executing the synchronized method. If this is the only method, as in the example, mutual exclusion to the shared object is ensured. If an object has more than one method, to ensure mutually exclusive access to the state of the object, all the methods should be synchronized.
Access to an object may also be made mutually exclusive by using the synchronized statement:
synchronized (object) { statements }
This acquires the referenced object’s lock before executing the bracketed statement block and releases it on exiting the block. For example, an alternative (but less elegant) way to correct the example would be to modify the Turnstile.run() method to use:
synchronized(people) {people.increment();}
This is less elegant as the user of the shared object has the responsibility for imposing the lock, rather than embedding it in the shared object itself. Since not all users of the object may act responsibly, it may also be less secure against interference.
The output from the corrected Ornamental Garden program is depicted in Figure 4.7. The only change is to use the class defined in Program 4.3 rather than the original Counter class. This change is made by clicking the Fix It check box before pressing Go.
Once a thread has acquired the lock on an object by executing a synchronized method, that method may itself call another synchronized method from the same object (directly or indirectly) without having to wait to acquire the lock again. The lock counts how many times it has been acquired by the same thread and does not allow another thread to access the object until there has been an equivalent number of releases. This locking strategy is sometimes termed recursive locking since it permits recursive synchronized methods. For example:
public synchronized void increment(int n) {
   if (n>0) {
       ++value;
       increment(n-1);
   } else return;
}
This is a rather unlikely recursive version of a method which increments value by n. If locking in Java was not recursive, it would cause a calling thread to be blocked forever, waiting to acquire a lock which it already holds!

4.3 Modeling Mutual Exclusion

The simplest way to correct the model of the Ornamental Garden program listed in Figure 4.4 is to add locking in exactly the same way as it was added to the Java program. For simplicity, we ignore the detail that Java locks are recursive since whether or not the lock is recursive has no impact on this problem. A (non-recursive) lock can be modeled by the process:
LOCK = (acquire->release->LOCK).
The composition LOCKVAR associates a lock with a variable. It is substituted for VAR in the definition of GARDEN.
||LOCKVAR = (LOCK || VAR).
The alphabet VarAlpha is modified as follows to include the additional locking actions:
set VarAlpha = {value.{read[T],write[T],
                       acquire, release}}
Finally, the definition of TURNSTILE must be modified to acquire the lock before accessing the variable and to release it afterwards:
TURNSTILE = (go    -> RUN),
RUN       = (arrive-> INCREMENT
            |end   -> TURNSTILE),
INCREMENT = (value.acquire
             -> value.read[x:T]->value.write[x+1]
             -> value.release->RUN
            )+VarAlpha.
We can check this model in exactly the same way as before using TEST. An exhaustive search does not find any errors. Consequently, we have mechanically verified that this new version of the model satisfies the property that the count value is equal to the total number of arrivals when stop is pressed. A sample execution trace of the new model is shown below:
go
east.arrive
east.value.acquire
east.value.read.0
east.value.write.1
east.value.release
west.arrive
west.value.acquire
west.value.read.1
west.value.write.2
west.value.release
end
display.value.read.2
right
Now that we have shown that we can make shared actions indivisible or atomic using locks, we can abstract the details of variables and locks and model shared objects directly in terms of their synchronized methods. We can perform abstraction mechanically by hiding actions. For example, we can describe the behavior of the SynchronizedCounter class (over a finite integer range) by:
const N = 4
range T = 0..N

VAR = VAR[0],
VAR[u:T] = ( read[u]->VAR[u]
           | write[v:T]->VAR[v]).

LOCK = (acquire->release->LOCK).

INCREMENT = (acquire->read[x:T]
             -> (when (x<N) write[x+1]
                 ->release->increment->INCREMENT
                )
             )+{read[T],write[T]}.

||COUNTER = (INCREMENT||LOCK||VAR)@{increment}.
The definition of INCREMENT has been slightly modified from that used previously. The when clause ensures that the increment action can only occur when the value stored is less than N. In other words, increment is not allowed to overflow the range T. The alphabet declaration @{increment} means that read, write, acquire and release become internal actions (tau) of COUNTER. The LTS which results from minimizing COUNTER is depicted in Figure 4.8.
We can describe a single process that generates exactly the same LTS:
COUNTER = COUNTER[0],
COUNTER[v:T] = (when (v<N) increment->COUNTER[v+1]).
This is a much more abstract and consequently simpler model of the shared Counter object with its synchronized increment method. We have demonstrated above (by LTSA minimization) that it has exactly the same observable behavior as the more complex definition. A display action to read the value of the counter can be added as shown below:
DISPLAY_COUNTER = COUNTER[0],
COUNTER[v:T] = (when (v<N) increment->COUNTER[v+1]
               |display[v] -> COUNTER[v]).
The LTS which results from minimizing DISPLAY_COUNTER is depicted in Figure 4.9.
To implement this action in the Java class, we would simply add the synchronized method:
public synchronized int display() {
     return value;
}
In the following chapters, we usually model shared objects at this level of abstraction, ignoring the details of locks and mutual exclusion (as provided by the use of synchronized methods in Java). Each shared object is modeled as an FSP process, in addition to modeling each Java thread as an FSP process. The model of a program does not distinguish active entities (threads) from passive entities (shared objects). They are both modeled as finite state machines. This uniform treatment facilitates analysis.

Summary

In this chapter, we have discussed thread interaction via shared objects. The Ornamental Garden example served to demonstrate that uncontrolled interleaving of method instructions leads to destructive update of the state of the shared object. This is termed interference. Interference can be avoided by giving each concurrent method activation mutually exclusive access to the shared state. In Java, this is achieved by making such methods synchronized. Synchronized methods acquire a lock associated with the object before accessing the object state and release the lock after access. Since only one thread at a time can acquire the lock, synchronized methods obtain mutually exclusive access to the object state.
Interference bugs in real concurrent programs are notoriously difficult to find. They can be found by analyzing program models as we demonstrated. However, this requires detailed modeling at the level of store accesses to variables. Such models quickly become too large to analyze. The answer is to ensure systematically that all the methods of objects shared between threads are synchronized. They can then be treated as atomic actions for modeling purposes.

Notes and Further Reading

We have dealt with the problem of access to shared objects encapsulating variables. However, access to any resource must be made mutually exclusive if the resource cannot be concurrently shared. These resources are sometimes termed serially reusable since they can be used by many processes at different times but not shared by many processes at the same time. For example, a printer can only produce the output from one print job at a time. The solution of using locks to ensure mutual exclusion is a general one that can be applied to controlling access to a printer in the same way as to shared variables.
A solution to the mutual exclusion problem was first proposed by Dijkstra (1965). All operating systems and concurrent programming textbooks deal with mutual exclusion. Usually, great emphasis is placed on the concept of a critical section. A critical section is simply the section of code belonging to a thread or process which accesses the shared variables. To ensure correct behavior, this critical section must be given mutually exclusive access by acquiring a lock before the critical section and releasing it afterwards. We have not used the term critical section since, in an object-oriented language such as Java, shared variables are encapsulated in objects and accessed via methods. In other words, synchronized methods are critical sections. The synchronized statement is another way of making a section of code a critical section in Java. However, it is usually the case in well-designed concurrent object-oriented programs that critical sections are methods.

Exercises

  • 4.1 Modify the uncorrected version of the Ornamental Garden program such that Turnstile threads can sleep for different times. Is it possible to choose these sleep times such that interference does not occur?
  • 4.2 Given the following declarations:
    const N = 3
    range P = 1..2 //thread identities
    range C = 0..N //counter range for lock
    
  • Model a Java recursive lock as the FSP process RECURSIVE_LOCK with the alphabet {acquire[p:P],release[p:P]}. The action acquire[p] acquires the lock for thread p.
  • 4.3 A central computer connected to remote terminals via communication links is used to automate seat reservations for a concert hall. A booking clerk can display the current state of reservations on the terminal screen. To book a seat, a client chooses a free seat and the clerk enters the number of the chosen seat at the terminal and issues a ticket. A system is required which avoids double-booking of the same seat while allowing clients free choice of the available seats. Construct a model of the system and demonstrate that your model does not permit double-bookings. (Hint: It is only necessary to model a few terminals and a few seats. Remember, a seat can appear to be free although it is booked or being booked by another clerk.)
  • 4.4 Write a Java program that implements the seat reservation system of exercise 4.3.

No comments:

Post a Comment