Desaware Home
Products    Purchase    Publishing    Articles   Support    Company    Contact    
Articles
.NET
COM

 

 

bluebar
Contact Desaware and order today

bluebar
Sign up for Desaware's Newsletter for the latest news and tech tips.

Interrupting Long Operations

Copyright © 2002 by Desaware Inc. -- All rights reserved.

In our article "Introduction to State Machines" you read how StateCoder can be used to implement "long operations" - computational tasks that take a significant amount of time to complete, in a way that minimizes impact on the client application or component. In this article we will explore this subject in more depth, and address the problem of interrupting these operations.

Figure 1 illustrates the generic case of a series of long operations.


Figure 1 - Series of 3 Long Operations

Each long operation is represented by a "black box" - indicating that we can't interfere with the internal behavior of this code. Under StateCoder, each long operation will typically represent a function called during a state, where each state calls one long operation and the sequence of long operations is determined by the state machine.

What takes so long?

Before we explore the various ways of working with long operations, it's important to understand a little bit more about the types of long operations that can exist. Long operations actually divide into two categories:

  1. Long operations that are computationally intensive: i.e., you are performing a long series of calculations. Performing complex image processing, compiling a program, or a database query on a local system are examples of a potentially computationally intensive tasks.
  2. Long operations that are long because you are waiting for an outside event to occur. A call to a web service, database query to a server, or file download are examples of these kinds of long operations.

From the perspective of a client, both types of long operations are the same if you call them synchronously (i.e., you start the operation and cannot continue executing your code until the operation completes). 

There is actually only one way of avoiding blocking the client with a long operation - that is to place the operation on a different thread. There are two basic ways to do this: one is to create a new thread and place the long operation on that thread, the other to use an asynchronous call to the long operation (either through delegates, or through the standard .NET asynchronous design pattern in cases where the long operation has native asynchronous support). Keep in mind that making an asynchronous call still places the long operation on a different thread - it's just that the thread is provided by the .NET framework.

In the context of StateCoder, both approaches are easily supported. Placing the long operation in a state machine that is running in its own thread (or the StateCoder thread pool) is a way of explicitly moving the operation to another thread. Or you can use the AsyncResultMessageSource (or your own message source) to effectively turn the long operation into a message source. Like any asynchronous operation, it will run on the .NET thread pool.

Why is it important to understand the two types of long operations? Because they differ significantly in their impact on the system. A long operation that is waiting for an external event will typically suspend the thread holding the operation while waiting for the operation to complete. This is a very efficient way of handling long operations, because a suspended thread has very little performance impact. Consider a data query to an external server. The client machine does not need perform any processing on that query until the server responds, thus the thread can be suspended until the server operation completes.

You can, in fact, design an application that uses one thread to manage many of these asynchronous requests, since it is much more efficient for a single thread to wait on a group of requests than create a separate thread for each request. StateCoder uses this exact approach when managing multiple message sources on the thread pool.

The story is very different for computationally intensive operations. The background thread performing the long operation continues to work during the operation. In this case you will probably want to avoid creating too many background threads, since each one represents an additional load on the processor. In many cases computationally intensive long operations are best handled using a queuing system, where a limited number of threads are allocated to the task, each one performing queued operations in series. This is, by the way, easy handled by a state machine as well.

Interrupting Long Operations

If you look at the StateCoder examples, you will often see us use the Sleep operation to represent a long operation. The Sleep operation is a synchronous call to the second type of long operation (non-computationally intensive). We could have also used a very long loop which would better demonstrate a computationally intensive long operation.

The important thing for you to keep in mind about a long operation is this: unless the long operation itself has a mechanism to "abort" or "interrupt" the task, there is no way to safely abort the operation from the outside!

This is obvious when you think about it: Imagine you are calculating a long prime number (a computationally intensive task). Remember, this operation is taking place in a "black box". Unless that "box" has a method you can call to abort the calculation, any attempt to interrupt the operation means effectively killing the thread, and leaving the calculation at some undetermined state.

So if you wish to interrupt a long operation, you have the following choices:

  1. You can terminate the thread performing the operation (take your chances).
  2. You can redesign the long operation into a sequence of shorter operations.
  3. You can redesign the long operation to provide an "abort" mechanism.
  4. You can abandon the long operation - simply stop waiting for the result and allow it to complete at its own pace. For example, in the case of an asynchronous database query you might simply ignore the resulting data.

For our purposes we will assume that terminating the thread is not an option (since that reflects very poor design and may have a long term impact on system resources). The second option is obvious, and represents pretty close to the ideal solution. If you can break up the long operation into a sequence of shorter operations (or repetitive invocations of the same shorter operation), you have effectively turned the long operation into a state machine. Once you've converted a single long operation "black box" into its own state machine, you can invoke the state machine asynchronously (each state machine is also a message source), and you may interrupt the operation by using additional message sources in either state machine. 

Interrupting a long operation called from a StateCoder state.

Consider what happens when you call a function that performs a long operation from within a state (either the EnterState or MessageReceived method of a State class). How would you interrupt such a long operation? At first you might think that all you needed to do is add a message source that is signaled when the interrupt occurs. However upon further thought you will realize that this cannot work. You see, once you call the long operation function, you have made a synchronous call to a "black box", and that thread will not continue to run until the long operation concludes. All message sources for a state machine are synchronized to the state machine's thread, so none of them will be processed until that operation concludes.

You might think this is a bad thing - but it really isn't. If the message source was allowed to come in on any thread you would lose all of the threading protection that StateCoder provides your state machine and you would be back with the default .NET approach where everything is free threaded, and the need to carefully synchronize your objects.

However, you certainly can design an interrupt capability into your long operation as shown in figure 2.

Figure 2 - Interrupting a long operation 

Remember, the main thread is busy performing the long operation, so the interrupt must come in on a different thread. 

Let us consider how you might interrupt the generic "Sleep" operation that is used on the StateCoder examples. The Sleep method can be interrupted using the thread class Interrupt method.

The LongOperation function should be placed in your State Machine class. Why? Because that is the only class that is directly accessible by the client. The state machine class might have a member m_SleepingThread that contains a reference to the current thread. When a state needs to perform a long operation, it calls the LongOp1 function on the state machine (accessed using the state's Machine property).

Public Sub LongOp1()
   Try
      m_SleepingThread = Threading.Thread.CurrentThread
      Threading.Thread.Sleep(1000)
   Catch ex As Threading.ThreadInterruptedException
   Finally
       m_SleepingThread = Nothing
   End Try
End Sub
      

When a client wants to interrupt the long operation, it calls the InterruptLongOp function on the state machine object which might look something like this:

Public Sub InterruptLongOp()
     If Not m_SleepingThread Is Nothing Then
        If (m_SleepingThread.ThreadState And _
           Threading.ThreadState.WaitSleepJoin) <> 0 Then
               m_SleepingThread.Interrupt()
         End If
     End If
End Sub

You might thing this solves the problem, but in fact this code, simple as it is, won't work. Well, actually it will work - most of the time. But every now and then it will cause problems. Consider the following scenarios:

  • What if the m_SleepingThread property is set in the LongOp1 function, then the InterruptLongOp function is called. It will see that m_SleepingThread is valid, but the thread will still show as running and will not match the WaitSleepJoin thread state. Thus the thread will not be interrupted. It will then go to sleep, and the interrupt will fail.
  • What if the InterruptLongOp method is about to interrupt the thread, then the LongOp1 exits its sleep state the interrupt occurs between the time the Sleep call returns and the m_SleepingThread variable is set to Nothing. When InterruptLongOp then tries to interrupt the thread it is already running. The thread will be interrupted next time it tries to block (whether for a sleep or wait operation) - but that might be somewhere else in the code where an interrupt will cause the code to fail!
  • If the InterruptLongOp method is about to interrupt the thread, and the Sleep operation exits and the entire LongOp function exits, the m_SleepingThread.Interrupt call will result in a null object reference error.

These are the kinds of problems that often occur in multithreaded applications (and exactly the kinds of problems StateCoder is designed to avoid).

One solution to interrupting a sleep operation is shown here:

Public Sub LongOp1()
   Try
      SyncLock Me
         m_SleepComplete = False
         m_SleepingThread = Threading.Thread.CurrentThread
      End SyncLock
      Threading.Thread.Sleep(1000)
      m_SleepComplete = True
      Catch ex As Threading.ThreadInterruptedException
   Finally
      SyncLock Me
         m_SleepingThread = Nothing
      End SyncLock
   End Try
End Sub

Public Sub InterruptLongOp()
   Dim OkToExit As Boolean
   SyncLock Me
      If Not m_SleepingThread Is Nothing Then
         Do Until OkToExit
            If (m_SleepingThread.ThreadState And _
               Threading.ThreadState.WaitSleepJoin) <> 0 Then
               Try
                  m_SleepingThread.Interrupt()
               Catch
               End Try
               OkToExit = True
            Else
               If m_SleepComplete Then OkToExit = True
            End If
         Loop
      End If
   End SyncLock
End Sub

Looks nasty, doesn't it? We'll leave a complete analysis to the reader (and you are welcome to contact us if you have a better solution, or see a problem with this code as well). Two hints: 

  • The reason for the m_SleepComplete flag is to determine if the failure to be in the WaitSleepJoin state is because the Sleep operation has not started, or because it has completed.
  • The SyncLock calls protect access to the m_SleepingThread member, making sure that it cannot be accessed simultaneously by both threads.

Moving Long Operations into Message Sources.

As you can see, calling a long operation from within a state object poses a number of problems. For one thing, it often requires that each state machine be in its own thread. For another, interrupting these long operations brings back synchronization problems.

However, by simply rethinking the nature of a long operation, you can avoid the synchronization problems and easily add the ability to interrupt the operation. How? Simply turn the long operation into a message source. This is illustrated in figure 3.

Figure 3 - Placing a long operation in a message source

Here you see a traditional state machine diagram. The left circle represents the entry state. In this state you have two message sources defined. One of them begins a long operation (internally this will either be using an asynchronous call, or by executing the operation on a thread that is created and managed by the message source itself). The other represents a timeout or external interrupt.

Either of these message sources will cause a MessageReceived method to arrive, which you can handle in any way you choose. In this figure, the two messages cause transitions to different states, but that is just one of many options.

What happens to the long operation message source if the interrupt occurs? That's up to you. If you do nothing, that message source will typically remain active for later states. However if you change the message source (removing the long operation from the ActiveMessageSources array), the long operation will be "abandoned". It will continue to run until it completes, but the message will be ignored since no state machine is waiting for it. You can, if you wish, incorporate your own code to abort that message source as well (typically through a Dispose method call).

This is very close to the design pattern described earlier; that of splitting the long operation into a state machine that implements a sequence of shorter operations, and using that state machine as a message source.

Conclusion

We hope you will find this article helpful in architecting your state machines. Long operations pose challenges to any application and component, and no framework can provide the magical trick of safely interrupting an arbitrary long operation. However, with some thought and design, StateCoder does provide you with several approaches to choose from in addressing this particular problem. 

 

For notification when new articles are available, sign up for Desaware's Newsletter.

articles
Related Products:
 
Products    Purchase    Articles    Support    Company    Contact
Copyright© 2012 Desaware, Inc. All Rights Reserved.    Privacy Policy