Worker Threads

Pausing & shutting down threads

A thread may have to stop and wait for some reason. Perhaps the user has clicked a "pause" check box or pushed a "stop" button. Perhaps the thread has nothing to do, and is waiting for some information, such as request packet, to process. The problem is that you need to shut down all the threads before a process exits (note: in Windows CE, shutting down the main thread shuts down the process, and all threads owned by the process. This is not true in Win9x, Windows NT, or Windows 2000). A typical bug encountered is that you shut down your program, recompile it, and get an error that it is unable to write the executable file. Why? Because the program is still running. But you don't see it on the taskbar. So you bring up the Task Manager (via Ctrl-Alt-Del) and it isn't there, either. But something has got your executable tied up! The answer is that if you look in the NT Task manager under the processes tab, or use pview to look at processes, you will indeed find that your program is still running. This is usually because you failed to shut down one or more worker threads. As long as any one thread exists, even if it is blocked, your process is still alive. Of course, if you've killed the GUI thread, so the main window is gone and nothing is visible. But it is still lurking. Of course, you can use the NT Task Manager, or pview, to kill the process by terminating all of its threads, but this is a little bit like using dynamite to lift your car to change the tire. Sure, it lifts the car, but there are a few other things that go wrong which are considered undesirable side effects.

Three functions immediately present themselves for purposes of pausing or shutting down a thread: the SuspendThread and ResumeThread methods (and their underlying API calls, ::SuspendThread and ::ResumeThread) and ::TerminateThread. Assume, for all practical purposes, except in some very limited contexts, these functions do not exist. Using them will almost always get you in trouble.

The limited contexts in which these can be used are

  • A thread is generally free to call SuspendThread (or ::SuspendThread) on itself, providing it knows itself to be in a "safe" state (for example, it is not holding any synchronization primitive it is supposed to release). The risk here is that you don't know what your caller has locked, and you could be suspending a thread that the caller of your function was not expecting to be suspended. So be extremely careful even in this limited case!
  • A thread is always free to call ResumeThread (or ::ResumeThread) on another thread it knows is suspended. For example, in the case of creating/starting a thread in suspended mode, you have to call ResumeThread/::ResumeThread to get it to run at all!

Note that it is not a good idea to have a thread call TerminateThread on itself, because this will mean that it terminates instantly. The implication of this is that the DLL_THREAD_DETACH events for various DLLs will not be executed, which can lead to the misbehavior of a DLL you didn't even know you were using! (If you don't understand what this means, take it as meaning: bypassing DLL_THREAD_DETACH is a Very Bad Thing). Instead, if a thread wants to kill itself, it should call ExitThread, which guarantees the correct notification of all the DLLs.

Note that you should not substitute SuspendThread/ResumeThread for the proper use of synchronization objects such as Events and Mutexes.

To illustrate why it is a Bad Idea to let one thread suspend another, let's take a simple case: you have a worker thread off doing something, and the something involves memory allocation. You click the "Pause" button on your GUI, which immediately calls SuspendThread. What happens? The worker thread stops. Right now. Immediately. No matter what it is doing. If your worker thread happens to be in the storage allocator, you have just shut down your entire application. Well, not quite--it won't shut down until the next time you try to allocate memory from the GUI thread. But the MFC library is fairly liberal in using memory allocation, so there is an excellent chance that the next call to the MFC library from any thread will simply stop that thread dead in its tracks. If it is your GUI thread (the most likely one) your app appears to hang.

Why is this?

The storage allocator is designed to be thread-safe. This means that at most one thread at a time is permitted to be executing it. It is protected by a CRITICAL_SECTION, and each thread which attempts to enter the allocator blocks if there is already a thread in the allocator. So what happens if you do SuspendThread? The thread stops dead, in the middle of the allocator, with the critical section lock held. This lock will not be released until the thread resumes. Now, if it happens that your GUI requires an allocation as part of resuming the thread, an attempt to resume the thread will block, producing classic deadlock. And if you did a ::TerminateThread, then there is no way the lock will ever be released. And without SuspendThread, there is no need for ResumeThread

Ah, you say, but I know I don't do any memory allocation either in the worker thread or the GUI. So I don't need to worry about this!

You're wrong.

Remember that the MFC library does allocations you don't see. And allocation is only one of many critical sections that exist inside the MFC library to make it thread-safe. And stopping in any of them will be fatal to your app.

Ignore the existence of these functions.

So how do you suspend or terminate a thread?

The problems of shutting down a thread and pausing a thread are closely related, and my solution is the same in both cases: I use a synchronization primitive to effect the pause by suspending the thread, and use timeouts on the primitive to allow me to poll for shutdown.

Sometimes the synchronization primitive is a simple event, such as in the example below where I wish to be able to pause the thread. In other cases, particularly where I'm using the thread to service a queue of events, I will use a synchronization primitive such as a semaphore. You can also read my essay on GUI Thread Techniques.

Typically, I use a worker thread in the background, and it simply polls some state (such as a Boolean) to determine what it should be doing. For example, to pause a thread, it looks at the Boolean that says "Paused", which is set when (for example) a checkbox is set:

  // thread body:
while(running)
   { /* loop */
    if(paused)
       switch(::WaitForSingleObject(event, time))
          {
           case WAIT_OBJECT_0:
              break;
           case WAIT_TIMEOUT:
              continue;
          }
     // rest of thread
    } /* loop */

The trick of doing the continue for the timeout means that the thread will regularly poll for the running flag being clear, simplifying your shutdown of the application. I typically use 1000 for the time, polling once a second for shutdown. 

Why did I do the apparently redundant test of the paused Boolean variable before doing the ::WaitForSingleObject? Wouldn't waiting on the object be sufficient?

Yes, the paused Boolean is an optimization hack. Because I use ::WaitForSingleObject, each pass through this loop involves a moderately heavy-duty operation to test for continuance. In a high-performance loop this would introduce a completely unacceptable performance bottleneck. By using a simple Boolean I can avoid the heavy-duty kernel call when I don't need it. If performance is not an issue, you can eliminate this extra test.

The code in the main GUI thread that sets these variables is as shown below:

CMyDialog::OnPause()
   {
    if(paused && c_Pause.GetCheck() != BST_CHECKED)
       { /* resume */
        paused = FALSE;
        SetEvent(event);
       } /* resume */
    else
    if(!paused && c_Pause.GetCheck() == BST_CHECKED)
       { /* pause */
        paused = TRUE;
        ResetEvent(event);
       } /* pause */
   }

where event is a handle from ::CreateEvent. Avoid CEvent--at least the last time I tried it, it was so unbelievably brain-dead buggy that it was unusable. I haven't checked the VC++ 6.0 implementation, so it may have been fixed, but the bottom line is that the MFC interface to synchronization primitives has been deeply suspect, and gains nothing over using the actual primitives.

There's another, slightly more complex, mechanism for doing a thread shutdown without polling, which I discuss in a later section.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.” - Donald Knuth