Threading

This document outlines the threading design in DamageControl.

h3 Event driven
Most components in DamageControl communicate using events.

h3 Hub
Many events don't have a specific target, instead they are sent to anyone in DamageControl that might be interested in them. You do this by sending the message to the Hub. There is one single hub in DamageControl and most components are directly connected to that hub.

When a message is published on the hub using publish_message it passes it on to each subscriber by calling the receive_message on the subscriber.

h3 AsyncComponent
An AsyncComponent (or asynchronuos component) does not process it's message on the same thread as the hub, instead it enqueues the message on an internal queue and processes the message one at a time on a separate thread. It does

h3 One thread per component
The AsyncComponent design effectively means that most components (all but the simplest ones) have their own thread and an external thread never enters the component. External stimuli enters the component in a very well-isolated place. This makes it easy to write each component since essentially each component is single-threaded.

h3 Unit-testing and threading
Running several threads in a unit-test and handling all the race-conditions that can occur is really hard. This usually leads to slow and intermittently failing tests that can even sometimes deadlock. Fortunately in this design few unit-tests actually needs to start new threads. Instead the component can use the testing-thread as it's main thread. A message is published on the hub or directly put on the AsyncComponents inqueue and the force_tick method is called which picks up the message from the queue and processes it, the result of the processing is then evaluated.
A similar but slightly different design has to be used for blocking queues, see below.

Unit-tests for the framework classes (like AsyncComponent, BlockingQueue and TimerMixin) usually have to start a separate thread for at least some of it's test-cases, but generally speaking multiple threads in a unit-test should be avoided.

h3 Blocking queues
The blocking queues design is so far a proposal but has not yet been implemented.
A blocking queue is a queue that acts completely as an ordinary thread-safe queue (that is you enqueue stuff to one side and dequeue stuff from the other side, first-in-last-out) with one important difference, it blocks the currently running thread when it is empty and you are trying to dequeue. Additional operations can also be dequeuing with a timeout, and polling (timeout set to zero).

This quality makes it extremely useful for hand-over between different threads. It has been suggested that the current tick-based design for AsyncComponent is replaced by a blocking queue design instead.

The following strategy can be used to test an AsyncComponent using blocking queues: the AsyncComponent has a method that will go into a loop until the thread is killed where it dequeues a message from the queue (this will block until something is available).

Time-sensitive components that reacts not only to external events but also to timing events will need a slightly different design. A good example would be the BuildScheduler that needs to wake up when the quiet period has elapsed for a requested build. There are several approaches to doing this: requeuing the request for a later time (this would require timed processing of events) or entering the dequeue block with a timeout.

h3 Some contentious objects
Some objects mainly contain state that the real active components operate on. These can sometimes be read and modified by several threads at the same time. Whether to pass these objects by copy over thread boundaries is necessary or not has still not been determined.
These are:

  • Build
  • The SCM classes

h3 This is not SEDA
This design is not SEDA (staged event driven architecture) although the similarities are apparent. The SEDA design for optimal throughput and scalability and contains several elements that you can't find here. The event-driven design of DamageControl is because it's a simpler programming model, it allows us to easily test each component in isolation and it makes it easy to add plugins at a later stage.

Labels

 
(None)