diff options
Diffstat (limited to 'vendor/rayon-core/src/sleep/README.md')
-rw-r--r-- | vendor/rayon-core/src/sleep/README.md | 219 |
1 files changed, 219 insertions, 0 deletions
diff --git a/vendor/rayon-core/src/sleep/README.md b/vendor/rayon-core/src/sleep/README.md new file mode 100644 index 0000000..55426c8 --- /dev/null +++ b/vendor/rayon-core/src/sleep/README.md @@ -0,0 +1,219 @@ +# Introduction: the sleep module + +The code in this module governs when worker threads should go to +sleep. The system used in this code was introduced in [Rayon RFC #5]. +There is also a [video walkthrough] available. Both of those may be +valuable resources to understanding the code, though naturally they +will also grow stale over time. The comments in this file are +extracted from the RFC and meant to be kept up to date. + +[Rayon RFC #5]: https://github.com/rayon-rs/rfcs/pull/5 +[video walkthrough]: https://youtu.be/HvmQsE5M4cY + +# The `Sleep` struct + +The `Sleep` struct is embedded into each registry. It performs several functions: + +* It tracks when workers are awake or asleep. +* It decides how long a worker should look for work before it goes to sleep, + via a callback that is invoked periodically from the worker's search loop. +* It is notified when latches are set, jobs are published, or other + events occur, and it will go and wake the appropriate threads if + they are sleeping. + +# Thread states + +There are three main thread states: + +* An **active** thread is one that is actively executing a job. +* An **idle** thread is one that is searching for work to do. It will be + trying to steal work or pop work from the global injector queue. +* A **sleeping** thread is one that is blocked on a condition variable, + waiting to be awoken. + +We sometimes refer to the final two states collectively as **inactive**. +Threads begin as idle but transition to idle and finally sleeping when +they're unable to find work to do. + +## Sleepy threads + +There is one other special state worth mentioning. During the idle state, +threads can get **sleepy**. A sleepy thread is still idle, in that it is still +searching for work, but it is *about* to go to sleep after it does one more +search (or some other number, potentially). When a thread enters the sleepy +state, it signals (via the **jobs event counter**, described below) that it is +about to go to sleep. If new work is published, this will lead to the counter +being adjusted. When the thread actually goes to sleep, it will (hopefully, but +not guaranteed) see that the counter has changed and elect not to sleep, but +instead to search again. See the section on the **jobs event counter** for more +details. + +# The counters + +One of the key structs in the sleep module is `AtomicCounters`, found in +`counters.rs`. It packs three counters into one atomically managed value: + +* Two **thread counters**, which track the number of threads in a particular state. +* The **jobs event counter**, which is used to signal when new work is available. + It (sort of) tracks the number of jobs posted, but not quite, and it can rollover. + +## Thread counters + +There are two thread counters, one that tracks **inactive** threads and one that +tracks **sleeping** threads. From this, one can deduce the number of threads +that are idle by subtracting sleeping threads from inactive threads. We track +the counters in this way because it permits simpler atomic operations. One can +increment the number of sleeping threads (and thus decrease the number of idle +threads) simply by doing one atomic increment, for example. Similarly, one can +decrease the number of sleeping threads (and increase the number of idle +threads) through one atomic decrement. + +These counters are adjusted as follows: + +* When a thread enters the idle state: increment the inactive thread counter. +* When a thread enters the sleeping state: increment the sleeping thread counter. +* When a thread awakens a sleeping thread: decrement the sleeping thread counter. + * Subtle point: the thread that *awakens* the sleeping thread decrements the + counter, not the thread that is *sleeping*. This is because there is a delay + between signaling a thread to wake and the thread actually waking: + decrementing the counter when awakening the thread means that other threads + that may be posting work will see the up-to-date value that much faster. +* When a thread finds work, exiting the idle state: decrement the inactive + thread counter. + +## Jobs event counter + +The final counter is the **jobs event counter**. The role of this counter is to +help sleepy threads detect when new work is posted in a lightweight fashion. In +its simplest form, we would simply have a counter that gets incremented each +time a new job is posted. This way, when a thread gets sleepy, it could read the +counter, and then compare to see if the value has changed before it actually +goes to sleep. But this [turns out to be too expensive] in practice, so we use a +somewhat more complex scheme. + +[turns out to be too expensive]: https://github.com/rayon-rs/rayon/pull/746#issuecomment-624802747 + +The idea is that the counter toggles between two states, depending on whether +its value is even or odd (or, equivalently, on the value of its low bit): + +* Even -- If the low bit is zero, then it means that there has been no new work + since the last thread got sleepy. +* Odd -- If the low bit is one, then it means that new work was posted since + the last thread got sleepy. + +### New work is posted + +When new work is posted, we check the value of the counter: if it is even, +then we increment it by one, so that it becomes odd. + +### Worker thread gets sleepy + +When a worker thread gets sleepy, it will read the value of the counter. If the +counter is odd, it will increment the counter so that it is even. Either way, it +remembers the final value of the counter. The final value will be used later, +when the thread is going to sleep. If at that time the counter has not changed, +then we can assume no new jobs have been posted (though note the remote +possibility of rollover, discussed in detail below). + +# Protocol for a worker thread to post work + +The full protocol for a thread to post work is as follows + +* If the work is posted into the injection queue, then execute a seq-cst fence (see below). +* Load the counters, incrementing the JEC if it is even so that it is odd. +* Check if there are idle threads available to handle this new job. If not, + and there are sleeping threads, then wake one or more threads. + +# Protocol for a worker thread to fall asleep + +The full protocol for a thread to fall asleep is as follows: + +* After completing all its jobs, the worker goes idle and begins to + search for work. As it searches, it counts "rounds". In each round, + it searches all other work threads' queues, plus the 'injector queue' for + work injected from the outside. If work is found in this search, the thread + becomes active again and hence restarts this protocol from the top. +* After a certain number of rounds, the thread "gets sleepy" and executes `get_sleepy` + above, remembering the `final_value` of the JEC. It does one more search for work. +* If no work is found, the thread atomically: + * Checks the JEC to see that it has not changed from `final_value`. + * If it has, then the thread goes back to searching for work. We reset to + just before we got sleepy, so that we will do one more search + before attending to sleep again (rather than searching for many rounds). + * Increments the number of sleeping threads by 1. +* The thread then executes a seq-cst fence operation (see below). +* The thread then does one final check for injected jobs (see below). If any + are available, it returns to the 'pre-sleepy' state as if the JEC had changed. +* The thread waits to be signaled. Once signaled, it returns to the idle state. + +# The jobs event counter and deadlock + +As described in the section on the JEC, the main concern around going to sleep +is avoiding a race condition wherein: + +* Thread A looks for work, finds none. +* Thread B posts work but sees no sleeping threads. +* Thread A goes to sleep. + +The JEC protocol largely prevents this, but due to rollover, this prevention is +not complete. It is possible -- if unlikely -- that enough activity occurs for +Thread A to observe the same JEC value that it saw when getting sleepy. If the +new work being published came from *inside* the thread-pool, then this race +condition isn't too harmful. It means that we have fewer workers processing the +work then we should, but we won't deadlock. This seems like an acceptable risk +given that this is unlikely in practice. + +However, if the work was posted as an *external* job, that is a problem. In that +case, it's possible that all of our workers could go to sleep, and the external +job would never get processed. To prevent that, the sleeping protocol includes +one final check to see if the injector queue is empty before fully falling +asleep. Note that this final check occurs **after** the number of sleeping +threads has been incremented. We are not concerned therefore with races against +injections that occur after that increment, only before. + +Unfortunately, there is one rather subtle point concerning this final check: +we wish to avoid the possibility that: + +* work is pushed into the injection queue by an outside thread X, +* the sleepy thread S sees the JEC but it has rolled over and is equal +* the sleepy thread S reads the injection queue but does not see the work posted by X. + +This is possible because the C++ memory model typically offers guarantees of the +form "if you see the access A, then you must see those other accesses" -- but it +doesn't guarantee that you will see the access A (i.e., if you think of +processors with independent caches, you may be operating on very out of date +cache state). + +## Using seq-cst fences to prevent deadlock + +To overcome this problem, we have inserted two sequentially consistent fence +operations into the protocols above: + +* One fence occurs after work is posted into the injection queue, but before the + counters are read (including the number of sleeping threads). + * Note that no fence is needed for work posted to internal queues, since it is ok + to overlook work in that case. +* One fence occurs after the number of sleeping threads is incremented, but + before the injection queue is read. + +### Proof sketch + +What follows is a "proof sketch" that the protocol is deadlock free. We model +two relevant bits of memory, the job injector queue J and the atomic counters C. + +Consider the actions of the injecting thread: + +* PushJob: Job is injected, which can be modeled as an atomic write to J with release semantics. +* PushFence: A sequentially consistent fence is executed. +* ReadSleepers: The counters C are read (they may also be incremented, but we just consider the read that comes first). + +Meanwhile, the sleepy thread does the following: + +* IncSleepers: The number of sleeping threads is incremented, which is atomic exchange to C. +* SleepFence: A sequentially consistent fence is executed. +* ReadJob: We look to see if the queue is empty, which is a read of J with acquire semantics. + +Either PushFence or SleepFence must come first: + +* If PushFence comes first, then PushJob must be visible to ReadJob. +* If SleepFence comes first, then IncSleepers is visible to ReadSleepers.
\ No newline at end of file |