Blender V4.3
BLI_lazy_threading.hh File Reference

Go to the source code of this file.

Classes

class  blender::lazy_threading::HintReceiver
 
class  blender::lazy_threading::ReceiverIsolation
 

Namespaces

namespace  blender
 
namespace  blender::lazy_threading
 

Functions

void blender::lazy_threading::send_hint ()
 

Detailed Description

The goal of "lazy threading" is to avoid using threads unless one can reasonably assume that it is worth distributing work over multiple threads. Using threads can lead to worse overall performance by introducing inter-thread communication overhead. Keeping all work on a single thread reduces this overhead to zero and also makes better use of the CPU cache.

Functions like #parallel_for also solve this to some degree by using a "grain size". When the number of individual tasks is too small, no multi-threading is used. This works very well when there are many homogeneous tasks that can be expected to take approximately the same time.

The situation becomes more difficult when:

  • The individual tasks are not homogeneous, i.e. they take different amounts of time to compute.
  • It is practically impossible to guess how long each task will take in advance.

Given those constraints, a single grain size cannot be determined. One could just schedule all tasks individually but that would create a lot of overhead when the tasks happen to be very small. While TBB will keep all tasks on a single thread if the other threads are busy, if they are idle they will start stealing the work even if that's not beneficial for overall performance.

This file provides a simple API that allows a task scheduler to properly handle tasks whose size is not known in advance. The key idea is this:

‍By default, all work stays on a single thread. If an individual task notices that it is about start a computation that will take a while, it notifies the task scheduler further up on the stack. The scheduler then allows other threads to take over other tasks that were originally meant for the current thread.

This way, when all tasks are small, no threading overhead has to be paid for. Whenever there is a task that keeps the current thread busy for a while, the other tasks are moved to a separate thread so that they can be executed without waiting for the long computation to finish.

Consequently, the earlier a task knows during it execution that it will take a while, the better. That's because if it is blocking anyway, it's more efficient to move the other tasks to another thread earlier.

To make this work, three things have to be solved:

  1. The task scheduler has to be able to start single-threaded and become multi-threaded after tasks have started executing. This has to be solved in the specific task scheduler.
  2. There has to be a way for the currently running task to tell the task scheduler that it is about to perform a computation that will take a while and that it would be reasonable to move other tasks to other threads. This part is implemented in the API provided by this file.
  3. Individual tasks have to decide when a computation is long enough to justify talking to the scheduler. This is always based on heuristics that have to be fine tuned over time. One could assume that this means adding new work-size checks to many parts in Blender, but that's actually not necessary, because these checks exist already in the form of grain sizes passed to e.g. #parallel_for. The assumption here is that when the task thinks the current work load is big enough to justify using threads, it's also big enough to justify using another thread for waiting tasks on the current thread.

Definition in file BLI_lazy_threading.hh.