Scheduling is one of the factors that most directly affect performance in Thread-Level Speculation (TLS). Since loops may present dependences that cannot be predicted before runtime, finding a good chunk size is not a simple task. The most used mechanism, Fixed-Size Chunking (FSC), requires many “dry-runs” to set the optimal chunk size.
Our research in this field has led to several solutions. The most recent one is a scheduling mechanism called “Moody scheduling”, that estimates at runtime the best size of the next chunk to be scheduled, using information of the execution of the previous chunks. More details can be found at:
Moody Scheduling for Speculative Parallelization. Alvaro Estebanez, Diego R. Llanos, David Orden, Belen Palop. Euro-Par 2015, 24-28 August, Vienna, Austria. LNCS 9233, Springer, Berlin.
Our previous solutions rely on some knowledge about the expected pattern of dependence violations that may appear. Some algorithms, such as the Randomized Incremental Constructions, present many dependence violations at the beginning of the loop. For these algorithms, it is better to issue smaller chunks at the beginning, and slowly increase the chunk size as execution progresses. Our work with these algorithms is summarized in the following papers (in reversed chronological order):
Just-In-Time Scheduling for Loop-based Speculative Parallelization. Diego R. Llanos, David Orden, Belén Palop. PDP 2008 Conference, IEEE Press, Tolouse, February 13-15 2008.
New Scheduling Strategies for Randomized Incremental Algorithms in the Context of Speculative Parallelization. Diego R. Llanos, David Orden, Belén Palop. IEEE Transactions on Computers, 56(6), pp. 839-852, June 2007, ISSN 0018-9340, IEEE Press.