Difference Between std::execution::par and std::execution::par_unseq
How does std::execution::par_unseq differ from std::execution::par?
The std::execution header in C++ provides several execution policies to specify how algorithms should be executed.
Among these are std::execution::par and std::execution::par_unseq, which differ in their parallelization and vectorization capabilities.
std::execution::par:
- Enables parallel execution of tasks.
- Utilizes multiple threads to perform operations concurrently.
- Ensures that each task runs independently, but does not allow for vectorization.
- Suitable for algorithms that benefit from parallel processing without needing SIMD (Single Instruction, Multiple Data) instructions.
std::execution::par_unseq:
- Combines parallel and vectorized execution.
- Allows tasks to run concurrently across multiple threads and also utilize SIMD instructions within each thread.
- Offers more aggressive optimization, which can lead to higher performance on suitable hardware.
- Suitable for algorithms that can benefit from both multi-threading and vectorization.
Here's a simple example to illustrate the difference:
#include <algorithm>
#include <execution>
#include <format>
#include <iostream>
#include <vector>
void Log(int number) {
std::cout << std::format("Number: {}\n", number);
}
int main() {
std::vector<int> numbers{1, 2, 3, 4, 5};
std::for_each(
std::execution::par,
numbers.begin(), numbers.end(), Log
);
std::cout << "\nUsing par_unseq:\n";
std::for_each(std::execution::par_unseq,
numbers.begin(), numbers.end(), Log
);
}Number: 1
Number: 2
Number: 3
Number: 5
Number: 4
Using par_unseq:
Number: 1
Number: 2
Number: 5
Number: 4
Number: 3While the output might look similar, the internal execution can be different. std::execution::par runs the tasks in parallel, while std::execution::par_unseq can further optimize by vectorizing operations.
Key points:
std::execution::paris for parallel execution using multiple threads.std::execution::par_unseqcombines parallel execution with SIMD vectorization.- Use
std::execution::parfor tasks that benefit from parallelism without vectorization. - Use
std::execution::par_unseqfor tasks that can leverage both parallelism and vectorization for maximum performance.
Choosing the right execution policy depends on the nature of the task and the hardware capabilities. Understanding these differences helps in writing efficient and optimized C++ programs.
Parallel Algorithm Execution
Multithreading in C++ standard library algorithms using execution policies