...

Approach 2: Split the Algorithm Into Two Parts

Learn how splitting the algorithm into two parts can improve performance by copying elements conditionally in parallel and then merging them sequentially into a continuous range.

We'll cover the following...

Part one: Copy elements in parallel into the destination range
Part two: Move the sparse range sequentially into a continuous range
Performance evaluation

The second approach is to split the algorithm into two parts. First, the conditional copying is performed in parallel chunks, and then the resulting sparse range is squeezed to a continuous range.

Part one: Copy elements in parallel into the destination range

The first part copies the elements in chunks, resulting in the sparse destination array illustrated below. Each chunk is conditionally copied in parallel, and the resulting range iterators are stored in std::future objects for later retrieval:

Press + to interact

template <typename SrcIt, typename DstIt, typename Pred>
auto par_copy_if_split(SrcIt first, SrcIt last, DstIt dst, Pred pred,
                       size_t chunk_sz) -> DstIt {
  auto n = static_cast<size_t>(std::distance(first, last));
  auto futures = std::vector<std::future<std::pair<DstIt, DstIt>>>{};
  futures.reserve(n / chunk_sz);
  for (size_t i = 0; i < n; i += chunk_sz) {
    const auto stop_idx = std::min(i + chunk_sz, n);
    auto future = std::async([=, &pred] {
      auto dst_first = dst + i;
      auto dst_last =
          std::copy_if(first + i, first + stop_idx, dst_first, pred);
      return std::make_pair(dst_first, dst_last);
    });
    futures.emplace_back(std::move(future));
  }
 // To be continued ...

Getting Started

A Brief Introduction to C++

Essential C++ Techniques

Analyzing and Measuring Performance

Data Structures

Algorithms

Ranges and Views

Memory Management

C++ Essentials Exam

Compile-Time Programming

Essential Utilities

Proxy Objects and Lazy Evaluation

Concurrency

Coroutines and Lazy Generators

Asynchronous Programming with Coroutines

Parallel Algorithms

C++ Advanced Concepts Exam

Approach 2: Split the Algorithm Into Two Parts

Part one: Copy elements in parallel into the destination range