1.6 - Comparison with sequential execution

September 2025

Key Concept

Sequential execution is the most straightforward way to solve problems. It involves performing operations one after another, in the order they are written in the code. While simple to understand and implement, sequential execution often falls short when dealing with large datasets or computationally intensive tasks. This section compares sequential execution with parallel execution, highlighting the advantages and disadvantages of each approach.

Topics

Execution Order: Instructions are processed in a linear sequence.

Sequential execution follows a strict linear order. Each instruction must complete before the next one begins. This makes it easy to reason about the program's behavior, as the execution path is predictable. In contrast, parallel execution allows multiple instructions or tasks to run concurrently. The order in which these concurrent tasks complete can be unpredictable, but this unpredictability is often managed through synchronization mechanisms like locks and semaphores.

Resource Utilization:  A single processor or core is typically utilized.

Sequential execution typically utilizes only one processor core at a time. This can lead to significant underutilization of hardware resources, especially when dealing with tasks that can be broken down into independent subtasks. Parallel execution, on the other hand, can leverage multiple processor cores, leading to a more efficient use of available hardware. This can significantly reduce the overall execution time for computationally intensive tasks.

Scalability: Limited by the speed of the single processor.

Sequential execution has limited scalability. As the size of the dataset or the complexity of the task increases, the execution time grows linearly with the amount of work. Adding more processing power to a sequential system does not necessarily result in a proportional reduction in execution time. Parallel execution, however, offers better scalability. By adding more processor cores, the execution time can be reduced proportionally, allowing for faster processing of larger datasets.

Complexity: Simple to understand and implement for basic tasks

Sequential execution is generally simpler to implement and debug than parallel execution. The linear execution flow makes it easier to trace the program's behavior and identify errors. Parallel execution, however, introduces additional complexities related to synchronization, data consistency, and error handling. These complexities can make parallel programs more difficult to write, debug, and maintain.

Introduction to Parallel Execution

Parallel execution is a technique that involves dividing a task into smaller subtasks and executing them concurrently on multiple processors or cores. This approach aims to reduce the overall execution time of a program by leveraging the power of parallel processing.

Types of Parallel Execution

There are several types of parallel execution, each with its own characteristics and suitability for different types of tasks.

Shared Memory Parallelism

In shared memory parallelism, multiple processors or cores share a common memory space. This allows them to access and modify the same data directly. This approach is typically used for tasks that require frequent data sharing between processors. However, it can also introduce challenges related to data consistency and synchronization.

Distributed Memory Parallelism

In distributed memory parallelism, each processor or core has its own private memory space. Communication between processors is achieved through message passing. This approach is typically used for tasks that involve large datasets that cannot fit into the memory of a single processor.

Hybrid Parallelism

Hybrid parallelism combines shared memory and distributed memory parallelism. This approach allows for the benefits of both approaches, providing flexibility in designing parallel systems.

Parallel Programming Models

Parallel programming models provide a framework for writing parallel programs. These models define the structure and behavior of parallel programs, and provide tools for managing concurrency and synchronization.

Message Passing Interface (MPI)

MPI is a widely used standard for message passing, allowing processes to communicate with each other. It is particularly well-suited for distributed memory parallelism.

OpenMP

OpenMP is a widely used API for shared memory parallelism. It provides directives that can be inserted into existing code to enable parallel execution.

CUDA

CUDA is a parallel computing platform and programming model developed by NVIDIA. It is specifically designed for leveraging the power of GPUs (Graphics Processing Units) for parallel processing.

Challenges in Parallel Execution

While parallel execution offers significant advantages, it also presents several challenges.

Synchronization

Synchronization is essential to ensure that multiple processors or cores access and modify shared data in a consistent manner. Synchronization mechanisms, such as locks and semaphores, are used to prevent race conditions and data corruption.

Data Consistency

Maintaining data consistency across multiple processors or cores can be challenging, especially in distributed memory systems. Techniques such as data replication and distributed transactions are used to address this challenge.

Communication Overhead

Communication between processors or cores can introduce significant overhead, especially in distributed memory systems. Minimizing communication overhead is crucial for achieving good performance.

Load Balancing

Load balancing is the process of distributing work evenly across multiple processors or cores. Uneven load distribution can lead to some processors being idle while others are overloaded, reducing overall performance.

Exercise

Consider a simple calculation involving many independent operations. How could parallel execution potentially improve the time required? (5 min)

Answer: Think of applying a filter to thousands of small images. Sequentially, one core would process them one after another. With parallel execution, multiple cores could each handle a subset at the same time — so instead of waiting for 1,000 operations to complete one by one, you might finish in roughly 1/number_of_cores of the time (minus overhead). This illustrates how independence between operations makes parallelism worthwhile.

💡 Common Pitfalls

  • Assuming all tasks are equally amenable to parallelization.

    💡 Best Practices

    • Identify independent tasks before attempting parallelization.
      No Pages Found