Parallel Computing

15 August 2023

1. Introduction

Parallel computing is a computing model in which multiple processors or processing units within a computer or computing system operate simultaneously. Traditionally, computers have a single processor and operations are performed sequentially. However, with parallel computing, multiple processors can work simultaneously, processing large amounts of data faster and solving complex problems more efficiently.

Parallel computing provides the following advantages:

Faster computing: Thanks to the simultaneous operation of processors, parallel computing can drastically reduce processing time by dividing a large dataset into chunks and assigning each chunk to different processors.
High performance: Parallel computing provides high performance in complex calculations, providing a significant advantage in applications that require intensive processing such as scientific simulations, database queries, image processing and artificial intelligence.
Scalability: Parallel computing has the ability to easily scale the system as the need for processing power increases. That is, computing capacity can be increased by adding more processors.

Parallel computing can be implemented at different levels: It can be in the form of using multiple cores on a single processor (multicore), multi-processor systems with multiple processors connected to each other, or multiple computers distributed over the network (cluster).

However, parallel computing can sometimes become complex and difficult to program. Difficulties such as data dependency, synchronization, communication, and concurrency issues may be encountered. Therefore, parallel computing requires proper algorithm design and careful processor/core management.

2. Process and Thread

Process	Thread
It is a program that runs on a computer system.	They are execution units that run inside the process.
Each process has its own memory space.	They use common memory space shared with other threads within a process.
Since each process's memory space is separate, processes use inter-process communication mechanisms to communicate with each other.	It is easier for threads to communicate with each other as common memory space is used with other threads.
Creating a process is slower and more costly than creating a thread.	Creating and managing a thread is faster and less costly than creating a process.

Process or thread preference for parallel programming varies according to the problem to be solved and the systems to be used.

The more advantageous aspects of creating a process:

Isolation: Processes can be completely isolated from each other as they have their own address space. This means that if one process malfunctions or crashes, other processes will not be affected. At the same time, processes can run without interfering with each other using mechanisms provided by the operating system.
Security: Because processes have their own private memory space, they can increase information security. A process can protect itself from malicious attacks by limiting access to other processes' data and code.
Better Scalability: Provides better scalability between different physical processors and cores. Because processes can use more physical processors or cores to increase processing power.

For example, in Google Chrome, each tab is a process. Thus, the tabs are isolated from each other. If one tab becomes unresponsive or crashes, other tabs are unaffected and malicious code in one tab cannot access other tabs.

The more advantageous aspects of creating threads:

Lighter and Faster Creation: Threads can be created and managed lighter and faster compared to processes. While creating a process requires more cost and effort by the operating system, such as creating a separate memory area for the new process, threads can be started faster than creating a process because they run inside a process.
Less Memory Consumption: Threads use a common memory space shared within a process, while processes have their own private memory space. Therefore, threads consume less memory and provide more efficient memory management compared to processes executing the same tasks.
More Effective Communication: Since threads run in the same process, it is easier to share data between threads. By sharing memory space, they can manipulate data directly and do not need extra mechanisms for communication.
Operating System Resources: Creating processes places more load on the operating system, while threads cause a lighter load. This makes threads more efficient to manage large numbers of parallel operations at the same time.

3. Challenges Encountered in Parallel Programming

3.1. Race Condition

Race condition is a situation where unexpected and erroneous results occur when multiple threads or processes simultaneously read and write to a shared resource (for example, a variable, memory space, file, or database).

Race condition can be thought of as a condition where multiple threads are trying to access the same resource. Because threads do not guarantee the order or sequence to modify the same resource, unexpected results can occur. This can lead to errors such as the program being non-deterministic and producing different results each time it is run.

In an example race condition, two threads read and modify the same variable, one thread reads the variable while the other thread modifies the variable. In this case, the result may be an unexpected value depending on the processes of both threads.

Race conditions are an important issue that should be considered when parallel programming. To avoid race conditions, using synchronization mechanisms between threads or processes, and allowing only one thread or process to access a shared resource called a critical section at a time is a commonly preferred method. In this way, simultaneous access problems on shared resources are avoided and the program can be run safely and correctly.

3.1.1. Synchronization Methods for Race Condition

Mutex Lock: A certain lock (mutex) is taken before entering the critical region and the lock is released after the job is finished. Other threads wait for the lock to be unlocked and thus access the shared resource in turn.
Semaphores: Semaphores are a numerical data structure used to control the simultaneous access of a resource. A semaphore represents the current number of a given resource. Threads reduce (acquire) the semaphore when they want to use the resource and increase (release) the semaphore when releasing the resource. The value of the semaphores indicates the current state of the resource and determines the maximum number of simultaneous accesses allowed.

3.2. Inter-process Communication Mechanisms

Since the memory area of each process is separate and isolated, data sharing between processes can be achieved with the following techniques:

Pipe: Pipe is a simple IPC mechanism that provides a unidirectional or duplex flow of data between two processes. While one process writes data, the other process can read that data. A pipe is mostly used to pass data between parent and child processes. If more than one process uses the same end, there may be corruption in the data inside the pipe.
Message Queue: Message queuing is an IPC method where one process sends messages to other processes and the receiver waits for the messages in the queue. The message queue has the ability to receive and send data asynchronously.
Shared Memory: Shared memory is an IPC mechanism that provides the ability to share memory space between different processes. With shared memory, processes can write to the same memory space and other processes can read this memory space.