Algorithms and Languages

Computer control

Distributed Systems

Data and Memory

SIMD/MIMD

Single/Multiple Instruction

Synch/Comm

Synchronization Communication

Links


The Professor
Gita

Gita's home page

The Class


Architectures

The most important architectural aspect of SIMD is the organization of the processor array. One such architecture is the processing element to processing element organization. In this configuration, N processing elements are connected via an interconnection network. Each processing element (PE) is a processor with local memory. The PEs execute the instructions that are distributed to the PEs by the ACU via a broadcast bus. Each PE then operates on data stored in its own memory, and on data broadcast by the ACU. Data is exchanged among PEs via a unidirectional interconnection network, and the I/O bus is used to transfer data from PEs to the I/O interface and vice versa. To transfer results from particular PEs to the ACU, the result bus is used. Because local memory can be employed, the hardware that is used in such a machine can be constructed efficiently. In many algorithms, communication is mostly local, e.g., among the nearest meighbors.

A second SIMD architecture is the processor to memory organization. In this configuration, a bidirectional interconnection network connects the N processors and M memory modules. The processors are controlled by the ACU via the broadcast bus. Data is exchanged between processors via the interconnection network and the memory modules. Again, data transfers between the memories and the I/O interface are handled via the I/O bus, and a result bus is used. Examples of this SIMD machine architecture are the Burroughs scientific processor (BSP) and the Texas Reconfigurable Array Computer (TRAC).

A third SIMD architecture is the content-addressable memory. In contrast to a RAM, in which data can be accessed by providing data addresses serially, a CAM is content addressable; i.e., a data item is provided to the CAM, and those CAM cells that contain this value will set a flag (in parallel) to indicate whether the provided data item matches the value stored in its cell. Instead of employing separate processors in the processor array of an SIMD machine of this architecture, special compare and matching logic is present in each CAM bit cell. Thus, a CAM cell acts like a separate PE. SIMD machines consisting of a CAM are also called associative processors.

Because, in an SIMD machine, a single ACU provides the instruction stream for all of the array processors, the system will frequently be under-utilized whenever programs are run that require only a few PEs. To alleviate this problem, multiple-SIMD (MSIMD) machines were designed. They consist of multiple control units, each with its own program memory. The PEs are controlled by U control units that divide the machine into U independent virtual SIMD machines of various sizes. U is usually much smaller than N and determines the maximum number of SIMD programs that can operate simultaneously. The distribution of the PEs onto the ACUs can be either static or dynamic.

The MSIMD machine architecture has several advantages over normal SIMD machines, including:

  • Efficiency: If a program requires only a subset of the available PEs, the remaining PEs can be used for other programs.
  • Multiple users: Up to U different users can execute different SIMD programs on the machine simultaneously.
  • Fault detection: A program runs on two independent machine partitions, and errors are detected by result comparison.
  • Fault tolerance: A faulty PE only affects one of the multiple SIMD machines, and other machines can still operate correctly.

Home Page | Gita | Students | Algorithms | Communication | Distributed
Links | MIMD | Omega | Shared Memory | SIMD | Software | Synchro