#### Collected works of Kenneth E. Batcher, emeritus professor, and works inspired by his research and scholarship.

#### Browse the *Kenneth E. Batcher Collection: Papers from the Parallel and Associative Computing Laboratory
* Collections

**1 - 25**of

**25**| Next »

## Virtual parallelism by self simulation of the multiple instruction stream associative model1996The ASC model for parallel computation supports a generalization of an associative style of computing that has been used since the introduction of associative SIMD computers in the early 1970's. In particular, this model supports data parallelism, constant time maximum and minimum operations, one or more instruction streams (ISs) which are sent to an equal number of partition sets of processors (PEs), and assignment of tasks to the ISs using control parallelism. Since problems often need more processors than a machine has, it is useful to have virtual processors simulated by existing architectures. This paper shows how virtual parallelism is possible where more PEs and ISs are simulated than the actual hardware possesses. The extra time needed for an ASC(n,j) machine to simulate an ASC(N,J) machine where |

## The universality of various types of SIMD machine interconnection networks03/1977SIMD machine architects must choose an interconnection network to provide interprocessor communication. The universality of a network is its ability to simulate arbitrary interconnections of the processing elements. We examine the universality of five particular networks which cover the types used in the llliac IV, STARAN, Omen, SlMDA, and RAP machines. They also cover the types discussed by Feng, Lang, Lawrie, Orcutt, Siegel, and Stone. We give O((log2N)2 ) algorithms, where N is the number of processing elements, for the Perfect Shuffle, PM21, WPM21, and Cube networks to simulate arbitrary interconnections (Orcutt has given an O(Nl/2log2N) algorithm for the llliac network). We analyze Batcher's bitonic sorting method and show how each network can implement it on an SIMD machine. We discuss how sorting destination tags is equivalent to simulating interconnections. |

## The Architecture of Tomorrow's Massively Parallel Computer07/01/1987Goodyear Aerospace delivered the Massively Parallel Processor (MPP) to NASA/Goddard in May 1983, over three years ago. Ever since then, Goodyear has tried to look in a forward direction. There is always some debate as to which way is forward when it comes to supercomputer architecture. Improvements to the MPP's massively parallel architecture are discussed in the areas of data I/O, memory capacity, connectivity, and indirect (or local) addressing. In I/O, transfer rates up to 640 megabytes per second can be achieved. There are devices that can supply the data and accept it at this rate. The memory capacity can be increased up to 128 megabytes in the ARU and over a gigabyte in the staging memory. For connectivity, there are several different kinds of multistage networks that should be considered. |

## Solving a 2D knapsack problem on an associative computer augmented with a linear network1996This paper describes a parallelization of the sequential dynamic programming method for solving a 2D knapsack problem where multiples of n rectangular objects are optimally packed into a knapsack of size |

## Simulation between enhanced meshes and the multiple associative computing (MASC) model11/1999MASC (for Multiple Associative Computing) is a practical, highly scalable joint control parallel, data parallel model that naturally supports massive parallelism and a wide range of applications. In this paper, we propose efficient algorithms for the MASC model with a 2-D mesh to simulate enhanced meshes, e.g., meshes with multiple broadcasting (MMB), and basic reconfigurable meshes (BRM). The results not only show the power of the MASC model in terms of the enhanced mesh models but also provide an automatic conversion of numerous algorithms designed for enhanced meshes to the MASC model. |

## Simulating PRAM with a MSIMD model (ASC)08/14/1998The ASC (MSIMD) model for parallel computation supports a generalized version of an associative style of computing that has been used since the introduction of associative SIMD computers in the early 1970's. In particular, this model supports data parallelism, constant time maximum and minimum operations, one or more instruction streams (ISs) which are sent to a unique set in a dynamic partition of the processors, and assignment of tasks to the ISs using control parallelism. ASC also allows a network to interconnect the processing elements (PEs). This paper shows how ASC can simulate synchronous PRAM, and the converse. These results provide an important step in defining the power of associative model in terms of PRAM which is the most studied parallel model. Also, these simulations will provide numerous algorithms for ASC by giving an automatic method of converting algorithms from PRAM to ASC. |

## Relating the power of the Multiple Associative Computing (MASC) model to that of reconfigurable bus-based models05/2010The MASC (Multiple ASsociative Computing) model is a multi-SIMD model that uses control parallelism to coordinate the interaction of data parallel threads and supports associative SIMD computing on each of its threads. There have been a wide range of algorithms developed for this model. Research on using this model in real-time system applications and building a scalable MASC architecture is currently quite active. In this paper, we present simulations between MASC and reconfigurable bus-based models, e.g., various versions of the Reconfigurable Multiple Bus Machine (RMBM). Constant time simulations of the basic RMBM by MASC and vice versa are obtained. Simulations of the segmenting RMBM, fusing RMBM, and extended RMBM by MASC in non-constant time are also discussed. By taking advantage of previously established relationships between RMBM and two other popular parallel computational models, namely, the Reconfigurable Mesh (RM) and the Parallel Random Access Machine (PRAM), we extend our simulation results to further categorize the power of the MASC model in relation to RM and PRAM. |

## On using the UML to describe the MASC model of parallel computation06/2000A Unified Modeling Language (UML) description of the MASC model of parallel computation is presented. This UML description identifies MASC objects and specifies various object and inter-object relationships, dependencies, and behaviors. This was achieved by describing various views of the MASC model by using many of the UML structural and behavioral diagrams. The use of using UML to describe MASC has been highly effective for further parallel modeling techniques, comparisons to other parallel models, MASC parallel system software research, and MASC algorithm development. |

## Massively parallel processor (MPP): July 1979 phase I final report07/23/1979Goodyear Aerospace Corporation technical report GER-16684 |

## Implementing Associative Search and Responder Resolution04/2002In a paper presented last year at WMPP'01 [Walker01], we described the initial prototype of an associative processor implemented using field-programmable logic devices (FPLDs). That paper presented an overview of the design, and concentrated on the processor's instruction set and its implementation using FPLDs. This paper describes the implementation of the processor's associative operation -- associative searching and responder resolution -- in more detail. |

## Implementing associative processing: Rethinking earlier architectural decisions04/2001This paper describes an initial design of an associative processor for implementation using field-programmable logic devices (FPLDs). The processor is based loosely on earlier work on the STARAN computer, but updated to reflect modern design practices. We also draw on a large body of research at Kent State on the ASC and MASC models of associative processing, and take advantage of an existing compiler for the ASC model. The resulting design consists of an associative array of 8-bit RISC Processing Elements (PEs), operating in byte-serial fashion under the control of an Instruction Stream (IS) Control Unit that can execute assembly language code produced by a machine-specific back-end compiler. |

## Functional description of ASPRO: The high speed associative processor07/16/1984Goodyear Aerospace Corporation technical report GER-16868 |

## Efficient associative SIMD processing for non-tabular structured data2004With recent advances in VLSI technology, it is easy to develop chips containing hundreds or thousands of processors - in effect, SIMD processing on a chip. Associative SIMD processing goes further, providing efficient parallel methods for searching tabular data. The techniques described here show that associative processing can also be extended to efficiently process non-tabular structured data (e.g., trees), as we demonstrate by supporting structure codes in our associative ASC processor. |

## Comparisons of air traffic control implementations on an associative processor with a MIMD and consequences for parallel computing02/2013This paper has two complementary focuses. The first is the system design and algorithmic development for air traffic control (ATC) using an associative SIMD processor (AP). The second is the comparison of this implementation with a multiprocessor implementation and the implications of these comparisons. This paper demonstrates how one application, ATC, can more easily, more simply, and more efficiently be implemented on an AP than is generally possible on other types of traditional hardware. The AP implementation of ATC will take advantage of its deterministic hardware to use static scheduling. The software will be dramatically smaller and cheaper to create and maintain. Likewise, a large AP system will be considerably simpler and cheaper than the MIMD hardware currently used. While APs were used for ATC-type applications earlier, these are no longer available. We use a ClearSpeed CSX600 accelerator to emulate the AP solutions of ATC on an ATC prototype consisting of eight data-intensive ATC real-time tasks. Its performance is compared with an 8-core multiprocessor (MP) using OpenMP. Our extensive experiments show that the AP implementation meets all deadlines while the MP will regularly miss a large number of deadlines. The AP code will be similar in size to sequential code for the same tasks and will avoid all of the additional support software needed with an MP to handle dynamic scheduling, load balancing, shared resource management, race conditions, false sharing, etc. At this point, essentially only MIMD systems are built. Many of the advantages of using an AP to solve an ATC problem would carry over to other applications. AP solutions for a wide variety of applications will be cited in this paper. Applications that involve a high degree of data parallelism such as database management, text processing, image processing, graph processing, bioinformatics, weather modeling, managing UAS (Unmanned Aircraft Systems or drones) etc., are good candidates for AP solutions. This raises the issue of whether we should routinely consider using non-multiprocessor hardware like the AP for applications where substantially simpler software solutions will normally exist. It also raises the question of whether the use of both AP and MIMD hardware in a single heterogeneous system could provide more versatility and efficiency. Either the AP or MIMD could serve as the primary system, but could hand off jobs it could not handle efficiently to the other system. |

## ASPRO support software users guide08/1982Goodyear Aerospace Corporation technical manual GER-16846 |

## ASPRO programming reference manual10/1982Goodyear Aerospace Corporation technical manual GER-16809 Revision B |

## ASPRO floating point programming manual10/1985Goodyear Aerospace Corporation technical manual GER 16810 |

## ASPRO emulator (VAE) user's guide02/1985Goodyear Aerospace Corporation technical manual GER 17267 |

## ASPRO debugger RSX version users guide02/1987Goodyear Aerospace Corporation technical manual GER-16891 |

## ASC: An associative-computing paradigm11/1994Today's increased computing speeds allow conventional sequential machines to effectively emulate associative computing techniques. We present a parallel programming paradigm called ASC (ASsociative Computing), designed for a wide range of computing engines. Our paradigm has an efficient associative-based, dynamic memory-allocation mechanism that does not use pointers. It incorporates data parallelism at the base level, so that programmers do not have to specify low-level sequential tasks such as sorting, looping and parallelization. Our paradigm supports all of the standard data-parallel and massively parallel computing algorithms. It combines numerical computation (such as convolution, matrix multiplication, and graphics) with nonnumerical computing (such as compilation, graph algorithms, rule-based systems, and language interpreters). This article focuses on the nonnumerical aspects of ASC. |

## A study of a byte oriented associative array processor for image processing11/19/1975Goodyear Aerospace Corporation technical report AP-122364 |

## A software implementation of a cycle precision simulator of a multiple associative model01/2010The Multiple Associative Computing (MASC) parallel model is a generalization model of an Associative Computing (ASC) parallel model designed to support multiple ASC data parallel threads by using control parallelism. The MASC model is designed to combine the advantages of both Single Instruction Stream Multiple Data Streams (SIMD) and Multiple Instruction Streams Multiple Data Streams (MIMD) models. Here is the first time that a complete description of MASC model has been implemented (in software) true to its original description. A cycle precision simulator is built to demonstrate the performance of MASC on various multithreaded algorithms. The simulator is a software prototype for the model with sufficient software details to allow it to be converted into a hardware prototype of the model. If a reasonable limit for the number of threads simultaneously supported is assumed, the resulting hardware design is not only easily to implement, but can easily support a huge number of processing units and is a excellent candidate architecture for supporting large scale (e.g., terascale and petascale) computing. Experimental results shows that, when processing large-scale instances using multiple workers, the algorithm executed by the MASC model using a static task assignment scheme provides strong scaling with constant time overhead. |

## A scalable pipelined associative SIMD array with reconfigurable PE interconnection network for embedded applications11/2005This paper describes the FPGA implementation of a specialized SIMD processor array for embedded applications. An alternative to traditional SoC or MPSoC architectures, this array combines the massive parallelism inherent in SIMD architectures with the search capabilities of associative computing, producing a SIMD Processor Array System on a Chip (PASoC) well suited for applications such as data mining and bioinformatics. This paper first describes the architecture of the system, and then the pipelining of the processing elements and the addition of a reconfigurable interconnection network. Finally, the paper concludes with a brief description of a string-matching algorithm that can be used for the applications cited above. |

## A scalable associative processor with applications in database and image processing04/2004This paper describes the implementation and use of a dedicated associative SIMD co-processor ideally suited for many applications such as database processing, image processing, genome matching, or molecular similarity analysis. The concept of associative SIMD processing is introduced, and differentiated from other associative and SIMD techniques. Then our ASC (ASsociative Computing) processor is briefly described, along with its implementation of associative SIMD processing. Finally, we demonstrate the use of our ASC processor on relational database processing and on the image processing operation of edge detection. |