BSP_Overview.pdf part of burroughs BSP Overview BSP_Overview.pdf, burroughs BSP BSP

Text preview for : BSP_Overview.pdf part of burroughs BSP Overview burroughs BSP BSP_Overview.pdf

Back to : BSP_Overview.pdf | Home

r
Burroughs
BURROUGHS SCIENTIFIC PROCESSOR

OVERVIEW, PERSPECTIVE, ARCHITECTURE
BS P -----------------------~----------- ----~-------------- BUR ROUG HS SCI E NT IF i C PROCESSOR

CONTENTS

Section Page

1 INTRODUCTION 1

2 ARCHITECTURAL PHILOSOPHY 3

C')
0 PARALLELIS:rvl RATIONALE h
"
4 PARALLELISM USEFULNESS 7

5 PARALLELISM IN SUPERCOMPUTERS 11

6 PARALLELISM IMPLEMENTATION IN THE ESP 13

7 SUMMARY 15

iii
~~p ~~~~~~~~~~~~~~~~~~~~~BURROUGHSSCIENTIFICPROCESSOR

n
I J

: I
BS P ---~-~-------------------------------- BUR ROUGHS SCI ENTI F IC PROCESSOR

1. INTRODUCTION

One of the most exciting developments in large-scale scientific computing is the
announcement of the Burroughs Scientific Processor (BSP). This system, capable
of delivering up to 50 million answers per second, is intended to solve the very
largest problems in engineering and scienc e.

The BSP is one of the so-called" supercomputers." As such, it is designed to
deliver at least one and in most instances several orders of magnitude more
processing power than the largest general-purpose computers.

Supercomputer design and utilization is a subject of much more than academic
interest. A number of application areas, addressable only by supercomputersJ
can be linked directly to our progress and survival. These areas include
numerical weather prediction.. structural analysis, linear programming,
natural resource exploration, and nuclear technology. Associated with each
application is at least one critical issue, as indicated below.

Application Critical Issues

Numerical weather prediction Agricultural production and flood
control

Structural analysis More energy-efficient, safer
automobiles

Safer, more economical buildings,
bridges, roads

1
~~p ~~~~~~~~~~~~~~~~~~~~~BURROUGHSSCIENTIFICPROCESSOR

Linear programming Application of limited resources to
maximize or minimize a specified
objective

Nuclear terminology More cost-effective, safer sources
of energy

Consider numerical weather prediction. At the present time, supercomputers are
being used extensively by atmospheric research institutions around the world as
key tools in understanding and predicting the weather. Assume it were possible to
compute regional forecasts accurately several months in advance. Imagine how
this would benefit food production. Given an accurate, long-range forecast, a
country could take a major step toward predicting its crop yields and could plan to
ensure that it had an adequate food supply. At the present time, it is conceivable
that only a "super" supercomputer could deliver the computing power necessary
to achieve this goal.

It has been argued in some quarters that all large computers (including the super-
scale systems discussed here) will soon be superseded by collections of mini-
computers or ensembles of thousands of microprocessors. The rationale behind
this argument is that the era of truly inexpensive hardware is at hand; and that it
ought to be possible to have (in some aggregate form, at least) several orders of
magnitude more processing power at a much lower cost in the ensemble of micro-
processors or the collection of minicomputers.

Unfortunately, no one has yet determined a method of controlling or utilizing the
power available in the ensemble, nor how to partition a large problem currently
soluable only by a supercomputer onto a collection of smaller machines in order
to obtain a timely solution to the problem. Unquestionably, inexpensive hardware
will be exploited in the future, but some fundamental problems in control will
have to be solved first.

2
BSP----------------------- BURROUGHS SCIENTIFIC PROCESSOR

2. ARCHITECTURAL PHILOSOPHY

Parallelism is the architectural philosophy underlying the design of the BSP. It
is synonymous with concurrency and simultaneity, namely, many things going on
at once. It can be defined as the employment of multiple computing resources to
increase the throughput of a system, and can be understood and utilized in terms
of the two basic parameters that characterize all computers: space and time.

Spatial parallelism is exploited by employing replicated un its doing identical tasks
simultaneously. Temporal parallelism is exploited by equipping a single unit
with the capability to perform different tasks Simultaneously.

Given these definitions, it is easy to see that parallelism is not a new idea
in computer design. It has been extensively employed in general purpose data
processing systems via multiprocessing (replicated CPUs) and multiprogramming
(where the I/O requirements of one job are balanced against the processing require-
ments of another job). The principal objective in the general-purpose system is
to maximize the throughput of a mix of jobs (Figure 1); but in the context of very
large-scale scientific processing, parallelism is defined with a different end in
mind. It is the application of multiple computing resources to the solution of a
single problem (Figure 2).

3
~~p ~~~~~~~~~~~~~~~~~~~~~BURROUGHSSCIENTIFICPROCESSOR

JOB MIX

~ @]
FORTRAN
JOB JOB JOB

I I I
COMPUTING COMPUTING COMPUTING
RESOURCE RESOURCE RESOURCE

FIGURE 7. I

WEATHE R PR EDICTION
STRUCTURAL ANALYSIS
NUCLEAR TECHNOLOGY

I 1
COMPUTING COMPUTING COMPUTING
RESOURCE RESOURCE RESOURCE

FIGURE 2. I

4
BSP ----~---------------~--------------- . -- BURROUGHS SCIENTIFIC PROCESSOR

3. PARALLELISM RATIONALE

The applications that require the power of a supercomputer are quite distinct from
one another in that they address different natural phenomena and use different
mathematical techniques. But they do have one common characteristic: massive
amounts of computation. In factI the number of arithmetic operations needed to
solve some problems is now in the trillions.

This situation is not likely to change - for problem requirements continue to grow.
Computerized models of natural phenomena are quite simple by nature's
standards. Scientists are constantly striving to perfect their models by making
them more accurate and by exercising them with more and more data (Figure 3).

PROBLEM SPAC E

GREATER RESOLUTION 1 SIGNIFICANTLY MORE POWERFUL
COMPUTERS THAN THOSE PRESENTLY
MOR E ACCURATE MODELS J AVAILABLE

FIGURE 3.

5
BSP -------------------BURROUGHSSCIENTIFICPROCESSOR

The amount of computation required by more sophisticated models places enormous
burdens on the computing systems which support them. The burden is especially
heavy if the computer is sequentially organized (Figure 4), that is, if all arithmetic
operations must be done one at a time. The reason is that sequential organizations
are now running into the limitations of the so-far immutable law of physics which
dictates that it is not possible to transfer information from one point to another
faster than the speed of light.

MEMORY

CONTROL
UNIT

ARITHMETIC
UNIT

FIGURE 4.

Traditionally, serial machines have demonstrated performance gains by little
more than a repackaging of the basic organization of Figure 4 in faster and faster
hardware. That is, computer technology has advanced from vacuum tubes to
transistors to integrated circuits, with corresponding increases in the number
of operations per second (tens of thousands, hundreds of thousands, and millions
of operations per second respectively).

While it is expected that "hardware only" based improvements will continue, they
cannot be expected to continue at the pace that has enabled computer designers to
see an order of magnitude increase in performance every three to five years.
Thus, to guarantee the levels of performance needed by superscale problems,
the conclusion is inescapable: some additional component is necessary in the
basic architecture of a computer system. That component is parallelism.

6
BSP ---------- --- BURROUGHS SCIENTIFIC PROCESSOR

4. PARALLELISM USEFULNESS

It is natural to ask if parallelism is a sufficiently general concept to be useful in
computer design. Parallelism turns out to be extremely useful because our per-
ception of nature is highly susceptible to the types of parallelism that can be
built into a computer.

Our perception of natural phenomena begins typically with a description in terms
of continuous mathematics.. whi~h is then translated into a description in terms
of finite mathematics. The discretization process is suggested in Figure 5.

1-
FiGURE 5. I

Suppose the quantity of interest is a function called (XI y). It might be a measure
of temperature or charge distribution. It is to be computed over the surface of a
slab by means of solving a differential equation. If the equation were exactly
soluble l CP (XI y) could be determined for any point on the slab. However l in many
instances l the equation is not exactly soluable. One mustl therefore l use a finite
approximation to the differential equation and be content with computing the finite
equivalent ct> (II J) at a finite number of points on the slab.

7
~~~ ~~~~~~~~~~~~~~~~~~~~~BURROUGHSSCIENTIFICPROCESSOR

Two points should be understood about the computer solution of the cl>(I, J) on a
sequential computer. First, all (I, J)s are comp\.lted one at a time. Second,
the total amount of computation tim e is proportional to the number of grid points
and to the solution time per grid point.

However, in many instances there is nothing in the mathematics which dictates
that the cl> (I, J) be computed one at a time. In fact, many models have the
property that ~ (I+1, J) depends only on (I, J). This means that a number of
(I+1, J)s can be computed simultaneously implying a substantial increase in
performance (Figure 6).

1-

FIGURE 6. I

Simultaneous computation suggests parallelism. Parallel or simultaneous com-
putation in turn suggests that there may be an entity more suitable to an architec-
ture based on parallel technology than the single operand which is associated
(conceptually, at least), with a sequential or serial architecture.

The basic quantity susceptible to parallelism is the linear vector. In this context, a
vector is defined as a set of operands upon which some sequence of arithmetic
opera tions is to be performed. A linear vector is a vector whose elements are
mapped into the memory of a computer in a linear fashion, i. e., the addresses
of the elements differ by a constant (Figure 7).

Simple manipulations of linear vectors correspond to looping structures in
FORTRAN. For example, if A and B are defined as vectors with 100 elements
each, then the vector statement:

C=A+B

is equivalent to: DO 10 I = 1, 100 (1)
10 C(I) = A(I) + B(I).

8
BSP ------------------ - BURROUGHS SCIENTIFIC PROCESSOR

LINEAR VECTORS
4 X 5 ARRAY
r
A A A A A
11 12 13 14 15

A A A A A
21 22 23 24 25
N <
A A A A A
31 32 33 34 35

A A A A A
41 42 43 44 45
"-

STANDARD FORTRAN COLUMNWISE MAPPI NG

ARRAY A A A A A A A A A A A A A A A A A A A A
ELEMENTS 11 21 31 41 12 22 32 42 13 23 33 43 14 24 34 44 15 25 35 45

MEMORY
ADDRESS
o 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
=J!...
LINEAR VECTOR COMPONENTS SEPARATED BY A CONSTANT INCREMENT .s!

COLUMNS .Q. = 1
ROWS Q= N
FORWARD DIAGONALS .s! = Nt 1

FIGlJR