    System Configuration
    The ES is a highly parallel vector supercomputer system of the distributed-memory type, and consisted of 640 processor nodes (PNs) connected by 640x640 single-stage crossbar switches. Each PN is a system with a shared memory, consisting of 8 vector-type arithmetic processors (APs), a 16-GB main memory system (MS), a remote access control unit (RCU), and an I/O processor. The peak performance of each AP is 8Gflops. The ES as a whole thus consists of 5120 APs with 10 TB of main memory and the theoretical performance of 40Tflops

    Construction of Arithmetic Processor (AP)

    Each AP consists of a 4-way super-scalar unit (SU), a vector unit (VU), and main memory access control unit on a single LSI chip. The AP operates at a clock frequency of 500MHz with some circuits operating at 1GHz. Each SU is a super-scalar processor with 64KB instruction caches, 64KB data caches, and 128 general-purpose scalar registers.

