The ISA is used as base for high end microprocessors from IBM during the 1990s and were used in many of IBM's servers, minicomputers, workstations, and supercomputers. These processors are called POWER1 (RIOS-1, RIOS.9, RSC, RAD6000) and POWER2 (POWER2, POWER2+ and P2SC).
The ISA evolved into the PowerPC instruction set architecture and was deprecated in 1998 when IBM introduced the POWER3 processor that was mainly a 32/64 bit PowerPC processor but included the POWER ISA for backwards compatibility. The POWER ISA was then abandoned.
In 1974, IBM started a project with a design objective of creating a large telephone-switching network with a potential capacity to deal with at least 300 calls per second. It was projected that 20,000 machine instructions would be required to handle each call while maintaining a real-time response, so a processor with a performance of 12 MIPS was deemed necessary. This requirement was extremely ambitious for the time, but it was realised that much of the complexity of contemporary CPUs could be dispensed with, since this machine would need only to perform I/O, branches, add register-register, move data between registers and memory, and would have no need for special instructions to perform heavy arithmetic.
This simple design philosophy, whereby each step of a complex operation is specified explicitly by one machine instruction, and all instructions are required to complete in the same constant time, would later come to be known as RISC.
By 1975 the telephone switch project was canceled without a prototype. From the estimates from simulations produced in the project's first year, however, it looked as if the processor being designed for this project could be a very promising general-purpose processor, so work continued at Thomas J. Watson Research Center building #801, on the 801 project.
For two years at the Watson Research Center, the superscalar limits of the 801 design were explored, such as the feasibility of implementing the design using multiple functional units to improve performance, similar to what had been done in the IBM System/360 Model 91 and the CDC 6600 (although the Model 91 had been based on a CISC design), to determine if a RISC machine could maintain multiple instructions per cycle, or what design changes need to be made to the 801 design to allow for multiple-execution-units.
To increase performance, Cheetah had separate branch, fixed-point, and floating-point execution units. Many changes were made to the 801 design to allow for multiple-execution-units. Cheetah was originally planned to be manufactured using bipolar emitter-coupled logic (ECL) technology, but by 1984 complementary metal–oxide–semiconductor (CMOS) technology afforded an increase in the level of circuit integration while improving transistor-logic performance.
In 1985, research on a second-generation RISC architecture started at the IBM Thomas J. Watson Research Center, producing the "AMERICA architecture"; in 1986, IBM Austin started developing the RS/6000 series, based on that architecture.
In February 1990, the first computers from IBM to incorporate the POWER instruction set were called the "RISC System/6000" or RS/6000. These RS/6000 computers were divided into two classes, workstations and servers, and hence introduced as the POWERstation and POWERserver. The RS/6000 CPU had 2 configurations, called the "RIOS-1" and "RIOS.9" (or more commonly the "POWER1" CPU). A RIOS-1 configuration had a total of 10 discrete chips - an instruction cache chip, fixed-point chip, floating-point chip, 4 data cache chips, storage control chip, input/output chips, and a clock chip. The lower cost RIOS.9 configuration had 8 discrete chips - an instruction cache chip, fixed-point chip, floating-point chip, 2 data cache chips, storage control chip, input/output chip, and a clock chip.
A single-chip implementation of RIOS, RSC (for "RISC Single Chip"), was developed for lower-end RS/6000's; the first machines using RSC were released in 1992.
IBM started the POWER2 processor effort as a successor to the POWER1 two years before the creation of the 1991 Apple/IBM/Motorola alliance in Austin, Texas. Despite being impacted by diversion of resources to jump start the Apple/IBM/Motorola effort, the POWER2 took five years from start to system shipment. By adding a second fixed-point unit, a second floating point unit, and other performance enhancements to the design, the POWER2 had leadership performance when it was announced in November 1993.
New instructions were also added to the instruction set:
To support the RS/6000 and RS/6000 SP2 product lines in 1996, IBM had its own design team implement a single-chip version of POWER2, the P2SC ("POWER2 Super Chip"), outside the Apple/IBM/Motorola alliance in IBM's most advanced and dense CMOS-6S process. P2SC combined all of the separate POWER2 instruction cache, fixed point, floating point, storage control, and data cache chips onto one huge die. At the time of its introduction, P2SC was the largest and highest transistor count processor in the industry. Despite the challenge of its size, complexity, and advanced CMOS process, the first tape-out version of the processor was able to be shipped, and it had leadership floating point performance at the time it was announced. P2SC was the processor used in the 1997 IBM Deep Blue chess playing supercomputer which beat chess grandmaster Garry Kasparov. With its twin sophisticated MAF floating point units and huge wide and low latency memory interfaces, P2SC was primarily targeted at engineering and scientific applications. P2SC was eventually succeeded by the POWER3, which included 64-bit, SMP capability, and a full transition to PowerPC in addition to P2SC's sophisticated twin MAF floating point units.
At about the same time the PC/RT was being released, IBM started the America Project, to design the most powerful CPU on the market. They were interested primarily in fixing two problems in the 801 design:
Floating point became a focus for the America Project, and IBM was able to use new algorithms developed in the early 1980s that could support 64-bit double-precision multiplies and divides in a single cycle. The FPU portion of the design was separate from the instruction decoder and integer parts, allowing the decoder to send instructions to both the FPU and ALU (integer) execution units at the same time. IBM complemented this with a complex instruction decoder which could be fetching one instruction, decoding another, and sending one to the ALU and FPU at the same time, resulting in one of the first superscalar CPU designs in use.
The system used 32 32-bit integer registers and another 32 64-bit floating point registers, each in their own unit. The branch unit also included a number of "private" registers for its own use, including the program counter.
Another interesting feature of the architecture is a virtual address system which maps all addresses into a 52-bit space. In this way applications can share memory in a "flat" 32-bit space, and all of the programs can have different blocks of 32 bits each.
Appendix E of Book I: PowerPC User Instruction Set Architecture of the PowerPC Architecture Book, Version 2.02 describes the differences between the POWER and POWER2 instruction set architectures and the version of the PowerPC instruction set architecture implemented by the POWER5.