Text preview for : Pentium IV.PDF part of Intel Pentium IV Intel Pentium IV.PDF
Back to : Pentium IV.PDF | Home
MICROPROCESSOR
REPORT www.MPRonline.com
T H E I N S I D E R ' S G U I D E T O M I C R O P R O C E S S O R H A R D WA R E
PENTIUM 4 (PARTIALLY) PREVIEWED
Intel Lifts Veil on Hyperpipelined CPU--But Not All the Way
By Pe ter N. Glaskow sky {8/28/00-01}
Intel has released a few more details of its next IA-32 processor, formerly code-named Willa-
mette and now slated to be sold as the Pentium 4. At the Intel Developer Forum last week,
Intel's CEO Craig Barrett and vice president Albert Yu provided interesting insights into the
design goals and microarchitecture of the Pentium 4's new Although the new pipeline has one execution stage, the
core. The announcement answered some questions but begged P4's two ALUs execute many operations in one-half of a clock
others. For example, the P4 will be announced at "at least" period. Shifts and some other operations still spend one full
1.4GHz, according to Intel, but the company has said noth- clock period in the ALU, and these operations must start at
ing about the P4's performance relative to the Pentium 3, the beginning of the period. Since the rest of the pipeline
currently shipping at 1.13GHz. can process only two ALU operations per clock, the faster
The higher operating frequency of the new part is ALUs don't increase peak throughput--but they do boost
made possible by a hyperpipelined core with 20 stages. sustained throughput. When two ALU operations are ready
Intel calls this new architecture NetBurst rather than P7, or for execution, one of which depends on the results of the
some other sequential code-name of the type used in the past. other, the Pentium 4 can complete the first operation in the
As Figure 1 shows, the NetBurst
pipeline is twice as deep as that of
the P6, which in turn had twice Prefetch Decode Decode Execute Write-back
the depth of the P5's. Increasing P5 Microarchitecture
pipeline depth increases logic
complexity and branch penalties,
but it also allows clock speeds to Fetch Fetch Decode Decode Decode Rename ROB Rd Rdy/Sch Dispatch Execute
increase. We expect the new core P6 Microarchitecture
to reach 2GHz--a speed demon-
strated at IDF--before it moves
to a 0.13-micron process in 2001. TC Nxt IP TC Fetch Drive Alloc Rename Queue Schedule
The two Drive stages shown
in Figure 1 represent time required Schedule Schedule Dispatch Dispatch Reg File Reg File Execute Flags Branch Ck Drive
to move signals across the chip. No NetBurst Microarchitecture
other work is done during these
stages. As far as we know, NetBurst
Figure 1. The new hyperpipelined NetBurst microarchitecture of the Pentium 4 allows its clock rate to
is the first pipeline with dedicated be increased significantly over that of the Pentium 3. This figure, based on information provided by
stages for wire delays. Intel, shows the portion of each pipeline involved in ALU operations under branch mispredictions.