### Appendix C: Pipelining: Basic and Intermediate Concepts

Key ideas and simple pipeline (Section C.1) Hazards (Sections C.2 and C.3) Structural hazards Data hazards Control hazards Exceptions (Section C.4) Multicycle operations (Section C.5)



| Consider an instruction the $s_1, s_2, \ldots, s_n$ , taking the Let $T = \Sigma t_i$ |                          |
|---------------------------------------------------------------------------------------|--------------------------|
| Without pipelining                                                                    | With an n-stage pipeline |
| Throughput =                                                                          | Throughput =             |
| Latency =                                                                             | Latency =                |
| Speedup                                                                               |                          |

| Let $\Delta > 0$ be extra delay per stage<br>e.g., latches                         |  |
|------------------------------------------------------------------------------------|--|
| $\Delta$ limits the useful depth of a pipeline.                                    |  |
| With an nstage pipeline<br>$Throughput = \frac{I}{\Delta + max t_i} < \frac{n}{T}$ |  |
| $Latency = n \times (\Delta + max t_i) \ge n\Delta + T$                            |  |
| Speedup = $\frac{\Sigma t_i}{\Lambda + \max t_i} < n$                              |  |

### Example

Let  $t_{l,2,3} = 8$ , 12, 10 ns and  $\Delta = 2$  ns Throughput = Latency = Speedup =

| Practical Limit 3 - Hazards                                                                                                                                                 |  |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| $Pipeline \ Speedup = \frac{Time_{sequential}}{Time_{pipeline}} = \frac{CPI_{sequential}}{CPI_{pipeline}} \times \frac{Cycle \ Time_{sequential}}{Cycle \ Time_{pipeline}}$ |  |
| If we ignore cycle time differences:                                                                                                                                        |  |
| $CPI_{ideal-pipeline} = \frac{CPI_{sequential}}{Pipeline Depth}$                                                                                                            |  |
| $Pipeline Speedup = \frac{CPI_{ideal-pipeline} \times Pipeline Depth}{CPI_{ideal-pipeline} + Pipeline stall cycles}$                                                        |  |

## Pipelining a Basic RISC ISA

Assumptions:

Only loads and stores affect memory

Base register + immediate offset = effective address

ALU operations

Only access registers

Two sources - two registers, or register and immediate

Branches and jumps

Address = PC + offset

Comparison between a register and zero

The last assumption is different from the 6<sup>th</sup> edition of the text and results in a slightly different pipeline. We will discuss reasons and implications in class.

|          |                                      | A S     | Sim     | ole I  | Five  | e St  | age     | R   | ISC  | Pipe    | line |  |  |
|----------|--------------------------------------|---------|---------|--------|-------|-------|---------|-----|------|---------|------|--|--|
| Pipeline | ə Sta                                | iges    |         |        |       |       |         |     |      |         |      |  |  |
| IF -     | - Inst                               | tructio | on Fe   | etch   |       |       |         |     |      |         |      |  |  |
| ID -     | – Ins                                | tructi  | on de   | ecode  | , reg | ister | read, b | ora | anch | computa | tion |  |  |
| EX       | EX – Execution and Effective Address |         |         |        |       |       |         |     |      |         |      |  |  |
| ME       | MEM – Memory Access                  |         |         |        |       |       |         |     |      |         |      |  |  |
| WE       | 3 – W                                | /riteb  | ack     |        |       |       |         |     |      |         |      |  |  |
|          |                                      |         |         |        |       |       |         |     |      |         |      |  |  |
|          | 1                                    | 2       | 3       | 4      | 5     | 6     | 7 8     | 3   | 9    |         |      |  |  |
| i        | IF                                   | ID      | ΕX      | MEM    | WB    |       |         |     |      |         |      |  |  |
| i+1      |                                      | IF      | ID      | ΕX     | MEM   | WB    |         |     |      |         |      |  |  |
| i+2      |                                      |         | IF      | ID     | ΕX    | MEM   | WB      |     |      |         |      |  |  |
| i+3      |                                      |         |         | IF     | ID    | ΕX    | MEM W   | IΒ  |      |         |      |  |  |
| i+4      |                                      |         |         |        | IF    | ID    | EX ME   | M   | WB   |         |      |  |  |
| Pipelini | ing re                               | eally i | isn't t | his si | mple  |       |         |     |      |         |      |  |  |



|              | Hazards |  |
|--------------|---------|--|
| Hazards      |         |  |
| Structural H | lazards |  |
| Data Hazar   | ds      |  |
| Control Haz  | zards   |  |
|              |         |  |

## Handling Hazards

Pipeline interlock logic Detects hazard and takes appropriate action Simplest solution: stall Increases CPI Decreases performance Other solutions are harder, but have better performance



When two *different* instructions want to use the *same* hardware resource in the *same* cycle

Stall (cause bubble)

- + Low cost, simple
- Increases CPI
- Use for rare events
- E.g., ??

Duplicate Resource

- + Good performance
  - Increases cost (and maybe cycle time for interconnect) Use for cheap resources
  - E.g., ALU and PC adder

### Structural Hazards, cont.

Pipeline Resource

+ Good performance Often complex to do Use when simple to do

E.g., write & read registers every cycle

Structural hazards are avoided if each instruction uses a resource At most once Always in the same pipeline stage For one cycle

 $(\Rightarrow$  no cycle where two instructions use the same resource)

#### Structural Hazard Example

Loads/stores (MEM) use same memory port as instrn fetches (IF) 30% of all instructions are loads and stores

Assume CPI<sub>old</sub> is 1.5

|     | 1  | 2  | 3  | 4   | 5    | 6    | 7    | 8    | 9   |    |
|-----|----|----|----|-----|------|------|------|------|-----|----|
| i   | IF | ID | ΕX | MEM | WB < | <- a | load | ł    |     |    |
| i+1 |    | IF | ID | ΕX  | MEM  | WB   |      |      |     |    |
| i+2 |    |    | IF | ID  | ΕX   | MEM  | WB   |      |     |    |
| i+3 |    |    |    | * * | IF   | ID   | ΕX   | MEM  | WB  |    |
| i+4 |    |    |    |     |      | IF   | ID   | EX N | 4EM | WB |

How much faster could a new machine with two memory ports be?

| appea      | <ul> <li>different instructions use the same location, it must</li> <li>r as if instructions execute one at a time and in the</li> <li>ed order</li> </ul> |
|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------|
|            | ADD r1, r2,                                                                                                                                                |
|            | SUB r2,,r1<br>OR r1,,                                                                                                                                      |
| Read-Afte  | er-Write (RAW, data-dependence)                                                                                                                            |
|            | T IMPORTANT                                                                                                                                                |
|            | er-Read (WAR, anti-dependence)                                                                                                                             |
| Write-Afte | er-Write (WAW, output-dependence)                                                                                                                          |
| NOT: Rea   | ad-After-Read (RAR)                                                                                                                                        |

|              |    |    |         |            | r1 written |      |
|--------------|----|----|---------|------------|------------|------|
| DD r1,_,_    | IF | ID | EX      | MEM        | WB         |      |
|              |    |    |         |            | NOT OK!    |      |
| UB _, r1,_   |    | IF |         | EX         | MEM        | WB   |
|              |    |    | r1 read |            | r1 written |      |
| V r1,_,_     | IF | ID | EX      | MEM        | WB         |      |
|              |    |    |         |            | NOT OK!    |      |
| UB _, r1,_   |    | IF | ID<br>t | EX         | MEM        | WB   |
|              |    |    | r1 read |            |            |      |
|              |    |    | 1       | memory wri | tten       |      |
| W r1,100(r0) | IF | ID | EX      | MEM        | WB         |      |
|              |    |    |         | <u> </u>   | CORR       | ECT! |
| V r2,100(r0) |    | IF | ID      | EX         | мем<br>1   | WB   |
| Unless LW    |    |    |         |            | memory re  | ad   |









| Before:    |     |           | After:     |     |          |
|------------|-----|-----------|------------|-----|----------|
| a = b + c; | LW  | Rb,b      | a = b + c; | LW  | Rb,b     |
|            | LW  | Rc,c      |            | LW  | Rc,c     |
|            |     | <- stall  |            | LW  | Re,e     |
|            | ADD | Ra,Rb,Rc  |            | ADD | Ra,Rb,Rc |
|            | SW  | a, Ra     |            |     |          |
| d = e - f; | LW  | Re,e      | d = e - f; | LW  | Rf,f     |
|            | LW  | Rf,f      |            | SW  | a, Ra    |
|            |     | <- stall  |            | SUB | Rd,Re,Rf |
|            | SUB | Rd,Re, Rf |            | SW  | d, Rd    |
|            | SW  | d, Rd     |            |     |          |

|          | Other Data Hazards                                              |
|----------|-----------------------------------------------------------------|
| i+1      | ADD r1,r2,<br>SUB r2,,r1<br>OR r1,,                             |
| Write-A  | After-Read (WAR, anti-dependence)                               |
|          |                                                                 |
|          |                                                                 |
|          | MULT , (r2), r1 /* RX mult */<br>LW , (r1)+ /* autoincrement */ |
| Write-A  | After-Write (WAW, output-dependence)                            |
|          |                                                                 |
| i<br>i+1 | DIVF fr1, , /* slow */                                          |
|          | ADDF fr1, , /* fast */                                          |

# **Control Hazards**

When an instruction affects which instructions are executed *next* -- branches, jumps, calls

| i    | BEÇ   | 2Z : | r1,#8      |       |      |       |       |          |
|------|-------|------|------------|-------|------|-------|-------|----------|
| i+1  | SUE   | 3    | , ,        |       |      |       |       |          |
| i+8  | OR    |      |            |       |      |       |       |          |
| 1+0  | UR    |      | <i>, ,</i> |       |      |       |       |          |
| i+9  | ADI   | )    | , ,        |       |      |       |       |          |
|      |       |      |            |       |      |       |       |          |
| 1    | 2     | 3    | 4          | 5     | 6    | 7     | 8     | 9        |
| i    | IF    | ID   | ΕX         | MEM   | WB   |       |       |          |
| i+1  |       | IF   | (abo       | rted) | 1    |       |       |          |
| i+8  |       |      | IF         | ID    | ΕX   | MEM   | WB    |          |
| i+9  |       |      |            | IF    | ID   | ΕX    | MEM   |          |
|      |       |      |            |       |      |       |       |          |
| Hand | dling | g co | ontrol     | haza  | ards | is ve | ry im | nportant |

| Handling Control Hazards                                    |
|-------------------------------------------------------------|
| ranch Prediction                                            |
| Guess the direction of the branch                           |
| Minimize penalty when right                                 |
| May increase penalty when wrong                             |
| echniques                                                   |
| Static – At compile time                                    |
| Dynamic – At run time                                       |
| tatic Techniques                                            |
| Predict NotTaken                                            |
| Predict Taken                                               |
| Delayed Branches                                            |
| ynamic techniques and more powerful static techniques later |
|                                                             |

| Handling | Control | Hazards, | cont. |
|----------|---------|----------|-------|
|----------|---------|----------|-------|

|     | 1   | 2  | 3    | 4     | 5   | 6   | 7   | 8  |
|-----|-----|----|------|-------|-----|-----|-----|----|
| i   | IF  | ID | ΕX   | MEM   | WB  |     |     |    |
| i+1 |     | IF | ID   | ΕX    | MEM | WB  |     |    |
| i+2 |     |    | IF   | ID    | ΕX  | MEM | WB  |    |
| i+3 |     |    |      | IF    | ID  | ΕX  | MEM | WB |
| Tak | en: |    |      |       |     |     |     |    |
|     | 1   | 2  | 3    | 4     | 5   | 6   | 7   | 8  |
| i   | IF  | ID | ΕX   | MEM   | WB  |     |     |    |
| i+1 |     | IF | (abo | rted) | )   |     |     |    |
| i+8 |     |    | IF   | ID    | ΕX  | MEM | WB  |    |
| i+9 |     |    |      | IF    | ID  | ΕX  | MEM | WB |

| Pred | ict T | AKEN    | Alw    | ays   |       |        |       |                       |
|------|-------|---------|--------|-------|-------|--------|-------|-----------------------|
|      | 1     | 2       | 3      | 4     | 5     | 6      | 7     | 8                     |
| i    | IF    | ID      | ΕX     | MEM   | WB    |        |       |                       |
| i+8  |       | 'IF'    | ID     | ΕX    | MEM   | WB     |       |                       |
| i+9  |       |         | IF     | ID    | ΕX    | MEM    | WB    |                       |
| i+10 |       |         |        | IF    | ID    | ΕX     | MEM   | WB                    |
| Must | kno   | w wha   | at ad  | dress | to fe | tch a  | t BEF | ORE branch is decoded |
| Ν    | lot p | ractica | al for | our b | asic  | pipeli | ine   |                       |

## Handling Control Hazards, cont.

Delayed branch

Execute next instruction regardless (of whether branch is taken) What do we execute in the DELAY SLOT?

|                         | Delay Slots |
|-------------------------|-------------|
| Fill from before branch |             |
| When:                   |             |
| Helps:                  |             |
| Fill from target        |             |
| When:                   |             |
| Helps:                  |             |
| Fill from fall through  |             |
| When:                   |             |
| Helps:                  |             |

## Delay Slots (Cont.)

Cancelling or nullifying branch Instruction includes direction of prediction Delay instruction squashed if wrong prediction

Allows second and third case of previous slide to be more aggressive

### Comparison of Branch Schemes

Suppose 14% of all instructions are branches Suppose 65% of all branches are taken Suppose 50% of delay slots usefully filled CPIpenalty = % branches × (% Taken × Taken-Penalty + % Not-Taken × Not-Taken penalty) Branch Taken Not-Taken CPI Penalty Scheme Penalty Penalty .14 **Basic Branch** 1 1 Not-Taken 1 0 .09 Taken0 .05 0 1 Taken1 .14 1 1 Delayed Branch .5 .5 .07

### **Real Processors**

MIPS R4000: 3 cycle branch penalty

First cycle: cancelling delayed branch (cancel if not taken) Next two cycles: Predict not taken

Recent architectures:

Because of deeper pipelines, delayed branches not very useful Processors rely more on hardware prediction (will see later) or may include both delayed and nondelayed branches

### Interrupts

Interrupts (a.k.a. faults, exceptions, traps) often require Surprise jump Linking of return address Saving of PSW (including CCs) State change (e.g., to kernel mode)

#### Some examples

- Arithmetic overflow
- I/O device request
- O.S. call
- Page fault
- Make pipelining hard

## **One Classification of Interrupts**

1a. Synchronous

function of program and memory state (e.g., arithmetic overflow, page fault)

1b. Asynchronous external device or hardware malfunction (printer ready, bus error)

| Handling Interrupts                       |
|-------------------------------------------|
| Precise Interrupts (Sequential Semantics) |
| Complete instrns before offending one     |
| Squash (effects of) instrns after         |
| Save PC                                   |
| Force trap instrn into IF                 |
| Must handle simultaneous interrupts       |
| IF –                                      |
|                                           |
| ID –                                      |
| EX –                                      |
| MEM –                                     |
| WB –                                      |
| Which interrupt should be handled first?  |
|                                           |

|       |        |        |        |        | Int    | orr  | uni  | e /      | cont.   |        |    |      |  |
|-------|--------|--------|--------|--------|--------|------|------|----------|---------|--------|----|------|--|
|       |        |        |        |        |        | CII  | upi  | <u>,</u> |         |        |    | <br> |  |
| Exar  | nple   | : Da   | ta Pa  | age F  | ault   |      |      |          |         |        |    |      |  |
|       | 1      | 2      | 3      | 4      | 5      | 6    | 7    | 8        |         |        |    |      |  |
| i     | IF     | ID     | ΕX     | MEM    | WB     |      |      |          |         |        |    |      |  |
| i+1   |        | IF     | ID     | ΕX     | MEM    | WB   | <- p | age      | fault   | (MEM)  |    |      |  |
| i+2   |        |        | IF     | ID     | ΕX     | MEM  | WB   | <- 9     | squash  |        |    |      |  |
| i+3   |        |        |        | IF     | ID     | ΕX   | MEN  | I WB     | <- squ  | ıash   |    |      |  |
| i+4   |        |        |        |        | IF     | ID   | ΕX   | MEN      | 4 WB <- | - squa | sh |      |  |
| i+5   |        |        |        | trap   | p −>   | IF   | ID   | ΕX       | MEM V   | √В     |    |      |  |
| i+6   |        |        | tra    | p har  | ndle   | r -> | IF   | ID       | EX 1    | 4EM WB |    |      |  |
| Prec  | edin   | ig ins | struc  | tion a | alrea  | dy c | omp  | lete     |         |        |    |      |  |
| Sau   | ash (  |        | oodii  | ng ins | struc  | tion | 9    |          |         |        |    |      |  |
|       |        |        |        | 0      |        |      |      |          |         |        |    |      |  |
| H     | reve   | ent fi | rom    | modi   | ying   | sta  | te   |          |         |        |    |      |  |
| 'Trap | o' ins | struc  | tion j | umps   | s to f | irap | han  | dler     |         |        |    |      |  |
| Harc  | lwar   | e sa   | ves F  | PC in  | IAR    |      |      |          |         |        |    |      |  |
| Trap  | har    | dler   | mus    | t sav  | e IA   | R    |      |          |         |        |    |      |  |

|       | 1     | 2      | 3     | 4      | 5       | 6      | 7   | 8                 |
|-------|-------|--------|-------|--------|---------|--------|-----|-------------------|
| L     | IF    | ID     | ΕX    | MEM    | WB      |        |     |                   |
| i+1   |       | IF     | ID    | ΕX     | MEM     | WB     |     |                   |
| L+2   |       |        | IF    | ID     | ΕX      | MEM    | WB  | <- Exception (EX) |
| L+3   |       |        |       | IF     | ID      | ΕX     | MEM | 4 WB <- squash    |
| L+4   |       |        |       |        | IF      | ID     | ΕX  | MEM WB <- squash  |
| L+5   |       |        |       | trap   | o −>    | IF     | ID  | EX MEM WB         |
| L+6   |       |        | traj  | o har  | ndlei   | c ->   | IF  | ID EX MEM WB      |
| _et p | receo | dina i | nstru | ctions | s com   | nplete | e   |                   |
|       |       |        | eding | inatre | . ation |        |     |                   |

| Exan  | nple: | Illega | al Op  | code   |        |       |     |                    |
|-------|-------|--------|--------|--------|--------|-------|-----|--------------------|
|       | 1     | 2      | 3      | 4      | 5      | 6     | 7   | 8                  |
| i     | IF    | ID     | ΕX     | MEM    | WB     |       |     |                    |
| i+1   |       | IF     | ID     | ΕX     | MEM    | WB    |     |                    |
| i+2   |       |        | IF     | ID     | ΕX     | MEM   | WB  |                    |
| i+3   |       |        |        | IF     | ID     | ΕX    | MEM | WB <- ill. op (ID) |
| i+4   |       |        |        |        | IF     | ID    | ΕX  | MEM WB <- squash   |
| i+5   |       |        |        | trap   | р —>   | IF    | ID  | EX MEM WB          |
| i+6   |       |        | tra    | p har  | ndlei  | c ->  | IF  | ID EX MEM WB       |
| Let p | rece  | dina i | instru | ctions | s com  | telar | Э   |                    |
|       |       | 0      |        |        | uctior | •     |     |                    |

| Exar        | nple:                               | Out-                                     | of-or                                       | der In                                       | terrup                                | pts                                                                             |
|-------------|-------------------------------------|------------------------------------------|---------------------------------------------|----------------------------------------------|---------------------------------------|---------------------------------------------------------------------------------|
|             | 1                                   | 2                                        | 3                                           | 4                                            | 5                                     | 6 7 8                                                                           |
| i           | IF                                  | ID                                       | ΕX                                          | MEM                                          | WB                                    | <- page fault (MEM)                                                             |
| i+1         |                                     | IF                                       | ID                                          | ΕX                                           | MEM                                   | WB <- page fault (IF)                                                           |
| i+2         |                                     |                                          | IF                                          | ID                                           | ΕX                                    | MEM WB                                                                          |
| i+3         |                                     |                                          |                                             | IF                                           | ID                                    | EX MEM WB                                                                       |
|             |                                     |                                          |                                             | I . I .                                      |                                       |                                                                                 |
| whic        | in pa                               | ge fa                                    | ult sh                                      | iouia                                        | we ta                                 | ake?                                                                            |
| For p       | orecis                              | se int                                   | errup                                       | ts – F                                       | Post in                               | ake?<br>nterrupts on a status vector associated with<br>ites in pipeline        |
| For p       | orecis<br>Istruc                    | se int<br>tion,                          | errup<br>disat                              | ts – F<br>ole lat                            | Post in<br>er wri                     | nterrupts on a status vector associated with                                    |
| For p       | orecis<br>Istruc<br>C               | se int<br>tion,<br>heck                  | errup<br>disat                              | ts – F<br>ble lat<br>rupt b                  | Post in<br>er wri                     | nterrupts on a status vector associated with<br>ites in pipeline                |
| For p<br>in | orecis<br>Istruc<br>Cl              | se int<br>tion,<br>heck                  | errup<br>disat<br>interr<br>later           | ts – F<br>ble lat<br>rupt b<br>ncy           | Post in<br>er wri<br>it on e          | nterrupts on a status vector associated with<br>ites in pipeline                |
| For p<br>in | orecis<br>Istruc<br>C<br>Lo<br>mpre | se int<br>tion,<br>heck<br>ongei<br>cise | errup<br>disat<br>interr<br>later<br>interr | ts – F<br>ble lat<br>rupt b<br>ncy<br>upts - | Post in<br>er wri<br>it on e<br>- Han | nterrupts on a status vector associated with<br>ites in pipeline<br>entering WB |

### Interrupts, cont.

Other complications

Odd bits of state (e.g., CCs) Earlywrites (e.g., autoincrement) Outoforder execution

Interrupts come at random times

The frequent case isn't everything The rare case MUST work correctly

### Multicycle Operations

Not all operations complete in one cycle

Floating point arithmetic is inherently slower than integer arithmetic

2 to 4 cycles for multiply or add

20 to 50 cycles for divide

Extend basic 5-stage pipeline

EX stage may repeat multiple times

Multiple function units

Not pipelined for now

### Handling Multicycle Operations

Four Functional Units EX: Integer unit E\*: FP/integer multiplier E+: FP adder E/: FP/integer divider Assume EX takes one cycle & all FP units take 4 Separate integer and FP registers All FP arithmetic from FP registers Worry about Structural hazards RAW hazards & forwarding WAR & WAW between integer & FP ops

|      | 1      | 2       | 3      | 4     | 5     | 6   | 7   | 8   | 9   | 10  | 11  |
|------|--------|---------|--------|-------|-------|-----|-----|-----|-----|-----|-----|
| int  | IF     | ID      | ΕX     | MEM   | WB    |     |     |     |     |     |     |
| fp*  |        | IF      | ID     | E*    | E*    | E*  | E*  | MEM | WB  |     |     |
| int  |        |         | IF     | ID    | ΕX    | MEM | WB? | (1) |     |     |     |
| fp/  |        |         |        | IF    | ID    | E/  | E/  | E/  | E/  | MEM | WB  |
| int  |        |         |        |       | IF    | ID  | ΕX  | * * | MEM | WB  | (2) |
| fp/  |        |         |        |       | (3)   | IF  | ID  | * * | * * | E/  | E/  |
| int  |        |         |        |       |       | (4) | IF  | * * | * * | ID  | ΕX  |
| Note | s      |         |        |       |       |     |     |     |     |     |     |
| (*   | 1) W/  | ۹W p    | ossib  | le on | y if? |     |     |     |     |     |     |
| (2   | 2) Sta | all for | ced b  | oy?   |       |     |     |     |     |     |     |
| (3   | 3) Sta | all for | ced b  | oy?   |       |     |     |     |     |     |     |
| U.   | 1) Sta | all for | rood k | -     |       |     |     |     |     |     |     |

### FP Instruction Issue

Check for RAW data hazard (in ID)

Wait until source registers are not used as destinations by instructions in EX that will not be available when needed

Check for forwarding

Bypass data from other stages, if necessary

Check for structural hazard in function unit

Wait until function unit is free (in ID)

Check for structural hazard in MEM / WB

Instructions stall in ID Instructions stall before MEM

Static priority (e.g., FU with longest latency)

## FP Instruction Issue (Cont.)

Check for WAW hazards

DIVF F0, F2, F4 SUBF F0, F8, F10 SUBF completes first (1) Stall SUBF (2) Abort DIVF's WB WAR hazards?

### More Multicycle Operations

Problems with Interrupts

DIVF F0, F2, F4 ADDF F2, F8, F10 SUBF F6, F4, F10

ADDF and SUBF complete before  $\ensuremath{\mathsf{DIVF}}$ 

Out-of-order completion

Possible imprecise interrupt

What happens if DIVF generates an exception after ADDF and SUBF complete??

We'll discuss solutions later