[Cst-1b] Computer design ARM Thumb

Wed, 31 May 2000 12:27:46 +0100

I think the basic ARM pipeline (which we did all those diagrams for in
lectures) does stall on loads and stores.  However, I didn't think the
stalls had anything to do with memory bus congestion - the load, for
example, takes one cycle to compute the address, one cycle to do the memory
access and one cycle to store the data in the register file (= 3 cycles
total).  There's no conflict with the instruction fetching, because we're
not trying to fetch an instruction on cycle 2, when we're fetching data
(since the pipeline's stalled).

The "standard RISC pipeline" doesn't normally need to stall on loads or
stores (assuming no data dependencies or anything) because it separates out
the execute, memory access & write back stages.  As far as I can see,
however, that means that every cycle the processor may be both trying do a
memory access and an instruction fetch...

Presumably this is ok if everything's available in the first level cache
(assuming the cache can cope with two uses per cycle), but if it's not then
we have to stall (and  possibly restart the pipeline, if I remember
correctly).  And at that point there may be some advantage to be gained from
the fact that we're doing half as many instruction fetches as normal - eg if
instruction's not in cache & the data we're accessing isn't in the cache we
need to do two reads from main memory; however, if we happen to have loaded
the instruction in question on the previous cycle, there's only one thing
left to read this cycle.

I don't know if that's the "correct" answer, but it seems at least
half-sensible to me.  The situation I describe seems unlike to occur very
often, but he did say that efficient interleaving only occurs "sometimes"...
Anyone got any better ideas?

Matthew

-----Original Message-----
From: cst-1b-admin@srcf.ucam.org [mailto:cst-1b-admin@srcf.ucam.org]On
Behalf Of M.Y.W.Y.B.
Sent: Wednesday, May 31, 2000 12:00 PM
To: cst-1b@srcf.ucam.org
Subject: Re: [Cst-1b] Computer design ARM Thumb

--On Mittwoch, 31. Mai 2000, 09:48 +0100 "Ewan Mellor" <eem21@cam.ac.uk>
wrote:

> On Wed, 31 May 2000, Shu Yan Chan wrote:
>
>> Hi all!
>> In page 35 of Computer Design notes, Slide 9-15
>> It said: Since IF's occur only 50% of the time, memory operations may
>> sometimes interleave efficiently
>>
>> What does it really mean?
>
>>From memory: The memory bus is still 32 bits wide, so by having an
> instruction set which is 16 bits wide we can get two instructions in one
> go.  This means that every other cycle we are not doing an instruction
> fetch, so when we access the memory bus (loads/stores), we stall less,
> because the bus isn't being used.  When you do a normal store, the
> pipeline stalls because we cannot fetch the next instruction at the start
> of the pipeline and execute the store at the same time.

I am slightly confused now. AFAIK the "standard RISC pipeline" never stalls
for a store instruction (why should it?) but might stall for a load
instruction (load delay slot) when there is data dependency.
???

Mo