Computer Organization - Ch5 Exploiting Memory Hierarchy | Mr. Opengate

2015年1月22日星期四

Computer Organization - Ch5 Exploiting Memory Hierarchy

於 1/22/2015 07:21:00 下午沒有留言:

標籤： Computer Science-Computer Organization

Chapter 5 Exploiting Memory Hierarchy

5.1 Direct Mapped Cache

基本Cache存取相關名詞

Block : unit of copying , may be multiple words
Hit : access satisfied by upper level
Miss : block copied from lower level
Hit ratio : hits/accesses
Miss ratio : misses/accesses = 1 – hit ratio

使用cache有效增加效率的原因

Temporal locality：instructions in a loop, induction variables
Spatial locality：sequential instruction access, array data

Direct mapped：只有hash和定址而得到的一個項，沒有搜尋

Address的內容

[tag]-[index]-[byte offset]
tag : 記錄現在在cache block中的是memory的哪一個block
index : 用來對每一個block定址(global)
offset : 用來對每一個byte/word定址(local)

Cache Memory的內容

[Index]-[Valid bit]-[Tag]-[Data]
Valid bit : 1 = present, 0 = not present

Direct mapped example

若cache要存放的資料量為16Kbyte, address為32 bits, 每一個block大小為4 words, 則Cache大小為多少?

首先求Entry數目，$Entry數目 = 1k = 2^{10}$，因此 $index = 10$
再求block內的word數目，$4 = 2^2$，因此 $word\ offset = 2$
再求word內的byte數目，很明顯是，$4 = 2^2$，因此 $byte\ offset = 2$
最後求tag，為 $address - ( index + offset )$
$tag = 32 – 10(index) - 2(word\ offset) - 2(byte\ offset)$
最終得Cache大小為
$2^{10} \times (1(valid) + 18(tag) + 4 \times 32(block size)) = 147000 bits$

當題目有提到word的字眼就用block offset，沒有的話，就只用byte offset即可

entry數 => Index bit
Block的word數 => block offset
N-bit Address => byte offset

5.2 Cache Miss

Block size issue

當block Size越來越大，會降低 miss rate，因為spatial locality。
block越大miss penalty越嚴重，因為要搬更多東西。
在固定的cache size下，太大的block Size :

會使block數目變少而miss rate上升
會讓沒有物盡其用的情況變嚴重，白白複製一堆東西到cache

Data Write issue

write-through : 同時更改cache和memory的的資料，效能差但可用write buffer使CPU繼續執行來改善，當發生write miss時，有兩種處理方式。

write around : 不存取cache直接去寫main memory
allocate on miss : 存取cache一起改

write-back : 先寫到cache並註記為dirty，當dirty block整個被替換時才寫回memory

Cache Performance：Average memory access time (AMAT) , AMAT = Hit time + Miss rate × Miss penalty

5.3 Cache Miss improvement

Associative Cache：增加空間使用的彈性，引入search的概念 (原本只有map和hash) ，使miss rate降低，硬體可平行做search。

direct mapped : hash 的概念 (block 位址直接mod n)
2-way set associative : 半hash、半search
fully associative : 全search

因為關係到search的查找模式，所以有 replacement policy

Least-recently used (LRU)
Random

multi-level caches

第一層 cache：希望將 hit time 減到最小
第二層 cache：希望將miss rate減到最小避免查找主記憶體，Hit time 影響較小

實務上的結果

L-1 cache 大小< single cache 大小
L-1 block 大小 < L-2 block 大小
example：考慮一個兩層架構的cache

Access time = 5 ns，Main memory access time = 100ns ，Clock cycle = 1/4GHz = 0.25ns (算次數) Global miss rate to main memory = 0.5 % Miss rate/instruction = 2% (算出現機率)
CPI = 1 + 0.02 × 20 + 0.005 × 400 = 3.4

memory hierarchy interaction with software

Radix Sort 複雜度較小，花的instruction少
Radix Sort 要用bocket額外暫存，miss rate 較高
所以，總和起來還是quick sort快

沒有留言:

張貼留言

訂閱：張貼留言 (Atom)

© 2025 Mr. Opengate