九州大学2023年夏计算机组成
九州大学2023年计算机组成答案 by 偷偷
催更|辅导|私塾兼职|联系偷偷:LifeGoesOn_Rio
[Q2] Let us consider a microprocessor having a pipelined datapath. Program execution time ET can be represented by using the three parameters, instruction count required to complete the whole program execution IC, the number of executed instructions per clock cycle IPC, and clock frequency F, as shown in the following equation.
$$ET = \frac{IC}{IPC \times F}$$
Answer the following questions.
(1) Consider an in-order microprocessor whose instruction issue width is one. Explain the effects of pipelined datapath implementation on IC, IPC, and F, compared to a single-cycle datapath implementation where each instruction is executed in a single clock cycle, respectively. If there are no effects on each parameter, answer “no effects.”
IC: No effects. The instruction count is determined by the program being executed and is independent of whether the processor uses a pipelined or a single-cycle datapath. The same number of instructions will be executed in both cases.
IPC: IPC typically increases in a pipelined processor because multiple instructions can be in different stages of execution at the same time. This allows the processor to issue a new instruction each cycle, leading to higher throughput.
F: The clock frequency F can potentially be higher in a pipelined processor. This is because each stage of the pipeline is smaller (handling only part of the instruction) compared to the single-cycle implementation where the entire instruction is processed in one cycle. This smaller stage size allows for faster clock cycles, thus enabling a higher frequency.
(2) We extend the pipelined datapath by increasing the instruction issue width from one to two that forms an in-order superscalar microprocessor, i.e., at most two independent instructions can be executed in parallel. Explain the effects of this extension on IC, IPC, and F, respectively. If there are no effects on each parameter, answer “no effects.”
IC: No effects. The instruction count is still determined by the program being executed.
IPC: In general, increasing the issue width should increase IPC, assuming there are enough independent instructions to take advantage of the increased width. Consider other factors, such as dependencies (data hazards, control hazards) can reduce the potential benefit.
F: No effects. Although there may be a slight decrease in frequency due to the added complexity of issuing two instructions per cycle.
(3) Assume that the instruction issue width is four. Answer the upper limit of IPC that can be achieved by the pipelined datapath.
Ans: 4
[Q3] Consider computer memory systems. Assume a direct-mapped cache memory implemented in a microprocessor chip. The microprocessor uses word addressing, the word size is 4 bytes, the cache size is 16 bytes, the block size is 4 bytes, and the address width is 4 bits. Suppose the cache was initially empty, and the memory access sequence for the following word addresses (represented in the binary numeral system) has occurred. 1101 ⇒ 1010 ⇒ 1111 ⇒ 1101
Then we have the following five memory accesses (memory access ① -⑤ ) consecutively.

Answer the following questions.
(1) Find all of the memory accesses among ① -⑤ that cause a cache hit. If there is no corresponding memory access, answer “not applicable.”
Hint: Word Addressing. 按字编址
- Block offset:block size is 4 bytes(1 word) and it requires 1 bit.
- Index bit: the number of cache blocks is $\frac{16bytes}{4bytes} = 4 = 2^2$,the index bit requires 2 bits.
- Tag bits: it requires 4 - 2 - 1 = 1 bits
Before:
Cache Block | Accessing Order |
---|---|
00 | |
01 | 1010 |
10 | 1101 |
11 | 1111 |
After:
- 1010: hit
- 1001: miss
- 1000: hit
- 0011: miss
- 1111: hit
(2) Find all of the memory accesses among ① -⑤ that cause a compulsory miss. If there is no corresponding memory access, answer “not applicable.”
Compulsory miss: ②
(3) Explain how to modify this cache memory to reduce the conflict miss. Also, if there are any demerits to the modification, explain them.
Modigy the cache to fully associative cache. The index now maps to a set of multiple cache lines, and the block can be placed in any available line within the set, reducing the likelihood of conflicts. However, it requires additional hardware for selecting the correct cache line.