Table of Contents
Exam #3
Exam Format
- The exam will be held in the testing center.
- Material covered: Section 4.9 - 4.11, Chapter 5.1 - 5.5
- 30 multiple choice questions
- Closed book
- You may use any calculator that does not have internet access and does not store materials from the book or class
- You may write on the exam
- No time limit on the exam (the average student takes about 1.5 hours to complete the exam)
The following tables and figures will be given on the exam:
- Figure 5.23
Exam Preparation Suggestions
- Review and practice in class quizzes. Several of the problems are variations of existing quiz questions.
- Review homework questions. Some problems are variations of homework questions.
- Review the ‘Exam Review Questions’ listed below
- Review textbook (see review questions on areas of the textbook to review)
- Some questions on the textbook may be given that are not explicitly covered in class
- Topics not listed in the review questions will not be covered in the exam
Chapter 4
- Section 4.9 - Control Hazards
- Are control hazards are easier to implement than data hazards? Which is more difficult to address effectively?
- Would it be just as simple for the pipelined RISC-V processor to predict that every conditional branch is taken (rather than not taken)?
- Would this require the calculation of a value that is not required when branches are predicted not taken?
- What actions must be taken to flush or discard instructions in the pipeline that follow a mispredicted branch?
- What are the benefits of executing branch instructions earlier in the pipeline?
- What is the earliest stage in which branch instructions could execute?
- What challenges must be addressed if branches execute in ID?
- What branch predictors exist that would give better performance than simply predicting all branch outcomes as “not taken”?
- How might the compiler indicate a prediction for a given branch instruction?
- What is required to implement dynamic branch prediction?
- How is a branch prediction buffer (or branch history table) accessed, and what information does it contain?
- How does the 1-bit branch predictor work? How does the 2-bit branch predictor work and how is it better than the 1-bit branch predictor?
- What advantage does a branch target buffer offer?
- Section 4.10 - Exceptions
- What are interrupts and exceptions (and what is the difference between the two terms)? Make sure you understand the book definitions.
- What are ‘precise’ exceptions, and why are they important?
- What does a processor normally do when an exception occurs?
- How does the pipelined datapath in Fig. 4.66 respond to an exception?
- Section 4.11 - Parallelism via Instructions
- What is instruction-level parallelism (ILP)?
- What challenges must be addressed in designing a pipelined processor capable of fetching, decoding, and executing multiple instructions per cycle?
- How does static multiple issue differ from dynamic multiple issue?
- What does it mean to speculatively execute instructions, and how does this differ from simple branch prediction?
- What additional hardware is required if a pipelined processor is to be made capable of executing instructions out-of-order?
- Section 4.12 - Putting It All together: The Intel Core i7 6700 and ARM Cortex-A53
- What major differences do you see between the pipelined RISC-V processor in the text, the ARM Cortex A53 pipeline in Fig. 4.72, and the Intel Core i7 920 pipeline in Fig. 4.74?
- What advanced ILP techniques does the Intel Core i7 use to increase performance?
- Does your experience with the class labs confirm the observation by the authors that creating a correctly executing pipeline can be very subtle (hard to get right)?
- What are some examples of complications in other instructions sets (not RISC-V) that make a pipelined implementation more difficult?
Chapter 5
- Section 5.1 - Introduction
- What is the principle of locality?
- What is temporal locality?
- What is spatial locality?
- What is a memory hierarchy?
- If the processor is at the top, where in the hierarchy is the fastest memory?
- Where is the cheapest memory (lowest cost per bit) in the hierarchy?
- Where is the largest memory in the hierarchy?
- What is the goal of a memory hierarchy?
- Why do designers use memory hierarchies?
- In the context of a memory hierarchy, what do the following terms mean?
- Block, hit rate, miss rate, hit time, miss penalty
- What must programmers know about memory to get good performance?
- How do memory hierarchies exploit temporal locality?
- How do they exploit spatial locality?
- In general, if data are in a level i of a hierarchy, will they also be found in level i+1?
- Is most of the cost of a memory hierarchy at the highest level (closest to the processor)?
- Section 5.2 - Memory Technologies
- What are the four primary technologies used today in memory hierarchies?
- What are the important characteristics of each?
- Which are volatile?
- Which technology is used to implement caches?
- Which technology is used to implement main memory?
- Which technology is used as secondary memory in Personal Mobile Devices?
- Which technology is commonly used as second memory in servers?
- Why does DRAM require refresh?
- What is buffered internally in DRAM?
- What is the advantage of having multiple banks within a DRAM chip?
- In the context of flash memory, what is wear leveling?
- What are the tradeoffs between flash memory and magnetic disc?
- What term is used to refer to any storage that is managed in such a way as to take advantage of locality of access?
- Section 5.3 - The Basics of Caches
- What computer systems today include memory hierarchies with caches?
- What is a direct-mapped cache?
- What determines the location(s) in a cache where a block can be placed?
- What is added to the cache to determine which block is in each cache location?
- Why is it essential to add a valid bit to very cache entry?
- How is a memory address interpreted by a cache?
- What fields are the address divided into?
- What role does each field play in a cache access?
- What do the addresses of all bytes within a block have in common?
- Given a memory address and a block size, can you determine the block number that that location will be a part of?
- In general, what is the impact of reducing the block size? Of increasing the block size?
- How can the optimizations of early restart and critical word first reduce the effects of a large miss penalty?
- What happens in a pipelined processor when a cache miss occurs?
- Why are writes more complicated than reads in a memory hierarchy?
- What is the advantage of a write-through scheme?
- What role could a write-buffer play with a write-through cache?
- How are writes handled in a write-back cache?
- What options exist for handling write-misses?
- For the cache depicted in Fig. 5.12, what is the block size? Is this cache direct-mapped?
- Assuming there is a separate set in the cache for each possible value of the index field, how many sets are in this cache?
- Why does this design use separate instruction and data caches?
- Can you explain why instruction miss rates are generally lower than data miss rates?
- Section 5.4 - Measuring and Improving Cache Performance
- How does the memory hierarchy affect CPU time?
- What is the impact of hit time on processor performance?
- What changes in a processor pipeline if the hit time is increased by one cycle?
- What happens in a pipelined processor when there is a cache miss?
- Assuming reads and writes are combined, what equations tell us the number of memory stall cycles?
- What is the motivation for using average memory access time (AMAT) to compare different cache designs?
- How is AMAT calculated?
- What are set-associative and fully associative caches?
- How do they differ from direct-mapped caches?
- In a 4-way set-associative cache, how many tags are compared on each access? Why?
- In a fully associative cache, how many tags are compared on each access?
- In general, what is the advantage of increasing the degree of associativity?
- What is the main disadvantage of increasing the associativity?
- What are the added costs of an associative cache?
- How much of a reduction in miss rate is achieved by increasing associativity?
- In what kind of caches is there a choice of which block to replace?
- What is required to implement LRU replacement in a 2-way set-associative cache?
- What is required to implement LRU replacement in a 4-way set-associative cache?
- Why does it get harder to implement LRU as the associativity grows?
- What is the principal benefit of having more than one level of cache in a memory hierarchy?
- If we compare the performance of two programs, will the one with fewer instructions always be the fastest?
- What can be done in software to “use the memory hierarchy well”?
- What is the main idea behind a blocked algorithm?
- Section 5.5 - Dependable Memory Hierarchy
- What is an error detection code?
- How does Hamming’s error correction code identify which bit is in error?
- How many parity bits are required for each 64-bit block of data in a SEC/DED code?
- How do you compute the parity bits for a 8-bit data word?
- How do you decode and correct errors with a SEC/SED hamming code?