ECEN 423 Exam Review Questions

Exam #3

Exam #3

Exam Format

The exam will be held in the testing center.
Material covered: Chapter 5
25 multiple choice questions
Closed book
Access to testing center calculator
You may write on the exam
No time limit on the exam (the average student takes about 1 hour to complete the exam)

The following tables and figures will be given on the exam:

Figure 5.23

Exam Preparation Suggestions

Review and practice all in class quizzes. Several of the problems are variations of existing quiz questions.
Review homework questions. Some problems are variations of homework questions.
Review the ‘Exam Review Questions’ listed below
- These are not questions that will be put on the exam but questions that suggest what areas we discussed you should review. No question will be given on the exam that doesn’t relate to at least one of the questions below.
Review textbook (see review questions on areas of the textbook to review)
- Some questions on the textbook may be given that are not explicitly covered in class
- Topics not listed in the review questions will not be covered in the exam <!–
Section 4.8 - Data Hazards: Forwarding versus Stalling
- How does forwarding allow the pipeline to run at full speed in the presence of data hazards?
- What additional hardware does forwarding require?
- In what stage in the RISC-V pipeline is the added forwarding hardware found? Why is it in that stage?
- Can you explain the significance of each condition and action in the pseudo-code on pages 319-320 that details the conditions for detecting hazards and the control signals to resolve them?
- From the Elaboration on page 302, what additional forwarding hardware would be needed in the RISC-V pipeline if we are to avoid stalls on code that loads values and then immediate stores them to a different address?
- In Fig. 4.56, why was a 2nd MUX added before the bottom ALU input?
- What is different about the data hazard caused by a load instruction followed immediately by an instruction that reads the loaded register?
  - How is this hazard detected?
  - What actions are taken in the RISC-V pipeline to get the correct result in this case?
- What is a nop, and how are nops related to pipeline bubbles?
- What is required to insert a nop or bubble into the pipeline?
- Why are the hazard detection unit and the forwarding unit in different stages?
Section 4.9 - Control Hazards
- Are control hazards are easier to understand than data hazards? Which is more difficult to address effectively?
- Would it be just as simple for the pipelined RISC-V processor to predict that every conditional branch is taken (rather than not taken)?
- Would this require the calculation of a value that is not required when branches are predicted not taken?
- What actions must be taken to flush or discard instructions in the pipeline that follow a mispredicted branch?
- What are the benefits of executing branch instructions earlier in the pipeline?
- What is the earliest stage in which branch instructions could execute?
- What challenges must be addressed if branches execute in ID?
- What branch predictors exist that would give better performance than simply predicting all branch outcomes as “not taken”?
- How might the compiler indicate a prediction for a given branch instruction?
- What is required to implement dynamic branch prediction?
- How is a branch prediction buffer (or branch history table) accessed, and what information does it contain?
- How does the 1-bit branch predictor work? How does the 2-bit branch predictor work and how is it better than the 1-bit branch predictor?
- What advantage does a branch target buffer offer?
Section 4.10 - Exceptions
- What are interrupts and exceptions (and what is the difference between the two terms)?
- What are ‘precise’ exceptions, and why are they important?
- What does a processor normally do when an exception occurs?
- How does the pipelined datapath in Fig. 4.66 respond to an exception?
Section 4.11 - Parallelism via Instructions
- What is instruction-level parallelism (ILP)?
- What challenges must be addressed in designing a pipelined processor capable of fetching, decoding, and executing multiple instructions per cycle?
- How does static multiple issue differ from dynamic multiple issue?
- What does it mean to speculatively execute instructions, and how does this differ from simple branch prediction?
- What additional hardware is required if a pipelined processor is to be made capable of executing instructions out-of-order?
Section 4.12 - Putting It All together: The Intel Core i7 6700 and ARM Cortex-A53
- What major differences do you see between the pipelined RISC-V processor in the text, the ARM Cortex A53 pipeline in Fig. 4.72, and the Intel Core i7 920 pipeline in Fig. 4.74?
- What advanced ILP techniques does the Intel Core i7 use to increase performance?
- Does your experience with the class labs confirm the observation by the authors that creating a correctly executing pipeline can be very subtle (hard to get right)?
- What are some examples of complications in other instructions sets (not RISC-V) that make a pipelined implementation more difficult? –> –>

Chapter 5

Section 5.1 - Introduction
- What is the principle of locality?
- What is temporal locality?
- What is spatial locality?
- What is a memory hierarchy?
- If the processor is at the top, where in the hierarchy is the fastest memory?
- Where is the cheapest memory (lowest cost per bit) in the hierarchy?
- Where is the largest memory in the hierarchy?
- What is the goal of a memory hierarchy?
- Why do designers use memory hierarchies?
- In the context of a memory hierarchy, what do the following terms mean?
- Block, hit rate, miss rate, hit time, miss penalty
- What must programmers know about memory to get good performance?
- How do memory hierarchies exploit temporal locality?
- How do they exploit spatial locality?
- In general, if data are in a level i of a hierarchy, will they also be found in level i+1?
- Is most of the cost of a memory hierarchy at the highest level (closest to the processor)?
Section 5.2 - Memory Technologies
- What are the four primary technologies used today in memory hierarchies?
- What are the important characteristics of each?
- Which are volatile?
- Which technology is used to implement caches?
- Which technology is used to implement main memory?
- Which technology is used as secondary memory in Personal Mobile Devices?
- Which technology is commonly used as second memory in servers?
- Why does DRAM require refresh?
- What is buffered internally in DRAM?
- What is the advantage of having multiple banks within a DRAM chip?
- In the context of flash memory, what is wear leveling?
- What are the tradeoffs between flash memory and magnetic disc?
- What term is used to refer to any storage that is managed in such a way as to take advantage of locality of access?
Section 5.3 - The Basics of Caches
- What computer systems today include memory hierarchies with caches?
- What is a direct-mapped cache?
- What determines the location(s) in a cache where a block can be placed?
- What is added to the cache to determine which block is in each cache location?
- Why is it essential to add a valid bit to very cache entry?
- How is a memory address interpreted by a cache?
- What fields are the address divided into?
- What role does each field play in a cache access?
- What do the addresses of all bytes within a block have in common?
- Given a memory address and a block size, can you determine the block number that that location will be a part of?
- In general, what is the impact of reducing the block size? Of increasing the block size?
- How can the optimizations of early restart and critical word first reduce the effects of a large miss penalty?
- What happens in a pipelined processor when a cache miss occurs?
- Why are writes more complicated than reads in a memory hierarchy?
- What is the advantage of a write-through scheme?
- What role could a write-buffer play with a write-through cache?
- How are writes handled in a write-back cache?
- What options exist for handling write-misses?
- For the cache depicted in Fig. 5.12, what is the block size? Is this cache direct-mapped?
- Assuming there is a separate set in the cache for each possible value of the index field, how many sets are in this cache?
- Why does this design use separate instruction and data caches?
- Can you explain why instruction miss rates are generally lower than data miss rates?
Section 5.4 - Measuring and Improving Cache Performance
- How does the memory hierarchy affect CPU time?
- What is the impact of hit time on processor performance?
- What changes in a processor pipeline if the hit time is increased by one cycle?
- What happens in a pipelined processor when there is a cache miss?
- Assuming reads and writes are combined, what equations tell us the number of memory stall cycles?
- What is the motivation for using average memory access time (AMAT) to compare different cache designs?
- How is AMAT calculated?
- What are set-associative and fully associative caches?
- How do they differ from direct-mapped caches?
- In a 4-way set-associative cache, how many tags are compared on each access? Why?
- In a fully associative cache, how many tags are compared on each access?
- In general, what is the advantage of increasing the degree of associativity?
- What is the main disadvantage of increasing the associativity?
- What are the added costs of an associative cache?
- How much of a reduction in miss rate is achieved by increasing associativity?
- In what kind of caches is there a choice of which block to replace?
- What is required to implement LRU replacement in a 2-way set-associative cache?
- What is required to implement LRU replacement in a 4-way set-associative cache?
- Why does it get harder to implement LRU as the associativity grows?
- What is the principal benefit of having more than one level of cache in a memory hierarchy?
- If we compare the performance of two programs, will the one with fewer instructions always be the fastest?
- What can be done in software to “use the memory hierarchy well”?
- What is the main idea behind a blocked algorithm?
Section 5.5 - Dependable Memory Hierarchy
- What is an error detection code?
- How does Hamming’s error correction code identify which bit is in error?
- How many parity bits are required for each 64-bit block of data in a SEC/DED code?
- How do you compute the parity bits for a 8-bit data word?
- How do you decode and correct errors with a SEC/SED hamming code?
Section 5.6 - Virtual Machines
- What are system virtual machines?
- What advantages do they offer?
- What are the technical challenges in implementing them?
Section 5.7 - Virtual Memory
- What level in the memory hierarchy is used as a cache by virtual memory?
- What are the benefits of virtual memory?
- What address translation must occur in systems with virtual memory?
- What is a page fault?
- How is a specific page found in main memory?
- What information is included in a page table?
- Why are tag comparisons required to locate blocks in a cache, but not pages in main memory?
- What is the advantage of multi-level page tables?
- What is a translation-lookaside buffer (or TLB)?
- What is included in each TLB entry?
- Is a TLB miss the same as a page fault?
- What happens in response to a TLB miss?
- Why are protection bits (that limit certain kinds of accesses) usually included in page table entries and TLBs?
Section 5.8 - A common framework for memory hierarchy
- What are the 4 key questions in our text’s common framework for memory hierarchy?
- What are the three types of misses in the three Cs model?
Section 5.10 - Parallelism and Memory Hierarchies
- What is the cache coherence problem in systems with multiple processors?
Section 5.13 - Real Stuff: The ARM Cortex-A8 and Intel Core i7 Memory Hierarchies
- What are the general characteristics of the Intel Core i7 and ARM A53 memory hierarchy?

ECEN 423 Exam Review Questions

Table of Contents

Exam #3

Exam Format

Exam Preparation Suggestions

Chapter 5