首页 > > 详细

CPU simulator functions A2

 Introduction

Our CPU simulator functions, but most real CPUs have at least one level of cache memory in the memory hierarchy. Let’s add a cache memory to our simulator so that we can run some experiments on settings related to cache memory to see how performance might be affected.
Questions
Part 1
Add a fully associative cache to the simulator implemented in assignment 1. You must use the sample solution from assignment 1 as the basis for doing this assignment.
Consider the following in your implementation:
There is only a data cache, all code fetches must go straight to the code area.
The cache is empty at the beginning of a program’s execution. This means you have no valid cache directory entries. The actual contents of your cache memory can be filled with 0xFF, the same as main memory and code memory.
The cache uses an LRU replacement policy.
The cache uses a write back update policy. Make sure all data is written back to main memory at the end of execution (e.g., after the program you’re running crashes).
The cache uses a demand fetch policy.
Transfer all words of a block from memory to the cache (and vice versa) before continuing processing (i.e., the cache read/write is completed within a single phase of execution).
Upon completion, in addition to existing output, the simulator will print a report indicating the cache hits and misses and the hit rate achieved (prior to printing the memory contents).
Suggestions
Remember that data memory and cache memory should be structured in the same way. Remember further that each entry in your cache memory is a block, not an individual word or byte. That means that your data memory should also be organized as a set of blocks rather than individual words or bytes.
Our diagrams of cache memory and the cache directory show the cache directory immediately beside or mixed with cache memory. While implementing it that way would match the diagrams that we’ve been looking at, implementing cache memory that way would be painful. You are strongly encouraged to implement your cache directory and the corresponding cache memory as parallel arrays.
For each entry in your cache, your cache directory should store:
oA valid bit (whether or not this is a real cache entry that you have explicitly loaded from main memory).
oA dirty bit (whether or not this cache entry has been written to, but hasn’t been written back to main memory).
oA tag (the tag that you extract from the address).
oA value that you use for tracking which entry in the directory is the least recently used.
You are simulating hardware. Hardware does not have a dynamic memory allocator (memory size is fixed, uh, when the hardware is designed. It’s especially fixed in that it’s physically part of a CPU die). Use of malloc is explicitly forbidden for this assignment (i.e., you should not be using lists or queues).
Looking forward to part 2: you should use preprocessor variables to set the number of blocks and the block size of your cache memory, e.g.,
#ifndef CACHE_BLOCKS
#define CACHE_BLOCKS 4
#endif
so that you can easily vary the size of cache memory and block size at compile time, e.g.,
clang++ simulator.cpp -o simulator -DCACHE_BLOCKS=8 -DBLOCK_SIZE=1
Some programs and test data have been provided, along with pre-compiled versions of the simulator.
Grading for part 1
There are a total of 15 points for part 1:
5 points for code quality and design:
o0 points: the code is very poor quality (e.g., no comments at all, no functions, poor naming convention for variables, etc). Code structures to represent the cache directory are not used at all.
o1–3 points: the code is low quality. While some coding standards are applied, their use is inconsistent (e.g., inconsistent use of comments, some functions but functions might do too much, code is repeated that should be in a function, etc). Code structures to represent the cache directory may not be used.
o4–5 points: The code is high quality, coding standards are applied consistently throughout the code base. Appropriate code sructures are used to represent the cache directory (e.g., struct).
10 points for implementation (5 × 2):
o0 points: no implementation is submitted, the implementation is vastly incomplete, or the code crashes when executed (this specifically means that your code crashes with, for example, a Segmentation fault or Bus error, the assembly code is expected to crash as an exit condition).
o1–2 points: implementation is significantly incomplete, for example, only one part of cache memory might be implemented (e.g., only reading is implemented).
o3–4 points: implementation is mostly complete, for example, the basic operations of cache memory are present (e.g., read and write support are present), but the policies may not be fully implemented (e.g., LRU is not implemented, write back is not implemented).
o5 points: implementation is complete. Reading and writing to cache memory are fully supported, and all required policies are implemented.
Notes:
If your code does not compile on “…” with a single invocation of make (that is, the grader will change into the appropriate directory and issue the command make with no arguments), you will receive a score of 0 for this assignment. If your submission does not have a Makefile, that’s effectively the same as submitting code that does not compile.
oForget how to make a Makefile? Never knew how to make a Makefile? Thankfully, you can go to https://makefiletutorial.com and find some very easy to use basic examples.
oYou are welcome to use command-line arguments to make to vary your cache memory settings (e.g., make CACHE_SIZE=8), but you are expicitly not required to do that. If you do: please document this clearly in your README.
oYou can use g++ or clang++, there are no restrictions on which compiler you use beyond the compiler being installed on rodents.
While we’re not checking for warnings or DbC, both are your friends.
Part 2
Some programs and test data have been provided to be used for experimenting on your implementation (these are the same as linked above).
Repeat all of the tests with the following settings:
Number of blocks Block size
8 1
4 2
2 4
1 8
You should also experiment with other cache size settings. For each of of the test programs, what setting results in the best hit ratio? Note that the optimal setting may not be listed above, but the block size can’t exceed 8. You must find the minimal values that results in the best hit ratio (making the number of blocks equal main memory isn’t valid!).
In a Markdown-formatted file, you should write a report that includes the following:
A table with the cache settings that you experimented with (minimally the above table, but you should have more entries here), plus the hit ratio for each of the sample programs (test1.asm and test2.asm).
A clear statement of which setting is the best for each program, and 1–2 sentences explaining why you believe that to be true based on the memory access patterns of the program.
Suggestions
This section was added after the assignment was originally published. No other changes have been made to the assignment.
Working with tables in text (markdown) is sorta painful, but almost all popular text editors have extensions or plugins that help ease this pain:
oText Tables by RomanPeshkov is one that seems to be popular for VS Code.
oVIM Table Mode by Dhruva Sagar is one that I use personally in vim.
Grading for part 2
There are a total of 5 points for part 2:
3 points for the table:
o0 points: No report is submitted, no table is provided in the report, or the table in the report is identical to the table listed above with no additional columns.
o1–2 points: The table above is reproduced with additional columns for cache hit ratio for the provided programs, but the table doesn’t include any additional experiments beyond what’s in the table above.
o3 points: A table is provided that clearly demonstrates the point of diminishing returns for the hit ratio (i.e., it shows the maximal hit ratio, and the point at which the hit ratio gets smaller).
1 point each for a clear (and accurate) explanation of the memory access patterns that lead to these settings being optimal for the test program (i.e., 1 point for test1.asm and 1 point for test2.asm).
Notes:
If your file is not a Markdown-formatted file, your score will be reduced by 3 points down to a minimum of 0 points.
Appendices
Control unit state machine
Two new “states” have been added to this state machine: Cache read and Cache write. Cache reading happens when you want to read from memory to register. Cache writing happens when you want to write to memory.
These aren’t exactly “states”, cache reading and writing are part of the existing states (e.g., writing to cache is part of the Write Back state), but these have been visualized in the state machine to demonstrate when they happen.
You explicitly must not add these as states to your program. Do not alter enum PHASES.
 
联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp2

联系我们 - QQ: 99515681 微信:codinghelp2
© 2021 www.7daixie.com
程序辅导网!