Operating System代写:CS4118 Dynamic Storage Allocator

虽然是一个lab作业,但是工作量其实不小,需要根据描述实现一个Dynamic Storage Allocator.

Introduction

In this lab you will be writing a general purpose dynamic storage allocator for C programs; that is, your own version of the malloc, free, realloc, and calloc functions. You are encouraged to explore the design space creatively and implement an allocator that is correct, efficient, and fast.

In order to get the most out of this lab, we strongly encourage you to start early. The total time you spend designing and debugging can easily eclipse the time you spend coding.

Bugs can be especially pernicious and difficult to track down in an allocator, and you will probably spend a significant amount of time debugging your code. Buggy code will not get any credit. You cannot afford to get score of 0 for this assignment.

This lab has been heavily revised from previous versions. Do not rely on advice or information you may find on the Web or from people who have done this lab before. It will most likely be misleading or outright wrong. Be sure to read all of the documentation carefully and especially study the baseline implementation we have provided.

Hand Out Instructions

Start by downloading malloclabcheckpoint-handout.tar from Autolab to a protected directory in which you plan to do your work. Then give the command tar xvpf malloclabcheckpoint-handout.tar. This will cause a number of files to be unpacked into the directory.

The only file you will turn in is malloclab-handin.tar, which contains your solution (mm.c) and your key.txt.The provided code allows you to locally evaluate your implementations. Using the command make will generate two driver programs: mdriver and mdriver-emulate. Use these to evaluate the correctness and performance of your implementations. You can get some idea of your program performance and the resulting value you will get for the throughput portion of your grade. However, the Autolab servers will generate different throughput values, and these will determine your actual score. This is discussed in more detail in the evaluation section.

Required Functions and Support Routines

Your dynamic storage allocator will implement the following functions, declared in mm.h and defined in mm.c:

1
2
3
4
5
6
bool mm_init(void);
void *malloc(size_t size);
void free(void *ptr);
void *realloc(void *ptr, size_t size);
void *calloc(size_t nmemb, size_t size);
bool mm_checkheap(int);

We provide you three versions of memory allocators:

  • mm.c: A placeholder that compiles correctly, but does not run.
  • mm-naive.c: A functional implementation that runs fast but gets very poor utilization, because it never reuses any blocks of memory.
  • mm-baseline.c: A fully-functional implicit-list allocator. We recommend that you use this code as your starting point.

Your allocator must run correctly on a 64-bit machine. It must support a full 64-bit address space, even though current implementations of x86-64 machines support only a 48-bit address space. The driver mdriver-emulate will evaluate your program’s correctness using benchmark traces that require the use of a full 64-bit address space.

Your submitted mm.c must implement the following functions:

  • mm_init: Performs any necessary initializations, such as allocating the initial heap area. The return value should be false if there was a problem in performing the initialization, true otherwise. You must reinitialize all of your data structures in this function, because the drivers call your mm_init function every time they begin a new trace to reset to an empty heap.

  • malloc: The malloc routine returns a pointer to an allocated block payload of at least size bytes. The entire allocated block should lie within the heap region and should not overlap with any other allocated block.
    Your malloc implementation must always return 16-byte aligned pointers.

  • free: The free routine frees the block pointed to by ptr. It returns nothing. This routine is only guaranteed to work when the passed pointer was returned by an earlier call to malloc, calloc, or realloc and has not yet been freed. free(NULL) has no effect.
  • realloc:The realloc routine returns a pointer to an allocated region of at least size bytes with the following constraints:
    • if ptr is NULL, the call is equivalent to malloc(size);
    • if size is equal to zero, the call is equivalent to free(ptr) and should return NULL;
    • if ptr is not NULL, it must have been returned by an earlier call to malloc or realloc and not yet have been freed. The call to realloc takes an existing block of memory, pointed to by ptr - the old block. Upon return, the contents of the new block should be the same as those of the old block, up to the minimum of the old and new sizes. Everything else is uninitialized. Achieving this involves either copying the old bytes to a newly allocated region or reusing the existing region.
      For example, if the old block is 16 bytes and the new block is 24 bytes, then the first 16 bytes of the new block are identical to the first 16 bytes of the old block and the last 8 bytes are uninitialized. Similarly, if the old block is 16 bytes and the new block is 8 bytes, then the contents of the new block are identical to the first 8 bytes of the old block.
      The function returns a pointer to the resulting region. The return value might be the same as the old blockperhaps there is free space after the old block, or size is smaller than the old block sizeor it might be different. If the call to realloc does not fail and the returned address is different than the address passed in, the old block has been freed and should not be used, freed, or passed to realloc again.
  • Hint: Your realloc implementation will have only minimal impact on measured throughput or utilization. A correct, simple implementation will suffice.
  • calloc: Allocates memory for an array of nmemb elements of size bytes each and returns a pointer to the allocated memory. The memory is set to zero before returning.
    Hint: Your calloc will not be graded on throughput or performance. A correct, simple implementation will suffice.
  • mm_checkheap: The mm_checkheap function (the heap consistency checker, or simply heap checker) scans the heap and checks it for possible errors (e.g., by making sure the headers and footers of each block are identical). Your heap checker should run silently until it detects some error in the heap. Then, and only then, should it print a message and return false. If it finds no errors, it should return true. It is very important that your heap checker run silently; otherwise, it will produce too much output to be useful on the large traces.
    A quality heap checker is essential for debugging your malloc implementation. Many malloc bugs are too subtle to debug using conventional gdb techniques. The only effective technique for some of these bugs is to use a heap consistency checker. When you encounter a bug, you can isolate it with repeated calls to the consistency checker until you find the operation that corrupted your heap. Because of the importance of the consistency checker, it will be graded. If you ask members of the course staff for help, the first thing we will do is ask to see your checkheap function, so please write this function before coming to see us!
    The mm_checkheap function takes a single integer argument that you can use any way you want.
    If mm_checkheap detects a problem with the heap, it can print the line number where mm_checkheapwas called, which allows you to call mm_checkheapat numerous places in your code while you are debugging.

The semantics for malloc, realloc, calloc, and free match those of the corresponding libc routines. Type man malloc to the shell for complete documentation.

Support Routines

The memlib.c package simulates the memory system for your dynamic memory allocator. You can invoke the following functions in memlib.c:

  • void *mem_sbrk(intptr_t incr): Expands the heap by incr bytes, where incr is a non-negative integer, and returns a generic pointer to the first byte of the newly allocated heap area. The semantics are identical to the Unix sbrk function, except that mem_sbrk will fail for negative arguments. (Data typeintptr_t is defined to be a signed integer large enough to hold a pointer. On our machines it is 64-bits long.)
  • void *mem_heap_lo(void): Returns a generic pointer to the first byte in the heap.
  • void *mem_heap_hi(void): Returns a generic pointer to the last byte in the heap.
  • size_t mem_heapsize(void): Returns the current size of the heap in bytes.

You are also allowed to use the following libc library functions:memcpy, memset, printf, fprintf, and sprintf. Other than these functions and the support routines, yourmm.ccode may not call any externally-defined function.

The Trace-driven Driver Programs

The two driver programs generated when you run make test your mm.c code for correctness, space utilization, and throughput. These driver programs are controlled by a set of trace files that are included in the traces subdirectory of the malloclab-handout.tar distribution. Each trace file contains a sequence of commands that instruct the driver to call your malloc, realloc, and free routines in some sequence. The drivers and the trace files are the same ones we will use when we grade your handin mm.c file.

When the driver programs are run, they will run each trace file multiple times: once to make sure your implementation is correct, once to determine the space utilization, and between 3 and 20 times to determine the throughput.

he drivers accept the following command-line arguments. The normal operation is to run it with no arguments, but you may find it useful to use the arguments during development.

  • p: Apply the scoring standards for the checkpoint, rather than for the final submission.
  • t tracedir: Look for the default trace files in directory instead of the default directory traces
  • f tracefile: Use one particular instead of the default set of tracefiles for testing correctness, utilization, and throughput.
  • c tracefile: Run a particular exactly once, testing only for correctness. This option is extremely useful if you want to print out debugging messages.
  • h: Print a summary of the possible command line arguments.
  • l: Run and measure the libc version of malloc in addition to your malloc package. This is interesting if you want to see how fast a real malloc package runs, but it has no effect on your grade for this assignment.
  • v level: Set the verbosity level to the specified value. Levels 0-2 are supported, with a default level of 1. Raising the verbosity level causes additional diagnostic information to be printed as each trace file is processed. This can be useful during debugging for determining which trace file is causing your malloc package to fail.
  • V: Equivalent to -v 2.
  • d level: At debug level 0, very little validity checking is done. This is useful if you are mostly done but just tweaking performance.
    At debug level 1, every array the driver allocates is filled with random bytes. When the array is freed or reallocated, the driver checks to make sure the bytes have not been changed. This is the default.
    At debug level 2, every time any operation is done, all of the allocated arrays are checked to make sure they still hold their randomly assigned bytes, and your implementation of mm_checkheap is called. This mode is slow, but it can help identify the exact point at which an error gets injected.
  • D: Equivalent to -d 2.
  • s: Time out after s seconds. The default is to never time out.
  • T: Print the output in a tabular form. This can be useful if you want to convert the results into a spreadsheet for further analysis.

For most of your testing, you should use the mdriver program. It checks the correctness, utilization, and throughput of the standard benchmark traces. The mdriver-emulate program uses special compilation and memory-management techniques to allow testing your program with a heap making use of the full, 64-bit address space. In addition to the standard benchmark traces, it will run a set of giant traces with very large allocation requests. If your submission (for either the checkpoint of the final version) fails to pass any of its checks, you will be given a penalty of 30 points.

Programming Rules

  • You are writing a general purpose allocator. You may not solve specifically for any of the traces - we will be checking for this. Any allocator that attempts to explicitly determine which trace is running (e.g., by executing a sequence of tests at the beginning of the trace) and change its behavior in a way that is appropriate only for that specific trace will receive a penalty of 20 points. On the other hand, you should feel free to write an adaptive allocator - one that dynamically tunes itself according to the general characteristics of the different traces.
  • You should not change any of the interfaces in mm.h, and your program must compile with the provided Makefile. However, we strongly encourage you to use static helper functions inmm.c to break up your code into small, easy-to-understand segments.
  • You are not allowed to define any large global data structures such as large arrays, trees, or lists in your mm.c program. However, you are allowed to declare small global arrays, structs and scalar variables such as integers, floats, and pointers in mm.c. In total, your global data should sum to at most 128 bytes. Global values defined with the const qualifier are not counted in the 128 bytes.
    The reason for this restriction is that the driver cannot account for such global variables in its memory utilization measure. If you need space for large data structures, you can allocate space for them within the heap.
    The 128-byte limit will not be tested by automated means. The TAs will check whether your code is within this limit by inspecting your code. Provide documentation in your comments on your use of global variables. Ideally, they should be declared in one single part of your program, and you should provide comments giving a tally of the number of bytes used.
  • The use of macro definitions (using #define) in your code is restricted to the following:
    1. Macros with names beginning with the prefix “dbg_“ that are used for debugging purposes only. See, for example, the debugging macros defined in mm-baseline.c. You may create other ones, but they must be disabled in any version of your code submitted to Autolab.
    2. Definitions of constants. These definitions must not have any parameters.

Explanation

It is traditional in C programming to use macros instead of function definitions in an attempt to improve program performance. This practice is obsolete. Modern compilers (when optimization is enabled) perform inline substitution of small functions, eliminating any inefficiencies due to the use of functions rather than macros. In addition, functions provide better type checking and (when optimization is disabled) enable better debugging support.

  • When you run make, it will run a program that checks for disallowed macro definitions in your code. This checker is overly strict - it cannot determine when a macro definition is embedded in a comment or in some part of the code that has been disabled by conditional-compilation directives. Nonetheless, your code must pass this checker without any warning messages.
  • The code shown in the textbook (Section 9.9.12, and available from the CS:APP website) is a useful source of inspiration for the lab, but it does not meet the required coding standards. It does not handle 64-bit allocations, it makes extensive use of macros instead of functions, and it relies heavily on low-level pointer arithmetic. Similarly, the code shown in K&R does not satisfy the coding requirements. You should use the provided code mm-baseline.c as your starting point.
  • It is okay to look at any high-level descriptions of algorithms found in the textbook or elsewhere, but it is not acceptable to copy or look at any code of malloc implementations found online or in other sources, except for the allocators described in the textbook, in K&R, or as part of the provided code.
  • It is okay to copy code for useful generic data structures and algorithms (e.g., linked lists, hash tables, search trees, and priority queues) from Wikipedia and other repositories. (This code must not have already been customized as part of a memory allocator.) You must include (as a comment) an attribution of the origins of any borrowed code.
  • Your allocator must always return pointers that are aligned to 16-byte boundaries. The driver will check this requirement.
  • Your code must compile without warnings. Warnings often point to subtle errors in your code; whenever you get a warning, you should double-check the corresponding line to see if the code is really doing what you intended. If it is, then you should eliminate the warning by tweaking the code (for instance, one common type of warning can be eliminated by adding a type-cast where a value is being converted from one type of pointer to another). We have added flags in the Makefile to force your code to be error-free. You may remove those flags during development if you wish, but please realize that we will be grading you with those flags activated.
  • Your code will be compiled with the clang compiler, rather than gcc. This compiler is distributed by the LLVM Project (llvm.org). Their compiler infrastructure provides features that have enabled us to implement the 64-bit address emulation techniques of mdriver-emulate. For the most part,clangis compatible with gcc, except that it generates different error and warning messages.

Machine Dependencies

You will find that your program gets different throughput values depending on the model of CPU for the machine running the program. Our evaluation code compensates for these differences by comparing the performance of your program to implementations we have written for both the checkpoint and the final versions. It can determine the reference performance for the machine executing the program in two different ways. First, it looks in the file throughputs.txt to see if it has a record for the executing CPU. (Linux machines contain a file /proc/cpuinfo that includes information about the CPU model.) Second, if it does not find this information, it runs reference implementations that are included as part of the provided files.

Useful Tips

  • We have included thecontracts.h header file from 15-122 in the handout directory. You’ll find debugging macros defined in mm-baseline.c that provide aliases to the contracts functions. Please feel free to use these to check your invariants during development.
  • Use the drivers ‘-c’ and ‘-f’ options.During initial development, using short trace files will simplify debugging and testing.
  • Use the drivers ‘-V’ option. The ‘-V’ option will also indicate when each trace file is processed, which will help you isolate errors.
  • Use the drivers ‘-D’ option.This does a lot of checking to quickly find errors.
  • Use gdb’s watch commandto find out what changed some value you did not expect to have changed.
  • Use the supplied helper function for viewing the heap contents with gdb. You cannot usegdbto directly view the contents of the heap when running the version with 64-bit memory emulation. We have provided a helper function for this purpose. See Appendix [app:hprobe] for documentation on this function and how to view the heap contents.
  • Reduce obscure pointer arithmetic through the use of struct’s and union’s. Although your data structures will be implemented in compressed form within the heap, you should strive to make them look as conventional as possible using struct and union declarations to encode the different fields. Examples of this style are shown in the baseline implementation.
  • Encapsulate your pointer operations in functions.Pointer arithmetic in memory managers is confusing and error-prone because of all the casting that is necessary. You can significantly reduce the complexity by writing functions for your pointer operations. See the code in mm-baseline.c for examples. Remember: you are not allowed to define macrosthe code in the textbook does not meet our coding standards for this lab.
  • Use your heap consistency checker.We are assigning ten points to the mm_checkheap function in your final submission for a reason. A good heap consistency checker will save you hours and hours when debugging your malloc package. You can use your heap checker to find out where exactly things are going wrong in your implementation (hopefully not in too many places!). Make sure that your heap checker is detailed. To be useful, your heap checker should only produce output when it detects an error. Every time you change your implementation, one of the first things you should do is think about how your mm_checkheap will change, what sort of tests need to be performed, and so on. Although we will not examine the heap checker you implement for the checkpoint, you should have a good implementation of it right from the start. Do not even think about asking for debugging help from any of the course staff until you have implemented and tried using your heap checker.
  • Keep backups. Whenever you have a working allocator and are considering making changes to it, keep a backup copy of the last working version. It is very common to make changes that inadvertently break the code and then have trouble undoing them.
  • Use version management. MallocLab takes a while to complete and we expect a lot of commits to GitLab. Make sure to commit frequently, especially when you change key features of your code. You may also wish to submit to Autolab periodically, keeping in mind that only your final submission will be graded. Your commit history will be used in determining academic integrity cases.
  • Start early!It is possible to write an efficient malloc package with a few pages of code. However, we can guarantee that it will be some of the most difficult and sophisticated code you have written so far in your career. So start early, and good luck!

Strategic Advice

You must design algorithms and data structures for managing free blocks that achieve the right balance of space utilization and speed. This involves a trade-off - it is easy to write a fast allocator by allocating blocks and never freeing them, or a high-utilization allocator that carefully packs blocks as tightly as possible. You must seek to minimize wasted space while also making your program run fast.

As described in the textbook and the lectures, utilization is reduced below 100% due to fragmentation, taking two different forms:

  • External fragmentation: Unused space between allocated blocks or at the end of the heap
  • Internal fragmentation: Space within an allocated block that cannot be used for storing data, because it is required for some of the manager’s data structures (e.g., headers, footers, and free-list pointers), or because extra bytes must be allocated to meet alignment or minimum block size requirements

To reduce external fragmentation, you will want to implement good block placement heuristics. To reduce internal fragmentation, it helps to reduce the storage for your data structures as much as possible.

Maximizing throughput requires making sure your allocator finds a suitable block quickly. This involves constructing more elaborate data structures than is found in the provided code. However, your code need not use any exotic data structures, such as search trees. Our reference implementation only uses singly- and doubly-linked lists.

Here’s a strategy we suggest you follow in meeting the checkpoint and final version requirements:

  • Checkpoint. The provided code already meets the required utilization performance, but it has very low throughput. You can achieve the required throughput by converting to an explicit-list allocator.
    You will want to experiment with allocation policies. The provided code implements first-fit search. Some allocators attempt best-fit search, but this is difficult to do efficiently. You can find ways to introduce elements of best-fit search into a first-fit allocator, while keeping the amount of search bounded.
    Depending on whether you place newly freed blocks at the beginning or the end of a free list, you can implement either a last-in-first-out (LIFO) or a first-in-first-out (FIFO) queue discipline. You should experiment with both.
  • Final Version. Building on the checkpoint version, you must greatly increase the utilization and keep a high throughput. You must reduce both external and internal fragmentation. Reducing external fragmentation requires achieving something closer to best-fit allocation, e.g., by using segregated lists. Reducing internal fragmentation requires reducing data structure overhead. There are multiple ways to do this, each with its own challenges. Possible approaches and their associated challenges include:
    • Eliminate footers in allocated blocks. But, you still need to be able to implement coalescing. See the discussion about this optimization on page 852 of the textbook.
    • Decrease the minimum block size. But, you must then manage free blocks that are too small to hold the pointers for a doubly linked free list.
    • Reduce headers below 8 bytes. But, you must support all possible block sizes, and so you must then be able to handle blocks with sizes that are too large to encode in the header.
    • Set up special regions of memory for small, fixed-size blocks. But, you will need to manage these and be able to free a block when given only the starting address of its payload.

Some advice on how to implement and debug your packages will be covered in the lectures and recitations.
Good luck!