Operating System代写:COMP0019 Debuggers and Virtual Memory

回答OS相关的问题,包括Debuggers, Virtual Memory, Linux/Unit File I/O, Process Control, x86-64 Assembly, Concurrency等。

Debuggers

Alternative assessment

Answer ALL questions (total 100 marks). Unless you have been granted a SoRA specifically exempting you from typing, you should submit typed answers if at all possible. Your submission MUST be in PDF format. You may draw figures e lectronically. You m ay also draw figures b y h and i f y ou p refer, s o l ong a s y ou c an s can t hem a nd i nclude t hem i n your answers document. You must submit your answers to the entirety of this assessment as a single PDF document, via the 0019 Moodle web page.

You may not discuss any of the content of this assessment nor any aspect of your solution to any of the questions with any other person, with the exception of asking clarification questions of the instructors as private posts on the 0019 Moodle Alternative Assessment forum. (Please do not use Piazza for questions about this paperthis is a constraint UCL has dictated.) Communicating with anyone else, online or offline, about the content of this assessment or how to solve it will be considered academic dishonesty, and handled under UCL’s rules on examination irregularities.

As you work on this assessment, you may find it useful to refer to the CS:APP/3e textbook, the 2020 lecture note slides for COMP0019, your personal notes taken during lectures, the 2020 COMP0019 lecture videos, the 2020 CW1 - CW5 starter code and handouts, your own solutions to CW1 - CW5, the standard Linux man pages, and all material posted to the 0019 Piazza forum.

GOOD LUCK !

Marks for each part of each question are indicated in square brackets Calculators are permitted

Debuggers and Virtual Memory in WeensyOS

In 0019 CW1, you used gdb, the debugger, to step through the binary bomb executable, and to display the contents of the binary bomb’s memory while it executed. Most students in 0019 also used gdb to debug C code during CW2 - CW5, to step through their own solution code and observe the contents of memory.

In Linux and UNIX, a debugger is an application that runs as one process and facilitates stepping through the execution of a separate application (the one the user wants to debug) that runs as a second, separate process. This question concerns how operating systems provide support for debuggers, and asks you to think about how one might incorporate support for debuggers into WeensyOS, the operating system for which you implemented virtual memory support in 0019 CW4.

When you ask a debugger like gdb to print the value of a variable in the target application being debugged, the debugger computes the virtual address of that variable in the target application’s virtual address space, and issues a system call that asks the operating system to obtain the value at that virtual address in the process with the process ID specified by the debugger (which the debugger specifies as the process ID of the target process being debugged). The system call returns the requested value at the target address in the virtual address space of the target process. The debugger then prints out the requested value for the user.

Consider how you could design and implement a new system call for WeensyOS for this functionalityto let a debugger application running as a WeensyOS process display a value stored at a virtual memory address in the virtual address space of some other target WeensyOS process. Suppose the C function prototype for this new WeensyOS system call is as follows:

1
int debug_get_long(int processid, long *targetaddr, long *result);

where:

  • processid is the WeensyOS process ID of the target process within whose virtual address space an 8-byte value should be retrieved;
  • targetaddr is the virtual address within the target process’s virtual address space from which an 8-byte value should be retrieved;
  • result is a pointer to a long in the calling process’s virtual address space, where the system call will store the requested value retrieved from the virtual address space of the target process;
  • the int return value indicates whether the system call has completed successfully or with an error.

Design debug_get_long() for the WeensyOS kernel. Give your design in C code, and comment your code thoroughly, explaining each significant step taken in the code.

Assume that there is already a symbolic constant SYS DEBUG GET LONG that can be used in the relevant switch statement in the WeensyOS kernel source to dispatch to your code when an application invokes the debug get long() system call. Provide the code for the system call starting with the relevant case statement. (You may find it useful to recall how you added support for fork() to the WeensyOS kernel in CW4.)

Hints: You will need to refamiliarize yourself with how process and kernel page tables work in WeensyOS to solve this problem. For full marks, you will need to include error-handling code that returns an error and avoids crashing the kernel if the call to debug_get_long() provides an invalidtargetaddr and processid, or an invalid combination of the two.

We will mark your answer by evaluating the correctness and completeness of your design as expressed in code. This is a design exercise and not a programming task: we won’t try to compile and run your code, or test it. We will not deduct marks for minor syntax errors. Don’t try to compile and run your code as you work on your solution; we are only asking you to write out the code to make your design concrete, as a way of demonstrating your understanding of virtual memory and how WeensyOS supports it.

Linux/UNIX File I/O and Process Control

Recall the notion of a file pointer in the Linux/UNIX file input/output (I/O) system calls, which tracks the current byte offset within a currently open Linux/UNIX filewhere in the bytes of the file to read or write next, when the next read() or write() system call is issued. As discussed in 0019, when a process has a file descriptor for a currently open file, modern Linux and UNIX OSes track the file pointer for that file using state held in the kernel’s memory.

Note: throughout this question, we are referring to the UNIX file I/O system calls, such as read() and write(), and not the C library’s buffered standard I/O file I/O calls, such as fread() and fwrite().

By contrast, in an early version of UNIX, the various UNIX file I/O system calls maintained the file pointer for a currently open file in user-level memory within the process’s virtual address space. When the file pointer advanced (after a read() or write() system call on the file, using the exact same API for these calls as in modern UNIX), the kernel would update the file’s file pointer value in the user-level memory of the process that called read() or write().

Note that in all versions of UNIX (the early version discussed above and modern ones), output to a terminal always appends to the end of the terminal, regardless of the file pointer value for the open file for that terminal. Terminals were originally teletypes, one-character-at-time physical printers onto paper, and modern terminals are windows in a windowing system; in both cases, each successive write() always adds to the end of the terminal’s output. When a shell is interactive (when it accepts commands from the user interactively), its output goes to a terminal, and thus interactive shells always append their output to the end of the terminal, regardless of the file pointer value for the open file for that terminal.

Suppose you run a fully implemented sh0019 shell (that fully solves 0019 CW5) according to the following scenario:

  • You create a file /tmp/a that contains the string This is file A.
  • You create a file /tmp/b that contains the string This is File B.
  • In the current working directory, you create a shell script file named script containing the following two commands: cat /tmp/a cat /tmp/b
  • You start an interactive command-line sh0019 shell. At its command prompt, you run a second instance of your sh0019 shell and have it run the commands in the above shell script by typing the command: sh0019 -q script > outfile
    • a. Suppose you run sh0019 in the above scenario on the aforementioned early version of UNIX. Assume that sh0019 compiles and runs on this early version of UNIX, but experiences the early UNIX version’s different file pointer implementation described above. On this early version of UNIX that maintains the file pointer in memory within a process’s address space, what output results (and in what file)? Justify your answer by describing the sequence of process creation, program execution, and file I/O system calls that occurs, and how the file pointer influences the output.
    • b. What output does the same command give (and in what file) when run on a modern version of Linux, which maintains file pointers as described in the 0019 lectures and in CS:APP/3e? Justify your answer by explaining what is different about modern Linux’s implementation of UNIX file I/O operations, and how this implementation difference gives rise to the output.

C Compilation and x86-64 Assembly

Consider the below disassembly of the compiled x86-64 object code for the C function mystery(), as output by objdump -S:

Disassembly of section .text:
0000000000000000 <mystery>:
   0:   55                     pushq  %rbp
   1:   48 89 e5               movq   %rsp,%rbp
   4:   85 d2                  testl  %edx,%edx
   6:   7e 14                  jle    1c <mystery+0x1c>
   8:   89 d0                  movl   %edx,%eax
   a:   31 c9                  xorl   %ecx,%ecx
   c:   8b 14 8e               movl   (%rsi,%rcx,4),%edx
   f:   01 d2                  addl   %edx,%edx
  11:   01 14 8f               addl   %edx,(%rdi,%rcx,4)
  14:   48 ff c1               incq   %rcx
  17:   48 39 c8               cmpq   %rcx,%rax
  1a:   75 f0                  jne    c <mystery+0xc>
  1c:   5d                     popq   %rbp
  1d:   c3                     retq

a. How many arguments does the C function mystery() take, and what evidence in the disassembly supports this conclusion?

b. Below is a skeleton for the C code that when compiled yields the above assembly code. In your answers document, write out the skeleton and fill in the missing code (where there are blanks in the skeleton) that matches the above assembly code.

Concurrency with pthreads on x86-64

Consider the following C code, which starts four pthreads, each of which increments each element of an array of unsigneds ten million times, and uses C11 atomics for synchronization. Include files have been omitted in the interest of brevity.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
void *threadfunc(void *arg)
{

_Atomic unsigned **x = (_Atomic unsigned **) arg;
for (int i = 0; i != 10000000; i++) {
for (int j = 0; j < 10; j++) {
atomic_fetch_add(x[j], 1);
}
}
}

int main()
{

pthread_t th[4];
_Atomic unsigned n[10];
_Atomic unsigned *pn[10];

for (int i = 0; i < 10; i++) {
n[i] = ATOMIC_VAR_INIT(0);
pn[i] = &(n[i]);
}

for (int i = 0; i < 4; i++) {
pthread_create(&th[i], NULL, threadfunc, (void *) pn);
}

for (int i = 0; i != 4; i++) {
pthread_join(th[i], NULL);
}

for (int i = 0; i < 10; i++) {
printf("%u\n", n[i]);
}
}

Rewrite the above code to instead use spin locks for synchronization. You may not place a spin lock outside the outer for() loop in threadfunc(), as doing so reduces concurrency among threads too much. Subject to that prohibition, you must place spin lock(s) so as to ensure correct synchronization, and to minimize the overhead of locking.

Omit include files. Use the spin lock API discussed in 0019 lecture, summarized below for your reference. You need not provide code for the implementation of spin_lock() or spin_unlock().

1
2
3
4
5
/* allocate and initialize spin lock */
atomic_flag spinlock = ATOMIC_FLAG_INIT;

int spin_lock(atomic_flag *lock); // acquire spin lock
int spin_unlock(atomic_flag *lock); // release spin lock

How does the C compiler enforce atomicity on the x86-64 CPU instructions that implement C11 atomic operations?

Which version of the code discussed in this exam question requires more C11 atomic operations for correct synchronization: the one in the exam question paper, which directly invokes C11 atomics, or your answer that uses spin locks? Justify your answer by referring to the relevant aspects of these two versions of the code.

Designing with Concurrency and Virtual Memory under Linux/UNIX

Unlike the preceding questions, this question is open-ended.

Your answer is expected not to exceed 1,000 words in length. It may include any number of figures or tables (including zero!) to support the arguments you make in your answer.

We would expect every figure and table you include to be discussed in the text. The quantity of figures or tables you include does not in and of itself influence your mark.

You have joined the development team for a main-memory database software system implemented in C on Linux. This main-memory database, as the name suggests, accepts read and write requests to the database contents, and stores all data in the database in RAM only, in a single Linux process, within which only one thread runs. A single write request can require writes to multiple disjoint memory regions within the virtual address space of the database process. When you join the project, the system provides no persistencei.e., none of the data is stored on non-volatile storage, such that if the machine on which the database runs crashes, loses power, or reboots, the entire database contents are lost.

You are given the task of implementing a simple form of persistence for this main-memory database system. You are told to design and implement extensions to the main-memory database that take a complete, consistent, persistent snapshot of the current contents of the database in main memory. “Persistent” means that the contents of the database should be written to non-volatile storage: in this case, to a single file in the standard Linux filesystem stored on a sufficiently large magnetic spinning disk. “Consistent” means that the copy of the database written to disk must not reflect partially complete write operations; for any given write to the database requested by a client, a snapshot must contain all or none of the changes to the database requested in that write operation. “Complete” means that at the time a snapshot operation is initiated, all prior write operations that have completed in main memory must be included in the copy of the database written to disk.

Your design must comply with the following criteria and goals (beyond the “persistent,” “consistent,” and “complete” definitions above):

  • The main-memory database must continue to process subsequent read and write requests that arrive during the taking of a snapshot; that is, your design may not pause the handling of read and write requests to the database while a snapshot completes.
  • The main-memory database does not process multiple read or write requests (or a mix of the two) concurrently; it processes a single read or write request to completion in isolation before moving on to the next one. Your persistence extensions should not change this behavior.
  • Once a snapshot is initiated, no changes to memory caused by writes requested after the snapshot began should be included in that snapshot.
  • You may use multiple pthreads, multiple processes, or some combination of both in your design.
  • You may use any of the pthread synchronization primitives described in the 0019 lectures on synchronization primitives.
  • You may use any of the Linux system calls described in the 0019 lectures.
  • You should aim to make your design both physical-memory efficient (i.e., minimize the quantity of RAM your design needs) and compute-efficient (i.e., minimize the number of CPU instructions or other execution time overheads your design introduces).

Assume there are four CPU cores in the Intel Core i7 CPU in the server where the main-memory database runs. And assume that the mix of read and write requests that the database receives over time is not known in advance, and that an administrator’s request for a snapshot can arrive at any time.

Hint: you may find it useful to review CS:APP/3e Sections 9.8.1. and 9.8.2, which describe how Linux uses copy-on-write page mappings.

Describe your design in an essay (optionally including any figures or tables you deem necessary), taking care to specify the Linux constructs, system calls, synchronization primitives, etc., that you use, and how they fit together in your design. You must also explain how your design meets the above requirements, and why you believe your design meets the physical-memory and compute efficiency goals described above.

There is no unique correct answer to this question. We expect that even experts in Systems may produce different designs, and use different arguments for why these designs meet the requirements and goals set out in the question.

In marking your answers, we will focus on your understanding of systems design principles and constructs in C on Linux, and the tradeoffs inherent in the use of different concurrency and synchronization techniques. That is, your mark will reflect the examiners’ assessment of your ability to reason about the material covered in 0019, and how well the logical arguments you make in support of your design provide evidence of that understanding. Your answer to this question will thus be marked according to the following criteria:

  • the extent to which your answer demonstrates understanding of 0019 topics, including technical correctness of the written sentences and of the observations made;
  • the completeness of the answer, in terms of whether it addresses all parts of the question as stated;
  • the clarity and concision of the answer: we expect that your answer will be easily grasped by the 0019 examiners, and will not exceed 1,000 words.
  • the quality of the insight offered in your answer, including originality and depth of discussion, soundness of the arguments made, and ability to make correct and insightful connections between designs, techniques, and systems programming primitives.