C代写:CS444 bsh

实现一个简易版Bash,包含I/O重定向、Pipe等功能。

Bash

Introduction

In this project, we develop a shell program called bsh b-shell, Boston shell. It works like the classic sh and bash. It accepts commands in the following syntax:

command [arg1 ... argn] [< filename] [> filename]
//accepting command-line arguments and performing I/O redirection

command [arg1 ... argn] | command [arg1 ... argn]
//creating a pipe between two commands

First, create a directory in the VM.

minix# mkdir /root/proj4

Next, copy four files from the instructor’s directory to the VM.

mic$ scp -P 2213 /home/ming/444/proj4/* root@localhost:~/proj4

The files are the following.

-rw-r--r--. 1 ming 2908 Mar 29 09:40 bsh.c
-rw-r--r--. 1 ming  273 Mar 29 09:40 envDemo.c
-rw-r--r--. 1 ming   67 Mar 29 09:40 Makefile
-rw-r--r--. 1 ming 1580 Mar 29 09:40 pipeDemo.c

Built-In Commands

Six Commands

B-shell has six built-in commands. Write C code to implement them do not fork a new process and then invoke existing Minix commands to accomplish the tasks. The commands are:

  1. exit exits bsh.
  2. env lists the environment variables and their values. The syntax is as follows (one var=value per line):
    • HOME=/root
    • HOST=minix
    • PWD=/root/proj4
    • SHELL=/bin/sh
    • USER=root
  3. setenv sets the value of an environment variable (new or existing). The syntax is: setenv variable value
  4. unsetenv removes a variable from the environment. The syntax is: unsetenv variable
  5. cd changes the current working directory and updates the environment variable PWD.
  6. history lists the last 500 commands the user has entered. As you can see in the file bsh.c, the command exit is already implemented.

Environment Variables

Environment variables are typically inherited from the parent shell. When an executable such as bsh starts running, it receives three parameters see envDemo.c:

1
2
int main(int argc, char *argv[], char *envp[]) {
}

The first two parameters should be familiar to you: argc is the number of command-line arguments, and argv is an array of string pointers to the arguments. The third parameter is probably new to you. Just like argv, it is an array of string pointers where each string is formatted like this:

HOME=/root

The main difference between argv and envp is:

  • You know there are argc arguments in argv[]. They are stored in argv[0], argv1, . . ., argv[argc - 1].
  • You don’t know the number of environment variables right away. If there are k such variables, they are stored in envp[0], envp1, …, envp[k 1]. The only way for you to know that you have reached k is by testing for envp[k] == NULL.

strsep()

Throughout this project, you often need to separate a string into several tokens. The function strsep() is handy for this purpose. It separates a string by delimiters of your choice. For example,

1
2
3
4
char tmpStr[1024], *myPath, *justPATH;
strcpy(tmpStr, "PATH=/bin:/usr/bin:/usr/pkg/bin:/usr/local/bin");
myPath = tmpStr;
justPATH = strsep(myPath, "=");

The following summarizes what happened above.

  • The variable tmpStr is unchanged it still points at the same address.
  • justPATH has the same value as tmpStr it points at the same address.
  • strsep() replaced the equals sign at tmpStr[4] with the end-of-string char ‘\0’ if you print the string at justPATH, you get PATH, literally.
  • myPath points at tmpStr[5], just beyond the original equals sign if you print the string at myPath, you get /bin:/usr/bin:/usr/pkg/bin:/usr/local/bin.
  • Last but not least, strsep() is destructive the copy in tmpStr is altered. This is why we used strcpy(). The copy is tokenized, but the original PATH value can be used again.

P.S. In Section 3, you loop through the paths by strsep(myPath, “:”).

Implement Environment Variables

To implement env, setenv, and unsetenv, you need to do the following:

  • At the beginning of bsh, loop through envp and make a copy of the environment variables to the memory space of your code. If you don’t make a copy and later try to change the values directly in envp, disasters segmentation faults will strike. To store the environment variables, it is easier to use an array than a linked list. You can assume that there are no more than 64 variables bsh.c has #define MAXENV 64
  • When the users of bsh you and the instructor try to setenv, you search your copy of the variables. If it is an existing variable, you should malloc() a new string, save the new value, and free() the old string. If you don’t malloc() a new string, the existing space may not be long enough to hold the new value, and segfaults will strike. If the variable is new, you just malloc() a new space to save it.
  • When the users try to unsetenv, you remove the variable from the list. Don’t forget to free() the memory, or else there will be memory leak.

Change Directory

You call the function chdir() to change the working directory of bsh. There are two cases. If the user enters just cd, you chdir() to the value of the environment variable HOME. If the user enters cd someDir, you chdir() to someDir. Don’t forget the set the value of PWD accordingly. When you save a new value to PWD, it is safer to malloc() a new string with enough space. Otherwise, if the new PWD is longer than the previous PWD, segfaults may strike. Don’t forget to free() the previous memory, or else there will be memory leak.

History

The Linux command history lists chronologically the last 500 commands that a user has entered.
Implement this feature. In bsh.c, there is #define MAXLINE 1024, which limits a command line to be no more than 1024 bytes. Given this, it is easier to use an array of 500 pointers, each pointing to a string of 1,024 bytes, so that you don’t need to malloc() and free() all the time. However, before the user has entered 500 commands, you need a way to know which history slots are valid and which slots are yet to be filled.

Minix Commands

An environment variable called PATH has a list of directories that contain Minix executables. When a command entered by the user is not one of the built-in commands, bsh should check to see if the command exists in one of the directories in PATH. You need to do the following:

  • Iterate through the directories listed in PATH
    • Use strsep() to separate the paths. However, you should make a copy of PATH=… and apply strsep() to the copy, because strsep() is destructive. You must keep the original copy of PATH=… intact so that you can use it again.
    • Append the user command to the end of a path to make an absolute path. For example, if the user command is ls, and you have extracted a path /bin from PATH, then you concatenate them to create an absolute path /bin/ls.
    • Use the function access() to see if the absolute path is a valid executable.
    • If access() says it is indeed an executable, you run it immediately as follows ignore the rest of directories in PATH
      • Use fork() to generate a child process
      • Make the parent process wait for the child
      • In the child process, call execv() with the appropriate parameters to run the executable. See bsh.c for details.
    • If the command does not appear anywhere in the paths, an error message should be printed. To implement the full functionality of bsh, you may need several system calls: fork(), wait(), waitpid(), execv(), chdir(), access(). You can find their manual pages by the man command in Minix: minix# man fork, or you can google man fork in a browser.

I/O Redirection

B-shell needs to be able to handle redirection of stdin and stdout. The following commands are examples of I/O redirection.

bsh> ls > tmp.txt          //Redirects stdout to the file tmp.txt
bsh> wc < tmp.txt          //Redirects stdin from the file tmp.txt
bsh> wc < in.txt > out.txt //Redirects stdin from in.txt, stdout to out.txt

When a child process is created using the system call fork(), it gets a copy of its parent’s file descriptor table. Included in this table are the file descriptors for stdin (fd 0) and stdout (fd 1). Each of these can be redirected by closing them and then creating a new file descriptor in their place using the system calls open(), close(), and dup(). For instance, to redirect output from stdout to the file tmp.txt, you do the following:

1
2
3
4
fid = open("tmp.txt", O_WRONLY | O_CREAT);
close(1); //closes stdout
dup(fid); //fid is now associated with fd 1
close(fid); //fid is no longer needed; use fd 1 instead

The system call dup() duplicates fid to the first available entry in the file descriptor table, in this case, fd 1, because it was just closed at the previous line.

Parsing User Commands

If the user enters commands in a clean way that all parts are separated by spaces, for example, cat in.txt | wc > out.txt, the function strsep() is all you need to tokenize the command line.
Here, you implement a way to allow the instructor to enter a command like:

cat in.txt|wc>out.txt

You can write C code or use the Minix utility flex.

Pipe

A pipe is a one-way communication channel between two processes. One processes writes to one end of the pipe, and the other process reads from the other end. The process that reads from the pipe should know it is time to exit when it reads an EOF on its input. Pipes are created using the system call pipe(). Adapt the code in pipeDemo.c to bsh.c. You are required to implement only one pipe, which connects two commands don’t worry about multiple pipes connecting more than two commands.

Grading Rubric

Put all files in /root/proj4.

  • Write a readMe.txt and explain what you have done and what you haven’t done
  • env, setenv, unsetenv
  • cd
  • history
  • Minix commands
  • I/O redirection
  • parsing
  • pipe