mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-25 23:11:02 +08:00
Translated by qhwdw
This commit is contained in:
parent
d178ed8fef
commit
19515041cd
@ -1,346 +0,0 @@
|
||||
Translating by qhwdw
|
||||
Lab 5: File system, Spawn and Shell
|
||||
======
|
||||
|
||||
**Due Thursday, November 15, 2018
|
||||
**
|
||||
|
||||
### Introduction
|
||||
|
||||
In this lab, you will implement `spawn`, a library call that loads and runs on-disk executables. You will then flesh out your kernel and library operating system enough to run a shell on the console. These features need a file system, and this lab introduces a simple read/write file system.
|
||||
|
||||
#### Getting Started
|
||||
|
||||
Use Git to fetch the latest version of the course repository, and then create a local branch called `lab5` based on our lab5 branch, `origin/lab5`:
|
||||
|
||||
```
|
||||
athena% cd ~/6.828/lab
|
||||
athena% add git
|
||||
athena% git pull
|
||||
Already up-to-date.
|
||||
athena% git checkout -b lab5 origin/lab5
|
||||
Branch lab5 set up to track remote branch refs/remotes/origin/lab5.
|
||||
Switched to a new branch "lab5"
|
||||
athena% git merge lab4
|
||||
Merge made by recursive.
|
||||
.....
|
||||
athena%
|
||||
```
|
||||
|
||||
The main new component for this part of the lab is the file system environment, located in the new `fs` directory. Scan through all the files in this directory to get a feel for what all is new. Also, there are some new file system-related source files in the `user` and `lib` directories,
|
||||
|
||||
| fs/fs.c | Code that mainipulates the file system's on-disk structure. |
|
||||
| fs/bc.c | A simple block cache built on top of our user-level page fault handling facility. |
|
||||
| fs/ide.c | Minimal PIO-based (non-interrupt-driven) IDE driver code. |
|
||||
| fs/serv.c | The file system server that interacts with client environments using file system IPCs. |
|
||||
| lib/fd.c | Code that implements the general UNIX-like file descriptor interface. |
|
||||
| lib/file.c | The driver for on-disk file type, implemented as a file system IPC client. |
|
||||
| lib/console.c | The driver for console input/output file type. |
|
||||
| lib/spawn.c | Code skeleton of the spawn library call. |
|
||||
|
||||
You should run the pingpong, primes, and forktree test cases from lab 4 again after merging in the new lab 5 code. You will need to comment out the `ENV_CREATE(fs_fs)` line in `kern/init.c` because `fs/fs.c` tries to do some I/O, which JOS does not allow yet. Similarly, temporarily comment out the call to `close_all()` in `lib/exit.c`; this function calls subroutines that you will implement later in the lab, and therefore will panic if called. If your lab 4 code doesn't contain any bugs, the test cases should run fine. Don't proceed until they work. Don't forget to un-comment these lines when you start Exercise 1.
|
||||
|
||||
If they don't work, use git diff lab4 to review all the changes, making sure there isn't any code you wrote for lab4 (or before) missing from lab 5. Make sure that lab 4 still works.
|
||||
|
||||
#### Lab Requirements
|
||||
|
||||
As before, you will need to do all of the regular exercises described in the lab and _at least one_ challenge problem. Additionally, you will need to write up brief answers to the questions posed in the lab and a short (e.g., one or two paragraph) description of what you did to solve your chosen challenge problem. If you implement more than one challenge problem, you only need to describe one of them in the write-up, though of course you are welcome to do more. Place the write-up in a file called `answers-lab5.txt` in the top level of your `lab5` directory before handing in your work.
|
||||
|
||||
### File system preliminaries
|
||||
|
||||
The file system you will work with is much simpler than most "real" file systems including that of xv6 UNIX, but it is powerful enough to provide the basic features: creating, reading, writing, and deleting files organized in a hierarchical directory structure.
|
||||
|
||||
We are (for the moment anyway) developing only a single-user operating system, which provides protection sufficient to catch bugs but not to protect multiple mutually suspicious users from each other. Our file system therefore does not support the UNIX notions of file ownership or permissions. Our file system also currently does not support hard links, symbolic links, time stamps, or special device files like most UNIX file systems do.
|
||||
|
||||
### On-Disk File System Structure
|
||||
|
||||
Most UNIX file systems divide available disk space into two main types of regions: _inode_ regions and _data_ regions. UNIX file systems assign one _inode_ to each file in the file system; a file's inode holds critical meta-data about the file such as its `stat` attributes and pointers to its data blocks. The data regions are divided into much larger (typically 8KB or more) _data blocks_ , within which the file system stores file data and directory meta-data. Directory entries contain file names and pointers to inodes; a file is said to be _hard-linked_ if multiple directory entries in the file system refer to that file's inode. Since our file system will not support hard links, we do not need this level of indirection and therefore can make a convenient simplification: our file system will not use inodes at all and instead will simply store all of a file's (or sub-directory's) meta-data within the (one and only) directory entry describing that file.
|
||||
|
||||
Both files and directories logically consist of a series of data blocks, which may be scattered throughout the disk much like the pages of an environment's virtual address space can be scattered throughout physical memory. The file system environment hides the details of block layout, presenting interfaces for reading and writing sequences of bytes at arbitrary offsets within files. The file system environment handles all modifications to directories internally as a part of performing actions such as file creation and deletion. Our file system does allow user environments to _read_ directory meta-data directly (e.g., with `read`), which means that user environments can perform directory scanning operations themselves (e.g., to implement the `ls` program) rather than having to rely on additional special calls to the file system. The disadvantage of this approach to directory scanning, and the reason most modern UNIX variants discourage it, is that it makes application programs dependent on the format of directory meta-data, making it difficult to change the file system's internal layout without changing or at least recompiling application programs as well.
|
||||
|
||||
#### Sectors and Blocks
|
||||
|
||||
Most disks cannot perform reads and writes at byte granularity and instead perform reads and writes in units of _sectors_. In JOS, sectors are 512 bytes each. File systems actually allocate and use disk storage in units of _blocks_. Be wary of the distinction between the two terms: _sector size_ is a property of the disk hardware, whereas _block size_ is an aspect of the operating system using the disk. A file system's block size must be a multiple of the sector size of the underlying disk.
|
||||
|
||||
The UNIX xv6 file system uses a block size of 512 bytes, the same as the sector size of the underlying disk. Most modern file systems use a larger block size, however, because storage space has gotten much cheaper and it is more efficient to manage storage at larger granularities. Our file system will use a block size of 4096 bytes, conveniently matching the processor's page size.
|
||||
|
||||
#### Superblocks
|
||||
|
||||
![Disk layout][1]
|
||||
|
||||
File systems typically reserve certain disk blocks at "easy-to-find" locations on the disk (such as the very start or the very end) to hold meta-data describing properties of the file system as a whole, such as the block size, disk size, any meta-data required to find the root directory, the time the file system was last mounted, the time the file system was last checked for errors, and so on. These special blocks are called _superblocks_.
|
||||
|
||||
Our file system will have exactly one superblock, which will always be at block 1 on the disk. Its layout is defined by `struct Super` in `inc/fs.h`. Block 0 is typically reserved to hold boot loaders and partition tables, so file systems generally do not use the very first disk block. Many "real" file systems maintain multiple superblocks, replicated throughout several widely-spaced regions of the disk, so that if one of them is corrupted or the disk develops a media error in that region, the other superblocks can still be found and used to access the file system.
|
||||
|
||||
#### File Meta-data
|
||||
|
||||
![File structure][2]
|
||||
The layout of the meta-data describing a file in our file system is described by `struct File` in `inc/fs.h`. This meta-data includes the file's name, size, type (regular file or directory), and pointers to the blocks comprising the file. As mentioned above, we do not have inodes, so this meta-data is stored in a directory entry on disk. Unlike in most "real" file systems, for simplicity we will use this one `File` structure to represent file meta-data as it appears _both on disk and in memory_.
|
||||
|
||||
The `f_direct` array in `struct File` contains space to store the block numbers of the first 10 (`NDIRECT`) blocks of the file, which we call the file's _direct_ blocks. For small files up to 10*4096 = 40KB in size, this means that the block numbers of all of the file's blocks will fit directly within the `File` structure itself. For larger files, however, we need a place to hold the rest of the file's block numbers. For any file greater than 40KB in size, therefore, we allocate an additional disk block, called the file's _indirect block_ , to hold up to 4096/4 = 1024 additional block numbers. Our file system therefore allows files to be up to 1034 blocks, or just over four megabytes, in size. To support larger files, "real" file systems typically support _double-_ and _triple-indirect blocks_ as well.
|
||||
|
||||
#### Directories versus Regular Files
|
||||
|
||||
A `File` structure in our file system can represent either a _regular_ file or a directory; these two types of "files" are distinguished by the `type` field in the `File` structure. The file system manages regular files and directory-files in exactly the same way, except that it does not interpret the contents of the data blocks associated with regular files at all, whereas the file system interprets the contents of a directory-file as a series of `File` structures describing the files and subdirectories within the directory.
|
||||
|
||||
The superblock in our file system contains a `File` structure (the `root` field in `struct Super`) that holds the meta-data for the file system's root directory. The contents of this directory-file is a sequence of `File` structures describing the files and directories located within the root directory of the file system. Any subdirectories in the root directory may in turn contain more `File` structures representing sub-subdirectories, and so on.
|
||||
|
||||
### The File System
|
||||
|
||||
The goal for this lab is not to have you implement the entire file system, but for you to implement only certain key components. In particular, you will be responsible for reading blocks into the block cache and flushing them back to disk; allocating disk blocks; mapping file offsets to disk blocks; and implementing read, write, and open in the IPC interface. Because you will not be implementing all of the file system yourself, it is very important that you familiarize yourself with the provided code and the various file system interfaces.
|
||||
|
||||
### Disk Access
|
||||
|
||||
The file system environment in our operating system needs to be able to access the disk, but we have not yet implemented any disk access functionality in our kernel. Instead of taking the conventional "monolithic" operating system strategy of adding an IDE disk driver to the kernel along with the necessary system calls to allow the file system to access it, we instead implement the IDE disk driver as part of the user-level file system environment. We will still need to modify the kernel slightly, in order to set things up so that the file system environment has the privileges it needs to implement disk access itself.
|
||||
|
||||
It is easy to implement disk access in user space this way as long as we rely on polling, "programmed I/O" (PIO)-based disk access and do not use disk interrupts. It is possible to implement interrupt-driven device drivers in user mode as well (the L3 and L4 kernels do this, for example), but it is more difficult since the kernel must field device interrupts and dispatch them to the correct user-mode environment.
|
||||
|
||||
The x86 processor uses the IOPL bits in the EFLAGS register to determine whether protected-mode code is allowed to perform special device I/O instructions such as the IN and OUT instructions. Since all of the IDE disk registers we need to access are located in the x86's I/O space rather than being memory-mapped, giving "I/O privilege" to the file system environment is the only thing we need to do in order to allow the file system to access these registers. In effect, the IOPL bits in the EFLAGS register provides the kernel with a simple "all-or-nothing" method of controlling whether user-mode code can access I/O space. In our case, we want the file system environment to be able to access I/O space, but we do not want any other environments to be able to access I/O space at all.
|
||||
|
||||
```
|
||||
Exercise 1. `i386_init` identifies the file system environment by passing the type `ENV_TYPE_FS` to your environment creation function, `env_create`. Modify `env_create` in `env.c`, so that it gives the file system environment I/O privilege, but never gives that privilege to any other environment.
|
||||
|
||||
Make sure you can start the file environment without causing a General Protection fault. You should pass the "fs i/o" test in make grade.
|
||||
```
|
||||
|
||||
```
|
||||
Question
|
||||
|
||||
1. Do you have to do anything else to ensure that this I/O privilege setting is saved and restored properly when you subsequently switch from one environment to another? Why?
|
||||
```
|
||||
|
||||
|
||||
Note that the `GNUmakefile` file in this lab sets up QEMU to use the file `obj/kern/kernel.img` as the image for disk 0 (typically "Drive C" under DOS/Windows) as before, and to use the (new) file `obj/fs/fs.img` as the image for disk 1 ("Drive D"). In this lab our file system should only ever touch disk 1; disk 0 is used only to boot the kernel. If you manage to corrupt either disk image in some way, you can reset both of them to their original, "pristine" versions simply by typing:
|
||||
|
||||
```
|
||||
$ rm obj/kern/kernel.img obj/fs/fs.img
|
||||
$ make
|
||||
```
|
||||
|
||||
or by doing:
|
||||
|
||||
```
|
||||
$ make clean
|
||||
$ make
|
||||
```
|
||||
|
||||
Challenge! Implement interrupt-driven IDE disk access, with or without DMA. You can decide whether to move the device driver into the kernel, keep it in user space along with the file system, or even (if you really want to get into the micro-kernel spirit) move it into a separate environment of its own.
|
||||
|
||||
### The Block Cache
|
||||
|
||||
In our file system, we will implement a simple "buffer cache" (really just a block cache) with the help of the processor's virtual memory system. The code for the block cache is in `fs/bc.c`.
|
||||
|
||||
Our file system will be limited to handling disks of size 3GB or less. We reserve a large, fixed 3GB region of the file system environment's address space, from 0x10000000 (`DISKMAP`) up to 0xD0000000 (`DISKMAP+DISKMAX`), as a "memory mapped" version of the disk. For example, disk block 0 is mapped at virtual address 0x10000000, disk block 1 is mapped at virtual address 0x10001000, and so on. The `diskaddr` function in `fs/bc.c` implements this translation from disk block numbers to virtual addresses (along with some sanity checking).
|
||||
|
||||
Since our file system environment has its own virtual address space independent of the virtual address spaces of all other environments in the system, and the only thing the file system environment needs to do is to implement file access, it is reasonable to reserve most of the file system environment's address space in this way. It would be awkward for a real file system implementation on a 32-bit machine to do this since modern disks are larger than 3GB. Such a buffer cache management approach may still be reasonable on a machine with a 64-bit address space.
|
||||
|
||||
Of course, it would take a long time to read the entire disk into memory, so instead we'll implement a form of _demand paging_ , wherein we only allocate pages in the disk map region and read the corresponding block from the disk in response to a page fault in this region. This way, we can pretend that the entire disk is in memory.
|
||||
|
||||
```
|
||||
Exercise 2. Implement the `bc_pgfault` and `flush_block` functions in `fs/bc.c`. `bc_pgfault` is a page fault handler, just like the one your wrote in the previous lab for copy-on-write fork, except that its job is to load pages in from the disk in response to a page fault. When writing this, keep in mind that (1) `addr` may not be aligned to a block boundary and (2) `ide_read` operates in sectors, not blocks.
|
||||
|
||||
The `flush_block` function should write a block out to disk _if necessary_. `flush_block` shouldn't do anything if the block isn't even in the block cache (that is, the page isn't mapped) or if it's not dirty. We will use the VM hardware to keep track of whether a disk block has been modified since it was last read from or written to disk. To see whether a block needs writing, we can just look to see if the `PTE_D` "dirty" bit is set in the `uvpt` entry. (The `PTE_D` bit is set by the processor in response to a write to that page; see 5.2.4.3 in [chapter 5][3] of the 386 reference manual.) After writing the block to disk, `flush_block` should clear the `PTE_D` bit using `sys_page_map`.
|
||||
|
||||
Use make grade to test your code. Your code should pass "check_bc", "check_super", and "check_bitmap".
|
||||
```
|
||||
|
||||
`fs_init` function in `fs/fs.c` is a prime example of how to use the block cache. After initializing the block cache, it simply stores pointers into the disk map region in the `super` global variable. After this point, we can simply read from the `super` structure as if they were in memory and our page fault handler will read them from disk as necessary.
|
||||
|
||||
```
|
||||
Challenge! The block cache has no eviction policy. Once a block gets faulted in to it, it never gets removed and will remain in memory forevermore. Add eviction to the buffer cache. Using the `PTE_A` "accessed" bits in the page tables, which the hardware sets on any access to a page, you can track approximate usage of disk blocks without the need to modify every place in the code that accesses the disk map region. Be careful with dirty blocks.
|
||||
```
|
||||
|
||||
### The Block Bitmap
|
||||
|
||||
After `fs_init` sets the `bitmap` pointer, we can treat `bitmap` as a packed array of bits, one for each block on the disk. See, for example, `block_is_free`, which simply checks whether a given block is marked free in the bitmap.
|
||||
|
||||
```
|
||||
Exercise 3. Use `free_block` as a model to implement `alloc_block` in `fs/fs.c`, which should find a free disk block in the bitmap, mark it used, and return the number of that block. When you allocate a block, you should immediately flush the changed bitmap block to disk with `flush_block`, to help file system consistency.
|
||||
|
||||
Use make grade to test your code. Your code should now pass "alloc_block".
|
||||
```
|
||||
|
||||
### File Operations
|
||||
|
||||
We have provided a variety of functions in `fs/fs.c` to implement the basic facilities you will need to interpret and manage `File` structures, scan and manage the entries of directory-files, and walk the file system from the root to resolve an absolute pathname. Read through _all_ of the code in `fs/fs.c` and make sure you understand what each function does before proceeding.
|
||||
|
||||
```
|
||||
Exercise 4. Implement `file_block_walk` and `file_get_block`. `file_block_walk` maps from a block offset within a file to the pointer for that block in the `struct File` or the indirect block, very much like what `pgdir_walk` did for page tables. `file_get_block` goes one step further and maps to the actual disk block, allocating a new one if necessary.
|
||||
|
||||
Use make grade to test your code. Your code should pass "file_open", "file_get_block", and "file_flush/file_truncated/file rewrite", and "testfile".
|
||||
```
|
||||
|
||||
`file_block_walk` and `file_get_block` are the workhorses of the file system. For example, `file_read` and `file_write` are little more than the bookkeeping atop `file_get_block` necessary to copy bytes between scattered blocks and a sequential buffer.
|
||||
|
||||
```
|
||||
Challenge! The file system is likely to be corrupted if it gets interrupted in the middle of an operation (for example, by a crash or a reboot). Implement soft updates or journalling to make the file system crash-resilient and demonstrate some situation where the old file system would get corrupted, but yours doesn't.
|
||||
```
|
||||
|
||||
### The file system interface
|
||||
|
||||
Now that we have the necessary functionality within the file system environment itself, we must make it accessible to other environments that wish to use the file system. Since other environments can't directly call functions in the file system environment, we'll expose access to the file system environment via a _remote procedure call_ , or RPC, abstraction, built atop JOS's IPC mechanism. Graphically, here's what a call to the file system server (say, read) looks like
|
||||
|
||||
```
|
||||
Regular env FS env
|
||||
+---------------+ +---------------+
|
||||
| read | | file_read |
|
||||
| (lib/fd.c) | | (fs/fs.c) |
|
||||
...|.......|.......|...|.......^.......|...............
|
||||
| v | | | | RPC mechanism
|
||||
| devfile_read | | serve_read |
|
||||
| (lib/file.c) | | (fs/serv.c) |
|
||||
| | | | ^ |
|
||||
| v | | | |
|
||||
| fsipc | | serve |
|
||||
| (lib/file.c) | | (fs/serv.c) |
|
||||
| | | | ^ |
|
||||
| v | | | |
|
||||
| ipc_send | | ipc_recv |
|
||||
| | | | ^ |
|
||||
+-------|-------+ +-------|-------+
|
||||
| |
|
||||
+-------------------+
|
||||
|
||||
```
|
||||
|
||||
Everything below the dotted line is simply the mechanics of getting a read request from the regular environment to the file system environment. Starting at the beginning, `read` (which we provide) works on any file descriptor and simply dispatches to the appropriate device read function, in this case `devfile_read` (we can have more device types, like pipes). `devfile_read` implements `read` specifically for on-disk files. This and the other `devfile_*` functions in `lib/file.c` implement the client side of the FS operations and all work in roughly the same way, bundling up arguments in a request structure, calling `fsipc` to send the IPC request, and unpacking and returning the results. The `fsipc` function simply handles the common details of sending a request to the server and receiving the reply.
|
||||
|
||||
The file system server code can be found in `fs/serv.c`. It loops in the `serve` function, endlessly receiving a request over IPC, dispatching that request to the appropriate handler function, and sending the result back via IPC. In the read example, `serve` will dispatch to `serve_read`, which will take care of the IPC details specific to read requests such as unpacking the request structure and finally call `file_read` to actually perform the file read.
|
||||
|
||||
Recall that JOS's IPC mechanism lets an environment send a single 32-bit number and, optionally, share a page. To send a request from the client to the server, we use the 32-bit number for the request type (the file system server RPCs are numbered, just like how syscalls were numbered) and store the arguments to the request in a `union Fsipc` on the page shared via the IPC. On the client side, we always share the page at `fsipcbuf`; on the server side, we map the incoming request page at `fsreq` (`0x0ffff000`).
|
||||
|
||||
The server also sends the response back via IPC. We use the 32-bit number for the function's return code. For most RPCs, this is all they return. `FSREQ_READ` and `FSREQ_STAT` also return data, which they simply write to the page that the client sent its request on. There's no need to send this page in the response IPC, since the client shared it with the file system server in the first place. Also, in its response, `FSREQ_OPEN` shares with the client a new "Fd page". We'll return to the file descriptor page shortly.
|
||||
|
||||
```
|
||||
Exercise 5. Implement `serve_read` in `fs/serv.c`.
|
||||
|
||||
`serve_read`'s heavy lifting will be done by the already-implemented `file_read` in `fs/fs.c` (which, in turn, is just a bunch of calls to `file_get_block`). `serve_read` just has to provide the RPC interface for file reading. Look at the comments and code in `serve_set_size` to get a general idea of how the server functions should be structured.
|
||||
|
||||
Use make grade to test your code. Your code should pass "serve_open/file_stat/file_close" and "file_read" for a score of 70/150.
|
||||
```
|
||||
|
||||
```
|
||||
Exercise 6. Implement `serve_write` in `fs/serv.c` and `devfile_write` in `lib/file.c`.
|
||||
|
||||
Use make grade to test your code. Your code should pass "file_write", "file_read after file_write", "open", and "large file" for a score of 90/150.
|
||||
```
|
||||
|
||||
### Spawning Processes
|
||||
|
||||
We have given you the code for `spawn` (see `lib/spawn.c`) which creates a new environment, loads a program image from the file system into it, and then starts the child environment running this program. The parent process then continues running independently of the child. The `spawn` function effectively acts like a `fork` in UNIX followed by an immediate `exec` in the child process.
|
||||
|
||||
We implemented `spawn` rather than a UNIX-style `exec` because `spawn` is easier to implement from user space in "exokernel fashion", without special help from the kernel. Think about what you would have to do in order to implement `exec` in user space, and be sure you understand why it is harder.
|
||||
|
||||
```
|
||||
Exercise 7. `spawn` relies on the new syscall `sys_env_set_trapframe` to initialize the state of the newly created environment. Implement `sys_env_set_trapframe` in `kern/syscall.c` (don't forget to dispatch the new system call in `syscall()`).
|
||||
|
||||
Test your code by running the `user/spawnhello` program from `kern/init.c`, which will attempt to spawn `/hello` from the file system.
|
||||
|
||||
Use make grade to test your code.
|
||||
```
|
||||
|
||||
```
|
||||
Challenge! Implement Unix-style `exec`.
|
||||
```
|
||||
|
||||
```
|
||||
Challenge! Implement `mmap`-style memory-mapped files and modify `spawn` to map pages directly from the ELF image when possible.
|
||||
```
|
||||
|
||||
### Sharing library state across fork and spawn
|
||||
|
||||
The UNIX file descriptors are a general notion that also encompasses pipes, console I/O, etc. In JOS, each of these device types has a corresponding `struct Dev`, with pointers to the functions that implement read/write/etc. for that device type. `lib/fd.c` implements the general UNIX-like file descriptor interface on top of this. Each `struct Fd` indicates its device type, and most of the functions in `lib/fd.c` simply dispatch operations to functions in the appropriate `struct Dev`.
|
||||
|
||||
`lib/fd.c` also maintains the _file descriptor table_ region in each application environment's address space, starting at `FDTABLE`. This area reserves a page's worth (4KB) of address space for each of the up to `MAXFD` (currently 32) file descriptors the application can have open at once. At any given time, a particular file descriptor table page is mapped if and only if the corresponding file descriptor is in use. Each file descriptor also has an optional "data page" in the region starting at `FILEDATA`, which devices can use if they choose.
|
||||
|
||||
We would like to share file descriptor state across `fork` and `spawn`, but file descriptor state is kept in user-space memory. Right now, on `fork`, the memory will be marked copy-on-write, so the state will be duplicated rather than shared. (This means environments won't be able to seek in files they didn't open themselves and that pipes won't work across a fork.) On `spawn`, the memory will be left behind, not copied at all. (Effectively, the spawned environment starts with no open file descriptors.)
|
||||
|
||||
We will change `fork` to know that certain regions of memory are used by the "library operating system" and should always be shared. Rather than hard-code a list of regions somewhere, we will set an otherwise-unused bit in the page table entries (just like we did with the `PTE_COW` bit in `fork`).
|
||||
|
||||
We have defined a new `PTE_SHARE` bit in `inc/lib.h`. This bit is one of the three PTE bits that are marked "available for software use" in the Intel and AMD manuals. We will establish the convention that if a page table entry has this bit set, the PTE should be copied directly from parent to child in both `fork` and `spawn`. Note that this is different from marking it copy-on-write: as described in the first paragraph, we want to make sure to _share_ updates to the page.
|
||||
|
||||
```
|
||||
Exercise 8. Change `duppage` in `lib/fork.c` to follow the new convention. If the page table entry has the `PTE_SHARE` bit set, just copy the mapping directly. (You should use `PTE_SYSCALL`, not `0xfff`, to mask out the relevant bits from the page table entry. `0xfff` picks up the accessed and dirty bits as well.)
|
||||
|
||||
Likewise, implement `copy_shared_pages` in `lib/spawn.c`. It should loop through all page table entries in the current process (just like `fork` did), copying any page mappings that have the `PTE_SHARE` bit set into the child process.
|
||||
```
|
||||
|
||||
Use make run-testpteshare to check that your code is behaving properly. You should see lines that say "`fork handles PTE_SHARE right`" and "`spawn handles PTE_SHARE right`".
|
||||
|
||||
Use make run-testfdsharing to check that file descriptors are shared properly. You should see lines that say "`read in child succeeded`" and "`read in parent succeeded`".
|
||||
|
||||
### The keyboard interface
|
||||
|
||||
For the shell to work, we need a way to type at it. QEMU has been displaying output we write to the CGA display and the serial port, but so far we've only taken input while in the kernel monitor. In QEMU, input typed in the graphical window appear as input from the keyboard to JOS, while input typed to the console appear as characters on the serial port. `kern/console.c` already contains the keyboard and serial drivers that have been used by the kernel monitor since lab 1, but now you need to attach these to the rest of the system.
|
||||
|
||||
```
|
||||
Exercise 9. In your `kern/trap.c`, call `kbd_intr` to handle trap `IRQ_OFFSET+IRQ_KBD` and `serial_intr` to handle trap `IRQ_OFFSET+IRQ_SERIAL`.
|
||||
```
|
||||
|
||||
We implemented the console input/output file type for you, in `lib/console.c`. `kbd_intr` and `serial_intr` fill a buffer with the recently read input while the console file type drains the buffer (the console file type is used for stdin/stdout by default unless the user redirects them).
|
||||
|
||||
Test your code by running make run-testkbd and type a few lines. The system should echo your lines back to you as you finish them. Try typing in both the console and the graphical window, if you have both available.
|
||||
|
||||
### The Shell
|
||||
|
||||
Run make run-icode or make run-icode-nox. This will run your kernel and start `user/icode`. `icode` execs `init`, which will set up the console as file descriptors 0 and 1 (standard input and standard output). It will then spawn `sh`, the shell. You should be able to run the following commands:
|
||||
|
||||
```
|
||||
echo hello world | cat
|
||||
cat lorem |cat
|
||||
cat lorem |num
|
||||
cat lorem |num |num |num |num |num
|
||||
lsfd
|
||||
```
|
||||
|
||||
Note that the user library routine `cprintf` prints straight to the console, without using the file descriptor code. This is great for debugging but not great for piping into other programs. To print output to a particular file descriptor (for example, 1, standard output), use `fprintf(1, "...", ...)`. `printf("...", ...)` is a short-cut for printing to FD 1. See `user/lsfd.c` for examples.
|
||||
|
||||
```
|
||||
Exercise 10.
|
||||
|
||||
The shell doesn't support I/O redirection. It would be nice to run sh <script instead of having to type in all the commands in the script by hand, as you did above. Add I/O redirection for < to `user/sh.c`.
|
||||
|
||||
Test your implementation by typing sh <script into your shell
|
||||
|
||||
Run make run-testshell to test your shell. `testshell` simply feeds the above commands (also found in `fs/testshell.sh`) into the shell and then checks that the output matches `fs/testshell.key`.
|
||||
```
|
||||
|
||||
```
|
||||
Challenge! Add more features to the shell. Possibilities include (a few require changes to the file system too):
|
||||
|
||||
* backgrounding commands (`ls &`)
|
||||
* multiple commands per line (`ls; echo hi`)
|
||||
* command grouping (`(ls; echo hi) | cat > out`)
|
||||
* environment variable expansion (`echo $hello`)
|
||||
* quoting (`echo "a | b"`)
|
||||
* command-line history and/or editing
|
||||
* tab completion
|
||||
* directories, cd, and a PATH for command-lookup.
|
||||
* file creation
|
||||
* ctl-c to kill the running environment
|
||||
|
||||
|
||||
|
||||
but feel free to do something not on this list.
|
||||
```
|
||||
|
||||
Your code should pass all tests at this point. As usual, you can grade your submission with make grade and hand it in with make handin.
|
||||
|
||||
**This completes the lab.** As usual, don't forget to run make grade and to write up your answers and a description of your challenge exercise solution. Before handing in, use git status and git diff to examine your changes and don't forget to git add answers-lab5.txt. When you're ready, commit your changes with git commit -am 'my solutions to lab 5', then make handin to submit your solution.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://pdos.csail.mit.edu/6.828/2018/labs/lab5/
|
||||
|
||||
作者:[csail.mit][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://pdos.csail.mit.edu
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://pdos.csail.mit.edu/6.828/2018/labs/lab5/disk.png
|
||||
[2]: https://pdos.csail.mit.edu/6.828/2018/labs/lab5/file.png
|
||||
[3]: http://pdos.csail.mit.edu/6.828/2011/readings/i386/s05_02.htm
|
339
translated/tech/20181016 Lab 5- File system, Spawn and Shell.md
Normal file
339
translated/tech/20181016 Lab 5- File system, Spawn and Shell.md
Normal file
@ -0,0 +1,339 @@
|
||||
实验 5:文件系统、Spawn 和 Shell
|
||||
======
|
||||
|
||||
### 简介
|
||||
|
||||
在本实验中,你将要去实现 `spawn`,它是一个加载和运行磁盘上可运行文件的库调用。然后,你接着要去充实你的内核和库,以使操作系统能够在控制台上运行一个 shell。而这些特性需要一个文件系统,本实验将引入一个可读/写的简单文件系统。
|
||||
|
||||
#### 预备知识
|
||||
|
||||
使用 Git 去获取最新版的课程仓库,然后创建一个命名为 `lab5` 的本地分支,去跟踪远程的 `origin/lab5` 分支:
|
||||
|
||||
```
|
||||
athena% cd ~/6.828/lab
|
||||
athena% add git
|
||||
athena% git pull
|
||||
Already up-to-date.
|
||||
athena% git checkout -b lab5 origin/lab5
|
||||
Branch lab5 set up to track remote branch refs/remotes/origin/lab5.
|
||||
Switched to a new branch "lab5"
|
||||
athena% git merge lab4
|
||||
Merge made by recursive.
|
||||
.....
|
||||
athena%
|
||||
```
|
||||
|
||||
在实验中这一部分的主要新组件是文件系统环境,它位于新的 `fs` 目录下。通过检查这个目录中的所有文件,我们来看一下新的文件都有什么。另外,在 `user` 和 `lib` 目录下还有一些文件系统相关的源文件。
|
||||
|
||||
fs/fs.c 维护文件系统在磁盘上结构的代码
|
||||
fs/bc.c 构建在我们的用户级页故障处理功能之上的一个简单的块缓存
|
||||
fs/ide.c 极简的基于 PIO(非中断驱动的)IDE 驱动程序代码
|
||||
fs/serv.c 使用文件系统 IPC 与客户端环境交互的文件系统服务器
|
||||
lib/fd.c 实现一个常见的类 UNIX 的文件描述符接口的代码
|
||||
lib/file.c 磁盘上文件类型的驱动,实现为一个文件系统 IPC 客户端
|
||||
lib/console.c 控制台输入/输出文件类型的驱动
|
||||
lib/spawn.c spawn 库调用的框架代码
|
||||
|
||||
你应该再次去运行 `pingpong`、`primes`、和 `forktree`,测试实验 4 完成后合并到新的实验 5 中的代码能否正确运行。你还需要在 `kern/init.c` 中注释掉 `ENV_CREATE(fs_fs)` 行,因为 `fs/fs.c` 将尝试去做一些 I/O,而 JOS 到目前为止还不具备该功能。同样,在 `lib/exit.c` 中临时注释掉对 `close_all()` 的调用;这个函数将调用你在本实验后面部分去实现的子程序,如果现在去调用,它将导致 JOS 内核崩溃。如果你的实验 4 的代码没有任何 bug,将很完美地通过这个测试。在它们都能正常工作之前是不能继续后续实验的。在你开始做练习 1 时,不要忘记去取消这些行上的注释。
|
||||
|
||||
如果它们不能正常工作,使用 `git diff lab4` 去重新评估所有的变更,确保你在实验 4(及以前)所写的代码在本实验中没有丢失。确保实验 4 仍然能正常工作。
|
||||
|
||||
#### 实验要求
|
||||
|
||||
和以前一样,你需要做本实验中所描述的所有常规练习和至少一个挑战问题。另外,你需要写出你在本实验中问题的详细答案,和你是如何解决挑战问题的一个简短(即:用一到两个段落)的描述。如果你实现了多个挑战问题,你只需要写出其中一个即可,当然,我们欢迎你做的越多越好。在你动手实验之前,将你的问题答案写入到你的 `lab5` 根目录下的一个名为 `answers-lab5.txt` 的文件中。
|
||||
|
||||
### 文件系统的雏形
|
||||
|
||||
你将要使用的文件系统比起大多数“真正的”文件系统(包括 xv6 UNIX 的文件系统)要简单的多,但它也是很强大的,足够去提供基本的特性:创建、读取、写入、和删除组织在层次目录结构中的文件。
|
||||
|
||||
到目前为止,我们开发的是一个单用户操作系统,它提供足够的保护并能去捕获 bug,但它还不能在多个不可信用户之间提供保护。因此,我们的文件系统还不支持 UNIX 的所有者或权限的概念。我们的文件系统目前也不支持硬链接、时间戳、或像大多数 UNIX 文件系统实现的那些特殊的设备文件。
|
||||
|
||||
### 磁盘上文件系统结构
|
||||
|
||||
主流的 UNIX 文件系统将可用磁盘空间分为两种主要的区域类型:节点区域和数据区域。UNIX 文件系统在文件系统中为每个文件分配一个节点;一个文件的节点保存了这个文件重要的元数据,比如它的 `stat` 属性和指向数据块的指针。数据区域被分为更大的(一般是 8 KB 或更大)数据块,它在文件系统中存储文件数据和目录元数据。目录条目包含文件名字和指向到节点的指针;如果文件系统中的多个目录条目指向到那个文件的节点上,则称该文件是硬链接的。由于我们的文件系统不支持硬链接,所以我们不需要这种间接的级别,并且因此可以更方便简化:我们的文件系统将压根就不使用节点,而是简单地将文件的(或子目录的)所有元数据保存在描述那个文件的(唯一的)目录条目中。
|
||||
|
||||
文件和目录逻辑上都是由一系列的数据块组成,它或许是很稀疏地分散到磁盘上,就像一个环境的虚拟地址空间上的页,能够稀疏地分散在物理内存中一样。文件系统环境隐藏了块布局的细节,只提供文件中任意偏移位置读写字节序列的接口。作为像文件创建和删除操作的一部分,文件系统环境服务程序在目录内部完成所有的修改。我们的文件系统允许用户环境去直接读取目录元数据(即:使用 `read`),这意味着用户环境自己就能够执行目录扫描操作(即:实现 `ls` 程序),而不用另外依赖对文件系统的特定调用。用这种方法做目录扫描的缺点是,(也是大多数现代 UNIX 操作系统变体摒弃它的原因)使得应用程序依赖目录元数据的格式,如果不改变或至少要重编译应用程序的前提下,去改变文件系统的内部布局将变得很困难。
|
||||
|
||||
#### 扇区和块
|
||||
|
||||
大多数磁盘都不能执行以字节为粒度的读写操作,而是以扇区为单位执行读写。在 JOS 中,每个扇区是 512 字节。文件系统实际上是以块为单位来分配和使用磁盘存储的。要注意这两个术语之间的区别:扇区大小是硬盘硬件的属性,而块大小是使用这个磁盘的操作系统上的术语。一个文件系统的块大小必须是底层磁盘的扇区大小的倍数。
|
||||
|
||||
UNIX xv6 文件系统使用 512 字节大小的块,与它底层磁盘的扇区大小一样。而大多数现代文件系统使用更大尺寸的块,因为现在存储空间变得很廉价了,而使用更大的粒度在存储管理上更高效。我们的文件系统将使用 4096 字节的块,以更方便地去匹配处理器上页的大小。
|
||||
|
||||
#### 超级块
|
||||
|
||||
![Disk layout][1]
|
||||
|
||||
文件系统一般在磁盘上的“易于查找”的位置(比如磁盘开始或结束的位置)保留一些磁盘块,用于保存描述整个文件系统属性的元数据,比如块大小、磁盘大小、用于查找根目录的任何元数据、文件系统最后一次挂载的时间、文件系统最后一次错误检查的时间等等。这些特定的块被称为超级块。
|
||||
|
||||
我们的文件系统只有一个超级块,它固定为磁盘的 1 号块。它的布局定义在 `inc/fs.h` 文件里的 `struct Super` 中。而 0 号块一般是保留的,用于去保存引导加载程序和分区表,因此文件系统一般不会去使用磁盘上比较靠前的块。许多“真实的”文件系统都维护多个超级块,并将它们复制到间隔较大的几个区域中,这样即便其中一个超级块坏了或超级块所在的那个区域产生了介质错误,其它的超级块仍然能够被找到并用于去访问文件系统。
|
||||
|
||||
#### 文件元数据
|
||||
|
||||
![File structure][2]
|
||||
元数据的布局是描述在我们的文件系统中的一个文件中,这个文件就是 `inc/fs.h` 中的 `struct File`。元数据包含文件的名字、大小、类型(普通文件还是目录)、指向构成这个文件的块的指针。正如前面所提到的,我们的文件系统中并没有节点,因此元数据是保存在磁盘上的一个目录条目中,而不是像大多数“真正的”文件系统那样保存在节点中。为简单起见,我们将使用 `File` 这一个结构去表示文件元数据,因为它要同时出现在磁盘上和内存中。
|
||||
|
||||
在 `struct File` 中的数组 `f_direct` 包含一个保存文件的前 10 个块(`NDIRECT`)的块编号的空间,我们称之为文件的直接块。对于最大 `10*4096 = 40KB` 的小文件,这意味着这个文件的所有块的块编号将全部直接保存在结构 `File` 中,但是,对于超过 40 KB 大小的文件,我们需要一个地方去保存文件剩余的块编号。所以我们分配一个额外的磁盘块,我们称之为文件的间接块,由它去保存最多 4096/4 = 1024 个额外的块编号。因此,我们的文件系统最多允许有 1034 个块,或者说不能超过 4MB 大小。为支持大文件,“真正的”文件系统一般都支持两个或三个间接块。
|
||||
|
||||
#### 目录与普通文件
|
||||
|
||||
我们的文件系统中的结构 `File` 既能够表示一个普通文件,也能够表示一个目录;这两种“文件”类型是由 `File` 结构中的 `type` 字段来区分的。除了文件系统根本就不需要解释的、分配给普通文件的数据块的内容之外,它使用完全相同的方式来管理普通文件和目录“文件”,文件系统将目录“文件”的内容解释为包含在目录中的一系列的由 `File` 结构所描述的文件和子目录。
|
||||
|
||||
在我们文件系统中的超级块包含一个结构 `File`(在 `struct Super` 中的 `root` 字段中),它用于保存文件系统的根目录的元数据。这个目录“文件”的内容是一系列的 `File` 结构所描述的、位于文件系统根目录中的文件和目录。在根目录中的任何子目录转而可以包含更多的 `File` 结构所表示的子目录,依此类推。
|
||||
|
||||
### 文件系统
|
||||
|
||||
本实验的目标并不是让你去实现完整的文件系统,你只需要去实现几个重要的组件即可。实践中,你将负责把块读入到块缓存中,并且刷新脏块到磁盘上;分配磁盘块;映射文件偏移量到磁盘块;以及实现读取、写入、和在 IPC 接口中打开。因为你并不去实现完整的文件系统,熟悉提供给你的代码和各种文件系统接口是非常重要的。
|
||||
|
||||
### 磁盘访问
|
||||
|
||||
我们的操作系统的文件系统环境需要能访问磁盘,但是我们在内核中并没有实现任何磁盘访问的功能。与传统的在内核中添加了 IDE 磁盘驱动程序、以及允许文件系统去访问它所必需的系统调用的“大一统”策略不同,我们将 IDE 磁盘驱动实现为用户级文件系统环境的一部分。我们仍然需要对内核做稍微的修改,是为了能够设置一些东西,以便于文件系统环境拥有实现磁盘访问本身所需的权限。
|
||||
|
||||
只要我们依赖轮询、基于 “编程的 I/O”(PIO)的磁盘访问,并且不使用磁盘中断,那么在用户空间中实现磁盘访问还是很容易的。也可以去实现由中断驱动的设备驱动程序(比如像 L3 和 L4 内核就是这么做的),但这样做的话,内核必须接收设备中断并将它派发到相应的用户模式环境上,这样实现的难度会更大。
|
||||
|
||||
x86 处理器在 EFLAGS 寄存器中使用 IOPL 位去确定保护模式中的代码是否允许执行特定的设备 I/O 指令,比如 `IN` 和 `OUT` 指令。由于我们需要的所有 IDE 磁盘寄存器都位于 x86 的 I/O 空间中而不是映射在内存中,所以,为了允许文件系统去访问这些寄存器,我们需要做的唯一的事情便是授予文件系统环境“I/O 权限”。实际上,在 EFLAGS 寄存器的 IOPL 位上规定,内核使用一个简单的“要么全都能访问、要么全都不能访问”的方法来控制用户模式中的代码能否访问 I/O 空间。在我们的案例中,我们希望文件系统环境能够去访问 I/O 空间,但我们又希望任何其它的环境完全不能访问 I/O 空间。
|
||||
|
||||
```markdown
|
||||
练习 1、`i386_init` 通过将类型 `ENV_TYPE_FS` 传递给你的环境创建函数 `env_create` 来识别文件系统。修改 `env.c` 中的 `env_create` ,以便于它只授予文件系统环境 I/O 的权限,而不授予任何其它环境 I/O 的权限。
|
||||
|
||||
确保你能启动这个文件系统环境,而不会产生一般保护故障。你应该要通过在 `make grade` 中的 "fs i/o" 测试。
|
||||
```
|
||||
|
||||
```markdown
|
||||
问题
|
||||
|
||||
1、当你从一个环境切换到另一个环境时,你是否需要做一些操作来确保 I/O 权限设置能被保存和正确地恢复?为什么?
|
||||
```
|
||||
|
||||
|
||||
注意本实验中的 `GNUmakefile` 文件,它用于设置 QEMU 去使用文件 `obj/kern/kernel.img` 作为磁盘 0 的镜像(一般情况下表示 DOS 或 Windows 中的 “C 盘”),以及使用(新)文件 `obj/fs/fs.img` 作为磁盘 1 的镜像(”D 盘“)。在本实验中,我们的文件系统应该仅与磁盘 1 有交互;而磁盘 0 仅用于去引导内核。如果你想去恢复其中一个有某些错误的磁盘镜像,你可以通过输入如下的命令,去重置它们到最初的、”崭新的“版本:
|
||||
|
||||
```
|
||||
$ rm obj/kern/kernel.img obj/fs/fs.img
|
||||
$ make
|
||||
```
|
||||
|
||||
或者:
|
||||
|
||||
```
|
||||
$ make clean
|
||||
$ make
|
||||
```
|
||||
|
||||
小挑战!实现中断驱动的 IDE 磁盘访问,既可以使用也可以不使用 DMA 模式。由你来决定是否将设备驱动移植进内核中、还是与文件系统一样保留在用户空间中、甚至是将它移植到一个它自己的的单独的环境中(如果你真的想了解微内核的本质的话)。
|
||||
|
||||
### 块缓存
|
||||
|
||||
在我们的文件系统中,我们将在处理器虚拟内存系统的帮助下,实现一个简单的”缓冲区“(实际上就是一个块缓冲区)。块缓存的代码在 `fs/bc.c` 文件中。
|
||||
|
||||
我们的文件系统将被限制为仅能处理 3GB 或更小的磁盘。我们保留一个大的、尺寸固定为 3GB 的文件系统环境的地址空间区域,从 0x10000000(`DISKMAP`)到 0xD0000000(`DISKMAP+DISKMAX`)作为一个磁盘的”内存映射版“。比如,磁盘的 0 号块被映射到虚拟地址 0x10000000 处,磁盘的 1 号块被映射到虚拟地址 0x10001000 处,依此类推。在 `fs/bc.c` 中的 `diskaddr` 函数实现从磁盘块编号到虚拟地址的转换(以及一些完整性检查)。
|
||||
|
||||
由于我们的文件系统环境在系统中有独立于所有其它环境的虚拟地址空间之外的、它自己的虚拟地址空间,并且文件系统环境仅需要做的事情就是实现文件访问,以这种方式去保留大多数文件系统环境的地址空间是很明智的。如果在一台 32 位机器上的”真实的“文件系统上这么做是很不方便的,因为现在的磁盘都远大于 3 GB。而在一台有 64 位地址空间的机器上,这样的缓存管理方法仍然是明智的。
|
||||
|
||||
当然,将整个磁盘读入到内存中需要很长时间,因此,我们将它实现成”按需“分页的形式,那样我们只在磁盘映射区域中分配页,并且当在这个区域中产生页故障时,从磁盘读取相关的块去响应这个页故障。通过这种方式,我们能够假装将整个磁盘装进了内存中。
|
||||
|
||||
```markdown
|
||||
练习 2、在 `fs/bc.c` 中实现 `bc_pgfault` 和 `flush_block` 函数。`bc_pgfault` 函数是一个页故障服务程序,就像你在前一个实验中编写的写时复制 fork 一样,只不过它的任务是从磁盘中加载页去响应一个页故障。在你编写它时,记住: (1) `addr` 可能并不会做边界对齐,并且 (2) 在扇区中的 `ide_read` 操作并不是以块为单位的。
|
||||
|
||||
(如果需要的话)函数 `flush_block` 应该会将一个块写入到磁盘上。如果在块缓存中没有块(也就是说,页没有映射)或者它不是一个脏块,那么 `flush_block` 将什么都不做。我们将使用虚拟内存硬件去跟踪,磁盘块自最后一次从磁盘读取或写入到磁盘之后是否被修改过。查看一个块是否需要写入时,我们只需要去查看 `uvpt` 条目中的 `PTE_D` 的 ”dirty“ 位即可。(`PTE_D` 位由处理器设置,用于表示那个页被写入;具体细节可以查看 x386 参考手册的 [第 5 章][3] 的 5.2.4.3 节)块被写入到磁盘上之后,`flush_block` 函数将使用 `sys_page_map` 去清除 `PTE_D` 位。
|
||||
|
||||
使用 `make grade` 去测试你的代码。你的代码应该能够通过 "check_bc"、"check_super"、和 "check_bitmap" 的测试。
|
||||
```
|
||||
|
||||
在 `fs/fs.c` 中的函数 `fs_init` 是块缓存使用的一个很好的示例。在初始化块缓存之后,它简单地在全局变量 `super` 中保存指针到磁盘映射区。在这之后,如果块在内存中,或我们的页故障服务程序按需将它们从磁盘上读入后,我们就能够简单地从 `super` 结构中读取块了。
|
||||
|
||||
```markdown
|
||||
小挑战!到现在为止,块缓存还没有清除策略。一旦某个块因为页故障被读入到缓存中之后,它将一直不会被清除,并且永远保留在内存中。给块缓存增加一个清除策略。在页表中使用 `PTE_A` 的 "accessed" 位来实现,任何环境访问一个页时,硬件就会设置这个位,你可以通过它来跟踪磁盘块的大致使用情况,而不需要修改访问磁盘映射区域的任何代码。使用脏块要小心。
|
||||
```
|
||||
|
||||
### 块位图
|
||||
|
||||
在 `fs_init` 设置了 `bitmap` 指针之后,我们可以认为 `bitmap` 是一个装满比特位的数组,磁盘上的每个块就是数组中的其中一个比特位。比如 `block_is_free`,它只是简单地在位图中检查给定的块是否被标记为空闲。
|
||||
|
||||
```markdown
|
||||
练习 3、使用 `free_block` 作为实现 `fs/fs.c` 中的 `alloc_block` 的一个模型,它将在位图中去查找一个空闲的磁盘块,并将它标记为已使用,然后返回块编号。当你分配一个块时,你应该立即使用 `flush_block` 将已改变的位图块刷新到磁盘上,以确保文件系统的一致性。
|
||||
|
||||
使用 `make grade` 去测试你的代码。现在,你的代码应该要通过 "alloc_block" 的测试。
|
||||
```
|
||||
|
||||
### 文件操作
|
||||
|
||||
在 `fs/fs.c` 中,我们提供一系列的函数去实现基本的功能,比如,你将需要去理解和管理结构 `File`、扫描和管理目录”文件“的条目、 以及从根目录开始遍历文件系统以解析一个绝对路径名。阅读 `fs/fs.c` 中的所有代码,并在你开始实验之前,确保你理解了每个函数的功能。
|
||||
|
||||
```markdown
|
||||
练习 4、实现 `file_block_walk` 和 `file_get_block`。`file_block_walk` 从一个文件中的块偏移量映射到 `struct File` 中那个块的指针上或间接块上,它非常类似于 `pgdir_walk` 在页表上所做的事。`file_get_block` 将更进一步,将去映射一个真实的磁盘块,如果需要的话,去分配一个新的磁盘块。
|
||||
|
||||
使用 `make grade` 去测试你的代码。你的代码应该要通过 "file_open"、"file_get_block"、以及 "file_flush/file_truncated/file rewrite"、和 "testfile" 的测试。
|
||||
```
|
||||
|
||||
`file_block_walk` 和 `file_get_block` 是文件系统中的”劳动模范“。比如,`file_read` 和 `file_write` 或多或少都在 `file_get_block` 上做必需的登记工作,然后在分散的块和连续的缓存之间复制字节。
|
||||
|
||||
```
|
||||
小挑战!如果操作在中途实然被打断(比如,突然崩溃或重启),文件系统很可能会产生错误。实现软件更新或日志处理的方式让文件系统的”崩溃可靠性“更好,并且演示一下旧的文件系统可能会崩溃,而你的更新后的文件系统不会崩溃的情况。
|
||||
```
|
||||
|
||||
### 文件系统接口
|
||||
|
||||
现在,我们已经有了文件系统环境自身所需的功能了,我们必须让其它希望使用文件系统的环境能够访问它。由于其它环境并不能直接调用文件系统环境中的函数,我们必须通过一个远程过程调用或 RPC、构建在 JOS 的 IPC 机制之上的抽象化来暴露对文件系统的访问。如下图所示,下图是对文件系统服务调用(比如:读取)的样子:
|
||||
|
||||
```
|
||||
Regular env FS env
|
||||
+---------------+ +---------------+
|
||||
| read | | file_read |
|
||||
| (lib/fd.c) | | (fs/fs.c) |
|
||||
...|.......|.......|...|.......^.......|...............
|
||||
| v | | | | RPC mechanism
|
||||
| devfile_read | | serve_read |
|
||||
| (lib/file.c) | | (fs/serv.c) |
|
||||
| | | | ^ |
|
||||
| v | | | |
|
||||
| fsipc | | serve |
|
||||
| (lib/file.c) | | (fs/serv.c) |
|
||||
| | | | ^ |
|
||||
| v | | | |
|
||||
| ipc_send | | ipc_recv |
|
||||
| | | | ^ |
|
||||
+-------|-------+ +-------|-------+
|
||||
| |
|
||||
+-------------------+
|
||||
|
||||
```
|
||||
|
||||
圆点虚线下面的过程是一个普通的环境对文件系统环境请求进行读取的简单机制。从(我们提供的)在任何文件描述符上的 `read` 工作开始,并简单地派发到相关的设备读取函数上,在我们的案例中是 `devfile_read`(我们还有更多的设备类型,比如管道)。`devfile_read` 实现了对磁盘上文件指定的 `read`。它和 `lib/file.c` 中的其它的 `devfile_*` 函数实现了客户端侧的文件系统操作,并且所有的工作大致都是以相同的方式来完成的,把参数打包进一个请求结构中,调用 `fsipc` 去发送 IPC 请求以及解包并返回结果。`fsipc` 函数把发送请求到服务器和接收来自服务器的回复的普通细节做了简化处理。
|
||||
|
||||
在 `fs/serv.c` 中可以找到文件系统服务器代码。它是一个 `serve` 函数的循环,无休止地接收基于 IPC 的请求,并派发请求到相关的服务函数,并通过 IPC 来回送结果。在读取示例中,`serve` 将派发到 `serve_read` 函数上,它将去处理读取请求的 IPC 细节,比如,解包请求结构并最终调用 `file_read` 去执行实际的文件读取动作。
|
||||
|
||||
回顾一下 JOS 的 IPC 机制,它让一个环境发送一个单个的 32 位数字和可选的共享页。从一个客户端向服务器发送一个请求,我们为请求类型使用 32 位的数字(文件系统服务器 RPC 是有编号的,就像系统调用那样的编号),然后通过 IPC 在共享页上的一个 `union Fsipc` 中存储请求参数。在客户端侧,我们已经在 `fsipcbuf` 处共享了页;在服务端,我们在 `fsreq`(`0x0ffff000`)处映射入站请求页。
|
||||
|
||||
服务器也通过 IPC 来发送响应。我们为函数的返回代码使用 32 位的数字。对于大多数 RPC,这已经涵盖了它们全部的返回代码。`FSREQ_READ` 和 `FSREQ_STAT` 也返回数据,它们只是被简单地写入到客户端发送它的请求时的页上。在 IPC 的响应中并不需要去发送这个页,因为这个页是文件系统服务器和客户端从一开始就共享的页。另外,在它的响应中,`FSREQ_OPEN` 与客户端共享一个新的 "Fd page”。我们将快捷地返回到文件描述符页上。
|
||||
|
||||
```markdown
|
||||
练习 5、实现 `fs/serv.c` 中的 `serve_read`。
|
||||
|
||||
`serve_read` 的重任将由已经在 `fs/fs.c` 中实现的 `file_read` 来承担(它实际上不过是对 `file_get_block` 的一连串调用)。对于文件读取,`serve_read` 只能提供 RPC 接口。查看 `serve_set_size` 中的注释和代码,去大体上了解服务器函数的结构。
|
||||
|
||||
使用 `make grade` 去测试你的代码。你的代码通过 "serve_open/file_stat/file_close" 和 "file_read" 的测试后,你得分应该是 70(总分为 150)。
|
||||
```
|
||||
|
||||
```markdown
|
||||
练习 6、实现 `fs/serv.c` 中的 `serve_write` 和 `lib/file.c` 中的 `devfile_write`。
|
||||
|
||||
使用 `make grade` 去测试你的代码。你的代码通过 "file_write"、"file_read after file_write"、"open"、和 "large file" 的测试后,得分应该是 90(总分为150)。
|
||||
```
|
||||
|
||||
### 进程增殖
|
||||
|
||||
我们给你提供了 `spawn` 的代码(查看 `lib/spawn.c` 文件),它用于创建一个新环境、从文件系统中加载一个程序镜像并启动子环境来运行这个程序。然后这个父进程独立于子环境持续运行。`spawn` 函数的行为,在效果上类似于UNIX 中的 `fork`,然后同时紧跟着 `fork` 之后在子进程中立即启动执行一个 `exec`。
|
||||
|
||||
我们实现的是 `spawn`,而不是一个类 UNIX 的 `exec`,因为 `spawn` 是很容易从用户空间中、以”外内核式“ 实现的,它无需来自内核的特别帮助。为了在用户空间中实现 `exec`,想一想你应该做什么?确保你理解了它为什么很难。
|
||||
|
||||
```markdown
|
||||
练习 7、`spawn` 依赖新的系统调用 `sys_env_set_trapframe` 去初始化新创建的环境的状态。实现 `kern/syscall.c` 中的 `sys_env_set_trapframe`。(不要忘记在 `syscall()` 中派发新系统调用)
|
||||
|
||||
运行来自 `kern/init.c` 中的 `user/spawnhello` 程序来测试你的代码`kern/init.c` ,它将尝试从文件系统中增殖 `/hello`。
|
||||
|
||||
使用 `make grade` 去测试你的代码。
|
||||
```
|
||||
|
||||
```markdown
|
||||
小挑战!实现 Unix 式的 `exec`。
|
||||
```
|
||||
|
||||
```markdown
|
||||
小挑战!实现 `mmap` 式的文件内存映射,并如果可能的话,修改 `spawn` 从 ELF 中直接映射页。
|
||||
```
|
||||
|
||||
### 跨 fork 和 spawn 共享库状态
|
||||
|
||||
UNIX 文件描述符是一个通称的概念,它还包括管道、控制台 I/O 等等。在 JOS 中,每个这类设备都有一个相应的 `struct Dev`,使用指针去指向到实现读取/写入/等等的函数上。对于那个设备类型,`lib/fd.c` 在其上实现了类 UNIX 的文件描述符接口。每个 `struct Fd` 表示它的设备类型,并且大多数 `lib/fd.c` 中的函数只是简单地派发操作到 `struct Dev` 中相应函数上。
|
||||
|
||||
`lib/fd.c` 也在每个应用程序环境的地址空间中维护一个文件描述符表区域,开始位置在 `FDTABLE` 处。这个区域为应该程序能够一次最多打开 `MAXFD`(当前为 32)个文件描述符而保留页的地址空间值(4KB)。在任意给定的时刻,当且仅当相应的文件描述符处于使用中时,一个特定的文件描述符表才会被映射。在区域的 `FILEDATA` 处开始的位置,每个文件描述符表也有一个可选的”数据页“,如果它们被选定,相应的设备就能使用它。
|
||||
|
||||
我们想跨 `fork` 和 `spawn` 共享文件描述符状态,但是文件描述符状态是保存在用户空间的内存中。而现在,在 `fork` 中,内存是标记为写时复制的,因此状态将被复制而不是共享。(这意味着环境不能在它们自己无法打开的文件中去搜索,并且管道不能跨一个 `fork` 去工作)在 `spawn` 上,内存将被保留,压根不会去复制。(事实上,增殖的环境从使用一个不打开的文件描述符去开始。)
|
||||
|
||||
我们将要修改 `fork`,以让它知道某些被”库管理的系统“所使用的、和总是被共享的内存区域。而不是去”硬编码“一个某些区域的列表,我们将在页表条目中设置一个”这些不使用“的位(就像我们在 `fork` 中使用的 `PTE_COW` 位一样)。
|
||||
|
||||
我们在 `inc/lib.h` 中定义了一个新的 `PTE_SHARE` 位,在 Intel 和 AMD 的手册中,这个位是被标记为”软件可使用的“的三个 PTE 位之一。我们将创建一个约定,如果一个页表条目中这个位被设置,那么在 `fork` 和 `spawn` 中应该直接从父环境中复制 PTE 到子环境中。注意它与标记为写时复制的差别:正如在第一段中所描述的,我们希望确保能共享页更新。
|
||||
|
||||
```markdown
|
||||
练习 8、修改 `lib/fork.c` 中的 `duppage`,以遵循最新约定。如果页表条目设置了 `PTE_SHARE` 位,仅直接复制映射。(你应该去使用 `PTE_SYSCALL`,而不是 `0xfff`,去从页表条目中掩掉有关的位。`0xfff` 仅选出可访问的位和脏位。)
|
||||
|
||||
同样的,在 `lib/spawn.c` 中实现 `copy_shared_pages`。它应该循环遍历当前进程中所有的页表条目(就像 `fork` 那样),复制任何设置了 `PTE_SHARE` 位的页映射到子进程中。
|
||||
```
|
||||
|
||||
使用 `make run-testpteshare` 去检查你的代码行为是否正确。正确的情况下,你应该会看到像 "`fork handles PTE_SHARE right`" 和 "`spawn handles PTE_SHARE right`” 这样的输出行。
|
||||
|
||||
使用 `make run-testfdsharing` 去检查文件描述符是否正确共享。正确的情况下,你应该会看到 "`read in child succeeded`" 和 "`read in parent succeeded`” 这样的输出行。
|
||||
|
||||
### 键盘接口
|
||||
|
||||
为了能让 shell 工作,我们需要一些方式去输入。QEMU 可以显示输出,我们将它的输出写入到 CGA 显示器上和串行端口上,但到目前为止,我们仅能够在内核监视器中接收输入。在 QEMU 中,我们从图形化窗口中的输入作为从键盘到 JOS 的输入,虽然键入到控制台的输入作为出现在串行端口上的字符的方式显现。在 `kern/console.c` 中已经包含了由我们自实验 1 以来的内核监视器所使用的键盘和串行端口的驱动程序,但现在你需要去把这些增加到系统中。
|
||||
|
||||
```markdown
|
||||
练习 9、在你的 `kern/trap.c` 中,调用 `kbd_intr` 去处理捕获 `IRQ_OFFSET+IRQ_KBD` 和 `serial_intr`,用它们去处理捕获 `IRQ_OFFSET+IRQ_SERIAL`。
|
||||
```
|
||||
|
||||
在 `lib/console.c` 中,我们为你实现了文件的控制台输入/输出。`kbd_intr` 和 `serial_intr` 将使用从最新读取到的输入来填充缓冲区,而控制台文件类型去排空缓冲区(默认情况下,控制台文件类型为 stdin/stdout,除非用户重定向它们)。
|
||||
|
||||
运行 `make run-testkbd` 并输入几行来测试你的代码。在你输入完成之后,系统将回显你输入的行。如果控制台和窗口都可以使用的话,尝试在它们上面都做一下测试。
|
||||
|
||||
### Shell
|
||||
|
||||
运行 `make run-icode` 或 `make run-icode-nox` 将运行你的内核并启动 `user/icode`。`icode` 又运行 `init`,它将设置控制台作为文件描述符 0 和 1(即:标准输入 stdin 和标准输出 stdout),然后增殖出环境 `sh`,就是 shell。之后你应该能够运行如下的命令了:
|
||||
|
||||
```
|
||||
echo hello world | cat
|
||||
cat lorem |cat
|
||||
cat lorem |num
|
||||
cat lorem |num |num |num |num |num
|
||||
lsfd
|
||||
```
|
||||
|
||||
注意用户库常规程序 `cprintf` 将直接输出到控制台,而不会使用文件描述符代码。这对调试非常有用,但是对管道连接其它程序却很不利。为将输出打印到特定的文件描述符(比如 1,它是标准输出 stdout),需要使用 `fprintf(1, "...", ...)`。`printf("...", ...)` 是将输出打印到文件描述符 1(标准输出 stdout) 的快捷方式。查看 `user/lsfd.c` 了解更多示例。
|
||||
|
||||
```markdown
|
||||
练习 10、
|
||||
这个 shell 不支持 I/O 重定向。如果能够运行 `run sh <script` 就更完美了,就不用将所有的命令手动去放入一个脚本中,就像上面那样。为 `<` 在 `user/sh.c` 中添加重定向的功能。
|
||||
|
||||
通过在你的 shell 中输入 `sh <script` 来测试你实现的重定向功能。
|
||||
|
||||
运行 `make run-testshell` 去测试你的 shell。`testshell` 只是简单地给 shell ”喂“上面的命令(也可以在 `fs/testshell.sh` 中找到),然后检查它的输出是否与 `fs/testshell.key` 一样。
|
||||
```
|
||||
|
||||
```markdown
|
||||
小挑战!给你的 shell 添加更多的特性。最好包括以下的特性(其中一些可能会要求修改文件系统):
|
||||
|
||||
* 后台命令 (`ls &`)
|
||||
* 一行中运行多个命令 (`ls; echo hi`)
|
||||
* 命令组 (`(ls; echo hi) | cat > out`)
|
||||
* 扩展环境变量 (`echo $hello`)
|
||||
* 引号 (`echo "a | b"`)
|
||||
* 命令行历史和/或编辑功能
|
||||
* tab 命令补全
|
||||
* 为命令行查找目录、cd 和路径
|
||||
* 文件创建
|
||||
* 用快捷键 `ctl-c` 去杀死一个运行中的环境
|
||||
|
||||
可做的事情还有很多,并不仅限于以上列表。
|
||||
```
|
||||
|
||||
到目前为止,你的代码应该要通过所有的测试。和以前一样,你可以使用 `make grade` 去评级你的提交,并且使用 `make handin` 上交你的实验。
|
||||
|
||||
**本实验到此结束。** 和以前一样,不要忘了运行 `make grade` 去做评级测试,并将你的练习答案和挑战问题的解决方案写下来。在动手实验之前,使用 `git status` 和 `git diff` 去检查你的变更,并不要忘记使用 `git add answers-lab5.txt` 去提交你的答案。完成之后,使用 `git commit -am 'my solutions to lab 5’` 去提交你的变更,然后使用 `make handin` 去提交你的解决方案。
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://pdos.csail.mit.edu/6.828/2018/labs/lab5/
|
||||
|
||||
作者:[csail.mit][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[qhwdw](https://github.com/qhwdw)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://pdos.csail.mit.edu
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://pdos.csail.mit.edu/6.828/2018/labs/lab5/disk.png
|
||||
[2]: https://pdos.csail.mit.edu/6.828/2018/labs/lab5/file.png
|
||||
[3]: http://pdos.csail.mit.edu/6.828/2011/readings/i386/s05_02.htm
|
Loading…
Reference in New Issue
Block a user