Large files account for most of the transfers to/from the disk.
Disk management policies
We need the following data structures:
- a file header, one for each file, listing the disk sectors associated with that file.
- a bitmap representing the free space on disk, one bit per block (or sector).
We will add more data structures later.
Contiguous allocation:
- User says in advance how big the file will be.
- Search the bitmap (using first/best fit) to locate space for the file.
The file header contains:
- first sector in the file.
- file size (number of sectors).
Pros: fast sequential access; easy random access.
Cons: internal fragmentation; files are hard to grow.
Linked files:
- The file header points to the first block.
- each block contains a pointer to the next block.
Pros: Files can grow dynamically; free list is managed the same as a file.
Cons: For sequential access, there is a seek between blocks; random access is horrible; unreliable (lose 1 pointer, lose the rest of the file).
Indexed files (VMS):
- User declares max file size.
- System allocates a file header to hold an array of pointers big enough to point to that number of blocks.
Pros: Can easily grow up to space allowed for descriptor;
random access is fast.
Cons: Clumsy to grow file bigger than table size; still lots of seeks.
Multilevel indexed (Unix):
Key idea: Be efficient for small files, but allow large files.
- File header is 13 pointers (fixed size table).
- The first 10 pointers point to data blocks.
- The eleventh pointer points to an indirect block of pointers to data blocks.
- The twelveth pointer points to a doubly indirect block.
- The last pointer points a triply indirect block.
The user interface to the file system
Data operations:
- Create()
- Delete()
- Open()
- Close()
- Read()
- Write()
- Seek()
Naming operations:
- HardLink()
- SoftLink()
- Unlink()
Attributes (owner, protection,...):
- SetAttribute()
- GetAttribute()
OS File Data Structures
- Open file table (shared by all processes)
- open count
- access times
- protection information
- location(s) of file on disk
- pointer(s) to locations of file in file cache
- Per-process file table
- pointer to entry in open file table
- current position in file (offset)
- pointers to file buffer
OS file operations:
fileDesc = Create(name):
- OS adds entries to system and per-process file tables.
- OS generates a unique identifier, fileDesc, often hidden from user.
- Usually the user sees an index from (0, 1, 2, ...), while the OS maps from the index to an internal descriptor.
- OS associates name with fileDesc.
- OS Finds or allocates a file header (sometimes called descriptor block) for the new file in memory. Also allocates disk space.
Delete(name):
- OS translates name to a fileDesc.
- OS frees all blocks and descriptors associated with fileDesc in the per-process and global tables.
fileID = Open(name,mode):
- OS translates name to fileDesc.
- OS allocates a "transaction" fileID to the file.
- OS updates the global table to include file.
- OS sets the mode (r, w, rw, x,...) to control concurrent access to the file.
- OS returns fileID to the user.
Close(fileID):
- OS removes file descriptor information from the per-process file table.
- OS also removes file descriptor information from the global table if this
is the last reference to the file.
Read(fileID,from,size,bufAddress):
- This is a random access.
- OS reads "size" bytes from file position "from" into buffer "bufAddress".
for (i = from; i < from + size; i++)
*bufAddress[i - from] = file[i];
Read(fileID,size,bufAdress):
- This is sequential access.
- OS reads "size" bytes from current file position fp into "bufAddress" and increments current file position.
for (i = 0; i < size, i++)
*bufAddress[i] = file[fp + 1];