CSAPP Walk Through: Chapter 1

These series of notes are based on the book Computer Systems: A Programmer's Perspective.

Computer Systems: A Programmer's Perspective

The homepage for the book is http://csapp.cs.cmu.edu/.

I put these notes here for me to review the book conveniently, hope it helps you as well.


Information = Bits + Context

Files that consist exclusively of ASCII characters are known as text files. All other files are known as binary files.

The machine representations of numbers are not the same as integers and real numbers as they a finite approximations that can behave in unexpected ways.

Why is C so successful?

  • C was closely tied with the Unix opearting system. Most of the Unix kernel and all of its supporting tools and libraries were written in C.
  • C is a small and simple language. The simplicity of C made it relatively easy to learn and to port to different computers.
  • C was designed for a proctical purpose.

However, C also have weaknesses:

  • C pointers are a common source of confusion and programming errors.
  • C lacks explicit support for some abstractions, such as classes, objects and exceptions and exceptions.

The 4 phases of the Compilation System

  • Preprocessing phase: The preprocessor(cpp) transform the source program, which is in a text form, into a modified source program form according to directives begin with the # character. The result is typically ends with the .i suffix.
  • Compilation phase: The compiler(cc1) translates the .i file into a text file which contains an assembly-language program, which ends with the .s suffix.
  • Assembly phase: The assembler(as) translates the assembly-language program into machine-language instructions, and packages them in a the relocatable object program, the output file is ended with the .o suffix.
  • Linking phase: The linker(ld) handles a process which merges the source program (.o file) with other precompiled object file from other libraries. The result is the executable object file, which is ready to be loaded into memory and executed by the system.

Understanding How Compilation System Work is useful

  • Optimizing program performance: With the basic understanding of machine-level code and how to compiler translates different C statements into machine code, we can make good coding decisions in our C programs.
  • Understanding link-time errors: Some of the most perplexing programming errors are related to the operation of the linker.
  • Avoiding security holes: With better understanding of how compilation system works, we may reduce the threat of our program being attacked.

Processors Read and Interpret Instructions

Hardware Organization of a System

  • Buses: Buses is a collection of electrical conduits running throughout the system. Buses are typically designed to transfer Ts. 32 bits machines have word size of 4 bytes and 64 bits machine have word size of 8 bytes.
  • I/O Devices: Input/Output devices, such as keyboards, mouses, monitors and hard drive disks, are the system's connection to the external world. Each I/O Device is connected to the I/O bus by either a controller or an adaper.
  • Main Memory: The main memory holds both a program and data it manipulates, physically consists of dynamic random access memory(DRAM) chips and logically organized as a linear array of bytes.
  • Processor: CPU, also called processor, is an engine which interprets(executes) instructions stored in main memory, consists of the program counter(PC), the arithmetic/logic unit(ALU) and register file. The CPU can carry out a few operations, such as Load, Store, Operate and Jump.

Running a Program

When we run a executable file(in Shell for example), shell loads the executable file to the main memory using DMA, and processor begin to execute the machine-language instuctions in the executable file's main routine.

Direct Memory Access(DMA) is a technique to transfer data directly from disk to main memory, no need to go through the processor.

Caches

System spends a lot of time moving information from one place to another, so it is important to make the information move faster.

Cache memories, designed to deal with the processor-memory gap, were small fast storage devices to store temproary information which the processor is likely to need in the near future.

  • L1 cache is nearly as fast as the register file.
  • L2 cache is about 5 times slower than L1 cache, but is still 5 to 10 times faster than the main memory.
  • Both L1 cache and L2 cache are implemented with static random access memory(SRAM).

Storage Devices Form a Hierarchy

  • Storage at one level serves as a cache for storage at the next lower level.
  • The higher the level of the storage device, the faster and smaller it can be.
  • The storage device on the top of the Hierarchy is the register file.
  • Local disks can be the cache of the remote storage.(Like information on the web server.)

The Operating System Manages the Hardware

The operating system has two primary purpose:

  1. To protect the hardware from misuse by runaway applications.
  2. To provide applications with simple and uniform mechanisms for manipulating complicated and often wildly different low-level hardware devices.

Processes

  • A process is the operating system's absctraction of a running program.
  • When we run a program on a modern operating system, processes provides an illusion that the program is the only one running on the system, and the program appears to have exclusive use of the processor, main memory and the I/O devices.
  • Multiple processes can run concurrently on the same system. When running concurrently, the instructions of one process are interleaved with the instructions of another process.
  • The operating system performs this interleaving with context switching.
  • Context is also known as state, which includes information such as the current value of PC, the register file, and the content of main memory.

Threads

  • Threads are execution units which processes in modern systems have.
  • Each of the threads in a process runs in the context of the process and sharing the same code and global data.
  • Sharing data between threads is easier than sharing data between processes.

Virtual Memory

  • Virtual memory is an abstraction that provides each process with the illusion that it has exclusive use of the main memory.
  • Each process has the same uniform view of memory, which is known as its virtual address space.

Files

  • A file is a sequence of bytes.
  • Every I/O device is modeled as a file.
  • Using a small set of system calls known as Unix I/O, all input/output in the system is performed by reading and writing files.

Systems Communicate with Each Other Using Networks

The network can also be viewed as an I/O device.

Concurrency and Parallelism

  • Concurrency to refer to the general concept of a system with multiple, simultaneous activities.
  • Parallelism to refer to the use of concurrency to make a system run faster.

Parallelism can be exploited at multiple levels of abstraction in a computer system, for example there are Thread-Level Concurrency, Instruction-Level Parallelism, and Single-Instruction, Multiple-Data (SIMD) Parallelism.

About the author

Hey, I'm Peixuan, a software engineer. I write posts about things I learned.

dinever