Tuesday, September 11, 2012 CC-BY-NC
Computers and operating systems

Maintainer: admin

Lecture notes for COMP 310, lecture #2, taught by Xue Liu. These lecture notes are student-generated and any errors or omissions should be assumed to be the fault of the notetaker and not of the lecturer. To correct an error, you have to be registered and logged-in; alternatively, you can contact @dellsystem directly,

The slides for this lecture are available through MyCourses. Slides covered: 1.27 (continued from last lecture), 1.29-1.50 (there is no slide 1.28).

1Interrupts, continued

(continued from last class)

In an embedded system, you want to minimise the number of interrupts, because the CPU typically has limited processing power and probably has other stuff to do anyway. An example of such a system would be a communication device, for which time is critical.

1.1Direct memory access

Normally, when an interrupt occurs, the CPU is responsible for transferring data to and from the local buffer of the I/O device to main memory. However, this is not usually desirable for throughput and performance reasons, especially if the CPU has many other tasks to complete. Direct memory access is one alternative. In this scheme, a special controller, known as a DMA controller, handles all the data transfer from local buffers to main memory. The CPU needs to initiate the transfer, but once the transfer is under way, the CPU can perform other tasks, and is notified via an interrupt from the DMA controller when the transfer is complete.

This method has several benefits:

  • Data throughput is increased
  • The CPU is free to do other work while the transfer is in progress
  • Data can be transferred much faster, which is especially important for certain specialised devices like graphics, sound, and network cards

So we have an I/O device in one corner, and a CPU in another corner, and main memory containing instructions and data in another corner. Between the CPU and memory, we have instructions and data going back and forth, and and between the CPU and the I/O device, and we have interrupts being sent to the CPU, I/O requests sent to the device, and data moving back and forth. Then, we have direct memory access between the I/O device and main memory.

2Storage structures

The only large storage media that the CPU can access directly is main memory. This type of memory is volatile, quick to access, and limited in size. Then we have secondary storage, which is non-volatile and typically much larger, and tertiary storage. I am a bit confused about the distinction between secondary and tertiary storage, and where exactly magnetic disks fit in, but magnetic disks are rigid metal or glass platters coated with magnetic recording material. Nearly all modern computers - smartphones, servers, desktops - use this architecture.

2.1Storage hierarchy

Storage systems can be organised via a hierarchy of speed, cost (actual physical cost), and volatility. The closer you are to the CPU, the faster, more expensive, and more volatile it is.

  • Registers (within the CPU)
  • CPU cache
  • Main memory
  • Electronic disk
  • Magnetic disk
  • Optical disk
  • Magnetic tapes

2.2Caching

Caching is a technique whereby frequently-accessed information (for some definition of "frequently-accessed") is copied to a faster storage location. It's performed at many levels in a computer: in the hardware, in the operating system, in software. For instance, main memory can be viewed as a cache for secondary storage, while the CPU has its own cache for main memory. Information is temporarily copied from the slower storage location to the faster storage location, which is always smaller in size, when the information is being used (exact implementation details differ). When information is needed, the cache (possibly more than one cache, at various levels) is checked first; if the information is not there, it is copied into the cache and then used.

How exactly caches are managed - its size, what the replacement policy should be, etc - is an important consideration when designing an operating system.

Incidentally, Google's indexes are stored in main memory (across many machines); this ensures that when you search for a query, the results are near-instantaneous. Google decidedly has other caching mechanisms as well, but this is one of them.

3Computer architectures

Most systems have a single, general-purpose processor. Modern systems increasingly sport special-purpose processors as well, such as a graphics processing unit (GPU). Note that multiple cores count as only one CPU. Incidentally, coordinating multiple cores is a very difficult problem, albeit one outside the scope of this course.

3.1Multiprocessing and multi-core systems

Multiprocessor systems (also known as parallel or tightly-coupled systems) have become more common as well. These systems have several advantages: increased throughput (more operations can be performed); economy of scale (as different processors can use the same storage devices and power source); and greater robustness and fault tolerance, through redundancy (so if one processor dies, there is still at least one other to function as a backup).

Multiprocessor systems can be either asymmetric (master-slave) or symmetric (all peers). In a symmetric system, we have multiple CPUs, each with its own registers and cache, all connected through a shared bus to main memory. I don't know what an antisymmetric system looks like because there is no diagram for it in the slides.

In a multi-core system, you have multiple cores on a single chip. Other than that, there isn't really much difference. However, this setup is more efficient for communication, and also results in less power consumption.

3.1.1Limits to Moore's law

Although CPUs have been steadily decreasing in size throughout history, this is not likely to last much longer. There are, in fact, physical limits to how much you can pack into a CPU, due to energy density and the risk of overheating.

3.2Clustered systems

Clustered systems are sort of like multiprocessor systems, but instead of one system with multiple processors, they consist of multiple independent systems working together. Storage is usually shared between the computers via a storage area network (SAN) or network-attached storage (NAS; the two are different), with each computer connected independently to the storage system. The result is a robust service with high availability, even when a machine fails. Clusters are often used for high-performance computing, with the applications and algorithms used developed specifically with parallelisation in mind.

Clustering can be symmetric or asymmetric. Symmetric clusters consist of one machine running in hot standby mode, which coordinates tasks among the other, "worker", machines, and takes over in case of failure. Also known as hot backup, hot mother (not really).

4Operating system structure

4.1Multiprogramming

The concept of multiprogramming is a very important one in operating system design. It is necessary to ensure that a particular user or program does not keep the CPU and I/O devices perpetually busy, at the expense of others. At the same time, it is also necessary to ensure that the CPU always has a job to execute (so that it is never kept idle). At any given time, a subset of all the possible jobs are stored in memory, with one job being selected according to the scheduling policy of the system. If the operating system ever has to wait to finish completing a job (for an I/O device, for instance), it switches to another job, and resumes the previous job later, when the wait time is likely to be over.

Timesharing is a logical extension of the above in which the CPU switches jobs so frequently that it appears that the CPU is performing multiple tasks at the same time. This gives the user the impression of interactive computing. In this case, response time should be very low - certainly less than a second. If not all the processes that need to be run can fit in main memory, the operating system engages in swapping to disk to move them in and out of memory as necessary. The use of virtual memory abstracts memory management away from the physical storage device, ensuring that a process never has to deal with the specific type of memory storage being used directly.

4.2Dual-mode operations

Sometimes, during the normal course of operation of an operating system, one will encounter hardware-driven interrupts, traps, infinite loops, and the like.

Under construction