The Linux device driver library consists of various native Linux device drivers coupled with glue code to export the OSKit interfaces such as blkio, netio, and bufio (See Chapter 7). See the source file linux/dev/README for a list of devices and their status.
The header files oskit/dev/linux_ethernet.h and oskit/dev/linux_scsi.h determine which network and SCSI drivers are compiled into liboskit_linux_dev.a. Those files also influence driver probing; see the oskit_linux_init routines below.
There are several ways to initalize this library. One can either initialize all the compiled-in drivers (oskit_linux_init_devs, initialize a specific class of drivers (oskit_linux_init_ethernet), or initialize specific drivers (e.g., oskit_linux_init_scsi_ncr53c8xx).
These initialization functions initialize various glue code and register the appropriate device(s) in the device tree, to be probed with oskit_dev_probe.
This function initializes and registers all known drivers. The known drivers are: the IDE disk driver, and all drivers listed in the <oskit/dev/linux_ethernet.h> and <oskit/dev/linux_scsi.h> files.
Once drivers are registered, their devices may be probed via oskit_dev_probe.
This function initializes and registers all available network drivers. Currently this means Ethernet drivers only, but in the future there may be other network drivers supported such as Myrinet. The known Ethernet drivers are listed in the <oskit/dev/linux_ethernet.h> file.
Once drivers are registered, their devices may be probed via oskit_dev_probe.
This function initializes and registers all available Ethernet network drivers. The known Ethernet drivers are listed in the <oskit/dev/linux_ethernet.h> file.
Once drivers are registered, their devices may be probed via oskit_dev_probe.
This function initializes and registers all available SCSI disk drivers. The known SCSI drivers are listed in the <oskit/dev/linux_scsi.h> file.
Once drivers are registered, their devices may be probed via oskit_dev_probe.
This function initializes and registers all available IDE disk drivers. There is currently only one IDE driver.
Once drivers are registered, their devices may be probed via oskit_dev_probe.
This function initializes and registers a specific SCSI disk driver. The name must be one from the name field of the drivers listed in the <oskit/dev/linux_scsi.h> file.
Once drivers are registered, their devices may be probed via oskit_dev_probe.
This function initializes and registers a specific Ethernet network driver. The name must be one from the name field of the drivers listed in the <oskit/dev/linux_ethernet.h> file.
Once drivers are registered, their devices may be probed via oskit_dev_probe.
Once the desired drivers are initialized, registered, and probed, one can obtain references to their blkio, netio, etc interfaces (See Chapter 7) two different ways.
The first way is to look them up via their Linux name, e.g., “sd0” for a SCSI disk, or “eth0” for a Ethernet device. This is described here as it is specific to Linux.
The second, and preferred, way is to use osenv_device_lookup to find a detected device with the desired interface, such as oskit_etherdev_iid (See Chapter 17).
#include <oskit/dev/linux.h>
oskit_error_t oskit_linux_block_open(const char *name, unsigned flags, [out] oskit_blkio_t **out_io);
This function takes a Linux name of a disk, e.g., “sd0” or “wd1”, and returns an oskit_blkio_t that can be used to access the device.
The oskit_blkio interface is described in Chapter 7.
Returns 0 on success, or an error code specified in <oskit/error.h>, on error.
The rest of this chapter is very incomplete. Some of the internal details of the Linux driver emulation are described, but not the aspects relevant for typical use of the library.
XXX
Much of the data here on Linux device driver internals is out-of-date with respect to the newer device drivers that are now part of the OSKit. This section documents drivers from Linux 1.3.6.8 or earlier; the current OSKit drivers are from Linux 2.2.12, so parts of this section are likely no longer correct.
XXX Library can be used either as one component or can be used to produce many separate components, depending on how it is used.
There are a number of assumptions made by some drivers: if a given assumption is not met by the OS using the framework, then the drivers that make the assumption will not work, but other drivers may still be usable. The specific assumptions made by each partially-compliant driver are listed in a table in the appropriate section below; here is a summary of the assumptions some of the drivers make:
The following sections document all the variables and functions that Linux drivers can refer to. These variables and functions are provided by the glue code supplied as part of the library, so this information should not be needed for normal use of the library under the device driver framework. However, they are documented here for the benefit of those working on this library or upgrading it to new versions of the Linux drivers, or for those who wish to “short-cut” through the framework directly to the Linux device drivers in some situations, e.g., for performance reasons.
For an outline of our namespace management conventions, see Section 4.7.2 in our SOSP paper, http://www.cs.utah.edu/flux/papers/index.html#SOSKIT.
This is a global variable that points to the state for the current process. It is mostly used by drivers to set or clear the interruptible state of the process.
Many Linux device drivers depend on a global variable called jiffies, which in Linux contains a clock tick counter that is incremented by one at each 10-millisecond (100Hz) clock tick. The device drivers typically read this counter while polling a device during a (hopefully short) interrupt-enabled busy-wait loop. Although a few drivers take the global clock frequency symbol HZ into account when determining timeout values and such, most of the drivers just used hard-coded values when using the jiffies counter for timeouts, and therefore assume that jiffies increments “about” 100 times per second.
This variable is an array of pointers to network device structures. The array is indexed by the interrupt request line (IRQ) number. Linux network drivers use it in interrupt handlers to find the interrupting network device given the IRQ number passed to them by the kernel.
This variable is an array of “struct blk_dev_struct” structures. It is indexed by the major device number. Each element contains the I/O request queue and a pointer to the I/O request function in the driver. The kernel queues I/O requests on the request queue, and calls the request function to process the queue.
This variable is an array of pointers to integers. It is indexed by the major device number. The subarray is indexed by the minor device number. Each cell of the subarray contains the size of the device in 1024 byte units. The subarray pointer can be NULL, in which case, the kernel does not check the size and range of an I/O request for the device.
This variable is an array of pointers to integers. It is indexed by the major device number. The subarray is indexed by the minor device number. Each cell of the subarray contains the block size of the device in bytes. The subarray can be NULL, in which case, the kernel uses the global definition BLOCK_SIZE (currently 1024 bytes) in its calculations.
This variable is an array of pointers to integers. It is indexed by the major device number. The subarray is indexed by the minor device number. Each cell of the subarray contains the hardware sector size of the device in bytes. If the subarray is NULL, the kernel uses 512 bytes in its calculations.
This variable is an array of integers indexed by the major device number. It specifies how many sectors of read-ahead the kernel should perform on the device. The drivers only initialize the values in this array; the Linux kernel block buffer code is the actual user of these values.
The Linux kernel uses a static array of I/O request structures. When all I/O request structures are in use, a process sleeps on this variable. When a driver finishes an I/O request and frees the I/O request structure, it performs a wake up on this variable.
If this variable is non-zero, it indicates that the machine has an EISA bus. It is initialized bye the Linux kernel prior to device configuration.
This variable contains the address of the last byte of physical memory plus one. It is initialized by the Linux kernel prior to device configuration.
This variable gets incremented on entry to an interrupt handler, and decremented on exit. Its purpose is let driver code determine if it was called from an interrupt handler.
This variable contains Linux kernel statistics counters. Linux drivers increment various fields in it when certain events occur.
Linux has a notion of “bottom half” handlers. These handlers have a higher priority than any user level process but lower priority than hardware interrupts. They are analogous to software interrupts in BSD. Linux checks if any “bottom half” handlers need to be run when it is returning to user mode. Linux provides a number of lists of such handlers that are scheduled on the occurrence of specific events. tq_timer is one such list. On every clock interrupt, Linux checks if any handlers are on this list, and if there are, immediately schedules the handlers to run.
This integer variable indicates which of the timers in timer_table (described below) are active. A bit is set if the timer is active, otherwise it is clear.
This variable is an array of “struct timer_struct” elements. The array is index by global constants defined in ¡linux/timer.h¿. Each element contains the duration of timeout, and a pointer to a function that will be invoked when the timer expires.
This variable holds the Linux version number. Some drivers check the kernel version to account for feature differences between different kernel releases.
This function is called by a driver to set up for probing IRQs. The function attaches a handler on each available IRQ, waits for waittime ticks, and returns a bit mask of IRQs available IRQs. The driver should then force the device to generate an interrupt.
This function is called by a driver after it has programmed the device to generate an interrupt. The function waits waittime ticks, and returns the IRQ number on which the device interrupted. If no interrupt occurred, 0 is returned.
This function registers a driver for the major number major. When an access is made to a device with the specified major number, the kernel accesses the driver through the operations vector fops. The function returns 0 on success, non-zero otherwise.
This function removes the association between a driver and the major number major, previously established by register_blkdev. The function returns 0 on success, non-zero otherwise.
This function is called by a driver to allocate a buffer size bytes in length and associate it with device dev, and block number block.
This function frees the buffer bh, previously allocated by getblk.
This function allocates a buffer size bytes in length, and fills it with data from device dev, starting at block number block.
This function is the default implementation of file write. It is used by most of the Linux block drivers. The function writes count bytes of data to the device specified by i_rdev field of inode, starting at byte offset specified by f_pos of file, from the buffer buf. The function returns 0 for success, non-zero otherwise.
This function is the default implementation of file read. It is used by most of the Linux block drivers. The function reads count bytes of data from the device specified by i_rdev field of inode, starting at byte offset specified by f_pos field of file, into the buffer buf. The function returns 0 for success, non-zero otherwise.
This function checks if media has been removed or changed in a removable medium device specified by dev. It does so by invoking the check_media_change function in the driver’s file operations vector. If a change has occurred, it calls the driver’s revalidate function to validate the new media. The function returns 0 if no medium change has occurred, non-zero otherwise.
This function allocates the DMA request line drq for the calling driver. It returns 0 on success, non-zero otherwise.
This function frees the DMA request line drq previously allocated by request_dma.
This function masks the interrupt request line irq at the interrupt controller.
This function unmasks the interrupt request line irq at the interrupt controller.
This function allocates the interrupt request line irq, and attach the interrupt handler handler to it. It returns 0 on success, non-zero otherwise.
This function frees the interrupt request line irq, previously allocated by request_irq.
This function allocates size bytes memory. The priority argument is a set of bitfields defined as follows:
This function frees the memory p previously allocated by kmalloc.
This function allocates size bytes of memory in kernel virtual space that need not have underlying contiguous physical memory.
Check if the I/O address space region starting at port and size bytes in length, is available for use. Returns 0 if region is free, non-zero otherwise.
Allocate the I/O address space region starting at port and size bytes in length. It is the caller’s responsibility to make sure the region is free by calling check_region, prior to calling this routine.
Free the I/O address space region starting at port and size bytes in length, previously allocated by request_region.
Add the wait element wait to the wait queue q.
Remove the wait element wait from the wait queue q.
Perform a down operation on the semaphore sem. The caller blocks if the value of the semaphore is less than or equal to 0.
Add the caller to the wait queue q, and block it. If interruptible flag is non-zero, the caller can be woken up from its sleep by a signal.
Wake up anyone waiting on the wait queue q.
Put the caller to sleep, waiting on the buffer bh. Called by drivers to wait for I/O completion on the buffer.
Call the scheduler to pick the next task to run.
Schedule a time out. The length of the time out and function to be called on timer expiry are specified in timer.
Cancel the time out timer.
The linux subdirectory in the OSKit source tree is organized as follows. The top-level linux/dev directory contains all the glue code implemented by the Flux project to squash the Linux drivers into the OSKit driver framework. linux/fs contains our glue for Linux filesystems, and linux/shared contains glue used by both components. In general, everything except the code in the linux/src directory was written by us, whereas everything under linux/src comes verbatim from Linux. Each of the subdirectories of linux/src corresponds to the identically named subdirectories of in the Linux kernel source tree.
Of course, there are a few necessary deviations from this rule: a few of the Linux header and source files are slightly modified, and a few of the Linux header files (but no source files) were completely replaced. The header files that were heavily modified include:
|
Things drivers may want to do that make emulation difficult:
|
The Linux SCSI driver set includes both the low-level SCSI host adapter drivers and the high-level SCSI drivers for generic SCSI disks, tapes, etc.
|