The OSKit’s minimal C library is a subset of a standard ANSI/POSIX C library designed specifically for use in kernels or other restricted environments in which a “full-blown” C library cannot be used. The minimal C library provides many simple standard functions such as string, memory, and formatted output functions: functions that are often useful in kernels as well as application programs, but because ordinary application-oriented C libraries are unusable in kernels, must usually be reimplemented or manually “pasted” into the kernel sources with appropriate modifications to make them usable in the kernel environment. The versions of these functions provided by the OSKit minimal C library, like the other components of the OSKit, are designed to be as generic and context-independent as possible, so that they can be used in arbitrary environments without the developer having to resort to the traditional manual cut-and-paste methods. This cleaner strategy brings with it the well-known advantages of careful code reuse: the kernel itself becomes smaller and simpler due to fewer extraneous “utility” functions hanging around in the sources; it is easier to maintain both the kernel, for the above reason, and the standard utility functions it uses, because there is only one copy of each to maintain; finally, the kernel can easily adopt new, improved implementations of common performance-critical functions as they become available, simply by linking against a new version of the minimal C library (e.g., new versions of memcpy or bzero optimized for particular architectures or newer family members of a given architecture).
In general, the minimal C library provides only functions specified in the ANSI C or POSIX.1 standards, and only a subset thereof. Furthermore, the provided implementations of these functions are designed to be as independent as possible from each other and from the environment in which they run, allowing arbitrary subsets of these functions to be used when needed without pulling in any more functionality than necessary and without requiring the OS developer to provide significant support infrastructure. For example, all of the “simple” functions which merely perform some computation on or manipulation of supplied data, such as the string instructions, are guaranteed to be completely independent of each other.
The functions that are inherently environment-dependent in some way, such as printf, which assumes the existence of some kind of “standard output” or “console,” are implemented in terms of other clearly specified, environment-dependent functions. Thus, in order to use the minimal C library’s implementation of printf, the OS developer must provide appropriate console_putchar and console_putbytes routines to be used to write characters to whatever acts as the “standard output” in the current environment. All such dependencies between C library functions are explicitly stated in this document, so that it is always clear what additional functions the developer must supply in order to make use of a set of functions provided by the minimal C library.
Since almost all of the functions and definitions provided by the OSKit minimal C library implement well-known, well-defined ANSI and POSIX C library interfaces which are amply documented elsewhere, we do not attempt to describe the purpose and behavior of each function in this chapter. Instead, only the peculiarities relevant to the minimal C library, such as implementation interdependencies and side effects, are described here.
Note that many files and functions in the minimal C library are derived or taken directly from other source code bases, particularly Mach and BSD. Specific attributions are made in the source files themselves.
Some of the functions provided in the minimal C library depend on lower level I/O routines in the POSIX library (see Section 20) to provide mappings to the appropriate OSKit COM interfaces. For example, fopen in the C library will chain to open in the POSIX library, which in turn will chain to the appropriate oskit_dir and oskit_file COM operations.
The following features in many C libraries are deliberately unsupported by the minimal C library, for reasons described below, and will remain unsupported unless a compelling counterargument arises:
There is limited support for printing floating point numbers, however. If this feature is desired, doprnt.c can be compiled with -DDOPRNT_FLOAT to enable the use of the %f format specifier.
When the OSKit is installed using make install, a set of standard ANSI/POSIX-defined header files, containing definitions and function prototypes for the minimal C library, are installed in the selected include directory under the subdirectory oskit/c/. For example, the version of the ANSI C header file string.h provided with the minimal C library is installed as prefix /include/oskit/c/string.h. These header files are installed in a subdirectory rather than in the top level include directory so that if the OSKit is installed in a standard place shared by other packages and/or system files, such as /usr or /usr/local, the minimal C library’s header files will not conflict with header files provided by normal application-oriented C libraries, nor will applications “accidentally” use the minimal C library’s header files when they really want the normal C library’s header files.
There are two main ways a kernel or other program can explicitly use the OSKit minimal C library’s header files. The first is by including the oskit/c/ prefix directly in all relevant #include statements; e.g., ‘#include <oskit/c/string.h>’ instead of ‘#include <string.h>’. However, since this method effectively makes the client code somewhat specific to the OSKit minimal C library by hard-coding OSKit-specific pathnames into the #include statements, this method should generally only be used if for some reason the code in question is extremely dependent on the OSKit minimal C library in particular, and it would never make sense for it to include corresponding header files from a different C library.
For typical code using the minimal C library, which simply needs “a printf” or “a strcpy,” the preferred method of including the library’s header files is to code the #include lines without the oskit/c/ prefix, just as in application code using an ordinary C library, and then add an appropriate -I (include directory) directive to the compiler command line so that the oskit/c/ directory will be scanned automatically for these header files before the top-level include directory and other include directories in the system are searched. Typically this -I directive can be added to the CFLAGS variable in the Makefile used to build the program in question. In fact, the OSKit itself uses this method to allow code in other toolkit components and in the minimal C library itself to make use of definitions and functions provided by the minimal C library. (Of course, these dependencies are clearly documented, so that if you want to use other OSKit components but not the minimal C library, or only part of the minimal C library, it is possible to do so cleanly.)
Except when otherwise noted, all of the definitions and functions described in this section are very simple, have few dependencies, and behave as in ordinary C libraries. Functions that are not self-contained and interact with the surrounding environment in non-trivial ways (e.g., the memory allocation functions) are described in more detail in later sections.
This header file simply cross-includes the header file oskit/exec/a.out.h, which is part of the executable interpreter library (see Section 35.1.2) and provides a minimal set of definitions describing a.out-format executable and object files. Although this header file is not standard ANSI or POSIX (thank goodness!), it is a fairly strong Unix tradition, and is especially relevant to operating system code, and therefore is provided as part of the OSKit.
This header file defines the alloca pseudo-function, which allows C code to dynamically allocate memory on the calling function’s stack frame, which will be freed automatically when the function returns. This header is not ANSI or POSIX but is a fairly well-established tradition. The implementation of this function currently depends on being compiled with gcc.
This header file provides a standard assert macro as described in the C standard. All uses of the assert macro are compiled out (they generate no code) if the preprocessor symbol NDEBUG is defined before this header file is included.
This header file provides implementations of the following standard character handling functions:
The implementations of these functions provided by the minimal C library are directly-coded inline functions, and do not reference any global data structures such as character type arrays. They do not support locales (see Section 14.3), and only recognize the basic 7-bit ASCII character set (all characters above 126 are considered to be control characters).
This file declares the global errno variable, and defines symbolic constants for all the errno values defined in the ISO/ANSI C, POSIX.1, and UNIX standards. They are provided mainly for the convenience of clients that can benefit from standardized error codes and do not already have their own error handling scheme and error code namespace. The symbols defined in this header file have the same values as the corresponding symbols defined in oskit/error.h (see 4.6.2), which are the error codes used through the OSKit’s COM interfaces; this way, error codes from arbitrary OSKit components can be used directly as errno values at least by programs that use the minimal C library.
The main disadvantage of using COM error codes as errno values is that, since they don’t start from around 0 like typical Unix errno values, it’s impossible to provide a traditional Unix-style sys_errlist table for them. However, they are fully compatible with the POSIX-blessed strerror and perror routines, and in any case the minimal C library is not intended to support “legacy” applications directly - for that purpose, a “real” C library would be more appropriate, and such a C library would probably use more traditional errno values, doing appropriate translation when interacting with COM interfaces.
This header file defines prototypes for the low-level POSIX functions creat and open, and provides symbolic constants for the POSIX open mode flags (O_*). Neither creat nor open are defined in the minimal C library, but instead are defined in the POSIX library (see Section 20).
The open mode constants defined by this header are identical to and interchangeable with the corresponding constants defined in oskit/fs/file.h for the oskit_file COM interface (see 9.4). These definitions are provided so that clients may standardize on a single set of defintions, which are the same as those used by the COM components. For example, the FreeBSD C library includes this header file, thus providing compatibility between the the two libraries and the disk-based file systems.
This header file provides the standard set of symbols required by the ISO C standard describing various characteristics of the float, double, and long double types. There is nothing special about the OSKit’s definition of these symbols; see the ANSI/ISO C or Single UNIX standard for detailed information about this header file.
This header file defines the following standard symbols describing architecture-specific limits of basic numeric types:
The minimal C library’s limits.h does not define any of the POSIX symbols describing operating system-specific limits, such as maximum number of open files, since the minimal C library has know way of knowing how it will be used and thus what these values should be.
This header file defines common types and functions used by the minimal C library’s default memory allocation functions. This header file is not a standard POSIX or X/Open CAE header file; instead its purpose is to expose the implementation of the malloc facility so that the client can fully control it and use it in arbitrary contexts.
The malloc package implements the following standard allocation routines (also defined in stdlib.h).
The base C library also provides additional routines that allocate chunks of memory that are naturally aligned. The user must keep track of the size of each allocated chunk and free the memory with sfree rather than the ordinary free.
The following are specific to the LMM implementation. They take an additional flag to allow requests for specific types of memory.
The following functions are frequently overridden by the client OS:
See Section 14.5 for details on these functions.
This header file provides function prototypes for the math functions conventionally found in libm, the standard C math library. Although these functions are not part of the minimal C library, an implementation of the math functions is available in the FreeBSD math library; see Chapter 22 for details. This header file also defines various floating-point constants, such as the value of , as described in the Unix CAE specification. Since these functions and their implementations are fully standard, they are not described in further detail here; refer to the ISO C and Unix standards for more information.
This header file defines structures and prototypes for Internet domain name service (DNS) operations, such as finding the IP address for a host name and vice versa.
This header provides definitions for the minimal setjmp/longjmp facility provided in the minimal C library. This facility differs from standard ones in two ways:
In summary, this header file defines the following symbols:
The minimal C library has no support for signals, and thus does not implement any of the functions prototyped in this header file. The header file is here for client OSes that wish to support POSIX signal semantics.
This header provides definitions for accessing variable argument lists. It simply chains to x86-specific definitions.
This header file defines the symbol NULL and the type size_t if they haven’t been defined already. It also defines wchar_t and the offsetof macro.
This header provides definitions for the standard input and output facilities provided by the minimal C library. Many of these routines simply chain to the low-level I/O routines in the POSIX library, and do no buffering.
This header file defines the symbol NULL and the type size_t if they haven’t been defined already, and provides prototypes for the following functions in the minimal C library:
Prototypes for the following functions are also provided, but they are not implemented in the minimal C library. See the FreeBSD C library in Section 21.
This header file defines the symbol NULL if it hasn’t been defined already, and provides prototypes for the following functions in the minimal C library:
The following deprecated functions are provided for compatibility with existing code:
For compatibility with existing software, a header file called strings.h is provided which acts as a synonym for string.h (Section 14.4.18).
GNU profiling support definitions.
Format definitions for ‘ioctl’ commands. From BSD4.4.
This file includes constant definitions and function prototypes for memory management operations.
None of these routines are implemented in the minimal C library.
The defined constant values are the same as traditional BSD, though the values of PROT_READ and PROT_EXEC are reversed.
This file should be included by code that requires certain system- and machine-dependent parameters and functions.
Definitions the arguments to the reboot system call.
This header simply includes the base C library signal.h.
This header includes constant definitions and function prototypes for file operations.
None of these routines are implemented in the minimal C library. Refer to the POSIX library in Section 20.
This header simply includes the base C library termio.h.
This header includes constant definitions and function prototypes for timing and related functions, none of which are implemented in the minimal C library. Refer to the POSIX library (Section 20) and the FreeBSD C library (Section 21) for implementation of these functions.
General POSIX types.
Note that the minimal C library has no support for processes, and thus doesn’t implement any of the functions prototyped in this header file. The header file is here in case client OSes wish to support POSIX wait semantics.
The minimal C library does not fully support termios. Some of the termio stuff is implemented elsewhere to support OSKit devices.
This file contains the required symbolic constants for a POSIX system. These include the symbolic access and seek constants:
This file defines no POSIX compile-time or execution-time constants. Additionally defined are the constants:
prototypes for standard POSIX functions:
Of the above routines, only _exit is considered part of the minimal C library. The remaining functions are part of the extended POSIX environment. Refer to Section 20 for details.
This file defines the utimbuf structure, as well as the prototype for the POSIX function utime, which sets the access and modification times of a named file. This function is not implemented in the minimal C library. Refer to Section 20 for details.
This file defines the utsname structure, as well as the prototype for the POSIX function uname, which returns a series of null terminated strings of information identifying the current system. This function is not implemented in the minimal C library. Refer to Section 20 for details.
All of the default memory allocation functions in the minimal C library are built on top of the OSKit LMM, described in Chapter 25.
There are three families of memory allocation routines available in the minimal C library. First is the standard malloc, realloc, calloc, and free. These work as in any standard C library.
The second family, smalloc, smemalign, and sfree, assume that the caller will keep track of the size of allocated memory blocks. Chunks allocated with smalloc-style functions must be freed with sfree rather than the normal free. These functions are not part of the POSIX standard, but are much more memory efficient when allocating many power-of-two-size chunks naturally aligned to their size (e.g., when allocating naturally-aligned pages or superpages). The normal memalign function attaches a prefix to each allocated block to keep track of the block’s size, and the presence of this prefix makes it impossible to allocate naturally-aligned, natural-sized blocks successively in memory; only every other block can be used, greatly increasing fragmentation and effectively halving usable memory. (Note that this fragmentation property is not peculiar to the OSKit’s implementation of memalign; most versions of memalign produce have this effect.)
The third family, mallocf, memalignf, smallocf, and smemalignf, allow LMM flags to be passed to the more common allocation routines. These are useful for allocating memory of a specific type (see 25.2). Memory allocated with these routines should be freed with free or sfree as appropriate.
All of the memory management functions, if they are unable to allocate a block out of the LMM pool, call the morecore function and then retry the allocation if morecore returns non-zero. The default behavior for this function is simply to return 0, signifying that no more memory is available. In environments in which a dynamically growable heap is available, you can override the morecore function to grow the heap as appropriate.
All of the memory allocation functions make calls to mem_lock and mem_unlock to protect access to the LMM pool under all of these services. The default implementation of these synchronization functions in the minimal C library is to do nothing. However, when the C library is initialized (see Section 14.7.1 or Section 21.8.1), a query for the lock manager will be made (See Section 6.3) to determine if there is a default implementation of locks available, and will use that implementation to guarantee thread/SMP safety. The absence of a lock manager implementation implies a single threaded environment, and thus locks are unnecessary. Additionally, they can be overridden with functions that acquire and release a lock of some kind appropriate to the environment in order to make the allocation functions thread- or SMP-safe. Also, note that if you link in liboskit_kern before liboskit_c, the kernel support library provides its own default implementation of mem_lock and mem_unlock, which call base_critical_enter and base_critical_leave respectively; this provides simple and robust, though probably far from optimal, memory allocation protection for kernel code running on the bare hardware.
The LMM pool used by all default memory allocation functions either directly or indirectly.
In the base environemnt, this LMM is initialized at boot time to contain all the physical memory available in the system (see Section 15.11). “Available memory” means all that is not used by base environment data structures or by the OS kernel image itself.
Standard issue malloc function. Calls mallocf with flags value zero to allocate the memory.
Returns a pointer to the allocated memory or zero if none.
Calls malloc to allocate memory, asserting that the return is non-zero; i.e., mustmalloc will panic if no memory is available.
Note that if NDEBUG is defined, assert will do nothing and this routine is identical to malloc.
Returns a pointer to the allocated memory if it returns at all.
Allocate uninitialized memory with the specified byte alignment; e.g., an alignment value of 32 will return a block aligned on a 32-byte boundary. Calls memalignf with flags value zero to allocate the memory.
Note that the alignment is not the same as used by the underlying LMM routines. The alignment parameter in LMM calls is the number of low-order bits that should be zero in the returned pointer.
Returns a pointer to the allocated memory or zero if none.
Standard issue calloc function. Calls malloc to allocate the memory and memset to clear it.
Returns a pointer to the allocated memory or zero if none.
Calls calloc to allocate memory, asserting that the return is non-zero; i.e., mustcalloc will panic if no memory is available.
Note that if NDEBUG is defined, assert will do nothing and this routine is identical to calloc.
Returns a pointer to the allocated memory if it returns at all.
Standard issue realloc function. Calls malloc if buf is zero, otherwise calls lmm_alloc to allocate an entirely new block of memory, uses memcpy to copy the old block, and lmm_frees that block when done.
May call morecore if the initial attempt to allocate memory fails.
Returns a pointer to the allocated memory or zero if none.
Standard issue free function. Calls lmm_free to release the memory.
Note that free must only be called with memory allocated by one of: malloc, realloc, calloc, mustmalloc, mustcalloc, mallocf, memalign, or memalignf.
Identical to malloc except that the user must keep track of the size of the allocated chunk and pass that size to sfree when releasing the chunk.
Calls smallocf with flags value zero to allocate the memory.
Returns a pointer to the allocated memory or zero if none.
Identical to memalign except that the user must keep track of the size of the allocated chunk and pass that size to sfree when releasing the chunk.
Allocates uninitialized memory with the specified byte alignment; e.g., an alignment value of 32 will return a block aligned on a 32-byte boundary. Calls smemalignf with flags value zero to allocate the memory.
Note that the alignment is not the same as used by the underlying LMM routines. The alignment parameter in LMM calls is the number of low-order bits that should be zero in the returned pointer.
Returns a pointer to the allocated memory or zero if none.
Frees a block of memory with the indicated size. Calls lmm_free to release the memory.
Note that sfree must only be called with memory allocated by one of: smalloc, smallocf, smemalign, or smemalignf and that the size given must match that used on allocation.
Allocates uninitialized memory from malloc_lmm. The interface is similar to malloc but with an additional flags parameter which is passed to lmm_alloc.
For kernels running in the base environment on an x86, meaningful values for flags are as described in Section 15.11.1.
Returns a pointer to the allocated memory or zero if none.
Allocate uninitialized memory with the specified byte alignment; e.g., an alignment value of 32 will return a block aligned on a 32-byte boundary. The interface is similar to malloc but with an additional flags parameter which is passed to lmm_alloc.
For kernels running in the base environment on an x86, meaningful values for flags are as described in Section 15.11.1.
Note that the alignment is not the same as used by the underlying LMM routines. The alignment parameter in LMM calls is the number of low-order bits that should be zero in the returned pointer.
Returns a pointer to the allocated memory or zero if none.
Allocates uninitialized memory from malloc_lmm. The interface is similar to smalloc but with an additional flags parameter which is passed to lmm_alloc. As with smalloc, the user must keep track of the size of the allocated chunk and pass that size to sfree when releasing the chunk.
For kernels running in the base environment on an x86, meaningful values for flags are as described in Section 15.11.1.
Returns a pointer to the allocated memory or zero if none.
Allocate uninitialized memory with the specified byte alignment; e.g., an alignment value of 32 will return a block aligned on a 32-byte boundary. The interface is similar to smemalign but with an additional flags parameter which is passed to lmm_alloc. As with smemalign, the user must keep track of the size of the allocated chunk and pass that size to sfree when releasing the chunk.
For kernels running in the base environment on an x86, meaningful values for flags are as described in Section 15.11.1.
Note that the alignment is not the same as used by the underlying LMM routines. The alignment parameter in LMM calls is the number of low-order bits that should be zero in the returned pointer.
Returns a pointer to the allocated memory or zero if none.
This routine is called directly or indirectly by any of the memory allocation routines in this section when a call to the underlying LMM allocation routine fails. This allows a kernel to add more memory to malloc_lmm as needed.
The default version of morecore in the minimal C library just returns zero indicating no more memory was available. Client OSes should override this routine as necessary.
Returns non-zero if the indicated amount of memory was added, zero otherwise.
This routine is called from any default memory allocation routine before it attempts to access malloc_lmm.
Coupled with mem_unlock, this provides a way to make memory allocation thread and MP safe. In a multithreaded client OS, these functions will use the default lock implementation as provided by the lock manager (see Section 6.3), to protect accesses to the malloc_lmm. Or, these functions may be overridden with a suitable synchronization primitive.
Note that the kernel support library provides defaults for mem_lock and mem_unlock that call base_critical_enter and base_critical_leave respectively. However, you’ll only get these versions if you use the kernel support library and link it in before the minimal C library.
This routine is called from any default memory allocation routine after all accesses to malloc_lmm are complete.
Coupled with mem_lock, this provides a way to make memory allocation thread and MP safe. In a multithreaded client OS, these functions will use the default lock implementation as provided by the lock manager (see Section 6.3), to protect accesses to the malloc_lmm. Or, these functions may be overridden with a suitable synchronization primitive.
Note that the kernel support library provides defaults for mem_lock and mem_unlock that call base_critical_enter and base_critical_leave respectively. However, you’ll only get these versions if you use the kernel support library and link it in before the minimal C library.
The versions of sprintf, vsprintf, sscanf, and vsscanf provided in the OSKit’s minimal C library are completely self-contained; they do not pull in the code for printf, fprintf, or other “file-oriented” standard I/O functions. Thus, they can be used in any environment, regardless of whether some kind of console or file I/O is available.
The routines printf, puts, putchar, getchar, etc., are all defined in terms of console_putchar, console_getchar, console_puts, and console_putbytes. This means that you can get working formatted “console” output merely by providing an appropriate implementation of the aforementioned console functions. In the base environment, these routines are defined in the kernel library (see Section 15.13).
The standard I/O functions that actually take a FILE* argument, such as fprintf and fwrite, and as such are fundamentally dependent on the notion of files, are implemented in terms of the low-level I/O functions in the POSIX library (see Section 20). However, unlike in “real” C libraries, the high-level file I/O functions provided by the minimal C library only implement the minimum of functionality to provide the basic API: in particular, they do no buffering, so for example an fwrite translates directly to a write. This design reduces code size and minimizes interdependencies between functions, while still providing familiar, useful services such as formatted file I/O.
oskit_load_libc allows for internal initializatons to be done. This routine must be called when the operating system is initialized, typically from the Client OS library. The services database is used to lookup other interfaces required by the C library, and is maintained as internal state to the library.
oskit_init_libc allows for secondary initializations to be performed by the C library, in cases where lazy initialization is not appropriate. It must be called sometime after oskit_load_libc.
exit calls up to 32 functions installed via atexit in reverse order of installation before it calls _exit.
_exit, which terminates the calling process in Unix, calls oskit_libc_exit with the exit status code (see Section 20.5.1).
abort calls _exit(1).
#include <oskit/c/stdio.h>
void hexdumpb(void *base, void *buf, int nbytes);
void hexdumpw(void *base, void *buf, int nwords);
These functions print out a buffer as a hexdump. For example (the box is included):
.---------------------------------------------------------------------------. | 00000000 837c240c 00741dc7 05007010 00000000 .|$..t....p..... | | 00000010 008b4424 0ca30470 10008b04 24a30870 ..D$...p....$..p | | 00000020 1000eb2c c7050070 10000100 0000833c ...,...p.......< | | 00000030 2400740a c7050070 10000200 00008b44 $.t....p.......D | `---------------------------------------------------------------------------' |
The first form treats the buffer as an array of bytes whereas the second treats the buffer as an array of words. This distinction is only important on little-endian machines and only affects the appearance of the four middle columns of hex numbers--the last column of output is identical for both.
Dump machine-specific state from a sigcontext structure to stdout. On the x86 this includes the processor registers and a stack backtrace originating at the saved EBP value.