KGI Display Hardware Driver Overview

Abstract

As a part of the GGI (General Graphics Interface) Project, a Kernel Graphics Interface (KGI) is being developed to provide the neccessary hardware abstraction to allow efficient sharing and virtualization of graphics hardware in multi-user/multi-processing environments.

This article is intended to give a detailed overview of the KGI portability layer, display hardware abstraction and a basic overview of the modular display driver.

General Overview

The main design goals for KGI (Kernel Graphics Interface) display drivers can be summarized as:

Portability. KGI display drivers should easily be reused in different environments - such as in-kernel drivers or drivers that are part of a user-space application - without any modifications to the driver sources. Also, the display drivers and display hardware model should not prescibe a certain programming interface.
Flexibility. The KGI display driver model should be flexible enough to be used for any type of display hardware, as well as easily extendible to new developments.
Performance. The display driver design should allow for efficient use of acceleration features, especially in multi-user multi-process(or) environments. This includes means to share and virtualize graphics hardware.

In order to meet these goals, KGI-0.9 is divided into the following key components:

a portability layer that defines basic types and physical I/O services used by the drivers to access the hardware.
an abstract display hardware model that allows a hardware independent description of operation modes.
a modular display hardware driver design consting of a 'low-level' part that may be run in kernel space and a 'high-level' part translating a given application programming interface request into hardware-specific low-level requests. These are handled either directly by the hardware or passed to the low-level driver for execution.
a KGI environment that provides the neccessary environment and operating system services to share and virtualize the application views of the hardware.

Each of the key components except the environment services mentioned will be explained in more detail in the following sections.

Portability layer

The KGI portability layer defines some basic data types, some host specific macros and definitions to handle endianess and physical I/O services in a platform-independent manner.

Integral (integer) types

All signed integral types are defined to use 2's complement representation, the most significant bit being the sign bit.

`kgi_s8_t`	8bit signed
`kgi_u8_t`	8bit unsigned
`kgi_s16_t`	16bit signed
`kgi_u16_t`	16bit unsigned
`kgi_s32_t`	32bit signed
`kgi_u32_t`	32bit unsigned
`kgi_u_t`	system native unsigned integral, but at least 32bit wide
`kgi_s_t`	system native signed integral, but at least 32bit wide
`kgi_ascii_t`	8bit character code with 8bit ISO-latin1 encoding
`kgi_unicode_t`	16bit character code with 16bit UNICODE encoding
`kgi_isochar_t`	32bit character code with 32bit ISO 10646 encoding
`kgi_virt_addr_t`	virtual address type (byte-offset arithmetic)
`kgi_phys_addr_t`	physical address type (byte-offset arithmetic)
`kgi_bus_addr_t`	bus address type (byte offset arithmetic)
`kgi_size_t`	type to encode address range sizes.
`void`	indicates no associated type information
`void *`	same as kgi_virt_addr, but no arithmetic defined
`kgi_private_t`	data type to hold any of the above types.

The low-level KGI display hardware drivers have to run in different environments, e.g. as in-kernel drivers or as library extentions. The use of instructions that modify floating point registers is therefore not allowed for low-level drivers and the corresponding are not defined in KGI. However, high-level drivers that translate a given API (e.g. OpenGL) to hardware specific commands are defined to run as part of a application and may utilize the full register/instruction set available.

Endianness
KGI assumes all data types to be stored in driver accessible virtual memory to be either in host-native or explicitly in big or little endian encoding. The KGI system layer defines a set of macros to convert between host-native endian (HE) and big endian (BE) or litle endian (LE) encoded data. The macros are named sysencodingtype(arg), where encoding is either LE or BE, and type is one of the following: isochar, unicode, s16, u16, s32 or u32. If the argument is in HE encoding, the result will be in BE or LE encoding and vice versa. Note that these are macros and therefore the argument passed should either be a constant expression or a direct variable. Expressions that contain function calls or assignment operations must not be used as arguments for these macros.

Physical I/O

KGI low-level drivers are the primary instance that coordinates graphics hardware access. Some resources of the graphics hardware (texture buffers, frame buffer I/O memory, DMA buffers, FIFO registers etc.) may be exported to applications, but this is not done without approval by the low-level driver. The low-level driver therefore has to register _all_ resources (interrupts, I/O memory regions, etc.) required to operate the card with the Operating System environment.

KGI uses the concept of I/O regions to handle resources required by drivers. Basically, an I/O region is an address space and a set of operations defined on this address space. For a given I/O type io, the associated metalanguage is defined as follows:

io_paddr_t
physical address - needed to establish mapping to virtual addresses
io_iaddr_t
i/o address type - addresses the device will respond to when applied on the address select lines
io_baddr_t
bus address type - the address other devices have to access on their bus to access this device
io_vaddr_t
virtual addresses - only these may be used with the subsequent programmed I/O functions (kind of a handle)

struct _region_s io_region_t
a structure that is used to communicate information about a given region between the driver and the environment. The following fields are defined:

`device`	a handle that uniquely identifies the location in the device tree
`base_virt`	virtual address that maps to the device's base address
`base_io`	io base address of the region the device responds to
`base_bus`	bus address to be used to access this address
`base_phys`	physical address to be used to establish a virtual mapping
`size`	size (in bytes) of the region
`decode`	bitmask of address select lines the decoder evaluates
`name`	a string that identifies the region

int io_check_region(io_region_t *)
This environment function queries if the given region is 'free', e.g. not served by another driver. Not all environments provide sufficient support for this to be implemented. If this function cannot be implemented properly, it should always indicate a region is 'free'. The device, base_io, size, decode and name fields of the region passed have to be properly initialized.
io_vaddr io_claim_region(io_region_t *)
This environment function registers the region - if possible - with a central resource management facility and establishes a virtual mapping of this region. Before claiming a region, the driver has to check whether a region is free. The region passed need to have the same fields valid as for io_check_region(). After completion, all fields are initialized with valid values.
io_vaddr_t io_free_region(io_region_t *)
This environment function destroys a virtual mapping established by io_claim_region() and unregisters with a central resource management facility. This invalidates the base_ fields of the region passed, except for the base_io field. Note that the driver must not assume a valid virtual/bus mapping after freeing a region.
kgi_usize io_insize(const io_vaddr_t vaddr)
Returns the result of a read operation of size size bits at the device address mapped to vaddr (base_virt + offset corresponds to base_io + offset for a given region). vaddr has to be naturally aligned on a size bit boundary.
void io_outsize(const kgi_usize_t val, const io_vaddr_t addr)
Performs a write operation of size bits width at the device address mapped to vaddr. The same alignment restrictions as for io_insize() apply.
void io_inssize(const io_vaddr_t vaddr, void *buf, kgi_size_t count)
void io_outssize(const io_vaddr_t vaddr, const void *buf, kgi_size_t count)
Performs count read/write operations of size bit size at the device address mapped to vaddr reading/writing the data from the device to buf/from buf to the device. vaddr has to be properly aligned and buf must be valid.
void io_putsize(const io_vaddr_t vaddr, void *buf, kgi_size_t count)
void io_get(void *buf, const io_vaddr_t, src, unsigned long count)
Performs count write/read operations of size sizebit size at the device address mapped to vaddr. The difference to io_ins/outs() is that vaddr is incremented according to size after each write.

Note that for a particular bus/io space binding any of the I/O operations that are not supported may be missing. Currently the following bindings are definied:

`pcicfg`	PCI32 Configuration Registers
`io`	ISA I/O-Ports
`mem`	Memory Mapped I/O

So basically the KGI portability layer defines platform independent data types and means how to establish a communication channel between the hardware and the driver.

Detailed data type definitions can be found in file:kgi-0.9/kgi/include/kgi/io.h

Display Hardware Model

KGI employs a operation mode description independent of the underlying hardware and desired application programming interface. This is used to specify the operation mode of a given hardware without assumptions specific to a given API. The concept behind this description is to describe the data flow from a device-internal frame buffer representation to the final visible image.

Attributes
The KGI display hardware model assumes graphics hardware to be used to control a visible rectangular picture in certain attributes. The smallest units for which attributes can be controlled independently of each other are picture elements (pixels). However, a change of a pixel's attribute (e.g. the character displayed in this pixel) may result in a change of smaller units of the visible image called dots. Currently the following attributes are defined:

`private`	driver private data
`application`	store what you want here, the hardware doesn't care
`stencil`	stencil mask/window ID values
`z`	z-buffer value
`colorindex`	color (the final color is determined by a table lookup)
`color1`	direct control of color channel 1
`color2`	direct control of color channel 2
`color3`	direct control of color channel 3
`alpha`	alpha value
`foreground`	foreground color index for text modes
`texture index`	pixel texture (character shape) index for text modes
`blink`	blink bit/frequency

The particular meaning of color1, color2, and color3 depends on the viewing device and is specified by the color-space (YUV, RGB, ...) associated with it.

Some display hardware allows to control the attributes of two pictures (with identical resoulution) independently, so that stereo viewing is possible. To allow for smooth animation, several versions (frames) of a picture may be stored in the device to allow fast changes between the versions.

KGI therefore further divides per-pixel attributes into attributes stored per frame and attributes stored common to all frames. If a display hardware is stereo-capable, all per-frame (e.g. color, alpha values) attributes can be controlled independently for the left and right image. Common-to-all-frames attributes (e.g. z-values, stencil values) are global to all frames, for both the left and right image (if applicable).

In order to represent precision requirements in the per-attribute control, a bitmask and a zero-terminated array of kgi_u8_t values specifying the number of bits required per attribute is used. This allows for a compact, sufficient and extensible representation of all frame buffer formats. For example, a typical 3D application would specify KGI_AM_ALPHA|KGI_AM_COLOR_INDEX and { 8, 8, 0 } for the per-frame attributes and KGI_AM_STENCIL | KGI_AM_Z and { 8, 24, 0 } for the common-to-all-frames attributes.

Image Modes, Dot Ports, Dot Streams and Dot Stream Converters

The final, visible picture may be the result of (digital or analog) signal processing, e.g. blending, overlaying or chroma-keying of several independent images.
Given a display hardware internal 2D buffer of a particular size (the virtual image), only a rectangular subregion of that virtual image (the visible image) may be used for the overlay.
KGI-0.9 uses an abstract representation of the signal sources and signal processing devices to describe the hardware operation mode.
- Image Modes
  describe which attributes are stored per frame and common to all frames, at what precision attributes are stored and what size the virtual and visible image are (in pixels), as well as some global properties, e.g. if the virtual/visible image can be resized, if scaling/interpolation or table-lookup operations can be applied to per-pixel attribute values before being converted into dots and sent to a dot-port.
- Dot Ports
  describe what final screen size (in dots), color space, data format etc. the dot-data transfered from a image-read-out-unig to another signal processing device has. The signal processing device is assumed to process the data at a certain maximum rate, the dot clock. E.g. a video DAC may change it's RGB outputs once per dot clock cycle. However, data may have to be transfered at a higher or lower rate wich is determined by the load clock ratio, defining the dot-data transfered per transfer cycle.
- Dot Stream Converters
  represent signal processing devices that read image data on one or more dot ports, optionally perform some operations (color space conversion, interpolation, dot-rate conversion, overlaying, etc.) and send the result to another dot-port.
This abstraction allows very complex hardware setups to be described in a kind of signal-flow-tree, with a dot-port as root representing the viewing device, dot-stream-converters as nodes, dot-ports as links and image modes as leafs.

Resources

The abstraction described in the last section allows to describe the (static) operation mode and frame/common buffer requirements. However, it does not specify means to _alter_ (dynamic) properties of the operation mode (e.g. the look-up-tables) or the frame/common buffer contents.

This is done through resources, some of which are global and must be shared between processes (e.g. the frame/common buffers, look-up tables) and some of which can be virtualized (e.g. texture buffers, 2D or 3D graphics processor, etc.)

Basically resources are data structures used to communicate relevant data to an external mapper (a special device file driver), that utilizes the neccessary protection/virtualization mechanisms of the environment. Depending on the environment some resources (e.g. accelerators - see below) may not be available to the high-level driver(s). Currently the following resources are defined:

Commands	This resource is used to perform specific requests, e.g. setting a look-up-table entry etc.
MMIO regions (memory mapped I/O regions)	This resource type is used to allow processes to get a virtual mapping of device-local memory, such as frame or local buffers, graphics processor control registers, etc.
Accelerators (DMA buffers)/Streams	This resource type is used to establish access to a circular list of process-local DMA buffers (only one at a time being writeable to the application). The buffers are allocated by the external mapper and are phyiscally continous.
Shared (virtual) Memory (AGP texture memory)	This resource type is used to establish access to a memory object shared between the low-level driver, hardware and the application. This is not yet specified in detail.

Exact definitions of the various types can be found in file:kgi/include/kgi/kgi.h

Modular Display Driver Implementation

The most common graphics card architecture on the PC-market utilizes the following principal design:

KGI therefore defines a modular driver architecture that allows to write and distribute separate drivers for each subsystem (except memory). A fully operational driver is then obtained by linking the sub-system drivers together.

Each driver provides some (specified) driver-global information, such as maximum resolution, vendor and model, AC limits etc.

A meta-language is defined for each subsystem that allows driver initialization, deinitialization, resource export and operation mode negotiation/checking. This way drivers can be passed a partially filled-in operation mode description and auto-negotiate the proper operation mode.

This modular display driver internal interface is defined in kgi-0.9/drv/display/kgi/module.h, but is adopted to allow an easier mapping to the UDI driver model and not yet finalized.

Summary

This article was intended to give a more detailed view of the KGI display hardware abstraction model. It mainly covered (static) operation mode specification, as well as application/driver/hardware interaction.