KGI Display Hardware Driver Overview
Abstract
As a part of the GGI (General Graphics Interface) Project, a Kernel Graphics
Interface (KGI) is being developed to provide the neccessary hardware
abstraction to allow efficient sharing and virtualization of graphics
hardware in multi-user/multi-processing environments.
This article is intended to give a detailed overview of the KGI
portability layer, display hardware abstraction and a basic overview
of the modular display driver.
General Overview
The main design goals for KGI (Kernel Graphics Interface) display drivers
can be summarized as:
- Portability. KGI display drivers should easily be reused in
different environments - such as in-kernel drivers or drivers that
are part of a user-space application - without any modifications
to the driver sources. Also, the display drivers and display
hardware model should not prescibe a certain programming interface.
- Flexibility. The KGI display driver model should be flexible
enough to be used for any type of display hardware, as well as
easily extendible to new developments.
- Performance. The display driver design should allow for
efficient use of acceleration features, especially in multi-user
multi-process(or) environments. This includes means to share
and virtualize graphics hardware.
In order to meet these goals, KGI-0.9 is divided into the following key
components:
- a portability layer that defines basic types and physical
I/O services used by the drivers to access the hardware.
- an abstract display hardware model that allows a hardware
independent description of operation modes.
- a modular display hardware driver design consting of a
'low-level' part that may be run in kernel space and a 'high-level'
part translating a given application programming interface
request into hardware-specific low-level requests. These are
handled either directly by the hardware or passed to the
low-level driver for execution.
- a KGI environment that provides the neccessary environment and
operating system services to share and virtualize the application
views of the hardware.
Each of the key components except the environment services mentioned
will be explained in more detail in the following sections.
Portability layer
The KGI portability layer defines some basic data types, some host specific
macros and definitions to handle endianess and physical I/O services
in a platform-independent manner.
Integral (integer) types
All signed integral types are defined to use 2's complement
representation, the most significant bit being the sign bit.
kgi_s8_t | 8bit signed |
kgi_u8_t | 8bit unsigned | |
kgi_s16_t | 16bit signed | |
kgi_u16_t | 16bit unsigned |
kgi_s32_t | 32bit signed |
kgi_u32_t | 32bit unsigned |
kgi_u_t | system native unsigned integral, but at least 32bit wide |
kgi_s_t | system native signed integral, but at least 32bit wide |
kgi_ascii_t | 8bit character code with 8bit ISO-latin1 encoding |
kgi_unicode_t | 16bit character code with 16bit UNICODE encoding |
kgi_isochar_t | 32bit character code with 32bit ISO 10646 encoding |
kgi_virt_addr_t | virtual address type (byte-offset arithmetic) |
kgi_phys_addr_t | physical address type (byte-offset arithmetic) |
kgi_bus_addr_t | bus address type (byte offset arithmetic) |
kgi_size_t | type to encode address range sizes. |
void | indicates no associated type information |
void * | same as kgi_virt_addr, but no arithmetic defined |
kgi_private_t | data type to hold any of the above types. |
The low-level KGI display hardware drivers have to run in different
environments, e.g. as in-kernel drivers or as library extentions.
The use of instructions that modify floating point registers is
therefore not allowed for low-level drivers and the corresponding
are not defined in KGI. However, high-level drivers that translate
a given API (e.g. OpenGL) to hardware specific commands are defined
to run as part of a application and may utilize the full
register/instruction set available.
Endianness
KGI assumes all data types to be stored in driver accessible virtual
memory to be either in host-native or explicitly in big or little
endian encoding. The KGI system layer defines a set of macros
to convert between host-native endian (HE) and big endian (BE) or litle
endian (LE) encoded data. The macros are named
sysencodingtype(arg), where
encoding is either LE or BE,
and type is one of the following:
isochar, unicode, s16, u16,
s32 or u32.
If the argument is in HE encoding, the result will be in BE or LE
encoding and vice versa. Note that these are macros and therefore
the argument passed should either be a constant expression or a
direct variable. Expressions that contain function calls or
assignment operations must not be used as arguments for these macros.
Physical I/O
KGI low-level drivers are the primary instance that coordinates
graphics hardware access. Some resources of the graphics hardware
(texture buffers, frame buffer I/O memory, DMA buffers, FIFO registers
etc.) may be exported to applications, but this is not done without
approval by the low-level driver. The low-level driver therefore
has to register _all_ resources (interrupts, I/O memory regions, etc.)
required to operate the card with the Operating System environment.
KGI uses the concept of I/O regions to handle resources required by
drivers. Basically, an I/O region is an address space and a set
of operations defined on this address space.
For a given I/O type io, the associated metalanguage
is defined as follows:
- io_paddr_t
physical address - needed to establish mapping to
virtual addresses
- io_iaddr_t
i/o address type - addresses the device will respond to
when applied on the address select lines
- io_baddr_t
bus address type - the address other devices have
to access on their bus to access this device
- io_vaddr_t
virtual addresses - only these may be used with the
subsequent programmed I/O functions (kind of a handle)
- struct _region_s io_region_t
a structure that is used to communicate information
about a given region between the driver and the
environment. The following fields are defined:
device |
a handle that uniquely identifies the location
in the device tree |
base_virt |
virtual address that maps to the device's
base address |
base_io |
io base address of the region the device
responds to |
base_bus |
bus address to be used to access this
address |
base_phys |
physical address to be used to establish
a virtual mapping |
size |
size (in bytes) of the region |
decode |
bitmask of address select lines the decoder
evaluates |
name |
a string that identifies the region |
- int io_check_region(io_region_t *)
This environment function queries if the given region is
'free', e.g. not served by another driver. Not all environments
provide sufficient support for this to be implemented. If
this function cannot be implemented properly, it should
always indicate a region is 'free'. The device, base_io, size,
decode and name fields of the region passed have to be properly
initialized.
- io_vaddr io_claim_region(io_region_t *)
This environment function registers the region - if possible -
with a central resource management facility and establishes
a virtual mapping of this region. Before claiming a region,
the driver has to check whether a region is free.
The region passed need to have the same fields valid
as for io_check_region(). After completion, all fields
are initialized with valid values.
- io_vaddr_t io_free_region(io_region_t *)
This environment function destroys a virtual mapping
established by io_claim_region() and unregisters with
a central resource management facility. This invalidates
the base_ fields of the region passed, except for
the base_io field. Note that the driver must not assume
a valid virtual/bus mapping after freeing a region.
- kgi_usize io_insize(const io_vaddr_t vaddr)
Returns the result of a read operation of size
size bits at the device address mapped to
vaddr (base_virt + offset corresponds to
base_io + offset for a given region).
vaddr has to be naturally aligned on a size bit
boundary.
- void io_outsize(const kgi_usize_t val, const io_vaddr_t addr)
Performs a write operation of size bits
width at the device address mapped to vaddr. The same
alignment restrictions as for io_insize()
apply.
- void io_inssize(const io_vaddr_t vaddr, void *buf, kgi_size_t count)
- void io_outssize(const io_vaddr_t vaddr, const void *buf, kgi_size_t count)
Performs count read/write operations of
size bit size at the device address
mapped to vaddr reading/writing the data from the
device to buf/from buf to the device.
vaddr has to be properly aligned and buf must be
valid.
- void io_putsize(const io_vaddr_t vaddr, void *buf, kgi_size_t count)
- void io_get(void *buf, const io_vaddr_t, src, unsigned long count)
Performs count write/read operations of size
sizebit size at the device address mapped to
vaddr. The difference to io_ins/outs()
is that vaddr is incremented according
to size after each write.
Note that for a particular bus/io space binding any of the I/O
operations that are not supported may be missing. Currently the
following bindings are definied:
pcicfg | PCI32 Configuration Registers |
io | ISA I/O-Ports |
mem | Memory Mapped I/O |
So basically the KGI portability layer defines platform independent data types
and means how to establish a communication channel between the hardware
and the driver.
Detailed data type definitions can be found in
file:kgi-0.9/kgi/include/kgi/io.h
Display Hardware Model
KGI employs a operation mode description independent of the underlying
hardware and desired application programming interface. This is used to
specify the operation mode of a given hardware without assumptions
specific to a given API. The concept behind this description is to describe
the data flow from a device-internal frame buffer representation to
the final visible image.
- Attributes
The KGI display hardware model assumes graphics hardware to be used
to control a visible rectangular picture in certain attributes. The
smallest units for which attributes can be controlled independently
of each other are picture elements (pixels). However, a change of a
pixel's attribute (e.g. the character displayed in this pixel) may
result in a change of smaller units of the visible image called dots.
Currently the following attributes are defined:
private |
driver private data |
application |
store what you want here, the hardware doesn't
care |
stencil |
stencil mask/window ID values |
z |
z-buffer value |
colorindex |
color (the final color is determined by a table
lookup) |
color1 |
direct control of color channel 1 |
color2 |
direct control of color channel 2 |
color3 |
direct control of color channel 3 |
alpha |
alpha value |
foreground |
foreground color index for text modes |
texture index |
pixel texture (character shape) index for text
modes |
blink |
blink bit/frequency |
The particular meaning of color1, color2,
and color3 depends on the viewing device and is specified
by the color-space (YUV, RGB, ...) associated with it.
Some display hardware allows to control the attributes of
two pictures (with identical resoulution) independently, so
that stereo viewing is possible. To allow for smooth animation,
several versions (frames) of a picture may be stored in the
device to allow fast changes between the versions.
KGI therefore further divides per-pixel attributes into attributes
stored per frame and attributes stored common to all frames.
If a display hardware is stereo-capable, all per-frame (e.g.
color, alpha values) attributes can be controlled independently
for the left and right image.
Common-to-all-frames attributes (e.g. z-values, stencil values)
are global to all frames, for both the left and right image
(if applicable).
In order to represent precision requirements in the per-attribute
control, a bitmask and a zero-terminated array of kgi_u8_t
values specifying the number of bits required per attribute is used.
This allows for a compact, sufficient and extensible representation
of all frame buffer formats.
For example, a typical 3D application would specify
KGI_AM_ALPHA|KGI_AM_COLOR_INDEX
and { 8, 8, 0 } for the per-frame attributes
and KGI_AM_STENCIL | KGI_AM_Z
and { 8, 24, 0 } for the common-to-all-frames
attributes.
- Image Modes, Dot Ports, Dot Streams and Dot Stream Converters
The final, visible picture may be the result of (digital or analog)
signal processing, e.g. blending, overlaying or chroma-keying of several
independent images.
Given a display hardware internal 2D buffer of a particular size (the
virtual image), only a rectangular subregion of that virtual image
(the visible image) may be used for the overlay.
KGI-0.9 uses an abstract representation of the signal sources and
signal processing devices to describe the hardware operation mode.
- Image Modes
describe which attributes are stored per frame and common
to all frames, at what precision attributes are stored and
what size the virtual and visible image are (in pixels),
as well as some global properties, e.g. if the
virtual/visible image can be resized, if scaling/interpolation
or table-lookup operations can be applied to per-pixel
attribute values before being converted into dots
and sent to a dot-port.
- Dot Ports
describe what final screen size (in dots), color space,
data format etc. the dot-data transfered from a
image-read-out-unig to another signal processing device has.
The signal processing device is assumed to process the data
at a certain maximum rate, the dot clock.
E.g. a video DAC may change it's RGB outputs once per dot
clock cycle.
However, data may have to be transfered at a higher or lower
rate wich is determined by the load clock ratio, defining
the dot-data transfered per transfer cycle.
- Dot Stream Converters
represent signal processing devices that read image data
on one or more dot ports, optionally perform some
operations (color space conversion, interpolation,
dot-rate conversion, overlaying, etc.)
and send the result to another dot-port.
This abstraction allows very complex hardware setups to be described
in a kind of signal-flow-tree, with a dot-port as root representing the
viewing device, dot-stream-converters as nodes, dot-ports as links and
image modes as leafs.
- Resources
The abstraction described in the last section allows to describe the
(static) operation mode and frame/common buffer requirements.
However, it does not specify means to _alter_ (dynamic) properties
of the operation mode (e.g. the look-up-tables) or the frame/common
buffer contents.
This is done through resources, some of which are global and must be
shared between processes (e.g. the frame/common buffers, look-up
tables) and some of which can be virtualized (e.g. texture buffers,
2D or 3D graphics processor, etc.)
Basically resources are data structures used to communicate relevant
data to an external mapper (a special device file driver), that
utilizes the neccessary protection/virtualization mechanisms of
the environment.
Depending on the environment some resources (e.g. accelerators -
see below) may not be available to the high-level driver(s).
Currently the following resources are defined:
Commands |
This resource is used to perform specific requests, e.g.
setting a look-up-table entry etc. |
MMIO regions (memory mapped I/O regions) |
This resource type is used to allow processes to get
a virtual mapping of device-local memory, such as frame or
local buffers, graphics processor control registers,
etc. |
Accelerators (DMA buffers)/Streams |
This resource type is used to establish access to a
circular list of process-local DMA buffers (only one
at a time being writeable to the application).
The buffers are allocated by the external mapper and
are phyiscally continous. |
Shared (virtual) Memory (AGP texture memory) |
This resource type is used to establish access to
a memory object shared between the low-level driver,
hardware and the application. This is not yet specified
in detail. |
Exact definitions of the various types can be found in
file:kgi/include/kgi/kgi.h
Modular Display Driver Implementation
The most common graphics card architecture on the PC-market
utilizes the following principal design:
KGI therefore defines a modular driver architecture that allows
to write and distribute separate drivers for each subsystem
(except memory).
A fully operational driver is then obtained by linking the
sub-system drivers together.
Each driver provides some (specified) driver-global information,
such as maximum resolution, vendor and model, AC limits etc.
A meta-language is defined for each subsystem that allows driver
initialization, deinitialization, resource export and operation
mode negotiation/checking. This way drivers can be passed a partially
filled-in operation mode description and auto-negotiate the proper
operation mode.
This modular display driver internal interface is defined in
kgi-0.9/drv/display/kgi/module.h,
but is adopted to allow an easier
mapping to the UDI driver model and not yet finalized.
Summary
This article was intended to give a more detailed view of the
KGI display hardware abstraction model.
It mainly covered (static) operation mode specification, as well
as application/driver/hardware interaction.