Opengl Texture Formats For Essays

Introduction

The Graphics SDK (with the associated drivers) for the PowerVR SGX graphics core on OMAP35x supports several methods for sharing 2D images between OpenGL ES 1.1, 2.0, OpenVG and native OS windowing systems. The terminology and use of these various approaches for the SGX is difficult to discern from the standard documents that define them because industry standard API's are quite complicated and there are many factors in their application that can affect performance.

Those that are new to OpenGL ES might think that PixelBuffers, Texture Maps, Pixmaps and/or Frame Buffer Objects are interchangeable or compatible, but that is generally not true. These constructs have been defined over time as OpenGL has evolved and their intended uses and level of support on the SGX differ significantly. It is highly recommended to read through the various specifications at [1]

Implementations of OpenGL for desktop systems are typically more forgiving about how applications are implemented, but in the embedded world it is especially important to choose the right constructs for the intended application due to the limits inherit in any embedded system. In particular, the need to convert an image stored in one format to another by copying its contents should be avoided because that can place an unnecessary burden on the limited bandwidth of the memory interface and the host processor.

This paper attempts to introduce applications developers using OMAP35x to the various methods of sharing rendered images with OpenGL ES so that the best approach can be chosen for any given application. This paper is only an introduction and is not intended to be comprehensive. Particular focus is given to the methods of accomplishing Rendering to a Texture and Texture Streaming applications on the OMAP35x platform. Some familiarity with OpenGL ES is assumed and the appropriate Khronos standard documents should be used as the definitive references.

First, Some Terms

What makes a buffer a buffer? In the world of OpenGL ES, a lot more information is needed than just the address and size of a buffer. Here is a summary of other important attributes for buffers used to store images:

Dimensions

The size of the image in pixels in the horizontal and vertical directions. Images are always stored in rectangular shapes, so every horizontal line of pixels in a given image will have the same length. If an image will be used for texture mapping, it is best for its horizontal and vertical sizes to be powers of two. OpenGL ES 1.1 requires this, but OpenGL ES 2.0 does not. Even though 2.0 allows non-power of two sizes (often referred to as NPOT), in certain situations, SGX may perform better with textures that do have power of two dimensions and this is related to how the texture is stored and accessed (see the section on Twiddling below). Therefore, most applications round up texture sizes to the closest power of two and either stretch the image or allow the extra pixels to remain a background or border color. Note that it’s fine for the horizontal and vertical dimensions to be different powers of two (non-square), but then the texture may be rendered with better quality in the direction of the larger dimension. The maximum dimensions for any texture on the SGX on OMAP3 is 2048 x 2048 (this changes on certain OMAP4 models and beyond).

Color Depth

The number of bits used to encode each pixel of the image. For OpenGL ES, this is usually 16, 24 or 32.

Color Format

The ordering and precisions of the Red, Green, Blue and Alpha components of the pixels of the image. On the PowerVR architecture, this is usually, RGB565 (16 bits), RGB888 (24 bits) or ARGB8888 (32 bits).

Color Space

OpenGL ES and OpenVG only directly support the RGB (Red, Green, Blue) color space. However, video/camera systems typically produce images in the YCrCb color space, so sharing images between those systems and OpenGL ES requires converting the color space of every pixel in the image. This is an expensive operation, but there are extensions to OpenGL ES to allow color space conversion (YCrCb to RGB) to be performed on the SGX. The OMAP35x Display Subsystem also has dedicated logic for doing color space conversions.

Stride

Images are often stored in buffers that were originally allocated to hold larger images. Therefore the lines of the current image may not be continuous such that each line immediately succeeds its predecessor. In other words, there may be unused words of memory at the end of each line which must be skipped when copying the image. The stride is defined as the count of bytes that must be added to the address of the start of a line to address the start of the next line in the same image. Sometimes strides are expressed in pixels instead of bytes, in which case the color depth of the image must be taken into account.

Display and rendering hardware often impose strict requirements on the stride of image buffers. For example, the SGX 1.0.3 core in OMAP35x ES2.x devices requires a minimum stride of 32 pixels and the stride must also be a multiple of 32 pixels. However, the newer SGX 1.2.1 core in the ES3.x devices only requires that the stride be a multiple of 8 pixels and have a minimum of 8 pixels.

Another example of a required stride is with the OMAP35x Display Subsystem. It has the ability to display images rotated on 90 degree increments, but only if the displayed image buffer has a stride of 2048 pixels.

Contiguous

An image which is stored continuously in physical memory, without gaps. The SGX can only render into a contiguous image buffer and the OMAP35x display subsystem can only display images from contiguous buffers, however, they do handle strided images provided the strides are in the required ranges.

Cached

An image which is stored in a region in DDR memory which is being cached by the ARM processor’s data cache. This is determined by the configuration of the ARM’s MMU which is controlled by the OS that allocated the buffer in question.

Buffers which will be shared between the ARM and the SGX are usually not cached. In some situations it may be possible to increase performance by using a cached buffer instead, but this requires the application to clean and invalidate the proper cache lines at very specific times that may be difficult to determine.

Twiddling

Accessing DDR memory sequentially is faster than accessing it randomly and reading a stored image in a horizontal direction usually corresponds to sequential access, but reading the same image vertically usually requires random access (or large steps in the address). This would make the time required to read an image from DDR for texture mapping dependent upon the orientation of the viewport from the geometry. Therefore, the OpenGL ES drivers for the SGX typically prepare images that will be used for texture mapping by rearranging their contents so that horizontal and vertical accesses will have more consistent and deterministic performance. This process is called Twiddling and it is applied by default to all OpenGL ES texture maps for the SGX. However, since the SGX has a deferred rendering architecture, twiddling is typically deferred until textures are actually used. Beginning with the 1.3 DDK, twiddling is now performed by the SGX using its transfer queue, rather than by the ARM. This is a performance enhancement that can be disabled by placing the string “DisableHWTextureUpload=1” in the powervr.ini file, which can be useful for debugging.

Compression

Images that will be used for texture mapping in OpenGL ES are usually compressed with an algorithm that is proprietary to the PowerVR architecture named PVRTC. This is an asymmetrical compression algorithm meaning that it is much more compute intensive to compress an image than to decompress the same image. In fact, since the SGX has dedicated logic to decompress the PVRTC format, it is effectively free and should always be used for static images because it reduces the burden on the OMAP35x DDR memory system. Utilities are provided in the Graphics SDK to do the PVRTC compression for static images, on a workstation, when an application is compiled.

However, if the image for a texture map is to be dynamically updated for every frame, compression can not be used because the cost of doing it on the target OMAP35x device would be too great.

MipMaps

When an image will be used for texture mapping, it is recommended that MipMaps be supplied also. MipMaps are copies of the texture image which have been scaled down and filtered in advance. OpenGL ES will automatically select the best MipMap from this set of images for the target geometry whenever the texture map is applied. MipMaps enable OpenGL ES to maintain higher quality rendering results when the textures will be viewed across a wide range of down-scaled factors. MipMaps can be generated at compile time or run time, but they are intended for static images.

If the image for a texture map is to be dynamically updated for every frame, the cost of regenerating the MipMaps for every frame is probably too great and may have no value anyway if the texture image is not down-scaled across a range of factors. That is controlled by how the application animates the motion of the model and/or the viewport in 3D space.

Allocator

OpenGL ES and OpenVG rely upon the EGL driver to supply the framebuffer, pixelbuffers and pixmaps. Since any EGL driver is implemented for a specific OS, it depends upon that underlying OS to provide these allocations. Knowing the allocator of a buffer can help determine the other attributes of the buffer and ultimately its compatibility for a particular application.

The Framebuffer

The most familiar buffer is what the SGX renders into by default, for display, but it’s known by several names and it’s usually not a single buffer. Books on OpenGL call this the Color Buffer. The EGL document calls it an on-screen rendering surface. A more precise name is the window system provided framebuffer. This ambiguity in nomenclature comes from the fact that OpenGL was designed to be OS and platform independent, but displaying a graphics buffer is always a very OS and platform dependent activity.

On the OMAP35x platform, the framebuffer is always allocated by an EGL driver, either under Windows or Linux, and the EGL driver can be configured to use a single framebuffer or 2 or 3 (or more) framebuffers. Using 2 or more buffers improves the quality of the displayed graphics because one buffer can be written by the SGX while another buffer is being read by the DSS for display. The buffer that is currently being read by the DSS is called the Front Buffer. If a second buffer has been allocated, it is called a Back Buffer and the SGX will render to it, instead of the front buffer. Typically, the front and back buffer assignments are swapped at the completion of each frame to prevent the need to copy buffer contents and/or maintain synchronization with the DSS video system.

If the application chooses to swap the front and back buffers synchronously at the display frame rate (usually 60 Hz) and the application can always complete the rendering of each frame within that available time, then 2 buffers are sufficient. However, in most applications, the time to render frames varies greatly depending upon the complexity of the current scene and this can make it impossible to guarantee that rendering is completed for every frame in the available time. In this case, it is useful to allocate additional back buffers to allow the rendering to run asynchronous to the display. This is called a flip chain of back buffers.

Three framebuffer configurations supported under Linux for the Null window system

  • pvr2d front - The SGX renders directly to a single buffer which is always displayed.
  • pvr2d blit - The SGX renders to a back buffer and copies each frame to a displayed buffer.
  • pvr2d flip - Multiple render buffers are used and each is displayed successively.

Obviously, the term “framebuffer” is a vague misnomer, but it serves as a useful name for whatever set of buffers the SGX is using for the default rendering target in the current system configuration. It is important to understand that all of these buffers are for the express purpose of displaying images on a display by the OMAP35x Display Subsystem. If high performance is desired, these buffers should not be read by the host processor either directly or with OpenGL ES functions such as glReadPixels(), glCopyTexImage2D() or glCopyTexSubImage2D(). These functions are useful for testing and debugging purposes, but they are performance killers. This is because OpenGL ES has a long rendering pipeline design and any operation that requires reading from the final result (a framebuffer) stalls the entire pipeline whenever a read back is performed.

There are many applications that require rendering an intermediate image that will not be displayed directly, but read back and used for further rendering. This can be done without stalling the pipeline by rendering to either a pbuffer or Frame Buffer Object, instead of using the framebuffer. The pbuffer or FBO can then be used as a texture map for further rendering by OpenGL ES. This is called Render to Texture or RTT. It has many applications and is discussed in detail in the following sections.

Texture Mapping

This is the primary mechanism in OpenGL for using 2D images by mapping them onto 3D geometry. It is usually the most performance critical aspect of using OpenGL ES on OMAP35x because texture data is often large and must be stored in the DDR memory. PVRTC compression is typically used to increase the performance of texture mapping and MipMaps are used to improve the quality of the resultant image.

The image data for a texture map can originate from a number of possible sources. For static images, it is best to use the PVRTexTool utility provided in the Graphics SDK to compress and MipMap the images into the PVRTC format at compile time. Performing the compression at run-time is not supported. There are two levels of quality in PVRTC compression; 2 or 4 bits per pixel, and a choice of alpha or no alpha support. The texture image must be supplied in one of these supported formats:

Texture Map formats defined by the OpenGL ES standards

GL_RGBA (RGBA 8888) GL_RGB (RGB 888) GL_LUMINANCE (I 8) GL_ALPHA (A 8) GL_LUMINANCE_ALPHA (AI 88)

Additional Texture Map formats defined by extensions

GL_RGB565_OES (RGB 565) GL_RGBA4_OES (RGBA 4444) GL_RGB5_A1_OES (RGBA 1555) GL_BGRA (BGRA 8888)

Compressed Texture Map formats defined by extensions (for static textures)

GL_ETC1_RGB8_OES GL_COMPRESSED_RGB_PVRTC_4BPPV1_IMG GL_COMPRESSED_RGB_PVRTC_2BPPV1_IMG GL_COMPRESSED_RGBA_PVRTC_4BPPV1_IMG GL_COMPRESSED_RGBA_PVRTC_2BPPV1_IMG

The OpenGL ES standards only specify texture formats in the RGB color space, but there are some extensions which are supported on the SGX for supplying textures in the YCrCb color space. In this case, the SGX can perform color space conversion to an RGB format.

The pixel dimensions of a texture affect the performance of using the texture. In particular, the horizontal and vertical dimensions should be powers of two. OpenGL ES 1.1 requires this, but OpenGL ES 2.0 does not. Even though 2.0 allows non-power of two sizes (NPOT), the SGX still performs much better (10 to 20 times faster) with textures that do have power of two dimensions. Therefore, most applications round up texture sizes to the closest power of two and either stretch the image or allow the extra pixels to remain a background/border color.

When dynamic images are used for texture mapping, they can be supplied from either a pbuffer, FBO or a buffer allocated by the application. In texture streaming and render to texture applications, the texture image is typically updated for every frame. In these cases, PVRTC compression can not be used because it is not supported at run-time and there is probably not enough time to do the compression and maintain video frame rates anyway.

Mipmapping is also optional, but it can improve the quality of the resultant image in applications where the texture mapped geometry will be viewed across a range of decreasing sizes. OpenGL ES 1.1 has the ability to automatically generate mipmaps whenever a texture map is updated. This is enabled with glTexParameterf(GL_TEXTURE_2D, GL_GENERATE_MIPMAP, GL_TRUE). Automatic mipmap generation has been replaced in OpenGL ES 2.0 with the new function glGenerateMipmap(TEXTURE 2D), which only generates the mipmap levels once per call. The performance of mipmap generation has been improved significantly beginning with the 1.4 DDK.

Essential functions in OpenGL ES 1.1 and 2.0 to create and configure texture maps

  • glGenTextures - Generates handles for texture maps
  • glBindTexture - Binds a texture map for use
  • glTexImage2D - Loads the texture map image data
  • glTexParameterf - Configures texture map filtering parameters

The OpenGL ES drivers normally perform twiddling on all textures before they are used. This is done to improve the performance of using the textures when they are applied to the geometry. However, the twiddling is not done immediately when a texture is created. The SGX architecture defers all rendering, including the twiddling of textures, until they are actually required. This can lead to some unexpected results when attempting to benchmark the performance of rendering textured geometry.

Render To Texture

There are many applications that require rendering an intermediate image that will not be displayed directly, but read back and used for further rendering. This can be done without stalling the OpenGL ES pipeline by rendering to either a pbuffer or Frame Buffer Object, instead of using the framebuffer. The pbuffer or FBO can then be used as a texture map for further rendering by OpenGL ES.

The most common applications for Rendering to a Texture

  • Repeating a rendered image multiple times in a scene
  • Simulating reflection effects, like a mirror or lake in a scene
  • Simulating shadow effects in a scene
  • Post-processing effects, such as motion blur or antialiasing
  • Compositing a rendered image into a 3D GUI

A sample application named RenderToTexture is provided here in source code form that demonstrates how to implement rendering to a texture on the OMAP35x platform using either pbuffers or FBOs.

File:RenderToTextureExamples.zip

Supplied versions of the Render to Texture demonstration program

  • RenderToTexture_pBuffer\OGLES\RenderToTexture.cpp - Uses PBuffers for OpenGL ES 1.1
  • RenderToTexture\OGLES\RenderToTexture.cpp - Uses FBO-OES extension for OpenGL ES 1.1
  • RenderToTexture2\OGLES2\RenderToTexture2.cpp - Uses FBOs for OpenGL ES 2.0

The use of Frame Buffer Objects is recommended over pbuffers for rendering to textures, because FBOs are more flexible and offer some performance advantages. Nevertheless, the pbuffer version of the demonstration program is also provided for comparison and legacy.

These programs use the PVRShell and PVRTools environment so they can be directly compiled and run under either embedded Linux, WindowsCE 6.0 or VFrame, without modification. These programs can also be used for benchmarking to compare the actual performance when run on the OMAP35x EVM, since they all display real-time performance measurements in frames per second (FPS). Use the up/down arrow keys to increase/decrease the size of the cube.

Note that to run the FBO-OES version under VFrame requires the PowerVR PC Emulation SDK version 2.4 or later and the pcviewer_es1.cfg file properly configured for the SGX530 core in OMAP35x.

The design and coding of these programs is described in more detail in the following sections on PixelBuffers and Frame Buffer Objects. See figure 1.

Render to Texture Demonstration Program Screen Capture

Figure 1

PixelBuffers

PixelBuffers (or pbuffers) are the original solution for how to render to an off-screen buffer with OpenGL ES or OpenVG. This approach continues to be supported on OMAP35x for rendering with OpenGL ES 1.1 and OpenVG, but it has been superseded by the newer Frame Buffer Objects approach in OpenGL ES 2.0. Since there is now an extension to OpenGL ES 1.1 to also support FBOs, and they provide better performance and flexibility, the only reasons to use pbuffers today are for backwards compatibility or for OpenVG. Pbuffers are still the best way to share images between OpenGL ES 1.1 and OpenVG.

The major difference between pbuffers and FBOs is that pbuffers are allocated by the EGL driver whereas FBOs are controlled entirely through OpenGL ES and are therefore integrated with it better. Pbuffers require a separate rendering context from the framebuffer. This leads to some performance problems when rendering to a texture because switching OpenGL ES to render to a pbuffer requires handling changes in the rendering context and this cost is incurred every time the rendering target is changed. Also, when a pbuffer is used as a texture, it is not stored in a twiddled format which reduces the performance when the texture is used.

Essential EGL functions to create and use a pixelbuffer

  • eglGetCurrentDisplay - Get a handle to the framebuffer
  • eglGetCurrentContext - Get a handle to the rendering context of the framebuffer
  • eglQueryContext - Get the rendering context of the framebuffer
  • eglGetConfigAttrib - Get the configuration of the rendering context
  • eglChooseConfig - Find the closest matching rendering context available
  • eglCreatePbufferSurface - Create a pixelbuffer
  • eglMakeCurrent - Switch the rendering target to the pixelbuffer
  • eglBindTexImage - Bind a pixelbuffer to use it as a texture map
  • eglDestroySurface - Delete a pixelbuffer

PixelBuffers are for rendering with the accelerated OpenGL ES 1.1 or OpenVG drivers only. They are not intended to be accessed directly by the application or any other software which may be running on the ARM. If such access is needed, either Pixmaps or FBOs should be used instead.

Note that support for pixelbuffers has recently been added to the OpenGL ES 2.0 driver for OMAP35x beginning with the 1.4 DDK, but its use is not recommended because this contradicts the Khronos standard for 2.0 and is probably not portable.

Pixelbuffer Demonstration Program

The pbuffer version of the RenderToTexture demonstration program creates and uses a single pbuffer which is bound and used as both a texture map applied to a geometric model of a rotating cube and for the target of the rendering. This forms a circular rendering loop where the image of a rotating cube is rendered to the pbuffer and then used as a texture map on the same cube for the next frame. See figures 1 and 2.

Figure 2


The design and coding of this program is now described in detail with an emphasis on the OpenGL ES calls involved with controlling the pbuffer and texture map. Please refer to the program source code (RenderToTexture.cpp, in the older versions of the SDK) to follow this description.

In InitView(), a call to SelectEGLConfig() queries the EGL driver to get the rendering context of the framebuffer so that eglCreatePbufferSurface() can create a pbuffer which matches that context as closely as possible. A handle is also created for a texture with glGenTextures() and it is bound with glBindTexture() so that it can be configured with glTexImage2D() and glTexParameterf(). The important parameters for the texture are its dimensions (gTextureSize) and the color format (GL_RGB).

Next, the texture is bound to the pbuffer by eglBindTexImage() so that the pbuffer can be used for texturing. This requires switching the rendering target from the framebuffer to the pbuffer (m_PBufferSurface) by calling eglMakeCurrent(), and then after the bind operation, eglMakeCurrent() is called again to switch the rendering target back to the framebuffer (m_CurrentSurface). This completes the initialization of the program.

RenderScene() constitutes the main loop of this program. It is called repeatedly by the PVRShell to render each frame of the 3D scene. With each call, RenderScene() calls RenderTexture() to render an updated texture map, then the next frame of the cube model is rendered with the updated texture applied.

RenderTexture() is the function that actually renders the cube image into a texture map. This function begins and ends with calls to eglMakeCurrent(). This first call switches the target of OpenGL ES rendering from the framebuffer to the pbuffer to be used for texturing (m_PBufferSurface).

Since texture maps usually have different dimensions than the framebuffer, glViewport() is called to configure OpenGL ES for the dimensions of the target texture and to erase it with glClear() in preparation for rendering a new image of the cube. A different model-view projection matrix is also required, so that is loaded as well. eglReleaseTexImage() and eglBindTexImage() are called to release the previously used pbuffer and bind the new one for texturing. In this case, these are really the same pbuffer (m_PBufferSurface), but the EGL driver still requires these calls; possibly to signal that the rendering of the previous texture is complete. The call to DrawCubeSmooth() actually renders the cube model into the pbuffer with the updated rotation angle.

Finally, the last call to eglMakeCurrent(), switches the rendering target from the pbuffer back to the framebuffer (m_CurrentSurface).

Note that this program does not ever generate mipmaps for either of the textures, nor does it compress them. This is because the textures are updated so frequently that generating mipmaps or converting them to the PVRTC format would take too much time. Therefore, only the first mipmap level (0) of the textures is rendered and used in the uncompressed GL_RGB format.

Frame Buffer Objects

An FBO is an off-screen rendering target. It is an alternative to the framebuffer or pbuffer that would otherwise serve as the target buffer for rendering from OpenGL ES. FBOs were introduced with OpenGL ES 2.0 to provide greater flexibility and performance for applications like rendering to a texture map and now OpenGL ES 1.1 also supports FBOs through extensions to that standard. Older applications which were developed before 2.0, typically use pbuffers for off-screen rendering, but that approach has been superseded by FBOs.

The major difference between pbuffers and FBOs is that pbuffers are allocated by the EGL driver and require disparate rendering contexts whereas FBOs can share the same context as the framebuffer so there is less overhead associated with switching the target of the rendering between the two. Since FBOs are controlled entirely through OpenGL ES, they are also more tightly integrated and flexible. For example, not only can the rendered color image be captured and used as a texture map, but the depth and stencil images can be captured and used too.

An FBO is essentially a data structure maintained by the OpenGL ES driver. FBOs do not store image data directly, but store handles to renderbuffers or to textures which have been attached to the FBOs to capture the rendered images. There are 3 defined attachment points for the color, depth and stencil images that OpenGL ES produces. The rendered color image is most often used and can either be captured directly into an attached texture or into a renderbuffer for the application to read. The depth image (Z buffer) data is sometimes used for advanced rendering techniques and can be captured into either a texture or renderbuffer as well. Finally, the stencil image can only be captured into a renderbuffer. Note that capturing the depth image into a texture requires the OES_depth_texture extension which is only available beginning with the 1.4 DDK.

Renderbuffers are defined by data structures called Render Buffer Objects. For an application to read the color, depth and/or stencil images, it must create corresponding renderbuffers for each, configure and attach them to the current FBO with glFramebufferRenderbuffer(). Renderbuffers are empty when they are created by glGenRenderbuffers(), so their dimensions and formats must be configured using glRenderbufferStorage(). Applications can query the maximum dimensions that are supported by calling glGetIntegerv(GL_MAX_RENDERBUFFER_SIZE), but for the SGX it is the same as the maximum texture size (2048 x 2048). Up to 3 renderbuffers can be attached to an FBO at the same time.

Alternatively, to capture a color image directly into a texture map (rendering to a texture), use glFramebufferTexture2D() to attach the texture, instead of renderbuffers. This has the advantage that the texture can then be used immediately for texturing, without copying or conversion.

When an FBO is bound as the current rendering target, the functions glCopyTexImage2D() and glCopyTexSubImage2D() will copy from the color buffer renderbuffer or texture currently attached to the currently bound FBO, rather than from the framebuffer. This can be used to copy portions of an attached renderbuffer or texture to another texture. Also, the function glReadPixels() can be used to copy renderbuffers or textures currently attached to the currently bound FBO to a buffer supplied by the application. Note however, that these features are only operational beginning with the 1.5 DDK.

Figure 3 shows the relationship between the framebuffer, FBOs, RBOs, textures and how they can be connected to capture rendered images.

Figure 3

So there are two ways to capture rendered color images with FBOs; with an attached RBO, or directly into a texture map. This distinction is made because the OpenGL ES driver needs to know in advance if the image will be used for texturing. If so, the image will be twiddled as it is stored to provide better performance when the texture is used. The application also has the choice to store the image into any mipmap level of the texture and to generate the remaining levels, if needed. If the rendered color image will not be used for texture mapping, then the image should be captured into an RBO to avoid the overhead of the twiddling.

Essential OpenGL ES functions to create and use FBOs and RBOs

  • glGenFramebuffers - Generates FBO handles
  • glBindFramebuffer - Bind or unbind an FBO for use
  • glFramebufferTexture2D - Attach a texture map to capture rendered color image
  • glDeleteFramebuffers - Deletes FBO handles
  • glCheckFramebufferStatus - Check that FBO attachments are complete
  • glGenRenderbuffers - Generates RBO handles
  • glBindRenderbuffer - Bind an RBO for use
  • glRenderbufferStorage - Define pixel color format and dimensions for RBO storage
  • glFramebufferRenderbuffer - Attach RBOs to capture rendered color, depth or stencil data
  • glDeleteRenderbuffers - Deletes RBO handles

Note that since FBOs are not dependent on any underlying windowing system or EGL support, they are always available on the OMAP35x platform under Linux and Windows. The Qt environment supports render to texture applications via FBOs for its rendering of widgets.

FBO Demonstration Program

The FBO and FBO-OES versions of the RenderToTexture demonstration program create and use two FBOs which each have corresponding texture map attachments. The texture from the first FBO is applied to a geometric model of a rotating cube as the second FBO is used as the target of the rendering. After each frame is rendered, the use of these two FBOs are swapped, so that the next frame will be rendered with a texture image of the previous cube. This circular rendering algorithm produces a rotating cube which is texture mapped with an image of itself, which is also rotating. See figures 1 and 4.

Figure 4

The design and coding of this program is now described in detail with an emphasis on the OpenGL ES calls involved with controlling the FBOs and texture maps. Please refer to the program source code (RenderToTexture.cpp) to follow this description.

In InitView(), handles are created for 2 textures and 2 FBOs, with glGenTextures() and glGenFramebuffers(), respectively. Both textures are bound with glBindTexture() so that they can be configured with glTexImage2D() and glTexParameterf(). The important parameters for the textures are their dimensions (gTextureSize) and their color format (GL_RGB). Each FBO is then bound with glBindFramebuffer() so that their corresponding textures can then be attached by glFramebufferTexture2D(), which also configures how these attachments will be used.

Parameters to glFramebufferTexture2D()

  • GL_FRAMEBUFFER - These FBOs share the same rendering context as the framebuffer.
  • GL_COLOR_ATTACHMENT0 - The color buffer image will be captured, not the depth buffer.
  • GL_TEXTURE_2D - These are 2D textures, not 1D or 3D.
  • m_hTexture[Index] - The handles of the textures to attach to these FBOs.
  • 0 - The color buffer will be rendered into MipMap level 0 (the first one).

Since this program only uses FBOs for the purpose of rendering to textures, no Render Buffer Objects are needed. This completes the initialization of the program.

RenderScene() constitutes the main loop of this program. It is called repeatedly by the PVRShell to render each frame of the 3D scene. With each call, RenderScene() calls RenderTexture() to render an updated texture map, then the next frame of the cube model is rendered with the updated texture applied. Finally, the FBO index (m_CurrentFBO) is toggled so that the FBOs are swapped in preparation for the next call to RenderScene().

RenderTexture() is the function that actually renders the cube image into a texture map. This function begins and ends with calls to glBindFramebuffer(). The first call switches the rendering target from the framebuffer to the texture attached to the FBO (m_hFBO), which is indexed by (m_CurrentFBO). This simply toggles between 0 and 1 to effect the swapping of the two textures each time RenderTexture() is called.

Next, the call to glCheckFramebufferStatus() is recommended to confirm that the FBO is ready for use. If an error is returned, it indicates that the FBO is not ready to be used for rendering, but that should never occur.

Since the texture maps usually have different dimensions than the framebuffer, glViewport() is called to configure OpenGL ES for the dimensions of the target texture and to erase it with glClear() in preparation for rendering a new image of the cube. A different model-view projection matrix is also required, so that is loaded as well. The call to DrawCubeSmooth() actually renders the cube model into the texture attached to the FBO indexed by (m_CurrentFBO) with the updated rotation angle. Note that glBindTexture() binds the opposite texture map to use for texturing this cube (m_hTexture[m_CurrentFBO ^ 1]).

Finally, the last call to glBindFramebuffer(), switches the rendering target from the FBO back to the framebuffer. The 0 parameter for the FBO handle selects the framebuffer, rather than an FBO.

Note that this program does not ever generate mipmaps for either of the textures nor does it compress them. This is because the textures are updated so frequently that generating mipmaps or converting them to the PVRTC format would take too much time. Therefore, only the first mipmap level (0) of the textures is rendered and used in the uncompressed GL_RGB format.

The FBO-OES version of the RenderToTexture demonstration program is identical to the FBO version in how they create and use textures and FBOs, except that the FBO-OES version uses extensions to access the FBO functions since they are not directly supported by the OpenGL ES 1.1 standard. To use extensions supported through the PVRTools class CPVRTglesExt, InitView() calls CPVRTglesExt ::LoadExtensions() to initialize the pointer (m_Extensions). This pointer is then used to call all extension functions for FBOs. To obtain a string which lists the names of all extension functions supported by the OpenGL ES driver, use glGetString(GL_EXTENSIONS). It is a good practice for applications that require extensions to do this inquiry to confirm that the required extensions are available.

Pixmaps

Pixmaps are another type of off-screen rendering surface that can be allocated by the EGL driver. The EGL specification (Khronos Native Platform Graphics Interface) defines two types of off-screen rendering surfaces; pixelbuffers and pixmaps (as well as the framebuffer used for the on-screen native platform windowing system). The difference between pixelbuffers and pixmaps is in the format and stride of the image data they store. PixelBuffers are created with a format and stride that the SGX can render into efficiently, but is not necessarily compatible with any other graphics interface beyond OpenGL ES 1.1 and OpenVG. Pixmaps, however are specifically defined to be compatible with the native windowing and/or graphics system for the OS platform that the EGL driver is implemented for. Also, since pixelbuffers are required by the EGL specification, their support is guaranteed on whatever OS platform the EGL driver is provided for. However, support for pixmaps is optional and should not be assumed because a different implementation is required for each windowing system.

Windows has only a single windowing system to support and an updated WSEGL and display driver could allow pixmap images from OpenGL ES and OpenVG to be shared by Windows as DirectDraw surfaces which can be accessed by Windows native graphics interface (GDI). A major obstacle in this development would be the fact that Windows uses two types of image buffers; Device Independent Bitmaps (DIBs) and Device Dependent Bitmaps (DDBs) and to allow the SGX to access DDBs via pixmaps requires customizations to the Windows display driver in addition to the EGL driver.

Support for pixmaps for OMAP35x under Linux is in development. There are several popular windowing systems in use today with embedded Linux and each of these will require enhancements to their EGL driver implementations to support pixmaps. Potentially, the pixel format and stride requirements of each windowing system could be different. Today, pixmaps are only supported for the Null windowing system beginning with release 3.00.00.10 of the Linux Graphics SDK. This support is defined by the following LinuxNullPixmap structure. It allows the use of CMEM to allocate buffers for pixmaps which are physically contiguous, for example. The source code for this implementation of the EGL driver can be used as a guide for implementing pixmap support for other windowing systems.

typedef struct { long ePixelFormat; long eRotation; long lWidth; long lHeight; long lStride; long lSizeInBytes; long pvAddress; long lAddress; } LinuxNullPixmap ;

References

  • www.montgomery1.com/opengl - Montgomery One - Practical Solutions from the Visual Computing Frontier
  • www.khronos.org/opengles - The official standard documents for OpenGL ES 1.1 and 2.0
  • www.khronos.org/egl - The official standard document for the EGL
  • OpenGL ES 2.0 Programming Guide - By Aaftab Munshi, Dan Ginsburg and Dave Shreiner
  • www.imgtec.com/PowerVR/insider - Download the “PC Emulation” PowerVR SDKs for OpenGL ES
  • www.opengl.org/wiki/Main_Page - Wiki that covers desktop versions of OpenGL

A Pixel Transfer operation is the act of taking pixel data from an unformatted memory buffer and copying it in OpenGL-owned storage governed by an image format. Or vice-versa: copying pixel data from image format-based storage to unformatted memory. There are a number of functions that affect how pixel transfer operation is handled; many of these relate to how the information in the memory buffer is to be interpreted.

Terminology

Pixel transfers can either go from user memory to OpenGL memory, or from OpenGL memory to user memory (the user memory can be client memory or buffer objects). Pixel data in user memory is said to be packed. Therefore, transfers to OpenGL memory are called unpack operations, and transfers from OpenGL memory are called pack operations.

Pixel transfer initiation

There are a number of OpenGL functions that initiate a pixel transfer operation. These functions are:

Transfers from OpenGL to the user:

Transfers from the user to OpenGL:

There are also special pixel transfer commands for compressed image formats. These are not technically pixel transfer operations, as they do nothing more than copy memory to/from compressed textures. But they are listed here because they can use pixel buffers for reading and writing.

The discussion below will ignore the compressed texture functions, since none of what is discussed pertains to them.

Pixel transfer arguments

With the exception of the compressed texture functions, all functions that initiate pixel transfers take 3 parameters:

GLenum format​, GLenum type​, void *data​

The data​ pointer is either a client memory pointer or an offset into a buffer object. The switch for this is based on whether a buffer object is bound to the GL_PIXEL_PACK/UNPACK_BUFFER binding, depending on whether the pixel transfer is a pack or unpack operation. For ease of discussion, let us call this "client memory" regardless of whether it refers to a buffer object or not.

The format​ and type​ parameters describe the format of a single pixel. These do not refer to the internal format of the texture object (in the case of glTexImage* calls). They also do not completely describe the pixel format in client memory (see #Pixel transfer parameters for more on that).

Pixel format

Pixels of the client data can be color values, depth values, combined depth/stencil values, or just stencil values. Color values can have up to four components: R, G, B and A. Depth and stencil values only have one component. Combined depth/stencil values have two components.

The format​ parameter of a pixel transfer function defines the following:

  • The basic type of data that is being read/written from/to: Color, depth, stencil, or depth/stencil. This must match the image format of the image being read/written from/to.
  • The order of the individual components within each pixel.
  • For color values, whether or not the data should be converted to/from floating-point values when being read/written. For more details, see below.

If only depth values are being transferred, then GL_DEPTH_COMPONENT is used. If only stencil values are being transferred, then GL_STENCIL_INDEX is used. If combined depth/stencil values are being transferred, then GL_DEPTH_STENCIL is used. The latter can only be used with Textures that explicitly use a depth/stencil format, or in Framebuffer pixel read operations for Framebuffer Objects that have an explicit depth/stencil format.

For color formats, there are more possibilities. GL_RED, GL_GREEN, and GL_BLUE represent transferring data for those specific components (GL_ALPHA cannot be used). GL_RG represents two components, R and G, in that order. GL_RGB and GL_BGR represent those three components, with GL_BGR being in reverse order. GL_RGBA and GL_BGRA represent those components; the latter reverses the order of the first three components. These are the only color formats supported (note that there are ways around that).

All of the above format specifiers implicitly mean that the pixels are floating-point values. For integer pixel types, using a floating-point format means that the pixels will be assumed to be normalized integers. And thus they will be interpreted as normalized values.

If you want to transfer integral data to integral image formats, you must suffix the pixel format with "_INTEGER". This states that the client-side pixel data is integer rather than floating-point. You should only use the "_INTEGER" format suffix with integral image formats.

Note: Values for the format​ parameter look a lot like image formats. They are not! Do not confuse the two. While in many cases they must match to some degree, they do completely different things. If you always use sized image formats for texture, then they will never match, since the format​ parameter cannot have a size.

Pixel type

The type​ parameter of a pixel transfer function defines how many bits each of the components defined by the format​ take up. There are two kinds of type​ values: values that specify each component as a separate byte value, or values that pack multiple components into a single value.

For example, GL_RGBAformat​ combined with GL_UNSIGNED_BYTEtype​ means that each pixel will take up 4 unsigned bytes. The first byte will be R, the second will be G, and so on. GL_UNSIGNED_BYTE as a type​ is a per-component type; each component has this size.

The possible per-component formats use the enumerators for OpenGL types:

  • GL_(UNSIGNED_)BYTE: 1 byte
  • GL_(UNSIGNED_)SHORT: 2 bytes
  • GL_(UNSIGNED_)INT: 4 bytes
  • GL_HALF_FLOAT: 2 bytes
  • GL_FLOAT: 4 bytes

However, there are packed arrangements of pixel data that are useful, where each component is packed into non-byte-length values. A common example is a 16-bit RGB color, where the red and blue components take up 5 bits and the green is 6.

To specify this kind of data in OpenGL, we use a packed type​ value. Packed type fields are specified as follows:

GL_[base type​]_[size1​]_[size2​]_[size3​]_[size4​](_REV​)

The parenthesis represent an optional value.

The base type​ is the OpenGL type enumerator name of the fully packed value. These values are always an unsigned integer type, of a size large enough to hold a whole color. So the 5-6-5 RGB colors would be stored into a 16-bit unsigned integer, which means using UNSIGNED_SHORT.

The size​ values represent the sizes of the components (in bits), in that order. We want the components to be 5 for the first, 6 for the second, and 5 for the third. Since there is no fourth component, there is no size4​ value.

Therefore, the type​ that represents 5-6-5 colors is GL_UNSIGNED_SHORT_5_6_5.

By default the components are laid out from msb (most-significant bit) to lsb (least-significant bit). However, if the type​ has a _REV​ at the end of it, the component order is reversed, and they are laid out from lsb-to-msb. In "REV" mode, the first component, the one that matches the first 5, would go into the last component specified by the format​.

So if we have GL_RGBformat​ combined with the GL_UNSIGNED_SHORT_5_6_5_REVtype​, the blue component goes into the first 5 bits of the color value. Note that this is functionally identical to GL_BGR with GL_UNSIGNED_SHORT_5_6_5.

With the exception of 2 special cases, the number of components in the format​ must match the number of components provided by a packed type​. Some of the packed types even put restrictions on the component ordering, or the kinds of components the format​ can be used with.

OpenGL defines the possible sizes. They are:

  • 3_3_2 (2_3_3_REV): unsigned bytes. Only used with GL_RGB.
  • 5_6_5 (5_6_5_REV): unsigned shorts. Only used with GL_RGB.
  • 4_4_4_4 (4_4_4_4_REV): unsigned shorts.
  • 5_5_5_1 (1_5_5_5_REV): unsigned shorts.
  • 8_8_8_8 (8_8_8_8_REV): unsigned ints.
  • 10_10_10_2 (2_10_10_10_REV): unsigned ints.
  • 24_8 (no _REV): unsigned ints. Only used with GL_DEPTH_STENCIL.
  • 10F_11F_11F_REV (no non-REV): unsigned ints. These represent floats, and can only be used with GL_RGB. This should only be used with images that have the GL_R11F_G11F_B10F image format.
  • 5_9_9_9_REV (no non-REV): unsigned ints. Only used with GL_RGB; the last component (the 5. It's REV) does not directly map to a color value. It is a shared exponent. Only use this with images that have the GL_RGB9_E5 image format.

There is one very special packed type​ field. It is GL_FLOAT_32_UNSIGNED_INT_24_8_REV. This can only be used in tandem with images that use the GL_DEPTH32F_STENCIL8 image format. It represents two 32-bit values. The first value is a 32-bit floating-point depth value. The second breaks the 32-bit integer value into 24-bits of unused space, followed by 8 bits of stencil.

Pixel transfer parameters

The format​ and type​ parameters describes only the representation of a single pixel of data. The layout of the data in client memory is otherwise controlled by various global parameters, set by the glPixelStore[if] functions, in combination with the endianness of the client.

Pack and unpack operations (reads from OpenGL memory and writes to OpenGL memory, respectively) use different sets of parameters to control the layout of pixel data. All pack (read) parameters begin with GL_PACK, while all unpack (write) parameters begin with GL_UNPACK. Both kinds of operations have the same parameters which have the same meaning, but they only affect that particular kind of operation.

The GL_PACK_ALIGNMENT parameter will not affect any uploads (unpack) to OpenGL memory.

Pixel layout

The layout of pixel data is as follows.

The data is arranged in "rows". Each row represents a horizontal span in the pixel transfer, based on the width​ parameter in the transfer operation. Each pixel within a row is directly adjacent to the other pixels. So there is no space between pixels.

If the format​ is GL_RGB, and the type​ is GL_UNSIGNED_BYTE, then the size of a pixel is 3. A width​ of 16 pixels means that the total byte length of a single row of pixels is 48 bytes. If the format​ were GL_RGBA, then the pixel size would be 4 and the size of a row would be 64.

If the pixel transfer operation is two-dimensional or higher, then there will be height​ number of rows, where height​ is a parameter of the pixel transfer function. The first row is the bottom of the image in OpenGL space. The next row is the row above that, and so on.

Unlike pixels however, rows are not necessarily directly contiguous in memory. The bytes on each row must begin on a specific alignment. This alignment is user defined with the GL_PACK/UNPACK_ALIGNMENT parameter. This value can be 1, 2, 4, or 8.

For example, if the format​ is GL_RGB, and the type​ is GL_UNSIGNED_BYTE, and the width​ is 9, then each row is 27 bytes long. If the alignment is 8, this means that the second row begins 32 bytes after the first. If the alignment is 1, then the second row begins 27 bytes after the first row.

If the pixel transfer operation is three-dimensional, then there is a depth​ as well as width​ and height​. This changes nothing about the layout; it only changes how many rows there are. Instead of height​ rows, there are height​ * depth​ rows.

Endian issues

Client pixel data, the "packed" data, is always in client byte ordering. So on a little-endian machine, unsigned integers are represented in client memory in little-endian order. On a big-endian machine, unsigned integers are represented in big-endian order.

If you wish to change this, you can use glPixelStore to set GL_PACK/UNPACK_SWAP_BYTES to GL_TRUE. This will cause OpenGL to perform byte swapping on the type​ values from the platform's native endian order to the order expected by OpenGL.

To clarify how client memory layout is determined based on format​, type​, GL_PACK/UNPACK_SWAP_BYTES, and client byte ordering, here are a few examples with RGBA 8888 pixel data:

Order (in client memory)format​type​SWAP_BYTESClient Byte Order
RGBARGBAUNSIGNED_BYTE<any><any>
RGBARGBAUINT_8_8_8_8_REVFALSELittle-endian
BGRABGRAUNSIGNED_BYTE<any><any>
BGRABGRAUINT_8_8_8_8_REVFALSELittle-endian
ABGRRGBAUINT_8_8_8_8FALSELittle-endian
ARGBBGRAUINT_8_8_8_8FALSELittle-endian
ABGRRGBAUINT_8_8_8_8_REVTRUELittle-endian
ABGRRGBAUINT_8_8_8_8_REVFALSEBig-endian

Sub-image selection

Format conversion

Pixels specified by the user must be converted between the user-specified format (with format​ and type​) and the internal representation controlled by the image format of the image.

The pixels in the client memory are either in some form of floating-point representation or integral values. The floating-point forms include normalized integers, whether signed or unsigned. If the format​ parameter does not specify the "_INTEGER" suffix, then all integer values are assumed to be normalized integers. If the format​ parameter specifies "_INTEGER", but the type​ is of a floating-point type (GL_FLOAT, GL_HALF_FLOAT, or similar), then an error results and the pixel transfer fails. Also, if "_INTEGER" is specified but the image format is not integral, then the transfer fails.

When data is being transferred to an image, pixel values are converted to either floating-point or integer values. If the image format is normalized, the values that are written are clamped to [0, 1] and normalized. If the image format is integral, then the integral input values are copied verbatim.

This process is reversed for writing to client data.

Note: If the OpenGL implementation can get away with it, it will not do the conversion. So if you upload normalized integers to a normalized integer internal format, the implementation won't bother with the conversion since it would be a no-op. Always try to match the given format with the image format.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *