Introduction to the C64

University of Washington Image Computing Library

 

Yongmin Kim, Ph.D.

Professor and Director

Image Computing Systems Laboratory

University of Washington
Seattle, WA 98195 U.S.A.

 

1.0 Scope

 

This document describes the TMS320C64x University of Washington Image Computing Library (C64 UWICL), which consists of a set of image computing functions for the TMS320C64x. The document describes the goals and features of the C64 UWICL, its architecture, the image computing functions supported, the quality assurance procedure, the documentation and technical support provided, the availability of the C64 UWICL, and finally a list of proposed functions for the library.

 

2.0 Goals and Features of the C64 UWICL

 

The major goal of the C64 UWICL is to help the C64x DSP family to succeed technologically and commercially and help the companies adopting the C64x technology (1) reduce time-to-market, (2) control the software development costs, and (3) have a successful transition to mediaprocessor-based products. It aims to accomplish these by (1) providing an efficient library of core low-level image computing algorithms to the C64 UWICL Consortium member companies so that higher-level algorithms and applications can be developed and quickly integrated on the C64x-based target systems with minimal custom coding by companies, (2) expanding, upgrading, and supporting the library to facilitate the widespread use and technical/commercial successes of the C64x, and (3) training and educating students in C64x programming and developing educational materials for the VLIW-based mediaprocessors.

 

Although the compiler technology has advanced significantly and the mediaprocessors have become much easier to program in the last 10 years, the programmers still need to fully understand not only the underlying technology but also the algorithms and applications to the full extent to be able to map them onto the architecture in a highly-optimized fashion. We at the Image Computing Systems Laboratory (ICSL) of the University of Washington have expertise in various imaging applications, low-level imaging algorithms and mediaprocessor architectures. Using the C64 UWICL, the companies should be able to enjoy their applications being implemented flexibly and in a timely fashion, resulting in reduced time-to-market and lower development costs. Programmers do not need to understand the details of the C64x architecture to the full extent to be able to maximally utilize its processing power. From our experience, this kind of infrastructure to reduce the end user’s and application developer’s low-level programming burden is essential for the success of high-performance programmable mediaprocessors.

 

3.0 Architecture of the C64 UWICL

 

Most functions in the C64 UWICL will consist of image computing tight loops only. In typical imaging applications, data are first brought from external memory into the on-chip memory, and depending on the application, one or more tight loops operate on that particular set of data before the results are sent back to the external memory. In our previous libraries for TMS320C80 and MAP-CA, we have been providing all of the functions as hierarchical modules, e.g., in the MAP-CA UWICL, a function-level module that handles the data flow and a tight-loop level module that processes the image data that are brought on-chip. This, although proved to be a good practice for completeness and benchmarking, turned out to be inefficient when an application needs a tight and optimal integration using some UWICL functions and other custom functions. Most cases, application developers benefited from the tight-loop modules, and they ended up developing their own data flow code that is wrapped around multiple tight loops to utilize the DMA bandwidth efficiently. Also, the presence of a large L2 memory in C64x offers more flexibility and makes the data flow more application-specific. Furthermore, tight loops that assume any knowledge of the data flow could not be integrated with other data flow schemes easily. Hence, we propose to provide imaging functions as tight loops for this library. Tight loops in the C64 UWICL assume no specific data flow schemes, thus they can be easily integrated with any data flow scheme or even with cache-based implementations.

 

 

Tight loops in the C64 UWICL will not directly implement TI’s eXpressDSP Algorithm Standard interfaces, such as IALG or IDMA. However, none of the tight loops will violate the rules and guidelines described within the eXpressDSP framework, and they can be easily integrated into higher-level algorithms that implement the algorithm standard without any performance or compliance trade-offs. The XDAIS-compliant algorithm developers can use these tight loops directly in their code, which exposes IALG and IDMA interfaces, while application developers can integrate these compliant algorithms in their framework implementing the data flow scheme themselves in an efficient way for that particular application. Figure 1 illustrates this.

 

The data flow code provided with the previous libraries was most useful as templates for application-level data flow code development. Thus, we will implement some of the C64 UWICL functions with data flow code incorporated via two hierarchical modules. The data-flow level module, in this case, will be provided as an example for application developers who wish to develop their own data-flow scheme. The tight loops of these two-level functions can totally be decoupled from their data flow and be used in an application framework just like the other C64 UWICL tight loops. So far, 11 functions have been selected from a variety of imaging functions to cover a broad range of data flow templates.

 

Figure 1. Use of UWICL tight loops in an XDAIS-compliant framework.

 

 

4.0 Family of Functions

 

In the first stage, the C64 UWICL will consist of about 45 functions, which can be grouped into 10 categories. The following list shows different function families and the number of functions included in each family.

 

            Arithmetic functions:                                            2

            Filtering functions:                                               5

            Geometric manipulation functions:                       10

            Transform functions:                                          10

            Machine vision functions:                                     4

            Statistics functions:                                              2

            Contrast enhancement functions:                          2

            Pixel depth conversion functions:                          2

            Color space conversion functions:                         6

            Compression functions:                                        2

 

5.0 Additional Functions To Be Developed

 

The initial 45 functions do not include simple arithmetic, logical or similar functions except for a few examples (e.g., add8, invert8). Application developers can easily develop such functions (e.g., image subtraction) without much effort. Hence, the C64 UWICL will initially focus on more complicated functions that require good knowledge of the algorithms and the underlying architecture. We are planning to develop additional functions after the release 1.0 with about 45 functions initially proposed in the Appendix.

 

During the lifetime of the consortium, new functions will be continuously developed and the library will be expanded. These new functions may include the following:

 

            More convolution functions

            More warping functions

            More wavelet functions

            More color space conversion functions

            More machine vision functions

More compression functions

Image segmentation functions

            3D volume processing/manipulation functions

           

Member companies of the C64 UWICL Consortium can also suggest the development of some new functions for their applications. The suggested functions from the Consortium member companies will be collected, analyzed, prioritized, and selected by the C64 UWICL team in consultation with the Consortium Advisory Board. Every six months, the development of five to ten new functions of high priority will be initiated for the ensuing six to 12 months depending on the available funding and the complexity and implementation difficulty of the selected functions. During the development time period, each function is subject to multiple design reviews, modifications, and optimization. After the functions are developed and tested under the established protocol, they will be included in the next release of the C64 UWICL.

 

6.0 Quality Assurance

 

All functions in the C64 UWICL are subject to detailed design review, white-box testing, and black-box testing. Prior to the implementation of any new C64 UWICL function, the high-level design of the new function is reviewed critically during regular design review meetings. Once the design is approved and the function is coded, the programmer tests his/her functions (white-box testing) under various conditions on the simulator and hardware, and validates the outputs of the function by using verification utilities for each function. Finally, one UWICL team member is designated as the quality control person who tests each function under various conditions and reports and communicates any problems and suggestions to the function’s author for corrections and improvements. Details of the algorithm verification will be documented in the C64 UWICL Quality Assurance Guide.

 

7.0 Documentation

The following documents will be included with the release of the C64 UWICL:

 

Design Guide: details the C64 UWICL software architecture.

Style Guides: standards for both C and assembly language coding.

Man pages: HTML man pages with each C64 UWICL function, providing algorithm specifics, synopsis, restrictions, and function parameters.

README files: a text file provided with each function source, describing how to compile the function, its performance, and restrictions.

Quality Assurance (QA) Guide: documents the C64 UWICL QA and support procedures.

Installation Guide: describes the procedures for installing and using the C64 UWICL.

Changes document: lists the changes in the C64 UWICL from previous releases.

 

8.0  Support

 

The C64 UWICL team will provide on-line bug support and on-line general support to Consortium member companies. Member companies of the C64 UWICL Consortium can report bugs to uwicl-support@icsl.ee.washington.edu in which he/she needs to describe the nature of the problem. The C64 UWICL team will attempt to answer and supply fixes to any bugs within 1-4 weeks of receipt of the report. C64 Consortium member companies can also send general support questions to uwicl-support@icsl.ee.washington.edu and C64 UWICL team members will respond within five working days. Details of the support procedures will be documented in the Quality Assurance Guide.

 

9.0 Availability of UWICL

There is a sponsorship fee anticipated at $32,000 per year per Consortium Member Company to support the C64 UWICL Consortium starting in January 2002. The company receives the site license to use the C64 UWICL source code immediately in R&D under the terms of the Sponsorship Agreement.  When the company is ready to introduce products using the C64 UWICL, they have two options for obtaining a commercial license: either paid-up non-exclusive license or a royalty bearing non-exclusive license, the forms of which are attached to the Sponsorship Agreement and available for inspection prior to joining the consortium.

 

The sooner a company joins the Consortium, the more advantageous the licensing terms become. For every year a company has been a member of the Consortium before a commercial license agreement is executed, there is a reduction per year in paid-up license fee or royalty. The maximum reduction is 60%.

 

The Consortium Advisory Board (CAB) provides advice regarding the choice of C64 UWICL functions, recommends areas of interest, and gives feedback on the overall Consortium operation. Each member company will have one vote. The CAB meeting will be held twice a year at the University of Washington.

 

C64 UWICL Schedule:

Official C64 UWICL version 1.0 by March 2002.

First C64 UWICL Consortium Advisory Board Meeting by March 2002.

Biannual expansion and upgrade in March and August/September of every year.

 

Consortium Director and Contact Person:

Professor Yongmin Kim

Director of the C64 UWICL Consortium

Departments of Bioengineering and Electrical Engineering

University of Washington, Box 352500

Seattle, WA  98195-2500 U.S.A.

Tel: (206) 685-2271, Fax: (206) 543-3842, Email: ykim@u.washington.edu

WWW site: http://icsl.ee.washington.edu/projects/c64x-uwicl

10.0 Appendix: Proposed Function List of the C64 UWICL

 

Table 1 is a list of proposed functions for the initial release of the C64 UWICL. If a function is also present in the TI’s C64x ImageLib, the differences are noted under the notes column. It is also specified if a function is provided with data flow code (in addition to the tight loop). It should be noted that some of the functions might consist of more than one tight loop (e.g., C with intrinsics and assembly, or different tight loops for different tap sizes of wavelet) although they are listed as one function.

Table 1. List of proposed function in the TMS320C64x UWICL

Category

Function Name

Notes

Data flow?

Arithmetic

8-bit image addition

 

Yes

 

8-bit image invert

 

Yes

 

 

 

 

Filtering

FIR filter (16-bit input, 16-bit output)

16-bit coefficients

No

 

FIR filter (8-bit input, 8-bit output)

16-bit coefficients

No

 

8-bit 2D convolution

Generalized kernel size with 16-bit coeffs (TI has 3x3 with 8-bit coeffs)

Yes

 

16-bit 2D convolution

Generalized kernel size with 16-bit coeffs

No

 

3x3 median filter

Similar to TI's

No

 

 

 

 

Machine Vision

binary dilate

Generalized structuring element (TI has 3x3)

Yes

 

binary erode

Generalized structuring element (TI has 3x3)

No

 

distance transform

 

No

 

connected components

 

No

 

 

 

 

Transform

2D complex FFT

 

Yes

 

2D complex IFFT

 

No

 

2D real FFT

 

No

 

2D real IFFT

 

No

 

D4 wavelet

Daubechies-4 wavelet

Yes

 

inverse D4 wavelet

 

No

 

generalized wavelet

Four different versions (4-tap, 8-tap, 12-tap, 16-tap, any padding can be supported), TI's implementation assumes 8-tap only, and employs circular padding

No

 

generalized inverse wavelet

TI's implementation assumes 8-tap, and circular padding

No

 

8x8 DCT

Similar to TI's

No

 

8x8 IDCT

Similar to TI's

No

 

 

 

 

Contrast Enhancement

8-bit histogram equalization

 

No

 

window & level

 

No

 

 

 

 

Statistics

8-bit histogram generation

Similar to TI's

Yes

 

template correlation

Uses a fast algorithm developed by ICSL, supports very large template sizes (TI's correlation supports 3x3, uses classical method)

No

 

 

 

 

Geometric Manipulation

8-bit affine warp

 

Yes

 

16-bit affine warp

 

No

 

8-bit perspective warp

 

No

 

rotate 90 degrees

 

No

 

binary image rotate 90 degrees

 

No

 

8-bit transpose

 

Yes

 

16-bit transpose

 

No

 

32-bit transpose

 

No

 

flip along x-axis

 

Yes

 

flip along y-axis

 

Yes

 

 

 

 

Pixel Depth Conversion

pack from 8-bit to 1-bit

 

No

 

unpack from 1-bit to 8-bit

 

No

 

 

 

 

Color Space Conversion

YCbCr 4:2:2 to HSI

TI has 422toRGB565, our RGB output is 8888 RGBX

No

 

RGB to HSI

 

No

 

YCbCr 4:2:2 to 4:2:0

 

No

 

YCbCr 4:2:0 to 4:2:2

 

No

 

YCbCr 4:2:2 to RGB

 

No

 

YCbCr 4:2:0 to RGB

 

No

 

 

 

 

Video Compression

motion estimation for a 16 x 16 block

Employs fast search algorithm (TI's version is full search)