The Video and Image Processing group considers all aspects of video and image processing, including development and experimentation with new or novel sensors, and alternate imaging technologies, including temperature and depth sensing devices.
This group effectively spans all of the school major centres and labs: with MDRI and FMSL including a focus on approaches that have a biomedical flavour and depend on mathematical analysis; and KIT having a focus on brain imaging and multimodal communication .
Current projects conducted by the Video and Image Processing (VIP) research group include:
- Colour Filter Array Demosaicking
Most digital cameras on the market use a single image sensor with a Bayer colour filter array to capture colour images. Colour filter array demosaicking refers to the determination of missing colour values at each pixel to produce a full colour image. Our VIP research group has been active in this research area, and several algorithms have been developed and published in a number of international papers. The results have been shown to outperform many existing techniques.
- Image Denoising
Due to the limitation of the size of an image sensor, the area for each pixel is getting smaller as the resolution is increasing. As a result, the signal to noise ratio of the captured images will decrease as the resolution increases. Image denoising within CFA demosaicking will become a crucial factor in the production of high quality and high resolution images. Our research group has been developing multi-dimensional image filtering techniques for denoising an image without blurring it.
Another problem with high resolution image sensors is that with millions of image pixels in a sensor, it is highly probable that it will contain a few defective pixels due to errors in the fabrication process. While these bad pixels would normally be mapped out in the manufacturing process, more defective pixels, known as hot pixels, could appear over time with camera usage. Since some hot pixels can still function at normal settings, they need not be permanently mapped out because they will only appear on a long exposure and/or at high ISO settings. A number of innovative techniques have been developed to tackle the problems.
- Super-Resolution within Colour Filter Array Demosaicking for underwater colour image sensors
Images captured underwater have poor signal to noise ratio due to poor lighting conditions. This will adversely affect the CFA demosaicking process that follows and result in colour artifacts. Moreover, underwater images have a blue/green cast and thus underwater colour correction is required for more natural?looking colour images. Another important capability to be developed for underwater applications is super-resolution image reconstruction for detailed fault detection such as for improved visualization of cracks in underwater pipeline inspection. Our proposed approach is a unified method of combining underwater colour correction, denoising, and super-resolution image reconstruction within CFA demosaicking. This unified single stage approach will significantly improve the efficiency and performance of colour image sensors for underwater applications.
- A High Dynamic Range Imaging video capture device using standard image sensors
Through High Dynamic Range Imaging (HDRI, e.g. 32 bits per colour per pixel), the full range of tones undersea can be captured in order to reproduce human perception down to the finest detail. This will provide a new dimension of image details which have not been seen undersea before. Moreover, the extra details will improve higher accuracy for feature detection and other identification techniques for undersea applications, e.g. undersea pipeline inspection. As true HDR image sensors are still under development, it would be an important advancement on undersea imaging if a HDRI video capture device can be realized using standard off the shelf image sensors. Our proposed approach is to use high speed image sensors so that images captured at various apertures and exposures are combined at high speed in the reconstruction of a high dynamic range video footage.
- Non-invasive Lizard Identification
The aim of the project is to identify individual Pygmy Bluetongue Lizards from their digital images using image identification techniques. Each lizard has a unique scale pattern like finger prints and the development of an image recognition system to identify over 500 individual lizards is underway. This is a project in collaboration with the School of Biological Sciences.
- Darwin Finch beak Morphology and Photo Identification
The aim of this research is to determine the beak volume and curvature of the small, medium, and large tree finch (Darwin's finches from the Galapagos Islands) from the digital images, and to test some hypotheses from the data.
- An Expert System for Living Creature Research
An expert system has been successfully developed for our other project to correctly identify lizards based on their individual features. Other capabilities of the expert system under development include the symmetrical analysis of the lizard scale patterns and its relationship to genetic behaviour. Our proposed expert system could also be modified and enhanced to identify other species, such as the Malleefowl, for study and research within their habitat. It would be particularly useful for the study and tracking of smaller living creatures which cannot be tagged or are unable to carry a GPS device.
- Intraocular Pressure Monitoring System
Glaucoma is the second leading cause of blindness in the world and the leading cause of blindness amongst Africans. A major risk factor in Glaucoma is elevated intraocular pressure (IOP). All current treatments, whether medical, laser or surgical, primarily act by lowering IOP. Thus, the accurate monitoring of IOP is an essential clinical facet in glaucoma care. Another alternative is the use of a glaucoma drainage implant (GDI) that provides an alternative pathway for the drainage of aqueous humor to reduce IOP. A common problem encountered during GDI implantation is hypotony, which has therefore led to the incorporation of valves into a GDI. The aim of this project is to investigate the problems in current glaucoma treatment, namely IOP measurement and regulation of IOP.
- Face and Facial Feature Tracking and Interpretation
A major focus of the KIT Artificial Intellience and Language Technologies Program is the understanding and modeling of human faces, and their expressions and gestures. This includes interpreting the emotions from a visual face, reading lips, and tracking eyes to determine what they are looking at, comparing audio and video information to determine who is talking, and modeling speech, emotions and faces accurately with our talking heads. For this work we use combinations of colour information, depth information, and conventional image processing techniques in multiple colour spaces, in particular using Red Incusion to find the blood-hue dominated skin, and Red Exclusion to find the shadow-blue dominated features.
- Body, Body Part and Object Tracking and Interpretation
For our work in intelligent robotics, vehicles and prosthetics (e.g. wheelchairs and ride-on toys) and the KIT work in grounded language and teaching heads (that is learning or teaching what words actually refer to in the real world) we need to be able to recognize objects and their functions. For this work we tend to use simple laboratory/building scenes, simlistic worlds with children's blocks and other toys as props, whilst much of our concentration is on understanding the human body, its body language, and use of gesture. For this work we use combinations of colour information, depth information, and we derive information about lighting, shadows and horizons, as well as using and fusing conventional image processing techniques in multiple colour spaces. We also make use of the Kinect for depth information.
- Point of Interest Tracking
A more general task still is to automatically determine and track points of interest in an unsupervised way, rather than requiring specific objects or features to be pointed out and labelled for supervised learning. This was originally developed as an alternative to Active Appearance Models (AAM) for body, face and feature tracking, but is also being explored for speeding up optical flow and stereopsis calculations.
- Estimating Camera Pose Using Vanishing Points
Real-time video processing is an essential aspect of augmented reality, as the virtual world or visualization needs to be overlaid accurately over the real world video. This requires accurate determination of camera pose, for which we use perspective information in the form of vanishing points, as explained in detail for the augmented reality project .
Further information
We are looking for collaborators and postgraduate research students to join our research group. We would also be happy to provide more information about the School's research programs, the opportunities for higher degree study and scholarship information. For more information, please contact the research group leader Dr Jimmy Li.

