Wyner-Ziv Coding of Video

People

Goals

This research project investigates Wyner-Ziv coding of video sequences. The topics that are currently under study are:

Activities

Design of Practical Wyner-Ziv Codes

Codes for Lossless Asymmetric Distributed Source Coding


We have sought superior alternatives to punctured turbo codes for lossless distributed source coding. We constructed two classes of rate-adaptive codes based on low-density parity-check (LDPC) codes; namely, LDPC accumulate (LDPCA) codes and sum LDPC accumulate (SLDPCA) codes. We also investigated the optimization of degree distributions and local stopping set structure for these codes. Further details and results have been published in [Varodayan_Asilomar2005] and [Varodayan_EURASIP2006]. Source code for the rate-adaptive LDPCA codes has been made public at http://www.stanford.edu/~divad/software.html. In [Varodayan_PCS2006], we modified the LDPCA decoder to handle, not only independent identically-distributed sources, but also sources modeled as one-dimensional Markov chains and two-dimensional Markov random fields. 

Quantizer Design for Wyner-Ziv Coding

We extended the Lloyd algorithm to design optimal quantizers for the Wyner-Ziv problem, and applied it to scalar and vector data with Gaussian statistics and to video samples. We investigated, theoretically and experimentally, the behavior of Wyner-Ziv quantizers at high rates. Results have been published in [Rebollo_DCC2003] and [Rebollo_Asilomar2003].

In [Rebollo_Asilomar2004] and [Rebollo_ICASSP2005], we further extended the Lloyd algorithm and the high-rate analysis to quantization design for distributed coding of noisy observations of unseen sources with decoder side information. This study dramatically broadens the range of applications, including not only joint distributed coding and denoising, but also applications using rate-constrained statistical inference with side information, or even quantization of the side information itself. A generalization of the information-theoretic rate-distortion bounds for Wyner-Ziv coding of noisy sources in the quadratic-Gaussian case was shown in [Rebollo_DCC05].

Recent improvements on our extension of the Lloyd algorithm [Rebollo_IT2006] enabled us to design quantizers for more complex coding settings potentially involving several encoders and decoders, such as network distributed coding of noisy sources and broadcast with side information. In addition, we discovered that the same unified theoretic framework of quantizer design includes, as special cases, well-known solutions to problems not directly related to source coding such as Gauss mixture modeling.

Finally, also in [Rebollo_IT2006], we developed techniques based on linear discriminant analysis and principal component analysis to reduce the dimensionality of the data involved in distributed coding and distributed classification. These techniques give us insight into the design of robust hash codes used in a Wyner-Ziv coder of video to perform motion estimation while keeping the complexity shifted towards the decoder.

Signal Transforms for Wyner-Ziv Coding

Using the high-rate analysis of Wyner-Ziv quantization, we investigated Wyner-Ziv coding with block transforms applied to both the source data and the side information. We considered optimal bit allocation of the transform coefficients, overall rate-distortion behavior and transform coding gain, for Gaussian and non-Gaussian statistics. This was documented in [Rebollo_Asilomar2003].

An extension to transform coding of noisy sources with decoder side information was presented in [Rebollo_Asilomar2004]. Improvements and extensions on the transformation of noisy sources for distributed coding were shown in [Rebollo_EURASIP2006].

Intraframe Video Coding with Interframe Decoding

Pixel-Domain Wyner-Ziv Coder

We incorporated spatial sub-sampling in our pixel-domain Wyner-Ziv coder to exploit the spatial dependencies within a frame and reduce the bit rate. We also used the pixel-domain Wyner-Ziv coder in a distributed compression scheme for large camera arrays. We compared the compression performance and encoding complexity of the proposed scheme to that of individually compressing the camera views using JPEG2000 or shape-adaptive DCT image coding. The findings have been reported in [Zhu_SSP2003].

Transform-Domain Wyner-Ziv Coder

We implemented a transform-domain Wyner-Ziv coder. To encode a video frame, we applied a blockwise discrete cosine transform (DCT) on the frame and performed Slepian-Wolf coding on the bitplanes of the coefficients. We investigated different transform block sizes and bit allocation strategies between coefficients. We compared the compression efficiency of the transform-domain intraframe encoder-interframe decoder video compression system to the pixel-domain scheme, to conventional DCT-based intraframe coding and to H.263 interframe coding. The proposed system and results are reported in detail in [Rebollo_Asilomar2003] and [Aaron_VCIP2004].

We also used the transform-domain Wyner-Ziv coder for compression of light field images. We compared the compression efficiency and random access characteristics to that of conventional shape-adaptive image coding. The system and results are reported in [Aaron_MMSP2004].

Flexible Decoder Motion Compensation

We implemented different frame dependency schemes to generate the side information at the decoder. We investigated the trade-off between motion compensation complexity at the decoder and compression efficiency. We studied the effects of increasing the group of pictures (GOP) size on compression performance and error propagation. The results are reported in [Aaron_ICIP2003] and [Aaron_VCIP2004].

We applied hash-based motion compensation at the decoder for the WynerZiv video codec. In the proposed scheme the encoder sends hash information to aid the decoder in performing more accurate motion compensation. The system is described in [Aaron_ICIP2004].

In another implementation, we applied Wyner-Ziv coding only on the low frequency coefficients of the frame, which tend to have significant correlation with the corresponding coefficients from the previous frame. The high frequency coefficients are compressed by efficient run-length coding and are used at the decoder to perform motion-compensation. The proposed system and results are reported in detail in [Aaron_PCS2004].

Wyner-Ziv Residual Coder

We implemented a pixel-domain Wyner-Ziv residual video codec. In this scheme the encoder uses the previous frame as a simple reference frame and applies Wyner-Ziv coding on the residual pixels. The decoder generates better side information using compute-intensive motion estimation techniques such as motion-compensated interpolation or hash-based motion estimation. With this scheme, the encoder exploits some of the similarities between the current frame and the previous frame, while the decoder uses both the previous frame and the more sophisticated, motion-compensated side information for conditional decoding. The proposed system and results are reported in [Aaron_PCS2006].

.

Systematic Lossy Error Protection for Video

Pixel-Domain Wyner-Ziv Coder

We implemented a Wyner-Ziv coding scheme to protect a video wave form as follows: We generate a supplementary bitstream using pixel-domain Wyner-Ziv encoding of the video sequence. The Wyner-Ziv encoder consists of a coarse uniform quantizer followed by a turbo coder. The WynerZiv decoder decodes the Wyner-Ziv bitstream using the received error-prone video frames as side information. We also implemented an embedded pixel-domain Wyner-Ziv codec, which distributes the available Wyner-Ziv bit-rate among two or more Wyner-Ziv descriptions. The decoder attempts to recover the best description allowable at the given channel error probability. Our experiments and results are described in [Aaron_DCC2003] and [Aaron_ICIP2003].

Wyner-Ziv Coding for Error-Resilient Digital Video Broadcasting

We applied the systematic lossy source channel coding framework for error-resilient MPEG-2 broadcasting. An MPEG bitstream is transmitted over an error-prone channel without error protection. In addition we generate a supplementary bitstream using Wyner-Ziv encoding. Different from the pixel-domain case, the Wyner-Ziv encoding consists of generating a coarsely quantized video bitstream using a conventional hybrid video coder, applying Reed-Solomon codes, and transmitting only the parity symbols. In the event of channel errors, the Wyner-Ziv decoder decodes these parity symbols using the error-prone conventionally decoded MPEG video sequence as side information. The system is designed to be fully backward compatible with universally deployed MPEG-2 broadcasting systems, and can be implemented with negligible complexity overhead compared to conventional FEC systems. Our findings are reported in [Rane_VCIP2004] and [Rane_ICIP2004]. We also implemented an embedded Wyner-Ziv codec, and compared the performance of our scheme with that of traditional forward error correction and with layered coding schemes that employ unequal error protection. These experiments are described in [Rane_ICIP2005] and [Rane_IVCP2005].

Wyner-Ziv Coding for Video Transmission over Ad-Hoc Networks

We investigated the performance of the above Systematic Lossy Error Protection (SLEP) scheme for video transmission between mobile nodes of a wireless ad hoc network. In such a system, the packet loss rate and the allowable bit-rate over a link changes over time. Further, in conjunction with a model for the end-to-end rate-distortion performance of the SLEP scheme [Rane_PCS2004], we proposed a method to select the best path for transmitting video data, from amongst a number of candidate paths, with different rate budgets and packet loss rates. These experiments are documented in [Zhu_VCIP2005].

SLEP based on H.264/AVC Redundant Slices and Flexible Macroblock Ordering

We implemented the SLEP scheme using standard compliant features in  the state-of-the-art H.264/AVC video coding standard. The redundant  slices feature can be used to generate redundant video descriptions on  which Reed Solomon Slepian-Wolf coding can be applied. Using Flexible Macroblock Ordering (FMO), a region-of interest (ROI) can be selected  in each video frame [Baccichet_PV2006]. The SLEP scheme can be applied preferentially on the ROI, while allowing the less important  background to be protected by conventional decoder-based error concealment schemes. A specification of the Wyner-Ziv codec operations  and a syntax for describing the Wyner-Ziv bit stream was proposed to the Joint Video Team of the International Telecommunications Union  (ITU-T), for inclusion as an error-resilience tool under the H.264/AVC video coding standard [Rane_JVT2006]. The proposal was recommended for a Core Experiment which is currently in progress.

A model was proposed to predict the average received picture quality delivered by the SLEP system as a function of the encoding bit rates of the primary and redundant descriptions, the bit rate of the parity symbols transmitted in the Wyner-Ziv bit stream, and the probability with which packets are lost during transmission [Rane_PCS2006].

Findings

Design of Practical Wyner-Ziv Codes

We have shown through simulation that our rate-adaptive LPDCA and SLDPCA codes have compression performance closer to the Slepian-Wolf bound than punctured turbo codes, over a wide range of rates. SLDPCA codes are the best at low and high source rates and LDPCA codes are the best at intermediate rates. Moreover, these codes are amenable to optimization in terms of degree distribution and local stopping set structure. When the LDPCA decoder is modified to take into account Markov source statistics in one or two dimensions, we demonstrate compression improvements up to a factor of 2 or 4, respectively.

Our theoretical analysis introduces the concept of 'rate measure', which allows the design of optimal Wyner-Ziv quantizers to be carried out independently from the specific implementation of the lossless
coding of the quantization index, just by modeling it as an ideal Slepian-Wolf coder. The theory establishes that optimal quantizers for Wyner-Ziv coding at high rates have uniform interval length and do not require quantization indices to be reused across intervals, as long as they are followed by an ideal Slepian-Wolf coder. Furthermore, no loss in performance is incurred by not having access to the side
information at the encoder. These findings have been confirmed experimentally for scalar and vector Gaussian statistics and video samples.

The extension of the Lloyd algorithm to quantization for distributed coding of noisy sources also generalizes the concept of rate measure. As a result, 'locally' optimal quantizers can be found not only for
indirect observations, but also for a number of related applications, such as quantization of the side information itself, network distributed coding with several encoders and decoders, and broadcast with side information, within a unified framework. At this stage, the high-rate analysis permits the characterization of such quantizers under less general conditions, restricting the type of applications mainly to joint quantization and denoising. However, these conditions guarantee the absence of performance loss due to the unavailability of the side information at the encoder, and the resulting quantizers are  also uniform without index repetition.

The theoretical study of linear transforms shows that, under certain conditions, the Karhunen-Loeve transform of the source vector is determined by its expected conditional covariance given the side information, which is approximated by the DCT for conditionally stationary processes. In the case of jointly Gaussian statistics, one can obtain a linear estimate of the source data given the side information and apply to this estimate the same transform used for the source data. Experimental results confirm that the use of the DCT may lead to important performance improvements. Conceptually similar results are proven in the noisy case. Finally, we establish that the side information can be replaced by a sufficient statistic without asymptotic loss of performance at high rates, regardless of the statistics of the side information.

Our theoretic analysis on the dimensionality reduction and feature extraction for distributed coding and classification concludes that the DCT is also a convenient candidate in the design of robust hash codes for Wyner-Ziv motion compensation.


Intraframe Video Coding with Interframe Decoding

When we applied the pixel-domain Wyner-Ziv coder to distributed compression for large camera arrays, experimental results show superior performance over JPEG2000 or shape-adaptive DCT image coding. This is achieved while requiring much lower encoder complexity compared to these conventional non-distributed compression schemes.

Sub-sampling the frame in the pixel-domain scheme for the intraframe encoder-interframe decoder video compression system proved to be ineffective in reducing the bit rate of the system. The transform-domain system was able to exploit the spatial dependencies within a frame and showed improved compression performance than the pixel-domain scheme.  Experimental results showed that the transform-domain Wyner-Ziv coder performed significantly better (up to 12 dB) than DCT-based intraframe coding and entailed comparable encoder complexity. There is still a performance gap compared to H.263 interframe coding.

When we applied the transform-domain Wyner-Ziv coder to compression of light field images, experimental results show superior performance over shape-adaptive DCT image coding. This compression performance is achieved while maintaining good random access characteristics which is important in light field streaming systems.

Applying motion estimation at the decoder generates better side information for decoding Wyner-Ziv frames but increases the decoder complexity. The frame dependency structure, GOP length and motion estimation complexity can be chosen based on the system requirements.

Hash-based motion compensation at the decoder is an effective way to achieve more accurate motion estimation and generate reliable side information from only a previous frame. This improvement in the system allows compression with longer GOP's and sequential decoding, while significantly outperforming DCT-based intraframe coding.

Applying conventional run-length coding on the high frequency coefficients of a frame and using these coefficients to perform motion estimation at the decoder is effective in generating reliable side information for the Wyner-Ziv encoded low frequencies. This system allows improved compression efficiency, especially for higher motion sequences.

With the addition of the encoder reference frame generation, frame store and frame subtraction, the pixel-domain Wyner-Ziv residual encoder has slightly higher complexity than the original pixel-domain Wyner-Ziv video coder. In terms of compression efficiency, the pixel-domain residual codec achieves better
rate-distortion performance than simply Wyner-Ziv coding the pixels and similar performance compared to the transform-domain Wyner-Ziv video codec.

Systematic Lossy Error Protection for Video

Our experiments indicate that a supplementary bit stream generated using pixel-domain Wyner-Ziv coding of the source sequence can be used to correct transmission errors in the transmitted video waveform up to a certain residual distortion, determined by the quantizer coarseness. Since the above scheme allows for some distortion in case of channel errors, it can potentially achieve a much lower bit rate than conventional channel coders which directly protect the bits produced by the source coder.

When applied to error-resilient digital video broadcasting, our Systematic Lossy Error Protection (SLEP) scheme achieves acceptable decoded video quality at higher error probabilities than FEC, while operating at the same or lower bit rate. The lossy nature of the scheme ensures that video quality degrades gracefully with increasing channel error probability. This avoids the visually unpleasant 'cliff' effect of FEC, in which the PSNR degrades rapidly after FEC is overwhelmed by channel errors. Further, an embedded Wyner-Ziv coder achieves graceful degradation of video quality without requiring layered encoding of the original video sequence. Application of the SLEP scheme to a region-of-interest within a video frame results in a better exploitation of the resilience-quality tradeoff and therefore superior decoded picture quality, especially for sequences with low motion or a static background.

Using a model for the end-to-end video quality delivered by the lossy error protection system, we are able to choose the bit rates for encoding the redundant video descriptions as well as the parity bit rates in the Reed-Solomon Slepian-Wolf codec in such a way that the received video quality is maximized. The model also enables the encoder to accurately determine the tradeoff between the error resilience offered by a redundant description and the resultant quality loss from Wyner-Ziv decoding.

It is advantageous to use the SLEP scheme in scenarios with changing channel conditions (for instance, wireless ad hoc networks) because changes in the packet loss rate cause a graceful variation in the perceived video quality. Further, using the above end-to-end distortion model with estimates of the packet loss rates and rate-budgets for a number of candidate paths in the ad hoc network, we can rank order the paths in terms of the average decoded picture quality, and then select the best available path amongst them..


Publications

Journal Papers

  • D. Varodayan, A. Aaron and B. Girod, "Rate-adaptive codes for distributed source coding," EURASIP Signal Processing Journal, Special Issue on Distributed Source Coding. Invited Paper. To appear. [pdf]
  • D. Rebollo-Monedero, S. Rane, A. Aaron and B. Girod, "High-rate quantization and transform coding with side information at the decoder," EURASIP Signal Processing Journal, Special Issue on Distributed Source Coding. Invited Paper. To appear. [pdf]
  • B. Girod, A. Aaron, S. Rane and D. Rebollo-Monedero , "Distributed video coding,"  Proceedings of the IEEE, Special Issue on Video Coding and Delivery, vol. 93, no. 1, pp. 71-83, January 2005. Invited paper. [pdf]

In Preparation

  • D. Rebollo-Monedero and B. Girod, "Network Distributed Quantization," IEEE Transactions on Information Theory. In preparation.

Conference Publications

  • S. Rane, P. Baccichet and B. Girod, "Modeling and Optimization of a Systematic Lossy Error Protection System based on H.264/AVC Redundant Slices," Proc. Picture Coding Symposium, PCS-2006, Beijing, China, April 2006. [pdf]
  • A. Aaron, D. Varodayan and B. Girod, "Wyner-Ziv residual coding of video," Proc. Picture Coding Symposium, PCS-2006, Beijing, China, April 2006. [pdf] [presentation]
  • D. Varodayan, A. Aaron and B. Girod, "Exploiting spatial correlation in pixel-domain distributed image compression," Proc. Picture Coding Symposium, PCS-2006, Beijing, China, April 2006. [pdf]
  • P. Baccichet, S. Rane and B. Girod, "Systematic Lossy Error Protection based on H.264/AVC Redundant Slices and Flexible Macroblock Ordering," Proc. Packet Video Workshop, PV-2006, Hangzhou, China, April 2006. [pdf]
  • S. Rane and B. Girod, "Systematic Lossy Error Protection based on H.264/AVC Redundant Slices," Proc. SPIE Visual Communications and Image Processing, VCIP-2006, San Jose, CA. Jan. 2006. [pdf]
  • D. Varodayan, A. Aaron and B. Girod, "Rate-adaptive distributed source coding using Low-Density Parity-Check codes," Proc. Asilomar Conference on Signals and Systems Pacific Grove, CA, Nov. 2005. [pdf]
  • S. Rane, A. Aaron and B. Girod, "Error-resilient video transmission using multiple embedded Wyner-Ziv descriptions", Proc. IEEE Internation Conference on Image Processing, ICIP-2005 , Genoa, Italy, Sept. 2005. [pdf]
  • X. Zhu, S. Rane, B. Girod, "Systematic lossy error protection for video transmission over wireless ad hoc networks", Proc. SPIE Visual Communications and Image Processing, VCIP-2005, Beijing, China, July 2005.  [pdf] [presentation]
  • D. Rebollo-Monedero and B. Girod, “A generalization of the rate-distortion function for Wyner-Ziv coding of noisy sources in the quadratic-Gaussian case,” in Proc. IEEE Data Compression Conferaence, DCC-2005, Snowbird, UT, Mar. 2005. [pdf] [presentation]
  • D. Rebollo-Monedero and B. Girod, "Design of optimal quantizers for distributed coding of noisy sources ", in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP-2005, Philadelphia, PA, Mar. 2005.  Invited paper. [pdf] [presentation]
  • S. Rane and B. Girod, "Systematic lossy error protection versus layered coding with unequal error protection", Proc. SPIE Conference on Image and Video Communications and Processing, IVCP-2005 , San Jose, CA., January 2005.  [pdf]
  • A. Aaron and B. Girod, "Wyner-Ziv video coding with low-encoder complexity," Proc. Picture Coding Symposium, PCS-2004, San Francisco, CA, December 2004. Invited paper. [pdf]
  • S. Rane and B. Girod, "Analysis of error-resilient video transmission based on systematic source-channel coding", Proc. Picture Coding Symposium, PCS-2004, San Francisco, CA, December 2004. [pdf]
  • D. Rebollo-Monedero, S. Rane, and B. Girod, “Wyner-Ziv quantization and transform coding of noisy sources at high rates,” Proc. Asilomar Conference on Signals and Systems, Pacific Grove, CA, Nov. 2004. [pdf] [presentation]
  • S. Rane, A. Aaron and B. Girod, "Systematic lossy forward error protection for error-resilient digital video broadcasting - A Wyner-Ziv coding approach," Proc. IEEE International Conference on Image Processing, ICIP-2004, Singapore, Oct. 2004. [pdf]
  • A. Aaron, S. Rane and B. Girod, "Wyner-Ziv video coding with hash-based motion compensation at the receiver",  Proc. IEEE International Conference on Image Processing, ICIP-2004, Singapore, Oct. 2004. [pdf] [presentation]
  • A. Aaron, P. Ramanathan and B. Girod, "Wyner-Ziv coding of light fields for random access," Proc. IEEE International Workshop on Multimedia Signal Processing, MMSP-2004, Siena, Italy, Sept. 2004. [pdf] [presentation]
  • A. Aaron, S. Rane, E. Setton and B. Girod, "Transform-domain Wyner-Ziv codec for video", Proc. Visual Communications and Image Processing , VCIP-2004 , San Jose, CA, January 2004. [pdf]  [presentation]
  • S. Rane, A. Aaron  and B. Girod, "Systematic lossy forward error protection for error resilient digital video broadcasting", Proc. SPIE Visual Communications and Image Processing, VCIP-2004, San Jose, CA, January 2004. [pdf
  • D. Rebollo-Monedero, A. Aaron and B. Girod, "Transforms for high rate distributed source coding," Proc. Asilomar Conference on Signals and Systems , Pacific Grove, CA, Nov. 2003. Invited paper. [pdf] [presentation]
  • X. Zhu, A. Aaron and B. Girod, "Distributed compression for large camera arrays", Proc. IEEE Workshop on Statistical Signal Processing, SSP-2003, St Louis, Missouri, Sept. 2003. Invited Paper. [pdf] [poster]
  • A. Aaron, E. Setton and B. Girod, "Towards practical Wyner-Ziv coding of video", Proc. IEEE International Conference on Image Processing , ICIP-2003, Barcelona, Spain, Sept. 2003. [pdf] [poster]
  • A. Aaron, S. Rane, D. Rebollo-Monedero and B. Girod, "Systematic lossy forward error protection for video waveforms", Proc. IEEE International Conference on Image Processing , ICIP-2003 , Barcelona, Spain, Sept. 2003. Invited Paper. [pdf] [presentation]
  • D. Rebollo-Monedero, R. Zhang and B. Girod, "Design of optimal quantizers for distributed source coding," Proc. IEEE Data Compression Conference, DCC-2003, Snowbird, UT, March 2003. [pdf] [presentation]
  • A. Aaron, S. Rane, R. Zhang and B. Girod, "Wyner-Ziv coding for video: Applications to compression and error resilience," Proc. IEEE Data Compression Conference, DCC-2003 , Snowbird, UT, March 2003. [pdf] [presentation]
  • A. Aaron, R. Zhang and B. Girod, "Wyner-Ziv coding of motion video," Proc. Asilomar Conference on Signals and Systems , Pacific Grove, CA, Nov. 2002. Invited Paper.[pdf] [presentation]
  • A. Aaron and B. Girod, "Compression with side information using turbo codes," Proc. IEEE Data Compression Conference , DCC-2002 , Snowbird, UT, April 2002.  [pdf] [presentation]

Standardization Contributions

  • S. Rane, P. Baccichet and B. Girod, "Systematic Lossy Error Protection based on H.264/AVC Redundant Slices and Flexible Macroblock Ordering", Document No. JVT-S025, 19th JVT meeting, Geneva, March 2006.

Software

Rate-Adaptive LDPC Accumulate Codes for Distributed Source Coding
Source code and documentation for LDPCA codes:  http://www.stanford.edu/~divad/software.html



This work is supported by the National Science Foundation under Grant No. CCR-0310376. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

People | Goals | Activities | Findings | Publications | Software
Last modified: 29-May-2006