Skip to content

Advertisement

  • Research
  • Open Access

An energy-efficient low-memory image compression system for multimedia IoT products

EURASIP Journal on Image and Video Processing20182018:87

https://doi.org/10.1186/s13640-018-0333-3

  • Received: 29 December 2017
  • Accepted: 7 September 2018
  • Published:

Abstract

Emerging Internet of things (IoT) technologies have rapidly expanded to multimedia applications, including high-resolution image transmission. However, handling image data in IoT products with limited battery capacity requires low-complexity and small-size solutions such as low-memory compression techniques. The objective of this paper is to propose a line-based compression system based on four-level two-line discrete wavelet transform and adaptive line prediction. Bit stream is generated by multiplexing various frequency components with run-level coding combined with Huffman coding. The proposed system also includes a new bit rate control algorithm that could significantly improve image quality consistency in one frame. The proposed low-memory compression system can retain image quality for visually lossless compression criteria over the whole image frame. It can simultaneously lower total system power consumption in multimedia IoT products better than other existing low-memory compression techniques.

Keywords

  • Multimedia Internet of things
  • Line-based low-memory compression
  • Image quality consistent bit rate control

1 Introduction

With exciting development of very large-scale integration (VLSI) technology, Internet of things (IoT) products have rapidly evolved into monitoring devices that can capture and stream multimedia data with high-resolution images. Promising multimedia IoT applications include smart glasses [1, 2], unmanned aerial vehicles [3], telepresence robots [4], and wireless capsule endoscopies [5] that can capture and transmit still images or video sequences at visually lossless levels. However, processing a high volume of image data and wirelessly transmitting data in a multimedia IoT device that typically has limited battery capacity require high-power consumption. This is a major obstacle to system operation time.

In multimedia IoT devices, key factors of system power consumption include a microprocessor for information processing, memory devices for data storage, and wireless transmitter-receivers for data transmission. The amount of power consumed by the memory device is an important factor when transmitting and processing a high volume of data such as images. Therefore, many studies have been conducted to reduce power consumption by decreasing the size of image data with simple compression method before transmission to use computing processes and memory as small as possible [68]. Accordingly, low-memory compression methods have been adopted in mobile devices, especially in those that require high-power consumption and bandwidth for high-resolution display [9] and in vehicle safety systems that require very low delays for real-time image transmission [10].

In addition to image data, low-memory compression methods have been applied to body signal monitoring IoT devices such as two-dimensional (2D) electrocardiogram (ECG) recorders using high temporal correlations between ECG signals by converting 1D ECG signals into 2D ECG signals, thus reducing amounts of data transmitted [11].

Existing image compression standards such as JPEG2K [12], H.264 [13, 14], and HEVC [15, 16] can reduce the power required for transmission while maintaining a high compression ratio. However, power consumption of these standards is increased in the processor and memory due to vast amounts of processing operations and extremely large storage devices. Thus, less complex JPEG standard [17] is often applied to conventional multimedia IoT products to compress a whole single frame. However, the JPEG standard does not easily satisfy the demand for high image quality due to its fixed-point implementation. To achieve visually lossless condition which means more than 40-dB reconstruction image quality, the JPEG standard requires a floating-point processing unit that consumes a lot of power. In addition, for a high-resolution image with a visually lossless image where information of minute size is important, a line error is advantageous for recovering information rather than an error of a block occurring in conventional block-based compression standard methods. Another requirement for multimedia IoT devices is that a compact compression technique with small physical size and small production cost is often more important than high compatibility, because the IoT devices are often used in dedicated applications which often require very high volume of small-size devices.

One-dimensional (1D) set partitioning in hierarchical trees (SPIHT) technique based on zerotree wavelet coding has been presented in a low-memory compression study [18]. However, 1D SPIHT memory access counts are too high during the process of repetitively arranging key data in bit manipulations after wavelet transform for transmission, resulting in excessive power consumption. Several studies have been conducted to reduce memory accesses and improve compression ratio of 1D SPIHT [1922], including line-based backward coding of wavelet trees (L-BCWT) [23, 24] and zero memory set partitioned embedded block (ZM-SPECK) [25]. However, energy efficiency of systems after adopting these modified SPIHT methods needs to be further improved to satisfy energy requirements of low-power IoT devices.

Vo et al. [26] have conducted a study on low-memory compression method after applying non-uniform quantization techniques according to block types by considering error perception difference characteristics of human visual system (HVS). They achieved a compression ratio of 3:1. Kim et al. [27] have also presented a compression system that can perform horizontal predictive coding for low-frequency band with zero-zone scalar quantization techniques for higher frequencies after a line-based four-level discrete wavelet transform (DWT). After applying variable length coding method, they achieved a compression ratio of approximately 3:1 with a visually lossless condition.

The objective of this study was to propose a low-memory compression method to reduce overall power consumption of multimedia IoT products. The proposed compression method includes a compression method for compressing a single frame image and a bit rate control method for maintaining compression ratio and consistency of image quality through the image. This method extends and improves the existing frequency adaptive line compression (FALC) technique of Kim et al. [27], achieving a higher compression ratio by using the least possible amount of processing and memory. The proposed technique adopts 2D DWT and 2D directional adaptive differential pulse code modulation (DPCM) between lines. A combination of run-level coding and Huffman coding further improves the compression ratio of high-frequency data. The bit rate control method coupled with the proposed system maintains the consistency of image quality through the frame mostly with the information collected in the current frame. However, the method also checks if the current frame is similar to the previous frame as a part of an image sequence, because additional improvement of image quality can be easily achieved by using a simple information like a type of image. It can significantly alleviate the issue of image quality imbalance that might occur between highs and lows in a single image, an inevitable defect in low-memory compression methods. The proposed technique can achieve a compression ratio of at least 4:1 with an average peak signal-to-noise ratio (PSNR) of 40 dB.

In Section 2 of this paper, background information and conventional compression methods are introduced. The proposed compression system is described in Section 3. In Section 4, the method is evaluated and results are discussed. Section 5 concludes this paper.

2 Background research

2.1 Energy efficiency of compression methods for IoT system

With rapid development of IoT technology recently, securing sufficient operation time for IoT products with limited battery capacities has become an important performance index. Basic components of an applied IoT system include batteries as the main power source, sensors to acquire data, memory to store acquired data, microprocessors to process stored data, and transmission devices to transmit processed data wirelessly as shown in Fig. 1.
Fig. 1
Fig. 1

Basic compositions of IoT applied systems (uProc, microprocessor)

Applications of recent IoT devices are being expanded to multimedia. Such devices can capture and stream large volumes of image data. However, storing, processing, and wirelessly transmitting huge amounts of image data will increase power consumption and further reduce operation time of the overall system.

Image compression standard techniques such as JPEG2K, H.264, and HEVC have been used to reduce the size of image data. Although transmission power can be significantly reduced by using high compression ratios, high complexity of the compression system also increases power consumption due to excessive processing and memory demands in the compression process. Hence, many studies have been conducted on energy-effective compression methods based on low memory to reduce power consumption of multimedia IoT systems [68]. These methods can further reduce power in the overall system and extend the operation time of batteries by lowering processing operations and memory accesses for the compression system.

Figure 2 shows comparison of estimated power consumption of several compression methods on a multimedia IoT platform (Intel Galileo gen 2 development board) [28]. Cases were assumed to involve the application of image data monitoring. 4L H.264 is an intra compression mode based on H.264 main profile [13] with joint model (JM) 19.0 reference software [14] where coding unit size is fixed to 4 by 4 to use only four lines and all functions that include nine intra prediction modes and context-adaptive arithmetic coding (CABAC). Likewise, 4L HEVC is an intra compression mode based on HEVC main profile [15] with HEVC test model (HM) 16.5 reference software [16] where coding unit size is also fixed to 4 by 4 to use only four lines and all functions that include 35 intra prediction modes and CABAC. As shown in Fig. 2, power consumption of the whole IoT system is increased for 4L H.264 and 4L HEVC than 1D SPIHT.
Fig. 2
Fig. 2

Energy comparison of 1D SPIHT, 4L H.264, and 4L HEVC methods

2.2 Conventional compression algorithms

The 1D SPIHT method [18] is based on zerotree characteristics of wavelet coefficients. To enhance compression efficiency, this method generates three dynamic lists by repeatedly traversing coefficients from the upper bit plane to the lower bit plane to transmit important bit data first. This method is robust to transmission errors and easy to control bit rate. However, the 1D SPIHT compression method incurs high-power consumption due to its increased memory accesses during the repetitive process of managing three dynamic lists based on bit-plane coding.

L-BCWT method [23] can lower the complexity of existing 1D SPIHT by using a lookup table method called a one-pass backward coding technique. Although L-BCWT can reduce the complexity of repeated tree-scanning, bit-plane coding, and dynamic lists management of conventional 1D SPIHT, its compression performance remains marginal.

ZM-SPECK [25] can remove state-maps and dynamic lists in the existing SPIHT algorithm using linear indexing property of wavelet tree and merged refinement technique. This method can reduce both computational complexity and memory accesses related to dynamic lists. However, it still has a lot of memory accesses due to its bit-plane coding and recursive set-partition technique. The compression ratio of ZM-SPECK is slightly higher than that of the existing SPIHT algorithm.

Vo et al. [26] have proposed a line compression method called visually lossless compression non-uniform quantization (VLC_NUQ) method. This method can perform median edge detection (MED) prediction of 2 × 4 blocks in areas where errors are relatively less recognizable. It can also perform MED prediction of 1 × 4 blocks in detailed and flat areas where errors are easily noticeable. Values of each block are transmitted to the decoder after data are further reduced through predefined non-uniform quantization process. Although the simplicity of VLC_NUQ method only requires small amounts of power consumption, its average compression ratio of 3:1 is insufficient for reducing transmission power in applied IoT systems.

Kim et al. [27] have proposed a frequency adaptive line compression (FALC) method with decreased complexity comparing to existing compression methods. Instead of using bit planes of RGB data, FALC method uses whole coefficients of discrete wavelet transforms of YCuCv color space data. After four levels of wavelet transforms, the original data are transformed into four high-frequency bands and one low-frequency band. Then, selective zero-zone quantization is performed in the four high-frequency bands. For the low-frequency band, predictive coding removes redundancy. Data from each frequency band are then compressed with a variable length coding (VLC) method based on Huffman coding. The compression ratio of FALC surpasses 3:1 with a visually lossless condition. However, further improvement is needed to actively reduce more power in IoT-applied environments.

2.3 Existing line-based bit rate controls

The image quality of line-based compression methods often fluctuates over a single frame because each line even in a single frame has different characteristics in terms of compression efficiency. In JPEG-LS environment, Edirisinghe’s line-based bit rate control (BRC) [29] and Jiang’s line-based BRC [30] have been proposed. However, these BRCs have insufficient image qualities in terms of consistency within one frame.

Another bit rate control method [31] has been proposed to improve the consistency of image quality within one frame based on the existing FALC method. It keeps the quantization level change slow. It also checks image quality and available bits for the rest of the frame when two thirds of the frame is processed to find an appropriate quantization level that will use up the remaining bits.

However, existing BRC methods still have insufficient image quality consistency, especially when there are image splits or scene changes. Therefore, BRC must be further improved to be applied to consumer products.

3 Proposed compression system

The proposed compression system includes a compression method based on two consecutive lines and a bit rate control system to ensure the compression ratio for a whole frame. Figure 3 shows the diagram of the proposed compression system. To achieve higher power efficiency of the compression method, we extended the existing FALC method to improve the efficiency of each processing stage. To improve consistent image quality through the entire frame, bit rate control system is used to check characteristic variations of the frame.
Fig. 3
Fig. 3

Proposed compression system

3.1 The proposed selective 1L/2L compression method

The proposed selective 1L/2L compression method can remove spatial redundancy occurring between adjacent lines in spatial interline prediction after 1D DWT is extended to 2D DWT. To lower power consumption of the proposed compression method, we considered the smallest possible operations and memory accesses. Lastly, run-length encoding for repeated zeros was applied to repetitive zero values in high-frequency bands of chroma channels to further enhance the overall entropy coding compression performance. Overall structure of the proposed encoder is shown in Fig. 4. The following subsections will step by step describe the overall block diagram shown in Fig. 4.
Fig. 4
Fig. 4

Proposed 1L/2L selective encoder system

3.1.1 Spatial interline prediction

To effectively remove spatial redundancy between lines, a vertical interline predictive method is used. The proposed interline prediction method can calculate several directional sum of absolute difference (SAD) between two adjacent lines and then choose the prediction mode to minimize SAD. Four prediction modes are determined by considering both compression performance and transmission cost.

Since luminance channels have relatively higher spatial correlations compared to chroma channels, four spatial modes are defined for luminance channels while there are only two prediction modes for chroma channels. Horizontal prediction for one-line compression mode (1L mode) is processed with a 1 × 8 block while vertical two-line compression mode (2L mode) is done with a 2 × 8 block. Figure 5 shows the relationship between reference samples of previous line and predicted samples in processing lines. Predicted candidate modes are shown in Table 1, where Rx,y is the reference sample, Px,y is the predicted sample, and x, y show the location of a pixel.
Fig. 5
Fig. 5

Interline prediction process (Rx,y reference sample, Px,y predicted sample)

Table 1

Proposed interline prediction modes

Mode

Prediction eq. (1L mode 1 × 8 block, 2L mode 2 × 8 block)

Color channel

Vertical

Px, y = Px, y + 1 = Rx, y − 1

Luma, chroma

Left

Px, y + m = (Rx − 2 − m, y − 1 + Rx − 1 − m,  y − 1)  1,    m = 0 or 1

Luma

DC

\( {P}_{x,y}={P}_{x,y+1}=\left(\sum \limits_{i\in {B}_x}{R}_{i,y-1}\right)\gg 3,\kern6em {B}_x=\left\{\mathrm{all}\ x\ \mathrm{of}\ R\right\} \)

Luma, chroma

Right

Px, y + m = (Rx + 1 + m, y − 1 + Rx + 2 + m,  y − 1)  1,   m = 0 or 1

Luma

3.1.2 2D discrete wavelet transform

For 2D DWT in the 2L mode, a vertical DWT followed by the existing horizontal I42 integer DWT [27] to further remove vertical redundancy between two adjacent lines of an image. To have the minimum system complexity, Haar DWT [32] is used for vertical DWT. After the proposed 2D DWT is applied, there are 10 sub-bands: five for horizontal DWT and two for vertical DWT. Figure 6 shows sub-bands for the proposed 2D DWT.
Fig. 6
Fig. 6

Each sub-band for the proposed 2D DWT (10 sub-bands)

3.1.3 Extended frequency selective zero-zone quantization

In the existing low power line compression method, Ham et al. [31] have expanded 16 quantization levels of the existing single line-based line compression study [27] to 89 levels for a more precise bit rate control. In the proposed method, the number of quantization levels is reduced to 80 to decrease transmission cost. An additional 80-level quantization table for 2D DWT coefficients is also included and quantization parameters of each level are defined to be compatible to those of PSNR of the same quantization level in the 1L mode quantization table.

3.1.4 2D predictive coding in frequency domain

To remove remaining data redundancy after interline prediction in the spatial domain, an adaptive predictive coding in the frequency domain is performed. Besides using the existing horizontal predictive coding of FALC, the proposed method attempts to use vertical predictive coding in low-frequency band. The proposed method can predict bit stream sizes based on the VLC table for both horizontal and vertical predictive coding modes. It then selects one that has a small size out of these two predictive coding modes to achieve better compression ratio. We did not apply both prediction modes at the same time because once the horizontal predictive coding was applied, differential values had less correlations in the vertical direction. The optimal predictive coding selection process is shown in Fig. 7.
Fig. 7
Fig. 7

Optimal predictive coding selection process (horizontal coding or vertical coding)

3.1.5 Frequency component entropy encoding

In the proposed 1L/2L compression method, spatial interline prediction and vertical DWT often produce consecutive zeros in high-frequency chroma channels. The proposed frequency component entropy encoding (FCEE) includes a run-length code for repeated zeros in the chroma channel’s H3/2/1 band. The proposed zero run-length encoding (zRLE) composed of a zRLE header and the length of consecutive zeros. The length value is further compressed through independent Huffman coding.

3.1.6 Output selection process of optimal compression mode

The proposed method selects two lines of 1L mode or one two-line block of 2L mode depending on characteristics of input video lines. The 1L mode can have better compression performance than the 2L mode if the correlation between input lines is low as in complex texture images. On the other hand, the 2 L mode using the 2D DWT can have better performance than the 1L mode if the correlation between adjacent lines is high.

Figure 8 shows the proposed selection process of optimal compression mode. The proposed method can store two individually compressed image data through 1L and 2L compression modes in the temporary output buffer before transmitting data to the receiver. The comparator at the center predicts and compares bits generated by each compression mode to select a more efficient compression method as output value. The proposed method can guarantee a higher compression performance for images with more intense characteristic changes between image lines such as line-based scene changes or text images.
Fig. 8
Fig. 8

Optimal line compression mode selection process (1L/2L mode)

3.2 Proposed BRC method for 1L/2L compression

In the existing BRC method for line compression [31], current image type is estimated based on current quantization level (QL) and the resulting line compression ratio (LR). A new QL is then computed for the next line. The relationship between QL and LR for each image type is summarized in the BRC table.

Since quantization levels in the 2L mode had compatible PSNRs to those in the 1L mode, but not compatible LRs, we extend the existing BRC for the proposed BRC to have both 1L BRC table and 2L BRC table. The proposed BRC method also includes a compensating method for characteristic differences of linearity between 1L BRC table and 2L BRC table. Figure 9 shows differences of compression performance between 1L and 2L modes along with quantization levels.
Fig. 9
Fig. 9

Differences in BRC table characteristics according to 1L and 2L modes

The proposed method also predicts an overall compression outline for the current frame based on average energy of H1 bands of the previous frame. The previous frame is divided into several subsections, and two middle subsections are analyzed to check if there is a split mode. The proposed BRC method then adjusts target compression ratios (TCRs) for subsections of the current frame. Figure 10 shows an example of a split image mode where general videos and text images are combined.
Fig. 10
Fig. 10

An example image output with normal split mode (video + text)

3.2.1 BRC algorithm for 1L/2L compression

The proposed BRC determines QL for the next one or two lines from the BRC table based on previously selected compression mode which is either 1L or 2L. The compression mode of the previous line(s) was selected because it had better compression ratio with relatively lower QL of the selected mode. In other words, the QL used for the other mode that was not selected did not achieve enough compression ratio. Therefore, if the BRC decides to increase compression ratio by one step more for the next line and alter the compression mode, the QL of the new mode for the next line(s) should be increased by two steps to meet the demand.

The proposed BRC method has two separate BRC tables for 1L and 2L modes. It maintains QL values of both tables for the next lines if either one of two modes is selected. Table 2 shows how quantization level determination (QLD) of the proposed BRC works. As shown in Table 2, the proposed BRC defines two states for the compression ratio variations: (1) stable state for 3.5 < LR < 4.5 and (2) scene change state for others. LR1 and LR2 represent LR for 1L mode and LR for 2L mode, respectively. QL1 and QL2 represent QL values for the next line compression of 1L mode and 2L mode, respectively. TCR is the target compression ratio for the whole frame.
Table 2

Adjustment cases of the QLD process for 1L/2L BRC algorithm

Number

State

Each mode

LR results

QL1

QL2

1

Stable state (3.5 < LR < 4.5)

1L mode

LR1 < TCR

+ 1

+ 2

2

LR1 > TCR

− 1

0

3

2L mode

LR2 < TCR

+2

+1

4

LR2 > TCR

0

− 1

5

Scene change state

1L mode

 

BRC table1 new QL1

QL2 ≤ QL1

6

2L mode

 

QL1 ≤ QL2

BRC table2 new QL2

LR line compression ratio, TCR target compression ratio, QL quantization level

For example, case 1 in Table 2 shows that BRC is in a stable state while the current compression mode is in 1L mode. Since compression result of LR1 is less than TCR in case 1, next lines should be compressed more. Therefore, quantization levels for next lines QL1 and QL2 are increased by 1 and 2, respectively. When the BRC compresses the next lines, it will compare compression results of 1L and 2L modes and then chooses one with better result based on the trade-off between compression performance and image quality. For case 2, everything is the same as in case 1 except that LR1 is higher than TCR. In this case, QL1 is decreased by 1 but QL2 will remain the same as before.

In cases with scene change, it is difficult to define QL values for both compression modes. Therefore, a new single QL is selected in accordance with the BRC table for the current compression mode. The new QL is assigned to both QL values of 1L and 2L modes because the same QL value for both compression modes has similar image quality in terms of PSNR. Slight quality difference in the QL value can be adjusted through subsequent processing.

If there are a small number of remaining lines to be compressed while the cumulative compression ratio (CR) does not meet TCR, the quantization level for next lines should be increased rapidly. In the remaining line process (RLP) stage of the proposed 1L/2L BRC, if the CR is less than TCR after 70% lines are processed, the QL value for the current compression mode is increased by 3 while the QL value for the other compression mode is increased by 4. Table 3 shows how the adjustment works.
Table 3

Adjustment in RLP process for 1L/2L BRC algorithm

Number

Q-type

Each mode

CR

QL1

QL2

1

Normal

1L mode

CR < TCR

+ 3

+ 4

2

2L mode

+ 4

+ 3

CR cumulative compression ratio, TCR target compression ratio

3.2.2 Split mode processing based on average H1 band energy

The proposed BRC method can improve the consistency of image quality over the whole frame based on temporal similarity even if the frame is composed of several different images in split mode. The proposed method uses average energy of H1 band to find image characteristics for low-complexity prediction in time domain, so that it does not require frame memory or motion estimation techniques. The H1 band contains the highest frequency characteristics among sub-bands, and the higher the average energy of the H1 band, the lower the compression ratio. Therefore, the energy of H1 band can be used for image classification for BRC control. For example, a complex texture image such as text lines has higher average energy in the H1 band than a low texture image such as simple figure does.

In the proposed BRC, a frame is only divided into two parts for simplicity if the frame is in split mode. If the border line between these two parts is located at the top of the frame, there is plenty of room to adjust the compression ratio. On the other hand, if the border line is located at the bottom of the frame, there is a slight chance to enhance the image quality. Therefore, the proposed BRC checks the split mode only in the middle half of the frame. Figure 11 depicts the operation concept of the TCR re-adjustment method in split mode. BRC does not check split mode in the first or the last quarter of the frame. If a large difference of H1 band’s average energy is detected within the region of interest (ROI) (second and third quarters), the frame in split mode, location of the border, energy difference, TCR and CR of the upper region (TCR1 and CR1), and TCR and CR of the lower region (TCR2 and CR2) are recorded.
Fig. 11
Fig. 11

TCR re-adjustment in split mode (TCR target compression ratio)

For the next frame, recorded TCR1 and TCR2 are respectively used for the upper and lower regions because of temporal similarity of consecutive frames. If similar energy difference to the previous frame is detected around the location of the border line of the previous frame, it is regarded that the same split mode continues.

The location of the border line can move up or down due to scrolling. Thus, the lower region of the next frame starts as soon as the border line of the next frame is detected. Equation (1) is used to determine TCR2, where HEIGHT is the frame height and i is the length from the first line of the frame to the border line.

$$ \mathrm{TCR}2=\frac{\mathrm{HEIGHT}-i-1}{\left(\frac{\mathrm{HEIGHT}}{4.05}\right)-\left(\frac{i-1}{\mathrm{CR}1}\right)} $$
(1)

3.2.3 Frame unit screen transition process based on H1 band energy difference values (between frames)

The proposed BRC method can handle split mode in a single frame. It can also handle scene change between consecutive frames. The scene change is detected if the average energy of H1 band is changed more than a threshold value (TH) before the last quarter of the frame. If frame unit scene change occurs, compression for remaining regions is performed using default TCR value (4.00). This method can also prevent unnecessary image quality degradation due to scene changes between frames with split mode.

4 Results and discussion

To compare energy consumption of multimedia IoT devices with different compression algorithms, energy complexity method [33, 34] based on the number of operations in the microprocessor and the number of memory accesses can be used. This method can estimate the power consumption of an algorithm operating on a target embedded system while avoiding significant estimation deviation due to many complicated interactions among the embedded program implementation techniques, the optimization level, and the operating system (OS) programs [35]. The energy complexity method can estimate the trend of the energy cost change for multiple algorithms with high accuracy at the theoretical algorithm proposal stage.

To assess power reduction performance of existing IoT systems, we assumed that the evaluation system had specifications similar to those of Intel Galileo gen 2 development board [28] frequently used for commercial purposes. To estimate energy saving effects of the proposed method and existing compression methods, transmission power reduced by compression and processing power for compression based on power consumption characteristics of the evaluation system were considered simultaneously.

First, numbers of computational operations for each compression method and memory access counts were evaluated to predict the power consumption required for compression in terms of each compression algorithm. To predict the amount of transmission power saved due to compression, average compression ratio at visually lossless condition of at least 40 dB for each compression method was assessed. Computational power consumptions of all algorithms were simulated on a desktop computer using C programming language, and then, transmission power savings were calculated. Frame compression ratio (FCR), the compression ratio of the whole single frame, was used for compression performance evaluation for each algorithm as shown in Eq. (2). For test image, 24 Kodak still images [36] frequently used for image compression evaluations were employed (see Fig. 12).
Fig. 12
Fig. 12

Twenty-four Kodak still images (image resolution 768 × 512, 512 × 768)

$$ \mathrm{FCR}=\frac{\mathrm{Number}\ \mathrm{of}\ \mathrm{bits}\ \mathrm{for}\ \mathrm{original}\ \mathrm{image}}{\mathrm{Number}\ \mathrm{of}\ \mathrm{bits}\ \mathrm{for}\ \mathrm{compressed}\ \mathrm{image}} $$
(2)

4.1 Comparison of power consumption changes

The power consumption cost of the whole IoT platform [28] was estimated in terms of Intel quark SoC X1000 processor and mobile DRAM memory usage based on energy per bit information [37, 38]. Additionally, radio frequency (RF) power consumption cost for data transmission was estimated with Broadcom BCM43340 [39] known to consume approximately 1170 mW of average transmission power for 11 Mbps wireless transmission. Since the data size for three-color channels of a single uncompressed image is 9,437,184 bits (3 × 768 × 512 × 8), the total transmission energy for a single uncompressed image is 957 mJ. Table 4 depicts power consumption required for compression processing for each algorithm and average compression ratio at visually lossless condition. Figure 13 depicts frame compression ratio (FCR) for each compression method. The 1D SPIHT method [18] had the lowest compression ratio of 2.04:1, indicating the lowest reduction in transmission time. It consumed 469 mJ for RF transmission. It also showed high-power consumption cost due to iterative memory accesses. The L-BCWT method [23] reduced SPIHT’s memory access cost. However, its compression ratio was the same as that of 1D SPIHT. Its RF transmission energy was still 469 mJ. The ZM-SPECK showed higher FCR of 2.37:1 than L-BCWT. However, the achieved compression ratio was not high enough to reduce RF power consumption. Its RF transmission energy was 404 mJ. The FALC method [27] further enhanced compression ratio to 3.37:1, with RF transmission energy of 284 mJ. The 8L JPEG method is the same as the existing JPEG standard [40] except that it only uses eight lines at a time. The 8L JPEG method had less memory access cost of 37.4 mJ with higher FCR of 11.47:1 comparing to the proposed method. However, to meet high computational accuracy required for visually lossless image quality, the 8L JPEG consumed 163.7 mJ for floating-point operations [41], which was 13.5 mJ higher in total energy consumption compared to the proposed method. Although the 4L HEVC method [15] had the highest compression ratio of 12.5:1, it required extremely high-power consumption due to heavy memory accesses and computational processing. The 4L H.264 method [13] required lower power consumption than the 4L HEVC method. However, it needed higher power consumption than SPIHT-based methods.
Table 4

Energy complexity of each compression algorithm (sum of microprocessor operation and memory accesses)

Type

SPIHT [18]

L-BCWT [23]

ZM-SPECK [25]

VLC_NUQ [26]

FALC [27]

Proposed method

8L JPEG [40]

4L H.264 [13]

4L HEVC [15]

#CPU operations (energy cost/frame)

25 × 3MN (14.7)

15 × 3MN (8.8)

13.9 × 3MN (8.1)

38 × 3MN (22.4)

16 × 3MN (9.4)

58 × 3MN (34.2)

277.6 × 3MN (163.7)

548 × 3MN (323)

653 × 3MN (385)

#Memory accesses (energy cost/frame)

15.5 × 3MN (10.2)

9.5 × 3MN (6.2)

12 × 3MN (7.9)

9 × 3MN (5.9)

6.5 × 3MN (4.2)

43.6 × 3MN (28.8)

37.4 × 3MN (24.7)

194 × 3MN (128)

334 × 3MN (221)

Average FCR (RF energy cost/frame)

2.04:1 (469)

2.04:1 (469)

2.37:1 (404)

3.25:1 (295)

3.37:1 (284)

4.9:1 (195)

11.47:1 (83.4)

10.49:1 (91)

12.5:1 (76.5)

Transmission duration (s)

0.401

0.401

0.345

0.252

0.243

0.167

0.071

0.078

0.065

Total energy cost (mJ/frame)

494.2

484.4

420

322.9

297.8

258.4

271.9

542.6

682.4

M image width size, N image height size. CPU operation energy cost, 62.5 pJ/bit; memory access energy cost, 70 pJ/bit; average RF power consumption, 1170 mW; and RF bandwidth, 11 Mbps

Fig. 13
Fig. 13

Average compression ratio results for each compression method (CR compression ratio)

Although the proposed compression method used only one additional line memory and slightly higher operation counts than the FALC method, it achieved better compression ratio of 4.9:1 than the VLC_NUQ [26] or the FALC method. The RF transmission energy of the proposed method was reduced to 195 mJ. Furthermore, the proposed method consumed less energy for processing units because of its simple implementation. Based on these comparison results, the proposed method showed the best performance among these methods tested in terms of energy efficiency.

4.2 BRC tests for split mode in a still image and scene changes in an image sequence

The proposed bit rate control method and existing techniques were compared in terms of image quality while maintaining target compression ratio at 4.0 sharp. For a fair comparison, existing bit rate control techniques were implemented on top of the proposed compression method. Still images and image sequences were fed to algorithms under test to evaluate the consistency in image quality. Twenty-four Kodak still test images [36] were selected for BRC experiments (see Fig. 12). Experimental image sequences were composed of several merged images of different characteristics to evaluate image quality consistency in split mode with scene changes. These images were obtained from HEVC common test sequences [42] and overlaid on the text image.

Table 5 shows frame compression ratio (FCR) and PSNRs of the bit rate control technique for Kodak still images [36]. The average PSNR of Edirisinghe’s BRC [29] was 39.77 dB with frame compression ratio of 4.03. The average PSNR of Jiang’s BRC [30] was 40.36 dB with an average FCR of 4.37 which was too high because the target compression ratio was 4.0. Such high compression ratio of Jiang’s BRC method [30] caused severe imbalance of image quality in one frame. The frame compression ratio of the proposed BRC was 4.02 which was the closest to TCR of 4.0. The average PSNR of the proposed BRC was 41.39 dB which was higher than that of Edirisinghe’s BRC [29] and Jiang’s BRC [30] by 1.62 dB and 1.03 dB, respectively.
Table 5

Frame compression ratio and PSNR of bit rate control techniques for Kodak still images [36] (TCR: 4.0)

Algorithm

Best

Worst

Average

PSNR

FCR

PSNR

FCR

PSNR

FCR

Proposed

46.71

4.00

33.93

4.02

41.39

4.02

Edirisinghe’s [29]

46.54

4.01

29.11

4.06

39.77

4.03

Jiang’s [30]

45.83

4.43

33.91

4.05

40.36

4.37

FCR frame compression ratio

Figure 14 depicts image sequences for testing image quality consistency. The image sequence was composed of four consecutive images and a scene change in the middle. The t0~t3 indicate sequence numbers of image sequences while R0~R3 denote four equally divided regions of an image from the top to the bottom. Edrisinghe’s BRC [29] failed to achieve target compression ratio in t0, t2, or t3. In the event of scene change in t2, PSNRs in regions R2 and R3 were very low, especially in the text region (Fig. 14).
Fig. 14
Fig. 14

Image sequences to test image quality consistency (Kimono 1920 × 1080, BasketballDrive + Text 1920 × 1080). t2 shown in Fig. 9

BRC results for test image sequences are shown in Table 6. Jiang’s BRC [30] excessively compressed test images due to failures in compression ratio prediction. Consequently, image qualities in t2 and t3 were severely damaged and their PSNRs were decreased to approximately 30 dB.
Table 6

BRC results for testing image sequences

BRCs

Seq.

TCR1

TCR2

R0

R1

R2

R3

Total

FCR

Proposed

t0

4.00

 

43.85

44.24

44.40

43.33

43.93

4.00

t1

4.05

 

43.95

44.23

44.42

42.97

43.85

4.00

t2

4.05

4.08

42.35

42.56

38.38

36.21

39.04

4.08

t3

4.55

4.00

41.26

41.36

37.84

41.06

40.10

4.03

Edirisinghe’s [29]

t0

4.00

 

43.15

43.83

43.95

44.80

43.89

3.99

t1

4.00

 

43.35

43.30

44.59

44.28

43.84

4.01

t2

4.00

 

42.43

42.20

20.54

22.49

24.37

3.96

t3

4.00

 

42.58

42.30

25.24

36.94

30.82

3.95

Jiang’s [30]

t0

4.00

 

41.84

40.47

41.82

43.62

41.79

5.16

t1

4.00

 

42.27

40.87

42.17

43.64

42.13

4.99

t2

4.00

 

41.80

37.64

35.82

31.87

35.41

5.24

t3

4.00

 

40.81

38.01

34.96

31.88

35.19

5.41

On the other hand, the proposed BRC maintained a PSNR of t2 even under scene change condition. PSNRs of the proposed BRC for regions R2 and R3 in image t2 were 38.38 dB and 36.21 dB, respectively. PSNRs for regions R2 and R3 in t3 image sequence were 37.84 dB and 41.06 dB, respectively. These results showed that the proposed method outperformed existing techniques in terms of PSNR. Moreover, legibility of the text region of the proposed BRC was superior to that of existing BRC techniques as shown in Fig. 15.
Fig. 15
Fig. 15

Results of text line images in split mode. a Input image. b Edrisinghe’s [29] result. c Proposed BRC’s result

5 Conclusions

In this paper, a compression method for power reduction in multimedia IoT environments with bit rate control was proposed. The proposed method uses a low-complexity algorithm with the least possible amount of processes and memory access in consideration of IoT environment’s limited power supply.

Based on our test results, the proposed method achieved higher compression ratio than existing methods while maintaining lower complexity. It demonstrated superior performance compared to existing SPIHT or 4L HEVC method in terms of power reduction effect at system level. It can maintain better restorative and consistent image quality performance in situations with high/low quality imbalance, image split mode, and scene changes that often occur in existing line compression methods.

Future studies include optimization for the system implementation and transmission error recovery techniques to extend the proposed technique to various IoT applications.

Abbreviations

1L mode: 

One-line compression mode

1D: 

One-dimensional

2L mode: 

Two-line compression mode

2D: 

Two-dimensional

BRC: 

Bit rate control

CABAC: 

Context-adaptive arithmetic coding

CR: 

Compression ratio

DPCM: 

Differential pulse code modulation

DWT: 

Discrete wavelet transform

ECG: 

Electrocardiogram

FALC: 

Frequency adaptive line compression

FCEE: 

Frequency component entropy encoding

FCR: 

Frame compression ratio

HM: 

HEVC test model

HVS: 

Human visual system

IoT: 

Internet of things

JM: 

Joint model

L-BCWT: 

Line-based backward coding of wavelet trees

LR: 

Line compression ratio

MED: 

Median edge detection

OS: 

Operating system

PSNR: 

Peak signal-to-noise ratio

QL: 

Quantization level

QLD: 

Quantization level determination

RF: 

Radio frequency

RLP: 

Remaining line process

ROI: 

Region of interest

SAD: 

Sum of absolute difference

SPIHT: 

Set partitioning in hierarchical trees

TCRs: 

Target compression ratios

TH: 

Threshold value

uProc: 

Microprocessor

VLC: 

Variable length coding

VLC_NUQ: 

Visually lossless compression non-uniform quantization

VLSI: 

Very large-scale integration

ZM-SPECK: 

Zero memory set partitioned embedded block

zRLE: 

Zero run-length encoding

Declarations

Acknowledgements

SW Lee and HY Kim are co-first authors. They contributed equally to this study. The work reported in this paper was conducted during the sabbatical year of Kwangwoon University in 2018.

Funding

This work was partly supported by a grant (2016-0-00421) of Institute for Information & Communications Technology Promotion (IITP) funded by the Korea government (MSIP). It was also supported by a grant (2017R1D1A1B03036361) of the Basic Science Research Program through the NRF (National Research Foundation) of Korea funded by the Ministry of Education, a grant (10080649) funded by the MOTIE (Ministry of Trade, Industry & Energy), and KSRC (Korea Semiconductor Research Consortium) support program for the development of future semiconductor device.

Availability of data and materials

The conclusion and comparison data of this article are included within the article.

Authors’ contributions

HY implemented the proposed algorithm, carried out all experiments, and drafted the manuscript. SL conceived the study, designed the proposed algorithm and experiments, and helped draft the manuscript. All authors read and approved the final manuscript.

Authors’ information

Seong-Won Lee: He received B.S. and M.S. degrees in Control and Instrumentation Engineering from Seoul National University, Korea, in 1988 and 1990, respectively. He obtained Ph.D. degree in Electrical Engineering from University of Southern California, Los Angeles, CA, in 2003. From 1990 to 2004, he worked on VLSI/System-on-Chip (SoC) design at Samsung Electronics Co., Ltd., Korea. Since 2005, he has been a Professor in Computer Engineering Department, Kwangwoon University, Seoul, Korea. His research interests include image signal processing, signal processing SoC, and computer architecture.

Ho-Young Kim: He received B.S. and M.S. degrees in Computer Engineering from Kwangwoon University, Seoul, Korea, in 2011 and 2013, respectively. He is currently pursuing a Ph.D. degree in Computer Engineering at Kwangwoon University, Seoul, Korea. His research interests include image signal processing, signal processing SoC, and energy management system.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

(1)
Department of Computer Engineering, Kwangwoon University, 20, Gwangun-ro, Nowon-gu, Seoul, Republic of Korea

References

  1. M.C. Kang, S.H. Chae, J.Y. Sun, S.H. Lee, S.J. Ko, An enhanced obstacle avoidance method for the visually impaired using deformable grid. IEEE Trans. Consum. Electron. 63(2), 169–177 (2017)View ArticleGoogle Scholar
  2. J. Ruminski, A. Bujnowski, T. Kocejko, J. Wtorek, A. Andrushevich, M. Biallas, R. Kistler, Performance analysis of interaction between smart glasses and smart objects using image-based object identification. Int. J. Distributed Sensor Networks 12(3), 1–14 (2016)View ArticleGoogle Scholar
  3. H. Qian, J. Huang, L. Ma, The design challenges for unmanned vehicular video streaming. in Proc. Int. Conf. Vehicular Electronics and Safety (ICVES), 19–24 (2011)Google Scholar
  4. A. Tikanmaki, T. Bedrnik, R. Raveendran, J. Roning, The remote operation and environment reconstruction of outdoor mobile robots using virtual reality. in Proc. Int. Conf. Mechatronics and Automation (ICMA), 1526–1531 (2017)Google Scholar
  5. S.L. Chen, T.Y. Liu, C.W. Shen, M.C. Tuan, VLSI implementation of a cost-efficient near-lossless CFA image compressor for wireless capsule endoscopy. IEEE Access 4, 10235–10245 (2016)View ArticleGoogle Scholar
  6. A. Mammeri, B. Hadjou, A. Khoumsi, A survey of image compression algorithms for visual sensor networks. J. ISRN Sensor Networks 2012(760320), 1–19 (2012)Google Scholar
  7. T. Ma, M. Hempel, D. Peng, H. Sharif, A survey of energy-efficient compression and communication techniques for multimedia in resource constrained systems. IEEE Commun. Surveys and Tutorials 15(3), 963–972 (2013)View ArticleGoogle Scholar
  8. H. ZainEldin, M. Elhosseini, H. Ali, Image compression algorithms in wireless multimedia sensor networks: a survey. Int. J. Ain Shams Engineering 6(2), 481–490 (2015)View ArticleGoogle Scholar
  9. F. Walls, A. MacInnis, VESA display stream compression for television and cinema applications. IEEE J. Emerging and Selected Topics in Circuits and Systems 6(4), 460–470 (2016)View ArticleGoogle Scholar
  10. J.I. Odagiri, Y. Nakano, S. Yoshida, Video compression technology for in-vehicle image transmission: SmartCODEC. J. FUJITSU Scientific&Technical 43(4), 469–474 (2007)Google Scholar
  11. H.H. Chou, Y.J. Chen, Y.C. Shiau, T.S. Kuo, An effective and efficient compression algorithm for ECG signals with irregular periods. IEEE Trans. Biomedical Engineering 53(6), 1198–1205 (2006)View ArticleGoogle Scholar
  12. D. Cruz, R. Grosbois, T. Ebrahimi, JPEG 2000 performance evaluation and assessment. Int. J. Signal Process. Image Commun. 17(1), 113–130 (2002)View ArticleGoogle Scholar
  13. ITU-T and ISO/IEC JTC 1, Advanced video coding for generic audiovisual services. ISO/IEC 14496-10, ITU-T Rec. H.264) Version 1–25 April 2017Google Scholar
  14. Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, H.264/AVC reference software JM 19.0, June 2015. http://iphome.hhi.de/suehring/tml. Accessed 22 Dec 2017
  15. ITU-T and ISO/IEC, High Efficiency Video Coding, ITU-T Rec. H.265 and ISO/IEC 23008-2 (HEVC) Version 1-4 Dec. 2016Google Scholar
  16. C Rosewarne, B Bross, M Naccari, K Sharman, G Sullivan, High efficiency video coding (HEVC) test model 16 (HM 16) improved encoder description update 2. JCTVC-N15139 Feb 2015Google Scholar
  17. L. Lacassagne, D. Etiemble, S.A.O. Kablia, 16-bit floating point instructions for embedded multimedia applications. Paper presented at the 7th International Workshop on Computer Architecture for Machine Perception (CAMP'05), Palermo, Italy, 4–6 July 2005Google Scholar
  18. Z. Lu, D. Kim, W. Pearlman, Wavelet compression of ECG signals by the set partitioning in hierarchical trees algorithm. IEEE Trans. Biomed. Eng. 47(7), 849–856 (2000)View ArticleGoogle Scholar
  19. W. Pearlman, A. Islam, N. Nagaraj, A. Said, Efficient, low-complexity image coding with a set-partitioning embedded block coder. IEEE Trans. Circuits and Systems for Video Technol. 14(11), 1219–1235 (2004)View ArticleGoogle Scholar
  20. M. Akter, M. Reaz, F. Yasin, F. Choong, A modified set partitioning in hierarchical trees algorithm for real-time image compression. J. Commun. Technol. and Electron. 53(6), 642–650 (2008)View ArticleGoogle Scholar
  21. R. Senapati, U. Pati, K. Mahapatra, Listless block-tree set partitioning algorithm for very low bit rate embedded image compression. Int. J. Electron. Commun. 66(12), 985–995 (2012)View ArticleGoogle Scholar
  22. M. Tausif, N. Kidwai, E. Khan, M. Reisslein, FrWF-based LMBTC: memory-efficient image coding for visual sensors. IEEE J. Sensors 15(11), 6218–6228 (2015)View ArticleGoogle Scholar
  23. L. Ye, J. Guo, B. Nutter, S. Mitra, Memory-efficient image codec using line-based backward coding of wavelet trees. in Proc. Data Compress. Conf. (DCC), 213–222 (2007)Google Scholar
  24. L. Ye, J. Guo, B. Nutter, S. Mitra, Low-memory-usage image coding with line-based wavelet transform. J. Opt. Eng. 50(2), 027005–027001 (2011)View ArticleGoogle Scholar
  25. N. Kidwai, E. Khan, M. Reisslein, ZM-SPECK: a fast and memoryless image coder for multimedia sensor networks. IEEE J. Sensors 16(8), 2575–2587 (2016)View ArticleGoogle Scholar
  26. D. Vo, S. Lertrattanapanich, Y.T. Kim, Low line memory visually lossless compression for color images using non-uniform quantizers. IEEE Trans. Consum. Electron. 57(1), 187–195 (2011)View ArticleGoogle Scholar
  27. H.Y. Kim, J.H. Cho, J. Cho, S.W. Lee, A frequency adaptive line compression system for mobile display devices. J. IEICE Electronics Express 11(19), 20140746 (2014)View ArticleGoogle Scholar
  28. Datasheet, Intel Galileo Gen 2 Development Board. (Intel Co., 2014), https://www.rutronik.com/fileadmin/Micropages/Intel/intelgalileogen2prodbrief_330736_003.pdf. Accessed 22 Dec 2017
  29. E. Edirisinghe, S. Bedi, Variation of JPEG-LS to low cost rate control and its application in region-of-interest based coding. Int. J. Signal Process. Image Commun. 18(5), 357–372 (2003)View ArticleGoogle Scholar
  30. J. Jiang, A low-cost content-adaptive and rate-controllable near-lossless image codec in DPCM domain. IEEE Trans. Image Processing 9(4), 543–554 (2000)View ArticleGoogle Scholar
  31. J.S. Ham, H.Y. Kim, S.W. Lee, A consistent quality bit rate control for the line-based compression. IEIE Trans. Smart Processing & Computing 5(5) (2016)View ArticleGoogle Scholar
  32. Savithra Eratne, Mahinda Alahakoon, Fast predictive wavelet transform for lossless image compression, in Proc. International Conference on Industrial and Information Systems (ICIIS) (University of Peradeniya, Sri Lanka, 2009) https://ieeexplore.ieee.org/document/5429833/
  33. K Zotos, A Litke, A Chatzigeorgiou, S Nikolaidis, G Stephanides, energy complexity of software in embedded systems. in Proc. IASTED Int. Conf. Autom. Control Appl. (ACIT-ACA) (Novosibirsk, 2005)Google Scholar
  34. V. Konstantakos, A. Chatzigeorgiou, S. Nikolaidis, T. Laopoulos, Energy consumption estimation in embedded systems. IEEE Trans. Instrum. Meas. 57(4), 797–804 (2008)View ArticleGoogle Scholar
  35. Tao Li, Lizy Kurian John, Run-time modeling and estimation of operating system power consumption, in Proc. the 2003 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, (2003), 160–171Google Scholar
  36. Kodak Lossless True Color Image Suite. http://r0k.us/graphics/kodak. Accessed 22 Dec 2017
  37. Datasheet, Intel Quark SoC X1000 329676-005US (Intel Co., 2015), pp. 69–71, https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/quark-x1000-datasheet.pdf. Accessed 22 Dec 2017
  38. K Malladi, F Nothaft, K Periyathambi, B Lee, C Kozyrakis, M Horowitz, Towards energy-proportional datacenter memory with mobile DRAM. in Proc. IEEE 39th International Symposium on Computer Architecture (ISCA), (2012), pp. 37–48Google Scholar
  39. Datasheet, BCM43340. (Broadcom Co., 2015), pp. 135, http://www.mouser.com/ds/2/100/002-14943_0I_V-961661.pdf. Accessed 22 Dec 2017
  40. G.K. Wallace, The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)View ArticleGoogle Scholar
  41. Gokul Govindu, Ling Zhuo, Seonil Choi, Padma Gundala, Viktor K Prasanna, Area, and power performance analysis of a floating-point based application on FPGAs, in Proc. 7th Annual Workshop on High Performance Embedded Computing (HPEC 2003) (MIT Lincoln Laboratory, 2003)Google Scholar
  42. F Bossen, Common test conditions and software reference configurations. in document JCTVC-L1100 (Geneva, 2013) http://phenix.it-sudparis.eu/jct/doc_end_user/documents/12_Geneva/wg11/JCTVC-L1100-v1.zip. Accessed 22 Dec 2017

Copyright

© The Author(s). 2018

Advertisement