Separable 4D structure (existing I)
Figure 4 shows the 9/7-type separable 4D integer WT. In the JPEG 2000 standard, the 1D processing shown in Fig. 2 is applied to a 4D signal along the x, y, z, and t dimensions, where x and y denote two spatial dimensions within a slice, z denotes the third spatial dimension within a volume, and t denotes the fourth, temporal, dimension. However, the separable 4D structure increases the number of rounding operators in the transform. This structure has 192 rounding operators.
For a 4D input signal X(z), the transform splits the input signal into 16 channels, X0000, X0001, X0010, X0011, X0100, X0101, X0110, X0111, X1000, X1001, X1010, X1011, X1100, X1101, X1110, and X1111 as shown in Fig. 3. It is denoted as
$$ \left[\begin{array}{c}\begin{array}{c}{X}_{0000}\left(\mathbf{z}\right)\\ {}{X}_{0010}\left(\mathbf{z}\right)\end{array}\\ {}\begin{array}{c}\begin{array}{c}{X}_{0100}\left(\mathbf{z}\right)\\ {}{X}_{0011}\left(\boldsymbol{z}\right)\end{array}\\ {}\begin{array}{c}\begin{array}{c}\vdots \\ {}{X}_{1110}\left(\mathbf{z}\right)\end{array}\\ {}{X}_{1111}\left(\mathbf{z}\right)\end{array}\end{array}\end{array}\right]=\left[\begin{array}{c}\downarrow {2}_D\left[\left[\begin{array}{c}1\\ {}{z}_D\end{array}\right]{W}_1\left(\mathbf{z}\right)\right]\\ {}\begin{array}{c}\downarrow {2}_D\left[\left[\begin{array}{c}1\\ {}{z}_D\end{array}\right]{W}_2\left(\mathbf{z}\right)\right]\\ {}\vdots \\ {}\downarrow {2}_D\left[\left[\begin{array}{c}1\\ {}{z}_D\end{array}\right]{W}_8\left(\mathbf{z}\right)\right]\end{array}\end{array}\right], $$
(15)
where
$$ \left[\begin{array}{c}{W}_1\left(\mathbf{z}\right)\\ {}\begin{array}{c}{W}_2\left(\mathbf{z}\right)\\ {}{W}_3\left(\mathbf{z}\right)\\ {}\begin{array}{c}{W}_4\left(\mathbf{z}\right)\\ {}{W}_5\left(\mathbf{z}\right)\\ {}\begin{array}{c}{W}_6\left(\mathbf{z}\right)\\ {}{W}_7\left(\mathbf{z}\right)\\ {}{W}_8\left(\mathbf{z}\right)\end{array}\end{array}\end{array}\end{array}\right]=\left[\begin{array}{c}\downarrow {2}_C\left[\left[\begin{array}{c}1\\ {}{z}_C\end{array}\right]{V}_1\left(\mathbf{z}\right)\right]\\ {}\begin{array}{c}\downarrow {2}_C\left[\left[\begin{array}{c}1\\ {}{z}_C\end{array}\right]{V}_2\left(\mathbf{z}\right)\right]\\ {}\downarrow {2}_C\left[\left[\begin{array}{c}1\\ {}{z}_C\end{array}\right]{V}_3\left(\mathbf{z}\right)\right]\\ {}\downarrow {2}_C\left[\left[\begin{array}{c}1\\ {}{z}_C\end{array}\right]{V}_4\left(\mathbf{z}\right)\right]\end{array}\end{array}\right], $$
(16)
$$ \left[\begin{array}{c}\begin{array}{c}{V}_1\left(\mathbf{z}\right)\\ {}{V}_2\left(\mathbf{z}\right)\end{array}\\ {}{V}_3\left(\mathbf{z}\right)\\ {}{V}_4\left(\mathbf{z}\right)\end{array}\right]=\left[\begin{array}{c}\downarrow {2}_B\left[\left[\begin{array}{c}1\\ {}{z}_B\end{array}\right]{P}_1\left(\mathbf{z}\right)\right]\\ {}\downarrow {2}_B\left[\left[\begin{array}{c}1\\ {}{z}_B\end{array}\right]{P}_2\left(\mathbf{z}\right)\right]\end{array}\right] $$
$$ \left[\begin{array}{c}{P}_1\left(\mathbf{z}\right)\\ {}{P}_2\left(\mathbf{z}\right)\end{array}\right]=\downarrow {2}_A\left[\left[\begin{array}{c}1\\ {}{z}_A\end{array}\right]X\left(\mathbf{z}\right)\right] $$
and
$$ \left[\begin{array}{c}\downarrow {2}_A\left[X\left(\mathbf{z}\right)\right]\\ {}\downarrow {2}_B\left[X\left(\mathbf{z}\right)\right]\\ {}\begin{array}{c}\downarrow {2}_C\left[X\left(\mathbf{z}\right)\right]\\ {}\downarrow {2}_D\left[X\left(\mathbf{z}\right)\right]\end{array}\end{array}\right]=\left[\begin{array}{c}\frac{1}{Q}{\sum}_{p=0}^{Q-1}X\left({z}_A^{1/Q}\bullet {W}_Q^p,{z}_B,{z}_C,{z}_D\right)\\ {}\frac{1}{Q}{\sum}_{p=0}^{Q-1}X\left({z}_A,{z}_B^{1/Q}\bullet {W}_Q^p,{z}_C,{z}_D\right)\\ {}\begin{array}{c}\frac{1}{Q}{\sum}_{p=0}^{Q-1}X\left({z}_A,{z}_B,{z}_C^{1/Q}\bullet {W}_Q^p,{z}_D\right)\\ {}\frac{1}{Q}{\sum}_{p=0}^{Q-1}X\left({z}_A,{z}_B,{z}_C,{z}_D^{1/Q}\bullet {W}_Q^p\right)\end{array}\end{array}\right]\bullet {2}^F, $$
(17)
for
$$ X\left(\mathbf{z}\right)={\sum}_{n_1=0}^{N_1-1}{\sum}_{n_2=0}^{N_2-1}{\sum}_{n_3=0}^{N_3-1}{\sum}_{n_4=0}^{N_4-1}X\left(\mathbf{n}\right){z}_A^{-{n}_1}{z}_B^{-{n}_2}{z}_C^{-{n}_4}{z}_D^{-{n}_4}, $$
(18)
where z = (zA, zB, zC, zD) and n = (n1, n2, n3, n4).
In JPEG 2000 standard, applying the 1st, 2nd, 3rd, and 4th lifting steps in the spatial dimension, x with
$$ \left[\begin{array}{cc}{A}_1\left(\mathbf{z}\right)& {A}_3\left(\mathbf{z}\right)\\ {}{A}_2\left(\mathbf{z}\right)& {A}_4\left(\mathbf{z}\right)\end{array}\right]=\left[\begin{array}{cc}{h}_1\left(1+{z}_A^{+1}\right)& {h}_3\left(1+{z}_A^{+1}\right)\\ {}{h}_2\left(1+{z}_A^{-1}\right)& {h}_4\left(1+{z}_A^{-1}\right)\end{array}\right],\kern0.5em $$
(19)
and the 5th, 6th, 7th, and 8th lifting steps in the spatial dimension, y with
$$ \left[\begin{array}{cc}{B}_1\left(\mathbf{z}\right)& {B}_3\left(\mathbf{z}\right)\\ {}{B}_2\left(\mathbf{z}\right)& {B}_4\left(\mathbf{z}\right)\end{array}\right]=\left[\begin{array}{cc}{h}_1\left(1+{z}_B^{+1}\right)& {h}_3\left(1+{z}_B^{+1}\right)\\ {}{h}_2\left(1+{z}_B^{-1}\right)& {h}_4\left(1+{z}_B^{-1}\right)\end{array}\right],\kern0.5em $$
(20)
and the 9th, 10th, 11th, and 12th lifting steps in the spatial dimension, z with
$$ \left[\begin{array}{cc}{C}_1\left(\mathbf{z}\right)& {C}_3\left(\mathbf{z}\right)\\ {}{C}_2\left(\mathbf{z}\right)& {C}_4\left(\mathbf{z}\right)\end{array}\right]=\left[\begin{array}{cc}{h}_1\left(1+{z}_C^{+1}\right)& {h}_3\left(1+{z}_C^{+1}\right)\\ {}{h}_2\left(1+{z}_C^{-1}\right)& {h}_4\left(1+{z}_C^{-1}\right)\end{array}\right], $$
(21)
and the 13th, 14th, 15th, and 16th lifting steps in the temporal dimension, t with
$$ \left[\begin{array}{cc}{D}_1\left(\mathbf{z}\right)& {D}_3\left(\mathbf{z}\right)\\ {}{D}_2\left(\mathbf{z}\right)& {D}_4\left(\mathbf{z}\right)\end{array}\right]=\left[\begin{array}{cc}{h}_1\left(1+{z}_D^{+1}\right)& {h}_3\left(1+{z}_D^{+1}\right)\\ {}{h}_2\left(1+{z}_D^{-1}\right)& {h}_4\left(1+{z}_D^{-1}\right)\end{array}\right],\kern0.5em $$
(22)
to the channel signals in (7), the transform outputs sixteen frequency band signals Y
LLLL
(z), Y
LLLH
(z), Y
LLHL
(z), Y
LLHH
(z), Y
LHLL
(z), Y
LHLH
(z), Y
LHHL
(z), Y
LHHH
(z), Y
HLLL
(z), Y
HLLH
(z), Y
HLHL
(z), Y
HLHH
(z), Y
HHLL
(z), Y
HHLH
(z), Y
HHHL
(z), and Y
HHHH
(z) as illustrated in Fig. 4. This is referred to as a separable structure. As it has a large number of rounding operators, there is a large volume of rounding noise in the transform. A non-separable 3D structure was thus proposed in [27]. However, when used for a 4D signal, the rounding noise in it significantly increases compared with that in a separable 4D structure. Thus, its coding performance is significantly affected by the rounding noise generated inside it.
Non-separable 3D structure (existing II)
Figure 5 shows the non-separable 3D structure of integer WT for a 4D input signal designed in the 9/7-type transform based on the structure proposed in [28]. In the first to the fourth lifting steps, the 4D input signal, once it is decomposed into 16 channels, is applied to the spatial dimension x as in Eq. (19).
$$ \left[\begin{array}{c}{X}_{0000}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0001}^{(B)}\left(\mathbf{z}\right)\\ {}\begin{array}{c}\vdots \\ {}{X}_{1111}^{(B)}\left(\mathbf{z}\right)\end{array}\end{array}\right]=\left[\begin{array}{c}R\left[{k}^{-1}{X}_{0000}^{(A)}\left(\mathbf{z}\right)\right]\\ {}R\left[{k}^{-1}{X}_{0001}^{(A)}\left(\mathbf{z}\right)\right]\\ {}\begin{array}{c}\vdots \\ {}R\left[{k}^{+1}{X}_{1111}^{(A)}\left(\mathbf{z}\right)\right]\end{array}\end{array}\right], $$
(23)
Then, from the fifth to the 12th lifting steps, the signals are transformed simultaneously in spatial dimensions y and z, and temporal dimension t using the non-separable 3D structure. For instance, the signal in YLHHH is produced as
$$ {X}_{0111}^{(D)}\left(\mathbf{z}\right)={X}_{0111}^{(B)}\left(\mathbf{z}\right)+R\left[{k}^{+3}{2}^{-F}{P}_{LHHH}^{(D)}\left(\mathbf{z}\right)\right] $$
(24)
for
$$ {P}_{LHHH}^{(D)}={\left[\begin{array}{c}{B}_1\left(\mathbf{z}\right){C}_1\left(\mathbf{z}\right){D}_1\left(\mathbf{z}\right)\\ {}{B}_1\left(\mathbf{z}\right){C}_1\left(\mathbf{z}\right)\\ {}\begin{array}{c}{B}_1\left(\mathbf{z}\right){D}_1\left(\mathbf{z}\right)\\ {}{B}_1\left(\mathbf{z}\right)\\ {}\begin{array}{c}{C}_1\left(\mathbf{z}\right){D}_1\left(\mathbf{z}\right)\\ {}{C}_1\left(\mathbf{z}\right)\\ {}{D}_1\left(\mathbf{z}\right)\end{array}\end{array}\end{array}\right]}^T\bullet \left[\begin{array}{c}{X}_{0000}^{(B)}\ \left(\mathbf{z}\right)\\ {}{X}_{0001}^{(B)}\left(\mathbf{z}\right)\\ {}\begin{array}{c}{X}_{0010}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0011}^{(B)}\left(\mathbf{z}\right)\\ {}\begin{array}{c}{X}_{0100}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0101}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0110}^{(B)}\left(\mathbf{z}\right)\end{array}\end{array}\end{array}\right]+{\left[\begin{array}{c}{B}_3\left(\mathbf{z}\right){C}_3\left(\mathbf{z}\right){D}_3\left(\mathbf{z}\right)\\ {}{B}_3\left(\mathbf{z}\right){C}_3\left(\mathbf{z}\right)\\ {}\begin{array}{c}{B}_3\left(\mathbf{z}\right){D}_3\left(\mathbf{z}\right)\\ {}{B}_3\left(\mathbf{z}\right)\\ {}\begin{array}{c}{C}_3\left(\mathbf{z}\right){D}_3\left(\mathbf{z}\right)\\ {}{C}_3\left(\mathbf{z}\right)\\ {}{D}_3\left(\mathbf{z}\right)\end{array}\end{array}\end{array}\right]}^T\bullet \left[\begin{array}{c}{X}_{0000}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0001}^{(B)}\left(\mathbf{z}\right)\\ {}\begin{array}{c}{X}_{0010}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0011}^{(B)}\left(\mathbf{z}\right)\\ {}\begin{array}{c}{X}_{0100}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0101}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0110}^{(B)}\left(\mathbf{z}\right)\end{array}\end{array}\end{array}\right], $$
(25)
in the fifth lifting step. In this step, a 3D filtering with 3D memory accessing B1(z)C1(z)D1(z) is used. In the sixth lifting step, the calculation of YLHHL, YLHLH, and YLLHH:
$$ \left[\begin{array}{c}{X}_{0110}^{(D)}\left(\mathbf{z}\right)\\ {}{X}_{0101}^{(D)}\left(\mathbf{z}\right)\\ {}{X}_{0011}^{(D)}\left(\mathbf{z}\right)\end{array}\right]=\left[\begin{array}{c}{X}_{0110}^{(B)}\left(\mathbf{z}\right)+R\left[{k}^{+1}{2}^{-F}{P}_{LHHL}^{(D)}\left(\mathbf{z}\right)\right]\\ {}{X}_{0101}^{(B)}\left(\mathbf{z}\right)+R\left[{k}^{+1}{2}^{-F}{P}_{LHLH}^{(D)}\left(\mathbf{z}\right)\right]\\ {}{X}_{0011}^{(B)}\left(\mathbf{z}\right)+R\left[{k}^{+1}{2}^{-F}{P}_{LLHH}^{(D)}\left(\mathbf{z}\right)\right]\end{array}\right], $$
(26)
for
$$ \left[\begin{array}{c}{P}_{LHHL}^{(D)\prime}\left(\mathbf{z}\right)\\ {}{P}_{LHLH}^{(D)\prime}\left(\mathbf{z}\right)\\ {}{P}_{LLHH}^{(D)\prime}\left(\mathbf{z}\right)\end{array}\right]=\left[\begin{array}{ccccc}{B}_1\left(\mathbf{z}\right){C}_1\left(\mathbf{z}\right)& 0& {B}_1\left(\mathbf{z}\right)& {C}_1\left(\mathbf{z}\right)& {D}_2\left(\mathbf{z}\right)\\ {}{B}_1\left(\mathbf{z}\right){D}_1\left(\mathbf{z}\right)& {B}_1\left(\mathbf{z}\right)& 0& {D}_1\left(\mathbf{z}\right)& {C}_2\left(\mathbf{z}\right)\\ {}{C}_1\left(\mathbf{z}\right){D}_1\left(\mathbf{z}\right)& {C}_1\left(\mathbf{z}\right)& {D}_1\left(\mathbf{z}\right)& 0& {B}_2\left(\mathbf{z}\right)\end{array}\right]\left[\begin{array}{c}{X}_{0000}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0001}^{(B)}\left(\mathbf{z}\right)\\ {}\begin{array}{c}{X}_{0010}^{(B)}\left(\mathbf{z}\right)\\ {}\begin{array}{c}{X}_{0100}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0111}^{(B)}\left(\mathbf{z}\right)\end{array}\end{array}\end{array}\right], $$
(27)
$$ \left[\begin{array}{c}{P}_{LHHL}^{(D)\prime \prime}\left(\mathbf{z}\right)\\ {}{P}_{LHLH}^{(D)\prime \prime}\left(\mathbf{z}\right)\\ {}{P}_{LLHH}^{(D)\prime \prime}\left(\mathbf{z}\right)\end{array}\right]=\left[\begin{array}{ccccc}{B}_3\left(\mathbf{z}\right){C}_3\left(\mathbf{z}\right)& 0& {B}_3\left(\mathbf{z}\right)& {C}_3\left(\mathbf{z}\right)& {D}_4\left(\mathbf{z}\right)\\ {}{B}_3\left(\mathbf{z}\right){D}_3\left(\mathbf{z}\right)& {B}_3\left(\mathbf{z}\right)& 0& {D}_3\left(\mathbf{z}\right)& {C}_4\left(\mathbf{z}\right)\\ {}{C}_3\left(\mathbf{z}\right){D}_3\left(\mathbf{z}\right)& {C}_3\left(\mathbf{z}\right)& {D}_3\left(\mathbf{z}\right)& 0& {B}_4\left(\mathbf{z}\right)\end{array}\right]\left[\begin{array}{c}{X}_{0000}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0001}^{(B)}\left(\mathbf{z}\right)\\ {}\begin{array}{c}{X}_{0010}^{(B)}\left(\mathbf{z}\right)\\ {}\begin{array}{c}{X}_{0100}^{(B)}\left(\mathbf{z}\right)\\ {}{X}_{0111}^{(B)}\left(\mathbf{z}\right)\end{array}\end{array}\end{array}\right], $$
(28)
$$ \left\{\begin{array}{c}{P}_{LHHL}^{(D)}\left(\mathbf{z}\right)={P}_{LHHL}^{(D)\prime}\left(\mathbf{z}\right)+{P}_{LHHL}^{(D)\prime \prime}\left(\mathbf{z}\right)\\ {}{P}_{LHLH}^{(D)}\left(\mathbf{z}\right)={P}_{LHLH}^{(D)\prime}\left(\mathbf{z}\right)+{P}_{LHLH}^{(D)\prime \prime}\left(\mathbf{z}\right)\\ {}{P}_{LLHH}^{(D)}\left(\mathbf{z}\right)={P}_{LLHH}^{(D)\prime}\left(\mathbf{z}\right)+{P}_{LLHH}^{(D)\prime \prime}\left(\mathbf{z}\right)\end{array}\right., $$
(29)
where R[] denotes the rounding operation on a signal value. Similarly, prediction of X1111, X1110, X1101, X1100, X1011, X1010, X1001, X1000, X0100, X0010, X0001, and X0000are also independent. The total numbers of lifting steps and rounding operators in the non-separable structure were hence reduced from 16 to 12 and 192 to 96, respectively, comparing with the separable structure in Fig. 4. However, the quality of the decoded image was degraded by the rounding noise inside the transform in its integer implementation. The proposed methods solve this problem as explained below.