1 Introduction

Reversible data hiding (RDH) is a technique to embed message bits into a cover image in a reversible manner. During the phase of extracting the hidden data, all pixels of the original cover image can be recovered without any distortion, which is very important for military and medical applications. In the past two decades, RDH has received much attention and there are many work presented to promote the development of this research.

The first and most significant research direction is the RDH in uncompressed cover image. The cover image considered in this direction is uncompressed grayscale image in bitmap format. The mainstream of the current methods to hide data into an uncompressed image in a reversible manner can be roughly grouped into three aspects: difference expansion (DE) methods, histogram shift (HS) methods, and prediction-error expansion (PEE) methods. The first DE method was proposed by Tian [1], where integer wavelet transform was first conducted on the cover image to generate difference values, and then, the difference values were expanded to create the space for embedding the secret data. Weng et al. [2] improved the DE method by using the invariability of the sum of pixel pairs and pairwise difference adjustment (PDA) technique. The local-prediction-based DE method was proposed by Dragoi and Coltuc [3], where a predictor of least square was computed on a sub-block that centered on the pixel and the corresponding prediction error is expanded based on the framework of DE. The first HS method was proposed by Ni et al. [4], where the bins less than a given integer are shifted toward the left by one to create a vacant bin for data embedding before the message bits were subtracted from the values at the bin of the given integer. Luo et al. [2 and Section 3 introduce the motivation and concrete steps of the proposed method, respectively. Section 4 presents the experimental results, and Section 5 concludes the paper.

2 Motivation of the proposed method

In this section, the related work of dual-image RDH with center-folding strategy [31] and our previous work [32] are briefly introduced, and meanwhile, its potential improvement, i.e., the motivation of the proposed method, is also presented.

In [31], Lu et al. proposed a dual-image RDH method based on center-folding strategy. Assume that x, x1, and x2 are the original cover pixel and its corresponding marked dual pixel, respectively. First, an integer parameter, l, was set to control the embedding capacity. Note that l was greater than or equal to 2 in [31], and then, l binary to-be-embedded secret bits, denoted by {m1, m2, …, m l }, were combined as a set and transferred into a decimal value, d. The value of d ranged from 0 to 2l − 1. Next, d was converted to a smaller value, d′, which was within the range of [−2l − 1, 2l − 1 − 1] by the center-folding strategy as

$$ {d}^{\prime }=d-{2}^{l-1}. $$
(1)

Then, dwas embedded into x to generate a pair of stego pixels, i.e., x1 and x2, by the following averaging operation:

$$ \left\{\begin{array}{c}{x}_1=x+\mathrm{floor}\left({d}^{\prime }/2\right)\\ {}{x}_2=x-\mathrm{ceil}\left({d}^{\prime }/2\right)\end{array}\right. $$
(2)

where functions floor and ceil indicate the rounding operation toward the directions of negative infinity and positive infinity, respectively. In the phase of message extraction and cover image recovery, x and d can be restored as

$$ \left\{\begin{array}{c}x=\mathrm{ceil}\left(\frac{x_1+{x}_2}{2}\right)\\ {}d={d}^{\prime }+{2}^{l-1}=\left({x}_1-{x}_2\right)+{2}^{l-1}\end{array}\right. $$
(3)

Finally, original message bits {m1, …, m k } can be restored through a decimal-to-binary conversion of d.

Recently, an improved work of [31] was proposed by us in [32]. Let us revisit the problem of restoring x from x1 and x2 as referred to (3). First, a coordinate system of x2 with respect to x1 was set up. In this way, for a designated cover pixel x, the shiftable coordinate for (x1, x2) can be derived as being located at the pair straight line of x2 = − x1 + 2x and x2 = − x1 + 2x − 1. Figure 1a shows a diagram of a coordinate system and the shiftable coordinates, and an enlarged view of its central region is also shown in Fig. 1b. It can be seen that the shiftable coordinates for embedding, in the case of k = 3, are (x, x), (x − 1, x), (x, x − 1), (x − 1, x + 1), (x + 1, x − 1), (x − 2, x + 1), (x + 1, x − 2), (x − 2, x + 2), and (x + 2, x − 2). To clarify the description, we allocated the position number, starting at 1 and increasing by increments of 1 to represent the shiftable coordinates. Note that, as the position number increases, the distortion of each pixel also increases. Due to the aim for all the RDH methods that generate less distortion with a fixed embedding rate, we set a parameter k in our previous work, which was an integer equal to or greater than 1 to make a trade-off between distortion and embedding rate. Following [31], the k-length message bits, {m1, m2,…, m k }, were converted into decimal number d, and different from [31], if d was equal to 2k − 1, an extra bit, mk + 1, was added to improve the capacity; d was updated as d = d + mk + 1 and then embedded into x through:

$$ \left\{\begin{array}{c}{x}_1=x+\mathrm{floor}\left(d/4\right),{x}_2={x}_1-\left(d/2\right),\mathrm{if}\ d\ \mathrm{is}\ \mathrm{an}\ \mathrm{even}\ \mathrm{number}\\ {}{x}_1=x-\mathrm{ceil}\left(d/4\right),{x}_2={x}_1+\mathrm{ceil}\left(d/2\right),\mathrm{otherwise}\kern5em \end{array}\right. $$
(4)
Fig. 1
figure 1

Diagram of a coordinate system and allowable coordinates for the shiftable position-based dual-image RDH method. a Entire coordinate. b An enlarged view of central region of a

Eventually, the decimal number d can be restored in a reverse manner in the data extraction procedure.

Now, we analyze the embedding strategy of our previous work from a different perspective. Actually, for each embedding with parameter k, we generated a one-to-one code table to map the specific combination of k-length or (k + 1)-length message bits, which is denoted by C, to its corresponding shiftable position number P. Note that the maximum position number for each k was 2k + 1. For the ease of intuitive comprehension, Table 1 lists the corresponding code table for k = 2 and k = 3. As seen in Table 1, the maximum position numbers for k = 2 and k = 3 are 5 and 9, respectively, and their average embedding bit lengths are 2.125 and 3.0625 bits, respectively. Indeed, according to our observations, the total number of allowable shift positions, denoted by n, is more significant and elaborate than parameter k in the practical embedding performance. Note that value of n can be set as any arbitrary odd number greater equal to or greater than 3, and when the values of k are low enough, such as k = 1 or 2, their corresponding n values are 3 or 5, respectively, and there is no gap between them. However, as the values of k increase, such as k = 3, 4, and 5, their corresponding n values are 9, 17, and 33. There are obvious gaps between 5 and 9, 9 and 17, and 17 and 33. For example, for k = 2 and 3, there is an optional setting of n = 7 between them, and similarly, for k = 3 and 4, there are triple optional settings of n = 11, 13, and 15 between them. Therefore, to improve the embedding efficiency, the code length parameter k is no longer of use in this paper; we now use the maximum position parameter n. The code table with respect to n can be generated in a similar manner as Table 1. Table 2 lists two examples, i.e., for n = 7 and 11. To better explain the key idea of the proposed method, an example of how to embed the message bits is shown in Fig. 2, where message bits {00}2 and {101}2 are embedded into the cover pixels 45 and 90, respectively. A more general framework is demonstrated in Section 3.

Table 1 One-to-one map** code table for parameter k
Table 2 One-to-one map** code table for parameter n
Fig. 2
figure 2

An example of how to embed message bits for a given parameter n

3 Proposed method

Based on the observation that the maximum allowable position number n is more significant than code length parameter k used in [32], in this paper, we propose a novel code-map** method to improve the embedding efficiency, especially when there is a requirement for a higher embedding rate. In addition, different from the previous methods [31, 32] that the parameters were selected manually, the optimal parameter n* for a fixed embedding rate can be selected automatically before data hiding and n* can be identified correctly in advance before data extraction. The proposed method consists of two procedures, i.e., embedding data and extracting data/image recovery. In addition, the issue of underflow/overflow also is addressed in this section. The detailed procedures are as follows.

3.1 Data embedding

For data embedding, the steps are as follows:

Step 1: Seek the optimal parameter n*

In almost all dual-image reversible data hiding methods, the research objective is to obtain lower distortion with a fixed embedding rate α or embedding capacity β. The relationship between α and β can be derived by

$$ \alpha =\frac{\beta }{M\times N\times 2} $$
(5)

where M and N are the row and column numbers of the cover image, respectively. According to (5), the optimization of embedding efficiency for a given β can be easily transferred to the optimization of that for a given α. Thus, in the remainder of this paper, we take the fixed embedding rate of α as an example. First, ignore the issue of underflow/overflow and set up the candidate parameter set n = 3, 5, 7, 9, …, i.e., n can be set as an arbitrary odd integer greater than 1. As observed from Tables 1 and 2, the shortest code length, t, in the code table for each candidate n can be determined by

$$ t=\mathrm{floor}\left({\log}_2(n)\right) $$
(6)

For instance, for n = 7, the shortest code is {00}2 with the length of 2, and while for n = 11, the shortest codes are {000}2, {001}2, {010}2, {011}2, and {100}2 with the length of 3. If t is given, the value of n belongs to the interval of (2t, 2t + 1). In other words, once the value of n is fixed, for each embedding on each pixel, we can embed t to t + 1 binary bits. It can also be observed from Tables 1 and 2, assuming the message bits are randomly permutated and the numbers of 0 and 1 are approximately equal. Thus, the probability of each t-length code is (2t + 1 − n)/2t, and the probability of each (t + 1)-length code is (n − 2t)/2t. Also, take n = 7 and n = 11 for example; for n = 7, the probabilities of occurrence of {00}2 and {010}2 are 25 and 12.5%, respectively, and for n = 11, the probabilities of occurrence of {000}2 and {1010}2 are 12.5 and 6.25%, respectively. Based on the above observations, the ideal embedding rate α n for any candidate n can be determined as

$$ {\alpha}_n=\left[t\times \frac{2^{t+1}-n}{2^t}+\left(t+1\right)\times \frac{n-{2}^t}{2^t}\right]\times \frac{1}{2} $$
(7)

Note that the reason for appearance of 1/2 in (7) is due to that the embedding rate is calculated according to the dual images rather than a single image.

Next, if we consider the problem of underflow/overflow, the number of pixels that cannot be modified should be identified. Here, we denote this number as K n and we discuss how to determine K n in Section 3.3. After determining K n , the amendatory embedding rate for a fixed n can be updated as:

$$ {\tilde{\alpha}}_n=\frac{\left(M\times N-{K}_n\right)\times {\alpha}_n}{M\times N} $$
(8)

For any candidate n, we calculate its limit-embedding rate\( {\tilde{\alpha}}_n \), and the optimal parameter n* can be determined by

$$ {n}^{\ast }=\min (n)\kern0.75em \forall {\tilde{\alpha}}_n>\alpha, \kern0.5em n=3,5,7,9,\dots $$
(9)

where min is the function to seek the minimum value.

Step 2: Generate the code table

The shiftable position number P is allocated from 1 to n*, and the one-to-one map** relationship between P and the binary code for embedding, i.e., C, should be established by the predefined regulations as described in Section 2. For map** from code C to position number P, the first t* bits are converted to decimal number d, where t* is the shortest code length with respect to n* and determined from (5). As can be observed from Tables 1 and 2, the number of codes is n*, including the t*-length codes and the (t* + 1)-length codes with the numbers of 2t* + 1 − n* and 2n* − 2t* + 1, respectively. Then, the map** relationship can be interpreted as:

$$ \left\{\begin{array}{c}P=d+1,\kern5.25em \mathrm{if}\ d\le {2}^{t\ast +1}-{n}^{\ast }-1\\ {}P=\left({2}^{t\ast +1}-{n}^{\ast}\right)+2\times \left(d-{2}^{t\ast +1}+{n}^{\ast}\right)+m+1,\mathrm{otherwise}\end{array}\right. $$
(10)

where m is the (t* + 1)th bit of the message for embedding. For example, in the case of n* = 11 and C = {001}2, it can be easily calculated that d = 1, t* = 3, and d ≤ (2t* + 1 − n* − 1); therefore, P can be directly determined by P = d + 1 = 2. If n* = 11 and C = {1011}2, in this case, d = 5, t* = 3, and d > (2t* + 1 − n* − 1); therefore P can be determined by P = (2t ∗  + 1 − n) + 2 × (d − 2t ∗  + 1 + n) + m + 1= 7.

Step 3: Modify the cover pixel pair

Each original cover pixel, x, is modified to its corresponding marked stego pixel pair, x1 and x2, according to the parity of P as

$$ \left\{\begin{array}{c}{x}_1=x+\mathrm{floor}\left(\frac{P-1}{4}\right),{x}_2=x-\mathrm{ceil}\left(\frac{P-1}{4}\right),\mathrm{if}\ P\ \mathrm{is}\ \mathrm{an}\ \mathrm{odd}\ \mathrm{number}\ \\ {}{x}_1=x-\mathrm{ceil}\left(\frac{P}{4}\right),{x}_2=x+\mathrm{floor}\left(\frac{P}{4}\right),\kern6em \mathrm{otherwise}\end{array}\right.\kern0.5em $$
(11)

For example, as can be verified in Fig. 1, for P = 2, its corresponding modification equations are x1 = x − 1, x2 = x, and for P = 7, its corresponding modification equations are x1 = x + 1, x2 = x − 2.

To this point, the message bits have been embedded into cover image with a dual-image mechanism.

3.2 Extraction of data and restoration of the cover image

To extract the message and restore the cover image, the steps are listed as follows:

Step 1: Restore the cover image and identify n*

For each pixel, the cover image can be restored through (3). Based on all groups of {x, x1 and x2}, n* can be identified as

$$ {n}^{\ast }=\max \left(\left|{x}_1-x\right|+\left|{x}_2-x\right|\right)\times 2+1,\forall \left\{x,{x}_1,{x}_2\right\} $$
(12)

where max is the function to seek the maximum value.

Step 2: Regenerate the code table

In the extraction procedure, the code table is established with the inverse order relative to the embedding procedure. First, after recognizing the unmodified pixels with the consideration of underflow/overflow, the position number PP for the remaining pixels can be identified as

$$ P=\left\{\begin{array}{c}\left(\left|{x}_1-x\right|+\left|{x}_2-x\right|\right)\times 2+1,\mathrm{if}\ {x}_2\le {x}_1\\ {}\left(\left|{x}_1-x\right|+\left|{x}_2-x\right|\right)\times 2,\mathrm{otherwise}\end{array}\right. $$
(13)

Next, establish the one-to-one map** relationship from P Pto code C through

$$ C=\left\{\begin{array}{c}{\left\{P-1\right\}}_{2\mid t\ast },\kern12.25em \mathrm{if}\ P\le {2}^{t\ast +1}-{n}^{\ast}\\ {}{\left\{\mathrm{ceil}\left(\frac{P-\left({2}^{t\ast +1}-{n}^{\ast}\right)}{2}\right)+{2}^{t\ast +1}-{n}^{\ast }-1\right\}}_{2\mid t\ast}\left\Vert {\left\{\operatorname{mod}\left(P,2\right)\right\}}_{2\mid 1},\mathrm{otherwise}\right.\end{array}\right. $$
(14)

where, {∙}2|t* is the conversion from decimal number to t*-length binary digits, and |⋅| is a cascading operation onto two binary sequences.

Step 3: Extract the message bits

For each pixel without any underflow/overflow protection, after identifying its position number P and its corresponding code C, the message bits from each pixel can be extracted separately according to the code table as regenerated in Step 2. Finally, concatenate all messages into the entire message.

3.3 Underflow/overflow issue in the proposed method

For all RDH methods, the issue of underflow/overflow is inevitable. To avoid already modified pixels out of the bound of image representation, some original pixels are protected without any modification in the embedding process. The protection scope is

$$ \left\{\begin{array}{c}\mathrm{underflow}:x-\mathrm{ceil}\left(\left(n-1\right)/4\right)<0\ \\ {}\mathrm{overflow}:x+\mathrm{floor}\left(\left(n-1\right)/4\right)>255\end{array}\right.\kern0.5em $$
(15)

Thus, pixels that belong to the intervals of [0, ceil((n−1)/4)⌈(n − 1)/4⌉) and (255 − floor((n − 1)/4)⌈(n − 1)/4⌉)⌊(n − 1)/4⌋, 255] are not embedded in any message; instead, all remaining pixels are modified following the procedures described in Section 3.1. Note that the total number of pixels that belong to the underflow/overflow protection ranges is counted and denoted by K n .

4 Results and discussion

To demonstrate the efficacy of the proposed method, we compared our method with some state-of-the-art dual-image RDH methods [28, 31, 32]. First, four standard test images, i.e., Lena, Baboon, Barbara, and Goldhill, were used in our evaluation as shown in Fig. 3. Figure 4 shows the comparative results of the average peak signal-to-noise ratio (PSNR) with respect to the embedding rate between our previous work [32], Lu et al.’s method [31], and the proposed method. Note that owing to being very close to the corresponding PSNR performances of [32] in the cases of k ≥ 4, for ease of comparison, the performance curves of [31] in the cases of l ≥ 4 are omitted in Fig. 4. Note that, for shiftable position-based, dual-image RDH methods, their corresponding embedding performances were essentially unaffected by the content of the cover image itself, except for small differences in underflow/overflow. Therefore, as shown in Fig. 4, there is no significant difference between each image by using a specified method, whether it is [32] or the proposed method. It also should be pointed out that the parameters k and l in [31, 32], respectively, were not selected automatically; thus, there are multiple curves in each performance diagram based on the different settings of k and l. For the proposed method, the optimal parameter n* can be sought adaptively, so the performance curve for the proposed method is integrated into a single curve. Due to the use of new parameter n*, the performance of our method provided an obvious improvement, especially for the higher embedding rates.

Fig. 3
figure 3

Four standard test images with sizes of 512 × 512

Fig. 4
figure 4

Comparison of embedding rate and average PSNR between our previous method [32], Lu et al.’s method [31], and the proposed method

To further demonstrate the superiority for the requirement of higher embedding rates, Table 3 lists the average PSNRs of the proposed method and some state-of-the-art methods, such as [28, 31, 32], with some fixed embedding rates. Due to the limited capacity for [28], we use “n/a” to stand for the occasion that is unavailable to fulfill the required capacity. Note that for [31, 32], based on the given embedding rates, we select the optimal parameter manually. It can be seen that all average PSNRs by the proposed method are higher than our previous work of [32] and Lu et al.’s method [31], and in the case of α = 1.2, average PSNRs in [28] is slightly higher than the proposed method. However, with the increase of α, the embedding rate is out of the maximum sustainable rate; in other words, the proposed method has more flexible capacity than [28].

Table 3 Comparison of the proposed method with other methods in terms of average PSNRs (dB) with fixed embedding rates (bpp). The best result for each image is presented in italics

Next, to demonstrate the universality of the proposed method, i.e., our method was not only effective on some specific standard test images but also arbitrary selected images, 300 images were randomly selected from UCID dataset to involve into the test after a conversion from RGB color channels to gray images. Figure 5 shows the average PSNRs of each dual image pair with two fixed embedding rates of 1.2 bpp (bits per pixel) and 1.6 bpp, respectively. To present the superiority of the proposed method, the corresponding average PSNRs of [32] with the same embedding rates are also shown in Fig. 5. It can be seen that the average PSNRs of the proposed method for the embedding rates of 1.2 bpp and 1.6 bpp were 48.71 and 45.07 dB, respectively, with the average gains of 1.19 and 3.38 dB, respectively, over our previous method [32].

Fig. 5
figure 5

Comparison of average PSNR for the specified embedding rates of 1.2 bpp and 1.6 bpp between the proposed method and previous method [32] using randomly selected 300 UCID images

Standard image Lena is adopted to demonstrate the imperceptibility of the proposed method. Figure 6 shows the comparison between the original cover image and the marked stego images, where the column (a) is the cover image and the zoom-in view of its red square region, the columns (b) and (c) are the marked dual images with the embedding rate of 1.2 bpp and their corresponding zoom-in versions, the columns (e) and (f) are the marked dual images with the embedding rate of 1.6 bpp and their corresponding zoom-in versions. As seen in Fig. 6, the marked dual images still exhibit acceptable visual quality even with a relative higher embedding rate.

Fig. 6
figure 6

Visual comparison of the cover image and the marked stego dual images. a Cover image. b, c Marked stego dual images with the embedding rate of 1.2. e, f Marked stego dual images with the embedding rate of 1.6. All bottom images are the partial enlarged view of the red square regions from their corresponding top images

To verify the security of the proposed algorithm, the attacks of RS steganalysis [33] is conducted on the proposed method. The RS steganalysis was designed to detect the trace of LSB steganography and estimate the embedding rate of hidden message. In our experiments, the RS test was conducted on the two test images, i.e., Lena and Barbara, respectively, with the different embedding rates. Figure 7 shows the estimated steganography rate with respect to the actual embedding rates varying from 0 to 2.0 with the incremental step of 0.1. Note that embedding rate = 0 means the original cover image without any distortion. As can be seen in Fig. 7, all estimated steganography rates of the marked images are below 10%, which is a relative safe threshold to avoid causing the alert by the attackers. Furthermore, the histogram analysis is also applied on the marked dual images. Figure 8 shows the histogram comparison of the cover images, Lena and Barbara, and their corresponding marked stego dual images with the embedding rate of 1.2. It can be seen in Fig. 8 that there is no obvious differences between the cover images and the marked images; in other words, the proposed method is with high security to resist the histogram analysis attack.

Fig. 7
figure 7

RS steganalysis results of the proposed method with different embedding rates

Fig. 8
figure 8

Histogram comparison of the cover images and their corresponding marked dual images with the embedding rate of 1.2

5 Conclusions

In this paper, we have proposed a general work for our previously proposed shiftable position-based, dual-image RDH method. Compared with the previous method, the proposed method used the total number of shiftable positions, rather than the code length, as the embedding parameter. The proposed method provides two main improvements, i.e., (1) the optimal parameter can be sought automatically with a designated embedding rate and (2) the efficiency in the case of high embedding rates outperforms our previous work. The experimental results have demonstrated the improvements provided by the proposed method.