Frequency-constrained transferable adversarial attack on image manipulation detection and localization

Zeng, Yijia; Pun, Chi-Man

doi:10.1007/s00371-024-03482-4

Frequency-constrained transferable adversarial attack on image manipulation detection and localization

Research
Published: 05 June 2024

Volume 40, pages 4817–4828, (2024)
Cite this article

Download PDF

The Visual Computer Aims and scope Submit manuscript

Frequency-constrained transferable adversarial attack on image manipulation detection and localization

Download PDF

Yijia Zeng¹ &
Chi-Man Pun¹

144 Accesses
Explore all metrics

Abstract

Recent works have demonstrated the great performance of forgery image forensics based on deep learning, but there is still a risk that detectors could be susceptible to unknown illegal attacks, raising growing security concerns. This paper starts from the perspective of reverse forensics and explores the vulnerabilities of current image manipulation detectors to achieve targeted attacks. We present a novel reverse decision aggregate gradient attack under low-frequency constraints (RevAggAL). Specifically, we first propose a novel pixel reverse content decision-making (PRevCDm) loss to optimize perturbation generation with a specific principle more suitable for segmenting manipulated regions. Then, we introduce the low-frequency component to constrain the perturbation into more imperceptible details, significantly avoiding the degradation of image quality. We also consider aggregating gradients on model-agnostic features to enhance the transferability of adversarial examples in black-box scenarios. We evaluate the effectiveness of our method on three representative detectors (ResFCN, MVSSNet, and OSN) with five widely used forgery datasets (COVERAGE, COLUMBIA, CASIA1, NIST 2016, and Realistic Tampering). Experimental results show that our method improves the attack success rate (ASR) while ensuring better image quality.

Shallowfake and deepfake image manipulation localization using noise and RGB-based dual branch method

Article 25 June 2024

Image Manipulation Detection Based on Ringed Residual Edge Artifact Enhancement and Multiple Attention Mechanisms

FP-Net: frequency-perception network with adversarial training for image manipulation localization

Article 08 January 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Image manipulation generation [22, 27, 37, 49] significantly enriches sample diversity and visual interest; however, it also brings a crisis of trust in image reliability. To address concerns related to malicious tampering and illegal dissemination, forgery image forensic methods [2, 10, 38] have been continuously studied in recent years to automatically detect such manipulations. Various image manipulation detectors based on deep neural networks (DNNs) [1, 17, 31, 43] have demonstrated remarkable performance. Nevertheless, these detectors seem to be susceptible to adversarial examples, where artificially applying imperceptible perturbations on clean inputs would mislead the detectors into producing incorrect predictions. Furthermore, adversarial examples usually exhibit the transferability across diverse models, even those with unknown structural parameters.

Such possibility that a detector can be easily deceived just by adding subtle noise presents a severe threat to information security. Therefore, it is significant to investigate adversarial attacks specifically for existing manipulation detection methods from the perspective of reverse forensics. Meanwhile, exploring the transferability of adversarial examples proves advantageous in discovering universal vulnerabilities within detectors, thereby improving their robustness to malicious adversaries in black-box attack scenarios.

Despite extensive efforts in develo** adversarial attacks [50, 52, 56], most of the works primarily design adversaries tailored for image classifiers [46, 51, 53]. In such attacks, classifiers fail to correctly classify inputs in untargeted attacks or produce predictions consistent with preset labels in targeted attacks. However, these attack principles may not be optimal for effectively disrupting image manipulation detectors. This limitation arises from the fundamental differences between classification and manipulation detection. Manipulation detectors not only perform image-level identification of authentic and forgery images, but they also predict pixel-wise probability distribution maps (known as binary masks) to localize manipulated regions.

A noteworthy revelation is that adversarial attacks studied recently are aware of dense prediction tasks where adversaries are designed for object detection or semantic segmentation models. Huang et al. [20] explore transfer-based self-ensemble attack (T-SEA) on object detection, which ensembles the input, the attacked model, and the adversarial patch to boost the adversarial transferability under black-box attacks. Cai et al. [3] propose ensemble-based black-box targeted and untargeted attacks on semantic segmentation and object detection. Li et al. [28] optimally search adversarial points to generate high visual quality anti-forensic fake face images via exploring Style-GAN’s manifold. Jia et al. [21] propose a hybrid adversarial attack against face forgery detectors based on the strategy of meta-learning. However, there are few works on crafting imperceptible adversarial examples specific to image manipulation detection models. Moreover, the aforementioned attacks are mainly implemented in the spatial domain, which results in increasing the perturbation intensity for better effects at the expense of image quality. Although Zhu et al. [55] propose adversarial manipulation generation (AMG) that incorporates both spatial and frequency features into a GAN architecture to attack against manipulation detection, this generative attack requires a pre-trained generator that performs reasonably well on a dataset in advance. There may be limitations in acquiring dataset authorization for cross-dataset attacks. On the contrary, utilizing transferable adversaries from a small number of models to disrupt black-box victims is more cost-effective in practical applications. This is a worthwhile topic that adversarial examples generated on detector vulnerabilities could indeed easily increase security threats.

To address the aforementioned problems, we propose a novel adversarial attack called RevAggAL, which not only effectively achieves iterative attacks specific against image manipulation detectors at both image-level and pixel-level, but also generates adversarial examples with good invisibility and high transferability. Specifically, we first design a new loss function for optimizing perturbations from the perspective of pixel-level segmentation prediction. Then, we extract low-frequency components of input samples to constrain the visual differences between clean and perturbed images. We further improve adversaries with higher transferability across DNN-based detectors via gradient aggregation of mid-layer features in black-box attack scenarios. It is worth noting that our method is advantageous at aspects of cross-models and cross-datasets without additional pre-training on white-box adversarial generators in advance.

We summarize our main contributions as follows:

We propose an efficient loss function for optimally adding perturbations from the view of pixel-level segmentation decisions, which can also be extended to other classic iterative attack algorithms for adversarial example generation.
We introduce the low-frequency as a new constraint for limiting subtle noise in finer details. Such innovation would suppress perturbations appearing in sensitive regions to human observers. We also exploit aggregated gradients on mid-layer features from the white-box surrogate detectors to improve the transferability of designed adversarial examples to black-box victims.
We conduct extensive experiments on three DNN-based image manipulation detectors with five datasets under both white-box and black-box settings. Compared with traditional iterative attacks, our method achieves favorable attack performance in generating more imperceptible adversaries while reducing the degradation of image quality.

2 Related work

2.1 Digital image forensics

Digital image forensics technology plays a crucial role in identifying the authenticity, integrity, and source of images. Conventional approaches, such as digital watermarking [45] and digital signature [33, 36], require pre-insertion of encrypted information before transmitting an image to the generator, followed by feature extraction and consistency verification to stimulate the detector. However, these active forensic methods are impractical due to the possible absence of automatic key addition capabilities in most imaging devices. Recently, blind forensics has emerged as an effective alternative, directly analyzing inherent characteristics and proprietary properties to determine image sources [12, 13, 42] and manipulation artifacts [14, 26, 41]. Our work deals with blind forensics against forgery image manipulation in this domain.

2.2 Image manipulation detection and localization

In generic image manipulation blind forensics, it is necessary to distinguish authentic images from ones manipulated by a DNN-based generator, as well as to accurately localize modified regions. Consequently, this task is commonly regarded as an image manipulation detection and localization (IMDL) problem. Early methods extract image block features (such as DCT, PCA, SVD) or internal statistical characteristics (such as pixel mean, and RGB correlation) to establish true and false classifications. However, these patch-based operations often provide imprecise manipulation localization. Building upon this foundation, Li et al. [29] proposed the implementation of a fully convolutional network (FCN) [32] for precise localization. Salloum et al. [40] developed a multi-task FCN (MFCN) to segment a pixel-level fine-grained manipulated area and its boundary. Zhou et al. [54] employed the SRM kernel [11] to Faster R-CNN [39] to localize forgeries with bounding boxes. Chen et al. [6] proposed MVSSNet, which effectively captures subtle changes in suspicious boundaries and learns more general features through multi-level supervision. Wu et al. [48] considered the impact of image transmission on various online social networks (OSNs) and proposed a training scheme for improving the robustness of image manipulation detection by modeling predictable noises and intentionally introducing unseen noises. Our work refers to the pixel-wise semantic segmentation task, where we select a deformed baseline method (ResFCN, an FCN combined with ResNet-50 [18]) and two state-of-the-art methods (MVSSNet and OSN) for detecting and localizing manipulations.

2.3 Adversarial attack

Adversarial examples with subtle perturbations, resembling their originals, have been demonstrated to effectively disrupt a DNN-based model. Beyond image classification, the scope of adversarial attacks has expanded to encompass dense prediction tasks, such as object detection and semantic segmentation. Despite growing research on exploring the feasibility of attack methodologies for face forgery detection [4, 9, 28], there are few studies specifically targeting common image manipulation detectors. We introduce classic attack algorithms that hold the potential for manipulation detectors. Fast gradient sign method (FGSM) [15] is an algorithm that generates adversarial examples by leveraging the gradient. Projected gradient descent (PGD) [34] is an attack strategy that performs multiple iterations, taking a small step at each iteration while projecting the disturbance within a specified range. Nevertheless, most existing works primarily generate adversaries with spatial constraints, easily resulting in a certain degree of image quality degradation with perturbations visible to human eyes. Inspired by recent research on attacking classifiers in frequency domain [21, 55], we select low-frequency components from decomposed input samples to more precisely constrain the perturbations. Furthermore, we employ an Adam optimizer to optimize the loss calculation as in C &W [5].

3 Methodology

3.1 Problem formulation

Assume we have an image manipulation detector F which accepts a normalized image $x \in [0,1]^{H \times W \times C}$ and predicts a pixel-wise probability distribution map $y_{map} \in [0,1]^{H \times W}$, where each pixel is assigned a probability value indicating its likelihood of belonging to a specific category, to localize the manipulated region. Subsequently, an additional Global Max Pooling (GMP) function is applied to select and identify the most significant features from each channel across spatial dimensions, effectively compressing pixel-level probabilities into a single value for a binary image-level prediction discrimination label $y \in Y=\{0,1\}$ (i.e., 0 for Authentic, and 1 for Fake). In this work, we define F as the full neural network including the GMP function, $M(x)=y_{map}$ as the output of all layers excluding the GMP function, and

$$\begin{aligned} F(x) = GMP(M(x)) = y. \end{aligned}$$

(1)

The purpose of a targeted attack task is to craft an adversarial example $x^{adv}$ with a small perturbation $\delta $, which misleads the detector into predicting a result consistent with the preset target value. The generation of the adversarial example can be formalized as follows:

$$\begin{aligned} \begin{aligned}&\mathrm {\mathop {minimize}\limits _{\delta }} \ D(x, x+\delta ) \\&\mathrm {such \ that} \ F(x+\delta ) = y^{*} \end{aligned} \end{aligned}$$

(2)

where $D(\cdot , \cdot )$ denotes a distance metric to quantify the difference between the benign image and its adversarial example, and $y^{*}$ represents the desired target label corresponding to a pre-specified target image $x^{*}$.

Instead of directly solving this difficult minimization problem, in this work, we utilize the optimizer and transform it to solve the following loss optimization problem:

$$\begin{aligned} x^{adv} = \mathrm {\mathop {argmin}\limits _{\delta }} \ \{D(x, x+\delta ) + loss_{F, y^{*}}(x+\delta )\}. \end{aligned}$$

(3)

The first term in Eq. (3) constrains the perturbation, and the second one enforces the prediction to be aligned with the target label $y^{*}$.

Instead of directly using discrimination labels as reference values, we consider two target cases at pixel level: 1) For an authentic image, the adversarial example is designed to mislead the detector into predicting some false-positive pixel regions via the target fake map $y^{*}_{map0}$ with parts of pixel regions labeled as fake. 2) For a fake one, the perturbation is added so that the detector fails to notice any pixels in the manipulated area based on the target authentic map $y^{*}_{map1}$ with all pixels labeled as authentic. It is necessary to mention that labels with only two classes guide which target map is suitable to be chosen in this process.

3.2 Pixel reverse content decision-making loss

On the basis of manipulation detection definition, we can further express $loss_{F, y^{*}}(x+\delta )$ in Eq. (3) as:

$$\begin{aligned} loss_{F, y^{*}}(x+\delta ) = {\left\{ \begin{array}{ll} \textrm{argmin} \ J(M(x^{*}), y^{*}_{map0}), y=1 \\ \textrm{argmin} \ J(M(x^{*}), y^{*}_{map1}), y=0 \end{array}\right. } \end{aligned}$$

(4)

where the loss function $J(\cdot , \cdot )$ measures the distance between the target and predicted maps. One loss function commonly used in conventional adversarial attacks is cross-entropy (CE) for image-level discrimination labels or mean squared error (MSE) for pixel-wise probability distribution maps. However, using MSE loss for targeting all pixel values toward zero seems to be an overly absolute consideration.

Generally, all pixel values in a normalized pixel-level prediction probability map are continuously distributed in the range of [0,1]. The default decision threshold in image manipulation detection is commonly set to 0.5, which means pixels with values lower than 0.5 are defined as authentic pixels, and vice versa as fake pixels. Similar to [7] and [48], we maintain the value of 0.5 as our decision-making boundary, and only force fake pixels toward a range below the boundary value when the target label is 0; otherwise, focus on pushing authentic pixels beyond the boundary if the target label is 1.

To optimize losses more reasonably, we propose an understandable yet efficient loss function named pixel reverse content decision-making (PRevCDm) loss instead of MSE loss. Specifically, we first define an inverse function to obtain an opposite probability map with respect to the original probability map,

$$\begin{aligned} Rev(M(x)) = \sum \nolimits _{i}^{H \times W} (1-M(x)^{i}). \end{aligned}$$

(5)

Next, we express a map** function Y as follows to divide pixels corresponding to different labels into two camps:

$$\begin{aligned} Y = {\left\{ \begin{array}{ll} M(x^{adv}) - Rev(x^{adv}), y=1 \\ Rev(x^{adv}) - M(x^{adv}), y=0 \end{array}\right. } \end{aligned}$$

(6)

where Y represents the property of binary labels of pixels in the original probability map.

Considering the impact of pixels near the boundary on decision-making, we introduce a new constant value kappa (abbreviated as k for convenience) to further expand the boundary range. With k as a tolerance bias, these swing pixels within the finite border width move toward a more optimal position depending on the specific target reference, where the binary label of the pixel at $Y>-k$ position equals the target label. Eq. (6) can be adjusted as follows:

$$\begin{aligned} Y = {\left\{ \begin{array}{ll} M(x^{adv}) - Rev(x^{adv}) + k, y=1 \\ Rev(x^{adv}) - M(x^{adv}) + k, y=0. \end{array}\right. } \end{aligned}$$

(7)

Then, we utilize the $ReLu(\cdot )$ function to filter out and retain worthy pixels in the original probability map corresponding to the target label, and enter them into the next loss calculation. More specifically, for the original authentic image, only pixels that predict probability logits lower than (0.5+k/2) will remain and contribute to the loss function, while in the original fake image, pixels with predicted probabilities higher than (0.5-k/2) need more attention and will be counted into loss function.

To encourage the optimizer to search along a descent direction in the initial stage, we use $(e^{x}-1)$ as a further map** function with a larger slope when $x>0$, where x is substituted by ReLu(Y). Consequently, we can finally define the PRevCDm loss with the average of all pixel losses as:

$$\begin{aligned} loss_{F, y^{*}}(x^{adv}) = \frac{1}{N} \sum \nolimits _{i}^{N} (e^{ReLu(Y)}-1). \end{aligned}$$

(8)

3.3 Low-frequency constraint

Previous works typically utilize $\ell _{p}$-norm ($p\in \{0, 2, \infty \}$) to regularize perturbations in the representation space. However, it is still difficult to completely generate imperceptible perturbations under such traditional constraints. Existing spatial perturbations are commonly added at arbitrary positions on the original benign image. Once noise and aliasing artifacts appear in the light blank background, a random distribution of perturbations acceptable to deep discriminators may be easily detected by resolution-sensitive human visual systems. Therefore, using another constraint rather than $\ell _{p}$-norm is the key to limiting perturbations into imperceptible details.

For an image in the spatial domain, visible sensitive information mainly refers to colorful style content and general object structure, while those slender edges or complex textures are less sensitive. After converting the image to the frequency domain, the low-frequency component of an image contains its basic structural content, while rich detailed features like object edges and textures are included in the high-frequency components.

Motivated by the above principle, instead of directly using $\ell _{p}$-norm constraint, we introduce low-frequency as a new constraint. With the discrete wavelet transform (DWT) function, we first decompose the input image into one low-frequency ($x_{LL}$) and three high-frequency components ($x_{LH}$, $x_{HL}$, $x_{HH}$). Then, we use inverse DWT (IDWT) to reconstruct a new image with only the low-frequency component as $x^{new}=\varPhi (x)$ so that we can retain the main content information. The new alternative constraint of the perturbation in the first term in Eq. (3) can be expressed as:

$$\begin{aligned} \quad D(x, x+\delta ) = D_{lf}(x, x^{adv}) = \left\| \varPhi (x) - \varPhi (x^{adv}) \right\| _2 \end{aligned}$$

(9)

And the loss of main content specific to the perturbed example is reduced by minimizing Eq. (9) to reduce image quality degradation.

3.4 Transferability improvement with aggregate gradient

We also enhance the transferability of adversarial examples to perform more generalized black-box attacks against different manipulation detectors. In the black-box attack scenario, a common operation is to use a surrogate source model to craft adversarial examples.

Intuitively, we hope adversarial examples generated from the surrogate model are generalizable to diverse victim models for high transferability. However, designing such adversarial examples is nontrivial. Current DNN-based manipulation detectors with various structures usually extract different proprietary features to better adapt themselves to cross-data domains, which comes with the appearance of model-specific feature representations. Therefore, we argue that the indiscriminate distortion of arbitrary extracted features is more likely to fall into a model-specific local optimum, significantly reducing the transferability of adversarial examples.

To avoid trap** into a local optimum caused by overfitting to model-specific features, we propose to disturb the model-agnostic features from the source model as guidance to generate more transferable adversarial examples. Inspired by [44], we consider model-agnostic features specifically for fake manipulated images when generating more transferable adversarial examples. In the forward propagation of the surrogate detector F, let $F_{k}(x)$ denotes the features in the k-th layer, and the gradient w.r.t. $F_{k}(x)$ can be written as:

$$\begin{aligned} \varDelta _{k}^{x} = \frac{\partial O(x,y)}{\partial F_{k}(x)} = \frac{\partial M(x)}{\partial F_{k}(x)} \end{aligned}$$

(10)

where $O(\cdot , \cdot )$ denotes the logit output with respect to the ground-true label y, and $M(\cdot )$ denotes the logit output of the pixel-level distribution map mentioned in 2.1. Note that the raw gradient $\varDelta _{k}^{x}$ calculated with global feature maps generally carries model-specific information, resulting in visual pulses and large gradient noise on non-object regions.

To distort model-specific details but preserve general structures, we utilize the binarization mask to randomly discard pixels within a partial region of the input sample x with the probability $p_{r}$. We further calculate average aggregate gradients for such transformed inputs. We simplify these two steps and express them as:

$$\begin{aligned} \bar{\varDelta }_{k}^{x} = \frac{1}{N} \sum _{n=1}^{N} \varDelta _{k}^{x\odot M_{p_{r}}^{n}}, M_{p_{r}} \sim Bernoulli(1-p_{r}) \end{aligned}$$

(11)

where the $Mask_{p_{r}}$ is a binary mask with the same size as x, $\odot $ means the element-wise product, and N indicates the ensemble number of random masks adopted to x. For simplicity, we denote $\bar{\varDelta }_{k}^{x}$ as $\bar{\varDelta }$ in the rest of this paper.

Given the particularity of targeted attacks against manipulation detection models, we tend to divert the detector’s attention to both spurious regions and trivial backgrounds of a manipulated image. This strategy is designed to generate pixel-wise trend-positive predictions, effectively deceiving the detector. Specifically, we suppress model importance features with the aggregate gradients, and design the loss function Eq. (12) to guide the generation of transferable adversarial examples for fake manipulated images.

$$\begin{aligned} loss_{F, y^{*}}(x_{Fake}^{adv}) = \left\| \bar{\varDelta } \odot F_{k}(x_{Fake}^{adv}) \right\| _2 \end{aligned}$$

(12)

Here, we choose the $\ell _2$ regulation norm to suppress all relatively high intensity $\bar{\varDelta }$, forcing the output prediction return to zero.

3.5 The unified attack

Combining the PRevCDm Loss for authentic images and the aggregate gradient for fakes in adversarial example generation, as well as the low-frequency constraint, the overall attack problem of this work can be concluded from Eq. (3) as:

$$\begin{aligned} \begin{aligned} x^{adv}&= \mathrm {\mathop {argmin}} \{\alpha D_{lf}(x, x^{adv}) \\&\quad + \beta _1 loss_{F, y^{*}}(x_{Au}^{adv})_{PRevCDm} \\&\quad + \beta _2 loss_{F, y^{*}}(x_{Fake}^{adv})_{AggGrad}\} \end{aligned} \end{aligned}$$

(13)

where $\alpha $, $\beta _1$, and $\beta _2$ are hyper-parameters. For clarity, we present the pseudo-code in Algorithm 1 to outline the main procedures of our attack.

Table 1 The performance evaluation of different white-box attacks on three manipulation detectors (ResFCN, MVSSNet, and OSN) with five datasets (COVERAGE, COLUMBIA, CASIA1, NIST 2016, and Realistic Tampering)

Full size table

4 Experiments

4.1 Experimental setting

Table 2 Attack Success Rate (%) of adversarial attacks on three target detectors with five datasets

Full size table

Datasets and Models. We evaluate our method with five image manipulation datasets, namely COVERAGE [47], COLUMBIA [35], CASIA1 [8], NIST 2016 [16] and Realistic Tampering [24, 25]. COVERAGE contains 100 negative images manipulated by copy-move and their originals with genuine objects. COLUMBIA provides 363 images, including 183 genuine images and 180 spliced images. CASIA1 derives from the Corel image dataset, which consists of 800 authentic images, 459 copy-move images, and 461 spliced images. NIST 2016 contains 564 high-resolution images with copy-move, splicing, and removal. Realistic Tampering includes 220 realistic images captured by four cameras and their corresponding forgeries created by modern photo-editing software. We adopt ResFCN [18, 32], MVSSNet [7], and OSN [48] as our experimental image manipulation detectors. Previous studies have demonstrated their detection performance, and we directly use officially released detectors with pre-trained models and optimized parameters. Each detector is used as the victim model in its white-box attack, while only assuming a known surrogate model to generate adversarial examples transferable to attack other models in each black-box attack scenario.

Evaluation metrics. We calculate the attack success rate (ASR) to evaluate the overall attack performance, where ASR denotes the ratio of the number of successfully attacked images to the entire test dataset. We also evaluate detector performance changes before and after the attack with two manipulation detection metrics: image-level F1-score (imF1), and pixel-level F1-score (pF1). Here, authentic images are only considered for imF1 calculation, while fake images are used for both metrics computation. On the other hand, we adopt three typical metrics, including Fr$\acute{e}$chet inception distance (FID) [19], peak signal-to-noise ratio (PSNR), and structural similarity (SSIM), to measure visual image quality.

Implementation details. We use an Adam [4. By comparing the first and second rows, we observe the PRevCDm loss improves the attack performance. Despite a slight degradation in image quality with the PRevCDm loss, the utilization of the Dlf instead of spatial constraints helps mitigate this effect. Furthermore, we compare the third and last rows and observe an improvement in ASR while maintaining image quality with the AggGrad. The ablation experiments validate the effectiveness of the proposed components, which provides valuable inspiration for future work in this field.

Table 4 Ablation study on CASIA1 under the black-box attack setting

Full size table

5 Conclusion

In this paper, we propose an efficient adversarial attack named RevAggAL to explore the vulnerability of current state-of-the-art image manipulation detectors. To address the challenge of transferable attacks with more imperceptible perturbation, we combine the PRevCDm loss with aggregated gradients for adversarial example generation under the low-frequency constraint. Experiments demonstrate that our proposed method can achieve good attack performance while ensuring better image quality.

Data availability

No datasets were generated or analyzed during the current study.

References

Asnani, V., Yin, X., Hassner, T., Liu, S., Liu, X.: Proactive image manipulation detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 15,386–15,395 (2022)
Bammey, Q., Gioi, R.G.v., Morel, J.M.: An adaptive neural network for unsupervised mosaic consistency analysis in image forensics. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14,194–14,204 (2020)
Cai, Z., Tan, Y., Asif, M.S.: Ensemble-based blackbox attacks on dense prediction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4045–4055 (2023)
Carlini, N., Farid, H.: Evading deepfake-image detectors with white-and black-box attacks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 658–659 (2020)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy, pp. 39–57 (2017)
Chen, X., Dong, C., Ji, J., Cao, J., Li, X.: Image manipulation detection by multi-view multi-scale supervision. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 14,185–14,193 (2021)
Dong, C., Chen, X., Hu, R., Cao, J., Li, X.: MVSS-Net: multi-view multi-scale supervised networks for image manipulation detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3539–3553 (2022)
Google Scholar
Dong, J., Wang, W., Tan, T.: CASIA image tampering detection evaluation database. In: 2013 IEEE China summit and international conference on signal and information processing, pp. 422–426 (2013)
Feng, Y., Chen, B., Dai, T., **a, S.T.: Adversarial attack on deep product quantization network for image retrieval. In: Proceedings of the AAAI conference on artificial intelligence, 34, pp. 10,786–10,793 (2020)
Fontani, M., Bianchi, T., De Rosa, A., Piva, A., Barni, M.: A framework for decision fusion in image forensics based on Dempster-Shafer theory of evidence. IEEE Trans. Inf. Forensics Secur. 8(4), 593–607 (2013)
Article Google Scholar
Fridrich, J., Kodovsky, J.: Rich models for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 7(3), 868–882 (2012)
Article Google Scholar
Fridrich, J., Lukas, J., Goljan, M.: Digital camera identification from sensor noise. IEEE Trans. Inf. Secur. Forensics 1(2), 205–214 (2006)
Article Google Scholar
Gallagher, A.C., Chen, T.: Image authentication by detecting traces of demosaicing. In: 2008 IEEE computer society conference on computer vision and pattern recognition workshops, pp. 1–8 (2008)
Gao, Z., Sun, C., Cheng, Z., Guan, W., Liu, A., Wang, M.: TBNet: a two-stream boundary-aware network for generic image manipulation localization. IEEE Trans. Knowl. Data Eng. (2022)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. ar**v:1412.6572 (2014)
Guan, H., Kozak, M., Robertson, E., Lee, Y., Yates, A.N., Delgado, A., Zhou, D., Kheyrkhah, T., Smith, J., Fiscus, J.: MFC datasets: large-scale benchmark datasets for media forensic challenge evaluation. In: 2019 IEEE winter applications of computer vision workshops, pp. 63–72 (2019)
Guillaro, F., Cozzolino, D., Sud, A., Dufour, N., Verdoliva, L.: TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 20,606–20,615 (2023)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local NASH equilibrium. Adv. Neural Inf. Process. Syst. 30 (2017)
Huang, H., Chen, Z., Chen, H., Wang, Y., Zhang, K.: T-SEA: Transfer-based self-ensemble attack on object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 20,514–20,523 (2023)
Jia, S., Ma, C., Yao, T., Yin, B., Ding, S., Yang, X.: Exploring frequency adversarial attacks for face forgery detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4103–4112 (2022)
Kawar, B., Zada, S., Lang, O., Tov, O., Chang, H., Dekel, T., Mosseri, I., Irani, M.: Imagic: Text-based real image editing with diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6007–6017 (2023)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ar**v:1412.6980 (2014)
Korus, P., Huang, J.: Evaluation of random field models in multi-modal unsupervised tampering localization. In: 2016 IEEE international workshop on information forensics and security, pp. 1–6 (2016)
Korus, P., Huang, J.: Multi-scale analysis strategies in PRNU-based tampering localization. IEEE Trans. Inf. Forensics Secur. 12(4), 809–824 (2016)
Article Google Scholar
Kwon, M.J., Nam, S.H., Yu, I.J., Lee, H.K., Kim, C.: Learning jpeg compression artifacts for image manipulation detection and localization. Int. J. Comput. Vision 130(8), 1875–1895 (2022)
Article Google Scholar
Li, B., Qi, X., Lukasiewicz, T., Torr, P.H.: ManiGAN: text-guided image manipulation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7880–7889 (2020)
Li, D., Wang, W., Fan, H., Dong, J.: Exploring adversarial fake images on face manifold. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5789–5798 (2021)
Li, H., Huang, J.: Localization of deep inpainting using high-pass fully convolutional network. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8301–8310 (2019)
Li, Q., Shen, L., Guo, S., Lai, Z.: Wavelet integrated CNNs for noise-robust image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7245–7254 (2020)
Liu, X., Liu, Y., Chen, J., Liu, X.: PSCC-Net: progressive spatio-channel correlation network for image manipulation detection and localization. IEEE Trans. Circuits Syst. Video Technol. 32(11), 7505–7517 (2022)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440 (2015)
Lu, C.S., Liao, H.Y.M.: Structural digital signature for image authentication: an incidental distortion resistant scheme. In: Proceedings of the 2000 ACM workshops on multimedia, pp. 115–118 (2000)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. ar** autoencoder for deep image manipulation. Adv. Neural Inf. Process. Syst. 33, 7198–7211 (2020)
Google Scholar
Qiu, X., Li, H., Luo, W., Huang, J.: A universal image forensic strategy based on steganalytic model. In: Proceedings of the 2nd ACM workshop on information hiding and multimedia security, pp. 165–170 (2014)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)
Article Google Scholar
Salloum, R., Ren, Y., Kuo, C.C.J.: Image splicing localization using a multi-task fully convolutional network (MFCN). J. Vis. Commun. Image Represent. 51, 201–209 (2018)
Article Google Scholar
Schwarcz, S., Chellappa, R.: Finding facial forgery artifacts with parts-based detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 933–942 (2021)
Swaminathan, A., Wu, M., Liu, K.R.: Component forensics of digital cameras: A non-intrusive approach. In: 2006 40th annual conference on information sciences and systems, pp. 1194–1199 (2006)
Wang, J., Wu, Z., Chen, J., Han, X., Shrivastava, A., Lim, S.N., Jiang, Y.G.: Objectformer for image manipulation detection and localization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2364–2373 (2022)
Wang, Z., Guo, H., Zhang, Z., Liu, W., Qin, Z., Ren, K.: Feature importance-aware transferable adversarial attacks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 7639–7648 (2021)
Warbhe, A.D., Dharaskar, R., Thakare, V.: Computationally efficient digital image forensic method for image authentication. Proc. Comput. Sci. 78, 464–470 (2016)
Article Google Scholar
Wei, Z., Chen, J., Wu, Z., Jiang, Y.G.: Enhancing the self-universality for transferable targeted attacks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12,281–12,290 (2023)
Wen, B., Zhu, Y., Subramanian, R., Ng, T.T., Shen, X., Winkler, S.: COVERAGE—A novel database for copy-move forgery detection. In: 2016 IEEE international conference on image processing, pp. 161–165 (2016)
Wu, H., Zhou, J., Tian, J., Liu, J., Qiao, Y.: Robust image forgery detection against transmission over online social networks. IEEE Trans. Inf. Forensics Secur. 17, 443–456 (2022)
Article Google Scholar
Zeng, Y., Lin, Z., Patel, V.M.: SketchEdit: Mask-free local image manipulation with partial sketches. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5951–5961 (2022)
Zhang, J., Wu, W., Huang, J.t., Huang, Y., Wang, W., Su, Y., Lyu, M.R.: Improving adversarial transferability via neuron attribution-based attacks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14,993–15,002 (2022)
Zhao, A., Chu, T., Liu, Y., Li, W., Li, J., Duan, L.: Minimizing maximum model discrepancy for transferable black-box targeted attacks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8153–8162 (2023)
Zhao, Z., Liu, Z., Larson, M.: Towards large yet imperceptible adversarial image perturbations with perceptual color distance. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1039–1048 (2020)
Zhou, L., Cui, P., Zhang, X., Jiang, Y., Yang, S.: Adversarial eigen attack on black-box models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 15,254–15,262 (2022)
Zhou, P., Han, X., Morariu, V.I., Davis, L.S.: Learning rich features for image manipulation detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1053–1061 (2018)
Zhu, P., Osada, G., Kataoka, H., Takahashi, T.: Frequency-aware GAN for adversarial manipulation generation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 4315–4324 (2023)
Zou, J., Duan, Y., Li, B., Zhang, W., Pan, Y., Pan, Z.: Making adversarial examples more transferable and indistinguishable. In: Proceedings of the AAAI conference on artificial intelligence, 36, pp. 3662–3670 (2022)

Download references

Acknowledgements

This work was supported in part by the Science and Technology Development Fund, Macau SAR, under Grants 0087/2020/A2, 0141/2023/RIA2 and 0193/2023/RIA3.

Author information

Authors and Affiliations

Department of Computer and Information Science, University of Macau, Macau, China
Yijia Zeng & Chi-Man Pun

Authors

Yijia Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Man Pun
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.Z. wrote the first draft and it is revised by the C.M.P. All authors contributed to and reviewed the manuscript.

Corresponding author

Correspondence to Chi-Man Pun.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zeng, Y., Pun, CM. Frequency-constrained transferable adversarial attack on image manipulation detection and localization. Vis Comput 40, 4817–4828 (2024). https://doi.org/10.1007/s00371-024-03482-4

Download citation

Accepted: 13 May 2024
Published: 05 June 2024
Issue Date: July 2024
DOI: https://doi.org/10.1007/s00371-024-03482-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Frequency-constrained transferable adversarial attack on image manipulation detection and localization

Abstract

Similar content being viewed by others

Shallowfake and deepfake image manipulation localization using noise and RGB-based dual branch method

Image Manipulation Detection Based on Ringed Residual Edge Artifact Enhancement and Multiple Attention Mechanisms

FP-Net: frequency-perception network with adversarial training for image manipulation localization

1 Introduction