# Black-box decision-based attacks on images

In the previous post we reviewed a series of black-box score-based adversarial attacks where the adversary has to estimate the gradient by querying the target model and retrieving the labels’ confidence score. In this post we are going to explore the third category of black-box attacks, namely, black-box decision-based attacks. Under this settings, the only knowledge the attacker has about the model are only discrete hard-label predictions. Compared to score-based or transfer-based attacks, decision-based attacks are much more relevant in real-world scenario where confidence scores are usually not available. This is a very challenging problem since the direct extension of state-of-the-art white-box attacks to the hard-label black-box setting will require minimizing a non-continuous step function, which is combinatorial and cannot be solved by a gradient-based optimizer. More specifically, we are going to learn about the following attacks:

• Boundary Attack;
• QEBA;
• Evolutionary Attack;
• Hop-Skip-Jump-Attack.

### Overview

This post is part of a collection including the following 6 posts:

1. Threat models;
2. White-box attacks;
3. Black-box transfer-based attacks;
4. Black-box score-based attacks;
5. Black-box decision-based attacks (current post);
6. Defence methods.

### Boundary Attack

Boundary attack is one of the first methods developed for decision-based black box attacks . The algorithm is initialized from a point that is already adversarial and then performs a random walk along the boundary between the adversarial and the non-adversarial region such that it stays in the adversarial region and the distance towards the target image is reduced. When applied on images, where each pixel is constrained to a range of [0,255], each pixel in the the initial image xadv0 is sampled from a uniform distribution U(0,255). In untargeted context, are reject non-adversarial sample, while, in a targeted scenario, xadv0 can be any sample that is classified by the model as being from the target class. Once xadv0 has been initialized, the algorithm iteratively samples a perturbation μt from the proposal distribution P following the constraints:

$x_{adv}^{t-1}+\mu^t \in [0, 255], \: \: \:(1)$

$||\mu^t||2=\delta \cdot d(x, x{adv}^{t-1}), \: \: \:(2)$

$d(x, x_{adv}^{t-1})-d(x, x_{adv}^{t-1}+\mu^t)=\epsilon \cdot d(x, x_{adv}^{t-1}), \: \: \:(3)$

### QEBA

The first introduced decision based method, Boundary attacks, despite it can successfully find adversarial examples, however, it suffers from an exponential search time as well as a lack of convergence. Query Efficient Boundary Attack (QEBA) method reformulates boundary attack as an optimization of continuous function g defined as

$g(\theta)=\arg \min_{\nu>0}(f(x+\nu \frac{\theta}{||\theta||})\neq y),$

where θ represents the search direction (note that in practice θ represents adversarial noise), f is the hard-label black-box function (classifier), and g(θ) is the distance from x to the nearest adversarial example along the direction θ. Thus, instead of searching for an adversarial example, the algorithm searches the direction θ to minimize the distortion g(θ). To compute g(θ) given a normalized θ, it is first performed a fine-grained search by query the points {x+αθ, x+2αθ,…} one by one until f(x+iαθ)!=y. In the second phase it is conducted a binary search to find the solution within that region. This procedure is used to find the initial θ0, the corresponding g0), and to evaluate subsequent gt). The next search directions θt are instead optimized using Randomized Gradient-Free (RGF) method to estimate the gradient in each iteration as

$\nabla_{\theta}=\frac{g(\theta+\beta u)-g(\theta)}{\beta}\cdot u,$

where u is a random Gaussian vector, and β>0 is a smoothing parameter usually set to 0.005. The solution is then updated by θt+1t-η∇θt. Finally, this method achieves smaller or similar distortion, using 3-4 times less queries compared with Boundary Attack. (link)

### Evolutionary Attack

Originally applied to fool face recognition models under the decision-based black-box scenario, Evolutionary Attack method is based on an efficient covariance matrix adaptation evolution strategy (CMA-ES) which can model the local geometries of the search directions and meanwhile reduce the dimension of the search space. At the beginning of the algorithm, xadv0 is initialized with a sample that already satisfies the adversarial criterion. Next, a search is performed in a lower dimensional space Rm with m<n. In each iteration, it is first sampled a random vector z from N(0,σ^2C) such that z∈Rm, where C is a diagonal covariance matrix used to model the local geometries of the search directions and is defined as

$p_c=(1-c_c)p_c+\sqrt{c_c(2-c_c)}\frac{z}{\sigma},$

$c_{ii}=(1-c_{cov})c_{ii}+c_{cov}(p_c)_i^2,$

where pc∈Rm is called the evolution path as it stores the exponentially decayed successful search directions. For i=1,…,m, cii is the diagonal element of C and (pc)i is the i-th element of pc, while cc and ccov are two fixed hyper-parameters. Then, given the fact that elements in the diagonal covariance matrix C represent the most likely coordinates of the past successful trials, in each iteration are selected a subset k (k<m) coordinates to generate the random vector z with the probability of selecting the i-th coordinate being proportional to cii. In this way the search-space dimensionality can be reduced and the black-box attack convergence can be accelerated. Leaving the other coordinates set to 0, z is upscaled to the input space by bilinear interpolation so to get z‘. It is further added a bias to z‘ so sample from a biased distribution towards minimizing the distance of xadvt from the original image x defined as

$z'=z'+\mu(x-x_{adv}^t),$