AdverTorch is a PyTorch-based toolbox designed to generate adversarial perturbations and defend machine learning models against adversarial attacks for robustness research.
A Toolbox for Adversarial Robustness Research
This tool is primarily used by AI security researchers and practitioners to evaluate and improve the adversarial robustness of machine learning models. It enables users to create adversarial examples, implement defenses, and perform adversarial training to enhance model security against exploitation.
AdverTorch is developed under Python 3.6 and PyTorch 1.0.0/0.4.1; compatibility with newer versions may vary. Some attacks like FastFeatureAttack and JacobianSaliencyMapAttack currently fail tests against the CleverHans version used and are marked as skipped. Users should refer to the documentation and examples for best practices and stay updated on ongoing development for new features and fixes.
pip install advertorch
git clone the repository and run python setup.py install
pip install -e . to install in editable mode
Install TensorFlow GPU 1.11.0 via conda for testing environments
pip install CleverHans from specific git commit
pip install Keras version 2.2.2
pip install Foolbox version 1.3.2
pip install advertorch
Installs the AdverTorch package via pip.
python setup.py install
Installs AdverTorch from the cloned repository.
pip install -e .
Installs AdverTorch in editable mode for development.
from advertorch.attacks import LinfPGDAttack adversary = LinfPGDAttack(model, loss_fn=nn.CrossEntropyLoss(reduction="sum"), eps=0.3, nb_iter=40, eps_iter=0.01, rand_init=True, clip_min=0.0, clip_max=1.0, targeted=False) adv_untargeted = adversary.perturb(cln_data, true_label)
Creates an untargeted PGD adversarial attack on a PyTorch model.
adversary.targeted = True target = torch.ones_like(true_label) * 3 adv_targeted = adversary.perturb(cln_data, target)
Switches the attack to targeted mode and generates targeted adversarial examples.
See advertorch_examples/tutorial_attack_defense_bpda_mnist.ipynb
Example notebook demonstrating how to perform attacks and defenses using AdverTorch.
See advertorch_examples/tutorial_train_mnist.py
Example script showing how to adversarially train a robust model on MNIST.