Primary Use Case

This tool is primarily used by AI security researchers and practitioners to evaluate and improve the adversarial robustness of machine learning models. It enables users to create adversarial examples, implement defenses, and perform adversarial training to enhance model security against exploitation.

Key Features

Modules for generating adversarial perturbations

Defenses against adversarial examples

Scripts for adversarial training of robust models

Implemented primarily in PyTorch

Compatibility testing with Foolbox and CleverHans frameworks

Support for both targeted and untargeted attacks

Active development with plans for multi-framework support

Comprehensive examples and tutorials included

Insights & Recommendations

AdverTorch is developed under Python 3.6 and PyTorch 1.0.0/0.4.1; compatibility with newer versions may vary. Some attacks like FastFeatureAttack and JacobianSaliencyMapAttack currently fail tests against the CleverHans version used and are marked as skipped. Users should refer to the documentation and examples for best practices and stay updated on ongoing development for new features and fixes.

Installation

pip install advertorch
git clone the repository and run python setup.py install
pip install -e . to install in editable mode
Install TensorFlow GPU 1.11.0 via conda for testing environments
pip install CleverHans from specific git commit
pip install Keras version 2.2.2
pip install Foolbox version 1.3.2

Usage

pip install advertorch

Installs the AdverTorch package via pip.

python setup.py install

Installs AdverTorch from the cloned repository.

pip install -e .

Installs AdverTorch in editable mode for development.

from advertorch.attacks import LinfPGDAttack adversary = LinfPGDAttack(model, loss_fn=nn.CrossEntropyLoss(reduction="sum"), eps=0.3, nb_iter=40, eps_iter=0.01, rand_init=True, clip_min=0.0, clip_max=1.0, targeted=False) adv_untargeted = adversary.perturb(cln_data, true_label)

Creates an untargeted PGD adversarial attack on a PyTorch model.

adversary.targeted = True target = torch.ones_like(true_label) * 3 adv_targeted = adversary.perturb(cln_data, target)

Switches the attack to targeted mode and generates targeted adversarial examples.

See advertorch_examples/tutorial_attack_defense_bpda_mnist.ipynb

Example notebook demonstrating how to perform attacks and defenses using AdverTorch.

See advertorch_examples/tutorial_train_mnist.py

Example script showing how to adversarially train a robust model on MNIST.

Smart Usage Notes

Integrate AdverTorch into ML model development pipelines for continuous adversarial robustness testing.

Use adversarial example generation to simulate attacker behavior during red team exercises.

Leverage adversarial training scripts to harden models proactively as part of blue team defense.

Combine with threat intelligence to tailor adversarial attacks reflecting real-world tactics.

Expand tool support to TensorFlow and other frameworks to cover diverse AI environments.

OpenSec Atlas

advertorch

About This Tool

Primary Use Case

Key Features

Insights & Recommendations

Installation

Usage

Smart Usage Notes

Security Capability Profile

Tags

You Might Also Like