Note: The methodology behind the approach discussed in this post stems from a collaborative publication between myself and Irene Anthi.

注意： 本文讨论的方法背后的方法来自 我本人和 Irene Anthi 之间的 合作出版物 。

介绍 (INTRODUCTION)

Spam SMS text messages often show up unexpectedly on our phone screens. That’s aggravating enough, but it gets worse. Whoever is sending you a spam text message is usually trying to defraud you. Most spam text messages don’t come from another phone. They often originate from a computer and are delivered to your phone via an email address or an instant messaging account.

垃圾短信经常在我们的手机屏幕上意外显示。这足够令人讨厌，但情况变得更糟。谁向您发送垃圾短信通常是在欺骗您。大多数垃圾短信不是来自其他手机。它们通常来自计算机，并通过电子邮件地址或即时消息传递帐户传递到您的手机。

There exists several security mechanisms for automatically detecting whether an email or an SMS message is spam or not. These approaches often rely on machine learning. However, the introduction of such systems may also be subject to attacks.

存在几种用于自动检测电子邮件或SMS消息是否为垃圾邮件的安全机制。这些方法通常依赖于机器学习。但是，引入此类系统也可能会受到攻击。

The act of deploying attacks towards machine learning based systems is known as Adversarial Machine Learning (AML). The aim is to exploit the weaknesses of the pre-trained model which may have “blind spots” between the data points it has seen during training. More specifically, by automatically introducing slight perturbations to the unseen data points, the model may cross a decision boundary and classify the data as a different class. As a result, the model’s effectiveness can significantly be reduced.

向基于机器学习的系统部署攻击的行为称为对抗机器学习(AML)。目的是利用预训练模型的弱点，该弱点在训练过程中看到的数据点之间可能有“盲点”。更具体地，通过自动向看不见的数据点引入轻微的扰动，模型可以越过决策边界并将数据分类为不同的类别。结果，该模型的有效性会大大降低。

In the context of SMS spam detection, AML can be used to manipulate textual data by including perturbations to cause spam data to be classified as being not spam, consequently bypassing the detector.

在SMS垃圾邮件检测的上下文中，AML可以通过包含扰动来操纵文本数据，从而使垃圾邮件数据被归类为非垃圾邮件，从而绕过检测器，从而可以操纵文本数据。

数据集和数据预处理 (DATASET AND DATA PRE-PROCESSING)

The SMS Spam Collection is a set of SMS tagged messages that have been collected for SMS spam research. It contains a set of 5,574 English SMS text messages which are tagged according to whether they are spam (425 message) or not-spam (3,375).

SMS垃圾邮件收集是已收集用于SMS垃圾邮件研究的一组SMS标记邮件。它包含一组5574条英文SMS文本消息，这些消息根据是垃圾邮件(425条消息)还是非垃圾邮件(3375条)进行了标记。

Let’s first cover the pre-processing techniques we need to consider before we dive into applying any kind of machine learning techniques. We’ll perform pre-processing techniques that are standard for most Natural Language Processing (NLP) problems. These include:

首先，我们将介绍在应用任何类型的机器学习技术之前需要考虑的预处理技术。我们将执行大多数自然语言处理(NLP)问题的标准预处理技术。这些包括：

Convert the text to lowercase.
将文本转换为小写。
Remove punctuation.
删除标点符号。
Remove additional white space.
删除其他空格。
Remove numbers.
删除数字。
Remove stop words such as “the”, “a”, “an”, “in”.
删除停用词，例如“ the”，“ a”，“ an”，“ in”。
Lemmatisation.
合法化。
Tokenisation.
令牌化。

Python’s Natural Language Tool Kit (NLTK) can handle these pre-processing requirements. The output should now look something to the following:

Python的自然语言工具包(NLTK)可以处理这些预处理要求。现在，输出应类似于以下内容：

词嵌入 (WORD EMBEDDINGS)

Word embedding is one of the most popular representation of text vocabulary. It is capable of capturing the context of a word in a document, its semantic and syntactic similarity to its surrounding words, and its relation with other words.

词嵌入是最流行的文本词汇表示形式之一。它能够捕获文档中单词的上下文，与周围单词的语义和句法相似性以及与其他单词的关系。

But how are word embeddings captured in context? Word2Vec is one of the most popular technique to learn word embeddings using a two-layer Neural Network. The Neural Network takes in the corpus of text, analyses it, and for each word in the vocabulary, generates a vector of numbers that encode important information about the meaning of the word in relation to the context in which it appears.

但是如何在上下文中捕获单词嵌入呢？ Word2Vec是使用两层神经网络学习单词嵌入的最流行技术之一。神经网络接受文本的语料库，对其进行分析，然后为词汇表中的每个单词生成一个数字矢量，该矢量编码有关单词含义与单词出现上下文相关的重要信息。

There are two main models: the Continuous Bag-of-Words model and the Skip-gram model. The Word2Vec Skip-gram model is a shallow Neural Network with a single hidden layer that takes in a word as input and tries to predict the context of the words that surround it as an output.

有两个主要模型：连续词袋模型和Skip-gram模型。 Word2Vec跳过语法模型是一个浅层神经网络，具有单个隐藏层，该隐藏层将单词作为输入，并尝试预测围绕它的单词的上下文作为输出。

In this case, we will be using Gensim’s Word2Vec for creating the model. Some of the important parameters are as follows:

在这种情况下，我们将使用Gensim的Word2Vec创建模型。一些重要参数如下：

size: The number of dimensions of the embeddings. The default is 100.
size：嵌入的尺寸数。默认值为100。
window: The maximum distance between a target word and the words around the target word. The default window is 5.
窗口：目标词与目标词周围的词之间的最大距离。默认窗口是5。
min_count: The minimum count of words to consider when training the model. Words with occurrence less than this count will be ignored. The default min_count is 5.
min_count：训练模型时要考虑的最小单词数。出现次数少于此次数的单词将被忽略。默认的min_count为5。
workers: The number of partitions during training. The default workers is 3.
工人：培训期间的分区数。默认工作线程为3。
sg: The training algorithm, either Continuous Bag-of-Words (0) or Skip-gram (1). The default training algorithm is Continuous Bag-of-Words.
sg：训练算法，连续单词袋(0)或跳过语法(1)。默认的训练算法是“连续词袋”。

Next, we’ll see how to use the Word2Vec model to generate the vector for the documents in the dataset. Word2Vec vectors are generated for each SMS message in the training data by traversing through the dataset. By simply using the model on each word of the text messages, we retrieve the word embedding vectors for those words. We then represent a message in the dataset by calculating the average over all of the vectors of words in the text.

接下来，我们将看到如何使用Word2Vec模型为数据集中的文档生成向量。通过遍历数据集，为训练数据中的每个SMS消息生成Word2Vec向量。通过简单地在文本消息的每个单词上使用模型，我们检索了这些单词的单词嵌入向量。然后，我们通过计算文本中所有单词向量的平均值来表示数据集中的一条消息。

模型训练和分类 (MODEL TRAINING AND CLASSIFICATION)

Let’s first encode our target labels spam and not_spam. This involves converting the categorical values to numerical values. We’ll then assign the features to the variable X and the target labels to the variable y. Lastly, we’ll split the pre-processed data into two datasets.

首先让我们对目标标签spam和not_spam进行编码。这涉及将分类值转换为数值。然后，我们将要素分配给变量X ，将目标标签分配给变量y 。最后，我们将预处理后的数据分为两个数据集。

Train dataset: For training the SMS text categorisation model.
训练数据集：用于训练SMS文本分类模型。
Test dataset: For validating the performance of the model.
测试数据集：用于验证模型的性能。

To split the data into 2 such datasets, we’ll use Scikit-learn’s train test split method from the model selection function. In this case, we’ll split the data into 70% training and 30% testing.

要将数据分为两个这样的数据集，我们将使用Scikit-learn的模型选择功能中的训练测试拆分方法。在这种情况下，我们会将数据分为70％的训练和30％的测试。

For the sake of this post, we’ll use a Decision Tree classifier. In reality, you’d want to evaluate a variety of classifiers using cross-validation to determine which is the best performing. The “no free lunch” theorem suggests that there is no universally best learning algorithm. In other words, the choice of an appropriate algorithm should be based on its performance for that particular problem and the properties of data that characterise the problem.

为了这篇文章的缘故，我们将使用Decision Tree分类器。实际上，您想使用交叉验证来评估各种分类器，以确定哪个是性能最好的分类器。 “没有免费的午餐”定理表明，没有普遍适用的最佳学习算法。换句话说，适当算法的选择应基于针对特定问题的性能以及表征该问题的数据的属性。

Once the model is trained, we can evaluate its performance when it tries to predict the target labels of the test set. The classification report shows that the model can predict the test samples with a high weighted-average F1-score of 0.94.

训练模型后，我们可以在尝试预测测试集的目标标签时评估其性能。分类报告显示，该模型可以预测具有0.94的高加权平均F1分数的测试样本。

生成对抗性样本 (GENERATING ADVERSARIAL SAMPLES)

A well known use case of AML is in image classification. This involves adding noise that may not be perceptible to the human eye which also fools the classifier.

AML的一个众所周知的用例是图像分类。这涉及增加人眼无法察觉的噪声，这也会使分类器蒙蔽。

Image for post — Adversarial machine learning in image classification图像分类中的对抗机器学习

There are various methods by which adversarial samples can be generated. Such methods vary in complexity, the speed of their generation, and their performance. An unsophisticated approach towards crafting such samples is to manually perturb the input data points. However, manual perturbations are slow to generate and evaluate by comparison with automatic approaches.

有多种方法可以生成对抗性样本。此类方法的复杂性，生成速度和性能各不相同。制作此类样本的简单方法是手动扰动输入数据点。但是，与自动方法相比，手动扰动的生成和评估速度较慢。

One of the most popular technique towards automatically generating perturbed samples include the Jacobian-based Saliency Map Attack (JSMA). The methods rely on the methodology, that when adding small perturbations to the original sample, the resulting sample can exhibit adversarial characteristics in that the resulting sample is now classified differently by the targeted model.

自动生成扰动样本的最流行技术之一是基于雅可比的显着性图攻击(JSMA)。该方法依赖于该方法，即在向原始样本添加较小扰动时，所得样本可以表现出对抗性特征，因为所得样本现在通过目标模型进行了不同分类。

The JSMA method generates perturbations using saliency maps. A saliency map identifies which features of the input data are the most relevant to the model decision being one class or another; these features, if altered, are most likely affect the classification of the target values. More specifically, an initial percentage of features (gamma) is chosen to be perturbed by a (theta) amount of noise. Then, the model establishes whether the added noise has caused the targeted model to misclassify or not. If the noise has not affected the model’s performance, another set of features is selected and a new iteration occurs until a saliency map appears which can be used to generate an adversarial sample.

JSMA方法使用显着图生成扰动。显着性图标识输入数据的哪些特征与一个或另一个类别的模型决策最相关；这些功能(如果更改)很可能会影响目标值的分类。更具体地说，特征的初始百分比(γ)被选择为被θ量的噪声所干扰。然后，模型确定添加的噪声是否导致目标模型分类错误。如果噪声没有影响模型的性能，则选择另一组特征并进行新的迭代，直到出现显着图，该显着图可用于生成对抗性样本。

A pre-trained MLP is used as the underlying model for the generation of adversarial samples. Here, we explore how different combinations of the JSMA parameters affect the performance of the originally trained Decision Tree.

预先训练的MLP用作对抗性样本生成的基础模型。在这里，我们探索JSMA参数的不同组合如何影响最初训练的决策树的性能。

评价 (EVALUATION)

To explore how different combinations of the JSMA parameters affect the performance of the trained Decision Tree, adversarial samples were generated from all spam data points present in the testing data by using a range of combinations of gamma and theta. The adversarial samples were then joined with the non-spam testing data points and presented to the trained model. The heat map reports the overall weighted-average F1-scores for all adversarial combinations of JSMA’s gamma and theta parameters.

为了探究JSMA参数的不同组合如何影响经过训练的决策树的性能，使用一系列伽玛和theta组合从测试数据中存在的所有垃圾邮件数据点生成了对抗样本。然后将对抗性样本与非垃圾邮件测试数据点合并，并提供给训练有素的模型。该热图报告了JSMA的γ和theta参数的所有对抗性组合的总体加权平均F1得分。

The classification performance of the Decision Tree model achieved a decrease in F1-scores across all of the gamma and theta parameters. When gamma= 0.3, theta= 0.5, the model’s classification performance decreased by 18 percentage points (F1-score = 0.759). In this case, based on this dataset, gamma= 0.3, theta= 0.5 would be the optimal parameter one would use to successfully reduce the accuracy of a machine learning based SMS spam detector.

决策树模型的分类性能在所有gamma和theta参数上的F1得分均下降。当gamma = 0.3，theta = 0.5时，模型的分类性能下降了18个百分点(F1分数= 0.759)。在这种情况下，基于此数据集，gamma = 0.3，theta = 0.5将是用于成功降低基于机器学习的SMS垃圾邮件检测器准确性的最佳参数。

结论 (CONCLUSION)

So, what have I learnt from this analysis?

那么，我从这项分析中学到了什么？

Due to their effectiveness and flexibility, machine learning based detectors are now recognised as fundamental tools for detecting whether SMS text messages are spam or not. Nevertheless, such systems are vulnerable to attacks that may severely undermine or mislead their capabilities. Adversarial attacks may have severe consequences in such infrastructures, as SMS texts may be modified to bypass the detector.

由于它们的有效性和灵活性，基于机器学习的检测器现在被认为是检测SMS文本消息是否为垃圾邮件的基本工具。但是，这样的系统容易受到攻击的攻击，这些攻击可能会严重破坏或误导其功能。在这种基础架构中，对抗性攻击可能会带来严重后果，因为可以修改SMS文本以绕过检测器。

The next steps would be to explore how such samples can support the robustness of supervised models using adversarial training. This entails including adversarial samples into the training dataset, re-training the model, and evaluating its performance on all adversarial combinations of JSMA’s gamma and theta parameters.

下一步将是探索这些样本如何使用对抗训练来支持监督模型的鲁棒性。这需要将对抗性样本包括到训练数据集中，重新训练模型，并在JSMA的γ和theta参数的所有对抗性组合上评估其性能。

For the full notebook, check out my GitHub repo below: https://github.com/LowriWilliams/SMS_Adversarial_Machine_Learning

对于完整的笔记本，请在下面查看我的GitHub存储库： https : //github.com/LowriWilliams/SMS_Adversarial_Machine_Learning