arXiv学术速递笔记11.29

文章目录

一、自动驾驶/目标检测
- Improving Lane Detection Generalization: A Novel Framework using HD Maps for Boosting Diversity
- Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird's Eye View Segmentation for Connected and Autonomous Driving
- Panacea: Panoramic and Controllable Video Generation for Autonomous Driving
二、AI安全
- RetouchUAA: Unconstrained Adversarial Attack via Image Retouching
- Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method Perspective

一、自动驾驶/目标检测

Improving Lane Detection Generalization: A Novel Framework using HD Maps for Boosting Diversity

标题： 改进车道检测泛化：一种利用高清地图提高多样性的新框架
链接： https://arxiv.org/abs/2311.16589
作者： Daeun Lee,Minhyeok Heo,Jiwon Kim
备注： 6 pages, 5 figures
摘要： 车道线检测是车辆在道路上导航和定位的重要任务。为了保证检测结果的可靠性，车道线检测算法必须在各种道路环境中具有鲁棒的泛化性能。然而，尽管基于深度学习的车道检测算法的性能有了显著提高，但它们在响应道路环境变化时的泛化性能仍达不到预期。在本文中，我们提出了一种新的框架，单源域泛化（SSDG）的车道检测。通过将数据分解为车道结构和环境，我们使用高清（HD）地图和生成模型增强多样性。我们不是扩大数据量，而是从战略上选择数据的核心子集，最大限度地提高多样性并优化性能。我们广泛的实验表明，我们的框架提高了车道检测的泛化性能，与基于域自适应的方法相比。
摘要： Lane detection is a vital task for vehicles to navigate and localize their position on the road. To ensure reliable results, lane detection algorithms must have robust generalization performance in various road environments. However, despite the significant performance improvement of deep learning-based lane detection algorithms, their generalization performance in response to changes in road environments still falls short of expectations. In this paper, we present a novel framework for single-source domain generalization (SSDG) in lane detection. By decomposing data into lane structures and surroundings, we enhance diversity using High-Definition (HD) maps and generative models. Rather than expanding data volume, we strategically select a core subset of data, maximizing diversity and optimizing performance. Our extensive experiments demonstrate that our framework enhances the generalization performance of lane detection, comparable to the domain adaptation-based method.

Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird’s Eye View Segmentation for Connected and Autonomous Driving

标题： 面向连通自主驾驶的多智能体协同鸟视图分割中的全景景域综合
链接： https://arxiv.org/abs/2311.16754
作者： Senkang Hu, Zhengru Fang, Xianhao Chen, Yuguang Fang, Sam Kwong
摘要： 协作感知最近在自动驾驶中获得了极大的关注，通过在车辆之间交换额外的信息来提高感知质量。然而，部署协作感知系统可能会导致领域转移，这是由于不同的环境条件和联网和自动驾驶车辆（CAV）之间的数据异构性。为了解决这些挑战，我们提出了一个统一的领域泛化框架，适用于协同感知的训练和推理阶段。在训练阶段，我们引入了幅度增强（AmpAug）方法来增强低频图像变化，从而扩展了模型在各个领域的学习能力。我们还采用了元一致性训练方案来模拟域转移，通过精心设计的一致性损失来优化模型，以鼓励域不变表示。在推理阶段，我们引入了一个系统内的域对齐机制，以减少或潜在地消除域之间的差异CAV推理之前。综合实验证实了我们的方法的有效性与现有的国家的最先进的作品相比。代码将在https://github.com/DG-CAVs/DG-CoPerception.git上发布。
摘要： Collaborative perception has recently gained significant attention in autonomous driving, improving perception quality by enabling the exchange of additional information among vehicles. However, deploying collaborative perception systems can lead to domain shifts due to diverse environmental conditions and data heterogeneity among connected and autonomous vehicles (CAVs). To address these challenges, we propose a unified domain generalization framework applicable in both training and inference stages of collaborative perception. In the training phase, we introduce an Amplitude Augmentation (AmpAug) method to augment low-frequency image variations, broadening the model’s ability to learn across various domains. We also employ a meta-consistency training scheme to simulate domain shifts, optimizing the model with a carefully designed consistency loss to encourage domain-invariant representations. In the inference phase, we introduce an intra-system domain alignment mechanism to reduce or potentially eliminate the domain discrepancy among CAVs prior to inference. Comprehensive experiments substantiate the effectiveness of our method in comparison with the existing state-of-the-art works. Code will be released at https://github.com/DG-CAVs/DG-CoPerception.git.

Panacea: Panoramic and Controllable Video Generation for Autonomous Driving

标题： Panacea：用于自动驾驶的全景可控视频生成
链接： https://arxiv.org/abs/2311.16813
作者： Yuqing Wen, Yucheng Zhao, Yingfei Liu, Fan Jia, Yanhui Wang, Chong Luo, Chi Zhang, Tiancai Wang, Xiaoyan Sun, Xiangyu Zhang
备注： Project page: this https URL
摘要： 自动驾驶领域越来越需要高质量的注释训练数据。在本文中，我们提出了Panacea，这是一种在驾驶场景中生成全景和可控视频的创新方法，能够产生无限数量的多样化的注释样本，这对自动驾驶的进步至关重要。灵丹妙药解决了两个关键的挑战：“一致性”和“可控性”。一致性确保时间和跨视图的一致性，而可控性确保生成的内容与相应的注释对齐。我们的方法集成了一种新颖的4D注意力和两阶段生成管道来保持一致性，并辅以ControlNet框架，通过鸟瞰图（BEV）布局进行细致的控制。在nuScenes数据集上对Panacea进行了广泛的定性和定量评估，证明了其在生成高质量多视图驾驶场景视频方面的有效性。这项工作通过有效地增强用于先进BEV感知技术的训练数据集，显著推动了自动驾驶领域的发展。
摘要：The field of autonomous driving increasingly demands high-quality annotated training data. In this paper, we propose Panacea, an innovative approach to generate panoramic and controllable videos in driving scenarios, capable of yielding an unlimited numbers of diverse, annotated samples pivotal for autonomous driving advancements. Panacea addresses two critical challenges: ‘Consistency’ and ‘Controllability.’ Consistency ensures temporal and cross-view coherence, while Controllability ensures the alignment of generated content with corresponding annotations. Our approach integrates a novel 4D attention and a two-stage generation pipeline to maintain coherence, supplemented by the ControlNet framework for meticulous control by the Bird’s-Eye-View (BEV) layouts. Extensive qualitative and quantitative evaluations of Panacea on the nuScenes dataset prove its effectiveness in generating high-quality multi-view driving-scene videos. This work notably propels the field of autonomous driving by effectively augmenting the training dataset used for advanced BEV perception techniques.

二、AI安全

RetouchUAA: Unconstrained Adversarial Attack via Image Retouching

标题： RetouchUAA：通过图像修饰的无约束对抗攻击
链接： https://arxiv.org/abs/2311.16478
作者： Mengda Xie,Yiling He,Meie Fang
摘要： 深度神经网络（DNN）容易受到对抗性样本的影响。传统的攻击会产生受控的类似噪声的扰动，这些扰动无法反映真实世界的场景，并且难以解释。相比之下，最近的无约束攻击模仿自然的图像变换发生在现实世界中的可感知的，但不显眼的攻击，但妥协的现实主义，由于忽视图像后处理和不受控制的攻击方向。在本文中，我们提出了RetouchUAA，这是一种无约束的攻击，利用了现实生活中的扰动：图像修饰风格，突出了其对DNN的潜在威胁。与现有的攻击相比，RetouchUAA提供了几个显著的优势。首先，RetouchUAA通过两个关键设计：图像修饰攻击框架和修饰风格指导模块，在生成可解释和真实的扰动方面表现出色。以前定制设计的人类可解释性修饰框架通过线性化图像来对抗攻击，同时对人类修饰行为中的局部处理和修饰决策进行建模，为理解DNN对修饰的鲁棒性提供了一个明确而合理的管道。后者引导对抗图像朝向标准修饰风格，从而确保其真实性。其次，由于修饰决策正则化和持续攻击策略的设计，RetouchUAA也表现出出色的攻击能力和防御鲁棒性，对DNN构成了严重威胁。在ImageNet和Place365上的实验表明，RetouchUAA对三个DNN的白盒攻击成功率接近100%，同时在图像自然度，可传输性和防御鲁棒性之间实现了比基线攻击更好的权衡。
摘要： Deep Neural Networks (DNNs) are susceptible to adversarial examples. Conventional attacks generate controlled noise-like perturbations that fail to reflect real-world scenarios and hard to interpretable. In contrast, recent unconstrained attacks mimic natural image transformations occurring in the real world for perceptible but inconspicuous attacks, yet compromise realism due to neglect of image post-processing and uncontrolled attack direction. In this paper, we propose RetouchUAA, an unconstrained attack that exploits a real-life perturbation: image retouching styles, highlighting its potential threat to DNNs. Compared to existing attacks, RetouchUAA offers several notable advantages. Firstly, RetouchUAA excels in generating interpretable and realistic perturbations through two key designs: the image retouching attack framework and the retouching style guidance module. The former custom-designed human-interpretability retouching framework for adversarial attack by linearizing images while modelling the local processing and retouching decision-making in human retouching behaviour, provides an explicit and reasonable pipeline for understanding the robustness of DNNs against retouching. The latter guides the adversarial image towards standard retouching styles, thereby ensuring its realism. Secondly, attributed to the design of the retouching decision regularization and the persistent attack strategy, RetouchUAA also exhibits outstanding attack capability and defense robustness, posing a heavy threat to DNNs. Experiments on ImageNet and Place365 reveal that RetouchUAA achieves nearly 100% white-box attack success against three DNNs, while achieving a better trade-off between image naturalness, transferability and defense robustness than baseline attacks.

Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method Perspective

标题： 对数据集提炼后门攻击的再思考：核心方法视角
链接： https://arxiv.org/abs/2311.16646
作者： Ming-Yu Chung,Sheng-Yen Chou,Chia-Mu Yu,Pin-Yu Chen,Sy-Yen Kuo,Tsung-Yi Ho
备注： 19 pages, 4 figures
摘要： 数据集蒸馏提供了一种提高深度学习数据效率的潜在方法。最近的研究表明，它能够抵消原始训练样本中存在的后门风险。在这项研究中，我们深入研究了基于核方法的后门攻击和数据集蒸馏的理论方面。我们介绍了两种新的理论驱动的触发模式生成方法，专门用于数据集蒸馏。经过一系列全面的分析和实验，我们证明了我们基于优化的触发器设计框架可以有效地对数据集蒸馏进行后门攻击。值得注意的是，我们设计的触发器中毒的数据集证明对传统的后门攻击检测和缓解方法有弹性。我们的实证结果验证了使用我们的方法开发的触发器能够熟练地执行弹性后门攻击。
摘要： Dataset distillation offers a potential means to enhance data efficiency in deep learning. Recent studies have shown its ability to counteract backdoor risks present in original training samples. In this study, we delve into the theoretical aspects of backdoor attacks and dataset distillation based on kernel methods. We introduce two new theory-driven trigger pattern generation methods specialized for dataset distillation. Following a comprehensive set of analyses and experiments, we show that our optimization-based trigger design framework informs effective backdoor attacks on dataset distillation. Notably, datasets poisoned by our designed trigger prove resilient against conventional backdoor attack detection and mitigation methods. Our empirical results validate that the triggers developed using our approaches are proficient at executing resilient backdoor attacks.