编者按
在本系列文章中,我们梳理了运筹学顶刊Operations Research在2024年1月份发布的7篇文章的基本信息,旨在帮助读者快速洞察领域新动态。
推荐文章1
● 题目:Recovering Dantzig–Wolfe Bounds by Cutting Planes
通过切割平面恢复Dantig-Wolfe界
● 原文链接:https://doi.org/10.1287/opre.2023.0048
● 作者:Rui Chen, Oktay Günlük, Andrea Lodi
● 发布时间:2024/01/03
● 摘要:
Dantzig-Wolfe (DW) 分解是混合整数规划(MIP)中一种著名的技术,用于分解和凸化约束以获得潜在的强对偶界。我们研究了使用 DW 分解算法可以导出的切割平面,并显示这些切割可以提供与 DW 分解相同的对偶界。更具体地说,我们为每个 DW 块生成一个切割,当与原始公式中的约束结合时,这些切割暗示了可以简单编写使用 DW 界限的目标函数切割。这种方法通常会导致具有较低对偶退化的公式,因此在使用标准 MIP 解算器在原始空间解决时具有更好的计算性能。我们还讨论如何加强这些切割以进一步提高计算性能。我们在多重背包分配问题和时间背包问题上测试了我们的方法,并展示了所提出的切割在加速解决时间方面是有帮助的,而无需实现分支定价。
Dantzig–Wolfe (DW) decomposition is a well-known technique in mixed-integer programming (MIP) for decomposing and convexifying constraints to obtain potentially strong dual bounds. We investigate cutting planes that can be derived using the DW decomposition algorithm and show that these cuts can provide the same dual bounds as DW decomposition. More precisely, we generate one cut for each DW block, and when combined with the constraints in the original formulation, these cuts imply the objective function cut one can simply write using the DW bound. This approach typically leads to a formulation with lower dual degeneracy that consequently has a better computational performance when solved by standard MIP solvers in the original space. We also discuss how to strengthen these cuts to improve the computational performance further. We test our approach on the multiple knapsack assignment problem and the temporal knapsack problem, and we show that the proposed cuts are helpful in accelerating the solution time without the need to implement branch and price.
推荐文章2
● 题目:Data-Driven Minimax Optimization with Expectation Constraints
带有期望约束的数据驱动极小极大优化
● 原文链接:https://doi.org/10.1287/opre.2022.0110
● 作者:Shuoguang Yang, Xudong Li, Guanghui Lan
● 发布时间:2024/01/05
● 摘要:
近几十年来,对数据驱动优化方法的关注显著增长,其中著名的随机梯度下降方法尤为突出,但由于投影到由这些硬约束定义的可行集上的计算挑战,很少研究数据驱动的约束。在本文中,我们专注于非光滑凸凹随机极小极大范式,并将数据驱动约束表述为期望约束。极小极大期望约束问题包含了一大类现实世界应用,包括数据驱动的鲁棒优化、模型误设下的优化,以及带有公平性约束的接收者操作特性曲线(AUC)最大化。我们提出了一类高效的原始-对偶算法来解决极小极大期望约束问题,并展示我们的算法以O( 1 / N 1/\sqrt{N} 1/N)的最优速率收敛,其中N是迭代次数。我们通过在大规模现实世界应用上进行数值实验,展示了我们算法的实际效率。
Attention to data-driven optimization approaches, including the well-known stochastic gradient descent method, has grown significantly over recent decades, but data-driven constraints have rarely been studied because of the computational challenges of projections onto the feasible set defined by these hard constraints. In this paper, we focus on the nonsmooth convex-concave stochastic minimax regime and formulate the data-driven constraints as expectation constraints. The minimax expectation constrained problem subsumes a broad class of real-world applications, including data-driven robust optimization, optimization with misspecification, and area under the receiver operating characteristic curve (AUC) maximization with fairness constraints. We propose a class of efficient primal-dual algorithms to tackle the minimax expectation constrained problem and show that our algorithms converge at the optimal rate of O( 1 / N 1/\sqrt{N} 1/N), where N is the number of iterations. We demonstrate the practical efficiency of our algorithms by conducting numerical experiments on large-scale real-world applications.
推荐文章3
● 题目:Global Optimality Guarantees for Policy Gradient Methods
策略梯度方法的全局最优性保障
● 原文链接:https://doi.org/10.1287/opre.2021.0014
● 作者:Jalaj Bhandari, Daniel Russo
● 发布时间:2024/01/05
● 摘要:
策略梯度方法通过对一类参数化策略执行随机梯度下降来应对复杂、难以理解的控制问题。不幸的是,即使对于可通过标准动态规划技术解决的简单控制问题,策略梯度算法也面临非凸优化问题,并且广泛认为它们只能收敛到一个驻点。本工作识别了几个经典控制问题共享的结构特性,这些特性确保了策略梯度目标函数没有次优的驻点,尽管它是非凸的。当这些条件得到加强时,该目标满足一个Polyak-Łojasiewicz(梯度支配)条件,该条件产生了收敛率。当这些条件中的一些被放宽时,我们还提供了任何静止点的最优性差距的界限。
Policy gradients methods apply to complex, poorly understood, control problems by performing stochastic gradient descent over a parameterized class of polices. Unfortunately, even for simple control problems solvable by standard dynamic programming techniques, policy gradient algorithms face nonconvex optimization problems and are widely understood to converge only to a stationary point. This work identifies structural properties, shared by several classic control problems, that ensure the policy gradient objective function has no suboptimal stationary points despite being nonconvex. When these conditions are strengthened, this objective satisfies a Polyak-lojasiewicz (gradient dominance) condition that yields convergence rates. We also provide bounds on the optimality gap of any stationary point when some of these conditions are relaxed.
推荐文章4
● 题目:Quality Selection in Two-Sided Markets: A Constrained Price Discrimination Approach
双边市场的质量选择:含约束的价格歧视方法
● 原文链接:https://doi.org/10.1287/opre.2020.0754
● 作者:Bar Light, Ramesh Johari, Gabriel Weintraub
● 发布时间:2024/01/05
● 摘要:
在线平台收集有关参与者的丰富信息,然后将其中一些信息反馈给他们以改善市场结果。在本文中,我们研究了双边市场中的以下信息披露问题:如果平台想要最大化收入,应该允许哪些卖家参与,以及平台应该与买家分享多少关于参与卖家质量的可用信息?我们在两个不同的双边市场模型的背景下研究了这个信息披露问题:一个是平台选择价格和卖家选择数量的模型(类似于共享出行),另一个是卖家选择价格的模型(类似于电子商务)。我们的主要结果提供了在实践中常见的简单信息结构在某些条件下最大化平台收入的条件,例如禁止某些卖家使用平台以及不区分参与卖家。平台的信息披露问题自然转化为一个含约束的价格歧视问题,其中的约束由正在研究的特定双边市场模型的均衡结果决定。我们分析这个含约束的价格歧视问题以获得我们的结构性结果。
Online platforms collect rich information about participants and then share some of this information back with them to improve market outcomes. In this paper, we study the following information disclosure problem in two-sided markets: if a platform wants to maximize revenue, which sellers should the platform allow to participate, and how much of its available information about participating sellers’ quality should the platform share with buyers? We study this information disclosure problem in the context of two distinct two-sided market models: one in which the platform chooses prices and the sellers choose quantities (similar to ride sharing), and one in which the sellers choose prices (similar to e-commerce). Our main results provide conditions under which simple information structures commonly observed in practice, such as banning certain sellers from the platform and not distinguishing between participating sellers, maximize the platform’s revenue. The platform’s information disclosure problem naturally transforms into a constrained price discrimination problem in which the constraints are determined by the equilibrium outcomes of the specific two-sided market model being studied. We analyze this constrained price discrimination problem to obtain our structural results.
推荐文章5
● 题目:A Dynamic Model for Managing Volunteer Engagement
管理志愿者劳动的动态模型
● 原文链接:https://doi.org/10.1287/opre.2021.0419
● 作者:Baris Ata, Mustafa H. Tongarlak, Deishin Lee, Joy Field
● 发布时间:2024/01/08
● 摘要:
提供食物、住所和其他服务给需要人群的非营利组织依靠志愿者来提供服务。与有偿劳动不同,非营利组织对无偿志愿者的时间表、努力和可靠性控制较少。然而,这些组织可以投资于志愿者参与活动,以确保有稳定和充足的志愿劳动力供应。我们研究了一个关键的运营问题,即非营利组织如何管理其志愿者工作力量容量,以确保服务的持续提供。特别是,我们制定了一个多类排队网络模型,以描述非营利组织最小化提高志愿者参与成本的同时最大化志愿者的生产性工作的最佳参与活动。因为这个问题看似难以解决,我们在重流量限制下制定了一个近似的布朗控制问题,并研究该系统的动态控制。我们的解决方案是一个嵌套阈值政策,具有明确的拥挤阈值,这些阈值指示非营利组织何时应该最优地追求各种类型的志愿者参与活动。使用一家大型食物银行的数据校准的数值示例显示,我们部署参与活动的动态政策可以显著减少食物银行的志愿者操作的总年度成本,同时仍然保持几乎相同水平的社会影响。这种性能的改进不需要任何额外资源——它只需要食物银行根据报名参加志愿者轮班的志愿者人数战略性地部署其参与活动。
Nonprofit organizations that provide food, shelter, and other services to people in need, rely on volunteers to deliver their services. Unlike paid labor, nonprofit organizations have less control over unpaid volunteers’ schedules, efforts, and reliability. However, these organizations can invest in volunteer engagement activities to ensure a steady and adequate supply of volunteer labor. We study a key operational question of how a nonprofit organization can manage its volunteer workforce capacity to ensure consistent provision of services. In particular, we formulate a multiclass queueing network model to characterize the optimal engagement activities for the nonprofit organization to minimize the costs of enhancing volunteer engagement, while maximizing productive work done by volunteers. Because this problem appears intractable, we formulate an approximating Brownian control problem in the heavy traffic limit and study the dynamic control of that system. Our solution is a nested threshold policy with explicit congestion thresholds that indicate when the nonprofit should optimally pursue various types of volunteer engagement activities. A numerical example calibrated using data from a large food bank shows that our dynamic policy for deploying engagement activities can significantly reduce the food bank’s total annual cost of its volunteer operations while still maintaining almost the same level of social impact. This improvement in performance does not require any additional resources—it only requires that the food bank strategically deploy its engagement activities based on the number of volunteers signed up to work volunteer shifts.
推荐文章6
● 题目:Selling Quality-Differentiated Products in a Markovian Market with Unknown Transition Probabilities
在一个具有未知转移概率的马尔可夫市场中销售质量差异化产品
● 原文链接:https://doi.org/10.1287/opre.2022.0316
● 作者:N. Bora Keskin, Meng Li
● 发布时间:2024/01/12
● 摘要:
在这篇论文中,我们研究了一个公司在客户对质量的偏好存在未知且随时间变化的异质性情况下的动态定价问题。公司提供标准产品以及高级产品来处理这种异质性。首先,我们考虑一个基准案例,即客户异质性的转变结构是已知的。在这种情况下,我们分析公司的最优定价政策并描述其关键结构特性。之后,我们研究了市场转移结构未知的情况,并设计了一个简单且实际可行的政策,称为有界学习政策,它是结合了两种单独表现不佳的政策。通过遗憾来衡量性能(即,与一个知道市场底层变化的先知相比的收入损失),我们证明我们的有界学习政策在市场转移频率方面实现了遗憾的最快可能收敛速率。因此,我们的政策表现良好,而不依赖于对市场转移结构的精确了解。
In this paper, we study a firm’s dynamic pricing problem in the presence of unknown and time-varying heterogeneity in customers’ preferences for quality. The firm offers a standard product as well as a premium product to deal with this heterogeneity. First, we consider a benchmark case in which the transition structure of customer heterogeneity is known. In this case, we analyze the firm’s optimal pricing policy and characterize its key structural properties. Thereafter, we investigate the case of unknown market transition structure and design a simple and practically implementable policy, called the bounded learning policy, which is a combination of two policies that perform poorly in isolation. Measuring performance by regret (i.e., the revenue loss relative to a clairvoyant who knows the underlying changes in the market), we prove that our bounded learning policy achieves the fastest possible convergence rate of regret in terms of the frequency of market shifts. Thus, our policy performs well without relying on precise knowledge of the market transition structure.
推荐文章7
● 题目:A Pareto Dominance Principle for Data-Driven Optimization
数据驱动优化的帕累托最优原则
● 原文链接:https://doi.org/10.1287/opre.2021.0609
● 作者:Tobias Sutter, Bart P. G. Van Parys, Daniel Kuhn
● 发布时间:2024/01/19
● 摘要:
我们提出了一种统计上最优的方法来为随机优化问题构建数据驱动的决策。从根本上讲,数据驱动的决策只是一个将可用训练数据映射到一个可行行动的函数。它总是可以被表达为从数据构建的代理优化模型的最小化器。数据驱动决策的质量通过其样本外风险来衡量。另一个质量衡量是其样本外失望,我们将其定义为样本外风险超过代理优化模型的最优值的概率。数据驱动优化的关键是数据生成的概率测度是未知的。因此,理想的数据驱动决策应当同时针对每一个可想象的概率测度(因此特别是针对未知的真实测度)最小化样本外风险。不幸的是,这样理想的数据驱动决策通常是不可获得的。这促使我们寻求在做出数据驱动决策,实现样本内风险最小化的同时,针对每一个可想象的概率测度约束样本外失望的上界。我们证明在允许有趣应用的条件下,存在帕累托最优的数据驱动决策。该条件为:未知的数据生成概率测度必须属于一个参数模糊集,并且相应的参数必须生成一个满足大偏差原理的充分统计量。如果这些条件成立,我们进一步证明生成最优数据驱动决策的代理优化模型必须是从充分统计量和其大偏差原理的率函数构建的分布鲁棒优化问题。这表明,从严格统计意义上将数据映射到决策的最优方法是解决一个分布鲁棒优化模型。或许令人惊讶的是,这个结果无论原始随机优化问题是否凸,甚至当训练数据不是独立同分布时,都是成立的。作为一个副产品,我们的分析揭示了数据生成随机过程的结构属性如何影响最优分布鲁棒优化模型底层模糊集的形状。
We propose a statistically optimal approach to construct data-driven decisions for stochastic optimization problems. Fundamentally, a data-driven decision is simply a function that maps the available training data to a feasible action. It can always be expressed as the minimizer of a surrogate optimization model constructed from the data. The quality of a data-driven decision is measured by its out-of-sample risk. An additional quality measure is its out-of-sample disappointment, which we define as the probability that the out-of-sample risk exceeds the optimal value of the surrogate optimization model. The crux of data-driven optimization is that the data-generating probability measure is unknown. An ideal data-driven decision should therefore minimize the out-of-sample risk simultaneously with respect to every conceivable probability measure (and thus in particular with respect to the unknown true measure). Unfortunately, such ideal data-driven decisions are generally unavailable. This prompts us to seek data-driven decisions that minimize the in-sample risk subject to an upper bound on the out-of-sample disappointment—again simultaneously with respect to every conceivable probability measure. We prove that such Pareto dominant data-driven decisions exist under conditions that allow for interesting applications: The unknown data-generating probability measure must belong to a parametric ambiguity set, and the corresponding parameters must admit a sufficient statistic that satisfies a large deviation principle. If these conditions hold, we can further prove that the surrogate optimization model generating the optimal data-driven decision must be a distributionally robust optimization problem constructed from the sufficient statistic and the rate function of its large deviation principle. This shows that the optimal method for mapping data to decisions is, in a rigorous statistical sense, to solve a distributionally robust optimization model. Maybe surprisingly, this result holds irrespective of whether the original stochastic optimization problem is convex or not and holds even when the training data are not independent and identically distributed. As a byproduct, our analysis reveals how the structural properties of the data-generating stochastic process impact the shape of the ambiguity set underlying the optimal distributionally robust optimization model.