python求解纳什均衡

背景知识

双人纳什均衡求解问题,假设:

  • 玩家 0 0 0,行动 { 0 , ⋯ , m − 1 } \{0, \cdots, m-1\} {0,,m1},收益 A ∈ R + m × n A \in \reals_{+}^{m \times n} AR+m×n,策略 p ∈ R + m × 1 p \in \reals_{+}^{m \times 1} pR+m×1
  • 玩家 1 1 1,行动 { m , ⋯ , m + n − 1 } \{m, \cdots, m+n-1\} {m,,m+n1},收益 B ∈ R + m × n B \in \reals_{+}^{m \times n} BR+m×n,策略 q ∈ R + n × 1 q \in \reals_{+}^{n \times 1} qR+n×1

最优反应条件:

  • A q ∈ { 0 , u } m Aq \in \{0, u\}^{m} Aq{0,u}m
  • B T p ∈ { 0 , v } n B^Tp \in \{0, v\}^{n} BTp{0,v}n

polyhedron:

  • P † : { ( p , v ) ∣ 1 T p = 1 , p ≽ 0 , B T p ≼ v } P^{\dag}: \{ (p,v) \mid 1^Tp = 1, p \succcurlyeq 0, B^Tp \preccurlyeq v \} P:{(p,v)1Tp=1,p0,BTpv}
  • Q † : { ( q , u ) ∣ 1 T q = 1 , q ≽ 0 , A q ≼ u } Q^{\dag}: \{ (q,u) \mid 1^Tq = 1, q \succcurlyeq 0, Aq \preccurlyeq u \} Q:{(q,u)1Tq=1,q0,Aqu}

polytope:

  • P : { p ∣ p ≽ 0 , B T p ≼ 1 } P: \{ p \mid p \succcurlyeq 0, B^Tp \preccurlyeq 1 \} P:{pp0,BTp1}
  • Q : { q ∣ q ≽ 0 , A q ≼ 1 } Q: \{ q \mid q \succcurlyeq 0, Aq \preccurlyeq 1 \} Q:{qq0,Aq1}

只需求解原点以外的顶点。

算法概述

单独求解 P P P的顶点和 Q Q Q的顶点,然后匹配二者的顶点,根据:

  • ( A q ) i < 1 (Aq)_i < 1 (Aq)i<1当且仅当 p i = 0 p_i = 0 pi=0
  • ( B T p ) i < 1 (B^Tp)_i < 1 (BTp)i<1当且仅当 q j = 0 q_j = 0 qj=0

拓展延申

混合策略 k k k个分量非零称作 k k k支撑。
假设问题非退化,对于任意策略,如果策略是 k k k支撑,那么策略最优反应是 k k k支撑。
如果问题非退化,对于任意纳什均衡,如果 p p p k k k支撑,那么 q q q k k k支撑,如果 q q q k k k支撑,那么 p p p k k k支撑。

复杂度:

  • support enumeration: O ( 4 min ⁡ { m , n } ) \Omicron\left(4^{\min\{m,n\}}\right) O(4min{m,n})
  • vertex enumeration: O ( 2. 6 min ⁡ { m , n } ) \Omicron\left(2.6^{\min\{m,n\}}\right) O(2.6min{m,n})(这是高维多面体的性质)

代码

数据格式

http://www.gambit-project.org/gambit14/formats.html

import re
import os
import sys
import timeimport nash2
import nash3if os.name == 'nt':os.system('color')def load_nfg(nfg_path):nfg_comment_list = []nfg_player_list = []nfg_action_list = []nfg_payoff_list = []with open(nfg_path, 'rt') as nfg_file:first_line = nfg_file.readline()assert first_line.startswith('NFG 1 R')first_line = first_line.lstrip('NFG 1 R').lstrip()while not first_line.count('"') >= 1:first_line = nfg_file.readline().rstrip('\n')comment_line = first_line.lstrip('"')comment_line = comment_line.replace('\\"', '\\x22')while not '"' in comment_line:nfg_comment_list.append(comment_line)comment_line = nfg_file.readline().rstrip('\n')comment_line = comment_line.replace('\\"', '\\x22')meta_line = comment_line[comment_line.find('"') + 1 :]while not (meta_line.count('{') >= 2 and meta_line.count('}') >= 2):meta_line += nfg_file.readline().rstrip('\n')string_pattern = re.compile(r'\{([^\}]+)\}')player_pattern = re.compile(r'"([^"]+)"')action_pattern = re.compile(r'([0-9]+)')meta_string = string_pattern.findall(meta_line)player_string = meta_string[0]action_string = meta_string[1]for player in player_pattern.findall(player_string):nfg_player_list.append(player)for action in action_pattern.findall(action_string):nfg_action_list.append(int(action))payoff_line = Nonewhile not payoff_line:payoff_line = nfg_file.readline().rstrip('\n')for payoff in payoff_line.strip().split():nfg_payoff_list.append(int(payoff))return len(nfg_player_list), nfg_action_list, nfg_payoff_listdef dump_ne(ne_path, ne_list):with open(ne_path, 'wt') as ne_file:for ne_record in ne_list:ne_file.write(repr(tuple(ne_record)).lstrip('(').rstrip(')'))ne_file.write('\n')def solve_ne(nfg_path, ne_path):player_number, action_list, payoff_list = load_nfg(nfg_path)if player_number == 2:ne_list = nash2.solve_ne_two_mixed(action_list, payoff_list)else:ne_list = nash3.solve_ne_many_pure(player_number, action_list, payoff_list)dump_ne(ne_path, ne_list)def load_ne(ne_path):ne_list = []component_pattern = re.compile(r'[^,]+')with open(ne_path, 'rt') as ne_file:while (ne_line := ne_file.readline().rstrip('\n').lstrip('[').rstrip(']')):ne_record = []for component_string in component_pattern.findall(ne_line):ne_component = eval(component_string)if abs(ne_component) < 1e-8:ne_component = 0.0ne_record.append(ne_component)ne_list.append(tuple(ne_record))ne_list.sort()return ne_listdef compare_ne(ne_path, ne_tgt_path):ne_list = load_ne(ne_path)ne_tgt_list = load_ne(ne_tgt_path)if len(ne_list) == len(ne_tgt_list):for ne_record, ne_tgt_record in zip(ne_list, ne_tgt_list):for ne_component, ne_tgt_component in zip(ne_record, ne_tgt_record):if abs(ne_component - ne_tgt_component) > 1e-8:return Falsereturn Truedef log_one(ansi, stem, comment):print('{} \033[{}m[{}]\033[0m {}'.format(time.strftime('%H:%M:%S'), ansi, stem, comment))def deploy_all(nfg_dir, ne_dir, stem_list):for stem in stem_list:timestamp_before = time.time()solve_ne(os.path.join(nfg_dir, '{}.nfg'.format(stem)),os.path.join(ne_dir, '{}.ne'.format(stem)),)timestamp_after = time.time()ansi = 93log_one(ansi,stem,'solve_ne in {:.2f}s'.format(timestamp_after - timestamp_before),)def difftest_all(ne_dir, ne_tgt_dir, stem_list):for stem in stem_list:is_same = compare_ne(os.path.join(ne_dir, '{}.ne'.format(stem)),os.path.join(ne_tgt_dir, '{}.ne'.format(stem)),)if is_same:ansi = 92else:ansi = 91log_one(ansi,stem,'compare_ne as {:s}'.format('good' if is_same else 'bad'),)def deploy():curr_dir, _ = os.path.split(__file__)nfg_dir = os.path.join(curr_dir, 'input')ne_dir = os.path.join(curr_dir, 'output')stem_list = []for nfg_name in os.listdir(nfg_dir):if nfg_name.endswith('.nfg'):stem_list.append(nfg_name.rstrip('.nfg'))deploy_all(nfg_dir, ne_dir, stem_list)def difftest():curr_dir, _ = os.path.split(__file__)nfg_dir = os.path.join(curr_dir, 'example_input')ne_dir = os.path.join(curr_dir, 'example_output')ne_tgt_dir = os.path.join(curr_dir, 'example_input')stem_list = []for nfg_name in os.listdir(nfg_dir):if nfg_name.endswith('.nfg'):stem_list.append(nfg_name.rstrip('.nfg'))stem_list.sort()deploy_all(nfg_dir, ne_dir, stem_list)difftest_all(ne_dir, ne_tgt_dir, stem_list)if __name__ == '__main__':supported_routines = {'deploy': deploy, 'difftest': difftest}if len(sys.argv) > 1 and (routine := sys.argv[1]) in supported_routines:supported_routines[routine]()else:difftest()deploy()

双人 混合策略 纳什均衡

import numpy as npfrom scipy.optimize import linprog
from scipy.spatial import HalfspaceIntersectiondef get_bimatrix(action_list, payoff_list):m, n = action_listA = [[None for j in range(n)] for i in range(m)]B = [[None for j in range(n)] for i in range(m)]for j in range(n):for i in range(m):A[i][j] = payoff_list[(i + j * m) * 2 + 0]B[i][j] = payoff_list[(i + j * m) * 2 + 1]A = np.array(A)B = np.array(B)A -= A.min()B -= B.min()return m, n, A, Bdef get_polytope(mat):num, dim = mat.shapeempathy_polyhedron = np.concatenate((mat, -np.ones((num, 1))), axis=1)sanity_polyhedron = np.concatenate((-np.identity(dim), np.zeros((dim, 1))), axis=1)return np.concatenate((empathy_polyhedron, sanity_polyhedron), axis=0)def get_feasible(mat):num, dim = mat.shapeA_ub = np.concatenate((np.concatenate((mat, -np.identity(dim)), axis=0),np.concatenate((np.linalg.norm(mat, axis=1, keepdims=True), np.ones((dim, 1))),axis=0,),),axis=1,)b_ub = np.concatenate((np.ones((num, 1)), np.zeros((dim, 1))), axis=0)c = np.zeros((dim + 1,))c[dim] = -1result = linprog(c=c, A_ub=A_ub, b_ub=b_ub, A_eq=None, b_eq=None, bounds=(0, None))assert result.status == 0return result.x[:-1]def get_hyperplanes(polytope, vertex):A_slack = polytope[:, :-1]b_slack = polytope[:, -1:](hyperplanes,) = np.nonzero(np.isclose(np.ravel(A_slack @ vertex.reshape(-1, 1) + b_slack), 0))return hyperplanes.tolist()def get_vertecies(mat):polytope = get_polytope(mat)feasible = get_feasible(mat)intersection = HalfspaceIntersection(polytope, feasible)vertecies = []for vertex in intersection.intersections:if vertex.max() < np.inf and not np.all(np.isclose(vertex, 0)):vertecies.append((vertex, get_hyperplanes(polytope, vertex)))return verteciesdef solve_ne_two_mixed(action_list, payoff_list):'''算法:多面体顶点。'''m, n, A, B = get_bimatrix(action_list, payoff_list)q_vertecies = get_vertecies(A)p_vertecies = get_vertecies(B.T)mixed_list = []for p_vertex, m_hyperplanes in p_vertecies:for q_vertex, n_hyperplanes in q_vertecies:m_labels = set(m_hyperplanes)n_labels = set(map(lambda i: (i + n) % (m + n), n_hyperplanes))if len(m_labels | n_labels) == m + n:p = p_vertex / p_vertex.sum()q = q_vertex / q_vertex.sum()mixed = p.tolist() + q.tolist()mixed_list.append(mixed)return mixed_list

多人 纯策略 纳什均衡

def get_state_number(action_list):state_number = 1for action_number in action_list:state_number *= action_numberreturn state_numberdef get_state_list(player_number, action_list):state_number = get_state_number(action_list)state_list = []state = [0 for _ in range(player_number)]for state_index in range(state_number):state_list.append([element for element in state])for player in range(player_number):state[player] += 1if state[player] < action_list[player]:breakelse:state[player] = 0return state_listdef get_player_state_list(player_number, action_list, player):player_state_number = get_state_number(action_list) // action_list[player]player_state_list = []player_state = [0 for _ in range(player_number)]player_state[player] = Nonefor player_state_index in range(player_state_number):player_state_list.append([element for element in player_state])for enemy in range(player_number):if enemy != player:player_state[enemy] += 1if player_state[enemy] < action_list[enemy]:breakelse:player_state[enemy] = 0else:continuereturn player_state_listdef solve_ne_many_pure(player_number, action_list, payoff_list):'''算法:共同最优反应。'''state_payoff = {}state_response = {}for state_index, state in enumerate(get_state_list(player_number, action_list)):state_payoff[tuple(state)] = payoff_list[player_number * state_index : player_number * (state_index + 1)]state_response[tuple(state)] = [False for _ in range(player_number)]for player in range(player_number):for player_state in get_player_state_list(player_number, action_list, player):player_state_best_payoff = -float('inf')for player_choice in range(action_list[player]):player_state[player] = player_choiceif (player_state_best_payoff< state_payoff[tuple(player_state)][player]):player_state_best_payoff = state_payoff[tuple(player_state)][player]player_state[player] = Nonefor player_choice in range(action_list[player]):player_state[player] = player_choiceif (player_state_best_payoff<= state_payoff[tuple(player_state)][player]):state_response[tuple(player_state)][player] = Trueplayer_state[player] = Nonepure_list = []for state in get_state_list(player_number, action_list):if all(state_response[tuple(state)]):pure = []for player in range(player_number):player_pure = [0 for _ in range(action_list[player])]player_pure[state[player]] = 1pure += player_purepure_list.append(pure)return pure_list

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/4252.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

AA@有理系数多项式@整系数多项式@本原多项式@有理多项式可约问题

文章目录 有理系数多项式本原多项式一般多项式到整系数多项式本原多项式定义高斯引理整系数多项式分解定理推论思路1:思路2:思路3:(废弃) 整系数多项式有理根定理与整根定理爱森斯坦判别法构造任意次数的有理系数不可约多项式 有理系数多项式 有理数域上一元多项式的因式分解.…

Shuffle简单理解

map的结果本身是无序的&#xff0c;但是map输出的结果有序 mapper和reduce是不同的机器&#xff0c;进行了网络传输&#xff0c;所以存在数据拷贝 第二次排序&#xff0c;是将每个reduce对应的task进行排序&#xff0c;然后再进入reduce maptask运行结束&#xff0c;每个mask块…

JS制作一个网页版的猜数字小游戏

一. 网络游戏简介 在输入框内输一个数字&#xff0c;点击后面的“猜”按钮&#xff0c;系统会根据你猜的数字的大小&#xff0c;返回你是猜大猜小还是猜正确&#xff0c;系统还会统计你猜的次数&#xff0c;在猜数字的途中你点击按钮随时可以重新开始游戏 页面框架&#xff1…

能耗管理平台保障用电的安全

安科瑞虞佳豪 壹捌柒陆壹伍玖玖零玖叁 6月12日&#xff0c;江苏盐城射阳县某民房起火&#xff0c;消防救援人员到场后&#xff0c;立即对火势进行扑救&#xff0c;经过20多分钟的处置&#xff0c;现场明火全部被扑灭&#xff0c;据了解&#xff0c;起火原因是电线老化短路引发…

短视频seo抖音矩阵号系统源码开发搭建分享

我们自主研发的短视频矩阵系统源码&#xff0c;技术研发的独立核心算法的视频内容管理和展示功能。无需额外的流量接口费用和复杂的配置&#xff0c;轻松地创建和管理短视频内容&#xff0c;短视频矩阵源码是指将抖音平台上的视频资源进行筛选、排序等操作&#xff0c;进而提升…

Ubuntu 查看磁盘空间大小命令

df命令是Linux系统以磁盘分区为单位查看文件系统&#xff0c;可以加上参数查看磁盘剩余空间信息&#xff0c;命令格式&#xff1a; df -hl 显示格式为&#xff1a;  文件系统 容量 已用 可用 已用% 挂载点  Filesystem Size Used Avail Use% Mou…

48、MyBatis的优缺点

MyBatis的优缺点 优点 基于 SOL 语句编程&#xff0c;相当灵活&#xff0c;不会对应用程序或者数据库的现有设计造成任何影响&#xff0c;SQL 写在 XML里&#xff0c;解除 sql 与程序代码的合&#xff0c;便于统一管理&#xff0c;提供 XML 标签&#xff0c;支持编写动态 SQL…

Redis简介(1)

⭐ 作者简介&#xff1a;码上言 ⭐ 代表教程&#xff1a;Spring Boot vue-element 开发个人博客项目实战教程 ⭐专栏内容&#xff1a;个人博客系统 ⭐我的文档网站&#xff1a;http://xyhwh-nav.cn/ 文章目录 Redis简介1、NoSQL1.1、什么是NoSQL&#xff1f;1.2、NoSQL 特点…

【java爬虫】使用selenium获取某宝联盟淘口令

上一篇文章我们已经介绍过使用selenium获取优惠券基本信息的方法 (15条消息) 【java爬虫】使用selenium爬取优惠券_haohulala的博客-CSDN博客 本文将在上一篇文章的基础上更进一步&#xff0c;获取每个优惠券的淘口令&#xff0c;毕竟我们只有复制淘口令才能在APP里面获取优惠…

C++day4 (拷贝构造函数、拷贝赋值函数、匿名对象、友元函数、常成员函数、常对象、运算符重载)

#include <iostream> #include <cstring> using namespace std;class mystring { private:char *str; //记录C风格字符串int size; //记录字符串的实际长度public://无参构造mystring():size(10){strnew char[size];//构造出一个长度为10的字符串strcpy(str,&…

汽车的空气悬架的功能以及发展趋势

空气悬架能实现什么功能以及发展趋势 了解空气悬架之前,首先得快速了解什么是悬架。 教科书说法是: 悬架系统是汽车的车架与车桥或车轮之间的一切传力连接装置的总称。悬架系统基本构成有弹性元件(各类弹簧,缓冲作用);减震元件(减震器,减震作用);导向机构(控制臂等…

《网络是怎样连接的》(一)

本文主要取材于 《网络是怎样连接的》 第一章。 简述&#xff1a;在浏览器输入一个网址&#xff0c;浏览器会解析出域名&#xff0c;但是直接使用域名无法找到Web服务器。需要使用DNS解析器将域名解析为IP地址&#xff0c;然后客户端可以创建套接字&#xff0c;延伸出管道根据…

浅谈设计模式之单例模式

0 单例模式简介 单例模式属于创建型模式&#xff0c;它提供了一种创建对象的最佳方式。单例模式指的是单一的一个类&#xff0c;该类负责创建自己的对象&#xff0c;并且保证该对象唯一。该类提供了一种访问其唯一对象的方法&#xff0c;外部需要调用该类的对象可以通过方法获…

编译报错:The project is using an incompatible version

The project is using an incompatible version (AGP 8.0.2) of the Android Gradle plugin. Latest supported version is AGP 7.4.1 See Android Studio & AGP compatibility options. 注意AndroidStudio版本和AGP的版本&#xff0c;需要对应。 如果不对应需要下载对应的…

python发送邮件zmail库

第三方库“zmail”和“yagmail”可实现邮件发送。在实际使用对比zmail比yagmail更简洁。使用zmail&#xff0c;无需登录OA邮箱&#xff0c;便可完成邮件的发送及附件的自动加载。 import zmaildef send_zmail(sender, sender_password, addressee, host, port465, inspect_smtp…

【EasyExcel】在SpringBoot+VUE项目中引入EasyExcel实现对数据的导出(封装工具类)

在SpringBootVUE项目中引入EasyExcel实现导入导出 一、引入EasyExcel 通过maven引入&#xff0c;坐标如下&#xff1a; <dependency><groupId>com.alibaba</groupId><artifactId>easyexcel-core</artifactId><version>3.3.2</version…

R语言ggplot2——折线图

BMI <- read.table(/Users/zhangzhishuai/Downloads/33 lesson33 ggplot2散点图&#xff08;一&#xff09;/33_ggplot2/BMI.txt, header T,sep \t, row.names 1) library(ggplot2) ggplot(BMI, aes(xweight,yheight)) geom_line() # 折线图# 加文字 ggplot(BMI, aes(xwe…

Flink订阅Kafka消息队列实战案例

1、Kafka介绍 Kafka是一款开源的分布式消息系统&#xff0c;最初由LinkedIn公司开发并开源。它被设计用于处理海量的实时数据流&#xff0c;可以支持高吞吐量和低延迟的数据传输。 Kafka的设计主要目标是提供一个持久化的、高吞吐量的、可扩展的、分布式发布/订阅消息系统&am…

PHP客服系统-PhpWorkmanChat客服系统修改管理员密码

作为一款流行的开源PHP客服系统&#xff0c;基于thinkphp和workman&#xff0c;跨平台轻量级客服系统源码 管理员表是v2_admin 账户是admin&#xff0c;如果密码忘记了怎么办。可以直接修改数据库表v2_admin &#xff0c;密码规则是md5(密码 加密盐) &#xff0c; 加密盐可以在…

开发工具篇第二十六讲:使用IDEA进行本地调试和远程调试

开发工具篇第二十六讲&#xff1a;使用IDEA进行本地调试和远程调试 Debug用来追踪代码的运行流程&#xff0c;通常在程序运行过程中出现异常&#xff0c;启用Debug模式可以分析定位异常发生的位置&#xff0c;以及在运行过程中参数的变化&#xff1b;并且在实际的排错过程中&am…