如何用python32位开发词典软件_Python如何实现字典?

I was wondering how python dictionaries work under the hood, particularly the dynamic aspect?

When we create a dictionary, what is its initial size?

If we update it with a lot of elements, I suppose we need to enlarge the hash table. I suppose we need to recompute the hash function to adapt the size of the new bigger hash table while keeping a kind of logic with the previous hash table?

As you can see, I do not fully understand the internal of this structure.

解决方案When we create a dictionary what is its initial size?

As can be seen in the source code:

/* PyDict_MINSIZE is the starting size for any new dict.

* 8 allows dicts with no more than 5 active entries; experiments suggested

* this suffices for the majority of dicts (consisting mostly of usually-small

* dicts created to pass keyword arguments).

* Making this 8, rather than 4 reduces the number of resizes for most

* dictionaries, without any significant extra memory use.

*/

#define PyDict_MINSIZE 8

Imagine we update with a lot of key value pairs, i suppose we need to externe the hash table. I suppose we need to recompute the hash function to adapt the size of the new bigger hash table while keeping a kind of logic with the previous hash table....

CPython checks the hash table size every time we add a key. If the table is two-thirds full, it would resize the hash table by GROWTH_RATE (which is currently set to 3), and insert all elements:

/* GROWTH_RATE. Growth rate upon hitting maximum load.

* Currently set to used*3.

* This means that dicts double in size when growing without deletions,

* but have more head room when the number of deletions is on a par with the

* number of insertions. See also bpo-17563 and bpo-33205.

*

* GROWTH_RATE was set to used*4 up to version 3.2.

* GROWTH_RATE was set to used*2 in version 3.3.0

* GROWTH_RATE was set to used*2 + capacity/2 in 3.4.0-3.6.0.

*/

#define GROWTH_RATE(d) ((d)->ma_used*3)

The USABLE_FRACTION is the two thirds I mentioned above:

/* USABLE_FRACTION is the maximum dictionary load.

* Increasing this ratio makes dictionaries more dense resulting in more

* collisions. Decreasing it improves sparseness at the expense of spreading

* indices over more cache lines and at the cost of total memory consumed.

*

* USABLE_FRACTION must obey the following:

* (0 < USABLE_FRACTION(n) < n) for all n >= 2

*

* USABLE_FRACTION should be quick to calculate.

* Fractions around 1/2 to 2/3 seem to work well in practice.

*/

#define USABLE_FRACTION(n) (((n) << 1)/3)

Furthermore, the index calculation is:

i = (size_t)hash & mask;

where mask is HASH_TABLE_SIZE-1.

Here's how hash collisions are dealt:

perturb >>= PERTURB_SHIFT;

i = (i*5 + perturb + 1) & mask;

Explained in the source code:

The first half of collision resolution is to visit table indices via this

recurrence:

j = ((5*j) + 1) mod 2**i

For any initial j in range(2**i), repeating that 2**i times generates each

int in range(2**i) exactly once (see any text on random-number generation for

proof). By itself, this doesn't help much: like linear probing (setting

j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed

order. This would be bad, except that's not the only thing we do, and it's

actually *good* in the common cases where hash keys are consecutive. In an

example that's really too small to make this entirely clear, for a table of

size 2**3 the order of indices is:

0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]

If two things come in at index 5, the first place we look after is index 2,

not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.

Linear probing is deadly in this case because there the fixed probe order

is the *same* as the order consecutive keys are likely to arrive. But it's

extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,

and certain that consecutive hash codes do not.

The other half of the strategy is to get the other bits of the hash code

into play. This is done by initializing a (unsigned) vrbl "perturb" to the

full hash code, and changing the recurrence to:

perturb >>= PERTURB_SHIFT;

j = (5*j) + 1 + perturb;

use j % 2**i as the next table index;

Now the probe sequence depends (eventually) on every bit in the hash code,

and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,

because it quickly magnifies small differences in the bits that didn't affect

the initial index. Note that because perturb is unsigned, if the recurrence

is executed often enough perturb eventually becomes and remains 0. At that

point (very rarely reached) the recurrence is on (just) 5*j+1 again, and

that's certain to find an empty slot eventually (since it generates every int

in range(2**i), and we make sure there's always at least one empty slot).

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/537311.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

信息系统项目管理师:软件测试、调试及其管理

1&#xff0e;4&#xff0e;5软件测试及其管理 1、软件测试方法可分为静态测试和动态测试。 静态测试是指被测试程序不在机器上运行&#xff0c;而采用人工检测和计算机辅助静态分析的手段对程序进行检测。静态测试包括对文档的静态测试和对代码的静态测试。对文档的静态测试…

项目验收材料整合流程

目标&#xff1a;多份word整合成一份项目验收材料 第一步&#xff1a;编写好word&#xff1b;准备好一份验收材料的封面与目录word 第二步&#xff1a;用WPS的word转PDF&#xff0c;批量转成PDF&#xff1b; 第三步&#xff1a;用Adobe Acrobat DC 合并转成的多个PDF成为一个…

python调用接口获取文件_python接口文件使用说明

首先&#xff0c;python接口文件在安装好的darknet目录下的python文件夹&#xff0c;打开就可以看到这里的darknet.py文件就是python接口用编辑器打开查看最后部分代码&#xff1a;使用十分简单&#xff0c;先将网络配置加载进去&#xff0c;然后进行检测就行了。但其实现在还不…

[译]Kube Router Documentation

体系结构 Kube路由器是围绕观察者和控制器的概念而建立的。 观察者使用Kubernetes监视API来获取与创建&#xff0c;更新和删除Kubernetes对象有关的事件的通知。 每个观察者获取与特定API对象相关的通知。 在从API服务器接收事件时&#xff0c;观察者广播事件。 控制器注册以获…

windows11 22H2资源管理器开启多标签页

效果 步骤 windows11 22H2后续可能会推送该功能&#xff0c;现在是隐藏的&#xff0c;需要借助工具把这个隐藏功能开启 工具&#xff1a;vivetool 下载&#xff1a;Releases thebookisclosed/ViVe GitHub 步骤1&#xff1a;右键开始菜单&#xff0c;选择“终端&#xff08;…

python像素处理_Python 处理图片像素点的实例

###在做爬虫的时候有时需要识别验证码,但是验证码一般都有干扰物,这时需要对验证码进行预处理,效果如下:from PIL import Imageimport itertoolsimg Image.open(C:/img.jpg).convert(L) #打开图片,convert图像类型有L,RGBA# 转化为黑白图def blackWrite(img):blackXY []# 遍历…

Mysql更改表名大小写不敏感

编辑配置文件 vi /etc/my.cnf 在[mysqld]后添加添加 lower_case_table_names1 重启服务 service mysqld stop service mysqld start 部署会遇到的问题&#xff1a; MySQL在Linux下数据库名、表名、列名、别名大小写规则是这样的&#xff1a;   1、数据库名与表名是严格区分大…

遇到“我觉得行才算行”的业主怎么办?

目录 案例 分析 案例 项目初期UI设计需求不确定,我们设计了几稿,业主还是不满意,没有确定最终稿。后来呢,业主安排了一位内部的美工A过来。美工A给出了很多修改意见,我们根据美工A的意见进行了修改,又反反复复改了好几版,最后业主不算满意地确定了。 后来项目要收尾…

python读取多个文件夹下所有txt_Python实现合并同一个文件夹下所有txt文件的方法示例...

本文实例讲述了Python实现合并同一个文件夹下所有txt文件的方法。分享给大家供大家参考&#xff0c;具体如下&#xff1a;一、需求分析合并一个文件夹下所有txt文件二、合并效果三、python实现代码# -*- coding:utf-8*-import sysreload(sys)sys.setdefaultencoding(utf-8)impo…

项目是临时的,那项目组成员也是临时的吗?

在PMBOK定义项目属性&#xff0c;“临时性”是项目的三大属性之一。 在“结束项目或阶段”过程里的活动&#xff0c;重新分配人员&#xff1a;释放团队资源&#xff0c;在一些合同里面&#xff0c;项目结束后&#xff0c;需要给客户提供培训和一段时间的维护保修&#xff0c;那…

ceph安装配置

简介 ceph是一个开源分布式存储系统&#xff0c;支持PB级别的存储&#xff0c;支持对 象存储&#xff0c;块存储和文件存储&#xff0c;高性能&#xff0c;高可用&#xff0c;可扩展。 部署网络建议架构图 部署 部署架构图&#xff0c;本次实验部署jewel版本 实验环境的Vagrant…

推荐好用的JavaScript模块

2019独角兽企业重金招聘Python工程师标准>>> 译者按&#xff1a; 作者将自己常用的JavaScript模块分享给大家。 原文&#xff1a;? JavaScript Modules Worth Using ?译者: Fundebug为了保证可读性&#xff0c;本文采用意译而非直译。另外&#xff0c;本文版权归原…

python直接连接oracle_python连接oracle

一&#xff1a;弄清版本&#xff0c;最重要&#xff01;&#xff01;&#xff01;首先安装配置时&#xff0c;必须把握一个点&#xff0c;就是版本一致&#xff01;包括&#xff1a;系统版本&#xff0c;python版本&#xff0c;oracle客户端的版本&#xff0c;cx_Oracle的版本&…

项目计划不要拖,要赶紧排

目录 案例 复盘 应对 总结 案例 业主:这个项目很急,赶紧干活吧,明天就安排人来干活。 于是,项目经理问公司要来资源,第二天就投入到项目里。 公司只有一个项目,这样搞,项目能顺利实施,业主满意,公司老板感觉这种方法不错哦。 当公司项目越来越多了,员工也越来…

select函数_SQL高级功能:窗口函数

一、窗口函数有什么用&#xff1f;在日常生活中&#xff0c;经常会遇到需要在每组内排名&#xff0c;比如下面的业务需求&#xff1a;排名问题&#xff1a;每个部门按业绩来排名topN问题&#xff1a;找出每个部门排名前N的员工进行奖励面对这类需求&#xff0c;就需要使用sql的…

客户端C++与前端js交互

客户端与前端交互 qwebchannel.js文件引入建立通信// c发送消息给js new QWebChannel(qt.webChannelTransport, function(channel){var content channel.objects.jsContext;// 建立通信后&#xff0c;客户端通过调用 sMsg 方法来执行后面的回调函数&#xff0c;从而实现c与j…

python动态映射_sqlalchemy动态映射

似乎您可以直接使用属性&#xff0c;而不是使用columnsdict。考虑以下设置&#xff1a;from sqlalchemy import Table, Column, Integer, Unicode, MetaData, create_enginefrom sqlalchemy.orm import mapper, create_sessionclass Word(object):passwordColumns [english, k…

linux外接显示屏,关掉本身的笔记本电脑

https://blog.csdn.net/a2020883119/article/details/79561035 先用xrandr命令查看&#xff1a; eDP-1 connected eDP-1是连接着的 关掉&#xff1a;sudo xrandr --output eDP-1 --off 打开&#xff1a;sudo xrandr --output eDP-1 --auto

发挥项目“临时性”威力,让项目顺利实施

所谓临时性,就是要有明确的“开始”和“结束”。虽然大家都知道项目一定会有开始和结束的,但要更多地关注“明确“。 问题1:问商务(售前)或业主,这个项目什么时候结束? 答:商务或业主他们有时候也不知道,因为国内的项目大多数是提前开始交付,是一边交付,一边把里程…

上拉加载更多后台数据_6-7【微信小程序全栈开发课程】记录页面(七)--分页加载记录数据...

现在是一次性加载所有的记录数据&#xff0c;数据多的时候&#xff0c;会加载比较慢&#xff0c;所以我们改成分页加载&#xff0c;一次最多加载15条数据每次拉倒底部都会自动加载下一页的数据&#xff0c;知道所有的数据加载完成1、添加data变量编辑record.vue文件&#xff0c…