python基础-数据结构-leetcode刷题必看-heapq --- 堆队列算法

文章目录

    • 堆的定义
    • 堆的主要操作
    • 堆的构建
    • 堆排序
    • heapq模块
      • `heapq.heappush(heap, item)`
      • `heapq.heappop(heap)`
      • `heapq.heappushpop(heap, item)`
      • `heapq.heapreplace(heap, item)`
      • `heapq.merge(*iterables, key=None, reverse=False)`
      • `heapq.nlargest(n, iterable, key=None)`
      • `heapq.nsmallest(n, iterable, key=None)`
  • 参考

堆的定义

堆队列算法,即优先队列算法。

  • 堆是一种完全二叉树,这个性质保证了堆使用数组表示就不会出现None,数组中每一个值都不会是空的
  • 而且假设父节点索引为i那么左右子节点的下标分别是2i+12i+2
  • 小顶堆中每个上级节点的值都小于等于它的任意子节点。我们将这一条件称为堆的不变性。

这个实现使用了数组,其中对于所有从 0 开始计数的 k 都有 heap[k] <= heap[2*k+1]heap[k] <= heap[2*k+2]。为了便于比较,不存在的元素将被视为无穷大。堆(最小堆)最有趣的特性在于其最小的元素始终位于根节点 heap[0]
在这里插入图片描述

在这里插入图片描述

堆的主要操作

堆主要有两个重要操作:上滤、下虑
上虑操作,主要用于添加元素,如果子节点的值比父节点值要大,那么需要该节点递归执行上虑操作,以重新满足堆的性质。
例如下图是一个添加一个新元素的例子,我们先将新元素插入到数组的末尾,如果他比父节点((i-1)//2)大,那么他将向上执行上虑操作,直到根节点或者找到一个大于它的父节点。

def up_heapify(nums, i):while i > 0:parent = (i - 1) // 2if nums[i] > nums[parent]:nums[i], nums[parent] = nums[parent], nums[i]i = parentelse:break

在这里插入图片描述
在这里插入图片描述

下虑操作,主要用于删除元素(例如在堆排序中,总是弹出根节点),删除元素之后,我们会将末尾元素移动到根节点,然后执行下虑操作,如果比其中子节点小,我们就需要交换与子节点的位置,直到叶子节点,或者没有大于它的子节点。下虑更一般的操作是树中有节点比其子节点要小,我们需要执行下虑操作,以维持堆的特性。

def down_heapify(nums, n, i):while i < n:left = i * 2 + 1right = i * 2 + 2largest = iif left < n and nums[left] > nums[largest]:largest = leftif right < n and nums[right] > nums[largest]:largest = rightif largest == i:breaknums[largest], nums[i] = nums[i], nums[largest]i = largest

在这里插入图片描述

堆的构建

堆有两种构建方法,分别是使用上虑操作的构建 ( O n log ⁡ ( n ) ) (On\log(n)) (Onlog(n))和使用下虑操作的构建 ( O n ) (On) (On)方法。下面我们以[2,7,26,25,19,17,1,90,3,36]大顶堆构建过程为例
上虑构建:对于一个给定的数组[2,7,26,25,19,17,1,90,3,36],我们从前向后查看,

  • 首先是2,作为根节点,因为他没有父节点,无法执行上虑操作
  • 接着是7,我们可以根据(2-1)//2计算其父节点为索引0,那么72大违反了大顶堆的特点,我们需要7执行上虑操作
  • 接着是26,我们可以根据(2-1)//2计算其父节点为索引0,那么267大违反了大顶堆的特点,我们需要26执行上虑操作
  • 依次类推,直到遍历完数组的所有元素,那么大顶堆的构建过程为遍历所有元素 O ( n ) O(n) O(n)乘以每个元素的上虑操作 O ( log ⁡ n ) O(\log n) O(logn),总的复杂度就是 O ( n log ⁡ n ) O(n\log n) O(nlogn)
def up_build(nums):for i in range(len(nums)):up_heapify(nums, i)return nums

在这里插入图片描述
下虑构建:与上虑构建相反,我们从数组的末尾向前查看,我们查看每个元素是否比其左右孩子节点还要小,如果小,我们就让该元素执行下虑操作

  • 首先36,3,90,1,17都没有孩子节点,跳过
  • 接着19有孩子节点36,19<36,执行下虑操作
  • 依次类推遍历完所有节点
def down_build(nums):for i in range(len(nums) // 2 -1, -1, -1):down_heapify(nums, len(nums), i)return nums

在这里插入图片描述

为什么复杂度是 O ( n ) O(n) O(n)而不是 O ( n log ⁡ n ) O(n\log n) O(nlogn),如果粗略的来看他确实是需要遍历n个节点,然后每个节点可能执行 log ⁡ n \log n logn的下虑操作,那么答案就是 O ( n log ⁡ n ) O(n\log n) O(nlogn)。然而并不是简单的想象那般,
好的,让我们使用LaTeX格式来完整表示这一推导过程。

假设我们的二叉树高度为 h h h

  • 树的第 i i i 层有 2 i 2^i 2i 个节点,层数 i i i 0 0 0 h h h
  • 树的总节点数 n n n为:
    n = 2 0 + 2 1 + 2 2 + … + 2 h = ∑ i = 0 h 2 i n = 2^0 + 2^1 + 2^2 + \ldots + 2^h = \sum_{i=0}^{h} 2^i n=20+21+22++2h=i=0h2i

由于这是一个等比数列,我们可以使用等比数列求和公式:

n = 2 h + 1 − 1 n = 2^{h+1} - 1 n=2h+11

从而可以得到高度 h h h

h = log ⁡ 2 ( n + 1 ) − 1 h = \log_2(n + 1) - 1 h=log2(n+1)1

各层节点数与堆化操作次数的关系 我们要对各层的“节点数量 × \times × 节点高度”求和,得到所有节点的堆化迭代次数的总和。

  • 高度为 h − i h-i hi 的节点数为 2 i 2^i 2i
  • 每个节点的高度是 h − i h-i hi

堆化操作总次数为(这里也可以体现出,如果是叶子节点,高度为 0 0 0,其操作次数也是 0 0 0):

∑ i = 0 h 2 i ⋅ ( h − i ) \sum_{i=0}^{h} 2^i \cdot (h - i) i=0h2i(hi)

我们将公式拆分并化简:

∑ i = 0 h 2 i ⋅ ( h − i ) = h ∑ i = 0 h 2 i − ∑ i = 0 h i ⋅ 2 i \sum_{i=0}^{h} 2^i \cdot (h - i) = h \sum_{i=0}^{h} 2^i - \sum_{i=0}^{h} i \cdot 2^i i=0h2i(hi)=hi=0h2ii=0hi2i

首先,计算 ∑ i = 0 h 2 i \sum_{i=0}^{h} 2^i i=0h2i

∑ i = 0 h 2 i = 2 h + 1 − 1 \sum_{i=0}^{h} 2^i = 2^{h+1} - 1 i=0h2i=2h+11

接下来,计算 ∑ i = 0 h i ⋅ 2 i \sum_{i=0}^{h} i \cdot 2^i i=0hi2i

定义 S = ∑ i = 0 h i ⋅ 2 i S = \sum_{i=0}^{h} i \cdot 2^i S=i=0hi2i

2 S = ∑ i = 0 h i ⋅ 2 i + 1 = ∑ i = 1 h + 1 ( i − 1 ) ⋅ 2 i = ∑ i = 1 h + 1 i ⋅ 2 i − ∑ i = 1 h + 1 2 i 2S = \sum_{i=0}^{h} i \cdot 2^{i+1} = \sum_{i=1}^{h+1} (i-1) \cdot 2^i = \sum_{i=1}^{h+1} i \cdot 2^i - \sum_{i=1}^{h+1} 2^i 2S=i=0hi2i+1=i=1h+1(i1)2i=i=1h+1i2ii=1h+12i

2 S = ∑ i = 1 h + 1 i ⋅ 2 i − ( 2 h + 2 − 2 ) 2S = \sum_{i=1}^{h+1} i \cdot 2^i - (2^{h+2} - 2) 2S=i=1h+1i2i(2h+22)

$$ 2S = \sum_{i=0}^{h+1} i \cdot 2^i - 0 \cdot 2^0 - (2^{h+2} - 2) = S

  • (h+1) \cdot 2^{h+1} - (2^{h+2} - 2) $$

S = ( h + 1 ) ⋅ 2 h + 1 − 2 h + 2 + 2 = ( h − 1 ) ⋅ 2 h + 1 + 2 S = (h+1) \cdot 2^{h+1} - 2^{h+2} + 2 = (h-1) \cdot 2^{h+1} + 2 S=(h+1)2h+12h+2+2=(h1)2h+1+2

S S S带入原公式:

∑ i = 0 h 2 i ⋅ ( h − i ) = h ⋅ ( 2 h + 1 − 1 ) − ( ( h − 1 ) ⋅ 2 h + 1 + 2 ) \sum_{i=0}^{h} 2^i \cdot (h - i) = h \cdot (2^{h+1} - 1) - ((h-1) \cdot 2^{h+1} + 2) i=0h2i(hi)=h(2h+11)((h1)2h+1+2)

= h ⋅ 2 h + 1 − h − ( h − 1 ) ⋅ 2 h + 1 − 2 = 2 h + 1 + h − 2 = h \cdot 2^{h+1} - h - (h-1) \cdot 2^{h+1} - 2 = 2^{h+1} + h - 2 =h2h+1h(h1)2h+12=2h+1+h2

进一步简化

由于我们知道 n ≈ 2 h + 1 n \approx 2^{h+1} n2h+1,则:

2 h + 1 ≈ n 2^{h+1} \approx n 2h+1n

所以我们可以得到:

∑ i = 0 h 2 i ⋅ ( h − i ) ≈ n \sum_{i=0}^{h} 2^i \cdot (h - i) \approx n i=0h2i(hi)n

因此,构建堆的时间复杂度为 O ( n ) O(n) O(n)

通过以上推导,我们证明了使用上虑操作构建堆的时间复杂度为 O ( n ) O(n) O(n),即堆化操作的总次数与节点数成线性关系,非常高效。

堆排序

明白了堆的操作和堆的构建后,堆排序就很简单

  • 首先使用下虑构建好堆
  • 依次将数组的第0个元素弹出,将末尾元素移动到0的位置,并执行下虑操作(为节省空间,弹出来的元素我们可以暂存在数组末尾i的位置(那么i-到len(nums)-1都是排好序的),那么下虑操作时不能超过i
  • 直到弹出所有元素弹出,那么弹出的顺序就是一个有序的序列,(将根节点放到末尾这种做法:如果是大顶堆就是递增顺序,小顶堆是递减序列,这与弹出的单调性相反)
def heap_sort(nums):nums = down_build(nums)for i in range(len(nums) - 1, 0, -1):nums[i], nums[0] = nums[0], nums[i] # 根节点依次移到末尾i中down_heapify(nums, i,0) #不能超过ireturn nums

在这里插入图片描述
在这里插入图片描述

heapq模块

这个API与教材的堆算法实现有所不同,具体区别有两方面:(a)我们使用了从零开始的索引。这使得节点和其孩子节点索引之间的关系不太直观但更加适合,因为 Python 使用从零开始的索引。(b)我们的 pop 方法返回最小的项而不是最大的项(这在教材中称为“最小堆”;而“最大堆”在教材中更为常见,因为它更适用于原地排序)。

基于这两方面,把堆看作原生的Python list也没什么奇怪的:heap[0] 表示最小的元素,同时 heap.sort() 维护了堆的不变性!

要创建一个堆,可以新建一个空列表 [],或者用函数 heapify() 把一个非空列表变为堆。

heapq.heappush(heap, item)

item 的值加入 heap 中,保持堆的不变性,默认是构建小顶堆。

>>> nums = []
>>> heapq.heappush(nums, 2)
>>> heapq.heappush(nums, 7)
>>> heapq.heappush(nums, 26)
>>> heapq.heappush(nums, 19)
>>> heapq.heappush(nums, 17)
>>> heapq.heappush(nums, 1)
>>> heapq.heappush(nums, 90)
>>> heapq.heappush(nums, 3)
>>> heapq.heappush(nums, 36)
>>> nums
[1, 3, 2, 7, 17, 26, 90, 19, 36]

heapq.heappop(heap)

弹出并返回 heap 的最小的元素,保持堆的不变性。如果堆为空,抛出 IndexError。使用 heap[0] ,可以只访问最小的元素而不弹出它。

>>> heapq.heappop(nums)
1
>>> nums
[2, 3, 26, 7, 17, 36, 90, 19]

heapq.heappushpop(heap, item)

item 放入堆中,然后弹出并返回 heap 的最小元素。该组合操作比先调用 heappush() 再调用 heappop() 运行起来更有效率。

>>> heapq.heappushpop(nums, 5)
2
>>> nums
[3, 5, 26, 7, 17, 36, 90, 19]

heapq.heapify(x)
将list x 转换成堆,原地,线性时间内。

>>> nums =  [2,7,26,25,19,17,1,90,3,36]
>>> heapq.heapify(nums)
>>> nums
[1, 3, 2, 7, 19, 17, 26, 90, 25, 36]

heapq.heapreplace(heap, item)

弹出并返回 heap 中最小的一项,同时推入新的 item。堆的大小不变。如果堆为空则引发 IndexError
这个单步骤操作比 heappop()heappush() 更高效,并且在使用固定大小的堆时更为适宜。pop/push 组合总是会从堆中返回一个元素并将其替换为 item
返回的值可能会比新加入的值大。如果不希望如此,可改用 heappushpop()。它的 push/pop 组合返回两个值中较小的一个,将较大的留在堆中。

>>> nums
[1, 3, 2, 7, 19, 17, 26, 90, 25, 36]
>>> a = nums
>>> heapq.heapreplace(a, 0) # 不用管弹出元素与插入元素它们两个的大小
1
>>> a
[0, 3, 2, 7, 19, 17, 26, 90, 25, 36]
>>> b = nums
>>> heapq.heappushpop(b,0) # 如果插入的比弹出的更小,那么直接返回插入元素,不弹出
0
>>> b
[0, 3, 2, 7, 19, 17, 26, 90, 25, 36]

该模块还提供了三个基于堆的通用目的函数。

heapq.merge(*iterables, key=None, reverse=False)

将多个已排序的输入合并为一个已排序的输出(例如,合并来自多个日志文件的带时间戳的条目)。返回已排序值的 iterator
类似于 sorted(itertools.chain(*iterables)) 但返回一个可迭代对象,不会一次性地将数据全部放入内存,并假定每个输入流都是已排序的(从小到大)。
具有两个可选参数,它们都必须指定为关键字参数。
key 指定带有单个参数的 key function,用于从每个输入元素中提取比较键。默认值为 None (直接比较元素)。
reverse 为一个布尔值。如果设为 True,则输入元素将按比较结果逆序进行合并。要达成与 sorted(itertools.chain(*iterables), reverse=True) 类似的行为,所有可迭代对象必须是已从大到小排序的。

>>> heapq.merge(a,b)
<generator object merge at 0x00000204065C9678>
>>> list(heapq.merge(a,b))
[0, 0, 3, 2, 3, 2, 7, 7, 19, 17, 19, 17, 26, 26, 90, 25, 36, 90, 25, 36]

在 3.5 版本发生变更: 添加了可选的 keyreverse 形参。

heapq.nlargest(n, iterable, key=None)

iterable 所定义的数据集中返回前 n 个最大元素组成的列表。如果提供了 key 则其应指定一个单参数的函数,用于从 iterable 的每个元素中提取比较键 (例如 key=str.lower)。等价于: sorted(iterable, key=key, reverse=True)[:n]

>>> heapq.nlargest(3, nums)
[90, 36, 26]
>>> nums
[0, 3, 2, 7, 19, 17, 26, 90, 25, 36]

heapq.nsmallest(n, iterable, key=None)

iterable 所定义的数据集中返回前 n 个最小元素组成的列表。如果提供了 key 则其应指定一个单参数的函数,用于从 iterable 的每个元素中提取比较键 (例如 key=str.lower)。等价于: sorted(iterable, key=key)[:n]
后两个函数在 n 值较小时性能最好。对于更大的值,使用 sorted() 函数会更有效率。此外,当 n==1 时,使用内置的 min()max() 函数会更有效率。如果需要重复使用这些函数,请考虑将可迭代对象转为真正的堆。

>>> heapq.nsmallest(3, nums)
[0, 2, 3]

heapq的官方源码heapq.py

"""Heap queue algorithm (a.k.a. priority queue).Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for
all k, counting elements from 0.  For the sake of comparison,
non-existing elements are considered to be infinite.  The interesting
property of a heap is that a[0] is always its smallest element.Usage:heap = []            # creates an empty heap
heappush(heap, item) # pushes a new item on the heap
item = heappop(heap) # pops the smallest item from the heap
item = heap[0]       # smallest item on the heap without popping it
heapify(x)           # transforms list into a heap, in-place, in linear time
item = heappushpop(heap, item) # pushes a new item and then returns# the smallest item; the heap size is unchanged
item = heapreplace(heap, item) # pops and returns smallest item, and adds# new item; the heap size is unchangedOur API differs from textbook heap algorithms as follows:- We use 0-based indexing.  This makes the relationship between theindex for a node and the indexes for its children slightly lessobvious, but is more suitable since Python uses 0-based indexing.- Our heappop() method returns the smallest item, not the largest.These two make it possible to view the heap as a regular Python list
without surprises: heap[0] is the smallest item, and heap.sort()
maintains the heap invariant!
"""# Original code by Kevin O'Connor, augmented by Tim Peters and Raymond Hettinger__about__ = """Heap queues[explanation by François Pinard]Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for
all k, counting elements from 0.  For the sake of comparison,
non-existing elements are considered to be infinite.  The interesting
property of a heap is that a[0] is always its smallest element.The strange invariant above is meant to be an efficient memory
representation for a tournament.  The numbers below are `k', not a[k]:01                                 23               4                5               67       8       9       10      11      12      13      1415 16   17 18   19 20   21 22   23 24   25 26   27 28   29 30In the tree above, each cell `k' is topping `2*k+1' and `2*k+2'.  In
a usual binary tournament we see in sports, each cell is the winner
over the two cells it tops, and we can trace the winner down the tree
to see all opponents s/he had.  However, in many computer applications
of such tournaments, we do not need to trace the history of a winner.
To be more memory efficient, when a winner is promoted, we try to
replace it by something else at a lower level, and the rule becomes
that a cell and the two cells it tops contain three different items,
but the top cell "wins" over the two topped cells.If this heap invariant is protected at all time, index 0 is clearly
the overall winner.  The simplest algorithmic way to remove it and
find the "next" winner is to move some loser (let's say cell 30 in the
diagram above) into the 0 position, and then percolate this new 0 down
the tree, exchanging values, until the invariant is re-established.
This is clearly logarithmic on the total number of items in the tree.
By iterating over all items, you get an O(n ln n) sort.A nice feature of this sort is that you can efficiently insert new
items while the sort is going on, provided that the inserted items are
not "better" than the last 0'th element you extracted.  This is
especially useful in simulation contexts, where the tree holds all
incoming events, and the "win" condition means the smallest scheduled
time.  When an event schedule other events for execution, they are
scheduled into the future, so they can easily go into the heap.  So, a
heap is a good structure for implementing schedulers (this is what I
used for my MIDI sequencer :-).Various structures for implementing schedulers have been extensively
studied, and heaps are good for this, as they are reasonably speedy,
the speed is almost constant, and the worst case is not much different
than the average case.  However, there are other representations which
are more efficient overall, yet the worst cases might be terrible.Heaps are also very useful in big disk sorts.  You most probably all
know that a big sort implies producing "runs" (which are pre-sorted
sequences, which size is usually related to the amount of CPU memory),
followed by a merging passes for these runs, which merging is often
very cleverly organised[1].  It is very important that the initial
sort produces the longest runs possible.  Tournaments are a good way
to that.  If, using all the memory available to hold a tournament, you
replace and percolate items that happen to fit the current run, you'll
produce runs which are twice the size of the memory for random input,
and much better for input fuzzily ordered.Moreover, if you output the 0'th item on disk and get an input which
may not fit in the current tournament (because the value "wins" over
the last output value), it cannot fit in the heap, so the size of the
heap decreases.  The freed memory could be cleverly reused immediately
for progressively building a second heap, which grows at exactly the
same rate the first heap is melting.  When the first heap completely
vanishes, you switch heaps and start a new run.  Clever and quite
effective!In a word, heaps are useful memory structures to know.  I use them in
a few applications, and I think it is good to keep a `heap' module
around. :-)--------------------
[1] The disk balancing algorithms which are current, nowadays, are
more annoying than clever, and this is a consequence of the seeking
capabilities of the disks.  On devices which cannot seek, like big
tape drives, the story was quite different, and one had to be very
clever to ensure (far in advance) that each tape movement will be the
most effective possible (that is, will best participate at
"progressing" the merge).  Some tapes were even able to read
backwards, and this was also used to avoid the rewinding time.
Believe me, real good tape sorts were quite spectacular to watch!
From all times, sorting has always been a Great Art! :-)
"""__all__ = ['heappush', 'heappop', 'heapify', 'heapreplace', 'merge','nlargest', 'nsmallest', 'heappushpop']def heappush(heap, item):"""Push item onto heap, maintaining the heap invariant."""heap.append(item)_siftdown(heap, 0, len(heap)-1)def heappop(heap):"""Pop the smallest item off the heap, maintaining the heap invariant."""lastelt = heap.pop()    # raises appropriate IndexError if heap is emptyif heap:returnitem = heap[0]heap[0] = lastelt_siftup(heap, 0)return returnitemreturn lasteltdef heapreplace(heap, item):"""Pop and return the current smallest value, and add the new item.This is more efficient than heappop() followed by heappush(), and can bemore appropriate when using a fixed-size heap.  Note that the valuereturned may be larger than item!  That constrains reasonable uses ofthis routine unless written as part of a conditional replacement:if item > heap[0]:item = heapreplace(heap, item)"""returnitem = heap[0]    # raises appropriate IndexError if heap is emptyheap[0] = item_siftup(heap, 0)return returnitemdef heappushpop(heap, item):"""Fast version of a heappush followed by a heappop."""if heap and heap[0] < item:item, heap[0] = heap[0], item_siftup(heap, 0)return itemdef heapify(x):"""Transform list into a heap, in-place, in O(len(x)) time."""n = len(x)# Transform bottom-up.  The largest index there's any point to looking at# is the largest with a child index in-range, so must have 2*i + 1 < n,# or i < (n-1)/2.  If n is even = 2*j, this is (2*j-1)/2 = j-1/2 so# j-1 is the largest, which is n//2 - 1.  If n is odd = 2*j+1, this is# (2*j+1-1)/2 = j so j-1 is the largest, and that's again n//2-1.for i in reversed(range(n//2)):_siftup(x, i)def _heappop_max(heap):"""Maxheap version of a heappop."""lastelt = heap.pop()    # raises appropriate IndexError if heap is emptyif heap:returnitem = heap[0]heap[0] = lastelt_siftup_max(heap, 0)return returnitemreturn lasteltdef _heapreplace_max(heap, item):"""Maxheap version of a heappop followed by a heappush."""returnitem = heap[0]    # raises appropriate IndexError if heap is emptyheap[0] = item_siftup_max(heap, 0)return returnitemdef _heapify_max(x):"""Transform list into a maxheap, in-place, in O(len(x)) time."""n = len(x)for i in reversed(range(n//2)):_siftup_max(x, i)# 'heap' is a heap at all indices >= startpos, except possibly for pos.  pos
# is the index of a leaf with a possibly out-of-order value.  Restore the
# heap invariant.
def _siftdown(heap, startpos, pos):newitem = heap[pos]# Follow the path to the root, moving parents down until finding a place# newitem fits.while pos > startpos:parentpos = (pos - 1) >> 1parent = heap[parentpos]if newitem < parent:heap[pos] = parentpos = parentposcontinuebreakheap[pos] = newitem# The child indices of heap index pos are already heaps, and we want to make
# a heap at index pos too.  We do this by bubbling the smaller child of
# pos up (and so on with that child's children, etc) until hitting a leaf,
# then using _siftdown to move the oddball originally at index pos into place.
#
# We *could* break out of the loop as soon as we find a pos where newitem <=
# both its children, but turns out that's not a good idea, and despite that
# many books write the algorithm that way.  During a heap pop, the last array
# element is sifted in, and that tends to be large, so that comparing it
# against values starting from the root usually doesn't pay (= usually doesn't
# get us out of the loop early).  See Knuth, Volume 3, where this is
# explained and quantified in an exercise.
#
# Cutting the # of comparisons is important, since these routines have no
# way to extract "the priority" from an array element, so that intelligence
# is likely to be hiding in custom comparison methods, or in array elements
# storing (priority, record) tuples.  Comparisons are thus potentially
# expensive.
#
# On random arrays of length 1000, making this change cut the number of
# comparisons made by heapify() a little, and those made by exhaustive
# heappop() a lot, in accord with theory.  Here are typical results from 3
# runs (3 just to demonstrate how small the variance is):
#
# Compares needed by heapify     Compares needed by 1000 heappops
# --------------------------     --------------------------------
# 1837 cut to 1663               14996 cut to 8680
# 1855 cut to 1659               14966 cut to 8678
# 1847 cut to 1660               15024 cut to 8703
#
# Building the heap by using heappush() 1000 times instead required
# 2198, 2148, and 2219 compares:  heapify() is more efficient, when
# you can use it.
#
# The total compares needed by list.sort() on the same lists were 8627,
# 8627, and 8632 (this should be compared to the sum of heapify() and
# heappop() compares):  list.sort() is (unsurprisingly!) more efficient
# for sorting.def _siftup(heap, pos):endpos = len(heap)startpos = posnewitem = heap[pos]# Bubble up the smaller child until hitting a leaf.childpos = 2*pos + 1    # leftmost child positionwhile childpos < endpos:# Set childpos to index of smaller child.rightpos = childpos + 1if rightpos < endpos and not heap[childpos] < heap[rightpos]:childpos = rightpos# Move the smaller child up.heap[pos] = heap[childpos]pos = childposchildpos = 2*pos + 1# The leaf at pos is empty now.  Put newitem there, and bubble it up# to its final resting place (by sifting its parents down).heap[pos] = newitem_siftdown(heap, startpos, pos)def _siftdown_max(heap, startpos, pos):'Maxheap variant of _siftdown'newitem = heap[pos]# Follow the path to the root, moving parents down until finding a place# newitem fits.while pos > startpos:parentpos = (pos - 1) >> 1parent = heap[parentpos]if parent < newitem:heap[pos] = parentpos = parentposcontinuebreakheap[pos] = newitemdef _siftup_max(heap, pos):'Maxheap variant of _siftup'endpos = len(heap)startpos = posnewitem = heap[pos]# Bubble up the larger child until hitting a leaf.childpos = 2*pos + 1    # leftmost child positionwhile childpos < endpos:# Set childpos to index of larger child.rightpos = childpos + 1if rightpos < endpos and not heap[rightpos] < heap[childpos]:childpos = rightpos# Move the larger child up.heap[pos] = heap[childpos]pos = childposchildpos = 2*pos + 1# The leaf at pos is empty now.  Put newitem there, and bubble it up# to its final resting place (by sifting its parents down).heap[pos] = newitem_siftdown_max(heap, startpos, pos)def merge(*iterables, key=None, reverse=False):'''Merge multiple sorted inputs into a single sorted output.Similar to sorted(itertools.chain(*iterables)) but returns a generator,does not pull the data into memory all at once, and assumes that each ofthe input streams is already sorted (smallest to largest).>>> list(merge([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))[0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25]If *key* is not None, applies a key function to each element to determineits sort order.>>> list(merge(['dog', 'horse'], ['cat', 'fish', 'kangaroo'], key=len))['dog', 'cat', 'fish', 'horse', 'kangaroo']'''h = []h_append = h.appendif reverse:_heapify = _heapify_max_heappop = _heappop_max_heapreplace = _heapreplace_maxdirection = -1else:_heapify = heapify_heappop = heappop_heapreplace = heapreplacedirection = 1if key is None:for order, it in enumerate(map(iter, iterables)):try:next = it.__next__h_append([next(), order * direction, next])except StopIteration:pass_heapify(h)while len(h) > 1:try:while True:value, order, next = s = h[0]yield values[0] = next()           # raises StopIteration when exhausted_heapreplace(h, s)      # restore heap conditionexcept StopIteration:_heappop(h)                 # remove empty iteratorif h:# fast case when only a single iterator remainsvalue, order, next = h[0]yield valueyield from next.__self__returnfor order, it in enumerate(map(iter, iterables)):try:next = it.__next__value = next()h_append([key(value), order * direction, value, next])except StopIteration:pass_heapify(h)while len(h) > 1:try:while True:key_value, order, value, next = s = h[0]yield valuevalue = next()s[0] = key(value)s[2] = value_heapreplace(h, s)except StopIteration:_heappop(h)if h:key_value, order, value, next = h[0]yield valueyield from next.__self__# Algorithm notes for nlargest() and nsmallest()
# ==============================================
#
# Make a single pass over the data while keeping the k most extreme values
# in a heap.  Memory consumption is limited to keeping k values in a list.
#
# Measured performance for random inputs:
#
#                                   number of comparisons
#    n inputs     k-extreme values  (average of 5 trials)   % more than min()
# -------------   ----------------  ---------------------   -----------------
#      1,000           100                  3,317               231.7%
#     10,000           100                 14,046                40.5%
#    100,000           100                105,749                 5.7%
#  1,000,000           100              1,007,751                 0.8%
# 10,000,000           100             10,009,401                 0.1%
#
# Theoretical number of comparisons for k smallest of n random inputs:
#
# Step   Comparisons                  Action
# ----   --------------------------   ---------------------------
#  1     1.66 * k                     heapify the first k-inputs
#  2     n - k                        compare remaining elements to top of heap
#  3     k * (1 + lg2(k)) * ln(n/k)   replace the topmost value on the heap
#  4     k * lg2(k) - (k/2)           final sort of the k most extreme values
#
# Combining and simplifying for a rough estimate gives:
#
#        comparisons = n + k * (log(k, 2) * log(n/k) + log(k, 2) + log(n/k))
#
# Computing the number of comparisons for step 3:
# -----------------------------------------------
# * For the i-th new value from the iterable, the probability of being in the
#   k most extreme values is k/i.  For example, the probability of the 101st
#   value seen being in the 100 most extreme values is 100/101.
# * If the value is a new extreme value, the cost of inserting it into the
#   heap is 1 + log(k, 2).
# * The probability times the cost gives:
#            (k/i) * (1 + log(k, 2))
# * Summing across the remaining n-k elements gives:
#            sum((k/i) * (1 + log(k, 2)) for i in range(k+1, n+1))
# * This reduces to:
#            (H(n) - H(k)) * k * (1 + log(k, 2))
# * Where H(n) is the n-th harmonic number estimated by:
#            gamma = 0.5772156649
#            H(n) = log(n, e) + gamma + 1 / (2 * n)
#   http://en.wikipedia.org/wiki/Harmonic_series_(mathematics)#Rate_of_divergence
# * Substituting the H(n) formula:
#            comparisons = k * (1 + log(k, 2)) * (log(n/k, e) + (1/n - 1/k) / 2)
#
# Worst-case for step 3:
# ----------------------
# In the worst case, the input data is reversed sorted so that every new element
# must be inserted in the heap:
#
#             comparisons = 1.66 * k + log(k, 2) * (n - k)
#
# Alternative Algorithms
# ----------------------
# Other algorithms were not used because they:
# 1) Took much more auxiliary memory,
# 2) Made multiple passes over the data.
# 3) Made more comparisons in common cases (small k, large n, semi-random input).
# See the more detailed comparison of approach at:
# http://code.activestate.com/recipes/577573-compare-algorithms-for-heapqsmallestdef nsmallest(n, iterable, key=None):"""Find the n smallest elements in a dataset.Equivalent to:  sorted(iterable, key=key)[:n]"""# Short-cut for n==1 is to use min()if n == 1:it = iter(iterable)sentinel = object()result = min(it, default=sentinel, key=key)return [] if result is sentinel else [result]# When n>=size, it's faster to use sorted()try:size = len(iterable)except (TypeError, AttributeError):passelse:if n >= size:return sorted(iterable, key=key)[:n]# When key is none, use simpler decorationif key is None:it = iter(iterable)# put the range(n) first so that zip() doesn't# consume one too many elements from the iteratorresult = [(elem, i) for i, elem in zip(range(n), it)]if not result:return result_heapify_max(result)top = result[0][0]order = n_heapreplace = _heapreplace_maxfor elem in it:if elem < top:_heapreplace(result, (elem, order))top, _order = result[0]order += 1result.sort()return [elem for (elem, order) in result]# General case, slowest methodit = iter(iterable)result = [(key(elem), i, elem) for i, elem in zip(range(n), it)]if not result:return result_heapify_max(result)top = result[0][0]order = n_heapreplace = _heapreplace_maxfor elem in it:k = key(elem)if k < top:_heapreplace(result, (k, order, elem))top, _order, _elem = result[0]order += 1result.sort()return [elem for (k, order, elem) in result]def nlargest(n, iterable, key=None):"""Find the n largest elements in a dataset.Equivalent to:  sorted(iterable, key=key, reverse=True)[:n]"""# Short-cut for n==1 is to use max()if n == 1:it = iter(iterable)sentinel = object()result = max(it, default=sentinel, key=key)return [] if result is sentinel else [result]# When n>=size, it's faster to use sorted()try:size = len(iterable)except (TypeError, AttributeError):passelse:if n >= size:return sorted(iterable, key=key, reverse=True)[:n]# When key is none, use simpler decorationif key is None:it = iter(iterable)result = [(elem, i) for i, elem in zip(range(0, -n, -1), it)]if not result:return resultheapify(result)top = result[0][0]order = -n_heapreplace = heapreplacefor elem in it:if top < elem:_heapreplace(result, (elem, order))top, _order = result[0]order -= 1result.sort(reverse=True)return [elem for (elem, order) in result]# General case, slowest methodit = iter(iterable)result = [(key(elem), i, elem) for i, elem in zip(range(0, -n, -1), it)]if not result:return resultheapify(result)top = result[0][0]order = -n_heapreplace = heapreplacefor elem in it:k = key(elem)if top < k:_heapreplace(result, (k, order, elem))top, _order, _elem = result[0]order -= 1result.sort(reverse=True)return [elem for (k, order, elem) in result]# If available, use C implementation
try:from _heapq import *
except ImportError:pass
try:from _heapq import _heapreplace_max
except ImportError:pass
try:from _heapq import _heapify_max
except ImportError:pass
try:from _heapq import _heappop_max
except ImportError:passif __name__ == "__main__":import doctest # pragma: no coverprint(doctest.testmod()) # pragma: no cover

参考

heapq可视化https://visualgo.net/zh/heap

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/diannao/18814.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Linux基础学习笔记

目录 1、Linux安装 1.1 安装教程 1.2 Linux目录结构 2、Linux常用命令 2.1 ls 2.2 命令分类 2.3 目录处理命令 2.4 操作文件命令 2.5 查找文件命令 2.6 ln链接命令 2.7 进程相关命令 ​编辑3、配置网络 3.1 关闭windows防火墙 3.2 配置好虚拟机的局域网 3.3 配置…

汇编原理(四)[BX]和loop指令

loop&#xff1a;循环 误区&#xff1a;在编译器里写代码和在debug里写代码是不一样的&#xff0c;此时&#xff0c;对于编译器来说&#xff0c;就需要用到[bx] [bx]: [bx]同样表示一个内存单元&#xff0c;他的偏移地址在bx中&#xff0c;比如下面的指令 move bx, 0 move ax,…

永恒之蓝(MS17-010)详解

这个漏洞还蛮重要的&#xff0c;尤其在内网渗透和权限提升。 目录 SMB简介 SMB工作原理 永恒之蓝简原理 影响版本 漏洞复现 复现准备 复现过程 修复建议 SMB简介 SMB是一个协议服务器信息块&#xff0c;它是一种客户机/服务器、请求/响应协议&#xff0c;通过SMB协议…

dubbo复习:(11)使用grpc客户端访问tripple协议的dubbo 服务器

一、服务器端依赖&#xff1a; <?xml version"1.0" encoding"UTF-8"?> <project xmlns"http://maven.apache.org/POM/4.0.0"xmlns:xsi"http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation"http://maven.…

【kubernetes】陈述式资源管理的kubectl命令合集

目录 前言 一、K8s 资源管理操作方式 1、声明式资源管理方式 2、陈述式资源管理方式 二、陈述式资源管理方式 1、kubectl 命令基本语法 2、查看基本信息 2.1 查看版本信息 2.2 查看资源对象简写 2.3 配置kubectl命令自动补全 2.4 查看node节点日志 2.5 查看集群信息…

01 Nginx安装部署(系列篇)

一、安装部署 1、Nginx的发行版本 常用版本分为四大阵营&#xff1a; Nginx 开源版 | https://nginx.org/&#xff1a;赤裸裸的Web服务器、反向代理、负载均衡&#xff08;功能少&#xff0c;开发难度大&#xff09; Nginx Plus 商业版 | https://www.nginx.com/&#xff1a;…

高职物联网专业嵌入式系统开发教学解决方案

前言 随着人工智能与物联网技术的深度融合&#xff0c;物联网&#xff08;AIoT&#xff09;已成为推动产业发展的重要力量。高职物联网专业作为培养技术人才的重要基地&#xff0c;面临着课程体系更新、教学内容优化的迫切需求。嵌入式系统开发作为物联网专业的核心课程之一&a…

[CVPR-24] HUGS: Human Gaussian Splats

本文提出一种新的数字人表征Human Gaussian Splats (HUGS)&#xff0c;可以实现新姿态和新视角生成&#xff1b;本文提出一种新的前向形变模块&#xff08;forward deformation module&#xff09;&#xff0c;在标定空间基于Gaussians表征数字人&#xff0c;并基于LBS学习如何…

秘钥托管技术简介

目录 前言 一、秘钥托管是什么&#xff1f; 二、秘钥托管技术简介 1. Skipjack算法 2. LEAF产生过程示意图 3. 对加密通信的法律实施存取过程 总结 前言 1993年4月&#xff0c;美国政府为了满足其电信安全、公众安全和国家安全&#xff0c;提出了托管加密标准EES (escro…

Aria2下载安装使用

目录 下载Aria2 配置创建 aria2.conf 文件创建 aria2.session 文件 Aria2的使用基础使用多源下载多线程下载后台下载配置文件启动 AriaNg下载安装AriaNg配置AriaNg使用 Tracker 列表 aria2 是一款免费开源跨平台且不限速的多线程下载软件&#xff0c;其优点是速度快、体积小、资…

慧尔智联携纷享销客启动CRM项目 推进客户经营升级与内外高效协作

智慧农业领军企业慧尔智联携手纷享销客&#xff0c;启动CRM客户经营管理系统项目。双方将深入合作&#xff0c;全面落实慧尔智联发展策略&#xff0c;持续提升数字化经营管理水平&#xff0c;实现内部团队信息化高效协作&#xff0c;以快速响应市场需求&#xff0c;提升客户满意…

开源集运wms系统

集运WMS系统是一种专为集运业务设计的仓库管理系统&#xff0c;它能够高效地处理来自多个来源的货物&#xff0c;优化存储和发货流程。 经过长时间的开发和测试&#xff0c;推出了我的集运WMS系统。它不仅具备传统WMS系统的所有功能&#xff0c;还针对集运业务的特点进行了特别…

HNU-计算机体系结构-小班讨论-GoogleTPU的发展历程与思考

因为对GPU比较感兴趣&#xff0c;故选择这个作为汇报课题。

JEPaaS 低代码平台 accessToTeanantInfo SQL注入漏洞复现

0x01 产品简介 JEPaaS低代码开发平台开源版 旨在帮助企业快速实现信息化和数字化转型。该平台基于可视化开发环境,让软件开发人员和业务用户通过直观的可视化界面来构建应用程序 ,而不是传统的编写代码方式。 用户可以在开发平台灵活各个图形化控件,以构建业务流程、逻辑和…

智能合约革命:Web3引领智能化商业的未来

随着区块链技术的日益成熟和普及&#xff0c;智能合约作为其重要应用之一&#xff0c;正在逐渐改变着商业世界的面貌。Web3作为下一代互联网的代表&#xff0c;以其去中心化、加密安全的特性&#xff0c;为智能合约的发展提供了无限可能&#xff0c;将智能合约应用于商业领域的…

使用控制台方式部署sentinel

1.下载控制台jar包 2.运行jar包 java -jar sentinel-dashboard-1.8.0.jar 也可以通过编写批处理文件指定端口、用户名、密码&#xff1a; 客户端添加依赖&#xff08;后续整合springcloudalibaba时不需要此依赖&#xff09; 如修改了sentinel端口&#xff0c;需要添加客户端运…

Springboot项目搭建 jdk1.8

1.idea创建项目 2.项目配置 maven 编辑项目编码 删除无用文件 修改配置文件后缀&#xff0c;设置数据库 spring:datasource:driver-class-name: com.mysql.cj.jdbc.Driverurl:jdbc:mysql://localhost:3306/honey2024?useSSLfalse&useUnicodetrue&characterEncodingUT…

AI绘画Stable Diffusion XL 可商用模型!写实艺术时尚摄影级真实感大模型推荐(附模型下载)

大家好&#xff0c;我是设计师阿威 大家在使用AI绘画的时候&#xff0c;是不是遇到这种问题&#xff1a;收藏的模型确实很多&#xff0c;可商用的没几个&#xff0c;而今天阿威将给大家带来的这款写实艺术时尚摄影级真实感大模型-墨幽人造人XL&#xff0c; 对于个人来讲完全是…

Springboot事务控制中A方法调用B方法@Transactional生效与不生效情况实战总结

介绍 本篇对Springboot事务控制中A方法调用B方法Transactional生效与不生效情况进行实战总结&#xff0c;让容易忘记或者困扰初学者甚至老鸟的开发者&#xff0c;只需要看这一篇文章即可立马找到解决方案&#xff0c;这就是干货的价值。喜欢的朋友别忘记来个一键三连哈&#x…

【wiki知识库】03.前后端的初步交互(展现所有的电子书)

&#x1f4dd;个人主页&#xff1a;哈__ 期待您的关注 目录 一、&#x1f525;今日目标 二、&#x1f4c2;前端配置文件补充 三、&#x1f30f;前端Vue的改造 四、&#x1f4a1;总结 一、&#x1f525;今日目标 在上一篇文章当中&#xff0c;我已带大家把后端的一些基本工…