最长的递增子序列

一尘不染

最长的递增子序列

algorithm

给定一个输入序列，找到最长（不一定是连续的）非递减子序列的最佳方法是什么。

0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15 # sequence

1, 9, 13, 15 # non-decreasing subsequence

0, 2, 6, 9, 13, 15 # longest non-deceasing subsequence (not unique)

我正在寻找最佳算法。如果有代码，Python会很好，但是一切都很好。

阅读 199

2020-07-28

共1个答案

一尘不染

我只是偶然发现了这个问题，并提出了以下Python 3实现：

def subsequence(seq):
    if not seq:
        return seq

    M = [None] * len(seq)    # offset by 1 (j -> j-1)
    P = [None] * len(seq)

    # Since we have at least one element in our list, we can start by 
    # knowing that the there's at least an increasing subsequence of length one:
    # the first element.
    L = 1
    M[0] = 0

    # Looping over the sequence starting from the second element
    for i in range(1, len(seq)):
        # Binary search: we want the largest j <= L
        #  such that seq[M[j]] < seq[i] (default j = 0),
        #  hence we want the lower bound at the end of the search process.
        lower = 0
        upper = L

        # Since the binary search will not look at the upper bound value,
        # we'll have to check that manually
        if seq[M[upper-1]] < seq[i]:
            j = upper

        else:
            # actual binary search loop
            while upper - lower > 1:
                mid = (upper + lower) // 2
                if seq[M[mid-1]] < seq[i]:
                    lower = mid
                else:
                    upper = mid

            j = lower    # this will also set the default value to 0

        P[i] = M[j-1]

        if j == L or seq[i] < seq[M[j]]:
            M[j] = i
            L = max(L, j+1)

    # Building the result: [seq[M[L-1]], seq[P[M[L-1]]], seq[P[P[M[L-1]]]], ...]
    result = []
    pos = M[L-1]
    for _ in range(L):
        result.append(seq[pos])
        pos = P[pos]

    return result[::-1]    # reversing

由于花了一些时间来了解算法的工作原理，所以我在评论时有点冗长，并且我还将添加一个简短的解释：

seq 是输入序列。
L 是一个数字：它会在循环序列时进行更新，并标记到该时刻为止发现的最长递增子序列的长度。
M是一个列表。M[j-1]将指向的索引，seq该索引拥有可用于（最后）构建增加的length子序列的最小值j。
P是一个列表。P[i]将指向的索引M[j]在哪里。简而言之，它告诉您哪个是子序列的前一个元素。用于最后生成结果。i``seq``P

该算法如何工作：

处理空序列的特殊情况。
从1个元素的子序列开始。
用index循环输入序列i。
用二进制搜索发现j，让seq[M[j]有<比seq[i]。
更新P，M和L。
追溯结果并将其反向返回。

注意：
与Wikipedia算法的唯一区别是M列表中的偏移量1，在X这里称为seq。我还对Eric
Gustavson答案中显示的单元测试版本进行了稍微改进的单元测试版本，并通过了所有测试。

例：

seq = [30, 10, 20, 50, 40, 80, 60]

       0    1   2   3   4   5   6   <-- indexes

最后，我们将有：

M = [1, 2, 4, 6, None, None, None]
P = [None, None, 1, 2, 2, 4, 4]
result = [10, 20, 40, 60]

如您所见，P这非常简单。我们来看看它到底，所以它告诉之前60还有的40,前80有40，以前40有20，之前50有20和以前20有10，停止。

复杂的部分在继续M。在开始的时候M是[0, None, None, ...]因为长度为1的亚序列的最后一个元素（因此位置0
M）是索引0处：30。

此时，我们将开始循环seq查看10，因为10is <比30，M将被更新：

if j == L or seq[i] < seq[M[j]]:
    M[j] = i

所以，现在M的样子：[1, None, None, ...]。这是一件好事，因为10有更多的机会创造更长的增加的子序列。（新的1是10的索引）

现在轮到了20。通过10和20我们拥有长度为2的子序列（中的索引1 M），因此M将是：[1, 2, None, ...]。（新的2是20的索引）

现在轮到了50。50将不会是任何子序列的一部分，因此不会发生任何变化。

现在轮到了40。用10，20和40我们有长度为3（索引2的在子M，所以M将是：[1, 2, 4, None, ...]（新图4是40指数）

等等…

有关代码的完整介绍，您可以在此处复制并粘贴它：)

2020-07-28