P110: Run-length encoding of list

游程编码。游程编码是一种数据压缩方法，其将连续重复的元素编码为[N,E]项，E是元素值，N是元素E连续重复出现的次数。比如，[1, 1, 2, 3, 3, 4]按游程编码则为[[2, 1], [1, 2], [2, 3], [1, 4]]。首先，还是先写测试用例：

from python99.lists.p110 import encode def test_encode(): assert encode([]) == [] assert encode([1]) == [[1,1]] assert encode([1,1,2,3,3,4,4,4,4,5,6,6,6]) == [[2,1],[1,2],[2,3],[4,4],[1,5],[3,6]]

这个问题可以分两步解决，先将列表中的连续重复元素分组，再将分组逐个转换为[N, E]项。「将列表中的连续重复元素分组」已在P109中实现，可以直接重用。剩下的，就是把一组组「连续重复元素」转换为[N, E]项了。

Python提供了称为List Comprehension的方法，使用List comprehension可以方便地将列表中的元素逐个转换为另一种形式。List Comprehension的本意是提供一种简练的方式创建列表。常见的用法是将一些操作作用于列表中的每一个元素，然后把这些结果拼接成新的列表。List Comprehension的一般行式为：

[ f(x) for x in l]

f是某个函数，接受输入x，输出任意类型。函数f可以为空，也可以直接是表达式
x局部变量，用于引用l中的每一个元素
l列表

在本例中，x是包含一至多个相同元素的列表，而f则要返回一个包含两个元素的列表，第一个元素是列表x的长度，第二个元素是x中的任意一个元素「这里简单点就选头元素」。f可以直接用表逹式实现为：

[len(x), x[0]]

完整的代码实现：

# Run-length encoding of a list. # Use the result of problem 1.09 to implement the so-called run-length encoding data compression method. # Consecutive duplicates of elements are encoded as terms [N,E] # where N is the number of duplictes of the element E. from python99.lists.p109 import pack def encode(l): packed_list = pack(l) return [[len(group), group[0]] for group in packed_list]

P110

P110: Run-length encoding of list

results matching ""

No results matching ""