從Python列表中列出一個扁平列表


Answers

你可以使用itertools.chain()

>>> import itertools
>>> list2d = [[1,2,3],[4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain(*list2d))

或者,在Python> = 2.6中,使用itertools.chain.from_iterable() ,它不需要解包列表:

>>> import itertools
>>> list2d = [[1,2,3],[4,5,6], [7], [8,9]]
>>> merged = list(itertools.chain.from_iterable(list2d))

這種方法可以說比[item for sublist in l for item in sublist]更可讀,並且似乎也更快:

[me@home]$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;import itertools' 'list(itertools.chain.from_iterable(l))'
10000 loops, best of 3: 24.2 usec per loop
[me@home]$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 45.2 usec per loop
[me@home]$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 488 usec per loop
[me@home]$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 522 usec per loop
[me@home]$ python --version
Python 2.7.3
Question

我想知道是否有一個快捷方式可以在Python列表中列出一個簡單列表。

我可以在for循環中做到這一點,但也許有一些很酷的“單線程”? 我嘗試減少 ,但我得到一個錯誤。

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
reduce(lambda x, y: x.extend(y), l)

錯誤信息

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <lambda>
AttributeError: 'NoneType' object has no attribute 'extend'



以下對我來說似乎最簡單:

>>> import numpy as np
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> print (np.concatenate(l))
[1 2 3 4 5 6 7 8 9]



你為什麼使用擴展?

reduce(lambda x, y: x+y, l)

這應該很好。




注意 :以下適用於Python 3.3+,因為它使用yield_from

obj = [[1, 2,], [3, 4], [5, 6]] ,這裡所有的解都是很好的,包括列表理解和itertools.chain.from_iterable

但是,請考慮這個稍微複雜的情況:

>>> obj = [[1, 2, 3], [4, 5], 6, 'abc', [7], [8, [9, 10]]]

這裡有幾個問題:

  • 一個元素, 6 ,只是一個標量; 它不是可迭代的,所以上面的路線在這裡將失敗。
  • 一個元素'abc' 技術上可迭代的(全部都是)。 但是,在一行之間閱讀,我們不希望把它看作是這樣 - 我們想把它看作一個單一的元素。
  • 最後一個元素[8, [9, 10]]本身就是一個嵌套迭代器。 Basic list comprehension and chain.from_iterable only extract "1 level down."

We can remedy this as follows:

>>> import sys
>>> from collections import Iterator

>>> py2 = sys.version_info[0] == 2

>>> if py2:
...     str_types = basestring, unicode, str
... else:
...    str_types = str, bytes


>>> def flatten(obj):
...     for i in obj:
...         if isinstance(i, Iterable) and not isinstance(i, str_types):
...             yield from flatten(i)
...         else:
...             yield i


>>> list(flatten(obj))
[1, 2, 3, 4, 5, 6, 'abc', 7, 8, 9, 10]

Here, we check that the sub-element (1) is iterable with Iterable , an ABC from itertools , but also want to ensure that (2) the element is not "string-like." The first few lines ensure compatability between Python 2 & 3.

This competes with or is a hair faster than matplotlib.cbook.flatten :

>>> from matplotlib.cbook import flatten as m_flatten

%timeit flatten(obj*100)
1.39 µs ± 21.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit m_flatten(obj*100)
1.43 µs ± 17.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)



考慮安裝more_itertools包。

> pip install more_itertools

它附帶flattensource ,來自itertools食譜 )的實現:

import more_itertools

# Using flatten()
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.flatten(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

從版本2.4開始,可以使用more_itertools.collapsesource ,由abarnet提供)將更複雜,嵌套的迭代變平。

# Using collapse()
lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]       # given example 
list(more_itertools.collapse(lst)) 
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

lst = [[1, 2, 3], [[4, 5, 6]], [[[7]]], 8, 9]   # complex nesting
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]



如果你願意放棄一個微小的速度以獲得更清晰的外觀,那麼你可以使用numpy.concatenate().tolist()numpy.concatenate().ravel().tolist()

import numpy

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99

%timeit numpy.concatenate(l).ravel().tolist()
1000 loops, best of 3: 313 µs per loop

%timeit numpy.concatenate(l).tolist()
1000 loops, best of 3: 312 µs per loop

%timeit [item for sublist in l for item in sublist]
1000 loops, best of 3: 31.5 µs per loop

您可以在文檔numpy.concatenatenumpy.ravel找到更多numpy.ravel




最近我遇到了一種情況,我在一個子列表中混合了字符串和數字數據,例如

test = ['591212948',
['special', 'assoc', 'of', 'Chicago', 'Jon', 'Doe'],
['Jon'],
['Doe'],
['fl'],
92001,
555555555,
'hello',
['hello2', 'a'],
'b',
['hello33', ['z', 'w'], 'b']]

flat_list = [item for sublist in test for item in sublist]不起作用。 所以,我提出了以下解決方案1 +級別的子列表

def concatList(data):
    results = []
    for rec in data:
        if type(rec) == list:
            results += rec
            results = concatList(results)
        else:
            results.append(rec)
    return results

結果

In [38]: concatList(test)
Out[38]:
 Out[60]:
['591212948',
'special',
'assoc',
'of',
'Chicago',
'Jon',
'Doe',
'Jon',
'Doe',
'fl',
92001,
555555555,
'hello',
'hello2',
'a',
'b',
'hello33',
'z',
'w',
'b']



from functools import reduce #python 3

>>> l = [[1,2,3],[4,5,6], [7], [8,9]]
>>> reduce(lambda x,y: x+y,l)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

示例中的extend()方法修改x而不是返回有用的值( reduce()期望的)。

一個更快的方式來執行reduce版本將是

>>> import operator
>>> l = [[1,2,3],[4,5,6], [7], [8,9]]
>>> reduce(operator.concat, l)
[1, 2, 3, 4, 5, 6, 7, 8, 9]



你可以試試這個:

weird_list=[[1, 2, 3], [4, 5, 6], [7], [8, 9]]
nice_list = [int(e) for e in str(a) if e not in "[] ,"]



您可以非常簡單地使用實際的堆棧數據結構來避免對堆棧的遞歸調用。

alist = [1,[1,2],[1,2,[4,5,6],3, "33"]]
newlist = []

while len(alist) > 0 :
  templist = alist.pop()
  if type(templist) == type(list()) :
    while len(templist) > 0 :
      temp = templist.pop()
      if type(temp) == type(list()) :
        for x in temp :
          templist.append(x)
      else :
        newlist.append(temp)
  else :
    newlist.append(templist)
print(list(reversed(newlist)))



def flatten(l, a):
    for i in l:
        if isinstance(i, list):
            flatten(i, a)
        else:
            a.append(i)
    return a

print(flatten([[[1, [1,1, [3, [4,5,]]]], 2, 3], [4, 5],6], []))

# [1, 1, 1, 3, 4, 5, 2, 3, 4, 5, 6]



另一種不尋常的方法適用於異質和同類整數列表:

def unusual_flatten(some_list: list) -> list:
    cleaned_list = str(some_list).replace(("["), "").replace("]", "").split(",")
    return [int(item) for item in cleaned_list]

在示例列表中應用...

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9], 10]

unusual_flatten(l)

結果:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]




如果你想扁平化一個你不知道嵌套有多深的數據結構,你可以使用iteration_utilities.deepflatten 1

>>> from iteration_utilities import deepflatten

>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(deepflatten(l, depth=1))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

>>> l = [[1, 2, 3], [4, [5, 6]], 7, [8, 9]]
>>> list(deepflatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

它是一個生成器,因此您需要將結果轉換為list或明確地迭代它。

為了只展平一個級別,並且如果每個項目本身都是可迭代的,您還可以使用iteration_utilities.flatten ,它本身就是itertools.chain.from_iterable一個簡單包裝:

>>> from iteration_utilities import flatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(flatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

只需添加一些時間(根據NicoSchlömer答案,不包括此答案中提供的功能):

這是一個對數對數圖,以適應跨越的大範圍值。 為了定性推理:越低越好。

結果表明,如果迭代器只包含少量內部迭代器,那麼sum將會是最快的,但是對於長期迭代器,只有itertools.chain.from_iterableiteration_utilities.deepflatten或嵌套理解器具有合理的性能, itertools.chain.from_iterable是最快的(正如NicoSchlömer已經註意到的那樣)。

from itertools import chain
from functools import reduce
from collections import Iterable  # or from collections.abc import Iterable
import operator
from iteration_utilities import deepflatten

def nested_list_comprehension(lsts):
    return [item for sublist in lsts for item in sublist]

def itertools_chain_from_iterable(lsts):
    return list(chain.from_iterable(lsts))

def pythons_sum(lsts):
    return sum(lsts, [])

def reduce_add(lsts):
    return reduce(lambda x, y: x + y, lsts)

def pylangs_flatten(lsts):
    return list(flatten(lsts))

def flatten(items):
    """Yield items from any nested iterable; see REF."""
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
            yield from flatten(x)
        else:
            yield x

def reduce_concat(lsts):
    return reduce(operator.concat, lsts)

def iteration_utilities_deepflatten(lsts):
    return list(deepflatten(lsts, depth=1))


from simple_benchmark import benchmark

b = benchmark(
    [nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
     pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
    arguments={2**i: [[0]*5]*(2**i) for i in range(1, 13)},
    argument_name='number of inner lists'
)

b.plot()

1免責聲明:我是該圖書館的作者




我發現的最快解決方案(無論如何都是大列表):

import numpy as np
#turn list into an array and flatten()
np.array(l).flatten()

完成! 您當然可以通過執行列表(l)將其重新轉換為列表




這是一個適用於列表,數字,字符串和其他混合容器類型的嵌套列表的一般方法。

from collections import Iterable


def flatten(items):
    """Yield items from any nested iterable; see REF."""
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
            yield from flatten(x)
        else:
            yield x

list(flatten(l))                               # list of lists
#[1, 2, 3, 4, 5, 6, 7, 8, 9]

items = [[1, [2]], (3, 4, {5, 6}, 7), 8, "9"]  # numbers & mixed containers
list(flatten(items))
#[1, 2, 3, 4, 5, 6, 7, 8, '9']

該解決方案採用了Python 3的關鍵字yield from功能,可以從子生成器中提取項目。 請注意,此解決方案不適用於字符串。 更新:現在支持字符串。

REF:由Beazley,D.和B. Jones修改的解決方案 Recipe 4.14,Python Cookbook 3rd Ed。,O'Reilly Media Inc. Sebastopol,CA:2013。