python - 根据列表向字符串添加空格




list split (8)

我有字符串和数组。 字符串具有与数组相同的字母字符数量。 我需要将s拆分为与arr等每个元素具有相同长度的列表。

s = 'Pythonisanprogramminglanguage'

arr = ['lkjhgf', 'zx', 'qw', 'ertyuiopakk', 'foacdhlc']
expected == ['Python', 'is', 'an', 'programming', 'language']

一种方法是这样做:

s = 'Pythonisanprogramminglanguage'

arr = ['lkjhgf', 'zx', 'qw', 'ertyuiopakk', 'foacdhlc']

expected = []
i = 0
for word in arr:
    expected.append(s[i:i+len(word)])
    i+= len(word)

print(expected)

你可以收集 s 前面的切片。

output = []

for word in arr:
    i = len(word)
    chunk, s = s[:i], s[i:]
    output.append(chunk)

print(output)  # -> ['Python', 'is', 'an', 'programming', 'language']

使用简单的for循环,可以按如下方式完成:

s = 'Pythonisanprogramminglanguage'

arr = ['lkjhgf', 'zx', 'qw', 'ertyuiopakk', 'foacdhlc']

start_index = 0
expected = list()
for a in arr:
    expected.append(s[start_index:start_index+len(a)])
    start_index += len(a)

print(expected)

创建一个简单的循环并使用单词的长度作为索引:

s = 'Pythonisanprogramminglanguage'    
arr = ['lkjhgf', 'zx', 'qw', 'ertyuiopakk', 'foacdhlc']

ctr = 0
words = []
for x in arr:
  words.append(s[ctr:len(x) + ctr])
  ctr += len(x)

print(words)

# ['Python', 'is', 'an', 'programming', 'language']

将来,另一种方法是使用 赋值表达式 (Python 3.8中的新增内容):

s = 'Pythonisanprogramminglanguage'    
arr = ['lkjhgf', 'zx', 'qw', 'ertyuiopakk', 'foacdhlc']

i = 0
expected = [s[i:(i := i+len(word))] for word in arr]

您可以使用 itertools.accumulate 获取要拆分字符串的位置:

>>> s = 'Pythonisanprogramminglanguage'
>>> arr = ['lkjhgf', 'zx', 'qw', 'ertyuiopakk', 'foacdhlc']
>>> import itertools
>>> L = list(itertools.accumulate(map(len, arr)))
>>> L
[6, 8, 10, 21, 29]

现在如果你自己 zip 列表,你会得到间隔:

>>> list(zip([0]+L, L))
[(0, 6), (6, 8), (8, 10), (10, 21), (21, 29)]

而你只需要使用间隔来分割字符串:

>>> [s[i:j] for i,j in zip([0]+L, L)]
['Python', 'is', 'an', 'programming', 'language']

这是另一种方法:

import numpy as np
ar = [0]+list(map(len, arr))
ar = list(np.cumsum(ar))
output_ = [s[i:ar[ar.index(i)+1]] for i in ar[:-1]]

输出

['Python', 'is', 'an', 'programming', 'language']

itertools 模块有一个名为 accumulate() 的函数(在Py 3.2中添加),这有助于使这相对容易:

from itertools import accumulate  # added in Py 3.2


s = 'Pythonisanprogramminglanguage'
arr = ['lkjhgf', 'zx', 'qw', 'ertyuiopakk', 'foacdhlc']

cuts = tuple(accumulate(len(item) for item in arr))
words = [s[i:j] for i, j in zip((0,)+cuts, cuts)]
print(words)  # -> ['Python', 'is', 'an', 'programming', 'language']




split