For approaches to retrieving partial matches in a numeric list, go to:
But if you're looking for how to retrieve partial matches for a list of strings, you'll find the best approaches concisely explained in the answer below.
SO: Python list lookup with partial match shows how to return a bool, if a list contains an element that partially matches (e.g. begins, ends, or contains) a certain string. But how can you return the element itself, instead of True or False
Example:
l = ['ones', 'twos', 'threes']
wanted = 'three'
Here, the approach in the linked question will return True using:
any(s.startswith(wanted) for s in l)
So how can you return the element 'threes' instead?
startswith and in, return a Boolean.
- The
in operator is a test of membership.
- This can be performed with a
list-comprehension or filter.
- Using a
list-comprehension, with in, is the fastest implementation tested.
- If case is not an issue, consider mapping all the words to lowercase.
l = list(map(str.lower, l)).
- Tested with python 3.10.0
filter:
- Using
filter creates a filter object, so list() is used to show all the matching values in a list.
l = ['ones', 'twos', 'threes']
wanted = 'three'
# using startswith
result = list(filter(lambda x: x.startswith(wanted), l))
# using in
result = list(filter(lambda x: wanted in x, l))
print(result)
[out]:
['threes']
list-comprehension
l = ['ones', 'twos', 'threes']
wanted = 'three'
# using startswith
result = [v for v in l if v.startswith(wanted)]
# using in
result = [v for v in l if wanted in v]
print(result)
[out]:
['threes']
Which implementation is faster?
- Tested in Jupyter Lab using the
words corpus from nltk v3.6.5, which has 236736 words
- Words with
'three'
['three', 'threefold', 'threefolded', 'threefoldedness', 'threefoldly', 'threefoldness', 'threeling', 'threeness', 'threepence', 'threepenny', 'threepennyworth', 'threescore', 'threesome']
from nltk.corpus import words
%timeit list(filter(lambda x: x.startswith(wanted), words.words()))
[out]:
64.8 ms ± 856 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit list(filter(lambda x: wanted in x, words.words()))
[out]:
54.8 ms ± 528 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit [v for v in words.words() if v.startswith(wanted)]
[out]:
57.5 ms ± 634 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit [v for v in words.words() if wanted in v]
[out]:
50.2 ms ± 791 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
所有评论(0)