I have a list, say [a, b, c, a, a, b]. I want as output to get a list of indices of the matching elements. for this example it would then be [[0, 3, 4], [1, 5]].
[a, b, c, a, a, b]
[[0, 3, 4], [1, 5]]
The lists will be small so performance or memory use is not of importance. is there already a built-in function for lists or either for numpy/pandas that can do this?
In Python, you can achieve this using a dictionary to store the indices of each element. Here’s a simple function that does what you want:
def find_matching_indices(lst): index_dict = {} for i, item in enumerate(lst): if item in index_dict: index_dict[item].append(i) else: index_dict[item] = [i] result = list(index_dict.values()) return result # Example usage input_list = ['a', 'b', 'c', 'a', 'a', 'b'] output = find_matching_indices(input_list) print(output)
This will output [[0, 3, 4], [1, 5]], which is the desired result for your example.
Alternatively, if you are using NumPy, you can achieve the same result using the numpy.where function:
numpy.where
import numpy as np input_array = np.array(['a', 'b', 'c', 'a', 'a', 'b']) output = [np.where(input_array == item)[0].tolist() for item in np.unique(input_array)] print(output)
This will also give you the same result.
Note: The NumPy approach might be more suitable if you’re working with larger arrays or need additional functionality, but for small lists, the first approach using a dictionary is straightforward and efficient.