I have an two arrays of 1’s and 0’s:
a = [1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0] b = [0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1]
I want to make sure that the “1” always “jumps” the array as I go from left to right never appearing in the same array twice in a row before appearing in the other array.
a = [1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0] b = [0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]
I can do it using pandas and iteration:
df = pd.DataFrame({"A": a, "B": b, }) df2 = df[(df.A > 0) | (df.B > 0)] i = 0 for idx in df2.index: try: if df2.at[idx, 'A'] == df2.at[df2.index[i + 1], 'A']: df.at[idx, 'A'] = 0 if df2.at[idx, 'B'] == df2.at[df2.index[i + 1], 'B']: df.at[idx, 'B'] = 0 i += 1 except IndexError: pass
But it is not efficient. How can I vectorize it to make it faster?
You can achieve this without iteration using NumPy. Here’s a vectorized solution:
import numpy as np a = np.array([1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) b = np.array([0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1]) # Find indices where either a or b is 1 indices = np.where((a == 1) | (b == 1))[0] # Use diff to find consecutive indices, set the second occurrence to 0 a[indices[np.diff(indices) == 1]] = 0 b[indices[np.diff(indices) == 1]] = 0 print("Result A:", a) print("Result B:", b)
This code uses np.where to find indices where either a or b is 1. Then, np.diff is used to find consecutive indices, and those indices are used to set the second occurrence to 0 in both arrays. This approach avoids explicit iteration and should be more efficient for large arrays.
np.where
a
b
np.diff