I am trying to optimise this code using numpy built in function:
result = [test[0]] for x in range(1, len(test)): if test[x] <= result[-1]: result.append(result[-1]+1) else: result.append(test[x]) print(result)
It loops throw the array, check if previous value is equal or superior to the current value, if it does, adds +1, else do nothing. It is recursive (current value depends of the previously calculated value). My input array is, by design, always sorted (ascending order)
with test = np.array([0, 0, 0, 1, 4, 15, 16, 16, 16, 17]) I expect to get [0, 1, 2, 3, 4, 15, 16, 17, 18, 19]
test = np.array([0, 0, 0, 1, 4, 15, 16, 16, 16, 17])
[0, 1, 2, 3, 4, 15, 16, 17, 18, 19]
Is there a better way of doing this (I need to access many times with very big arrays (>=10M lenght) this function).
Certainly! You can optimize the given code using numpy’s built-in functions to eliminate the loop and make it more efficient. Here’s an optimized version:
import numpy as np def optimize_array(test): result = np.zeros_like(test) result[0] = test[0] mask = test[1:] <= result[:-1] result[1:] = np.where(mask, result[:-1] + 1, test[1:]) return result test = np.array([0, 0, 0, 1, 4, 15, 16, 16, 16, 17]) result = optimize_array(test) print(result)
This code eliminates the explicit loop and utilizes numpy’s vectorized operations for better performance. The np.where function is used to apply the condition in a vectorized manner, and the resulting array is assigned to the result array directly. This should be more efficient for large arrays.
np.where
result
Keep in mind that the performance gain may vary depending on the size of your input array and the hardware you’re running the code on. Always test with your specific use case to ensure the best performance.