i have massive 4d data set, spread throughout 4 variables, x_list, y_list, z_list, , i_list. each list of n scalars, x, y, , z representing point's position in space, , representing intensity.
i have function picks through , marks negligible points (those intensity low) deletion, setting intensity 0. however, when run on 2-million point set, deletion process takes hours.
currently, using .pop(index) command remove data points, because cleanly. here code:
counter = 0 = 0 entry in i_list if (i_list[i] == 0): x_list.pop(i) y_list.pop(i) z_list.pop(i) i_list.pop(i) counter += 1 print (counter, "points removed") else += 1 how can more efficiently?
i think it'll faster create new empty lists each existing list, , append items them if i_list[i] != 0. time complexity of operations you're doing, , you'll see deleting items o(n), whereas appending o(1). you're doing lot of o(n) deletes pretty large n, slow.
so like:
new_x = [] new_y = [] new_y = [] new_i = [] index in range(len(i_list)): if i_list[index] != 0: new_x.append(x_list[index]) new_y.append(y_list[index]) # etc. going further, should numpy arrays, subsetting find set of items i_list != 0 fast.
Comments
Post a Comment