How to plot Kernel Density Estimation (KDE) and zero crossings for 3D data in python? -


i have 3d dataset (x,y,z). perform kde, plot data , estimation. then, 0 crossings , plot kde. attempt below. have following questions:

  1. line x, y = np.mgrid[xmin:xmax:100j, ymin:ymax:100j] , positions = np.vstack([x.ravel(),y.ravel(),z.ravel()])as here (kde documentation) have effects in visualising real estimation original data?. don't understand why have use min , max perform kde , use ravel()?
  2. why have transpose data in f = np.reshape(kernel(positions).t, x.shape)

  3. is code correct ?

  4. i failed plot original data kde estimation , kde estimation/ original data 0 crossing:

  5. should 0 crossings vector ?. in code below it's tuple

    df = pd.read_csv(file, delimiter = ',') convert series data-frame arrays x = np.array(df['x'])  y = np.array(df['y'])  z = np.array(df['z']) data = np.vstack([x, y, z]) # perform kde kernel = scipy.stats.kde.gaussian_kde(data) density = kernel(data) fig, ax = plt.subplots(subplot_kw=dict(projection='3d')) x, y, z = data scatter = ax.scatter(x, y, z, c=density) xmin = values[0].min() xmax = values[0].max() ymin = values[1].min() ymax = values[1].max() zmin = values[2].min() zmax = values[2].max() x,y, z =      np.mgrid[xmin:xmax:100j,ymin:ymax:100j,zmin:zmax:100j] positions = np.vstack([x.ravel(),y.ravel(),z.ravel()])   f = np.reshape(kernel(positions).t, x.shape) derivative = np.gradient(f) dz, dy, dx = derivative xdiff = np.sign(dx)   # along x-axis  ydiff = np.sign(dy)   # along y-axis  zdiff = np.sign(dz)   # along z-axis xcross = np.where(xdiff[:-1] != xdiff[1:]) ycross = np.where([ydiff[:-1] != ydiff[1:]]) zcross = np.where([zdiff[:-1] != zdiff[1:]])  zerocross =  xcross + ycross + zcross 

line x, y = np.mgrid[xmin:xmax:100j, ymin:ymax:100j] , positions = np.vstack([x.ravel(),y.ravel(),z.ravel()]) here (kde documentation) have effects in visualising real estimation original data?. don't understand why have use min , max perform kde , use ravel()?

those 2 lines set grid of x, y, z locations kde evaluated. in code above being used estimate derivative of kernel density function. since aren't being used related plotting, won't affect visualisation.

xmin, xmax etc. used ensure grid covers full range of x, y, z values in data. syntax xmin:xmax:100j equivalent of np.linspace(xmin, xmax, 100), i.e. np.mgrid returns 100 evenly spaced points between xmin , xmax.

the x, y , z arrays returned np.mgrid each have shapes (100, 100, 100), whereas positions argument kernel(positions) needs (n_dimensions, n_points). line np.vstack([x.ravel(),y.ravel(),z.ravel()]) reshapes output of np.mgrid form. .ravel() flattens each (100, 100, 100) array (1000000,) vector, , np.vstack concatenates them on first dimension make (3, 1000000) array of points.

why have transpose data in f = np.reshape(kernel(positions).t, x.shape)

you don't :-). output of kernel(positions) 1d vector, transposing have no effect.

i failed plot original data kde estimation , kde estimation/ original data 0 crossing:

what did try? code above seems estimate zero-crossings of gradient of kernel density function, doesn't include code plot them. sort of plot want make?

should 0 crossings vector ?. in code below it's tuple

when call np.where(x) x multidimensional array, tuple containing indices x non-zero. since xdiff[:-1] != xdiff[1:] 3d array, tuple containing 3 1d arrays of indices, 1 per dimension.

you don't want set of square brackets in np.where([ydiff[:-1] != ydiff[1:]]), since in case [ydiff[:-1] != ydiff[1:]] treated (1, 100, 100, 100) array rather (100, 100, 100), , you'll therefore tuple containing 4 arrays of indices rather 3 (the first 1 zeros, since size in first dimension 1).


Comments