Hướng dẫn weighted median python numpy

What we can do, if i understood your problem correctly. Is to sum up the observations, dividing by 2 would give us the observation number corresponding to the median. From there we need to figure out what observation this number was.

One trick here, is to calculate the observation sums with np.cumsum. Which gives us a running cumulative sum.

Example:
np.cumsum[[1,2,3,4]] -> [ 1, 3, 6, 10]
Each element is the sum of all previously elements and itself. We have 10 observations here. so the mean would be the 5th observation. [We get 5 by dividing the last element by 2].
Now looking at the cumsum result, we can easily see that that must be the observation between the second and third elements [observation 3 and 6].

So all we need to do, is figure out the index of where the median [5] will fit.
np.searchsorted does exactly what we need. It will find the index to insert an elements into an array, so that it stays sorted.

The code to do it like so:

import numpy as np
#my test data
freq_count = np.array[[[30, 191, 9, 0], [10, 20, 300, 10], [10,20,30,40], [100,10,10,10], [1,1,1,100]]]

c = np.cumsum[freq_count, axis=1] 
indices = [np.searchsorted[row, row[-1]/2.0] for row in c]
masses = [i * 10 for i in indices] #Correct if the masses are indeed 0, 10, 20,...

#This is just for explanation.
print "median masses is:",  masses
print freq_count
print np.hstack[[c, c[:, -1, np.newaxis]/2.0]]

Output will be:

median masses is: [10 20 20  0 30]  
[[ 30 191   9   0]   midpoint]:

 w_median = [data[weights == np.max[weights]]][0]

 else:

 cs_weights = np.cumsum[s_weights]

 idx = np.where[cs_weights 

Chủ Đề