Tuesday, June 9, 2009

Deciphering Nmix

So I finally have a working version of Nmix, the program that uses the Reversible Jump Markov Chain Monte Carlo method to determine the number of components in some data set.

As far as I can figure out, the k file contains information about the number of components, k.
The first block is the posterior probabilities for each k, with the most probable obviously being the most likely number of components.
The next block contains the Bayes factors, which I don't yet understand.
The next is the prior distribution of k, which defaults to 1 for all k.
The last block is the posterior distribution of kpos, the number of nonempty components. This is pretty similar to the distribution for k, and I'm not entirely sure how they differ.

The pe file contains an even better summary of the data. It has a section for each k (the number of components) that the program visits.
The first number is the number of visits to that k or something. But it directly corresponds to the probability distribution in the k file (that number over the total number of sweeps).
Then we get to the interesting stuff. The first column is the weight of each component (the percent of data belonging to each).
The next one is the mean of each component, and the last is the standard deviation

But that doesn't include the error in the mean and I'll need to delve a little deeper to find it.

No comments:

Post a Comment