Monday, June 22, 2009

Jackknife has too many k's in a row for one word.

Since my last blog post, I have spent my time modifying my bootstrap code to instead do jackknifes.

Jackknifes are somewhat similar to bootstraps in that they test modified data sets, but the jackknife alteration is simply to remove a single data point, test the statistic in question, then put it back and remove another, and repeat until all of the data points have been removed. This gives a good estimate of the variance of the data, because it can show how much a single data point affects the results.

I'll find out more about jackknifing and bootstrapping once I actually get that book from the Astro library (probably tomorrow).

My jackknifing code made a bunch of pretty plots showing the Nmix probability of a certain component for each test. It's pretty interesting that the single-component probability seems to fluctuate a lot for the different jackknifes but quickly converges at 2 components and beyond.

But an alarming discovery was that the simulated normal sample with added uncertainty sometimes appeared to be strongly bimodal. So I decided to test the power of Nmix with a bunch of simulated pure normal populations of varying star number.

Theoretically Nmix should work better with more stars (as more data points make the Normal curve much more visible), so I'll see if that pans out as expected.

To Do:
1) Find the mean and standard deviation of the kurtosis of the jackknifed samples
3) According to Wikipedia, there is also a kick called a jackknife:

4) Figure out how to do that.

1 comment:

  1. Dude. Awesome. Don't fall and break yourself or anything while trying to learn to do that.
    And I agree... the double k looks odd... does in "bookkeeper" too. =P