Answer a question

I am having trouble computing a likelihood ratio test in Python 2.7.

I have two models and the corresponding likelihood values. I believe the rule for comparing whether model L2 is better than model L1 (if the models are closely related) is to look at -2 * log(L2/L1).

I then want to find the p-value for corresponding to -2 * log(L2/L1) and relate this to the significance for L2 is preferred to L1. Here is what I have so far:

import numpy as np
from scipy.stats import chisqprob

L1 = 467400. # log(likelihood) of my 1st fit
L2 = 467414. # log(likelihood) of my 2nd fit

LR = -2. * np.log(L2 / L1) # LR = -5.9905e-05

p = chisqprob(LR, 1) # L2 has 1 DoF more than L1

print 'p: %.30f' % p # p = 1.000000000000000000000000000000

five_sigma = 1 - scipy.special.erf(5 / np.sqrt(2.))                  :-)
print '5 sigma: %.30f' % five_sigma

five_sigma_check = 1 - 0.999999426696856                             :-(
print 'Check  : %.30f' % five_sigma_check

However, I run into two issues:

  • My p-value is coming out to be 1 when I'd have expected it to be close to 0.
  • When I use the formula on the line marked with the :-) to find five sigma, for example, it differs from the value quoted in the literature - that line is highlighted with a :-(. My value for five_sigma_check is taken from here.

Can anyone offer any advice, please? I'm relatively new to the world of Python and statistics.

Thanks.

Answers

To calculate the likelihood ratio given the log-likelihoods, use this formula:

from scipy.stats.distributions import chi2
def likelihood_ratio(llmin, llmax):
    return(2*(llmax-llmin))


LR = likelihood_ratio(L1,L2)


p = chi2.sf(LR, 1) # L2 has 1 DoF more than L1

print 'p: %.30f' % p 

# p: 0.000000121315450836607258011741
Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐