Title: | Multinomial Goodness-of-Fit Tests |
---|---|
Description: | Computes exact p-values for multinomial goodness-of-fit tests based on multiple test statistics, namely, Pearson's chi-square, the log-likelihood ratio and the probability mass statistic. Implements the algorithm detailed in Resin (2020) <arXiv:2008.12682>. Estimates based on the classical asymptotic chi-square approximation or Monte-Carlo simulation can also be computed. |
Authors: | Johannes Resin |
Maintainer: | Johannes Resin <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.2 |
Built: | 2024-10-22 03:28:47 UTC |
Source: | https://github.com/cran/ExactMultinom |
C++ function computing exact multinomial p-values. Does not perform any safety checks. Incorrect input may result in unwanted behavior.
Use only through multinom.test
with method = "exact" is recommended.
multinom_test_cpp(x, p, theta = 1e-04)
multinom_test_cpp(x, p, theta = 1e-04)
x |
Vector of non-negative integers - number of times each outcome was observed. |
p |
A vector of positive numbers - the hypothesized probabilities for each outcome. Need to sum to 1! |
theta |
Parameter - p-values less than theta will not be determined precisely. |
The outcomes should be ordered by the hypothesized probabilities from largest to smallest for optimal performance.
Returns a vector containing three values which are the p-values computed from the probability mass, Pearson's chi-square and the log-likelihood ratio statistic. Values below the threshold theta are upper bounds only and not exact p-values!
Computes exact p-values for multinomial goodness-of-fit tests based on multiple test statistics, namely, Pearson's chi-square, the log-likelihood ratio and the probability mass statistic. Implements the algorithm detailed in Resin (2020). Estimates based on the classical asymptotic chi-square approximation or Monte-Carlo simulation can also be computed.
multinom.test( x, p, stat = "Prob", method = "exact", theta = 1e-04, timelimit = 10, N = 10000 )
multinom.test( x, p, stat = "Prob", method = "exact", theta = 1e-04, timelimit = 10, N = 10000 )
x |
Vector of non-negative integers - number of times each outcome was observed. |
p |
A vector of positive numbers - the hypothesized probabilities for each outcome. Need not sum to 1, but only encode hypothesized proportions correctly. |
stat |
Test statistic to be used. Can be "Prob" for the probability mass, "Chisq" for Pearson's chi-square and "LLR" for the log-likelihood ratio. |
method |
Method used to compute the p-values. Can be "exact", "asymptotic" and "Monte-Carlo". |
theta |
Parameter used with the exact method. p-values less than theta will not be determined precisely. Values >= 10^-8 are recommended. Very small p-values (< 10^-8) may be falsified by rounding errors. |
timelimit |
Time limit in seconds, after which the exact method is interrupted to avoid long runtime. |
N |
Number of samples generated by the Monte-Carlo approach. |
The "exact" method implements the algorithm detailed in Resin (2020). The method improves on the full enumeration implemented in some other R packages.
It should work well if the number of categories is small. To avoid long runtimes the exact computation is interrupted after timelimit
seconds.
However, it may take longer than the specified time limit for the actual interrupt to occur. Only p-values greater theta
are determined precisely.
For p-values less than theta
, the algorithm will only determine that the p-value is smaller than theta
.
The "asymptotic" method returns classical chi-square approximations. The asymptotic approximation to the probability mass statistic is detailed in Section 2 of Resin (2020).
The "Monte-Carlo" method returns estimates based on N
random draws from the hypothesized distribution.
A list with class "mgof" containing
x |
As input by user. |
p |
As input by user. |
stat |
As input by user or default. |
method |
As input by user or default. |
theta |
As input by user or default. |
pvals_ex |
Exact p-values or NA. WARNING: Values less than theta are NOT exact p-values, but only imply that the actual p-value is less than that value. |
pvals_as |
Asymptotic approximation to p-values or NA. |
pvals_mc |
Monte-Carlo estimated p-values or NA. |
The first p-value (e.g., pvals_ex[1]
) is obtained from the probability mass, the second from Pearson's chi-square and the third from the log-likelihood ratio.
Each method computes p-values for all three test statistics simultaneously.
Resin, J. (2020), A Simple Algorithm for Exact Multinomial Tests, Preprint https://arxiv.org/abs/2008.12682
# Test fairness of a die (that is, whether each side has the same probability) p_fair = rep(1/6,6) # Hypothesized probabilities for each side x = c(16,17,12,15,15,25) # Observed number of times each side appeared on 100 throws # Exact multinomial test (using probability ordering by default): multinom.test(x,p_fair) # Exact multinomial test using log-likelihood ratio: multinom.test(x,p_fair,stat = "LLR") # Classical chi-square test (using asymptotics to estimate p-value and Pearson's chi-square): multinom.test(x,p_fair,stat = "Chisq",method = "asymptotic") # Using Monte-Carlo approach and probability ordering multinom.test(x,p_fair,stat = "Prob",method = "Monte-Carlo")
# Test fairness of a die (that is, whether each side has the same probability) p_fair = rep(1/6,6) # Hypothesized probabilities for each side x = c(16,17,12,15,15,25) # Observed number of times each side appeared on 100 throws # Exact multinomial test (using probability ordering by default): multinom.test(x,p_fair) # Exact multinomial test using log-likelihood ratio: multinom.test(x,p_fair,stat = "LLR") # Classical chi-square test (using asymptotics to estimate p-value and Pearson's chi-square): multinom.test(x,p_fair,stat = "Chisq",method = "asymptotic") # Using Monte-Carlo approach and probability ordering multinom.test(x,p_fair,stat = "Prob",method = "Monte-Carlo")