What happens when p > n? How do we estimate
models when this occurs?
How about statistical inference of parameters and
p-values? What happens?
When there are more parameters than observations
[say in genomics], we need to be careful when fitting models.
We need to invoke the pseudoinverse’s correct
formula ie:
|
Which we can once again solve via Cholesky
Factorization.
We saw in the Linear Regression optimization notes
that using the modified LM algorithm is necessary for
ill conditioned matrices.
|
Likewise by extending this to underdetermined systems:
|
However, for GLMs, we use to have that:
|
For underdetermined systems then we have:
|
However, for underdetermined systems, we MUST use the modified LM
algorithm for GLMs,
otherwise non convergence is seen ie:
|
So in steps:
|
Copyright Daniel Han 2024. Check out Unsloth!