mst.mle package:sn R Documentation
_M_a_x_i_m_u_m _l_i_k_e_l_i_h_o_o_d _e_s_t_i_m_a_t_i_o_n _f_o_r _a (_m_u_l_t_i_v_a_r_i_a_t_e) _s_k_e_w-_t _d_i_s_t_r_i_b_u_t_i_o_n
_D_e_s_c_r_i_p_t_i_o_n:
Fits a skew-t (ST) or multivariate skew-t (MST) distribution to
data, or fits a linear regression model with (multivariate) skew-t
errors, using maximum likelihood estimation.
_U_s_a_g_e:
mst.mle(X, y, freq, start, fixed.df=NA, trace=FALSE,
algorithm = c("nlminb","Nelder-Mead", "BFGS", "CG", "SANN"), control=list())
st.mle(X, y, freq, start, fixed.df=NA, trace=FALSE,
algorithm = c("nlminb","Nelder-Mead", "BFGS", "CG", "SANN"), control=list())
_A_r_g_u_m_e_n_t_s:
y: a matrix (for ‘mst.mle’) or a vector (for ‘st.mle’). If ‘y’
is a matrix, rows refer to observations, and columns to
components of the multivariate distribution.
X: a matrix of covariate values. If missing, a one-column
matrix of 1's is created; otherwise, it must have the same
number of rows of ‘y’. If ‘X’ is supplied, then it must
include a column of 1's.
freq: a vector of weights. If missing, a vector of 1's is created;
otherwise it must have length equal to the number of rows of
‘y’.
start: for ‘mst.mle’, a list contaning the components
‘beta’,‘Omega’, ‘alpha’, ‘df’ of the type described below;
for ‘st.mle’, a vector whose components contain analogous
ingredients as before, with the exception that the scale
parameter is the square root of ‘Omega’. In both cases, the
‘dp’ component of the returned list from a previous call has
the required format and it can be used as a new ‘start’. If
the ‘start’ parameter is missing, initial values are selected
by the function.
fixed.df: a scalar value containing the degrees of freedom (df), if
these must be taked as fixed, or ‘NA’ (default value) if ‘df’
is a parameter to be estimated.
trace: logical value which controls printing of the algorithm
convergence. If ‘trace=TRUE’, details are printed. Default
value is ‘FALSE’.
algorithm: a character string which selects the numerical optimization
procedure used to maximize the loglikelihood function. If
this string is set equal to ‘"nlminb"’, then this function is
called; in all other cases, ‘optim’ is called, with ‘method’
set equal to the given string. Default value is ‘"nlminb"’.
control: this parameter is passed to the chose optimizer, either
‘nlminb’ or ‘optim’; see the documentation of this function
for its usage.
_D_e_t_a_i_l_s:
If ‘y’ is a vector and it is supplied to ‘mst.mle’, then it is
converted to a one-column matrix, and a scalar skew-t distribution
is fitted. This is also the mechanism used by ‘st.mle’ which is
simply an interface to ‘mst.mle’.
The parameter ‘freq’ is intended for use with grouped data,
setting the values of ‘y’ equal to the central values of the
cells; in this case the resulting estimate is an approximation to
the exact maximum likelihood estimate. If ‘freq’ is not set, exact
maximum likelihood estimation is performed.
Numerical search of the maximum likelihood estimates is performed
in a suitable re-parameterization of the original parameters with
aid of the selected optimizer (‘nlminb’ or ‘optim’) which is
supplied with the derivatives of the log-likelihood function.
Notice that, in case the optimizer is ‘optim’), the gradient may
or may not be used, depending on which specific method has been
selected. On exit from the optimizer, an inverse transformation
of the parameters is performed. For a specific description on the
re-parametrization adopted, see Section 5.1 and Appendix B of
Azzalini \& Capitanio (2003).
_V_a_l_u_e:
A list containing the following components:
call: a string containing the calling statement.
dp: for ‘mst.mle’, this is a list containing the direct
parameters ‘beta’, ‘Omega’, ‘alpha’. Here, ‘beta’ is a matrix
of regression coefficients with
‘dim(beta)=c(ncol(X),ncol(y))’, ‘Omega’ is a covariance
matrix of order ‘ncol(y)’, ‘alpha’ is a vector of shape
parameters of length ‘ncol(y)’. For ‘st.mle’, ‘dp’ is a
vector of length ‘ncol(X)+3’, containing ‘c(beta, omega,
alpha, df)’, where ‘omega’ is the square root of ‘Omega’.
se: a list containing the components ‘beta’, ‘alpha’, ‘info’.
Here, ‘beta’ and ‘alpha’ are the standard errors for the
corresponding point estimates; ‘info’ is the observed
information matrix for the working parameter, as explained
below.
algorithm: the list returned by the chose optimizer, either ‘nlminb’ or
‘optim’, plus an item with the ‘name’ of the selected
algorithm; see the documentation of either ‘nlminb’ or
‘optim’ for explanation of the other components.
_B_a_c_k_g_r_o_u_n_d:
The family of multivariate skew-t distributions is an extension of
the multivariate Student's t family, via the introduction of a
‘shape’ parameter which regulates skewness; when ‘shape=0’, the
skew-t distribution reduces to the usual t distribution. When
‘df=Inf’ the distribution reduces to the multivariate skew-normal
one; see ‘dmsn’. See the reference below for additional
information.
_R_e_f_e_r_e_n_c_e_s:
Azzalini, A. and Capitanio, A. (2003). Distributions generated by
perturbation of symmetry with emphasis on a multivariate skew _t_
distribution. The full version of the paper published in abriged
form in _J.Roy. Statist. Soc. B_ *65*, 367-389, is available at
_S_e_e _A_l_s_o:
‘dmst’,‘msn.mle’,‘mst.fit’, ‘nlminb’, ‘optim’
_E_x_a_m_p_l_e_s:
data(ais, package="sn")
attach(ais)
X.mat <- model.matrix(~lbm+sex)
b <- sn.mle(X.mat, bmi)
#
b <- mst.mle(y=cbind(Ht,Wt))
#
# a multivariate regression case:
a <- mst.mle(X=cbind(1,Ht,Wt), y=bmi, control=list(x.tol=1e-6))
#
# refine the previous outcome
a1 <- mst.mle(X=cbind(1,Ht,Wt), y=bmi, control=list(x.tol=1e-9), start=a$dp)