sadII_lab10_hmm

2261 days ago by macieksk

# Hidden Markov Models # http://cran.r-project.org/web/packages/HMM/index.html library(HMM) 
       
# Initialise HMM hmm = initHMM(c("A","B"), c("L","R"), transProbs=matrix(c(.8,.2,.2,.8),2), emissionProbs=matrix(c(.6,.4,.4,.6),2)) print(hmm) 
       
$States
[1] "A" "B"

$Symbols
[1] "L" "R"

$startProbs
  A   B 
0.5 0.5 

$transProbs
    to
from   A   B
   A 0.8 0.2
   B 0.2 0.8

$emissionProbs
      symbols
states   L   R
     A 0.6 0.4
     B 0.4 0.6
# Sequence of observations observations = c("L","L","R","R") # Calculate forward probablities logForwardProbabilities = forward(hmm,observations) print(exp(logForwardProbabilities)) ?forward 
       
      index
states   1     2      3        4
     A 0.3 0.168 0.0608 0.024448
     B 0.2 0.088 0.0624 0.037248
forward                  package:HMM                   R
Documentation

_C_o_m_p_u_t_e_s _t_h_e _f_o_r_w_a_r_d
_p_r_o_b_a_b_i_l_i_t_i_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     The ‘forward’-function computes the forward probabilities. The
     forward probability for state X up to observation at time k is
     defined as the probability of observing the sequence of
     observations e_1, ... ,e_k and that the state at time k is X.
That
     is:
     ‘f[X,k] := Prob(E_1 = e_1, ... , E_k = e_k , X_k = X)’.
     Where ‘E_1...E_n = e_1...e_n’ is the sequence of observed
     emissions and ‘X_k’ is a random variable that represents the
state
     at time ‘k’.

_U_s_a_g_e:

     forward(hmm, observation)
     
_A_r_g_u_m_e_n_t_s:

hmm : A Hidden Markov Model.

observation : A sequence of observations.

_F_o_r_m_a_t:

     Dimension and Format of the Arguments.

     hmm A valid Hidden Markov Model, for example instantiated by
          ‘initHMM’.

     observation A vector of strings with the observations.

_V_a_l_u_e:

     Return Value:

forward : A matrix containing the forward probabilities.  The
          probabilities are given on a logarithmic scale (natural
          logarithm).  The first dimension refers to the state and
the
          second dimension to time.

_A_u_t_h_o_r(_s):

     Lin Himmelmann <hmm@linhi.com>, Scientific Software
Development

_R_e_f_e_r_e_n_c_e_s:

     Lawrence R. Rabiner: A Tutorial on Hidden Markov Models and
     Selected Applications in Speech Recognition. Proceedings of the
     IEEE 77(2) p.257-286, 1989.

_S_e_e _A_l_s_o:

     See ‘backward’ for computing the backward probabilities.

_E_x_a_m_p_l_e_s:

     # Initialise HMM
     hmm = initHMM(c("A","B"), c("L","R"),
transProbs=matrix(c(.8,.2,.2,.8),2),
             emissionProbs=matrix(c(.6,.4,.4,.6),2))
     print(hmm)
     # Sequence of observations
     observations = c("L","L","R","R")
     # Calculate forward probablities
     logForwardProbabilities = forward(hmm,observations)
     print(exp(logForwardProbabilities))
     
 
       
Error: could not find function "posterior"
posterior(hmm, observations) ?posterior # This function computes the posterior probabilities of being in # state X at time k for a given sequence of observations and a given # Hidden Markov Model. 
       
      index
states         1       2       3         4
     A 0.6037344 0.56639 0.43361 0.3962656
     B 0.3962656 0.43361 0.56639 0.6037344
posterior                 package:HMM                  R
Documentation

_C_o_m_p_u_t_e_s _t_h_e _p_o_s_t_e_r_i_o_r
_p_r_o_b_a_b_i_l_i_t_i_e_s _f_o_r _t_h_e
_s_t_a_t_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     This function computes the posterior probabilities of being in
     state X at time k for a given sequence of observations and a
given
     Hidden Markov Model.

_U_s_a_g_e:

     posterior(hmm, observation)
     
_A_r_g_u_m_e_n_t_s:

hmm : A Hidden Markov Model.

observation : A sequence of observations.

_F_o_r_m_a_t:

     Dimension and Format of the Arguments.

     hmm A valid Hidden Markov Model, for example instantiated by
          ‘initHMM’.

     observation A vector of observations.

_D_e_t_a_i_l_s:

     The posterior probability of being in a state X at time k can
be
     computed from the ‘forward’ and ‘backward’ probabilities:
     ‘ Ws(X_k = X | E_1 = e_1, ... , E_n = e_n) = f[X,k] * b[X,k] /
     Prob(E_1 = e_1, ... , E_n = e_n)’
     Where ‘E_1...E_n = e_1...e_n’ is the sequence of observed
     emissions and ‘X_k’ is a random variable that represents the
state
     at time ‘k’.

_V_a_l_u_e:

     Return Values:

posterior : A matrix containing the posterior probabilities.  The
first
          dimension refers to the state and the second dimension to
          time.

_A_u_t_h_o_r(_s):

     Lin Himmelmann <hmm@linhi.com>, Scientific Software
Development

_R_e_f_e_r_e_n_c_e_s:

     Lawrence R. Rabiner: A Tutorial on Hidden Markov Models and
     Selected Applications in Speech Recognition. Proceedings of the
     IEEE 77(2) p.257-286, 1989.

_S_e_e _A_l_s_o:

     See ‘forward’ for computing the forward probabilities and
     ‘backward’ for computing the backward probabilities.

_E_x_a_m_p_l_e_s:

     # Initialise HMM
     hmm = initHMM(c("A","B"), c("L","R"),
transProbs=matrix(c(.8,.2,.2,.8),2),
             emissionProbs=matrix(c(.6,.4,.4,.6),2))
     print(hmm)
     # Sequence of observations
     observations = c("L","L","R","R")
     # Calculate posterior probablities of the states
     posterior = posterior(hmm,observations)
     print(posterior)
     
# Initialise HMM hmm = initHMM(c("A","B"), c("L","R"), transProbs=matrix(c(.6,.4,.4,.6),2), emissionProbs=matrix(c(.6,.4,.4,.6),2)) print(hmm) # Sequence of observations observations = c("L","L","R","R") # Calculate Viterbi path viterbi = viterbi(hmm,observations) 
       
$States
[1] "A" "B"

$Symbols
[1] "L" "R"

$startProbs
  A   B 
0.5 0.5 

$transProbs
    to
from   A   B
   A 0.6 0.4
   B 0.4 0.6

$emissionProbs
      symbols
states   L   R
     A 0.6 0.4
     B 0.4 0.6
print(viterbi) 
       
[1] "A" "A" "B" "B"
?viterbi # The Viterbi-algorithm computes the most probable path of states # for a sequence of observations for a given Hidden Markov Model. 
       
viterbi                  package:HMM                   R
Documentation

_C_o_m_p_u_t_e_s _t_h_e _m_o_s_t
_p_r_o_b_a_b_l_e _p_a_t_h _o_f _s_t_a_t_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     The Viterbi-algorithm computes the most probable path of states
     for a sequence of observations for a given Hidden Markov Model.

_U_s_a_g_e:

     viterbi(hmm, observation)
     
_A_r_g_u_m_e_n_t_s:

hmm : A Hidden Markov Model.

observation : A sequence of observations.

_F_o_r_m_a_t:

     Dimension and Format of the Arguments.

     hmm A valid Hidden Markov Model, for example instantiated by
          ‘initHMM’.

     observation A vector of observations.

_V_a_l_u_e:

     Return Value:

viterbiPath : A vector of strings, containing the most probable path
of
          states.

_A_u_t_h_o_r(_s):

     Lin Himmelmann <hmm@linhi.com>, Scientific Software
Development

_R_e_f_e_r_e_n_c_e_s:

     Lawrence R. Rabiner: A Tutorial on Hidden Markov Models and
     Selected Applications in Speech Recognition. Proceedings of the
     IEEE 77(2) p.257-286, 1989.

_E_x_a_m_p_l_e_s:

     # Initialise HMM
     hmm = initHMM(c("A","B"), c("L","R"),
transProbs=matrix(c(.6,.4,.4,.6),2),
             emissionProbs=matrix(c(.6,.4,.4,.6),2))
     print(hmm)
     # Sequence of observations
     observations = c("L","L","R","R")
     # Calculate Viterbi path
     viterbi = viterbi(hmm,observations)
     print(viterbi)
     
%r download.file("http://sage.mimuw.edu.pl/home/pub/64/cells/88/jap.nucl.Rdata","data/jap.nucl.Rdata","wget") 
       
--2012-04-24 12:42:46-- 
http://sage.mimuw.edu.pl/home/pub/64/cells/88/jap.nucl.Rdata
Translacja sage.mimuw.edu.pl... 193.0.109.27
Łączenie się z sage.mimuw.edu.pl|193.0.109.27|:80... połączono.
Żądanie HTTP wysłano, oczekiwanie na odpowiedź... 200 OK
Długość: 1770659 (1,7M) [text/plain]
Zapis do: `data/jap.nucl.Rdata'

 0% [                                       ] 0           --.-K/s   
100%[======================================>] 1.770.659   --.-K/s
w 0,009s   

2012-04-24 12:42:46 (185 MB/s) - zapisano `data/jap.nucl.Rdata'
[1770659/1770659]
load("data/jap.nucl.Rdata") ls() 
       
 [1] "fit1"                    "fit2"                    "fit3"     

 [4] "fit4"                    "fit5"                    "fit6"     

 [7] "fit7"                    "hmm"                    
"logForwardProbabilities"
[10] "nucl.df"                 "observations"            "viterbi"  
head(nucl.df) 
       
  akt nast nnast akt.pir nast.pir nnast.pir nnnast nnnast.pir
1   G    A     A   FALSE    FALSE     FALSE      C       TRUE
2   A    A     C   FALSE    FALSE      TRUE      C       TRUE
3   A    C     C   FALSE     TRUE      TRUE      C       TRUE
4   C    C     C    TRUE     TRUE      TRUE      T       TRUE
5   C    C     T    TRUE     TRUE      TRUE      A      FALSE
6   C    T     A    TRUE     TRUE     FALSE      A      FALSE
# Wytrenuj 1,2,3 i wiecej stanowy HMM na czesci czesci danych nucl.df #(head... 1000?) - im wiecej stanow tym mniejsza porcja danych; po probojcie #baumWelch(hmm, observation,... ?baumWelch 
       
baumWelch                 package:HMM                  R
Documentation

_I_n_f_e_r_r_i_n_g _t_h_e _p_a_r_a_m_e_t_e_r_s
_o_f _a _H_i_d_d_e_n _M_a_r_k_o_v _M_o_d_e_l
_v_i_a _t_h_e _B_a_u_m-_W_e_l_c_h
_a_l_g_o_r_i_t_h_m

_D_e_s_c_r_i_p_t_i_o_n:

     For an initial Hidden Markov Model (HMM) and a given sequence
of
     observations, the Baum-Welch algorithm infers optimal
parameters
     to the HMM. Since the Baum-Welch algorithm is a variant of the
     Expectation-Maximisation algorithm, the algorithm converges to
a
     local solution which might not be the global optimum.

_U_s_a_g_e:

     baumWelch(hmm, observation, maxIterations=100, delta=1E-9,
pseudoCount=0)
     
_A_r_g_u_m_e_n_t_s:

hmm : A Hidden Markov Model.

observation : A sequence of observations.

maxIterations : The maximum number of iterations in the Baum-Welch
          algorithm.

delta : Additional termination condition, if the transition and
          emission matrices converge, before reaching the maximum
          number of iterations (‘maxIterations’). The difference of
          transition and emission parameters in consecutive
iterations
          must be smaller than ‘delta’ to terminate the algorithm.

pseudoCount : Adding this amount of pseudo counts in the
          estimation-step of the Baum-Welch algorithm.

_F_o_r_m_a_t:

     Dimension and Format of the Arguments.

     hmm A valid Hidden Markov Model, for example instantiated by
          ‘initHMM’.

     observation A vector of observations.

_V_a_l_u_e:

     Return Values:

hmm : The inferred HMM. The representation is equivalent to the
          representation in ‘initHMM’.

difference : Vector of differences calculated from consecutive
          transition and emission matrices in each iteration of the
          Baum-Welch procedure.  The difference is the sum of the
          distances between consecutive transition and emission
          matrices in the L2-Norm.

_A_u_t_h_o_r(_s):

     Lin Himmelmann <hmm@linhi.com>, Scientific Software
Development

_R_e_f_e_r_e_n_c_e_s:

     For details see: Lawrence R. Rabiner: A Tutorial on Hidden
Markov
     Models and Selected Applications in Speech Recognition.
     Proceedings of the IEEE 77(2) p.257-286, 1989.

_S_e_e _A_l_s_o:

     See ‘viterbiTraining’.

_E_x_a_m_p_l_e_s:

     # Initial HMM
     hmm = initHMM(c("A","B"),c("L","R"),
             transProbs=matrix(c(.9,.1,.1,.9),2),
             emissionProbs=matrix(c(.5,.51,.5,.49),2))
     print(hmm)
     # Sequence of observation
     a = sample(c(rep("L",100),rep("R",300)))
     b = sample(c(rep("L",300),rep("R",100)))
     observation = c(a,b)
     # Baum-Welch
     bw = baumWelch(hmm,observation,10)
     print(bw$hmm)