sadII_lab13_enrichment

2215 days ago by ab277290

# Dlugie - juz zainstalowane #source("http://bioconductor.org/biocLite.R") #biocLite(c("GOSim","DOSim")) 
       
# Dlugo.. library(GOSim) 
       
Loading required package: GO.db
Loading required package: AnnotationDbi
Loading required package: Biobase

Welcome to Bioconductor

  Vignettes contain introductory material. To view, type
  'browseVignettes()'. To cite Bioconductor, see
  'citation("Biobase")' and for packages 'citation("pkgname")'.

Loading required package: DBI

Loading required package: annotate
Loading required package: topGO
Loading required package: graph
Loading required package: SparseM
Package SparseM (0.96) loaded.
	   To cite, see citation("SparseM")


Attaching package: ‘SparseM’

The following object(s) are masked from ‘package:base’:

    backsolve


groupGOTerms: 	GOBPTerm, GOMFTerm, GOCCTerm environments built.
Loading required package: cluster
Loading required package: flexmix
Loading required package: lattice
Loading required package: modeltools
Loading required package: stats4
Loading required package: multcomp
Loading required package: mvtnorm
Loading required package: survival
Loading required package: splines
Loading required package: RBGL
Loading required package: Matrix

Attaching package: ‘Matrix’

The following object(s) are masked from ‘package:SparseM’:

    det

The following object(s) are masked from ‘package:base’:

    det

Loading required package: corpcor
Loading required package: org.Hs.eg.db

[1] "initializing GOSim package ..."
[1] "-> retrieving GO information for all available genes for
organism 'human' in GO database"
[1] "-> filtering GO terms according to evidence levels 'all'"
[1] "-> loading files with information content for corresponding
GO category (human)"
[1] "finished."
# Przyklady z # http://cran.r-project.org/web/packages/GOSim/vignettes/GOSim.pdf 
       
#We create a character vector of Entrez gene IDs, which we assume to be from human: genes=c("207","208","596","901","780","3169","9518","2852","26353","8614","7494") 
       
ginf<-getGOInfo(genes) dim(ginf) colnames(ginf) rownames(ginf) 
       
[1]  4 11
 [1] "207"   "208"   "596"   "901"   "780"   "3169"  "9518"  "2852" 
"26353" "8614"  "7494" 
[1] "go_id"      "Term"       "Definition" "IC"        
ginf[,1]$go_id ginf[,1]$Term 
       
 [1] "GO:0000060" "GO:0001893" "GO:0001934" "GO:0007596"
"GO:0006915" "GO:0007165" "GO:0035556"
 [8] "GO:0016567" "GO:0005975" "GO:0005979" "GO:0006417"
"GO:0006464" "GO:0006468" "GO:0006469"
[15] "GO:0006809" "GO:0006810" "GO:0006916" "GO:0006924"
"GO:0006954" "GO:0007186" "GO:0007275"
[22] "GO:0007281" "GO:0007399" "GO:0008286" "GO:0008543"
"GO:0008629" "GO:0008633" "GO:0008637"
[29] "GO:0008643" "GO:0009408" "GO:0010748" "GO:0010765"
"GO:0010907" "GO:0010975" "GO:0016070"
[36] "GO:0016071" "GO:0046326" "GO:0046889" "GO:0016310"
"GO:0018105" "GO:0048015" "GO:0030030"
[43] "GO:0030163" "GO:0030168" "GO:0030307" "GO:0030334"
"GO:0031018" "GO:0031295" "GO:0031659"
[50] "GO:0031999" "GO:0032094" "GO:0032270" "GO:0032436"
"GO:0032869" "GO:0032880" "GO:0033138"
[57] "GO:0034405" "GO:0042593" "GO:0042640" "GO:0043066"
"GO:0043536" "GO:0045429" "GO:0045600"
[64] "GO:0045725" "GO:0045792" "GO:0045884" "GO:0045944"
"GO:0046209" "GO:0046329" "GO:0046777"
[71] "GO:0048009" "GO:0048011" "GO:0050999" "GO:0051000"
"GO:0051091" "GO:0051146" "GO:0060709"
[78] "GO:0060716" "GO:0070141" "GO:0071363" "GO:0090004"
 [1] "protein import into nucleus, translocation"                   

 [2] "maternal placenta development"                                

 [3] "positive regulation of protein phosphorylation"               

 [4] "blood coagulation"                                            

 [5] "apoptosis"                                                    

 [6] "signal transduction"                                          

 [7] "intracellular signal transduction"                            

 [8] "protein ubiquitination"                                       

 [9] "carbohydrate metabolic process"                               

[10] "regulation of glycogen biosynthetic process"                  

[11] "regulation of translation"                                    

[12] "protein modification process"                                 

[13] "protein phosphorylation"                                      

[14] "negative regulation of protein kinase activity"               

[15] "nitric oxide biosynthetic process"                            

[16] "transport"                                                    

[17] "anti-apoptosis"                                               

[18] "activation-induced cell death of T cells"                     

[19] "inflammatory response"                                        

[20] "G-protein coupled receptor protein signaling pathway"         

[21] "multicellular organismal development"                         

[22] "germ cell development"                                        

[23] "nervous system development"                                   

[24] "insulin receptor signaling pathway"                           

[25] "fibroblast growth factor receptor signaling pathway"          

[26] "induction of apoptosis by intracellular signals"              

[27] "activation of pro-apoptotic gene products"                    

[28] "apoptotic mitochondrial changes"                              

[29] "carbohydrate transport"                                       

[30] "response to heat"                                             

[31] "negative regulation of plasma membrane long-chain fatty acid
transport"            
[32] "positive regulation of sodium ion transport"                  

[33] "positive regulation of glucose metabolic process"             

[34] "regulation of neuron projection development"                  

[35] "RNA metabolic process"                                        

[36] "mRNA metabolic process"                                       

[37] "positive regulation of glucose import"                        

[38] "positive regulation of lipid biosynthetic process"            

[39] "phosphorylation"                                              

[40] "peptidyl-serine phosphorylation"                              

[41] "phosphatidylinositol-mediated signaling"                      

[42] "cell projection organization"                                 

[43] "protein catabolic process"                                    

[44] "platelet activation"                                          

[45] "positive regulation of cell growth"                           

[46] "regulation of cell migration"                                 

[47] "endocrine pancreas development"                               

[48] "T cell costimulation"                                         

[49] "positive regulation of cyclin-dependent protein kinase
activity involved in G1/S"  
[50] "negative regulation of fatty acid beta-oxidation"             

[51] "response to food"                                             

[52] "positive regulation of cellular protein metabolic process"    

[53] "positive regulation of proteasomal ubiquitin-dependent protein
catabolic process"  
[54] "cellular response to insulin stimulus"                        

[55] "regulation of protein localization"                           

[56] "positive regulation of peptidyl-serine phosphorylation"       

[57] "response to fluid shear stress"                               

[58] "glucose homeostasis"                                          

[59] "anagen"                                                       

[60] "negative regulation of apoptosis"                             

[61] "positive regulation of blood vessel endothelial cell
migration"                    
[62] "positive regulation of nitric oxide biosynthetic process"     

[63] "positive regulation of fat cell differentiation"              

[64] "positive regulation of glycogen biosynthetic process"         

[65] "negative regulation of cell size"                             

[66] "regulation of survival gene product expression"               

[67] "positive regulation of transcription from RNA polymerase II
promoter"              
[68] "nitric oxide metabolic process"                               

[69] "negative regulation of JNK cascade"                           

[70] "protein autophosphorylation"                                  

[71] "insulin-like growth factor receptor signaling pathway"        

[72] "nerve growth factor receptor signaling pathway"               

[73] "regulation of nitric-oxide synthase activity"                 

[74] "positive regulation of nitric-oxide synthase activity"        

[75] "positive regulation of sequence-specific DNA binding
transcription factor activity"
[76] "striated muscle cell differentiation"                         

[77] "glycogen cell development involved in embryonic placenta
development"              
[78] "labyrinthine layer blood vessel development"                  

[79] "response to UV-A"                                             

[80] "cellular response to growth factor stimulus"                  

[81] "positive regulation of establishment of protein localization
in plasma membrane"   
ginf[,1]$IC #Information content 
       
 [1] 0.7004567 0.7318690 0.5089264 0.4106285 0.3128597 0.2013538
0.2976529 0.4101925 0.3669168
[10] 0.7130596 0.5282457 0.2676484 0.3447466 0.5616993 0.6628117
0.2157696 0.5043756 0.8582569
[19] 0.4581291 0.4036968 0.1777320 0.5733726 0.2864055 0.5456422
0.5673427 0.5935465 0.6947728
[28] 0.6439286 0.5162263 0.6283881 0.8779091 0.8286238 0.6843996
0.5265994 0.2078508 0.3822056
[37] 0.7130596 0.6751205 0.3331353 0.6027288 0.6258352 0.3830803
0.4340194 0.5102710 0.6185925
[46] 0.4754696 0.5616993 0.6221394 0.8582569 0.9389546 0.7238571
0.4431719 0.7004567 0.5214523
[55] 0.4930920 0.6947728 0.7675783 0.6077058 0.8064905 0.4335929
0.7318690 0.7130596 0.7504790
[64] 0.7558182 0.8286238 0.7318690 0.4431719 0.6408914 0.7361660
0.5647897 0.6975689 0.5237906
[73] 0.6975689 0.7888175 0.5552594 0.5338189 1.0000000 0.7811544
0.8779091 0.5679926 0.7811544
# Zadanie - sciagnij interesujacy Cie track z UCSC, przetnij jego wybrany fragment (np obciecie po jakims score) z trackiem z genami, #wybierz id genow i sprawdz enrichment uzywajac funkcji ?GOenrichment # http://cran.r-project.org/web/packages/GOSim/vignettes/GOSim.pdf 
       
GOenrichment               package:GOSim               R
Documentation

_G_O _e_n_r_i_c_h_m_e_n_t _a_n_a_l_y_s_i_s

_D_e_s_c_r_i_p_t_i_o_n:

     This function performs a GO enrichment analysis using topGO. It
     combines the two former functions "GOenrichment" and
     "analyzeCluster".

_U_s_a_g_e:

     GOenrichment(genesOfInterest, allgenes, cutoff=0.01,
method="elim")
     
_A_r_g_u_m_e_n_t_s:

genesOfInterest: character vector of Entrez gene IDs or vector of
          statistics (p-values, t-statistics, ...) named with entrez
          gene IDs

allgenes: character vector of Entrez gene IDs or vector of
statistics
          named with entrez gene IDs

  cutoff: significance cutoff for GO enrichment analysis

  method: topGO method to use

_D_e_t_a_i_l_s:

     If the parameters 'genesOfInterest' and 'allgenes' are both
     character vectors of Entrez gene IDs, Fisher's exact test is
used.
     The Kolmogorov-Smirnov test can be used, if a score (e.g.
p-value)
     for each gene is provided. For more details please refer to the
     topGO vignette.

_V_a_l_u_e:

 GOTerms: list of significant GO terms and their description

p.values: vector of p-values for significant GO terms

   genes: list of genes associated to each GO term

_A_u_t_h_o_r(_s):

     Holger Froehlich

_R_e_f_e_r_e_n_c_e_s:

     Adrian Alexa, J\"org Rahnenf\"uhrer, Thomas Lengauer: Improved
     scoring of functional groups from gene expression data by
     decorrelating GO graph structure, Bioinformatics, 2006,
     22(13):1600-1607

_S_e_e _A_l_s_o:

     ‘evaluateClustering’

_E_x_a_m_p_l_e_s:

     ## Not run:
            
             setOntology("BP")
             gomap <- get("gomap",env=GOSimEnv)
             allgenes = sample(names(gomap), 1000) # suppose these
are all genes
             allpvalues = runif(1000) # an these are their pvalues
             names(allpvalues) = allgenes
             if(require(topGO) & require(annotate))
                     GOenrichment(allpvalues[allpvalues<0.05],
allpvalues) # GO enrichment analysis using Kolmogorov-Smirnov test
     ## End(Not run)