##################################################### # Introduction # ################ XSAnno is a pipeline to generate pairwise orthologous annotation for comparative transcriptomic analysis.The pipeline consists of three components: global alignment, local alignment and a filter using simulated reads. XSAnno is developd by Ying Zhu, with substaintial technical input from Mingfeng Li. For more information please refer to http://hbatlas.org/xsanno/ XSAnno is licensed under the [GNU General Public License v3](http://www.gnu.org/licenses/) ##################################################### # Installation # ################ Installing LiftOver, BLAT, simNGS ---------------------------------- LiftOver: http://hgdownload.cse.ucsc.edu/admin/exe/ BLAT: http://genome.ucsc.edu/FAQ/FAQblat.html#blat3 simNGS: http://www.ebi.ac.uk/goldman-srv/simNGS/ Follow the standard instructions for installing those tools; see their respective manuals for details. Once the tools are installed, make sure that the directory containing the key binaries is included in your PATH. Installing XSAnno ------------------ 1. download XSAnno_v2.zip 2. To install the XSAnno pipeline, change AnnoConvert and BlatFilter to executable mode: chmod 755 AnnoConvert chmod 755 BlatFilter 2. add the directory /path_to_XSAnno_vXXX/bin to PATH. ##################################################### # Pipeline flow # ################# The pipeline generates orthologous annotation between two species (sp1 and sp2), based on annotation of sp1. The pipeline consists three components: LiftOver Annotation, BLAT Annotation and SIM Annotation. Step 1: LiftOver Annotation ---------------------------- 1. Prepare all pre-requiste files: 1) annotation of sp1 in bed format. The annotation field of each line should be arragend as geneID|transcriptID|geneName|transcriptName, separated by "|". 2) pair-wise alignment chain files downloaded from ucsc genome browser. (sp1 to sp2 and sp2 to sp1) 2. run AnnoConvert according to manual. The "-minMatch" parameter can be chosed based on simulation by running liftOverBlockSim.pl in perl_lib and analyze in R (see example_scripts/plot_liftOverBlockSim.r) Step 2: BLAT Annotation ---------------------------- 1. Prepare all files: 1) LiftOver Annotation from Step1: Sp1.Sp1_Sp2.liftOver.bed and Sp2.Sp1_Sp2.liftOver.bed 2) 2bit reference genome of sp1 and sp2 downloaded from ucsc genome browser 3) 11.ooc files for sp1 and sp2 to speed up the BLAT step. The files can be generated by BLAT using "-makeOoc=11.ooc" option. 2. run BlatFilter to perform local alignment between species and within species, following the instructions in manual. NOTE: Choose loose PID and PL in this step. Further filtering will be performed using R functions. 3. filter the exons, using R functions provided in Functions_BlatFilter_byTranscript.r (see example_scripts/blatFilter_byTranscript.r) Step 3: SIM Annotation ---------------------------- 1. generate simulated RNA-seq reads following the instruction of simNGS. (see example_scripts/simNGS.hg19.1.sh for generating a single simulated raw data file; see example_scripts/simNGS.sh for generating mulitple scripts like simNGS.hg19.1.sh for simulated data generation) 2. map the simulated RNA-seq reads using topHat 3. calculate differential expression using DESeq and filter out the exons that are differentially expressed using simulated RNA-seq reads (see example_scripts/simFilter_byDEX.r)