To align single-end reads:
star --runThreadN 12 --outFilterMultimapNmax 1 --outFilterMismatchNmax 10 --outSAMstrandField intronMotif --genomeLoad LoadAndKeep --outSAMattributes All --genomeDir $TMPDIR/star --readFilesCommand zcat --readFilesIn $TMPDIR/${fq_id}_1.fastq.gz --outFileNamePrefix $TMPDIR/$fq_id/ 2>$TMPDIR/$fq_id/$fq_id.log
To align paired-end reads:
star --runThreadN 12 --outFilterMultimapNmax 1 --outFilterMismatchNmax 10 --outSAMstrandField intronMotif --genomeLoad LoadAndKeep --outSAMattributes All --genomeDir $TMPDIR/star --readFilesCommand zcat --readFilesIn $TMPDIR/${fq_id}_1.fastq.gz $TMPDIR/${fq_id}_2.fastq.gz --outFileNamePrefix $TMPDIR/$fq_id/ 2>$TMPDIR/$fq_id/$fq_id.log
outFilterMismatchNmax 10
int: alignment will be output only if it has fewer mismatches than this value.
If you have un-stranded RNA-seqdata, and wish to run Cufflinks/Cuffdiff on STAR alignments, you will need to run STAR with --outSAMstrandField intronMotif option, which will generate the XS strand attribute for all alignments that contain splice junctions. The spliced alignments that have undefined strand (i.e. containing only non-canonical junctions) will be suppressed.
With default --outSAMattributes Standardoption the following SAM attributes will be generated:
Column 12: NH: number of loci a read (pair) maps to
Column 13: IH: alignment index for all alignments of a read
Column 14: aS: alignment score
Column 15: nM: number of mismatches (does not include indels)
If --outSAMattributes All option is used, the following additional attributes will be output:
Column 16: jM:B:c,M1,M2,… Intron motifs for all junctions (i.e. N in CIGAR): 0: non-canonical; 1: GT/AG, 2: CT/AC, 3: GC/AG, 4: CT/GC, 5: AT/AC, 6: GT/AT. If splice junctions database is used, and a junction is annotated, 20 is added to its motif value.
Column 17: jI:B:I,Start1,End1,Start2,End2,… Start and End of introns for all junctions (1-based)
Note, that samtools 0.1.18 or later have to be used with these extra attributes.
No comments:
Post a Comment