"My
understanding is that a read group means, roughly, "a set of reads that
were all the product of a single sequencing run on one lane". If you
have multiplexed samples in a single lane, you will get multiple samples
in a single read group. If you sequenced the same sample in several
lanes, you will have multiple read groups for the same sample."
Cited from https://www.biostars.org/p/43897/
The meaning of the standard read group fields can be found on http://gatkforums.broadinstitute.org/discussion/1317/collected-faqs-about-bam-files
Compose the read group identifier in the following format:
@RG\tID:group1\tSM:sample1\tPL:illumina\tLB:lib1\tPU:unit1
where the \t stands for the tab character.
Cited from https://www.broadinstitute.org/gatk/guide/tagged?tag=bwa
Example of Read Group usage
Support we have a trio of samples: MOM, DAD, and KID. Each has two DNA libraries prepared, one with 400 bp inserts and another with 200 bp inserts. Each of these libraries is run on two lanes of an illumina hiseq, requiring 3 x 2 x 2 = 12 lanes of data. When the data come off the sequencer, we would create 12 BAM files, with the following @RG fields in the header:
Dad's data:
@RG ID:FLOWCELL1.LANE1 PL:illumina LB:LIB-DAD-1 SM:DAD PI:200
@RG ID:FLOWCELL1.LANE2 PL:illumina LB:LIB-DAD-1 SM:DAD PI:200
@RG ID:FLOWCELL1.LANE3 PL:illumina LB:LIB-DAD-2 SM:DAD PI:400
@RG ID:FLOWCELL1.LANE4 PL:illumina LB:LIB-DAD-2 SM:DAD PI:400
Mom's data:
@RG ID:FLOWCELL1.LANE5 PL:illumina LB:LIB-MOM-1 SM:MOM PI:200
@RG ID:FLOWCELL1.LANE6 PL:illumina LB:LIB-MOM-1 SM:MOM PI:200
@RG ID:FLOWCELL1.LANE7 PL:illumina LB:LIB-MOM-2 SM:MOM PI:400
@RG ID:FLOWCELL1.LANE8 PL:illumina LB:LIB-MOM-2 SM:MOM PI:400
Kid's data:
@RG ID:FLOWCELL2.LANE1 PL:illumina LB:LIB-KID-1 SM:KID PI:200
@RG ID:FLOWCELL2.LANE2 PL:illumina LB:LIB-KID-1 SM:KID PI:200
@RG ID:FLOWCELL2.LANE3 PL:illumina LB:LIB-KID-2 SM:KID PI:400
@RG ID:FLOWCELL2.LANE4 PL:illumina LB:LIB-KID-2 SM:KID PI:400
Cited from http://toolshed.g2.bx.psu.edu/repository/display_tool?repository_id=c45d6c51a4fcfc6c&tool_config=database%2Fcommunity_files%2F000%2Frepo_259%2Fpicard_AddOrReplaceReadGroups.xml&changeset_revision=bf1c3f9f8282
Compose the read group identifier in the following format:
@RG\tID:group1\tSM:sample1\tPL:illumina\tLB:lib1\tPU:unit1
where the \t stands for the tab character.
Cited from https://www.broadinstitute.org/gatk/guide/tagged?tag=bwa
No comments:
Post a Comment