Hot Topics
Genotyping and SNP detection of Streptococcus pneumoniae
Genotyping and SNP detection of Streptococcus pneumoniae isolates by resequencing arrays
The NIAID sponsored Pathogen Functional Genomics Resource Center (PFGRC) (NIAID contract N01-AI-15447) has undertaken an evaluation of resequencing oligonucleotide array technology to detect genotypic variations in microorganisms. The primary goal is to establish expertise in the technology and methodologies while simultaneously addressing its value in the scientific community to identify and discover genotypic variations in microorganisms.
The PFGRC, in collaboration with Dr. M. Catherine McEllistrem's laboratory at The University of Pittsburg, has evaluated Affymetrix CustomSeq Resequencing Oligonucleotide Array technology to detect genotypic variations in Streptococcus pneumoniae isolates. This pilot project is further aimed towards development of a novel genotyping platform enabling detection of SNPs from whole genomes of microorganisms.
Non-contiguous regions of Streptococcus pneumoniae isolates were resequenced in this collaborative project. The CustomSeq Resequencing Array consists of 231,688 probes covering 28,961 bases of non-contiguous sequences of the reference Streptococcus pneumoniae TIGR4 strain as shown below. The sequence details can be accessed through the Comprehensive Microbial Resource (CMR) database of JCVI (http://www.jcvi.org/).
Name/Locus |
Gene/Sequence |
Length(bp) |
16S_rRNA |
16SrRNA |
1413 |
SP0117 |
Pneumococcal surface protein A |
2232 |
SP0368 |
Cell wall surface anchor family protein |
5301 |
SP0377_SP0378 |
Choline binding proteins C, J & intergenic region |
2262 |
SP0390_SP0391 |
Choline binding proteins G, F & intergenic region |
2078 |
SP0463_SP0466 |
Surface anchor, sortase & hypothetical proteins with intergenic region |
4788 |
SP0667 |
Pneumococcal surface protein - putative |
996 |
SP0834 |
Hemolysin-related protein |
510 |
SP1204 |
Hemolysin A - putative |
594 |
SP1466 |
Hemoylsin |
645 |
SP1833 |
Cell wall surface anchor family protein |
2124 |
SP1961 |
DNA-directed RNA polymerase, beta subunit |
3609 |
SP1992 |
Cell wall surface anchor family protein |
663 |
SP2145 |
Antigen, cell wall surface anchor family |
2082 |
The data presented in the following pages shows the SNPs that were detected in each of 85 distinct whole genome samples after hybridization with the TIGR4-based resequencing array. All samples were done in duplicate. A set of bioinformatic filters were applied to the results from each experiment (see here for more information), and the results from the two experiments were combined, eliminating those SNPs that were not present in both results after filtration. Among the samples hybridized were the TIGR4 strain itself, and 3 additional strains that are fully sequenced: G54, R6 and 670.
The use of genomiphied whole-genome samples, rather than PCR-amplified fragments, simplifies the experimental protocol, and also avoids the PCR failures that would inevitably occur with some clinical samples of unknown sequence composition. However, the higher complexity of the whole-genome sample also increases the frequency of certain artifacts. The bioinformatic filters that we have developed have proven successful in identifying and eliminating the majority of these artifacts.
The "SNP Report" provides:
- the nucleotide and its position in the reference strain and in the target fragment
- the annotation, context of the ORF, and amino acid sequence when in a coding region
This information may be sorted and organized by nucleotide position or ORF. A separate report provides, for each selected SNP position, an alignment between the reference sequence and the chosen target sequences.
Users of this comparative sequence information may begin to compile a meaningful set of known SNPs that may be applied to their own research projects.
