King's College London

Research portal

Global patterns of STR sequence variation: Sequencing the CEPH human genome diversity panel for 58 forensic STRs using the Illumina ForenSeq DNA Signature Prep Kit

Research output: Contribution to journalArticlepeer-review

Christopher Phillips, Laurence Devesse, David Ballard, Leanne van Weert, Maria de la Puente, Stefania Melis, Vanessa Álvarez Iglesias, Ana Freire-Aradas, Nicola Oldroyd, Cydne Holt, Denise Syndercombe Court, Ángel Carracedo, Maria Victoria Lareu

Original languageEnglish
Early online date13 Aug 2018
Accepted/In press26 Jul 2018
E-pub ahead of print13 Aug 2018

King's Authors


The 944 individuals of the CEPH human genome diversity panel (HGDP–CEPH), a standard sample set of 51 globally distributed populations, were sequenced using the Illumina ForenSeq™ DNA Signature Prep Kit. The ForenSeq™ system is a single multiplex for the MiSeq/FGx™ massively parallel sequencing instrument, comprising: amelogenin, 27 autosomal STRs, 24 Y-STRs, 7 X-STRs, and 94 SNPforID+Kiddlab autosomal ID-SNPs (plus optionally detected ancestry and phenotyping SNP sets). We report in detail the patterns of sequence variation observed in the repeat regions of the 58 forensic STR loci typed by the ForenSeq™ system. Sequence alleles were characterized and repeat region structures annotated by aligning the ForenSeq™ sequence output to the latest GRCh38 human reference sequence, necessitating the reversal and re-alignment of STR allele sequences reported by the Forenseq™ system in 20 of 58 STRs (plus the reverse alleles in two Y-STRs with duplicated-inverted repeat regions). Individual population sample sizes of the HGDP–CEPH panel do not allow reliable inferences to be made about levels of genetic variability in low frequency STR alleles-where particular sequence variants are found in only a few individuals; but we assessed the occurrence of both population-specific sequence variants and singleton observations; finding each of these in a sizeable proportion of HGDP–CEPH samples, with consequences for planning the co-ordinated compilation of sequence variation on a much larger scale than was required before by forensic laboratories now adopting massively parallel sequencing.

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454