Skip to main content

Table 1 Summary description of the various datasets used in the study

From: The influence of secondary structure, selection and recombination on rubella virus nucleotide substitution rate estimates

Dataset

Description

Acronym

Number of sequences

Temporal range

Alignment length

i

Full genome, representative sample containing 10 rubella virus lineages (extracted from dataset ii)

-

10

1961 - 2008

9762 nt

ii

Full genome (not tested for recombination)

Full Genome

34

1961 - 2009

9762 nt

iii

Full genome (without 2 detected recombinant isolates)

Full Genome rec.free

32

1961 - 2009

9762 nt

iv

Capsid structural protein

CP

52

1961 - 2009

900 nt

v

RNA-dependent RNA polymerase

RdRp

56

1961 - 2009

672 nt

vi

Envelope structural glycoprotein 2

E2

54

1961 - 2009

846 nt

vii

P150 non-structural protein

P150

34

1961 - 2009

3943 nt

viii

Envelope glycoprotein 1

E1

640

1961 - 2012

739 nt

ix

Filtered envelope glycoprotein 1, extracted from dataset ii

Filtered E1

34

1961 - 2009

739 nt

x

Temporally balanced envelope glycoprotein 1

Temporally Balanced E1

45

1961 - 2012

739 nt

xi

Envelope glycoprotein 1, without 2 detected recombinant isolates and 437 nt NASP predicted base-paired nucleotide sites

E1 rec.free UnPR

638

1961 - 2012

302 nt

xii

Envelope glycoprotein 1, without 2 detected recombinant isolates, containing only 437 nt NASP predicted base-paired nucleotide sites

E1 rec.free PR

638

1961 - 2012

437 nt

xiii

Full genome, without 2 detected recombinant isolates and 1960 nt NASP predicted base-paired nucleotide sites.

Full Genome rec.free UnPR

32

1961 - 2009

7802 nt

xiv

Full genome, without 2 detected recombinant isolates, containing only 1960 nt NASP predicted base-paired nucleotide sites

Full Genome rec.free PR

32

1961 - 2009

1960 nt