Skip to main content

Table 1 Number of nucleotide sequences and represented species in the developed plant ITS2 and rbcL databases at several key points of the DB4Q2 workflow

From: A detailed workflow to develop QIIME2-formatted reference databases for taxonomic analysis of DNA metabarcoding data

 

ITS2

rbcL

Without dereplication

With dereplication

Without dereplication

With dereplication

After download from NCBI

238,018 (74,411)

238,018 (74,411)

201,740 (62,314)

201,740 (62,314)

After culling (and dereplication)

223,947 (70,339)

173,597 (70,339)

197,071 (60,769)

135,473 (60,769)

After misidentification filtering

221,954 (69,799)

171,754 (69,785)

195,946 (60,342)

134,321 (60,315)

After amplicon-based restriction

35,505 (15,425)

29,545 (15,416)

113,526 (44,269)

81,415 (44,244)

  1. Numbers in brackets reflect the count of represented species at each step