It Is? It Is Not? Identifying Portulaca by Morphology and DNA Barcodes

Abstract

Amongst Portulaca species are weeds but also medicinal herbs and ornamentals. Differentiation between weeds and crops enable early removal of weeds preventing retardation of crop growth. Authenticating medicinal herbs prevents the sale of substitutions that lowers the effectiveness of the medicine, and potentially endangers lives. Species identification thus helps attain Sustainable Development Goals 2, 3, 14, and 15 of the United Nations. Identification using DNA barcoding is recommended, but barcoding may fail depending on plant groups, sampling regions, loci and data analysis methods used. Thus, BLAST and tree topology was used to test DNA markers of nuclear loci (ITS1 and ITS2) and chloroplast loci ( rbcl and trnL-F ) for their ability to identify or differentiate Malaysian plant species morphologically characterised as Portulaca oleracea , Portulaca umbraticola , and Portulaca grandiflora . The locus ITS1 enabled the identification of two of the species correctly using standard BLAST scores and discriminated all three species using tree topology. ITS2 could identify all three species accurately but only using a BLAST, based on secondary structure alignment. The rbcl locus was unable to identify or discriminate the species due to lack of variability between sequences of different species. While trnL-F could not identify using BLAST because the database is currently not populated with sequences for all species, resulting in identification to the most closely related species with sequences on the database, which was incorrect. Thus, we recommend the use of ITS1 and ITS2 loci for identifying Portulaca species until trnL-F sequences of more species populate the database for dependable species identification.

Keywords: Portulaca speciesspecies identificationDNA barcodesITStrnL-Frbcla

Introduction

Morphological features have been the basis of species identification for the past 250 years (Hebert & Gregory, 2005), using features such as stems, fruits, and flowers (Waldchen et al., 2018). However, difficulties in species identification arise when different species share similar morphological characters (Jinbo et al., 2011), or members of the same species look different due to phenotypic plasticity (Sdouga et al., 2018).

Species misidentification has been reported among Portulaca species, for example, the 5th. Edition of the Sunset Western Garden Book (October 1988) mislabelled Portulaca umbraticola as Portulaca oleracea (Menkins, 2013). The genus Portulaca Linnaeus includes more than 100 taxa (Kokubugata et al., 2015; Mabberley, 2017; Ocampo & Columbus, 2012), though there is no consensus on this number. The three species P. umbraticola , P. oleracea and P. grandiflora , for example, each have numerous synonyms (The Plant List, 2013), reflecting that even experienced taxonomist may misidentify (Friedheim, 2016; Pečnikar & Buzan, 2013).

P. grandiflora and P. umbraticola are ornamental, while the species P. oleracea may be considered a weed which can infest crop fields and cause economic losses (Uddin et al., 2014), or a medicinal source (Zhou et al., 2015). Identification of weeds at the early stages of growth for removal with species-specific methods would improve crop growth (Armstrong & Ball, 2005). Authentication of medicinal species such as Portulaca would prevent incorrect species usage, which may have adverse health effects (Ghorbani et al., 2017). Thus, species need a fast and efficient method of identification. Therefore, traditional morphology-based taxonomy should be supported by using DNA based species assignment (Packer et al., 2009). DNA barcoding is a method which involves the use of short DNA sequences to identify species through the comparison of this DNA to the DNA of the same locus of other individuals of the same and different species (Hebert et al., 2003).

There are limited studies on barcoding of this genus. Sdouga et al. (2018) worked mainly on trnH-psbA and ITS of P. oleracea . Nyffeler (2007) worked on the DNA markers matK and ndhF on five Portulaca species which included P. oleracea and P. grandiflora . Ocampo and Columbus, (2012) worked on 59 Portulaca species, which included the species in this study, but except for ITS the loci involved were different ( ndhF, trnT-psbD spacer, and ndhA intron). Machate et al. (n.d.) presented data on the relationship of P. oleraceae and P. grandiflora to other members of the same family using matK and rbcl . Additional information on Portulaca species is available from large scale environmental studies which sample across species. For example, the study of Newmaster et al. (2008) contributed sequences of rbcL and ITS2 for P. oleraceae . Sequences have also been uploaded on databases without accompanying publications such as from Choi and Park (2013) for P. oleracea trnH-psbA sequence (GenBank: KF954535.1). The review on the work on Portulaca shows that aside from the work of Ocampo and Columbus, (2012) on 59 Portulaca species, only P. oleracea has been widely studied.

Problem Statement

Misidentification creates problems for conservationists, ecologists, and various types of agencies that deal with food safety, and invasive plants (Hebert & Gregory, 2005). DNA barcoding to supplement morphological identification is suggested (Thompson & Newmaster, 2014). Among the reasons for incorporating DNA based methods in species identification, is that molecular identification can be more efficient, as shown by a taxonomic survey where 202 species were identified using molecular data, but only 142 species were identified using morphology (Thompson & Newmaster, 2014). The molecular identification not only identified more species but was at the same time, 37% less expensive (Thompson & Newmaster, 2014). However, DNA barcoding sometimes fails in species identification (Percy et al., 2014; Stallman et al., 2019). Part of the reason for failure could be the influence of geographical locations of the sample. In P. oleraceae , there was a correlation between the DNA and the geographic areas of the samples (Sdouga et al., 2018). Thus, species from specific regions may show a high level of deviation from the sequences on the National Center of Biotechnology Information, NCBI (Yang et al., 2017). It is crucial to understand if such sequence deviations would impact the utility of DNA barcoding. And as barcoding requires a universal DNA region that can be utilized across all plant taxonomic groups, it is important to test the recommended DNA markers for barcoding (CBOL Plant Working Group, 2009). These recommended barcodes were not used in the study on Portulaca, which covered various species (Ocampo & Columbus, 2012). Additionally, since a variety of DNA data analysis methods exist (Sandionigi et al., 2012) and perform differently (Stallman et al., 2019), methods need to be compared to ensure the reliability of results.

Research Questions

The question then is, will the DNA sequences of ITS1, ITS2, rbcl, and trnL-F analysed using BLAST or maximum-likelihood tree topology identify and discriminate the three morphologically identified Portulaca species commonly found in Malaysia? We hypothesised that the identification of these three Portulaca species sampled in Nilai, Negeri Sembilan, Malaysia using the DNA sequences of ITS1, ITS2, rbcl , and trnL-F would correspond to species assignment using leaf and flower morphology when using either BLAST of tree topology.

Purpose of the Study

The study aimed to;

  • characterise three Portulaca species found in Malaysia using leaf and flower morphology

  • determine if the nuclear (ITS1, ITS2) and chloroplast (rbcl, trnL-F) loci can be amplified, sequenced, and analysed using the Basic Local Alignment Search Tool (BLAST) and maximum likelihood tree topology, to corroborate the identify or discriminate the morphologically delimited Portulaca species.

Research Methods

Morphological characterisation, followed by molecular characterisation and their comparison was carried out, as explained in the next sections.

Sample selection, morphological characterisation, and identification

Nine samples were collected in Negeri Sembilan, three from each of the morphologically different Portulaca species. The samples were morphological characterised based on observations made by the naked eye and under a light microscope.

Molecular characterisation

DNA extraction, Amplification and Sequencing

DNA was extracted using the modified cetyltrimethylammonium bromide (CTAB) method (Doyle & Doyle, 1987). Polymerase Chain Reactions (PCR) using My Taq™ Mix (Bioline, USA) were performed according to manufacturer’s protocols using primers and thermocycling conditions reported for rbcl (Kress et al., 2009), trnL-F (Taberlet et al., 1991), ITS1(Cheng et al., 2015), and ITS2 (Chen et al., 2010). PCR products confirmed by agarose gel electrophoresis were purified and sequenced at MyTACG Bioscience Enterprise.

Sequence Analysis

The analysis used consensus sequences obtained using DNA Sequence Assembler version 5.15 (2018) or if unavailable, uni-directional reads. Fragment length and GC content obtained using the MEGA X Software (Kumar et al., 2018), if consistent with reported values indicate the authenticity of loci (Buckler & Holtsford, 1996). Authenticity is also indicated by a lack of stop codons in the reading frame of the coding locus rbcl and was determined using the Barcode of Life Database (BOLD) (Ratnasingham & Hebert, 2007). Sequence homogeneity in evolution and substitution saturation were determined using the Disparity Index (ID) in MEGA (Kumar & Gadagkar, 2001) and the Iss statistics, in the DAMBE Software (Xia & Xie, 2001).

To identify and differentiate species we used the BLASTN (Altschul et al., 1990) and created a maximum likelihood tree using MEGA X (Kumar et al., 2018). We assigned identities based on the highest score obtained when the sequences were queried using the BLAST (via (http://blast.ncbi.nlm.nih.gov/Blast.cgi). In the case of ITS2, the ITS2 database was used to assign species name based on a BLAST which uses structure and sequence (Merget et al., 2012).

The maximum likelihood tree was generated using sequences aligned with t-Coffee (Di Tommaso et al., 2011) according to the best-fit nucleotide substitution model selected based on the Akaike information criterion (AIC) calculated in MEGA X (Kumar et al., 2018). Sequences clustered in monophyletic clades in the tree enable species discrimination (Fazekas et al., 2008). Node support evaluated with bootstrapping (BS) (Felsenstein, 1985), was interpreted as giving relatively reliable support of the relationship when BS is between 70 and 85 and strong support when BS is more than 85 (Kress et al., 2002).

Findings

Provisional morphological identification

The plant samples were assigned to the Portulaca genus, diagnosed by the contracted head-like inflorescence and the fruit’s dehiscent top portion shed intact as a lid (Nyffeler & Eggli, 2010). Morphology provided temporary species assignments to P. oleracea (PO), P. grandiflora (PG) and P. umbraticola (PU).

The temporary identification to P. oleraceae was based on several characteristics. The arrangement of the vascular bundles in a zig-zag manner seen in the leaf cross-section (Figure 01 B), is a diagnostic character of the Olecaceae clade (Ocampo et al., 2013; Voznesenskaya et al., 2010). Species assignment within the clade was based on the alternate but pseudopposite leaves which are spatulate in shape (Fig 01C), and absence of hairs in the leaf axils when observed with the naked eyes as mentioned by Ocampo et al. (2013). The small yellow flower (Figure 01 A), referred to by Cudney et al., (2017) also differentiated it from the other two species.

Figure 1: P. oleracea A. Floral appearance B. Leaf vascular bundle arrangement C. Leaf appearance
P. oleracea A. Floral appearance B. Leaf vascular bundle arrangement C. Leaf appearance
See Full Size >

The second species was first inferred to be in the Pilosa clade, based on conspicuous leaf axillary hairs (Ocampo et al., 2013), and vascular bundles arranged in a peripheral ring with the water storage cells located in the central part (Figure 02 B, Ocampo et al., 2013). The terete-leaves leaves (Figure 02C) further narrowed identification to P. grandiflora or P. pilosa . The discrimination between P. pilosa and P. grandiflora is subjective. Sivarajan (1981) treated P. grandiflora as a subspecies of P. pilosa . However, there are references to the difference between the two. P. grandiflora , is recorded to have bigger flowers (Figure 02 A), 29 ± 0.2mm in diameter (n = 10) in this study versus 25 mm diameter reported in P. pilosa (Wu et al., 2003). P. grandiflora also has a hypodermal layer of cells, which is absent in P. pilosa (Ocampo et al., 2013). Additionally, the leaf axils are densely pilose in P. pilosa (Wu et al., 2003), which were not observed in the samples in this study. Thus, the provisional conclusion is that the species used in this study is P. grandiflora (see Figure 2 );

Figure 2: P. grandiflora A. Floral appearance B. Leaf vascular bundle arrangement C. Leaf appearance
P. grandiflora A. Floral appearance B. Leaf vascular bundle arrangement C. Leaf appearance
See Full Size >

P. umbraticola was provisionally identified based on the diagnostic character of a wing around the dehiscence line of the capsule (Figure 03 A) (Ocampo & Columbus, 2012), and the arrangement of the vascular bundles in a nearly straight line (Figure 03 B; Voznesenskaya et al., 2010). Also, as reported by Legrand (1962), there was variation in flower colour in this species. The leaves of P. umbraticola (Figure 03 C) was not a defining characteristic for this species.

Figure 3: P. umbraticola. A. Floral appearance shows the wing around the dehiscence line of the capsule. B. Leaf vascular bundle arrangement C. Leaf appearance
P. umbraticola. A. Floral appearance shows the wing around the dehiscence line of the capsule. B. Leaf vascular bundle arrangement C. Leaf appearance
See Full Size >

Molecular Identification

PCR and sequencing success rate

Barcoding depends on successful amplification and sequencing. This study found a 100% success in amplification, but sequencing was not always successful. Three rbcl amplicons failed to be sequenced. Further tests to determine the possible causes for the failure will have to be carried out. All reverse sequencing of ITS1 failed. Such consistent failure could be due to the primer sequence not precisely matching the primer region (Hollingsworth et al., 2011; Schori & Showalter, 2011). The universal primer set used may not have been specific enough. Though we obtained good quality trnL-F sequences, in 14 out of 18 cases, these sequences could not always form consensus sequences. The length of the sequence (Table 01 ) may require the use of internal primers as carried out for Rosoideae by Eriksson et al., (2003).

Detection possible source of noise by sequence characterisation

The length and percentage GC (Table 01 ), is within the range reported, increasing confidence that amplification was of the correct loci, although the possibility of pseudogenes cannot be ruled out. There were no violations of the assumption of homogeneity of nucleotide substitution or substitution saturation. Thus, causing no adverse effect on the accuracy of phylogenetic inferences (Kumar & Gadagkar, 2001; Xia et al., 2003).

Table 1 -
See Full Size >

Species Identification

BLAST analysis of all loci supported the identification of P. oleraceae (Table 02 ). The trnL-F and rbcl sequences obtained from P. umbraticola , were also most similar to P. oleraceae (Table 04 ), thus, were not useful for identifying P. olearaceae . However, this lack of utility is because of the absence of these sequences from the database, and may improve as the database extends its coverage of species and loci.

Table 2 -
See Full Size >

The species morphologically identified as P. grandiflora , was only determined to this same species by trnL-trnF (Table 03 ), and the ITS2 secondary structure-based BLAST. Standard BLAST using ITS1, and ITS2 identified P. grandiflora as P. pilosa . This misidentification could be due to the inconsistent naming of species leading to sequences being attributed to the wrong species. Unlike ITS1, alternate naming of species is probably not the cause of misidentification using BLAST of ITS2 sequences as structure-dependent BLAST of ITS2 enabled identification to the correct species. Structure-based BLAST, is more accurate (Keller et al., 2010), and should be used in preference to standard BLAST.

Table 3 -
See Full Size >

The species morphologically identified as P. umbraticola was assigned to the closely related P. oleraceae and P. grandiflora based on BLAST using sequences from the loci trnL-F and rbcl (Table 04 ). The incorrect identification was because trnL-F and rbcl sequences of P. umbraticola are not present in the database. An incomplete database reduces the accuracy of species identification in Portulaca, and needs to be addressed. Secondary structure-based BLAST of ITS2, as well as the standard BLAST of ITS1 (Table 04 ), supported the morphological identification of P. umbraticola .

Table 4 -
See Full Size >

Inferences from tree topology, in general, coincide with the deductions from BLAST. There is confirmation that the rbcL locus in Portulaca lacks enough variation to enable identifying at lower taxonomic levels (Figure 04 ) as in other taxons (Ghahramanzadeh et al., 2013; Kress et al., 2005; Newmaster et al., 2008). The variation in the rbcl locus, like other coding sequences, is functionally constrained (Chen et al., 2017).

Figure 4: Maximum likelihood tree based on rbcL sequences inferred using the Jukes-Cantor model (Jukes & Cantor, 1969) with discrete Gamma distribution. Bootstrap value is shown next to the branch. PO, PG, and PU represent P. oleracea, P. grandiflora, and P. umbraticola, respectively
Maximum likelihood tree based on rbcL sequences inferred using the Jukes-Cantor model (Jukes & Cantor, 1969) with discrete Gamma distribution. Bootstrap value is shown next to the branch. PO, PG, and PU represent P. oleracea, P. grandiflora, and P. umbraticola, respectively
See Full Size >

The trnL-F sequences only formed a monophyletic clade for P. grandiflora (Figure 05 ). As there were no trnL-F sequences for P. grandiflora on the database, there was no comparison to samples from other geographical locations. Additionally, the trnL-F locus could not resolve the relationship of the other two species. The absence of monophyly found in P. oleracea has also been reported by Ocampo and Columbus (2012) who explained its cause to different origins (North American and possibly African) of samples which leads to wide variation. P. oleracea has also been considered an aggregate composed of many sub-species (Danin et al., 1978). In future, the sample size should be increased to help confirm the inability or ability of this marker to identify the species. Zhang et al., (2010) have suggested that sample sizes widely used in DNA barcoding are inadequate to assess the genetic diversity of species, and hence will bias the identification.

Figure 5: Maximum likelihood tree generated using trnL-F sequences inferred by using the Tamura 3-parameter model (Tamura, 1992), with a discrete Gamma distribution. Bootstrap values are shown next to the branches. PO, PG and PU represent P. oleracea, P. grandiflora, and P. umbraticola, respectively
Maximum likelihood tree generated using trnL-F sequences inferred by using the Tamura 3-parameter model (Tamura, 1992), with a discrete Gamma distribution. Bootstrap values are shown next to the branches. PO, PG and PU represent P. oleracea, P. grandiflora, and P. umbraticola, respectively
See Full Size >

The ITS1 tree (Figure 06 ) shows three monophyletic clades, one for each species. However, the sequences of each species from this study formed a separate strongly supported clade from the downloaded sequences. Different mutational forces may be acting, such as the Portulaca improvement carried out by gamma irradiation and chemicals (Abraham & Desai, 1977; Wongpiyasatid & Hormchan, 2000).

Figure 6: The Maximum likelihood tree based on ITS1 sequences inferred using the Tamura 3-parameter model (Tamura, 1992) with discrete Gamma distribution and some invariable sites. Bootstrap support is shown next to the branches. PO, PG, and PU represent P. oleracea, P. grandiflora, and P. umbraticola
The Maximum likelihood tree based on ITS1 sequences inferred using the Tamura 3-parameter model (Tamura, 1992) with discrete Gamma distribution and some invariable sites. Bootstrap support is shown next to the branches. PO, PG, and PU represent P. oleracea, P. grandiflora, and P. umbraticola
See Full Size >

The ITS2 based tree (Figure 07 ) discriminated all species with high bootstrap support. While the P. oleracea in this study formed a monophyletic group with sequences downloaded from NCBI, the other two species did not cluster together with members of their species from the database. However, with the secondary structure analysis with ITS2, identification improved. Keller et al. (2010), and Müller et al., (2007) reported that secondary structure prediction is advantageous for species identification, and Coleman (2009) explained the benefit as arising from its ability to detect sequencing errors, pseudogenes, and genetic footprints indicative of past hybridisation events.

Figure 7: The Maximum likelihood tree based on ITS2 sequences inferred using Hasegawa-Kishino-Yano model (Hasegawa et al., 1985). Bootstrap support is shown next to the branches. PO, PG, and PU represent P. oleracea, P. grandiflora, and P. umbraticola, respectively
The Maximum likelihood tree based on ITS2 sequences inferred using Hasegawa-Kishino-Yano model (Hasegawa et al., 1985). Bootstrap support is shown next to the branches. PO, PG, and PU represent P. oleracea, P. grandiflora, and P. umbraticola, respectively
See Full Size >

Conclusion

All loci, using either BLAST or the tree topology, identified the genus correctly. Secondary structure-based BLAST analysis and tree topology of ITS2 confirmed the morphological identification of all Portulaca species. As the standard BLAST of the same sequences could not identify any of the species, the method of sequence analysis had a detectable influence on identification. Identification — Whether it is? Or it isn’t?, currently, cannot depend solely on DNA barcoding, and needs the support of morphology. Morphology based identification, however, can be simplified when DNA barcodes narrow the scope of search from among the estimated 369,000 species of flowering plants currently known, to say within the genus of a few hundred species. In the study, the lack of a complete database impacted the ability to identify species, and the database needs to be further populated to provide coverage of all species, and also reflect the within-species diversity. Also, because taxonomist cannot agree on characters to discriminate species, different names may be given to the same species such as P. grandiflora and P. pilosa var. grandiflora, as encountered in this study. Sequences from the same species uploaded on the database under different names, lead to confusion in species identification. DNA barcoding, however, may be seen as a way to flag species whose naming needs review. Thus, there is a need for continued effort in DNA barcoding.

Acknowledgments

INTI International University funded this study but had no role in study design, data collection, analysis, or preparation of the manuscript.

References

Copyright information

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

About this article

Publication Date

12 October 2020

eBook ISBN

978-1-80296-088-4

Publisher

European Publisher

Volume

89

Print ISBN (optional)

-

Edition Number

1st Edition

Pages

1-796

Subjects

Business, innovation, sustainability, environment, green business, environmental issues, urban planning, municipal planning, disasters, social impact of disasters

Cite this article as:

Yee, F. J., Ping, Y. H., Fathimath, S., & Selvarajah, G. (2020). It Is? It Is Not? Identifying Portulaca by Morphology and DNA Barcodes. In N. Samat, J. Sulong, M. Pourya Asl, P. Keikhosrokiani, Y. Azam, & S. T. K. Leng (Eds.), Innovation and Transformation in Humanities for a Sustainable Tomorrow, vol 89. European Proceedings of Social and Behavioural Sciences (pp. 339-352). European Publisher. https://doi.org/10.15405/epsbs.2020.10.02.31