In silico detection of polymorphic microsatellites in the endangered Isis tamarind, Alectryon ramiflorus (Sapindaceae)

Premise of the Study Alectryon ramiflorus (Sapindaceae) is an endangered rainforest tree known from only two populations. In this study, we identified polymorphic microsatellites, in silico, improving the effectiveness and efficiency of microsatellite development of nonmodel species. The development of genetic markers will support future conservation management of the species. Methods and Results We used next‐generation sequencing and bioinformatics to detect polymorphic microsatellites, in silico, reducing both the time and cost of marker development. A panel of 15 microsatellites, 12 of which were polymorphic, were subsequently characterized in 64 adult trees representing the entire species range. Mean observed heterozygosity and expected heterozygosity were 0.471 and 0.425, respectively. The polymorphism information content across loci ranged from 0.152 to 0.875. Conclusions The microsatellite markers developed in this study will be useful in gaining an understanding of A. ramiflorus’ genetic diversity, level of inbreeding, and population structure and for guiding future restoration and management efforts.


METHODS AND RESULTS
Genomic DNA from a single A. ramiflorus individual from each of the two known populations (Appendix 1) was extracted using the DNeasy Plant Mini Kit (QIAGEN, Hilden, Germany) and used to construct two DNA libraries at the Australian Genome Research Facility (Brisbane, Australia; http://agrf.org.au/). Libraries were constructed using the Illumina TruSeq Nano DNA Library Prep Kit (Illumina, San Diego, California, USA). Briefly, 200 ng of DNA was sheared to ~550 bp by ultra-focused sonication on the Covaris E220 Focused-ultrasonicator (Covaris, Woburn, Massachusetts, USA). Fragmented DNA underwent end-repair and size selection using SPRI beads (Beckman Coulter, Indianapolis, Indiana, USA), followed by adenylation of the 3′ ends and adapter ligation. Adapterligated fragments were PCR amplified for eight cycles. The final library was assessed using electrophoresis and real-time quantitative PCR (qPCR) before being sequenced using MiSeq (Illumina) according to the manufacturer's instructions with 2 × 250-bp paired-end reads.
FastQC (Andrews, 2010) was used to check adapter contamination and data quality. Paired-end reads were stitched using PEAR version 0.9.5 default parameters (Zhang et al., 2014), and subsequent merged reads were analyzed using QDD version 3.1.2 (Meglécz et al., 2014) for microsatellite detection and Primer3 for primer design (Untergasser et al., 2012). The two A. ramiflorus samples were processed individually, yielding approximately 20,000 microsatellites. One sample was treated as "target" and the other as "subject" for BLAST to yield similar sequences, resulting in a list of ~300 polymorphic microsatellites. A detailed description of the commands and parameters used are provided as Appendix S1. Of the 300 polymorphic microsatellites, 48 were chosen for validation by prioritizing perfect, dinucleotide repeats of ≥8 units, with ≥30 bp between primers and target microsatellites, and ~0.5°C difference in melting temperature of the primer pairs.
Twenty-four loci showing the clearest and most consistent amplification were selected, and forward primers were labeled with either FAM (Sigma-Aldrich, St. Louis, Missouri, USA), VIC, NED, or PET (Applied Biosystems, Foster City, California, USA) fluorescent dyes. PCR was completed for 14 A. ramiflorus individuals using the same reaction components described above and the following cycling conditions: initial denaturation at 95°C for 5 min; followed by 35 cycles of 94°C for 30 s, 56°C for 90 s, and 72°C for 30 s; with a final extension at 68°C for 10 min. PCR products were separated by capillary electrophoresis on a 3500 Genetic Analyzer (Applied Biosystems). Nine of the 24 loci tested revealed electrophoretic signatures that were acceptable but relatively difficult to interpret. Three of the 24 revealed monomorphic signatures (Table 1). Twelve high-quality polymorphic microsatellite loci (Table 1) were finally selected and validated in 64 individuals, 60 collected from the only two known populations (plus two individuals each from two roadside patches) located near Childers, Queensland (Appendix 1). Congeneric species could not be collected for cross-amplification testing due to permit restrictions. Fragment sizes were determined relative to a GeneScan 600 LIZ Size Standard (Applied Biosystems) using GeneMarker version 2.7.0 (SoftGenetics, State College, Pennsylvania, USA) and double-checked manually. MICRO-CHECKER version 2.2.3 (van Oosterhout et al., 2004) was used to check for scoring errors, homozygote excess, large allele dropout, and potential null alleles. Number of alleles, polymorphism information content, observed and expected heterozygosity, and null allele frequencies for each locus were calculated in CERVUS version 3.0.8 (Kalinowski et al., 2007). FSTAT version 2.9.3.2 (Goudet, 2001) was used to identify evidence of linkage disequilibrium. Tests for Hardy-Weinberg equilibrium at each locus were conducted in GenAlEx version 6.501 Smouse, 2006, 2012) with significance levels adjusted using Bonferroni correction.
Approximately 7.56 Gb (15,118,442 paired-end reads) of sequence data from two A. ramiflorus individuals were generated. A total of 72 alleles were resolved across the 12 loci in the 64 adult individuals tested. The number of alleles per locus, per population ranged from two to 17, with an average of 3.021 alleles per locus, per population. Mean levels of observed and expected heterozygosity per locus per population ranged from 0.375 to 0.583 and 0.313 to 0.526, respectively (Table 2). Polymorphism information content values ranged from 0.152 to 0.875 per locus, with a mean of 0.506. Six of the 12 loci exhibited an excess of homozygosity and significant deviation from conditions of Hardy-Weinberg equilibrium, which was expected given the small population sizes and extremely low number of individuals remaining in the species. Two loci (AR10 and AR30) displayed evidence of null alleles but at low frequency. No evidence of linkage disequilibrium between pairs of loci was detected. All microsatellite sequences were deposited in the National Center for Biotechnology Information's GenBank (Table 1), and the raw sequencing reads were deposited in the Sequence Read Archive (accession no. SUB4484831).

CONCLUSIONS
Here we have demonstrated that partial genomic sequencing and in silico identification of polymorphic loci using bioinformatics can serve as an effective and efficient approach to isolate and design microsatellite markers for genetic analysis in non-model species. The increased efficiency and effectiveness is observed via a reduction in the time and cost of the laboratory phase of primer and loci validation and via an increase in quality of the developed loci. The approach is potentially even more valuable for identification of polymorphic microsatellites in species with low genetic diversity such as A. ramiflorus. The 12 markers validated in this study will be employed in the conservation of this endangered species by providing important tools to inform the future management of A. ramiflorus.

ACKNOWLEDGMENTS
This work was financed by the Burnett Mary Regional Group (BMRG; MNES 16/18) and received support from the An annealing temperature of 56°C was used for all loci. c These loci were found to be monomorphic in Alectryon ramiflorus and are therefore not reported in Table 2.