Researchers have developed a novel computation model RaptGen, a VAE (Variational autoencoder) that can be used for aptamer generation. The model has been described in the paper published in Nature Computational Science.
Researchers describe how RaptGen uses a VAE with a profile hidden Markov Model decoder to create latent spaces in which sequences can form clusters. By using this latent representation, RaptGen was able to generate aptamers that were not included even in the original sequencing data or HT-SELEX dataset.
Oligonucleotides are short, single strands of synthetic DNA or RNA. Albeit small, these molecules play an important role in molecular and synthetic biology applications. One type of oligonucleotide—aptamers—can selectively bind to specific targets such as proteins, peptides, carbohydrates, viruses, toxins, metal ions and even live cells. As they are similar to antibodies, they have a variety of uses in the fields of biosensors, therapeutics, and diagnostics. However, compared to antibodies, aptamers do not induce an immune reaction in our bodies, and are easy to synthesize and modify. Moreover, an aptamer’s three-dimensional folding structure allows it to bind to a wider range of targets.
Aptamers are usually generated by an in vitro selection and amplification technology called systematic evolution of ligands by exponential enrichment, or SELEX. Briefly, SELEX is based on repeated cycles of binding, separation, and amplification of nucleotides. This process results in an enriched pool of nucleotide sequences that is then analyzed for candidate selection. High-throughput SELEX (HT-SELEX) can generate a vast number of aptamer candidates, but current practically-applicable sequencing only allows us to evaluate a limited number of these candidates (approximately 106). Therefore, computational processes are essential to optimize the discovery of new aptamers.
The team also successfully evaluated RaptGen’s performance using real-world data, by subjecting it to data from two independent HT-SELEX datasets. RaptGen could generate aptamer derivatives in an activity-guided manner and provide opportunities to optimize their activities. “This is important as it means that RaptGen can generate sequences having desired properties, such as the inhibition of certain enzymes or protein-protein interactions,” Professor Hamada explains. The application of these molecules could open many doors in the future.
Moving forward, the team plans to conduct extensive studies evaluating if alternative models can improve the performance of RaptGen, and whether RaptGen could advance RNA aptamer generation by using RNA sequences. The only drawbacks in using RaptGen are the high computational cost and increased training time, both of which can be improved in further studies.