Machine learning–directed massively parallel programmable nucleic acid amplification

Zhi Weng & Ping Song et al. · 2026-03-25

Dynamic regulation of amplification efficiency is pivotal yet challenging in molecular diagnostics and DNA data storage. Here, we develop a thermodynamics-based approach to achieve continuous and precise modulation of nucleic acid amplification efficiency. By decoupling sequence specificity from hybridization energy regulation via a primer-tag compensation strategy, we demonstrate programmed amplification with high resolution (33 versus 81%). Leveraging 2483 experimental data, we constructed a machine learning model that improved prediction accuracy from R 2 = 0.62 to = 0.86. In DNA data storage, this amplification strategy increases the density for information preview by nearly one order of magnitude and robust file steganography via differential amplification. In clinical validation, our method outperformed uniform amplification in cervical cancer RNA variant analysis, detecting rare RNA fusions and improving detection sensitivity by 100-fold under 10 4 simulated sequencing depth. This programmable technique is anticipated to extend to single-cell sequencing and spatial transcriptomics, offering a powerful tool for molecular diagnostics and synthetic biology.