SyNG-BTS Documentation
SyNG-BTS (Synthesis of Next Generation Bulk Transcriptomic Sequencing) is a Python package for data augmentation of bulk transcriptomic sequencing data using deep generative models.
Overview
SyNG-BTS synthesizes transcriptomics data with realistic distributions without relying on predefined formulas. It supports three types of deep generative models:
Variational Autoencoders (VAE/CVAE) - For general data augmentation
Generative Adversarial Networks (GAN/WGANGP) - Alternative generative approach
Flow-based Models (MAF) - For transfer learning scenarios
These models are trained on pilot datasets and used to generate synthetic samples for any desired sample size.
Quick Start
Install SyNG-BTS:
pip install syng-bts
Generate synthetic data with generate():
from syng_bts import generate
result = generate(data="SKCMPositive_4", model="VAE1-10", epoch=5)
print(result.generated_data.shape)
figs = result.plot_loss()
Run a pilot study with pilot_study():
from syng_bts import pilot_study
result = pilot_study(
data="SKCMPositive_4",
pilot_size=[50, 100],
model="VAE1-10",
early_stop_patience=30,
)
print(result.summary())
Browse and load full TCGA cohorts with list_tcga_datasets() and load_tcga_dataset():
from syng_bts import list_tcga_datasets, load_tcga_dataset
list_tcga_datasets(short=True)
ds = load_tcga_dataset("BRCA")
real_df, real_groups = ds.real("TC")
print(real_df.shape)
For more details, see the Usage Guide guide, or the TCGA Datasets guide for full TCGA cohort access. For upgrading from v2.x, see the Migration Guide / Changelog guide.
Citation
If you use SyNG-BTS in your research, please cite:
Qi Y, Wang X, Qin LX. Optimizing sample size for supervised machine learning with bulk transcriptomic sequencing: a learning curve approach. Brief Bioinform. 2025 Mar 4;26(2):bbaf097. doi: 10.1093/bib/bbaf097. PMID: 40072846; PMCID: PMC11899567. https://doi.org/10.1093/bib/bbaf097
Contents
Getting Started
Links
GitHub Repository: https://github.com/Omics-Data-Synthesis/SyNG-BTS
Documentation: https://syng-bts.readthedocs.io/
PyPI Package: https://pypi.org/project/syng-bts/
Issue Tracker: https://github.com/Omics-Data-Synthesis/SyNG-BTS/issues