Overview
Software
Open-source R packages and AI tools for cancer genomics, biostatistics, and clinical trials. Includes glmmPen, PurIST, dlGLM, NIMIWAE, and epigraHMM for RNA-seq, machine learning, precision medicine, and clinical research.
Clinical tools, R packages, and legacy utilities are grouped below with links to papers and code.
Decision Support Tools & Patents
PurIST Pancreatic Classifier
Single-sample PDAC subtype assay used in cooperative trials.
- Patents:
- US 11,053,550 (July 2021) – Gene-expression based subtyping of PDAC
- US Patent App. 17/336,600 (April 2022) – Methods and compositions for prognostic/diagnostic subtyping
- US 12,000,003 (June 2024) – Platform-independent single sample classifier
- Key papers: Rashid et al., Clin Cancer Res (2020); Li et al., J Mol Diagnostics (2024)
- Code: GitHub · Shiny GUI
R Packages
dlglm
Deeply learned GLMs handling non-ignorable missingness.
- GitHub
- Lim et al., JCGS (2024)
NIMIWAE
Variational autoencoder for non-ignorable missing data.
- GitHub
- Lim et al., Stat Biopharm Res (2024)
glmmPen
Penalized GLMM selection for biomarker discovery.
epigraHMM
Multi-sample enrichment detection for ChIP/ATAC.
- Bioconductor · GitHub
- Baldoni et al., Biometrics (2022)
FSCseq
Model-based feature selection + clustering for RNA-seq.
- GitHub
- Lim et al., Ann Appl Stat (2021)
CompDTUReg
Differential transcript usage with quantification uncertainty.
- GitHub
- Young et al., Biostatistics (2023)
Legacy & Specialized Tools
- mixNBHMM – Differential peak calling for multi-condition epigenomic data. (GitHub)
- ZIMHMM – Consensus enrichment calling across ChIP-seq replicates. (GitHub)
- ZINBA – Zero-inflated negative binomial algorithm detecting NGS-enriched regions. (Project)
- hmmcov – HMM / AR-HMM procedures with variable selection for epigenetic enrichment. (Project)
- BASeG – Bivariate association studies linking expression and epigenetic marks with shared genetics. (Rashid et al., Ann Appl Stat, 2016)
Code Repositories
All software is actively maintained on GitHub. Contributions, issues, and feature requests are welcome, and the full repository + contributor stats live on the Repositories page.
Collaborative Development
Many packages are developed in close collaboration with lab members and trainees:
- Hillary Heiling: glmmPen lead developer
- David Lim: dlglm, NIMIWAE, FSCseq lead developer
- Pedro Baldoni: epigraHMM, mixNBHMM, ZIMHMM lead developer
- Scott Van Buren: CompDTUReg lead developer
- Amber Young: CompDTUReg co-developer; current PhD student working on semi-supervised matrix factorization for PDAC subtyping
Using these tools in your research? Please let us know or open an issue. Email
Support & Contact
For technical support, please open an issue on the relevant GitHub repository. For collaboration inquiries, contact naim@unc.edu.