Faculty Software


DLATK is a python-based end to end human text analysis package, specifically suited for social media and social scientific applications.


EpiVIA was developed for the joint profiling of the epigenome and lentiviral integration site analysis at population and single-cell resolutions.


Grabseqs is a command-line tool that aims to simplify access to next-generation sequencing data and metadata stored in public repositories. Data from multiple repositories (NCBI’s Short Read Archive, MG-RAST, iMicrobe) are available in a standardized format through grabseqs


HeatITup is an algorithm for efficient and robust identification, classification, and quantification of Internal Tandem Duplication (ITD) mutations.


IM-PET identifies target promoters of distal transcriptional enhancers by integrating multiple types of genomics data.


inteGREAT is a graph-based algorithm for robust and scalable differential integration of transcriptomic and proteomic data sets.


Manubot is a tool to write and publish papers using the GitHub platform.


Pollution-Associated Risk Geospatial Analysis SITE (PARGASITE) is an online web-application and R package that can be used to estimate levels of pollutants in the U.S. for 1997 through 2019 at user-defined geographic locations and time ranges. Measures correspond to monthly and yearly raster files (Jan 2005 to Dec 2019) for PM2.5, Ozone, NO2, SO2, and CO covering the US and Puerto Rico that were created from United States Environmental Protection Agency (EPA) regulatory monitor data. The R package allows the user to obtain more customized output as well as work with the raster layers directly.


scATAC-pro is a comprehensive workbench for single-cell chromatin accessibility sequencing data.


SPHARM-MAT is implemented based on a powerful 3D Fourier surface representation method called SPHARM, which creates parametric surface models using spherical harmonics. It is a matlab-based 3D shape modeling and analysis toolkit, and is designed to aid statistical shape analysis for relating morphometric changes in 3D structures of interest to different conditions.


Sunbeam is a pipeline written in snakemake that simplifies and automates many of the steps in metagenomic sequencing analysis. In addition to a modular design allowing users to only run their desired analytical steps from the core pipeline, Sunbeam is extensible—users can extend the pipeline through a simple extension framework, and install extensions written by the dev team or other users.


TooManyCells is a suite of graph-based tools for efficient, and unbiased clustering and visualization of single cell RNA-seq and ATAC-seq data sets.


The Tree-based Pipeline Optimization Tool (TPOT) is an open-source and Python-based automated machine learning (AutoML) method and software package which uses genetic programming to discover and optimize machine learning pipelines using algorithms in the scikit-learn library.


VisCello is a software platform for hosting and interactive analysis and visualizaiton of single cell data


The Xmeta is an R package and also an online platform to facilitate comprehensive meta-analysis for users with or without programming skills. It features two analytic paths: 1) for R users, you can install the “xmeta” package and directly call the main functions; and 2) for people who do not use R, you can use the web-based secure meta-analysis pipeline to personalize your own analysis. It includes a wide variety of analyses for univariate, multivariate and network meta-analysis, for continuous, binary and time to event outcomes. In addition, it also includes a rich set of model diagnosis tools and data visualizations for different types of analyses.