CPAT (Coding-Potential Assessment Tool) trained a logistic regression model from pure sequence features to distinguish noncoding RNAs from protein coding mRNAs.
CrossMap is a program to liftover genome coordinates between different genome assemblies. It supports multiple file formats including BED, BAM/SAM/CRAM, VCF, Wiggle/BigWig, etc.
CircularLogo is an innovative web application that specifically designed to visualize and explore intra-motif dependencies. It is able to display intra-motif dependencies and reveal biomolecular structure effectively.
TIN (Transcript Integrity Number) is an algorithm to measure RNA degradation at transcript level. It can improve differential gene expression analyses by reducing false positives and false negatives.
CpGtools is a Python package to perform DNA methylation analysis. This package consists of three types of modules: (i) ‘CpG position modules’ focus on analyzing the genomic positions of CpGs, including associating other genomic and epigenomic features to a given list of CpGs and generating the DNA motif logo enriched in the genomic contexts of a given list of CpGs; (ii) ‘CpG signal modules’ are designed to analyze DNA methylation values, such as performing the PCA or t-SNE analyses, using Bayesian Gaussian mixture modeling to classify CpG sites into fully methylated, partially methylated and unmethylated groups, profiling the average DNA methylation level over user-specified genomics regions and generating the bean/violin plots and (iii) ‘differential CpG analysis modules’ focus on identifying differentially methylated CpGs between groups using different statistical methods including Fisher’s Exact Test, Student’s t-test, ANOVA, non-parametric tests, linear regression, logistic regression, beta-binomial regression and Bayesian estimation.
Collocated genomic intervals indicate biological association. The conventional approach to evaluate the strength of collocation involves arbitrary thresholds to decide the total number of overlapped genomic regions, which leads to biased, non-reproducible, and incomparable results. The cobind package provides six different metrics to measure the strength of overlapping between two sets of genomic intervals. Using transcription factor ChIP-seq, bulk and single-cell ATAC-seq data. We demonstrated that the normalized pointwise mutual information (NPMI) and collocation coefficient (C) are the best metrics to quantify genomic collocation, which successfully distinguished CTCF’s co-factors from over 1200 transcription factors and revealed potential master regulators from tissue and cell type specific open chromatin regions.