/tools
tools tagged “bioinformatics”
deepchem
deepchem/deepchem
DeepChem provides an open-source toolchain that facilitates the application of deep learning in drug discovery, quantum chemistry, and biology. It supports various molecular tasks such as property prediction, molecular generation, and offers extensive tutorials for users to learn and apply these techniques.
MMseqs2
soedinglab/MMseqs2
MMseqs2 is an ultra-fast and sensitive software suite for searching and clustering large sets of protein and nucleotide sequences. It significantly improves the speed and sensitivity of sequence searches compared to traditional methods like BLAST, making it a valuable tool for molecular biology research.
alphagenome
google-deepmind/alphagenome
AlphaGenome is an API that offers access to a model for predicting various functional outputs from DNA sequences, including gene expression and variant effects. It is designed for analyzing genomic data and provides tools for visualization and variant scoring.
ProtTrans
agemagician/ProtTrans
ProtTrans is a repository that offers pre-trained language models specifically designed for proteins, enabling tasks such as feature extraction, prediction, and protein sequence generation. It supports the bioinformatics community by providing tools for analyzing protein sequences and structures.
graphein
a-r-j/graphein
Graphein is a protein and interactomic graph library that enables the creation of geometric representations of protein and RNA structures, as well as biological interaction networks. It supports various molecular types and provides functionalities for graph construction, visualization, and analysis, making it a valuable resource for molecular design and drug discovery.
scipipe
scipipe/scipipe
SciPipe is a library for creating robust and flexible scientific workflows using the Go programming language. It is particularly suited for bioinformatics and cheminformatics applications, allowing users to design and execute pipelines that can process molecular data and integrate various command-line tools.
biopandas
BioPandas/biopandas
BioPandas provides tools for handling molecular structures, particularly from PDB and MOL2 files, using pandas DataFrames. It facilitates the analysis and manipulation of protein structures, making it useful for tasks in drug discovery and computational biology.
poly
bebop/poly
Poly is a Go package designed for engineering organisms, providing tools for tasks such as codon optimization and primer design. It aims to be a comprehensive resource for computational synthetic biology, making it useful for both academic and industrial applications.
DnaFeaturesViewer
Edinburgh-Genome-Foundry/DnaFeaturesViewer
DNA Features Viewer is a Python library designed to visualize DNA sequence features from GenBank or GFF files. It allows users to create clear plots of DNA sequences, making it useful for synthetic biology applications and DNA design.
plip
pharmai/plip
PLIP (Protein-Ligand Interaction Profiler) is a tool designed to analyze and visualize non-covalent interactions between proteins and ligands in PDB files. It facilitates the understanding of molecular interactions, which is crucial for applications in drug discovery and molecular biology.
avogadrolibs
OpenChemistry/avogadrolibs
Avogadro libraries are designed for advanced molecular editing and visualization, supporting computational chemistry and molecular modeling. They offer a flexible plugin architecture and are suitable for a wide range of applications in bioinformatics and materials science.
atomworks
RosettaCommons/atomworks
AtomWorks is an open-source platform that accelerates biomolecular modeling tasks by providing a toolkit for parsing, cleaning, and manipulating biological data. It includes advanced features for dataset featurization and sampling, making it suitable for deep learning applications in molecular biology.
p2rank
rdk/p2rank
P2Rank is a command-line tool that predicts ligand-binding sites from protein structures using machine learning techniques. It provides high accuracy in identifying potential binding pockets without relying on external databases, making it useful for drug discovery and virtual screening applications.
DeepFRI
flatironinstitute/DeepFRI
DeepFRI is a tool designed for deep functional residue identification in proteins. It predicts protein functions based on sequences and contact maps using graph convolutional networks, providing insights into molecular functions and biological processes.
pypdb
williamgilpin/pypdb
PyPDB is a Python toolkit designed for performing searches and fetching data from the RCSB Protein Data Bank (PDB). It allows users to access information about protein structures, sequences, and related data programmatically, facilitating research in molecular biology.
provis
salesforce/provis
This repository provides an implementation for visualizing and analyzing attention in protein language models, specifically designed to interpret how these models interact with protein structures. It includes tools for generating visualizations and conducting attention analysis on various protein datasets.
InterPLM
ElanaPearl/InterPLM
InterPLM is a toolkit designed for extracting, analyzing, and visualizing interpretable features from protein language models using sparse autoencoders. It allows users to work with protein embeddings and provides pretrained models for feature analysis and visualization.
ProteinFlow
adaptyvbio/ProteinFlow
ProteinFlow is an open-source Python library that streamlines the pre-processing of protein structure data for deep learning applications. It enables users to filter, cluster, and generate datasets from protein structure databases, facilitating various protein design tasks.
ANARCI
oxpig/ANARCI
ANARCI is a tool for antibody numbering and antigen receptor classification, utilizing alignment to germline sequences for accurate numbering. It supports various numbering schemes and outputs detailed alignment statistics, making it valuable for researchers working with antibodies.
Awesome-Biomolecule-Language-Cross-Modeling
QizhiPei/Awesome-Biomolecule-Language-Cross-Modeling
Awesome-Biomolecule-Language-Cross-Modeling is a curated list of resources that focuses on leveraging biomolecule data and natural language processing through multi-modal learning. It includes various models and datasets that facilitate tasks related to molecular properties and interactions.
Ankh
agemagician/Ankh
Ankh is an optimized protein language model that enhances general-purpose modeling for protein engineering. It offers pre-trained models and datasets for various protein-related tasks, including secondary structure prediction and solubility assessment.
MolTrans
kexinhuang12345/MolTrans
MolTrans is a tool designed for predicting drug target interactions using a transformer-based model. It addresses challenges in molecular representation learning and provides datasets for training and evaluation.
ProTrek
westlake-repl/ProTrek
ProTrek is a trimodal protein language model designed to enhance protein searches by integrating sequence, structure, and function information. It utilizes contrastive learning to improve retrieval tasks and provides embeddings for various protein-related applications.
prodigy
haddocking/prodigy
PRODIGY is a tool designed to predict the binding affinity of protein-protein complexes based on structural data. It allows users to analyze single or multiple structures and provides detailed output on intermolecular contacts and predicted affinities.