/tools
tools tagged “bioinformatics”
libcifpp
PDB-REDO/libcifpp
libcifpp is a C++ library designed to manipulate mmCIF and PDB files, allowing users to access and query molecular data efficiently. It includes features for symmetry calculations and 3D manipulations, making it useful for researchers working with protein structures.
AlphaFold3-Conda-Install
Model3DBio/AlphaFold3-Conda-Install
AlphaFold3-Conda-Install is a step-by-step guide for installing and configuring AlphaFold 3, a state-of-the-art tool for predicting protein structures. It facilitates the setup of the necessary environment and dependencies to run AlphaFold 3 effectively on compatible hardware.
proteinclip
wukevin/proteinclip
ProteinCLIP is a tool that harmonizes protein language models with natural language models to enhance the prediction of protein-protein interactions. It provides pre-trained models and training scripts for generating protein embeddings and classifiers.
AutoPeptideML
IBM/AutoPeptideML
AutoPeptideML is an AutoML system designed to help researchers build trustworthy models for predicting peptide bioactivity. It provides tools for model building, prediction, and benchmarking, making it accessible for users without prior machine learning expertise.
USPNet
ml4bio/USPNet
USPNet is a tool designed to predict signal peptides in protein sequences using a deep protein language model. It provides a benchmark set for evaluation and allows users to process and predict using their own protein data.
CPDB
a-r-j/CPDB
CPDB is a tool that allows users to parse PDB files into structured DataFrames, facilitating the analysis of protein structures. It supports various input methods, including direct file access and retrieval via UniProt IDs, making it versatile for bioinformatics applications.
finetune-esm
naity/finetune-esm
Finetune-ESM is a tool for scalable finetuning of protein language models, utilizing advanced training techniques to enhance the prediction of protein functions from sequences. It supports distributed training and reproducibility, making it suitable for bioinformatics applications.
VESPA
Rostlab/VESPA
VESPA is a tool that predicts the effects of single amino acid variants (SAVs) using embeddings from the Protein Language Model ProtT5. It provides a multistage pipeline for generating predictions based on protein sequences, making it useful for understanding protein mutations and their impacts.
LucaPCycle
LucaOne/LucaPCycle
LucaPCycle is a dual-channel model developed to predict whether a protein sequence has phosphate-solubilizing functionality and to classify it into one of 31 specific functional types. The tool employs large language models tailored for protein sequences, making it a valuable resource in protein analysis and molecular biology.
ESM-Ezy
westlake-repl/ESM-Ezy
ESM-Ezy is a tool designed for training and inference on protein sequences using a pre-trained model. It facilitates the retrieval and analysis of candidate sequences, making it relevant for applications in protein design and bioinformatics.
DeepCriticalLearning
XinshaoAmosWang/DeepCriticalLearning
DeepCriticalLearning implements various deep learning methods for protein function prediction and classification. It provides tools for training models that can be used in the context of protein design and analysis.
AlphaFold-MCP-Server
Augmented-Nature/AlphaFold-MCP-Server
The AlphaFold MCP Server is a comprehensive tool that allows users to access and analyze protein structure predictions from the AlphaFold Protein Structure Database. It offers features such as structure retrieval, confidence score analysis, and batch processing for multiple proteins, making it valuable for researchers in molecular biology and bioinformatics.
vcmsa
clairemcwhite/vcmsa
vcmsa is a Python library designed for vector clustering of Multiple Sequence Alignments (MSA) using protein language models. It allows for the alignment of protein sets that have conserved functions or structures but poorly conserved sequences, making it useful for protein design and analysis.
learnMSA
Gaius-Augustus/learnMSA
learnMSA is a tool designed for deep protein multiple alignments using large language and hidden Markov models. It allows for the alignment of millions of protein sequences with high accuracy, leveraging GPU acceleration and advanced modeling techniques.
de-stress
wells-wood-research/de-stress
DE-STRESS is a web application that provides tools for evaluating protein designs, making the process more reliable and accessible. It allows users to select promising protein designs for laboratory testing and includes functionalities for batch processing of protein structures.
paccmann_datasets
PaccMann/paccmann_datasets
Pytoda is a Python package that simplifies the handling of biochemical data for deep learning applications using PyTorch. It is particularly useful for researchers working on molecular design and related tasks in computational chemistry.
mmCIF2BioLiP
kad-ecoli/mmCIF2BioLiP
The mmCIF2BioLiP repository provides a web interface and scripts for curating the BioLiP database, which contains biologically relevant ligand-protein interactions. It facilitates the download and organization of data from the PDB, including binding affinities and other molecular information, making it a valuable resource for researchers in molecular biology and drug discovery.
PSALM
Protein-Sequence-Annotation/PSALM
PSALM is a tool designed for protein sequence annotation using advanced language models. It allows users to scan protein sequences for functional domains and provides a framework for training and evaluating models on protein data.
antiberty-pytorch
dohlee/antiberty-pytorch
The antiberty-pytorch repository provides an unofficial re-implementation of the AntiBERTy model, which is designed to analyze and predict properties of antibody sequences using a language model approach. It includes a dataset preparation pipeline for working with observed antibody sequences, facilitating research in antibody affinity maturation.
DiffPALM
Bitbol-Lab/DiffPALM
DiffPALM is a tool for pairing interacting protein sequences using masked language modeling. It employs a differentiable pairing method to optimize multiple sequence alignments, which is crucial for understanding protein interactions and structures.
lightdock-python2.7
lightdock/lightdock-python2.7
LightDock is a docking framework that utilizes the Glowworm Swarm Optimization algorithm to facilitate protein-protein, protein-peptide, and protein-DNA docking. It allows users to define custom scoring functions and supports various simulation options, making it a versatile tool for molecular docking studies.
AlphaFind
Coda-Research-Group/AlphaFind
AlphaFind is a web-based search engine that enables users to discover structural similarities among proteins in the AlphaFold Protein Structure Database. It accepts various protein identifiers as input and provides similarity metrics and 3D visualizations of protein structures.
mpek
kotori-y/mpek
MPEK is a multi-task learning tool that predicts enzyme turnover number (kcat) and Michaelis-Menten constant (Km) using enzyme sequences and substrate SMILES. It aims to enhance the evaluation of enzymatic efficiency and supports applications in biocatalysis and drug discovery.
MULAN
DFrolova/MULAN
MULAN is a multimodal protein language model that encodes both sequence and structural information of proteins. It utilizes pre-trained models to enhance protein representations, making it suitable for various downstream tasks in molecular biology.