/tools
tools tagged “bioinformatics”
Bacformer
macwiatrak/Bacformer
Bacformer is a foundational model that processes whole bacterial genomes as sequences of proteins, leveraging protein embeddings to perform tasks like essential gene prediction and operon identification. It is designed for applications in molecular biology and bioinformatics.
arpeggio
harryjubb/arpeggio
Arpeggio is a tool designed to calculate and visualize interatomic interactions in molecular structures, particularly proteins. It utilizes the CREDO rules to analyze contacts and interactions, providing valuable insights for molecular biology and cheminformatics.
manyfold
instadeepai/manyfold
ManyFold is a library that facilitates the training and validation of protein folding models, including AlphaFold and pLMFold. It allows users to generate and optimize protein structures using advanced machine learning techniques.
PXMeter
bytedance/PXMeter
PXMeter is a toolkit for assessing the structural quality of biomolecular predictions, including proteins and small molecules. It provides multi-metric evaluations and supports both command line and Python API interfaces for efficient analysis.
SeqVec
mheinzinger/SeqVec
SeqVec is a tool that creates embeddings for amino acid sequences using deep learning techniques. It enables the prediction of various protein properties and functions from single protein sequences, improving upon traditional methods that rely on evolutionary information.
protein-localization
HannesStark/protein-localization
This repository provides a method for predicting the subcellular location of proteins using transformer protein embeddings and a linear attention mechanism. It includes tools for training models and making predictions based on protein sequences.
S-PLM
duolinwang/S-PLM
S-PLM is a structure-aware protein language model that utilizes contrastive learning to integrate sequence and structural information for generating protein embeddings. It supports downstream tasks such as enzyme classification and protein structure prediction, making it a valuable tool in molecular biology.
PrismNet
kuixu/PrismNet
PrismNet is a deep learning framework designed to predict dynamic cellular protein-RNA interactions by utilizing in vivo RNA structure. It includes scripts for training models, evaluating performance, and preparing datasets for research in molecular biology.
Revisiting-PLMs
elttaes/Revisiting-PLMs
This repository explores evolution-aware protein language models to predict protein functions. It provides datasets related to metal ion binding and antibiotic resistance, making it a valuable resource for researchers in molecular biology and protein analysis.
protlearn
tadorfer/protlearn
protlearn is a Python package designed for extracting features from amino acid sequences. It includes preprocessing, feature computation, and dimensionality reduction stages, making it a valuable resource for analyzing protein data.
pLM-BLAST
labstructbioinf/pLM-BLAST
pLM-BLAST is a tool for detecting remote homology in proteins by comparing embeddings generated from protein language models. It allows users to search databases of protein sequences and visualize results, making it a valuable resource in the field of bioinformatics.
proteinsolver
ostrokach/proteinsolver
ProteinSolver is a deep neural network tool designed to generate amino acid sequences that can fold into specific protein structures. It utilizes graph neural networks to tackle protein design challenges, making it a valuable resource in the field of bioinformatics.
CodonFM
NVIDIA-Digital-Bio/CodonFM
CodonFM is an open-source suite of foundation models trained on codon sequences to learn contextual representations for various downstream tasks, including mutation prediction and evaluation of translation efficiency. It provides pre-trained models and tools for working with protein-coding sequences, making it a valuable resource in molecular biology.
PLM-interact
liudan111/PLM-interact
PLM-interact is a tool that extends protein language models to predict protein-protein interactions. It utilizes a novel method to jointly encode protein pairs, enhancing the prediction of their interactions based on their sequences.
py-rcsb-api
rcsb/py-rcsb-api
The py-rcsb-api is a Python toolkit designed to streamline access to the RCSB Protein Data Bank's API services. It allows users to perform complex queries to retrieve structural data about proteins and other macromolecules, facilitating research in molecular biology.
Cryo-IEF
westlake-repl/Cryo-IEF
Cryo-IEF is a foundation model designed for cryo-electron microscopy (cryo-EM) image processing, enabling the classification and quality assessment of cryo-EM particle images. It supports the development of automated pipelines for analyzing biological macromolecules, making it a valuable tool in structural biology.
sirius-libs
sirius-ms/sirius-libs
SIRIUS is a framework designed for metabolomics mass spectrometry, enabling the identification of molecular formulas for small molecules. It includes various modules for isotope pattern analysis, fragmentation tree computation, and compound class prediction.
lemon
chopralab/lemon
Lemon is a framework that allows users to rapidly mine structural information from the Protein Data Bank. It enables the creation of standardized workflows for querying 3D features of macromolecules, enhancing the efficiency of structural biology research.
Porter5
mircare/Porter5
Porter5 is a tool designed for fast and accurate prediction of protein secondary structure in both 3 and 8 classes. It utilizes advanced machine learning techniques to enhance the prediction quality, making it a valuable resource for researchers in molecular biology.
EasIFA
wangxr0526/EasIFA
EasIFA is a tool that utilizes multi-modal deep learning to efficiently annotate enzymatic active sites. It allows users to upload enzyme structures and provides predictions on active site positions and categories, enhancing the understanding of enzyme functionality.
PST
BorgwardtLab/PST
The Protein Structure Transformer (PST) enhances pretrained protein language models by incorporating structural knowledge, allowing for the extraction of protein structure representations. It can be used for various protein function prediction tasks and is trained on datasets like AlphaFold SwissProt.
SPROF-GO
biomed-AI/SPROF-GO
SPROF-GO is a tool for predicting protein functions from sequences using a pretrained language model and homology-based label diffusion. It offers fast and accurate predictions and includes datasets and models for users interested in reproducing the results.
TMbed
BernhoferM/TMbed
TMbed is a tool that predicts transmembrane proteins and their segments using embeddings generated from a Protein Language Model. It allows users to generate predictions and embeddings for protein sequences, making it a valuable resource in the field of molecular biology.
paccmann_proteomics
PaccMann/paccmann_proteomics
PaccMann Proteomics provides a framework for protein language modeling using transformer architectures to predict protein classification and binding interactions. It utilizes self-supervised learning techniques to handle unlabeled protein sequences and offers pre-trained models and datasets for various protein-related tasks.