/tools
tools tagged “representation”
obsidian-chem
Acylation/obsidian-chem
Obsidian Chem is a plugin for Obsidian.md that allows users to render chemical structures from SMILES strings. It enhances note-taking in chemistry by enabling the visualization of molecular structures directly within notes.
ESM-GearNet
DeepGraphLearning/ESM-GearNet
ESM-GearNet is a codebase for joint representation learning on protein sequences and structures, combining sequence and structure encoders to enhance protein representation. It includes pre-training techniques and is designed for tasks related to protein structure analysis.
SELFormer
HUBioDataLab/SELFormer
SELFormer is a molecular representation learning tool that utilizes SELFIES language models to generate high-quality molecular embeddings. It is pre-trained on drug-like compounds and fine-tuned for various molecular property prediction tasks, making it a valuable resource for drug discovery and cheminformatics.
prose
tbepler/prose
The ProSE repository provides multi-task and masked language model-based protein sequence embedding models. It allows users to train models and embed protein sequences, facilitating research in protein structure and function analysis.
DECIMER-Image-to-SMILES
Kohulan/DECIMER-Image-to-SMILES
DECIMER-Image-to-SMILES is a tool that utilizes an encoder-decoder neural network to recognize chemical structures from images and convert them into SMILES notation. This process aids in molecular representation and can facilitate further analysis in cheminformatics.
ProtFlash
ISYSLAB-HUST/ProtFlash
ProtFlash is a lightweight protein language model designed to generate embeddings for protein sequences. It utilizes pretrained models to facilitate protein representation learning, which can aid in various bioinformatics applications.
PINNACLE
mims-harvard/PINNACLE
PINNACLE is a geometric deep learning model designed to generate contextualized representations of proteins based on their interactions across different cell types and tissues. It aims to improve the understanding of protein functions and therapeutic potentials by incorporating biological context into its modeling approach.
AtomsBase.jl
JuliaMolSim/AtomsBase.jl
AtomsBase.jl is a Julia package that serves as an abstract interface for representing atomic geometries. It aims to enhance interoperability among molecular simulation engines and tools for computing chemical properties.
molencoder
cxhernandez/molencoder
MolEncoder is a Molecular AutoEncoder implemented in PyTorch that allows users to train models on molecular datasets, specifically designed for tasks such as molecular representation and generation. It includes functionalities for downloading datasets and training models, making it a useful tool in computational chemistry and molecular biology.
pysmilesutils
MolecularAI/pysmilesutils
PySMILESutils is a package designed for handling SMILES encodings of molecules, facilitating their use in deep learning applications with PyTorch. It includes features for data augmentation, dataset handling, and mini-batch creation, making it suitable for molecular machine learning tasks.
GraphLoG
DeepGraphLearning/GraphLoG
GraphLoG is a tool for self-supervised graph-level representation learning, with applications in the chemistry domain. It provides pre-training and fine-tuning capabilities for models that can be used to analyze molecular data.
S-PLM
duolinwang/S-PLM
S-PLM is a structure-aware protein language model that utilizes contrastive learning to integrate sequence and structural information for generating protein embeddings. It supports downstream tasks such as enzyme classification and protein structure prediction, making it a valuable tool in molecular biology.
molml
crcollins/molml
MolML is a Python library designed to interface molecules with machine learning by converting molecular structures into vector representations. It supports various molecular descriptors and is aimed at facilitating the application of machine learning techniques in predicting molecular properties.
pLM-BLAST
labstructbioinf/pLM-BLAST
pLM-BLAST is a tool for detecting remote homology in proteins by comparing embeddings generated from protein language models. It allows users to search databases of protein sequences and visualize results, making it a valuable resource in the field of bioinformatics.
PST
BorgwardtLab/PST
The Protein Structure Transformer (PST) enhances pretrained protein language models by incorporating structural knowledge, allowing for the extraction of protein structure representations. It can be used for various protein function prediction tasks and is trained on datasets like AlphaFold SwissProt.
SiamDiff
DeepGraphLearning/SiamDiff
SiamDiff is a codebase for a diffusion-based pre-training algorithm that enhances protein structure encoders. It improves performance on tasks such as protein-protein interaction prediction and function annotation by learning effective representations from protein sequences and structures.
biomed-multi-view
BiomedSciAI/biomed-multi-view
The biomed-multi-view repository features the Multi-view Molecular Embedding with Late Fusion (MMELON) architecture, which aggregates molecular representations from images, graphs, and text to enhance predictions of molecular properties. It is applicable to various tasks including ligand-protein binding and molecular solubility, utilizing a large dataset of molecules for training and evaluation.
esm-s
DeepGraphLearning/esm-s
The ESM-S repository provides a structure-informed protein language model that enhances the learning of protein representations by integrating structural information without requiring explicit protein structures. It is designed for tasks such as remote homology detection and function prediction in proteins.
simg
gomesgroup/simg
The SIMG repository introduces a novel approach to molecular representation by infusing stereoelectronic effects into molecular graphs. This enhances the performance of molecular machine learning models and enables the evaluation and design of complex molecular systems, including proteins.
kanzi
rdilip/kanzi
Kanzi is a tool for modeling biological structures through discrete tokenization of proteins. It utilizes flow autoencoders to efficiently encode and decode protein structures, facilitating further applications in protein design and molecular representation.
unimol_tools
deepmodeling/unimol_tools
Uni-Mol Tools is an easy-to-use auto-ML tool designed for predicting molecular properties and representations. It allows users to train models for classification and regression tasks on molecular data, utilizing SMILES and other formats for input.
neuraldecipher
bayer-science-for-a-better-life/neuraldecipher
Neuraldecipher is a tool that implements a method for reverse-engineering extended-connectivity fingerprints (ECFPs) back to their corresponding molecular structures. It utilizes deep learning techniques to facilitate the generation and analysis of molecular representations, making it useful for cheminformatics applications.
cuik-molmaker
NVIDIA-Digital-Bio/cuik-molmaker
cuik-molmaker is a specialized package for molecular featurization that transforms chemical structures into formats compatible with deep learning models, particularly graph neural networks. It combines C++ and Python for efficient processing and is designed to facilitate the training and inference workflows in molecular machine learning.
AlphaSeq_Antibody_Dataset
mit-ll/AlphaSeq_Antibody_Dataset
AlphaSeq_Antibody_Dataset contains two datasets with quantitative binding scores of scFv-format antibodies against a SARS-CoV-2 target peptide. It is designed to support protein representation learning and includes data for machine learning optimization of antibody candidates.