LoGoBERT-PPI (X-species Eukaryote)

LoGoBERT-PPI is a protein–protein interaction (PPI) prediction model built on top of the ESM-2 protein language model and a late-interaction MaxSim mechanism.

The model is trained on human protein–protein interaction data and evaluated on cross-species eukaryotic datasets including mouse, fly, yeast, and worm.

LoGoBERT-PPI is designed for scalable proteome-wide interaction inference while preserving localized interaction signals between protein sequences.


Model Overview

LoGoBERT-PPI combines:

  • Global sequence-level representations from ESM-2 using mean pooling
  • A residue-level late-interaction signal computed via a MaxSim operation inspired by ColBERT

This hybrid design enables efficient modeling of localized binding patterns between protein sequences while remaining computationally efficient for large-scale inference.


Available Checkpoints

File Description
model.safetensors Trained on human PPIs and evaluated on eukaryotic cross-species datasets (Mouse, Fly, Yeast, Worm)

Requirements

  • Python >= 3.9
  • torch
  • transformers
  • huggingface_hub

Install dependencies:

pip install torch transformers huggingface_hub
import torch
from model import LoGo_BERT
from transformers import AutoTokenizer

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = LoGo_BERT.from_pretrained("hbeen/LoGoBERT-PPI-Eukaryote")
model = model.to(device)
model.eval()

tok = AutoTokenizer.from_pretrained("facebook/esm2_t33_650M_UR50D")

seqA = "MDKKSARIRRATRARRKLQELGATRLVVHRTPRHIYAQVIAPNGSEVLVAASTVEKAIAEQLKYTGNKDAAAAVGKAVAERALEKGIKDVSFDRSGFQYHGRVQALADAAREAGLQF"
seqB = "MAVVKCKPTSPGRRHVVKVVNPELHKGKPFAPLLEKNSKSGGRNNNGRITTRHIGGGHKQAYRIVDFKRNKDGIPAVVERLEYDPNRSANIALVLYKDGERRYILAPKGLKAGDQIQSGVDAAIKPGNTLPMRNIPVGSTVHNVEMKPGKGGQLARSAGTYVQIVARDGAYVTLRLRSGEMRKVEADCRATLGEVGNAEHMLRVLGKAGAARWRGVRPTVRGTAMNPVDHPHGGGEGRNFGKHPVTPWGVQTKGKKTRSNKRTDKFIVRRRSK"

input_a = tok(seqA, return_tensors="pt")
input_b = tok(seqB, return_tensors="pt")

input_a = {k: v.to(device) for k, v in input_a.items()}
input_b = {k: v.to(device) for k, v in input_b.items()}

with torch.no_grad():
    prob = model(input_a, input_b)

print(prob)
Downloads last month
55
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support