Help & Documentation

Learn how to use PLATE-VS for molecular similarity search

What is PLATE-VS?

PLATE-VS (Protein-Ligand Affinity-based Target Evaluation - Virtual Screening) bridges the structure–activity gap for developing and rigorously evaluating modern virtual screening and protein–ligand ML models. Unlike structure-only resources (e.g., PDB) or structure-centric benchmarks (e.g., PDBbind), PLATE-VS links protein–ligand complexes with target–ligand affinity measurements mined from ChEMBL, enabling datasets that reflect both binding geometry and bioactivity.

Key Features:

  • Bridges the structure–activity gap: Links protein–ligand complexes with target–ligand affinity measurements from ChEMBL, enabling datasets that reflect both binding geometry and bioactivity.
  • Makes ChEMBL training-ready via data curation: Applies strict, metadata-aware filtering and harmonization to remove unreliable or inconsistent measurements, yielding internally consistent activity panels better suited for supervised learning and reproducible benchmarking.
  • Enables generalization testing beyond temporal splits: Clusters data by ligand chemical similarity and protein binding pocket similarity, supporting stratified train/validation/test splits and performance reporting by similarity bins—directly measuring how well models generalize to new scaffolds, new pockets, or both.
  • Supports realistic virtual screening with stronger decoys: Provides property-matched decoys for actives using DeepCoy-based generation, which better matches physicochemical properties while enforcing distinct shapes, improving over classical decoy strategies for fairer enrichment evaluation.

Affinity Database

https://www.drugbench.org/

Browse all binding affinity data in the Affinity Database on the home page. The database contains molecular information including protein IDs, molecule IDs, SMILES strings, and binding affinity measurements. You can filter, select specific records, and download selected or all data as CSV, CIF, and SDF files.

Search Molecule

https://www.drugbench.org/search-module

Navigate to the Search Molecule page to search for molecules similar to your query SMILES string. You can adjust the similarity threshold to find more or fewer matches.

Dataset Download

https://www.drugbench.org/dataset-download

The Dataset Download page provides downloadable 2D matrices with clustering visualization. You can explore protein and ligand similarity relationships and download data for specific threshold tranches.

Network View

https://www.drugbench.org/network-view

The Network View page visualizes molecular similarity as an interactive network graph. It shows:

  • The top 5 most similar molecules to your query (first level)
  • The top 5 neighbors for each of those molecules (second level)
  • Edge thickness represents similarity strength
  • Click on any node to visualize the molecule structure

Custom Download

https://www.drugbench.org/custom-download

The Custom Download page allows you to filter and download binding affinity data. You can filter by:

  • Protein ID: Select a specific protein from the dropdown
  • Molecule ID: Enter a ChEMBL molecule ID
  • SMILES: Enter a SMILES string (supports partial matching)
  • Type: Filter by standard type (e.g., IC50, Ki, EC50)
  • Value: Filter by standard value with operators (=, >, <, >=, <=)
  • Units: Filter by measurement units (e.g., nM, μM, M)
  • pChEMBL: Filter by pChEMBL value with operators
  • Assay ID: Enter a ChEMBL assay ID
  • Target ID: Enter a ChEMBL target ID

When you click the Custom Download button, the system will:

  • Filter the data based on your criteria
  • Automatically download the filtered results as CSV and SDF (ZIP) files
  • Display the filtered results in a paginated table

The table shows molecular information including protein IDs, molecule IDs, SMILES strings, binding affinity measurements, and ligand structure visualizations.

PLATE-VS API Client

https://github.com/AaronXu9/plate-vs-client

A Python client library is available for programmatic access to PLATE-VS web services. The client provides convenient methods to search and download data without using the web interface.

Installation:

Install directly from GitHub:

pip install https://github.com/AaronXu9/plate-vs-client.git

Quick Start:

from platevs_client import PlateVSClient

# Initialize the client
client = PlateVSClient(output_dir="./downloads")

# Check service status
status = client.check_service_status()

# Search by UniProt ID
df = client.get_protein_ligands("P00533")  # EGFR

# Search by SMILES (Similarity)
df = client.search_by_smiles("CC(=O)Oc1ccccc1C(=O)O", exact_match=False)

# Download similarity matrix data
csv_path = client.download_similarity_matrix_csv(0.9, qcov_level=100)
sdf_path = client.download_similarity_sdf(0.9)

Key Features:

  • Search by UniProt ID: Query affinity data for a specific protein
  • Search by SMILES: Query affinity data for a specific compound (exact or similarity search)
  • Download Similarity Matrix Data: Download CSV and SDF files for given similarity thresholds
  • Bulk Downloads: Download data for multiple thresholds at once

Available Methods:

  • search_by_uniprot(uniprot_id, page, limit) - Search affinity data by UniProt ID (JSON)
  • get_protein_ligands(uniprot_id) - Get ligands DataFrame for a protein (CSV download)
  • search_by_smiles(smiles, exact_match) - Search affinity data by SMILES
  • download_affinity_data(query, query_type) - Download affinity data to CSV
  • download_similarity_matrix_csv(threshold, qcov_level) - Download similarity CSV
  • download_similarity_sdf(threshold) - Download similarity SDF files (tar.gz)
  • download_all_similarity_data(thresholds, qcov_level) - Bulk download for multiple thresholds
  • check_service_status() - Check if services are accessible

For more information and examples, visit the plate-vs-client GitHub repository.

Need More Help?

If you need additional assistance, please reach out to the development team at katritchl@gmail.com.