Skip to main content

API Endpoints

Complete reference for all ChemAudit API endpoints, organized by feature.

Health and Configuration

GET /health

Health check endpoint. Returns system status and RDKit version.

Response:

{
"status": "healthy",
"app_name": "ChemAudit",
"app_version": "1.0.0",
"rdkit_version": "2025.09.3"
}

GET /config

Get public application configuration including deployment limits.

Response:

{
"app_name": "ChemAudit",
"app_version": "1.0.0",
"deployment_profile": "medium",
"limits": {
"max_batch_size": 10000,
"max_file_size_mb": 500,
"max_file_size_bytes": 524288000
}
}

Validation

POST /validate

Validate a single molecule. Results are cached by InChIKey for 1 hour.

Request:

{
"molecule": "CCO",
"format": "auto",
"checks": ["all"],
"preserve_aromatic": false
}
FieldTypeRequiredDescription
moleculestringYesSMILES, InChI, or MOL block (max 10,000 chars)
formatstringNoauto, smiles, inchi, mol (default: auto)
checksarrayNoSpecific checks to run (default: ["all"])
preserve_aromaticboolNoKeep aromatic notation in output SMILES (default: false)

Response:

{
"status": "completed",
"molecule_info": {
"input_smiles": "CCO",
"canonical_smiles": "CCO",
"inchi": "InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3",
"inchikey": "LFQSCWFLJHTTHZ-UHFFFAOYSA-N",
"molecular_formula": "C2H6O",
"molecular_weight": 46.07,
"num_atoms": 3
},
"overall_score": 95,
"issues": [],
"all_checks": [
{
"check_name": "valence_check",
"passed": true,
"severity": "critical",
"message": "All atoms have valid valence"
}
],
"execution_time_ms": 12,
"cached": false
}

POST /validate/async

Validate using the Celery high-priority queue. Use when batch jobs are running.

Query Parameters:

ParamTypeDescription
timeoutintMax wait time in seconds (1-60, default: 30)

Request: Same as POST /validate

Response: Same as POST /validate with additional queue field

GET /checks

List available validation checks grouped by category.

Response:

{
"basic": ["valence_check", "kekulize_check", "sanitization_check"],
"stereo": ["undefined_stereo_check", "stereo_consistency_check"],
"representation": ["smiles_length_check", "inchi_generation_check"]
}

Structural Alerts

POST /alerts

Screen molecule for structural alerts.

Request:

{
"molecule": "c1ccc2c(c1)nc(n2)Sc3nnnn3C",
"format": "smiles",
"catalogs": ["PAINS"]
}

Response:

{
"status": "completed",
"alerts": [
{
"pattern_name": "thiazole_amine_A",
"description": "Potential assay interference",
"severity": "warning",
"matched_atoms": [4, 5, 6, 7],
"catalog_source": "PAINS_A",
"smarts": "[#7]-c1nc2ccccc2s1"
}
],
"total_alerts": 1,
"has_critical": false,
"has_warning": true
}

POST /alerts/quick-check

Fast check for presence of alerts (no detailed results).

Request: Same as POST /alerts

Response:

{
"has_alerts": true,
"checked_catalogs": ["PAINS"]
}

GET /alerts/catalogs

List available structural alert catalogs.

Response:

{
"catalogs": {
"PAINS": {
"name": "PAINS",
"description": "Pan-Assay Interference Compounds",
"pattern_count": 480,
"severity": "warning"
}
},
"default_catalogs": ["PAINS"]
}

Scoring

POST /score

Calculate comprehensive molecular scores.

Request:

{
"molecule": "CCO",
"format": "smiles",
"include": ["ml_readiness", "druglikeness", "admet"]
}

Available score types:

  • ml_readiness: ML-readiness score (0-100)
  • np_likeness: Natural product likeness
  • scaffold: Murcko scaffold extraction
  • druglikeness: Drug-likeness filters
  • safety_filters: Safety alerts summary
  • admet: ADMET predictions
  • aggregator: Aggregator likelihood

Response: See User Guide - Scoring for detailed response schemas.

Standardization

POST /standardize

Standardize a molecule using ChEMBL-compatible pipeline.

Request:

{
"molecule": "CC(=O)Oc1ccccc1C(=O)[O-].[Na+]",
"format": "smiles",
"options": {
"include_tautomer": false,
"preserve_stereo": true
}
}

Response:

{
"result": {
"original_smiles": "CC(=O)Oc1ccccc1C(=O)[O-].[Na+]",
"standardized_smiles": "CC(=O)OC1=CC=CC=C1C(O)=O",
"success": true,
"steps_applied": [...],
"excluded_fragments": ["[Na+]"],
"structure_comparison": {
"mass_change_percent": -10.87
}
}
}

GET /standardize/options

Get available standardization options with descriptions.

Batch Processing

POST /batch/upload

Upload file for batch processing.

Request: multipart/form-data

FieldTypeRequiredDescription
filefileYesSDF, CSV, TSV, or TXT file
smiles_columnstringNoColumn name for SMILES in CSV
name_columnstringNoColumn name for molecule names/IDs
include_extended_safetyboolNoInclude NIH and ZINC filters
include_chembl_alertsboolNoInclude ChEMBL pharma filters
include_standardizationboolNoRun standardization pipeline

Response:

{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"total_molecules": 10000,
"message": "Job submitted. Processing 10000 molecules."
}

POST /batch/detect-columns

Detect columns in delimited text file for SMILES and Name selection.

GET /batch/{job_id}

Get batch job results with pagination and filtering.

Query Parameters:

ParamTypeDescription
pageintPage number (default: 1)
page_sizeintResults per page (1-100, default: 50)
status_filterstringFilter by success or error
min_scoreintMinimum validation score (0-100)
max_scoreintMaximum validation score (0-100)
sort_bystringSort field
sort_dirstringSort direction (asc, desc)

GET /batch/{job_id}/status

Lightweight status check for a batch job.

Response:

{
"job_id": "550e8400-...",
"status": "processing",
"progress": 45.5,
"processed": 455,
"total": 1000,
"eta_seconds": 68
}

GET /batch/{job_id}/stats

Get aggregate statistics for a batch job (without individual results).

DELETE /batch/{job_id}

Cancel a running batch job.

Export

GET /batch/{job_id}/export

Export batch results to a file.

Query Parameters:

ParamTypeRequiredDescription
formatstringYescsv, excel, sdf, json, or pdf
score_minintNoMinimum validation score filter
score_maxintNoMaximum validation score filter
statusstringNoFilter by success, error, or warning
indicesstringNoComma-separated molecule indices

Response: File download with appropriate Content-Disposition header

POST /batch/{job_id}/export

Export with molecule selection via request body (for large selections).

Request:

{
"indices": [0, 1, 5, 23, 42]
}

Query parameters same as GET version.

Database Integrations

POST /integrations/pubchem/lookup

Search PubChem database.

Request:

{
"molecule": "CCO",
"format": "smiles"
}

Response:

{
"found": true,
"cid": 702,
"iupac_name": "ethanol",
"synonyms": ["ethanol", "ethyl alcohol"],
"url": "https://pubchem.ncbi.nlm.nih.gov/compound/702"
}

POST /integrations/chembl/bioactivity

Search ChEMBL for bioactivity data.

POST /integrations/coconut/lookup

Search COCONUT natural products database.

API Key Management

All endpoints require admin authentication via X-Admin-Secret header.

POST /api-keys

Create a new API key.

Request:

{
"name": "my-application",
"description": "Key for production use",
"expiry_days": 90
}

Response (201):

{
"key": "the-full-api-key-shown-only-once",
"name": "my-application",
"created_at": "2026-02-01T00:00:00Z",
"expires_at": "2026-05-02T00:00:00Z"
}

GET /api-keys

List all API keys (metadata only, not the actual key values).

DELETE /api-keys/{key_id}

Revoke an API key. Returns 204 No Content on success.

Bookmarks

GET /bookmarks

List bookmarks with pagination and optional filters.

Query Parameters:

ParamTypeDescription
pageintPage number (default: 1)
page_sizeintResults per page (1-200, default: 50)
tagstringFilter by tag
searchstringSMILES substring search
sourcestringFilter by source

POST /bookmarks

Create a new bookmark. Returns 201.

Request:

{
"smiles": "CCO",
"name": "Ethanol",
"tags": ["simple", "alcohol"],
"notes": "Test molecule",
"source": "single_validation"
}

PUT /bookmarks/{id}

Update bookmark tags and notes. Returns 404 if not found.

DELETE /bookmarks/{id}

Delete a single bookmark. Returns 204 on success.

DELETE /bookmarks/bulk

Bulk delete bookmarks by IDs. Query parameter: ids (list of integers). Returns 204.

POST /bookmarks/batch-submit

Submit selected bookmarks as a new batch job.

Request:

{
"bookmark_ids": [1, 2, 3]
}

History

GET /history

Paginated validation audit trail with filters.

Query Parameters:

ParamTypeDescription
pageintPage number (default: 1)
page_sizeintResults per page (1-200, default: 50)
date_fromISO8601Filter entries after this date
date_toISO8601Filter entries before this date
outcomestringpass, warn, or fail
sourcestringsingle or batch
smiles_searchstringSMILES substring match

GET /history/stats

Get summary statistics: total validations, outcome distribution, source distribution.

Scoring Profiles

GET /profiles

List all scoring profiles (8 presets + user-created).

POST /profiles

Create a custom scoring profile. Returns 201.

GET /profiles/{id}

Get a single profile by ID.

PUT /profiles/{id}

Update a custom profile. Returns 400 if attempting to modify a preset.

DELETE /profiles/{id}

Soft-delete a custom profile. Returns 400 for presets. Returns 204 on success.

POST /profiles/{id}/duplicate

Duplicate any profile (including presets) as a new custom profile. Returns 201.

GET /profiles/{id}/export

Export a profile as JSON for sharing.

POST /profiles/import

Import a profile from JSON. Returns 201.

Batch Analytics

GET /batch/{job_id}/analytics

Get analytics status and cached results for a completed batch job.

POST /batch/{job_id}/analytics/{type}

Trigger an on-demand analytics computation. Types: scaffold, chemical_space, mmp, similarity_search, rgroup.

Optional body parameters:

ParamTypeUsed By
methodstringchemical_space (pca or tsne)
activity_columnstringmmp (property key for cliff detection)
query_smilesstringsimilarity_search
query_indexintsimilarity_search
top_kintsimilarity_search (default: 10)
core_smartsstringrgroup (SMARTS pattern)

Batch Subset Actions

POST /batch/{job_id}/subset/revalidate

Re-validate a subset as a new batch job. Body: {"indices": [0, 5, 12]}.

POST /batch/{job_id}/subset/rescore

Re-score a subset with a different profile. Body: {"indices": [0, 5], "profile_id": 3}.

POST /batch/{job_id}/subset/score-inline

Synchronous inline scoring. Body: {"indices": [0, 5], "profile_id": 4}.

POST /batch/{job_id}/subset/export

Export selected molecules only. Body: {"indices": [0, 5], "format": "csv"}.

Similarity

POST /validate/similarity

Calculate ECFP4 Tanimoto similarity between two molecules.

Request:

{
"smiles_a": "CC(=O)Oc1ccccc1C(=O)O",
"smiles_b": "CC(=O)Nc1ccc(O)cc1"
}

POST /permalinks

Create a shareable permalink for a batch report. Returns short_id, URL, and 30-day expiry.

GET /report/{short_id}

Resolve a permalink. Returns job data and snapshot. Returns 410 Gone if expired.

Session

DELETE /me/data

GDPR erasure — permanently deletes all bookmarks and history for the current session or API key.

Response:

{
"status": "purged",
"deleted": {"bookmarks": 42, "history": 150}
}

QSAR-Ready Pipeline

POST /qsar-ready/single

Process a single molecule through the QSAR-ready curation pipeline (standardize, strip salts, neutralize, canonicalize tautomers, check duplicates). Rate limit: 30/min.

Request:

{
"smiles": "CC(=O)Oc1ccccc1C(=O)[O-].[Na+]",
"config": {}
}

Response:

{
"original_smiles": "CC(=O)Oc1ccccc1C(=O)[O-].[Na+]",
"curated_smiles": "CC(=O)Oc1ccccc1C(=O)O",
"status": "ok",
"inchikey_changed": true,
"steps": [...]
}

POST /qsar-ready/batch/upload

Upload file for batch QSAR-ready processing (3/min). Accepts CSV, SDF, or pasted SMILES text.

GET /qsar-ready/batch/{job_id}/status

Check batch status (60/min). Returns job_id, status, progress, processed, total, eta_seconds.

GET /qsar-ready/batch/{job_id}/results

Get paginated results (30/min). Query params: page (default 1), per_page (1-500, default 50).

GET /qsar-ready/batch/{job_id}/download/{format}

Download curated results (10/min). Formats: csv, sdf, json.

Structure Filter

POST /structure-filter/filter

Filter molecules through multi-stage property/substructure funnel (20/min). Returns synchronously for ≤1,000 molecules; returns job_id for larger sets.

Request:

{
"smiles_list": ["CCO", "c1ccccc1"],
"preset": "drug_like"
}

Response:

{
"input_count": 2,
"output_count": 2,
"stages": [...],
"molecules": [
{ "smiles": "CCO", "status": "passed", "failed_at": null }
]
}

Presets: drug_like, lead_like, fragment_like, permissive.

POST /structure-filter/score

Score molecules 0–1 against filter criteria (30/min). Returns {"scores": [0.85, 0.92]}.

POST /structure-filter/reinvent-score

REINVENT-compatible scoring endpoint (30/min). Accepts array of {input_string, query_id}.

POST /structure-filter/batch/upload

Upload file for batch filtering (3/min).

GET /structure-filter/batch/{job_id}/status

Check filter job status (60/min).

GET /structure-filter/batch/{job_id}/results

Get filter results (30/min).

GET /structure-filter/batch/{job_id}/download/{format}

Download results (10/min). Formats: passed_txt (passing SMILES only), full_csv (all with status).

Dataset Audit

POST /dataset/upload

Upload a dataset for health auditing (3/min).

Request: multipart/form-data with file, optional smiles_column, optional activity_column.

Response:

{
"job_id": "550e8400-...",
"filename": "dataset.csv",
"file_type": "csv",
"status": "pending"
}

GET /dataset/{job_id}/status

Check audit status (60/min). Returns status, progress, current_stage, eta_seconds.

GET /dataset/{job_id}/results

Get audit results (30/min). Returns 202 if still processing. Includes health_audit (overall_score, sub_scores, issues), contradictions, curation_report.

POST /dataset/{job_id}/diff

Compare with another dataset version (10/min). Upload comparison file as multipart/form-data. Returns added, removed, modified, unchanged counts.

GET /dataset/{job_id}/download/report

Download curation report as JSON (10/min).

GET /dataset/{job_id}/download/csv

Download curated CSV with appended audit columns (10/min).

Next Steps