Scaffold Analysis
Scaffold analysis extracts the core ring system from molecules using the Murcko framework method, enabling structure-activity relationship (SAR) analysis and scaffold hopping.
What Is a Murcko Scaffold?
A Murcko scaffold is the core ring system and linkers of a molecule with all side chains removed.
Example (Ibuprofen):
- Full structure:
CC(C)Cc1ccc(cc1)C(C)C(=O)O - Murcko scaffold:
c1ccc(cc1)C - Generic scaffold:
C1CCC(CC1)C
Types of Scaffolds
Murcko Scaffold
Preserves atom types in the ring system and linkers:
- Aromatic vs. aliphatic rings distinguished
- Heteroatoms preserved
- Useful for detailed SAR analysis
Generic Scaffold
Replaces all atoms with carbon and all bonds with single bonds:
- Focuses on topology only
- Groups molecules by ring framework
- Useful for broad scaffold classification
Output
Scaffold analysis returns:
| Field | Description |
|---|---|
scaffold_smiles | SMILES of Murcko scaffold |
generic_scaffold_smiles | SMILES of generic scaffold |
has_scaffold | Whether molecule contains ring systems |
message | Explanation if no scaffold found |
API Usage
curl -X POST http://localhost:8001/api/v1/score \
-H "Content-Type: application/json" \
-d '{
"molecule": "CC(C)Cc1ccc(cc1)C(C)C(=O)O",
"include": ["scaffold"]
}'
Response (with scaffold):
{
"scaffold": {
"scaffold_smiles": "c1ccc(cc1)C",
"generic_scaffold_smiles": "C1CCC(CC1)C",
"has_scaffold": true,
"message": "Scaffold extracted successfully"
}
}
Response (no scaffold):
{
"scaffold": {
"scaffold_smiles": null,
"generic_scaffold_smiles": null,
"has_scaffold": false,
"message": "No ring system found"
}
}
Use Cases
SAR Analysis
Group molecules by scaffold to analyze structure-activity relationships:
- Extract scaffolds from active compounds
- Group by scaffold SMILES
- Analyze R-group effects on activity
- Identify optimal substitution patterns
Scaffold Hopping
Find alternative scaffolds with similar properties:
- Extract scaffold from lead compound
- Search for molecules with different scaffolds
- Test for similar biological activity
- Expand chemical space explored
Library Diversity
Assess scaffold diversity in compound libraries:
- Extract generic scaffolds from all molecules
- Count unique scaffolds
- Calculate scaffold diversity metrics
- Identify over-represented scaffolds
Hit Clustering
Cluster screening hits by scaffold:
- Extract scaffolds from all hits
- Group by identical scaffolds
- Prioritize diverse scaffolds
- Avoid redundant follow-up
Molecules Without Scaffolds
Molecules without ring systems return has_scaffold: false:
Examples:
- Linear molecules (ethanol, hexane)
- Branched aliphatics
- Simple derivatives without rings
These molecules have no Murcko scaffold by definition.
Scaffold Statistics in Batch Processing
In batch mode, scaffold diversity is reported:
{
"scaffold_statistics": {
"total_molecules": 1000,
"molecules_with_scaffolds": 850,
"unique_scaffolds": 45,
"unique_generic_scaffolds": 38,
"scaffold_diversity": 0.053
}
}
Scaffold diversity = unique scaffolds / total molecules
- Higher diversity = more varied scaffolds
- Lower diversity = repeated scaffold use
Combining with Other Scores
Scaffold + NP-likeness:
Natural products often have complex scaffolds:
has_scaffold = true AND
np_likeness_score > 1.0
Scaffold + Drug-likeness:
Drug-like scaffolds typically:
has_scaffold = true AND
lipinski_passed = true AND
num_rings <= 5
Best Practices
- Use for grouping: Scaffold is ideal for clustering molecules
- Generic for broad classes: Use generic scaffold for high-level classification
- Murcko for SAR: Use Murcko scaffold for detailed SAR
- Track diversity: Monitor scaffold diversity in libraries
- Combine with clustering: Use with fingerprint clustering for comprehensive analysis
Limitations
Scaffold definition:
- Only captures ring systems
- Doesn't account for stereochemistry
- Linker atoms may vary between definitions
Edge cases:
- Spiro compounds: Complex scaffold extraction
- Macrocycles: Entire molecule may be the scaffold
- Fused systems: Multiple valid scaffold interpretations
For molecules with complex or unusual ring systems (spiro, bridged, cage), manually review scaffold extraction results.
Next Steps
- NP-Likeness - Natural product similarity
- Batch Processing - Analyze scaffold diversity
- Scoring Overview - All scoring systems