Skip to main content

Aggregator Likelihood

Aggregator likelihood scoring predicts whether a molecule is likely to form colloidal aggregates in biochemical assays, which can lead to false positive results.

What Are Aggregators?

Aggregators are small molecules that form colloidal particles in aqueous solution, leading to:

  • Non-specific enzyme inhibition
  • False positives in biochemical assays
  • Artifacts in high-throughput screening
  • Misleading structure-activity relationships

Aggregation is concentration-dependent and can be disrupted by detergents.

How It's Calculated

The aggregator risk score (0–1) is the sum of triggered risk increments, capped at 1.0:

Risk Indicators

IndicatorConditionRisk Increment
High lipophilicityLogP > 4.0+0.30
Moderate lipophilicityLogP 3.0–4.0+0.15
Low polar surfaceTPSA < 40 A²+0.20
Extended aromatics≥ 4 aromatic rings+0.20
Moderate aromatics3 aromatic rings+0.10
Large molecular weightMW > 500 Da+0.10
High conjugationAromatic fraction > 0.7 and > 20 heavy atoms+0.15
Known scaffold matchSMARTS pattern match+0.20 per match

Known Aggregator Scaffolds (SMARTS-based)

The following empirically known aggregator scaffolds are matched:

  • Rhodanines
  • Quinones
  • Catechols
  • Curcumin-like structures
  • Phenol-sulfonamides
  • Flavonoids
  • Acridines
  • Phenothiazines
  • Extended conjugated polyenes
  • Azo compounds

Confidence: triggered_indicators / total_indicators

Output

The score provides:

FieldDescription
likelihoodRisk category: Low, Medium, High
risk_scoreNumerical score (0.0-1.0)
risk_factorsList of contributing factors
interpretationHuman-readable explanation

Risk Categories

LikelihoodRisk ScoreInterpretation
Low0.0-0.3Unlikely to aggregate
Medium0.3-0.7May aggregate at high concentration
High0.7-1.0Likely aggregator

API Usage

curl -X POST http://localhost:8001/api/v1/score \
-H "Content-Type: application/json" \
-d '{
"molecule": "c1ccc2c(c1)ccc3c2ccc4c3cccc4",
"include": ["aggregator"]
}'

Response:

{
"aggregator": {
"likelihood": "High",
"risk_score": 0.85,
"logp": 6.2,
"tpsa": 0.0,
"mw": 228.29,
"aromatic_rings": 4,
"risk_factors": [
"High LogP (6.2)",
"Low TPSA (0.0)",
"Many aromatic rings (4)"
],
"interpretation": "High risk of aggregation - likely false positive in assays"
}
}

Interpreting Results

Low Risk

Molecule is unlikely to aggregate:

  • Proceed with assay screening
  • Standard assay conditions should work
  • False positive risk minimal

Medium Risk

Molecule may aggregate at high concentrations:

  • Use appropriate concentration range
  • Include detergent in assay (0.01% Triton X-100 or similar)
  • Verify hits with dose-response curves
  • Check for non-specific effects

High Risk

Molecule likely aggregates:

  • High risk of false positives
  • Always include detergent in assay
  • Perform counterscreening
  • Use light scattering or turbidity checks
  • Consider alternative screening methods
  • Verify hits with orthogonal assays

Common Risk Factors

High LogP (> 5):

Hydrophobic molecules partition poorly in water and tend to self-associate.

Low TPSA (< 40 Ų):

Poor polar surface area means low water solubility, promoting aggregation.

Multiple Aromatic Rings (> 3):

Planar aromatic systems stack via pi-pi interactions, forming aggregates.

Moderate Molecular Weight (200-500 Da):

The size range where aggregation is most common.

Use Cases

Hit Validation

Check aggregation risk before spending resources on hits:

aggregator_likelihood != "High"

Assay Design

Design assays with aggregation in mind:

  • Include detergent for high-risk compounds
  • Use dynamic light scattering to detect aggregates
  • Run turbidity checks

Library Curation

Filter aggregators from screening libraries:

aggregator_likelihood = "Low" AND
risk_score < 0.3

Dose-Response Analysis

Interpret dose-response curves:

  • Aggregators often show Hill slopes > 1
  • Non-specific, steep inhibition
  • Detergent disrupts activity

Experimental Validation

Test for aggregation:

Detergent sensitivity:

  • Run assay with and without detergent (0.01-0.1% Triton X-100)
  • Aggregators lose activity with detergent

Dynamic light scattering (DLS):

  • Measure particle size in solution
  • Aggregators show particles > 100 nm

Turbidity:

  • Visual inspection or spectrophotometry (320-340 nm)
  • Aggregators cause cloudiness

Concentration dependence:

  • Test multiple concentrations
  • Aggregators show non-linear behavior

False Positives vs True Hits

ObservationAggregatorTrue Hit
Detergent effectActivity lostActivity maintained
Hill slope> 1 (steep)~1 (normal)
Light scatteringPositiveNegative
Concentration curveNon-linearLinear/saturable
SARFlat (no clear SAR)Clear SAR

Limitations

Prediction accuracy:

  • Based on empirical patterns
  • Cannot predict all aggregators
  • Experimental validation required

Context dependent:

  • Aggregation depends on assay conditions
  • Buffer composition affects aggregation
  • Temperature and concentration matter

Not absolute:

  • Some high-risk molecules don't aggregate
  • Some low-risk molecules do aggregate
  • Use as a guide, not absolute truth
Experimental Confirmation

Always validate aggregation experimentally for critical compounds. Computational prediction guides prioritization but doesn't replace experimental testing.

Best Practices

  1. Check all screening hits: Especially those with steep dose-response
  2. Use detergent controls: Include detergent in assay for high-risk compounds
  3. Orthogonal assays: Confirm hits in different assay formats
  4. Document risk scores: Track aggregator scores for future reference
  5. Filter libraries: Remove high-risk aggregators before screening

Reference

Irwin, J. J. et al. (2015). An aggregation advisor for ligand discovery. Journal of Medicinal Chemistry, 58(17), 7076–7087.

Next Steps