Skip to main content

ML-Readiness Scoring

The ML-readiness score evaluates how suitable a molecule is for machine learning workflows by assessing structural quality, physicochemical properties, complexity, and representation quality across four dimensions totaling 100 points.

Score Dimensions

DimensionMax PointsWhat It Measures
Structural Quality20Structural soundness for ML pipelines
Property Profile35Physicochemical property desirability
Complexity & Feasibility25Synthetic tractability and complexity
Representation Quality20Numerical representability for ML models

1. Structural Quality (20 points)

Binary pass/fail checks on fundamental structural soundness. Each item either passes (full points) or fails (0 points).

ItemPointsPass Condition
Single component5Exactly 1 fragment — no mixtures, salts, or solvents
Standard organic elements5No metal atoms present (no organometallics)
No radicals3No atoms with unpaired electrons
Reasonable charge3Net formal charge between −2 and +2
No dummy atoms4No R-groups or attachment points (atomic number ≠ 0)
Total20Sum of passed items

Non-scored caveats (reported as warnings):

  • Isotope labels detected (deuterium, ¹³C, tritium)
  • Trivial molecules (≤ 3 heavy atoms)

2. Property Profile (35 points)

Desirability-scored physicochemical properties measuring how well a molecule fits within typical ML training set distributions. Each property is scored using a trapezoidal desirability function.

PropertyIdeal RangeMax Points
Molecular Weight200–500 Da6
LogP (Wildman-Crippen)0.5–4.06
TPSA40–120 A²5
H-Bond Donors0–34
H-Bond Acceptors2–84
Rotatable Bonds1–85
Aromatic Rings1–35
Total35

Desirability Function

Each property is scored using a trapezoidal desirability function:

If min ≤ value ≤ max:  d = 1.0  (ideal range → full score)
If value < min: d = max(0, 1.0 − (min − value) / range)
If value > max: d = max(0, 1.0 − (value − max) / range)

where range = max − min
points = d × max_points

Example: A molecule with MW = 600 Da:

  • range = 500 − 200 = 300
  • d = max(0, 1.0 − (600 − 500) / 300) = 0.667
  • points = 0.667 × 6 = 4.0
Property Ranges

These ranges reflect the most common distributions in drug-like compound datasets used for ML training. Molecules outside these ranges aren't necessarily bad — they just fall outside the typical training distribution.

3. Complexity & Feasibility (25 points)

Assesses synthetic tractability and structural complexity, which affect practical utility in ML-driven drug discovery campaigns.

ComponentMax PointsCalculation
QED8QED.qed(mol) × 8
SA Score8Inverse mapping (see below)
Fsp34desirability(Fsp3, 0.2, 0.6) × 4
Stereocenters5Complexity-adjusted scoring (see below)
Total25

SA Score → Points

Synthetic Accessibility Score (1–10) is inversely mapped to points:

SA ScorePointsInterpretation
≤ 38.0Easy to synthesize
3–58.0 − (SA − 3) × 2.0Moderate complexity
5–74.0 − (SA − 5) × 2.0Difficult synthesis
> 70.0Very difficult

Stereocenter Scoring

StereocentersBase ScoreNotes
0–45.0Manageable complexity
5–85.0 − (n − 4) × 0.75Decreasing linearly
> 80.0Too complex

Penalty: If more than 50% of stereocenters are undefined, the base score is halved (×0.5).

4. Representation Quality (20 points)

Measures how well the molecule can be numerically represented for ML models — the core requirement for any ML application.

ComponentMax PointsWhat It Tests
Descriptor completeness5Fraction of 451 RDKit descriptors computed successfully
Fingerprint generation5Weighted success across 7 fingerprint types
Fingerprint informativeness5Ideal bit density between 1% and 30%
Conformer generation53D coordinate generation via ETKDGv3
Total20

Descriptors Tested (451 total)

Descriptor SetCountMethod
Standard RDKit217Descriptors.CalcMolDescriptors()
AUTOCORR2D192rdMolDescriptors.CalcAUTOCORR2D()
MQN42rdMolDescriptors.MQNs_()

Score: round(5.0 × (successful / 451), 2)

Fingerprint Types & Weights

FingerprintBitsWeightDescription
Morgan (radius=2)20488Circular fingerprints (ECFP4-like)
Morgan Features20488Feature-based Morgan
MACCS Keys1678166-bit MACCS structural keys
Atom Pair20484Atom pair descriptors
Topological Torsion20484Topological torsion descriptors
RDKit FP20484Daylight-like path fingerprints
Avalon5124Avalon toolkit fingerprints
Total weight40

Score: round(5.0 × (sum of successful weights / 40), 2)

Fingerprint Informativeness

Measures whether fingerprints have useful information content (not too sparse, not too dense):

Avg Bit DensityScoreInterpretation
1–30%5.0Ideal information content
< 1%ProportionalToo sparse (molecule too simple)
30–45%DecreasingToo dense (losing discriminative power)
> 45%0.0Not informative

Conformer Generation

MethodPoints
ETKDGv3 success (seed=42, maxIter=500)5
Random coordinate fallback3
Complete failure0

Overall Score & Tiers

Total = Structural Quality + Property Profile + Complexity & Feasibility + Representation Quality
ScoreTierInterpretation
85–100ExcellentSuitable for most ML workflows without modification
70–84GoodMinor limitations; generally suitable with standard preprocessing
50–69ModerateUsable but may need careful feature selection or preprocessing
30–49LimitedSignificant challenges; consider alternatives or specialized models
0–29PoorNot recommended for standard ML pipelines

The UI displays an interpretation banner with:

  • Overall score badge
  • Tier-specific guidance text
  • Per-dimension health tags showing completion percentage

API Usage

curl -X POST http://localhost:8001/api/v1/score \
-H "Content-Type: application/json" \
-d '{
"molecule": "CC(=O)Oc1ccccc1C(=O)O",
"include": ["ml_readiness"]
}'

Response:

{
"ml_readiness": {
"score": 88,
"dimensions": [
{
"name": "Structural Quality",
"score": 20.0,
"max_score": 20,
"items": [
{"name": "Single component", "score": 5.0, "max_score": 5, "passed": true},
{"name": "Standard organic elements", "score": 5.0, "max_score": 5, "passed": true},
{"name": "No radicals", "score": 3.0, "max_score": 3, "passed": true},
{"name": "Reasonable charge", "score": 3.0, "max_score": 3, "passed": true},
{"name": "No dummy atoms", "score": 4.0, "max_score": 4, "passed": true}
]
},
{
"name": "Property Profile",
"score": 30.5,
"max_score": 35,
"items": [
{"name": "MW", "score": 6.0, "max_score": 6, "passed": true},
{"name": "LogP", "score": 6.0, "max_score": 6, "passed": true}
]
},
{
"name": "Complexity & Feasibility",
"score": 21.0,
"max_score": 25
},
{
"name": "Representation Quality",
"score": 16.5,
"max_score": 20
}
],
"interpretation": "Good ML candidate with minor limitations...",
"caveats": []
}
}

Use Cases

Dataset Curation for ML

Filter molecules before creating training datasets:

ml_readiness_score >= 80 AND validation_score >= 90

This ensures:

  • Structurally sound for ML (no mixtures, metals, radicals)
  • Properties within typical training distributions
  • All required descriptors and fingerprints generate successfully
  • 3D conformer can be generated

Model Applicability Domain

Use ML-readiness to define applicability domain:

  • Train models only on molecules with score >= 80
  • Flag predictions on molecules with score < 80 as uncertain
  • Exclude molecules with score < 60 from predictions

Dimension-Specific Analysis

Use individual dimension scores to diagnose issues:

Low DimensionLikely IssueAction
Structural QualityMixtures, metals, radicalsClean up structure first
Property ProfileUnusual MW, LogP, TPSAMay be outside typical drug-like space
Complexity & FeasibilityHard to synthesize, many stereocentersConsider if practical for your use case
Representation QualityDescriptor failures, no 3DMay need specialized featurization

Limitations

Does not test:

  • Descriptor quality or relevance to specific models
  • Model-specific feature requirements
  • Chemical space coverage of your training set
  • Experimental measurability

Assumes:

  • Standard RDKit descriptors are sufficient
  • Common fingerprint types are appropriate
  • Property ranges derived from typical drug-like datasets
Custom Requirements

ML-readiness tests standard descriptors and fingerprints. If your model uses custom features (e.g., graph neural network features), you'll need additional validation.

References

  1. Bickerton, G. R. et al. (2012). Quantifying the chemical beauty of drugs. Nature Chemistry, 4(2), 90–98.
  2. Ertl, P. & Schuffenhauer, A. (2009). Estimation of synthetic accessibility score. Journal of Cheminformatics, 1(1), 8.
  3. Lovering, F. et al. (2009). Escape from flatland. Journal of Medicinal Chemistry, 52(21), 6752–6756.
  4. Rogers, D. & Hahn, M. (2010). Extended-connectivity fingerprints. Journal of Chemical Information and Modeling, 50(5), 742–754.

Next Steps