01 / 05
Diagnosis Distribution
COHORT BALANCE — n=1,200
02 / 05
Feature Discriminative Power
COHEN'S D EFFECT SIZE — MALIGNANT vs BENIGN
03 / 05
Radius Mean vs Concavity Mean
SCATTER PLOT — 400 PATIENTS SAMPLED · SIZED BY AREA
04 / 05
Avg Feature Values by Diagnosis
NORMALISED MEAN — 6 KEY FEATURES
05A
Radius Worst Distribution
FREQUENCY HISTOGRAM — WORST CASE RADIUS BY DIAGNOSIS
05B
Composite Malignancy Risk Score
WEIGHTED FORMULA: 0.4×radius_worst + 3.5×concavity_worst + 2.5×concave_pts_worst
The composite Risk Score combines the 3 most discriminating worst-case features into a single index — achieving cleaner separation than any individual feature.
Finding 01
Size ≠ Malignancy
Benign tumors average a slightly larger area (652 mm²) than malignant ones (629 mm²). This counter-intuitive result shows that tumor size alone is clinically insufficient as a screening signal.
Finding 02
Worst-Case Values Win
The _worst suffix features consistently outperform _mean features in discriminative power. The most extreme cell measurements — not averages — capture the irregular geometry of cancerous tissue.
Finding 03
No Single Feature is Enough
The scatter plot reveals complete class overlap between Malignant and Benign patients. Even combining radius and concavity cannot cleanly separate groups — multi-feature models and composite scoring are essential.