At low field, ¹H spectra blend into second-order multiplets. Scroll to sweep the spectrometer from 90 MHz to 600 MHz.
A 90 MHz benchtop ¹H NMR spectrum is rarely first-order: chemical-shift differences are comparable to coupling constants, so multiplets overlap and naive peak-picking fails. Spinhance trains a neural network to invert a normalized low-field spectrum back to the underlying spin system — chemical shifts δ (ppm), scalar couplings J (Hz), and proton degeneracies. Once you have those field-independent parameters, the spectrum can be re-simulated exactly at any field strength.
We scope the problem to molecules with exactly eight hard-equivalent (chemically + magnetically equivalent) spin groups, and build a fully synthetic, physically-plausible training set end to end.
Filter large public databases (ChEMBL / PubChem) with RDKit, assigning chemically & magnetically equivalent proton groups, and keep molecules with exactly 8 spin groups.
Embed each SMILES in 3D, then estimate ²J/³J/⁴J couplings and shifts with Karplus and tabulated rules — a spin-system "shift+J" graph that could have come from a real molecule.
Diagonalize the spin Hamiltonian to produce accurate spectra at 90 and 600 MHz — a fast pure-Python engine, validated against MestreNova. This animation is that engine, live.
Train a network mapping a normalized 2¹⁴-point spectrum back to the 8×9 shift+J+degeneracy block — invariant to the arbitrary labeling of groups (S₈ permutation).
Each molecule has exactly eight magnetically distinct ¹H spin groups (A–H). Hover the table or the 3D structure to see which atoms belong to which group. Explore the dataset here →
| Group | Equivalence | Protons | |
|---|---|---|---|
The spin graph is the model's learned intermediate representation — a labeled, undirected graph that sits between the raw spectrum and the output matrix. Nodes carry a chemical shift δ (ppm) and a proton degeneracy n. Edges carry a scalar coupling J (Hz) and a soft-equivalence flag indicating chemically equivalent but magnetically inequivalent protons. Stored as a symmetric 8×8 matrix with shifts on the diagonal and an 8×1 degeneracy vector — defined only up to permutation of the 8 labels (S₈). The structure on the left and the matrix on the right are two views of the same molecule now in the hero.
Showing the molecule currently in the hero animation. Node labels: chemical shift δ (ppm) / degeneracy n. Edge thickness and labels: coupling J (Hz); only |J| > 0.5 Hz shown.
Each stage introduced a different innovation — first in architecture and loss, then in the dataset itself, and finally in scaling data and model capacity together. Metrics are on a leakage-controlled held-out test set the model never saw. Explore the learning curves, a full run comparison, and per-molecule test predictions ↓
Note (corrected dataset): the dataset was regenerated to give diastereotopic protons independent shifts (they were previously merged), and the whole recipe ladder is being retrained on this corrected data across all three tiers. The quantitative model metrics below are being regenerated and will update as the retrained checkpoints land — treat current numbers as pending.
Training curves and the model comparison report validation-split metrics (used for model selection); the held-out test split — never seen in training or selection — is reported separately below and in the molecule explorer. New to the recipes (025–030)? Read how each model works →
Pick a model to see its predictions on a shared set of held-out PubChem test molecules — the leakage-controlled global 10% split that no model trained on or selected against (the CNN baseline was trained on ChEMBL, so PubChem is also out-of-distribution for it). The rendered trace re-simulates the model’s predicted spin system through the exact quantum simulator — a close overlay means the recovered parameters reproduce the input spectrum. Predicted values > 0.1 ppm / 1.5 Hz off ground truth are shown in red.
RDKitNumPy / SciPy PyTorchspin-Hamiltonian simulation