Methodology Reference

How MRM Method Builder normalizes inputs and assembles GC-QQQ output

MRM Method Builder converts compound names or CAS numbers into workflow-aware GC-QQQ output by normalizing inputs, routing requests by family, and returning vendor-neutral transition rows with method metadata where available.

This page describes the current production logic. It explains how pesticide, environmental, and odor/VOC workflows use different dataset paths and why final lab validation still remains necessary.

Check current coverage Try the generator

Accept list inputs

The workflow starts with a user-supplied list of compound names or CAS numbers.

Normalize entries

Each item is resolved to a canonical internal record when a confident match exists.

Route by family

Family selection controls which dataset and method logic path are applied.

Assemble output rows

Matched compounds are expanded into vendor-neutral transition rows with method context.

Return audit metadata

The response includes preview state, counts, and unsupported cases where relevant.

Input normalization

Input normalization starts by trimming empty lines, deduplicating repeated entries, and attempting to match each item to a canonical internal compound record.

For non-odor workflows, the current implementation searches the main GC dataset in data/database.csv using CAS numbers, compound names, and supported alternate naming fields. For odor workflows, the resolver uses the odor dataset and maps names or CAS values to a canonical odor CAS key.

If an input cannot be matched confidently, it remains unmatched. If multiple aliases resolve to the same canonical CAS, the system keeps one matched record and reports the rest as duplicates rather than inflating the result set.

Why this matters

Users trust the build result only if unmatched and duplicate cases stay visible during review.

This is why the current logic favors explicit unmatched reporting over silent fallback matching.

Family routing and active data sources

Workflow family	Normalization source	Build source	Method behavior
Pesticides	Main GC dataset data/database.csv	Main GC dataset data/database.csv	Canonical GC method mapping selects RI fields and method fingerprints.
Environmental	Main GC dataset data/database.csv	Main GC dataset data/database.csv	Shares the current canonical GC method handling used by the main dataset path.
Odor / VOCs	Odor resolver data/odor/odor-dataset.json	Odor transitions + RI/RT metadata data/odor/odor-dataset.json	Method-aware support changes by WAX vs 5ms and reports unsupported compounds explicitly.

Family selection changes the actual processing path. In the current codebase, pesticide and environmental workflows share the main GC dataset, while odor workflows use a separate odor-specific path generated from the odor source workbook.

Method-aware build logic

After normalization, the build step assembles vendor-neutral output rows for the selected family and method context. On the main GC path, canonical method IDs are resolved through the method-mapping layer. That method context is then used to select the relevant RI field and attach a method fingerprint to the result.

On the odor path, the selected method controls compound availability, transitions, RI/RT metadata, RT windows, and unsupported-compound reporting. In the current project snapshot, WAX covers the full odor dataset while 5ms supports most, but not all, odor compounds.

Current odor method families

WAX (PEG, polar)
5ms (5% phenyl, low-polarity)

Returned row content

The build response returns export-oriented rows, not a final instrument method file.

compound identity and CAS
precursor and product ions
quantifier / qualifier role and relative intensity
RI references and RT window defaults when available
column phase, flow mode, oven program, and inlet context

Preview and credits

Demo mode and limited-access sessions may receive restricted previews. Logged-out users and users without credits can also see preview-only output.

These limits change how much preview data is shown, but they do not change the normalization and family routing logic used by the build pipeline.

Unsupported vs unmatched

Unmatched inputs are entries the system cannot map confidently to an internal record.

Unsupported compounds are matched compounds that exist in the broader library but are unavailable in the selected method context, especially in odor workflows.

Validation still required

Method Builder accelerates list preparation and method drafting. It does not replace laboratory validation.

Users still need to confirm RT behavior, ion ratios, matrix effects, acquisition settings, and acceptance criteria in their own environment.

Source files described by this page

Main GC dataset path

data/database.csv

Used by the current pesticide and environmental workflow path.

Odor dataset path

data/odor/odor-dataset.json

Generated from the odor source workbook and used by odor/VOC normalization and build logic.

Method configuration

data/methods.json + lib/utils/gcMethods.ts

Defines method fingerprints, canonical IDs, and method-label normalization rules.

Methodology FAQ

How does the platform match names and CAS numbers?

Inputs are trimmed, deduplicated, and matched to internal compound records. Non-odor workflows use the main GC dataset, while odor workflows use an odor-specific resolver that maps names and CAS values to canonical odor records.

Do pesticide and environmental workflows use the same dataset?

Yes in the current implementation. Pesticide and environmental workflows share the main GC dataset in data/database.csv, while odor workflows use a separate odor dataset.

Why can an odor compound be supported in WAX but not in 5ms?

Odor support is method-aware. Some compounds are available in the WAX library but not in the current 5ms library, so they remain visible as unsupported when 5ms is selected.

Does the exported list replace lab validation?

No. The exported list is a workflow and method-building output. Users still need to verify RT behavior, ion ratios, matrix suitability, and acceptance criteria under their own laboratory conditions.

Use the methodology with the live workflow

Check current coverage, test your own compound list, and move into the workflow that matches your target class and method context.

Check coverage Try the generator