Methodology Reference

    How MRM Method Builder normalizes inputs and assembles GC-QQQ output

    MRM Method Builder converts compound names or CAS numbers into workflow-aware GC-QQQ output by normalizing inputs, routing requests by family, and returning vendor-neutral transition rows with method metadata where available.

    This page describes the current production logic. It explains how pesticide, environmental, and odor/VOC workflows use different dataset paths and why final lab validation still remains necessary.

    1

    Accept list inputs

    The workflow starts with a user-supplied list of compound names or CAS numbers.

    2

    Normalize entries

    Each item is resolved to a canonical internal record when a confident match exists.

    3

    Route by family

    Family selection controls which dataset and method logic path are applied.

    4

    Assemble output rows

    Matched compounds are expanded into vendor-neutral transition rows with method context.

    5

    Return audit metadata

    The response includes preview state, counts, and unsupported cases where relevant.

    Input normalization

    Input normalization starts by trimming empty lines, deduplicating repeated entries, and attempting to match each item to a canonical internal compound record.

    For non-odor workflows, the current implementation searches the main GC dataset in data/database.csv using CAS numbers, compound names, and supported alternate naming fields. For odor workflows, the resolver uses the odor dataset and maps names or CAS values to a canonical odor CAS key.

    If an input cannot be matched confidently, it remains unmatched. If multiple aliases resolve to the same canonical CAS, the system keeps one matched record and reports the rest as duplicates rather than inflating the result set.

    Why this matters

    Users trust the build result only if unmatched and duplicate cases stay visible during review.

    This is why the current logic favors explicit unmatched reporting over silent fallback matching.

    Family routing and active data sources

    Workflow familyNormalization sourceBuild sourceMethod behavior
    PesticidesMain GC dataset
    data/database.csv
    Main GC dataset
    data/database.csv
    Canonical GC method mapping selects RI fields and method fingerprints.
    EnvironmentalMain GC dataset
    data/database.csv
    Main GC dataset
    data/database.csv
    Shares the current canonical GC method handling used by the main dataset path.
    Odor / VOCsOdor resolver
    data/odor/odor-dataset.json
    Odor transitions + RI/RT metadata
    data/odor/odor-dataset.json
    Method-aware support changes by WAX vs 5ms and reports unsupported compounds explicitly.

    Family selection changes the actual processing path. In the current codebase, pesticide and environmental workflows share the main GC dataset, while odor workflows use a separate odor-specific path generated from the odor source workbook.

    Method-aware build logic

    After normalization, the build step assembles vendor-neutral output rows for the selected family and method context. On the main GC path, canonical method IDs are resolved through the method-mapping layer. That method context is then used to select the relevant RI field and attach a method fingerprint to the result.

    On the odor path, the selected method controls compound availability, transitions, RI/RT metadata, RT windows, and unsupported-compound reporting. In the current project snapshot, WAX covers the full odor dataset while 5ms supports most, but not all, odor compounds.

    Current odor method families
    • WAX (PEG, polar)
    • 5ms (5% phenyl, low-polarity)

    Returned row content

    The build response returns export-oriented rows, not a final instrument method file.

    • compound identity and CAS
    • precursor and product ions
    • quantifier / qualifier role and relative intensity
    • RI references and RT window defaults when available
    • column phase, flow mode, oven program, and inlet context

    Preview and credits

    Demo mode and limited-access sessions may receive restricted previews. Logged-out users and users without credits can also see preview-only output.

    These limits change how much preview data is shown, but they do not change the normalization and family routing logic used by the build pipeline.

    Unsupported vs unmatched

    Unmatched inputs are entries the system cannot map confidently to an internal record.

    Unsupported compounds are matched compounds that exist in the broader library but are unavailable in the selected method context, especially in odor workflows.

    Validation still required

    Method Builder accelerates list preparation and method drafting. It does not replace laboratory validation.

    Users still need to confirm RT behavior, ion ratios, matrix effects, acquisition settings, and acceptance criteria in their own environment.

    Source files described by this page

    Main GC dataset path
    data/database.csv
    Used by the current pesticide and environmental workflow path.
    Odor dataset path
    data/odor/odor-dataset.json
    Generated from the odor source workbook and used by odor/VOC normalization and build logic.
    Method configuration
    data/methods.json + lib/utils/gcMethods.ts
    Defines method fingerprints, canonical IDs, and method-label normalization rules.

    Methodology FAQ

    How does the platform match names and CAS numbers?

    Inputs are trimmed, deduplicated, and matched to internal compound records. Non-odor workflows use the main GC dataset, while odor workflows use an odor-specific resolver that maps names and CAS values to canonical odor records.

    Do pesticide and environmental workflows use the same dataset?

    Yes in the current implementation. Pesticide and environmental workflows share the main GC dataset in data/database.csv, while odor workflows use a separate odor dataset.

    Why can an odor compound be supported in WAX but not in 5ms?

    Odor support is method-aware. Some compounds are available in the WAX library but not in the current 5ms library, so they remain visible as unsupported when 5ms is selected.

    Does the exported list replace lab validation?

    No. The exported list is a workflow and method-building output. Users still need to verify RT behavior, ion ratios, matrix suitability, and acceptance criteria under their own laboratory conditions.

    Use the methodology with the live workflow

    Check current coverage, test your own compound list, and move into the workflow that matches your target class and method context.