NGS Bioinformatics Pipeline Change-Control File: How to Document, Revalidate, and Audit Every Pipeline Update
Practical guide to change control for NGS bioinformatics pipelines in IVD devices — covering variant caller updates, reference database changes, threshold modifications, wet-lab/software interface validation, IEC 62304 documentation, revalidation triggers, and postmarket records.
What This Article Covers / Does Not Cover
This article covers one specific operational artifact: the change-control file for an NGS bioinformatics pipeline that is part of an IVD device. It explains what to document when any component of the pipeline changes (variant caller, aligner, annotator, reference database, threshold parameter), how to determine the scope of revalidation, what records to keep in the design history file (DHF) and software file, and how to survive a Notified Body or FDA audit of pipeline changes.
This article does not cover how to design an NGS panel, how to validate a bioinformatics pipeline from scratch, how to write a 510(k) or IVDR technical documentation for an NGS device, or general NGS regulatory strategy. For the broad regulatory framework, see NGS Diagnostic Devices Regulatory Guide. For software lifecycle requirements, see IEC 62304 Edition 2 2026 Update. For the change-control framework, see Medical Device Change Control.
Why Pipeline Change Control Is Uniquely Difficult
NGS bioinformatics pipelines are not static artifacts. They are chains of interdependent software tools — base callers, aligners, variant callers, annotators, classifiers — each maintained by different teams, each on its own release cadence. A single pipeline may incorporate GATK, BWA-MEM2, Samtools, a custom annotation engine, ClinVar, COSMIC, and a proprietary classification rule set. When any one of these components is updated, the change can silently affect variant calls downstream.
FDA's 2018 final guidance "Design Considerations for Prenatal Next-Generation Sequencing In Vitro Diagnostics" and the 2024 draft guidance on analytical validation of NGS-based IVDs both require manufacturers to "describe and document all software to be used, including the source, versions, and any modifications" and to "document software versions and traceability, reference sequence assembly, and elements needed to compile, install, and run the bioinformatics pipeline." Under EU IVDR, bioinformatics pipelines are classified as medical device software subject to IEC 62304 lifecycle requirements, with risk management per ISO 14971.
The regulatory expectation is clear: every pipeline change must be controlled, evaluated for impact, revalidated to the extent necessary, documented, and approved before release. The practical challenge is that "revalidated to the extent necessary" is poorly defined, and Notified Bodies differ significantly in what they accept.
Pipeline Component Inventory: What You Are Controlling
Before you can control changes, you must know what is in the pipeline. The first artifact in the change-control file is a complete component inventory.
Component Inventory Table
| Component | Function | Source | Version (Current) | License | Config File | Risk Level | Classification |
|---|---|---|---|---|---|---|---|
| BWA-MEM2 | Read alignment to reference genome | Open source (GitHub) | v2.2.1 | MIT | align_config.yml |
Medium | IEC 62304 Level II |
| GATK HaplotypeCaller | SNV/indel variant calling | Open source (Broad Institute) | v4.5.0.0 | BSD-3 | gatk_config.ini |
High | IEC 62304 Level II |
| Manta | Structural variant calling | Open source (Illumina) | v1.6.0 | GPL v2 | manta_config.ini |
High | IEC 62304 Level II |
| Custom annotator | Variant annotation and classification | In-house developed | v3.1.4 | Proprietary | annotator_config.json |
High | IEC 62304 Level II |
| ClinVar database | Clinical significance database | NCBI public | 2025-10 release | Public domain | clinvar_release.txt |
High | Reference data |
| COSMIC database | Somatic mutation database | Sanger Institute (licensed) | v99 | Commercial | cosmic_release.csv |
Medium | Reference data |
| Reference genome | GRCh38/hg38 alignment target | UCSC/NCBI | GRCh38.p14 | Public domain | reference.fa |
High | Reference data |
| Python environment | Runtime environment | Python Software Foundation | 3.11.6 | PSF | requirements.txt |
Low | Infrastructure |
| Docker container | Deployment and packaging | Docker Inc. | N/A (custom image) | Apache 2.0 | Dockerfile |
Low | Infrastructure |
What goes in the file: The complete inventory table, with version numbers, sources, license types, and configuration file references, is stored in the DHF as part of the Software Requirements Specification (SRS) per IEC 62304 Section 5.2. Every change request must reference this inventory.
Change Classification Decision Tree
Not all pipeline changes carry the same risk. Use this decision tree to classify the change and determine the required validation scope.
START: Pipeline component change identified
│
├─ Is the change to a variant caller, aligner, or classifier algorithm?
│ ├─ YES → HIGH-IMPACT CHANGE
│ │ ├─ Full revalidation required
│ │ ├─ Run complete reference dataset (≥30 samples across variant types)
│ │ ├─ Concordance analysis: new vs. previous pipeline version
│ │ ├─ Update DHF, SRS, SDD, SVR, and release notes
│ │ └─ QA approval + regulatory affairs review
│ │
│ └─ NO → Go to next question
│
├─ Is the change to a reference database (ClinVar, COSMIC, dbSNP)?
│ ├─ YES → MEDIUM-IMPACT CHANGE
│ │ ├─ Differential analysis: identify added/removed/modified entries
│ │ ├─ Re-run affected samples from validation set
│ │ ├─ Verify no regression in previously called variants
│ │ ├─ Update database version in SRS and configuration records
│ │ └─ QA approval
│ │
│ └─ NO → Go to next question
│
├─ Is the change to a threshold, filter, or reporting parameter?
│ ├─ YES → MEDIUM-IMPACT CHANGE
│ │ ├─ Sensitivity/specificity analysis on reference dataset
│ │ ├─ Edge case testing with known challenging variants
│ │ ├─ Update configuration documentation
│ │ └─ QA approval + regulatory affairs review
│ │
│ └─ NO → Go to next question
│
├─ Is the change to infrastructure (OS, runtime, container, CI/CD)?
│ ├─ YES → LOW-IMPACT CHANGE
│ │ ├─ Regression test: re-run subset of validation samples (≥5)
│ │ ├─ Verify identical variant calls vs. baseline
│ │ ├─ Update configuration management records
│ │ └─ Developer lead approval
│ │
│ └─ NO → Bug fix or cosmetic change
│ ├─ MINIMAL-IMPACT CHANGE
│ ├─ Unit test for fix
│ ├─ Regression test (≥3 samples)
│ └─ Developer lead approval
Change Classification Evidence Table
| Change Classification | Validation Scope | Sample Size | DHF Update Sections | Approval Authority | Timeline (Typical) |
|---|---|---|---|---|---|
| High-impact | Full revalidation | ≥30 samples, all variant types | SRS, SDD, SVR, RA, Release Notes | QA + RA | 4–8 weeks |
| Medium-impact | Partial revalidation | ≥10 samples, affected variant types | SRS (database version), SVR | QA + RA | 2–4 weeks |
| Medium-impact (threshold) | Sensitivity/specificity | ≥10 samples + edge cases | SRS, SDD, SVR | QA + RA | 2–4 weeks |
| Low-impact | Regression test | ≥5 samples | Configuration records | Dev Lead | 1–2 weeks |
| Minimal-impact | Unit test + regression | ≥3 samples | Bug fix record | Dev Lead | 1–5 days |
The Change-Control File: What Goes In
Each pipeline change generates a change-control record. Here is the complete list of fields and documents that must be in the file.
Change-Control Record Data Fields
| Field | Description | Example | Owner |
|---|---|---|---|
| Change ID | Unique identifier | CC-NGS-2026-042 |
QA |
| Date initiated | When the change was requested | 2026-03-15 | Requestor |
| Component affected | Which pipeline component | GATK HaplotypeCaller | Dev Lead |
| Current version | Version being replaced | v4.4.0.0 | Dev Lead |
| Proposed version | New version | v4.5.0.0 | Dev Lead |
| Change classification | From decision tree | High-impact | RA + QA |
| Justification | Why the change is needed | "GATK 4.5 improves indel calling in low-complexity regions; addresses known FP in homopolymer runs (see CAPA-2026-018)" | Requestor |
| Risk assessment | ISO 14971 impact analysis | "Change affects SNV/indel calling accuracy; no impact on structural variant pipeline or reporting" | RA |
| Validation protocol | Reference to protocol document | VAL-NGS-2026-042 |
Dev Lead |
| Validation dataset | Reference samples used | "GIAB HG001–HG005 (5 samples) + 25 clinical residual specimens spanning SNV, indel, SV, CNV" | Dev Lead |
| Concordance results | Agreement with previous version | "99.7% concordance (SNV), 99.2% (indel); 3 discordant calls reviewed and resolved (see Appendix B)" | Dev Lead |
| Regression test results | Baseline comparison | "All 5 regression samples show identical calls vs. v4.4.0.0 baseline" | Dev Lead |
| Updated documents | List of DHF documents updated | SRS §4.3.2, SDD §5.1, SVR §6.4, RA §7.2 | QA |
| RA review | Regulatory affairs assessment | "No change to intended use, no new submission required; document in annual report" | RA |
| QA approval | Final sign-off | Approved 2026-04-10 | QA Manager |
| Release date | When the change goes live | 2026-04-15 | Release Manager |
| Post-release monitoring | Monitoring plan | "Monitor discordant call rate for 30 days; trigger threshold: >0.5% unexpected discordance" | Dev Lead |
Validation Protocol Requirements
High-Impact Change: Full Revalidation Protocol
For variant caller or aligner changes, the validation protocol must include:
- Reference materials: Genome in a Bottle (GIAB) samples from NIST (HG001–HG005 at minimum), covering diverse ancestries and variant types
- Clinical residual specimens: ≥25 previously characterized samples spanning SNVs, indels, structural variants, and copy number variants
- Concordance analysis: Compare new pipeline calls vs. previous pipeline calls, categorize discordant calls, and resolve each
- Performance metrics: Sensitivity, specificity, positive predictive value, negative predictive value, for each variant type, stratified by variant allele frequency (VAF) range
- Edge cases: Samples with homopolymer runs, low-complexity regions, GC-rich regions, low tumor purity (<10%), high VAF (>90%)
- Wet-lab interface verification: Confirm that raw data output (FASTQ/BAM) from the sequencing instrument is correctly processed by the updated pipeline
Medium-Impact Change: Database Update Protocol
For reference database updates:
- Differential analysis: Generate a list of all entries added, removed, or modified between the old and new database versions
- Affected variant review: Identify any previously reported variants whose clinical classification changes due to the database update
- Re-run subset: Process ≥10 validation samples through the updated pipeline and compare annotation output
- Clinically significant changes: Flag any variant reclassification that could affect patient management; document the review
Performance Metrics Table Template
| Variant Type | VAF Range | Sensitivity (New) | Sensitivity (Old) | Δ | Specificity (New) | Specificity (Old) | Δ | Pass/Fail |
|---|---|---|---|---|---|---|---|---|
| SNV | >20% | 99.8% | 99.6% | +0.2% | 99.9% | 99.9% | 0% | Pass |
| SNV | 5–20% | 98.5% | 97.8% | +0.7% | 99.7% | 99.6% | +0.1% | Pass |
| SNV | 1–5% | 92.1% | 89.4% | +2.7% | 98.2% | 97.8% | +0.4% | Pass |
| Indel (1–5 bp) | >20% | 99.1% | 98.4% | +0.7% | 99.5% | 99.3% | +0.2% | Pass |
| Indel (1–5 bp) | 5–20% | 95.3% | 92.1% | +3.2% | 98.8% | 98.1% | +0.7% | Pass |
| SV | All | 94.7% | 94.2% | +0.5% | 97.3% | 97.1% | +0.2% | Pass |
Wet-Lab / Software Interface Validation
A pipeline change that affects how raw sequencing data is processed (alignment, base quality recalibration, duplicate marking) must verify the interface with the wet-lab process. This is where many submissions fail.
Interface Checkpoint Table
| Interface Point | What to Verify | Acceptance Criterion | Method | Record |
|---|---|---|---|---|
| FASTQ input | Pipeline accepts output from sequencer without error | 0 ingestion failures on 30 consecutive runs | Automated log review | DevOps report |
| Sample barcode demultiplexing | Index assignment matches sample sheet | 100% concordance with expected assignments | Comparison of demux output vs. sample sheet | QC report |
| Alignment quality | % mapped reads, % properly paired, mean coverage | Meets specifications established in original validation | Qualimap/Samtools stats | QC metrics file |
| Base quality scores | Quality score distribution comparable to baseline | Mean Q30 ≥85%; no distribution shift >2 SD | FastQC + custom script | QC report |
| Contamination detection | Cross-sample contamination detection still functional | Verify with known mixture samples (1%, 3%, 5%) | VerifyBamID or equivalent | QC report |
| Output formatting | VCF/BAM output format unchanged or improved | Compatible with downstream reporting system | Automated schema validation | Integration test log |
Regulatory Submission Impact Assessment
Not every pipeline change requires a new regulatory submission, but every change must be assessed for submission impact.
Submission Impact Decision Matrix
| Change Type | 510(k) Impact | IVDR Impact | Documentation Required |
|---|---|---|---|
| Variant caller version update (same algorithm family) | Document in annual report | Update technical file; notify NB per IVDR Article 80 if significant | Change-control record + validation report |
| Variant caller replacement (different algorithm) | New 510(k) or Special 510(k) likely required | Update technical file; NB assessment required | Full revalidation + submission |
| Reference database update (routine, e.g., quarterly ClinVar) | Document in annual report | Update technical file as part of periodic review | Change-control record + differential analysis |
| Reference genome build change (e.g., GRCh37 → GRCh38) | Likely new 510(k) | Update technical file; NB assessment required | Full revalidation + submission |
| Reporting threshold change (e.g., VAF cutoff 5% → 2%) | New 510(k) required | Update technical file; NB assessment required | Sensitivity/specificity at new threshold + submission |
| Bug fix with no clinical impact | Document in change log | Update technical file change log | Change-control record + regression test |
| Infrastructure change (OS, container) | Document in change log | Document in QMS records | Change-control record + regression test |
Common Failure Modes and How to Remediate
Failure Mode 1: Insufficient Revalidation Scope
What happens: A variant caller is updated from v4.4 to v4.5, but the revalidation only checks a handful of easy samples. During an audit, the NB requests the validation dataset and finds it does not cover the full range of variant types claimed in the intended use.
How to remediate: Establish a fixed validation dataset that covers every variant type in the intended use statement. Store this as a golden dataset in the DHF. For every high-impact change, the validation protocol must demonstrate performance across all claimed variant types, VAF ranges, and sample types (FFPE, blood, saliva, etc.).
Failure Mode 2: Reference Database Drift Without Documentation
What happens: ClinVar releases are loaded into the pipeline on an ad hoc basis without a formal change-control record. When a patient variant is reclassified from "pathogenic" to "VUS" between releases, there is no traceability.
How to remediate: Treat every reference database update as a medium-impact change. Document the exact release version, the date loaded, the differential analysis, and any variant reclassifications. Store this in the change-control file linked to the annotation component in the inventory.
Failure Mode 3: Uncontrolled Open-Source Dependencies
What happens: A developer updates a Python library dependency (e.g., numpy, pandas) as part of routine maintenance. The update subtly changes floating-point handling in a variant quality score calculation. The change is not caught because there is no regression test that covers this pathway.
How to remediate: Pin all dependencies in a lock file (requirements.txt, poetry.lock, or equivalent). Treat any dependency version change as at minimum a low-impact change requiring regression testing. Include edge-case samples in the regression set that exercise the specific computational pathways most affected by numerical precision.
Failure Mode 4: Failure to Assess Cybersecurity Impact
What happens: A pipeline update introduces a new open-source dependency with a known CVE. The change-control process does not include a cybersecurity review step, and the vulnerability persists in the production pipeline.
How to remediate: Include a cybersecurity checkpoint in every change-control record. Run a dependency vulnerability scan (e.g., pip-audit, safety, or Snyk) on the updated pipeline before release. Document the scan results in the change-control file. For the broader cybersecurity framework, see SBOM Software Bill of Materials for Medical Devices.
Failure Mode 5: Wet-Lab / Bioinformatics Disconnect After Change
What happens: A pipeline update changes the expected BAM header format, but the wet-lab team is not notified. Sequencing runs start generating unexpected QC failures that are misattributed to sample quality rather than a pipeline issue.
How to remediate: Include a wet-lab interface verification step (see Interface Checkpoint Table above) in every change-control record for high- and medium-impact changes. Notify the wet-lab team before deploying pipeline updates to production.
Postmarket Update Records
Under FDA's QMSR (21 CFR Part 820, aligned with ISO 13485:2016, effective February 2026) and IVDR Article 80, pipeline changes that affect device performance must be documented and, where applicable, reported.
Postmarket Record Checklist
| Record | Where to File | Trigger | Retention |
|---|---|---|---|
| Change-control record | DHF + QMS | Every pipeline change | Device lifetime + 2 years (minimum) |
| Validation report | DHF | Every high/medium-impact change | Device lifetime + 2 years |
| SBOM update | Technical file + SBOM archive | Every software dependency change | Device lifetime + 2 years |
| Annual report entry (FDA) | Annual report | Every change affecting performance or labeling | Per FDA records policy |
| NB notification (IVDR) | Technical file update + NB communication | Significant changes per IVDR Art. 80 | Per IVDR requirements |
| Complaint file cross-reference | Complaint handling system | Any complaint potentially linked to a pipeline change | Per complaint handling procedure |
| CAPA cross-reference | CAPA system | If change was initiated by CAPA | Per CAPA procedure |
Source-to-Evidence Traceability Table
| Claim in Technical Documentation | Supporting Record | Location |
|---|---|---|
| "Variant calling accuracy is maintained after pipeline updates" | Concordance analysis for each change-control record | DHF → SVR → CC-NGS-XXXX |
| "All software components are under configuration management" | Component inventory table + Git history | DHF → SRS §4.3 + CM records |
| "Reference database versions are tracked and controlled" | Database version log + differential analysis per update | DHF → SRS §4.5 + CC-NGS-XXXX |
| "Revalidation covers all claimed variant types" | Validation protocol + dataset composition table | DHF → VAL-NGS-XXXX |
| "Cybersecurity vulnerabilities are assessed for each change" | Dependency scan results per change-control record | DHF → Cybersecurity file + CC-NGS-XXXX |
| "Wet-lab interface is verified after each significant change" | Interface checkpoint table results | DHF → SVR §6 + CC-NGS-XXXX |
Pre-Deployment Checklist
Use this checklist before deploying any pipeline change to production:
- Change-control record initiated with complete data fields
- Change classified using the decision tree (high/medium/low/minimal)
- Component inventory table updated
- Risk assessment completed per ISO 14971
- Validation protocol approved by QA
- Validation dataset covers all claimed variant types and VAF ranges
- Concordance analysis completed (for high/medium changes)
- Regression test passed (all change classifications)
- Wet-lab interface verification completed (for high/medium changes)
- Cybersecurity dependency scan completed and documented
- SBOM updated
- All DHF documents updated (SRS, SDD, SVR, RA as applicable)
- Regulatory affairs review completed (submission impact assessed)
- QA approval obtained
- Post-release monitoring plan defined with trigger thresholds
- Wet-lab team notified (for changes affecting interface)
- Release notes prepared
- Configuration management records updated (Git tag, Docker image tag)
Key Regulatory Sources
- FDA. "Design Considerations for Prenatal Next-Generation Sequencing In Vitro Diagnostics." Final guidance, 2018.
- FDA. "Use of Public Human Genetic Variant Databases to Support Clinical Validity for Next Generation Sequencing (NGS)-Based In Vitro Diagnostics." Draft guidance, 2016.
- IEC 62304:2006+Amd1:2015 (Edition 2 draft expected August 2026). Medical device software — Software life cycle processes.
- EU IVDR Regulation (EU) 2017/746. Annexes I, II, III.
- AMP/CAP Joint Consensus Recommendation. "Guidelines for Validation of Next-Generation Sequencing–Based Oncology Panels." JMD, 2019.
- ACMG Technical Standard. "Next-generation sequencing for constitutional variants in the clinical laboratory, 2021 revision." Genetics in Medicine, 2021.
- NIST Genome in a Bottle Consortium. Reference materials and benchmarking tools for NGS performance assessment.