Skip to content

Cross-DC Filtering

Filter data in one collection and automatically update related visualizations—no pre-computed joins needed.

Overview

Links connect Data Collections for interactive filtering at runtime. When you filter a metadata table, linked MultiQC plots and other visualizations update automatically.

┌─────────────────┐         ┌─────────────────┐
│  Metadata Table │  link   │  MultiQC Plots  │
│                 │────────▶│                 │
│  [filter here]  │         │  [auto-updates] │
└─────────────────┘         └─────────────────┘

How It Works

  1. Define a link between source DC (e.g., metadata table) and target DC (e.g., MultiQC)
  2. Add a filter component to your dashboard
  3. When users filter the source DC, linked targets automatically show only matching data

Configuration

Add links to your project YAML:

links:
  - source_dc_id: sample_metadata
    source_column: sample_id
    target_dc_id: multiqc_fastqc
    target_type: multiqc
    link_config:
      resolver: sample_mapping
Field Required Description
source_dc_id Yes Data collection containing the filter
source_column Yes Column to filter on
target_dc_id Yes Data collection to receive filtered values
target_type Yes Type of target: table or multiqc
link_config Yes Resolution configuration (see below)
link_config:
  resolver: sample_mapping    # Resolution strategy
  target_field: sample_name   # Field to match in target (optional)

Resolvers

Resolvers map source values to target identifiers:

Resolver Use Case Example
direct Same value in both DCs sample_idsample_id
sample_mapping Canonical ID → MultiQC variants S1[S1_R1, S1_R2]
pattern Template substitution {sample}.bam

When to Use Each Resolver

  • direct: Source and target use identical identifiers
  • sample_mapping: MultiQC sample names differ from your canonical IDs (most common for MultiQC)
  • pattern: Target uses predictable naming convention

Supported Target Types

Type Filter Action Status
table Filters rows with WHERE IN Available
multiqc Filters plot samples Available
map Filters map markers Available
jbrowse2 Shows/hides tracks Planned
images Filters image gallery Planned

Complete Example

name: "RNA-seq QC Analysis"
project_type: "advanced"

# Define links for cross-DC filtering
links:
  # Link metadata to MultiQC plots
  - source_dc_id: sample_metadata
    source_column: sample_id
    target_dc_id: multiqc_general_stats
    target_type: multiqc
    link_config:
      resolver: sample_mapping

  # Link metadata to expression table
  - source_dc_id: sample_metadata
    source_column: sample_id
    target_dc_id: gene_expression
    target_type: table
    link_config:
      resolver: direct
      target_field: sample_id

workflows:
  - name: "rnaseq_pipeline"
    # ... workflow config ...

    data_collections:
      - data_collection_tag: "sample_metadata"
        config:
          type: "table"
          metatype: "metadata"
          # ... scan config ...

      - data_collection_tag: "multiqc_general_stats"
        config:
          type: "MultiQC"

      - data_collection_tag: "gene_expression"
        config:
          type: "table"
          metatype: "aggregate"
          # ... scan config ...

Dashboard Usage

  1. Create a dashboard with your linked data collections
  2. Add a filter component (dropdown, multi-select) on the source DC
  3. Add visualizations for the target DCs
  4. Filter the source → targets update automatically
Feature Links Joins
Execution Runtime (on filter) Pre-computed (CLI batch)
Storage None Delta table in S3
Target types Any (table, MultiQC, ...) Tables only
Use case Interactive filtering Combined datasets

Use links for interactive cross-DC filtering. Use joins when you need a permanently combined dataset.