readme

CCRR: Complex Chromosomal Rearrangements Resolver

0. Introduction

Complex Chromosomal Rearrangements Resolver (CCRR) is a self‑contained toolkit that turns whole‑genome sequencing data into an annotated catalogue of complex tumour rearrangements. Packaged in a single Docker image, it installs all dependencies automatically, including six SV callers, five CNV callers, tools for purity–ploidy estimation, and a panel of complex event detectors (ShatterSeek, CTLPScanner, SeismicAmplification, AmpliconArchitect, Starfish, and gGnome via JaBba). A single command runs the full pipeline on tumour/normal BAM files, merges the results into high-confidence consensus SVs and copy number states, infers purity and ploidy, applies the complex event detection tools, and generates publication-ready Circos and track plots.

For users who already have SV and CNV results, a companion web server (https://www.ccrr.life/) provides an interactive interface that supports one-click execution of the same event detection suite. It accepts standard VCF and segment files as input, and also allows custom JSON-formatted SV/CNV data for flexible analysis and visualization. Users receive an interactive dashboard and downloadable result summaries and figures.

The source code is freely available at https://github.com/laslk/CCRR, and the workflow runs seamlessly on any Linux or Windows host with Docker support.

1. Run the Full Pipeline Locally

1.1 Installation

Download

wget -O ccrr1.2.zip https://www.ccrr.life/download_file/ccrr1.2.zip
unzip -q ccrr1.2.zip

Prepare for installation using `install.py`

python install.py -sequenza -manta -delly -svaba -gridss \
                    -lumpy -soreca -purple -sclust -cnvkit \
                    -ref 'hg19&hg38'

This script will automatically download the dependency data for the tools you selected and build the Dockerfile.

  -sequenza             use sequenza for cn, cellularity and ploidy

  -manta                use manta for sv
  -delly                use delly for sv and cn
  -svaba                use svaba for sv
  -gridss               use gridss for sv
  -lumpy                use lumpy for sv
  -soreca               use soreca for sv
  -purple               use purple for cn
  -sclust               use sclust for cn
  -cnvkit               use cnvkit for cn

  -ref                  hg19 or 'hg19&hg38'

You should select at least one SV tool and one CN tool.If you want to run in the fastest way, you can use only Delly to obtain SV and CN.

Obtain licenses for Mosek and Gurobi

Gurobi: Apply for a WLS Compute Server license and store it in the same directory as the Dockerfile, named gurobi.lic. For more information, visit www.gurobi.com
Mosek: Obtain a Mosek license and store it in the same directory as the Dockerfile, named mosek.lic. For more information, visit www.mosek.com

Build Docker image

docker build --pull --rm --build-arg GITHUB_PAT=[GITHUB_PAT] \
        --build-arg SCRIPT_DIR="/home/0.script" --build-arg TOOL_DIR="/home/1.tools" \
        --build-arg DATABASE_DIR="/home/2.share" --build-arg WORK_DIR="/home/3.wd" \
        -f Dockerfile -t ccrr:v1.2 .

To ensure the installation proceeds correctly, you need to provide a GitHub token GITHUB_PAT.
You can specify these four parameters: SCRIPT_DIR for the script directory, default is /home/0.script; TOOL_DIR for the tool directory, default is /home/1.tools; DATABASE_DIR for the database and mounted shared directory, default is /home/2.share; WORK_DIR for the work directory, default is /home/3.wd.

Run a container

docker run -v $(pwd)/share:/home/2.share -v $(pwd)/wd:/home/3.wd -d -it --name ccrr ccrr:v1.2
docker exec -it ccrr /bin/bash

This command mounts the current directory's share folder to DATABASE_DIR inside the container, creating a shared path between the host and the container.

1.2 Testing and Quick Start

Testing

ccrr -mode test

This will test the necessary environment required by the process with an accompanying small sample data, which may take up to half an hour.

Quick Start

nohup ccrr -mode default \
        -normal [normal.bam] --normal-id [normal-id] \
        -tumor [tumor.bam] --tumor-id [tumor-id] \
        --genome-version hg38 -reference [hg38.fa] \
        -threads 30 -g 200 >log 2>&1 &

This will run the entire process in default mode, allowing for up to 30 threads for possible multi-threaded tasks, and a memory cap of 200GB. Processing a pair of tumor/normal control BAM files, each sized around 106GB, approximately takes 80 hours.

1.3 Usage

Display Help

ccrr --help

Required Parameters

Mode

fast: Only uses delly to obtain SV and CN data.
default: Uses all included tools to obtain SV and CN data, then performs merging. SV results supported by at least two SV callers are retained by default.
custom: Allows for the selection of tools, parameters, and merge method to be used.
test: Tests the process with the accompanying small sample data, which may take about half an hour.
clear: Clear the working directory for the task ID specified by the -prefix option. Please clear it before starting a new task.

  -mode {fast,custom,default,test,clear}        choose mode to run

Input and Information

Your input should be a pair of normal/tumor control whole-genome sequencing BAM files and their reference genome. Supported reference genome versions are hg19 and hg38.

  -prefix                       task id  

  -normal NORMAL                normal bam
  --normal-id NORMAL_ID         Identifier for the normal sample, typically from the BAM header  

  -tumor TUMOR                  tumor bam
  --tumor-id TUMOR_ID           Identifier for the tumor sample, typically from the BAM header

  --genome-version {hg19,hg38}  Set the reference, hg19 or hg38
  -reference REFERENCE          reference fq

Optional Parameters

Configuring Multithreading and Available Memory

If not set, the default memory allocation is 8GB, which may not suffice for the memory demands of certain steps. We recommend setting it higher.
Please note that the default number of threads is 8. Some software may not support multithreading acceleration, and for others, there might be a soft cap on the number of threads that can be effectively utilized, meaning that setting a higher number of threads may not result in the expected speed-up.

  -threads THREADS      Set the number of processes if possible
  -g G                  set the amount of available RAM If possible

Selecting Required Tools

In custom mode, you can freely choose which software to use for generating SV and CN data. The built-in software includes:

  -sequenza             use sequenza for cellularity, ploidy and cn

  -delly                use delly for sv and cn 
  -manta                use manta for sv
  -svaba                use svaba for sv
  -gridss               use gridss for sv
  -lumpy                use lumpy for sv
  -soreca               use soreca for sv

  -sclust               use sclust for cn
  -purple               use purple for cn
  -cnvkit               use cnvkit for cn

Setting Quality Filtering

You can conveniently filter the results of each software based on quality before merging:

  --manta-filter MANTA_FILTER               Filter for manta
  --delly-filter DELLY_FILTER               Filter for delly sv
  --delly-cnvsize DELLY_CNVSIZE             min cnv size for delly
  --svaba-filter SVABA_FILTER               Filter for svaba
  --gridss-filter GRIDSS_FILTER             Filter for gridss
  --lumpy-filter LUMPY_FILTER               Filter for lumpy

Merging Method

When results from two different SV callers are adjacent in the genome and the distance between them is less than a specified threshold, they will be considered the same SV. The default threshold is 150bp.

  --sv-threshold SV_THRESHOLD

Select the method for merging results from different SV callers. If you wish to retain only the results supported by all the SV callers used, choose intersection; if you prefer to keep all results from all SV callers without duplicates, choose union; if you want to customize to retain results supported by X or more software tools, select x-or-more, and specify the number in --sv-x. If X is not specified, the default will be 3, meaning that results supported by two or more software tools will be retained.

  --sv-merge-method {intersection,union,x-or-more}
                        Choose a sv merging method: 
                        1. 'intersection': Merges only the SVs that are identified by all SV callers. 
                        2.'union': Merges all SVs identified by any of the SV callers. 
                        2. 'x-or-more': Merges SVs that are identified by at least x SV callers. if only one svcaller is prepared , then this parameter is irrelevant  
  --sv-x {1,2,3,4,5,6}  
                        Specify the x. This argument is required when '--merge-method' is set to 'x-or-more'. Must be among the provided input files. default=3

If you wish to prioritize a specific SV caller, setting --sv-primary-caller will retain all results outputted by it.

  --sv-primary-caller {manta,delly,svaba,gridss,lumpy,soreca}
                        Specify the primary SV caller to keep all of its result.

Setting --cn-threshold allows you to adjust the threshold, defining the maximum allowable distance for determining overlap among copy number change regions from different tools when merging copy number variant analysis results. The default threshold is 5000bp.

  --cn-threshold CN_THRESHOLD
                        threshold for determining cn, defaults to 5000bp

Complex Rearrangement Analysis

-complex COMPLEX      complex rearrangement analysis

--cellularity-ploidy-tool {sequenza,purple}

Complex rearrangement analysis is conducted by default. If you only wish to obtain merged results, you can use -complex False.
You can also specify the tool used for cellularity and ploidy estimation with --cellularity-ploidy-tool, choosing between sequenza (default) and purple. This setting influences tools like JaBba and gGnome.

Output, Rerunning, and History

${WORK_DIR}/[task id]will serve as the working directory, retaining the output results of each part. A summary of the complex rearrangement analysis can be found in ${WORK_DIR}/[task id]/complex/summary.
Once a module is completed, it will be recorded in the ${WORK_DIR}/[task id]/history file. If the process is unexpectedly interrupted, rerunning the entire process will skip the parts that have been successfully executed according to the records in the history file, resuming from the point of interruption.
Of course, you can manually modify this file to skip any steps you wish to bypass.

1.4 Custom Execution

You can run each module step by step according to your analytical needs. For example:

Using `svmerge.py` to Merge SV Data

You can specify the output results from each SV caller as input files for merging.

  -manta MANTA          manta vcf result
  -delly DELLY          delly vcf result
  -svaba SVABA          svaba vcf result
  -gridss GRIDSS        GRIDSS vcf result
  -lumpy LUMPY          LUMPY vcf result
  -soreca SORECA        soreca result

Filter the output results of each SV caller based on quality.

  --manta-filter MANTA_FILTER
                        Filter threshold for manta
  --delly-filter DELLY_FILTER
                        Filter threshold for delly
  --svaba-filter SVABA_FILTER
                        Filter threshold for svaba
  --gridss-filter GRIDSS_FILTER
                        Filter threshold for GRIDSS
  --lumpy-filter LUMPY_FILTER
                        Filter threshold for LUMPY

Determine thresholds, merging methods, and specify a trusted SV caller as described previously.

  --threshold THRESHOLD
                        threshold for determination, defaults to 150bp

  --merge-method {intersection,union,x-or-more}
                        Choose a merging method: 1. 'intersection': Merges only the SVs that are identified by all SV callers. 
                        2. 'union': Merges all SVs identified by any of the SV callers. 
                        3.'x-or-more': Merges SVs that are identified by at least x SV callers. if only one svcaller is prepared , then this parameter is irrelevant

  --primary-caller {None,manta,delly,svaba,gridss,lumpy,soreca}
                        Specify the primary SV caller to keep all of its result.

  -x {1,2,3,4,5,6}      Specify the x. This argument is required when '--merge-method' is set to 'x-or-more'.
                        Must be among the provided input files.

Set the output path and enable multi-process execution.

  -o O                  output path
  -t T                  Set the number of processes

Use `consensus_cn.py` to merge CN data.

python ${SCRIPT_DIR}/consensus_cn.py  \
    -sclust SCLUST -delly DELLY -purple PURPLE -cnvkit CNVKIT \
    -ref hg19 -gender male \
    -o OUT

Parameters

  -sclust SCLUST        sclust cn result
  -delly DELLY          delly cn result
  -purple PURPLE        purple cn result
  -cnvkit CNVKIT        cnvkit cn result
  -sequenza SEQUENZA    sequenza cn result

  --threshold THRESHOLD
                        threshold for determination, defaults to 5000bp

  -o O                  output path
  -ref REF              hg19 or hg38

Use `complex.py` to analyze complex rearrangements.

python ${SCRIPT_DIR}/complex.py  -prefix task_id \
        --tumor-id example -sv SV -cn CN \
        --genome-version hg19 -gender male \
        -shatterseek -starfish -gGnome -SA  -ctlpscanner \
        -threads 30 -g 200

Required inputs, the format of SV and CN files as shown in the examples.

https://www.ccrr.life/static/examplefile/custom_sv.bed
https://www.ccrr.life/static/examplefile/custom_cn.bed

  -prefix               task id
  --tumor-id TUMOR_ID
  -sv SV                sv input
  -cn CN                cn input
  --genome-version GENOME_VERSION
                        Set the reference, hg19 or hg38

Select the tools for complex rearrangement analysis.

  -shatterseek          use shatterseek
  -starfish             use starfish
  -gGnome               use jabba and gGnome
  -SA                   use Seismic Amplification
  -ctlpscanner          use CTLPscanner

The AmpliconArchitect requires an input of BAM files.

  -AA                           use Amplicon Architect

  -normal NORMAL                normal bam
  --normal-id NORMAL_ID

  -tumor TUMOR                  tumor bam
  --tumor-id TUMOR_ID

Set the available memory and number of threads.

  -threads THREADS      Set the number of processes if possible
  -g G                  set the amount of available RAM If possible

1.5 Output

Summary: `{WORK_DIR}/{PREFIX}/complex/summary.png`

A visual summary of CN, SV integration, and analysis results of various complex rearrangements generated by the CCRR workflow.
input
The tracks, from outer to inner, display:

Chromosomes:
Shows the start and end points of chromosomal regions and the centromeres.
CN:
Regional colors indicate copy number gains (red) or losses (green);
a black solid line represents a smoothed curve showing actual copy numbers,
with a straight black line representing the default normal copy number state (CN=2).
Shatterseek:
Highlights chromosomal shatter regions with high confidence (orange) and low confidence (yellow)
(criteria do not include statistical validation).
CTLPScanner:
Marks Chromothripsis-like Pattern areas, with region colors representing the log likelihood ratio (lg(LR) ≥ 5).
Seismic Amplification:
Indicates seismic amplification event areas (green).
Starfish:
Highlights complex genomic rearrangement areas (cyan).
gGnome:
Shows various complex event areas (details available in gGnome results).
AmpliconArchitect:
Marks ecDNA (blue), linear amplification (green), and BFB (yellow) areas
(not available on the web).
SV:
Indicates different types of structural variation:
- deletions (DEL, blue)
- inversions (INV, green)
- duplications (DUP, red)
- translocations (TRA, brown)

CN merge:

Merge result:{WORK_DIR}/{PREFIX}/cnmerge/consensus_cn.bed
Segment count plot: {WORK_DIR}/{PREFIX}/cnmerge/segment_count.pdf

A bar plot showing the count of copy number segments across different length intervals.

Bias and volatility plot: {WORK_DIR}/{PREFIX}/cnmerge/Bias_and_volatility_for_CN_all_ranges.pdf

This figure shows the distribution of bias and volatility for each tool across different region lengths.
Bias reflects systematic deviation from the consensus copy number, while volatility captures the magnitude of variation.
Both are length-weighted and log-scaled to allow fair comparison across tools.

SV Merge

Merge result: {WORK_DIR}/{PREFIX}/svmerge/sv_merge.bed
SV caller consensus (Upset plot): {WORK_DIR}/{PREFIX}/svmerge/sv_merged.pdf

An Upset plot illustrating the overlap and consensus of structural variant calls among different SV tools.

2. CCRR Web Services

2.1 Start

The web services support multiple input options. You may begin with any of the following:

the output of supported SV and CNV tools
your own SV/CNV annotations
a custom JSON file for visualization (as generated by the local pipeline above).

2.2 SV Input

fig1
By clicking on "From tools," you can upload results from various structural variant analysis tools.
fig2
You have the option to directly input the corresponding files, with examples available for review by clicking on the respective 'example' links. These include:

Delly: delly_example.sv.somatic.pre.vcf
Manta: manta_example.somaticSV.vcf
Gridss: gridss_example.gripss.filtered.vcf (processed with GRIPSS)
Lumpy: lumpy_example.gt.vcf
SvABA: svaba_example.somatic.sv.vcf
Soreca: soreca_example_unsnarl.txt
These sample files, derived from the public dataset SRR2020636, serve only as format references and hold no analytical significance.

Upload Options:
You can upload results from one to six different structural variant analysis tools. If only one file is uploaded, we will convert its format and proceed with the complex structural variant analysis. If two or more files are uploaded, they will first be merged.
Custom Data:
If you wish to use your own structural variant data, you can click on "From Custom" to upload your customized data.

The format for custom structural variant data should be as follows:

Chrom1 pos1: Chromosome and position of the first breakend
Chrom2 pos2: Chromosome and position of the second breakend
Strand1 strand2: Directions of the two breakpoints (+/-)
Level: Score, confidence level, or grade for the structural variant. If your processing pipeline does not generate this value, you can fill it arbitrarily.
Svtype: Type of structural variant (e.g., DEL, DUP, h2hINV, t2tINV, TRA)

Formatting Requirements:

Columns should be separated by tabs (\t).
All input chromosome formats should include the prefix "chr".

The example data available via the Example link is sourced from the PCAWG consensus public structural variant data (source link), specifically from the dataset 0c0038ff-6cc4-b0b0-e050-11ac0d483d73, which can be used for demonstration analyses. You can click the "Load Example" button to load the sample file.

2.3 CN Input

By clicking on "From tools", you can upload copy number analysis results from various tools.

You have the option to directly input the corresponding files, with examples available for review by clicking on the respective "example" links. These include:

Sequenza: sequenza_example_segments.txt
Sclust: sclust_example_iCN.seg
Purple: purple_example.cnv.somatic.tsv
Delly: delly_example.segmentation.bed
CNVkit: cnvkit_example_CNV_CALLS.bed

These sample files, derived from the public dataset SRR2020636, serve only as format references and hold no analytical significance.

Upload Options:
You can upload results from one to four different copy number variant analysis tools.

If only one file is uploaded, we will convert its format and proceed with the analysis of complex structural variants.
If two or more files are uploaded, they will first be merged.

Custom Data:
If you wish to use your own copy number data, you can click on "From Custom" to upload your customized data.

The format for custom copy number data should be as follows:

Column 1: Chromosome
Column 2: Start point
Column 3: End point
Column 4: Copy number of the region (integer)
Column 5: Copy number of the region (decimal)

Formatting Requirements:

If your processing pipeline does not produce decimals, you can fill this column with the decimal form of the integer copy number.
Columns should be separated by tabs (\t), without a header.
The chromosome format must include the prefix "chr".

The example data available via the Example link is sourced from the PCAWG consensus public copy number variant data (source link), specifically from the dataset 0c0038ff-6cc4-b0b0-e050-11ac0d483d73, which can be used for demonstration analyses.

2.4 Parameters

Click on "options" to expand the options card and customize parameters for the analysis. The parameters include:

input

Maximum Allowed Distance to Infer Identical SV Breakends Across Tools:
This is the maximum permissible distance for determining whether two breakpoints from different tools represent the same event when merging structural variant analysis results.
It should be an integer between 1–1000, with a default of 150.
Note: This parameter is irrelevant if you upload only one file or a custom file, as there will be no merging.
The Number of Structural Variation Callers Needed to Reach a Consensus on an SV Event:
Range of 1–6.
Maximum Allowed Distance to Infer Overlap of CN Change Regions Across Tools:
Specifies the maximum distance to consider copy number change regions from different tools as overlapping when merging CNV results. Must be 1-50000 (default: 5000). Higher values allow more relaxed merging, resulting in fewer final CNV segments. Ignored if only one file or a custom file is used.
Note: This parameter is also irrelevant if you upload only one file or a custom file.
Purity and Ploidy:
Estimated fraction of tumor cells in the sample and average total copy number across the genome. These parameters are used by gGnome to model copy number and junction balance. Leave blank to allow jaBbA to estimate it automatically.
Genome Version:
Reference human genome build used for coordinate mapping (hg19 / GRCh37 or hg38 / GRCh38).
Email (optional):
If you expect a long waiting time, you can leave your email address to receive a notification with the results page upon completion of the analysis.

2.5 Starting the Analysis

Ensure that you have:

Input at least one structural variant data file and one copy number data file.
Chosen the appropriate parameters.

Then, click "start". You will see a waiting page indicating that your analysis is either in queue or in progress. Once the analysis is complete, you will be redirected to the results page.

2.6 Interactive Result Visualization

This is an interactive web interface designed for exploring complex genomic rearrangement results through an intuitive Circos-based view.
The result page will automatically load the analysis results based on the files you uploaded.
If you wish to explore results interactively using your own data or view outputs from a local CCRR pipeline run, you can visit https://www.ccrr.life/customize-data, where you are allowed to upload your own .json result file.

input

Left Panel: Control Panel

The control panel on the left side of the interface provides key functionalities:

Load Data
Apply Settings / Reset: Apply layout or changes, or reset to default view.
Export Options:
- Export SVG: Download the current Circos plot as an SVG image.
- Download All Results: Export analysis results in batch.
Genome Regions:
Define specific genomic regions of interest using chr:start-end format. Multiple regions can be specified using semicolons.

Center Panel: Circos Plot

The central Circos plot offers a dynamic visualization of genome-wide CN and SV integration, with multiple inner and outer tracks showing different types of variation and complex events.

Interactive features:

Hover: Elements will slightly enlarge on mouse hover to indicate interactivity.
Click: Clicking on any element (e.g., CN segments, SV arcs, annotation blocks) will trigger a detailed information panel on the right.

Nearly all elements in the Circos plot are interactive, including:

CN segments
SV links
Annotations from tools like ShatterSeek, Starfish, gGnome, AmpliconArchitect, etc.

When an element in the plot is clicked, a detailed popup appears on the right, showing:

Genomic coordinates (chromosome, start, end, length)
Copy number value
Associated genes
Source tool and category
Other region-specific annotations or metadata

This allows uses to nspect specific regions or events in-depth and trace their origin or biological rielrevance.

Customization Options

Users can personalize the visualization via the control panel:

Select genome regions to display
Adjust color schemes and layout parameters

This flexible interface supports efficient exploration of complex SV and CNV landscapes in tumor genomes or other rearrangement-rich datasets.

3. Step-by-Step Example

3.1 Installation and Data Preparation

Create a Working Directory

mkdir ccrr1.2
cd ccrr1.2

Download CCRR

wget -O ccrr1.2.zip https://www.ccrr.life/download_file/ccrr1.2.zip
unzip -q ccrr1.2.zip

Download Dependencies and Create a Dockerfile

python install.py -sequenza -manta -delly -svaba -gridss -lumpy -soreca -purple -sclust -cnvkit -ref 'hg19&hg38'

Prepare licenses for Mosek and Gurobi

cp /path/to/gurobi.lic ./gurobi.lic
cp /path/to/mosek.lic ./mosek.lic

Prepare Input Data

Aligned BAM Files
Matched Reference Genome (used for BAM alignment)

We use data from the breast cancer cell line HCC1395/HCC1395BL, part of a multi-center study (DOI: 10.1186/s13059-022-02816-6). The BAM files were downloaded from: ftp://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/seqc/Somatic_Mutation_WG/data/WGS/

Files are placed at:

./share/hcc1395/WGS_FD_T_1.bam       # Tumor
./share/hcc1395/WGS_FD_N_1.bam       # Normal
./share/hcc1395/WGS_FD_T_1.bam.bai   # Index
./share/hcc1395/WGS_FD_N_1.bam.bai   # Index

The corresponding reference genome used for alignment is in

./share/data/ref/GRCh38.d1

Build Docker image

docker build --pull --rm --build-arg GITHUB_PAT=[GITHUB_PAT] --build-arg SCRIPT_DIR="/home/0.script" --build-arg TOOL_DIR="/home/1.tools"  --build-arg DATABASE_DIR="/home/2.share" --build-arg WORK_DIR="/home/3.wd" -f Dockerfile -t ccrr:v1.2 .

Run a container

docker run -v $(pwd)/share:/home/2.share -v $(pwd)/wd:/home/3.wd -d -it --name ccrr ccrr:v1.2
docker exec -it ccrr /bin/bash

3.2 Run the Pipeline Locally

Here, we selected all tools except SoReCa for this analysis.

nohup ccrr -mode custom -prefix hcc1395 -normal /home/2.share/hcc1395/WGS_FD_N_1.bam --normal-id WGS_FD_N_1 -tumor /home/2.share/hcc1395/WGS_FD_T_1.bam --tumor-id WGS_FD_T_1 --genome-version hg38 -reference /home/2.share/data/ref/GRCh38.d1/GRCh38.d1.vd1.fa -cnvkit -delly -manta -lumpy -gridss -svaba -purple -sclust  --cellularity-ploidy-tool sequenza  -threads 30 -g 200 >log 2>&1 &

The analysis completes in about 100 hours.

In /home/3.wd/hcc1395/svmerge, you can find the SV results from individual tools, the merged SV calls sv_merged.bed, and an UpSet plot sv_merged.pdf illustrating the overlaps among different SV datasets.

input

In /home/3.wd/hcc1395/cnmerge, you will find the CN analysis results from individual tools, as well as the merged consensus result:consensus_cn.bed

segment_count.pdf, counts of CNV segments by length

input

Bias_and_volatility_for_CN_all_ranges.pdf: shows bias (deviation from consensus) and volatility (variation across tools) across segment sizes.

In /home/3.wd/hcc1395/complex, you will find the results from six complex rearrangement analysis tools, a summary figure summary.png

input

and a JSON file hcc1395circos.json for web-based visualization.

3.3 Use the web service to explore the results in detail

To explore the results interactively, go to https://www.ccrr.life/ and click "Customize Data" in the top-right menu to access the custom upload page.

On the Customize Data page, click "Choose a file" to select hcc1395circos.json, then click "Upload & Render". The Circos plot will be rendered after a short loading period.

input

Clicking on any element in the plot reveals detailed annotations and associated information.

input

To focus on regions where multiple tools show consensus, enter the coordinates chr3:57825870-130091239;chr6:10307610-122158017 into the "Genome Region" field in the control panel, then click "Add". This will zoom in and display a more detailed and clearer view of the selected regions.

input

To save the current visualization, click "Export SVG" to export it as a scalable vector graphic. If needed, click "Reset" to clear custom regions and revert the view to its default state.

3.4 Upload Files for Analysis

We uploaded the locally generated results from Delly, Manta, Gridss, Lumpy, Purple, and CNVkit.
After uploading the files, select the reference genome as hg38, then click "Start" to begin the analysis.

input

The system will redirect to a waiting page. After approximately 20 minutes, it will automatically jump to the results page.

input

CCRR: Complex Chromosomal Rearrangements Resolver

0. Introduction

1. Run the Full Pipeline Locally

1.1 Installation

Download

Prepare for installation using install.py

Obtain licenses for Mosek and Gurobi

Build Docker image

Run a container

1.2 Testing and Quick Start

Testing

Quick Start

1.3 Usage

Display Help

Required Parameters

Mode

Input and Information

Optional Parameters

Configuring Multithreading and Available Memory

Selecting Required Tools

Setting Quality Filtering

Merging Method

Complex Rearrangement Analysis

Output, Rerunning, and History

1.4 Custom Execution

Using svmerge.py to Merge SV Data

Use consensus_cn.py to merge CN data.

Use complex.py to analyze complex rearrangements.

1.5 Output

Summary: {WORK_DIR}/{PREFIX}/complex/summary.png

CN merge:

SV Merge

2. CCRR Web Services

2.1 Start

2.2 SV Input

2.3 CN Input

2.4 Parameters

2.5 Starting the Analysis

2.6 Interactive Result Visualization

Left Panel: Control Panel

Center Panel: Circos Plot

Right Panel: Detailed Information Popup

Customization Options

3. Step-by-Step Example

3.1 Installation and Data Preparation

3.2 Run the Pipeline Locally

3.3 Use the web service to explore the results in detail

3.4 Upload Files for Analysis

Prepare for installation using `install.py`

Using `svmerge.py` to Merge SV Data

Use `consensus_cn.py` to merge CN data.

Use `complex.py` to analyze complex rearrangements.

Summary: `{WORK_DIR}/{PREFIX}/complex/summary.png`