CNA_plot.RdThis function creates a genome-wide copy number alteration (CNA) plot from a BigWig depth file and a VCF variant file. It computes delta values using one of three methods ("fit", "delta", or "loess"), downsampling the coverage data, and overlays variant calls as rectangles. A running median trend line is computed on weighted delta values.
CNA_plot(
depth_bigwig_file,
variant_file,
txdb,
method = c("fit", "delta", "loess"),
gene_delta_threshold = 2,
downsample = 0.1,
point_size = 0.01,
line_size = 0.1,
line_color = "red",
colors = NULL,
max_value = NULL,
min_value = NULL,
min_variant_distance = 10000,
samplename = "",
chr_filter = NULL,
trend_window = 50,
apply_weight = TRUE,
outside_weight = 0.25,
inside_weight = 1,
variant_alpha = 0.1,
trend_regions = "inside",
exclude_xy = TRUE,
return_data = FALSE
)Character. File path to a BigWig file containing depth/coverage information.
Character. File path to a variant call file (VCF) containing structural variant information.
A transcript database object (e.g., from the GenomicFeatures package) used for gene annotation.
Character. One of "fit", "delta", or "loess" determining the method to compute delta values.
Only the first element of the provided vector is used.
Numeric. Threshold applied to the delta values for gene annotation.
Numeric. Proportion of coverage data to retain (e.g., 0.01 for 1 percent).
Numeric. Size of the plotted data points.
Numeric. Size of the plotted trend line.
Character. Color for the trend line.
Named vector of colors for chromosomes. If NULL, an alternating palette of black/gray is used.
Numeric. Maximum allowed delta value; values above this are capped.
Numeric. Minimum allowed delta value; values below this are capped.
Numeric. Minimum distance (in bp) for a variant call to be retained.
Character. A label for the sample that is appended to the plot title.
Character. If specified, only data from the given chromosome are processed.
Integer. The number of consecutive data points over which to compute the running median trend line.
Logical. If TRUE, weight multipliers are applied to delta values outside CNA calls.
Numeric. Multiplier applied to delta values outside CNA calls.
Numeric. Multiplier applied to delta values inside CNA calls.
Numeric. Transparency level for variant rectangles.
Character. One of "both", "inside", or "outside" to control where the trend line is plotted.
Default is "both".
Logical. If TRUE, chromosomes X and Y are excluded from the analysis.
Logical. If TRUE, returns a list containing the plot, the coverage data, and variant calls.
Either a ggplot2 object representing the CNA plot or a list with additional processed data when return_data is TRUE.
The function performs several steps:
Imports coverage data from the BigWig file using import.bw() and keeps only standard chromosomes (with keepStandardChromosomes()).
Optionally loads external GC/repeat data when using the "fit" method.
Filters and down-samples the coverage data.
Computes delta values using the specified method:
["delta"] subtracts the mean coverage.
["loess"] fits a LOESS model and computes log2 ratios.
["fit"] fits a linear model to predict coverage based on GC content and repeat fraction.
Reads and filters variant calls from the VCF file.
Computes genomic offsets via generate_offsets() (assumed to be defined elsewhere).
Applies weighting to delta values and computes a running median trend line using rollapply().
Based on the value of trend_regions, the trend line is shown only for regions:
"inside": only within CNV call regions,
"outside": only outside CNV call regions, or
"both": across all regions (the default).
Builds the plot with ggplot2 incorporating points, trend line, variant rectangles, and chromosome boundaries.
This function assumes that the helper function generate_offsets() is defined in your package.