Converts a coverage data frame into a GRanges object, extracts gene ranges from a transcript database, finds overlaps between the coverage intervals and genes, and maps gene IDs to gene symbols (or returns Entrez IDs).

annotateCoverageWithGenes(
  coverage_df,
  species_annotation = org.Hs.eg.db,
  txdb = TxDb.Hsapiens.UCSC.hg38.knownGene,
  return_entrez = FALSE
)

Arguments

coverage_df

A data frame with at least the following columns: seqnames, start, and end representing coverage intervals.

txdb

A transcript database object from which gene information is extracted. Defaults to TxDb.Hsapiens.UCSC.hg19.knownGene.

return_entrez

Logical. If FALSE (default), gene symbols are returned; otherwise, the Entrez gene IDs are returned.

Value

The original coverage_df with two additional columns: gene_id (gene IDs) and gene_symbol (gene symbols or Entrez IDs, depending on return_entrez).

Details

The function converts the input data frame to a GRanges object using GRanges() and IRanges(). It then obtains gene ranges via genes() from the transcript database and finds overlaps with findOverlaps(). Gene IDs are extracted from the metadata columns of the gene ranges, and, if requested, mapped to gene symbols using mapIds().