GRanges-findOverlaps {GenomicRanges}R Documentation

GRanges and GRangesList Interval Overlaps

Description

Finds interval overlaps between a GRanges/GRangesList object and a GRanges/GRangesList object.

Usage

  ## S4 method for signature 'GRanges,GRanges':
findOverlaps(query, subject, maxgap = 0L, minoverlap = 1L,
             type = c("any", "start", "end"), select = c("all", "first"))
  ## S4 method for signature 'GRanges,GRanges':
countOverlaps(query, subject, maxgap = 0L, minoverlap = 1L,
              type = c("any", "start", "end"))
  ## S4 method for signature 'GRanges,GRanges':
subsetByOverlaps(query, subject, maxgap = 0L, minoverlap = 1L,
                 type = c("any", "start", "end"))
  ## S4 method for signature 'GRanges,GRanges':
match(x, table, nomatch = NA_integer_, incomparables = NULL)
  # Also:  x %in% table

Arguments

query, subject, x, table A GRanges or GRangesList object. RangesList and RangedData are also accepted for one of query or subject (x or table for match).
maxgap A non-negative integer representing the maximum distance between a query interval and a subject interval.
minoverlap Ignored.
type The type of acceptable overlap: "any" - any overlap within maxgap, "start" - the start of the query overlaps the start of the subject within maxgap, and "end" - the end of the query overlaps the end of the subject within maxgap.
select Overlaps to return: "all" - select all overlaps, and "first" - select the first overlap.
nomatch The integer value to be returned in the case when no match is found.
incomparables This value is ignored.

Details

The findOverlaps methods involving GRanges and GRangesList objects use the triplet (sequence name, range, strand) to determine which features (see paragraph below for the definition of feature) from the query overlap which features in the subject, where a strand value of "*" is treated as occurring on both the "+" and "-" strand. An overlap is recorded when a feature in the query and a feature in the subject have the same sequence name, have a compatible pairing of strands (e.g. "+"/"+", "-"/"-", "*"/"+", "*"/"-", etc.), and satisfy the interval overlap requirements. Strand is taken as "*" for RangedData and RangesList.

In the context of findOverlaps, a feature is a collection of ranges that are treated as a single entity. For GRanges objects, a feature is a single range; while for GRangesList objects, a feature is a list element containing a set of ranges. In the results, the features are referred to by number, which run from 1 to length(query)/length(subject).

Value

For findOverlaps either a RangesMatching object when select = "all" or an integer vector when select = "first".

For countOverlaps an integer vector containing the tabulated query overlap hits.

For subsetByOverlaps an object of the same class as query containing the subset that overlapped at least one entity in subject.

For match same as findOverlaps when select = "first".

For %in% the logical vector produced by !is.na(match(x, table)).

For RangedData and RangesList, with the exception of subsetByOverlaps, the results align to the unlisted form of the object. This turns out to be fairly convenient for RangedData (not so much for RangesList, but something has to give).

Author(s)

P. Aboyoun and S. Falcon and M. Lawrence

See Also

findOverlaps, GRanges, GRangesList

Examples

  ## GRanges object
  gr <-
    GRanges(seqnames =
            Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
            ranges =
            IRanges(1:10, width = 10:1, names = head(letters,10)),
            strand =
            Rle(strand(c("-", "+", "*", "+", "-")),
                c(1, 2, 2, 3, 2)),
            score = 1:10,
            GC = seq(1, 0, length=10))
  gr

  ## GRangesList object
  gr1 <-
    GRanges(seqnames = "chr2", ranges = IRanges(3, 6),
            strand = "+", score = 5L, GC = 0.45)
  gr2 <-
    GRanges(seqnames = c("chr1", "chr1"),
            ranges = IRanges(c(7,13), width = 3),
            strand = c("+", "-"), score = 3:4, GC = c(0.3, 0.5))
  gr3 <-
    GRanges(seqnames = c("chr1", "chr2"),
            ranges = IRanges(c(1, 4), c(3, 9)),
            strand = c("-", "-"), score = c(6L, 2L), GC = c(0.4, 0.1))
  grlist <- GRangesList("gr1" = gr1, "gr2" = gr2, "gr3" = gr3)

  ## Overlapping two GRanges objects
  table(gr %in% gr1)
  countOverlaps(gr, gr1)
  findOverlaps(gr, gr1)
  subsetByOverlaps(gr, gr1)
  countOverlaps(gr, gr1, type = "start")
  findOverlaps(gr, gr1, type = "start")
  subsetByOverlaps(gr, gr1, type = "start")
  findOverlaps(gr, gr1, select = "first")

  ## Overlapping a GRanges and a GRangesList object
  table(grlist %in% gr)
  countOverlaps(grlist, gr)
  findOverlaps(grlist, gr)
  subsetByOverlaps(grlist, gr)
  countOverlaps(grlist, gr, type = "start")
  findOverlaps(grlist, gr, type = "start")
  subsetByOverlaps(grlist, gr, type = "start")
  findOverlaps(grlist, gr, select = "first")

  ## Overlapping two GRangesList objects
  countOverlaps(grlist, rev(grlist))
  findOverlaps(grlist, rev(grlist))
  subsetByOverlaps(grlist, rev(grlist))

[Package GenomicRanges version 1.0.9 Index]