translate {Biostrings} | R Documentation |
Functions for transcription and/or translation of DNA or RNA sequences, and related utilities.
## Transcription: transcribe(x) cDNA(x) ## Translation: codons(x) translate(x) ## Related utilities: dna2rna(x) rna2dna(x)
x |
A DNAString object for An RNAString object for A DNAString, RNAString, MaskedDNAString or
MaskedRNAString object for A DNAString, RNAString, DNAStringSet,
RNAStringSet, MaskedDNAString or MaskedRNAString
object for |
transcribe
reproduces the biological process of DNA
transcription that occurs in the cell. It takes the naive approach
to treat the whole sequence x
as if it was a single exon.
See extractTranscripts
for a more
powerful version that allows the user to extract a set of transcripts
specified by the starts and ends of their exons as well as the strand
from which the transcript is coming.
cDNA
reproduces the process of synthesizing complementary DNA
from a mature mRNA template.
translate
reproduces the biological process of RNA
translation that occurs in the cell.
The input of the function can be either RNA or coding DNA.
The Standard Genetic Code (see ?GENETIC_CODE
) is
used to translate codons into amino acids.
codons
is a utility for extracting the codons involved
in this translation without translating them.
dna2rna
and rna2dna
are low-level utilities for
converting sequences from DNA to RNA and vice-versa.
All what this converstion does is to replace each occurrence of T
by a U and vice-versa.
An RNAString object for transcribe
and dna2rna
.
A DNAString object for cDNA
and rna2dna
.
Note that if the sequence passed to transcribe
or cDNA
is considered to be oriented 5'-3', then the returned sequence is
oriented 3'-5'.
An XStringViews object with 1 view per codon for codons
.
When x
is a MaskedDNAString or MaskedRNAString
object, its masked parts are interpreted as introns and filled with
the + letter in the returned object. Therefore codons that span across
masked regions are represented by views that have a width > 3 and
contain the + letter. Note that each view is guaranteed to contain
exactly 3 base letters.
An AAString object for translate
.
reverseComplement
,
GENETIC_CODE
,
DNAString-class,
RNAString-class,
AAString-class,
XStringSet-class,
XStringViews-class,
MaskedXString-class
file <- system.file("extdata", "someORF.fa", package="Biostrings") x <- readDNAStringSet(file) x ## The first and last 1000 nucleotides are not part of the ORFs: x <- DNAStringSet(x, start=1001, end=-1001) ## Before calling translate() on an ORF, we need to mask the introns ## if any. We can get this information fron the SGD database ## (http://www.yeastgenome.org/). ## According to SGD, the 1st ORF (YAL001C) has an intron at 71..160 ## (see http://db.yeastgenome.org/cgi-bin/locus.pl?locus=YAL001C) y1 <- x[[1]] mask1 <- Mask(length(y1), start=71, end=160) masks(y1) <- mask1 y1 translate(y1) ## Codons codons(y1) which(width(codons(y1)) != 3) codons(y1)[20:28] ## Translation on the '-' strand: dna <- DNAStringSet(c("ATC", "GCTG", "CGACT")) translate(reverseComplement(dna)) ## Translate sequences on both '+' and '-' strand across all ## possible reading frames (i.e., codon position 1, 2 or 3): ## First create a DNAStringSet of '+' and '-' strand sequences, ## removing the nucleotides prior to the reading frame start position. dnaSubseq <- lapply(1:3, function(pos, dna) subseq(c(dna, reverseComplement(dna)), start=pos), dna) ## Translation of 'dnaSubseq' produces a list of length 3, each with 6 ## elements (3 '+' strand results followed by 3 '-' strand results). lapply(dnaSubseq, translate) ## translate() throws a warning when the length of the sequence is not ## divisible by 3. To avoid this warning wrap the function in ## suppressWarnings().