Class: Google::Apis::GenomicsV1::Read
- Inherits:
-
Object
- Object
- Google::Apis::GenomicsV1::Read
- Includes:
- Core::Hashable, Core::JsonObjectSupport
- Defined in:
- generated/google/apis/genomics_v1/classes.rb,
generated/google/apis/genomics_v1/representations.rb,
generated/google/apis/genomics_v1/representations.rb
Overview
A read alignment describes a linear alignment of a string of DNA to a reference sequence, in addition to metadata about the fragment (the molecule of DNA sequenced) and the read (the bases which were read by the sequencer). A read is equivalent to a line in a SAM file. A read belongs to exactly one read group and exactly one read group set.
Reverse-stranded reads
Mapped reads (reads having a non-null alignment
) can be aligned to either
the forward or the reverse strand of their associated reference. Strandedness
of a mapped read is encoded by alignment.position.reverseStrand
.
If we consider the reference to be a forward-stranded coordinate space of
[0, reference.length)
with 0
as the left-most position and
reference.length
as the right-most position, reads are always aligned left
to right. That is, alignment.position.position
always refers to the
left-most reference coordinate and alignment.cigar
describes the alignment
of this read to the reference from left to right. All per-base fields such as
alignedSequence
and alignedQuality
share this same left-to-right
orientation; this is true of reads which are aligned to either strand. For
reverse-stranded reads, this means that alignedSequence
is the reverse
complement of the bases that were originally reported by the sequencing
machine.
Generating a reference-aligned sequence string
When interacting with mapped reads, it's often useful to produce a string
representing the local alignment of the read to reference. The following
pseudocode demonstrates one way of doing this:
out = ""
offset = 0
for c in read.alignment.cigar
switch c.operation
case "ALIGNMENT_MATCH", "SEQUENCE_MATCH", "SEQUENCE_MISMATCH":
out += read.alignedSequence[offset:offset+c.operationLength]
offset += c.operationLength
break
case "CLIP_SOFT", "INSERT":
offset += c.operationLength
break
case "PAD":
out += repeat("*", c.operationLength)
break
case "DELETE":
out += repeat("-", c.operationLength)
break
case "SKIP":
out += repeat(" ", c.operationLength)
break
case "CLIP_HARD":
break
return out
Converting to SAM's CIGAR string
The following pseudocode generates a SAM CIGAR string from the
cigar
field. Note that this is a lossy conversion
(cigar.referenceSequence
is lost).
cigarMap =
"ALIGNMENT_MATCH": "M",
"INSERT": "I",
"DELETE": "D",
"SKIP": "N",
"CLIP_SOFT": "S",
"CLIP_HARD": "H",
"PAD": "P",
"SEQUENCE_MATCH": "=",
"SEQUENCE_MISMATCH": "X",
cigarStr = ""
for c in read.alignment.cigar
cigarStr += c.operationLength + cigarMap[c.operation]
return cigarStr
Instance Attribute Summary collapse
-
#aligned_quality ⇒ Array<Fixnum>
The quality of the read sequence contained in this alignment record (equivalent to QUAL in SAM).
-
#aligned_sequence ⇒ String
The bases of the read sequence contained in this alignment record, without CIGAR operations applied (equivalent to SEQ in SAM).
-
#alignment ⇒ Google::Apis::GenomicsV1::LinearAlignment
A linear alignment can be represented by one CIGAR string.
-
#duplicate_fragment ⇒ Boolean
(also: #duplicate_fragment?)
The fragment is a PCR or optical duplicate (SAM flag 0x400).
-
#failed_vendor_quality_checks ⇒ Boolean
(also: #failed_vendor_quality_checks?)
Whether this read did not pass filters, such as platform or vendor quality controls (SAM flag 0x200).
-
#fragment_length ⇒ Fixnum
The observed length of the fragment, equivalent to TLEN in SAM.
-
#fragment_name ⇒ String
The fragment name.
-
#id ⇒ String
The server-generated read ID, unique across all reads.
-
#info ⇒ Hash<String,Array<Object>>
A map of additional read alignment information.
-
#next_mate_position ⇒ Google::Apis::GenomicsV1::Position
An abstraction for referring to a genomic position, in relation to some already known reference.
-
#number_reads ⇒ Fixnum
The number of reads in the fragment (extension to SAM flag 0x1).
-
#proper_placement ⇒ Boolean
(also: #proper_placement?)
The orientation and the distance between reads from the fragment are consistent with the sequencing protocol (SAM flag 0x2).
-
#read_group_id ⇒ String
The ID of the read group this read belongs to.
-
#read_group_set_id ⇒ String
The ID of the read group set this read belongs to.
-
#read_number ⇒ Fixnum
The read number in sequencing.
-
#secondary_alignment ⇒ Boolean
(also: #secondary_alignment?)
Whether this alignment is secondary.
-
#supplementary_alignment ⇒ Boolean
(also: #supplementary_alignment?)
Whether this alignment is supplementary.
Instance Method Summary collapse
-
#initialize(**args) ⇒ Read
constructor
A new instance of Read.
-
#update!(**args) ⇒ Object
Update properties of this object.
Methods included from Core::JsonObjectSupport
Methods included from Core::Hashable
Constructor Details
#initialize(**args) ⇒ Read
Returns a new instance of Read
2056 2057 2058 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 2056 def initialize(**args) update!(**args) end |
Instance Attribute Details
#aligned_quality ⇒ Array<Fixnum>
The quality of the read sequence contained in this alignment record
(equivalent to QUAL in SAM).
alignedSequence
and alignedQuality
may be shorter than the full read
sequence and quality. This will occur if the alignment is part of a
chimeric alignment, or if the read was trimmed. When this occurs, the CIGAR
for this read will begin/end with a hard clip operator that will indicate
the length of the excised sequence.
Corresponds to the JSON property alignedQuality
1935 1936 1937 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1935 def aligned_quality @aligned_quality end |
#aligned_sequence ⇒ String
The bases of the read sequence contained in this alignment record,
without CIGAR operations applied (equivalent to SEQ in SAM).
alignedSequence
and alignedQuality
may be
shorter than the full read sequence and quality. This will occur if the
alignment is part of a chimeric alignment, or if the read was trimmed. When
this occurs, the CIGAR for this read will begin/end with a hard clip
operator that will indicate the length of the excised sequence.
Corresponds to the JSON property alignedSequence
1946 1947 1948 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1946 def aligned_sequence @aligned_sequence end |
#alignment ⇒ Google::Apis::GenomicsV1::LinearAlignment
A linear alignment can be represented by one CIGAR string. Describes the
mapped position and local alignment of the read to the reference.
Corresponds to the JSON property alignment
1952 1953 1954 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1952 def alignment @alignment end |
#duplicate_fragment ⇒ Boolean Also known as: duplicate_fragment?
The fragment is a PCR or optical duplicate (SAM flag 0x400).
Corresponds to the JSON property duplicateFragment
1957 1958 1959 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1957 def duplicate_fragment @duplicate_fragment end |
#failed_vendor_quality_checks ⇒ Boolean Also known as: failed_vendor_quality_checks?
Whether this read did not pass filters, such as platform or vendor quality
controls (SAM flag 0x200).
Corresponds to the JSON property failedVendorQualityChecks
1964 1965 1966 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1964 def failed_vendor_quality_checks @failed_vendor_quality_checks end |
#fragment_length ⇒ Fixnum
The observed length of the fragment, equivalent to TLEN in SAM.
Corresponds to the JSON property fragmentLength
1970 1971 1972 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1970 def fragment_length @fragment_length end |
#fragment_name ⇒ String
The fragment name. Equivalent to QNAME (query template name) in SAM.
Corresponds to the JSON property fragmentName
1975 1976 1977 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1975 def fragment_name @fragment_name end |
#id ⇒ String
The server-generated read ID, unique across all reads. This is different
from the fragmentName
.
Corresponds to the JSON property id
1981 1982 1983 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1981 def id @id end |
#info ⇒ Hash<String,Array<Object>>
A map of additional read alignment information. This must be of the form
mapinfo
1987 1988 1989 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1987 def info @info end |
#next_mate_position ⇒ Google::Apis::GenomicsV1::Position
An abstraction for referring to a genomic position, in relation to some
already known reference. For now, represents a genomic position as a
reference name, a base number on that reference (0-based), and a
determination of forward or reverse strand.
Corresponds to the JSON property nextMatePosition
1995 1996 1997 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1995 def next_mate_position @next_mate_position end |
#number_reads ⇒ Fixnum
The number of reads in the fragment (extension to SAM flag 0x1).
Corresponds to the JSON property numberReads
2000 2001 2002 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 2000 def number_reads @number_reads end |
#proper_placement ⇒ Boolean Also known as: proper_placement?
The orientation and the distance between reads from the fragment are
consistent with the sequencing protocol (SAM flag 0x2).
Corresponds to the JSON property properPlacement
2006 2007 2008 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 2006 def proper_placement @proper_placement end |
#read_group_id ⇒ String
The ID of the read group this read belongs to. A read belongs to exactly
one read group. This is a server-generated ID which is distinct from SAM's
RG tag (for that value, see
ReadGroup.name).
Corresponds to the JSON property readGroupId
2015 2016 2017 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 2015 def read_group_id @read_group_id end |
#read_group_set_id ⇒ String
The ID of the read group set this read belongs to. A read belongs to
exactly one read group set.
Corresponds to the JSON property readGroupSetId
2021 2022 2023 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 2021 def read_group_set_id @read_group_set_id end |
#read_number ⇒ Fixnum
The read number in sequencing. 0-based and less than numberReads. This
field replaces SAM flag 0x40 and 0x80.
Corresponds to the JSON property readNumber
2027 2028 2029 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 2027 def read_number @read_number end |
#secondary_alignment ⇒ Boolean Also known as: secondary_alignment?
Whether this alignment is secondary. Equivalent to SAM flag 0x100.
A secondary alignment represents an alternative to the primary alignment
for this read. Aligners may return secondary alignments if a read can map
ambiguously to multiple coordinates in the genome. By convention, each read
has one and only one alignment where both secondaryAlignment
and supplementaryAlignment
are false.
Corresponds to the JSON property secondaryAlignment
2037 2038 2039 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 2037 def secondary_alignment @secondary_alignment end |
#supplementary_alignment ⇒ Boolean Also known as: supplementary_alignment?
Whether this alignment is supplementary. Equivalent to SAM flag 0x800.
Supplementary alignments are used in the representation of a chimeric
alignment. In a chimeric alignment, a read is split into multiple
linear alignments that map to different reference contigs. The first
linear alignment in the read will be designated as the representative
alignment; the remaining linear alignments will be designated as
supplementary alignments. These alignments may have different mapping
quality scores. In each linear alignment in a chimeric alignment, the read
will be hard clipped. The alignedSequence
and
alignedQuality
fields in the alignment record will only
represent the bases for its respective linear alignment.
Corresponds to the JSON property supplementaryAlignment
2053 2054 2055 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 2053 def supplementary_alignment @supplementary_alignment end |
Instance Method Details
#update!(**args) ⇒ Object
Update properties of this object
2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 2061 def update!(**args) @aligned_quality = args[:aligned_quality] if args.key?(:aligned_quality) @aligned_sequence = args[:aligned_sequence] if args.key?(:aligned_sequence) @alignment = args[:alignment] if args.key?(:alignment) @duplicate_fragment = args[:duplicate_fragment] if args.key?(:duplicate_fragment) @failed_vendor_quality_checks = args[:failed_vendor_quality_checks] if args.key?(:failed_vendor_quality_checks) @fragment_length = args[:fragment_length] if args.key?(:fragment_length) @fragment_name = args[:fragment_name] if args.key?(:fragment_name) @id = args[:id] if args.key?(:id) @info = args[:info] if args.key?(:info) @next_mate_position = args[:next_mate_position] if args.key?(:next_mate_position) @number_reads = args[:number_reads] if args.key?(:number_reads) @proper_placement = args[:proper_placement] if args.key?(:proper_placement) @read_group_id = args[:read_group_id] if args.key?(:read_group_id) @read_group_set_id = args[:read_group_set_id] if args.key?(:read_group_set_id) @read_number = args[:read_number] if args.key?(:read_number) @secondary_alignment = args[:secondary_alignment] if args.key?(:secondary_alignment) @supplementary_alignment = args[:supplementary_alignment] if args.key?(:supplementary_alignment) end |