Class: Google::Apis::GenomicsV1::Read

Inherits:

Object

Object
Google::Apis::GenomicsV1::Read

show all

Includes:: Core::Hashable, Core::JsonObjectSupport

Defined in:: generated/google/apis/genomics_v1/classes.rb,
generated/google/apis/genomics_v1/representations.rb,
generated/google/apis/genomics_v1/representations.rb

Overview

A read alignment describes a linear alignment of a string of DNA to a reference sequence, in addition to metadata about the fragment (the molecule of DNA sequenced) and the read (the bases which were read by the sequencer). A read is equivalent to a line in a SAM file. A read belongs to exactly one read group and exactly one read group set. For more genomics resource definitions, see Fundamentals of Google Genomics

Reverse-stranded reads

Mapped reads (reads having a non-null alignment) can be aligned to either the forward or the reverse strand of their associated reference. Strandedness of a mapped read is encoded by alignment.position.reverseStrand. If we consider the reference to be a forward-stranded coordinate space of [0, reference.length) with 0 as the left-most position and reference.length as the right-most position, reads are always aligned left to right. That is, alignment.position.position always refers to the left-most reference coordinate and alignment.cigar describes the alignment of this read to the reference from left to right. All per-base fields such as alignedSequence and alignedQuality share this same left-to-right orientation; this is true of reads which are aligned to either strand. For reverse-stranded reads, this means that alignedSequence is the reverse complement of the bases that were originally reported by the sequencing machine.

Generating a reference-aligned sequence string

When interacting with mapped reads, it's often useful to produce a string representing the local alignment of the read to reference. The following pseudocode demonstrates one way of doing this: out = "" offset = 0 for c in read.alignment.cigar switch c.operation case "ALIGNMENT_MATCH", "SEQUENCE_MATCH", "SEQUENCE_MISMATCH": out += read.alignedSequence[offset:offset+c.operationLength] offset += c.operationLength break case "CLIP_SOFT", "INSERT": offset += c.operationLength break case "PAD": out += repeat("*", c.operationLength) break case "DELETE": out += repeat("-", c.operationLength) break case "SKIP": out += repeat(" ", c.operationLength) break case "CLIP_HARD": break return out

Converting to SAM's CIGAR string

The following pseudocode generates a SAM CIGAR string from the cigar field. Note that this is a lossy conversion (cigar.referenceSequence is lost). cigarMap = "ALIGNMENT_MATCH": "M", "INSERT": "I", "DELETE": "D", "SKIP": "N", "CLIP_SOFT": "S", "CLIP_HARD": "H", "PAD": "P", "SEQUENCE_MATCH": "=", "SEQUENCE_MISMATCH": "X", cigarStr = "" for c in read.alignment.cigar cigarStr += c.operationLength + cigarMap[c.operation] return cigarStr

Instance Attribute Summary collapse

#aligned_quality ⇒ Array<Fixnum>
The quality of the read sequence contained in this alignment record (equivalent to QUAL in SAM).
#aligned_sequence ⇒ String
The bases of the read sequence contained in this alignment record, without CIGAR operations applied (equivalent to SEQ in SAM).
#alignment ⇒ Google::Apis::GenomicsV1::LinearAlignment
A linear alignment can be represented by one CIGAR string.
#duplicate_fragment ⇒ Boolean (also: #duplicate_fragment?)
The fragment is a PCR or optical duplicate (SAM flag 0x400).
#failed_vendor_quality_checks ⇒ Boolean (also: #failed_vendor_quality_checks?)
Whether this read did not pass filters, such as platform or vendor quality controls (SAM flag 0x200).
#fragment_length ⇒ Fixnum
The observed length of the fragment, equivalent to TLEN in SAM.
#fragment_name ⇒ String
The fragment name.
#id ⇒ String
The server-generated read ID, unique across all reads.
#info ⇒ Hash<String,Array<Object>>
A map of additional read alignment information.
#next_mate_position ⇒ Google::Apis::GenomicsV1::Position
An abstraction for referring to a genomic position, in relation to some already known reference.
#number_reads ⇒ Fixnum
The number of reads in the fragment (extension to SAM flag 0x1).
#proper_placement ⇒ Boolean (also: #proper_placement?)
The orientation and the distance between reads from the fragment are consistent with the sequencing protocol (SAM flag 0x2).
#read_group_id ⇒ String
The ID of the read group this read belongs to.
#read_group_set_id ⇒ String
The ID of the read group set this read belongs to.
#read_number ⇒ Fixnum
The read number in sequencing.
#secondary_alignment ⇒ Boolean (also: #secondary_alignment?)
Whether this alignment is secondary.
#supplementary_alignment ⇒ Boolean (also: #supplementary_alignment?)
Whether this alignment is supplementary.

Instance Method Summary collapse

#initialize(**args) ⇒ Read constructor
A new instance of Read.
#update!(**args) ⇒ Object
Update properties of this object.

Methods included from Core::JsonObjectSupport

#to_json

Methods included from Core::Hashable

process_value, #to_h

Constructor Details

#initialize(**args) ⇒ `Read`

Returns a new instance of Read



2112
2113
2114

# File 'generated/google/apis/genomics_v1/classes.rb', line 2112

def initialize(**args)
   update!(**args)
end

Instance Attribute Details

#aligned_quality ⇒ `Array<Fixnum>`

The quality of the read sequence contained in this alignment record (equivalent to QUAL in SAM). alignedSequence and alignedQuality may be shorter than the full read sequence and quality. This will occur if the alignment is part of a chimeric alignment, or if the read was trimmed. When this occurs, the CIGAR for this read will begin/end with a hard clip operator that will indicate the length of the excised sequence. Corresponds to the JSON property alignedQuality

Returns:

(Array<Fixnum>)



2071
2072
2073

# File 'generated/google/apis/genomics_v1/classes.rb', line 2071

def aligned_quality
  @aligned_quality
end

#aligned_sequence ⇒ `String`

The bases of the read sequence contained in this alignment record, without CIGAR operations applied (equivalent to SEQ in SAM). alignedSequence and alignedQuality may be shorter than the full read sequence and quality. This will occur if the alignment is part of a chimeric alignment, or if the read was trimmed. When this occurs, the CIGAR for this read will begin/end with a hard clip operator that will indicate the length of the excised sequence. Corresponds to the JSON property alignedSequence

Returns:

(String)



2011
2012
2013

# File 'generated/google/apis/genomics_v1/classes.rb', line 2011

def aligned_sequence
  @aligned_sequence
end

#alignment ⇒ `Google::Apis::GenomicsV1::LinearAlignment`

A linear alignment can be represented by one CIGAR string. Describes the mapped position and local alignment of the read to the reference. Corresponds to the JSON property alignment

Returns:

(Google::Apis::GenomicsV1::LinearAlignment)



2077
2078
2079

# File 'generated/google/apis/genomics_v1/classes.rb', line 2077

def alignment
  @alignment
end

#duplicate_fragment ⇒ `Boolean` Also known as: duplicate_fragment?

The fragment is a PCR or optical duplicate (SAM flag 0x400). Corresponds to the JSON property duplicateFragment

Returns:

(Boolean)



1985
1986
1987

# File 'generated/google/apis/genomics_v1/classes.rb', line 1985

def duplicate_fragment
  @duplicate_fragment
end

#failed_vendor_quality_checks ⇒ `Boolean` Also known as: failed_vendor_quality_checks?

Whether this read did not pass filters, such as platform or vendor quality controls (SAM flag 0x200). Corresponds to the JSON property failedVendorQualityChecks

Returns:

(Boolean)



2059
2060
2061

# File 'generated/google/apis/genomics_v1/classes.rb', line 2059

def failed_vendor_quality_checks
  @failed_vendor_quality_checks
end

#fragment_length ⇒ `Fixnum`

The observed length of the fragment, equivalent to TLEN in SAM. Corresponds to the JSON property fragmentLength

Returns:

(Fixnum)



2053
2054
2055

# File 'generated/google/apis/genomics_v1/classes.rb', line 2053

def fragment_length
  @fragment_length
end

#fragment_name ⇒ `String`

The fragment name. Equivalent to QNAME (query template name) in SAM. Corresponds to the JSON property fragmentName

Returns:

(String)



2104
2105
2106

# File 'generated/google/apis/genomics_v1/classes.rb', line 2104

def fragment_name
  @fragment_name
end

#id ⇒ `String`

The server-generated read ID, unique across all reads. This is different from the fragmentName. Corresponds to the JSON property id

Returns:

(String)



2083
2084
2085

# File 'generated/google/apis/genomics_v1/classes.rb', line 2083

def id
  @id
end

#info ⇒ `Hash<String,Array<Object>>`

A map of additional read alignment information. This must be of the form map (string key mapping to a list of string values). Corresponds to the JSON property info

Returns:

(Hash<String,Array<Object>>)



2017
2018
2019

# File 'generated/google/apis/genomics_v1/classes.rb', line 2017

def info
  @info
end

#next_mate_position ⇒ `Google::Apis::GenomicsV1::Position`

An abstraction for referring to a genomic position, in relation to some already known reference. For now, represents a genomic position as a reference name, a base number on that reference (0-based), and a determination of forward or reverse strand. Corresponds to the JSON property nextMatePosition

Returns:

(Google::Apis::GenomicsV1::Position)



2025
2026
2027

# File 'generated/google/apis/genomics_v1/classes.rb', line 2025

def next_mate_position
  @next_mate_position
end

#number_reads ⇒ `Fixnum`

The number of reads in the fragment (extension to SAM flag 0x1). Corresponds to the JSON property numberReads

Returns:

(Fixnum)



2088
2089
2090

# File 'generated/google/apis/genomics_v1/classes.rb', line 2088

def number_reads
  @number_reads
end

#proper_placement ⇒ `Boolean` Also known as: proper_placement?

The orientation and the distance between reads from the fragment are consistent with the sequencing protocol (SAM flag 0x2). Corresponds to the JSON property properPlacement

Returns:

(Boolean)



2047
2048
2049

# File 'generated/google/apis/genomics_v1/classes.rb', line 2047

def proper_placement
  @proper_placement
end

#read_group_id ⇒ `String`

The ID of the read group this read belongs to. A read belongs to exactly one read group. This is a server-generated ID which is distinct from SAM's RG tag (for that value, see ReadGroup.name). Corresponds to the JSON property readGroupId

Returns:

(String)



2000
2001
2002

# File 'generated/google/apis/genomics_v1/classes.rb', line 2000

def read_group_id
  @read_group_id
end

#read_group_set_id ⇒ `String`

The ID of the read group set this read belongs to. A read belongs to exactly one read group set. Corresponds to the JSON property readGroupSetId

Returns:

(String)



2110
2111
2112

# File 'generated/google/apis/genomics_v1/classes.rb', line 2110

def read_group_set_id
  @read_group_set_id
end

#read_number ⇒ `Fixnum`

The read number in sequencing. 0-based and less than numberReads. This field replaces SAM flag 0x40 and 0x80. Corresponds to the JSON property readNumber

Returns:

(Fixnum)



1992
1993
1994

# File 'generated/google/apis/genomics_v1/classes.rb', line 1992

def read_number
  @read_number
end

#secondary_alignment ⇒ `Boolean` Also known as: secondary_alignment?

Whether this alignment is secondary. Equivalent to SAM flag 0x100. A secondary alignment represents an alternative to the primary alignment for this read. Aligners may return secondary alignments if a read can map ambiguously to multiple coordinates in the genome. By convention, each read has one and only one alignment where both secondaryAlignment and supplementaryAlignment are false. Corresponds to the JSON property secondaryAlignment

Returns:

(Boolean)



2098
2099
2100

# File 'generated/google/apis/genomics_v1/classes.rb', line 2098

def secondary_alignment
  @secondary_alignment
end

#supplementary_alignment ⇒ `Boolean` Also known as: supplementary_alignment?

Whether this alignment is supplementary. Equivalent to SAM flag 0x800. Supplementary alignments are used in the representation of a chimeric alignment. In a chimeric alignment, a read is split into multiple linear alignments that map to different reference contigs. The first linear alignment in the read will be designated as the representative alignment; the remaining linear alignments will be designated as supplementary alignments. These alignments may have different mapping quality scores. In each linear alignment in a chimeric alignment, the read will be hard clipped. The alignedSequence and alignedQuality fields in the alignment record will only represent the bases for its respective linear alignment. Corresponds to the JSON property supplementaryAlignment

Returns:

(Boolean)



2040
2041
2042

# File 'generated/google/apis/genomics_v1/classes.rb', line 2040

def supplementary_alignment
  @supplementary_alignment
end

Instance Method Details

#update!(**args) ⇒ `Object`

Update properties of this object

# File 'generated/google/apis/genomics_v1/classes.rb', line 2117

def update!(**args)
  @duplicate_fragment = args[:duplicate_fragment] if args.key?(:duplicate_fragment)
  @read_number = args[:read_number] if args.key?(:read_number)
  @read_group_id = args[:read_group_id] if args.key?(:read_group_id)
  @aligned_sequence = args[:aligned_sequence] if args.key?(:aligned_sequence)
  @info = args[:info] if args.key?(:info)
  @next_mate_position = args[:next_mate_position] if args.key?(:next_mate_position)
  @supplementary_alignment = args[:supplementary_alignment] if args.key?(:supplementary_alignment)
  @proper_placement = args[:proper_placement] if args.key?(:proper_placement)
  @fragment_length = args[:fragment_length] if args.key?(:fragment_length)
  @failed_vendor_quality_checks = args[:failed_vendor_quality_checks] if args.key?(:failed_vendor_quality_checks)
  @aligned_quality = args[:aligned_quality] if args.key?(:aligned_quality)
  @alignment = args[:alignment] if args.key?(:alignment)
  @id = args[:id] if args.key?(:id)
  @number_reads = args[:number_reads] if args.key?(:number_reads)
  @secondary_alignment = args[:secondary_alignment] if args.key?(:secondary_alignment)
  @fragment_name = args[:fragment_name] if args.key?(:fragment_name)
  @read_group_set_id = args[:read_group_set_id] if args.key?(:read_group_set_id)
end

Class: Google::Apis::GenomicsV1::Read

Overview

Reverse-stranded reads

Generating a reference-aligned sequence string

Converting to SAM's CIGAR string

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Core::JsonObjectSupport

Methods included from Core::Hashable

Constructor Details

#initialize(**args) ⇒ Read

Instance Attribute Details

#aligned_quality ⇒ Array<Fixnum>

#aligned_sequence ⇒ String

#alignment ⇒ Google::Apis::GenomicsV1::LinearAlignment

#duplicate_fragment ⇒ Boolean Also known as: duplicate_fragment?

#failed_vendor_quality_checks ⇒ Boolean Also known as: failed_vendor_quality_checks?

#fragment_length ⇒ Fixnum

#fragment_name ⇒ String

#id ⇒ String

#info ⇒ Hash<String,Array<Object>>

#next_mate_position ⇒ Google::Apis::GenomicsV1::Position

#number_reads ⇒ Fixnum

#proper_placement ⇒ Boolean Also known as: proper_placement?

#read_group_id ⇒ String

#read_group_set_id ⇒ String

#read_number ⇒ Fixnum

#secondary_alignment ⇒ Boolean Also known as: secondary_alignment?

#supplementary_alignment ⇒ Boolean Also known as: supplementary_alignment?

Instance Method Details

#update!(**args) ⇒ Object

#initialize(**args) ⇒ `Read`

#aligned_quality ⇒ `Array<Fixnum>`

#aligned_sequence ⇒ `String`

#alignment ⇒ `Google::Apis::GenomicsV1::LinearAlignment`

#duplicate_fragment ⇒ `Boolean` Also known as: duplicate_fragment?

#failed_vendor_quality_checks ⇒ `Boolean` Also known as: failed_vendor_quality_checks?

#fragment_length ⇒ `Fixnum`

#fragment_name ⇒ `String`

#id ⇒ `String`

#info ⇒ `Hash<String,Array<Object>>`

#next_mate_position ⇒ `Google::Apis::GenomicsV1::Position`

#number_reads ⇒ `Fixnum`

#proper_placement ⇒ `Boolean` Also known as: proper_placement?

#read_group_id ⇒ `String`

#read_group_set_id ⇒ `String`

#read_number ⇒ `Fixnum`

#secondary_alignment ⇒ `Boolean` Also known as: secondary_alignment?

#supplementary_alignment ⇒ `Boolean` Also known as: supplementary_alignment?

#update!(**args) ⇒ `Object`