Chemical composition

class pyqms.ChemicalComposition(sequence=None, aa_compositions=None, isotopic_distributions=None)

Chemical composition class. The actual sequence or formula can be reset using the add function.

Keyword Arguments:
 
  • sequence (str) – Peptide or chemical formula sequence
  • aa_compositions (Optional[dict]) – amino acid compositions
  • isotopic_distributions (Optional[dict]) – isotopic distributions

Keyword argument examples:

  • sequence

This can for example be:

molecules = [
    '+H2O2H2-OH',
    '+{0}'.format('H2O'),
    '{peptide}'.format(pepitde='ELVISLIVES'),
    '{peptide}+{0}'.format('PO3', peptide='ELVISLIVES'),
    '{peptide}#{unimod}:{pos}'.format(
        peptide = 'ELVISLIVES',
        unimod = 'Oxidation',
        pos = 1
    )
]

Examples

>>> c = pyqms.ChemicalComposition()
>>> c.use("ELVISLIVES#Acetyl:1")
>>> c.hill_notation()
'C52H90N10O18'
>>> c.hill_notation_unimod()
'C(52)H(90)N(10)O(18)'
>>> c
{'O': 18, 'H': 90, 'C': 52, 'N': 10}
>>> c.composition_of_mod_at_pos[1]
defaultdict(<class 'int'>, {'O': 1, 'H': 2, 'C': 2})
>>> c.composition_of_aa_at_pos[1]
{'O': 3, 'H': 7, 'C': 5, 'N': 1}
>>> c.composition_at_pos[1]
defaultdict(<class 'int'>, {'O': 4, 'H': 9, 'C': 7, 'N': 1})
>>> c = pyqms.ChemicalComposition('+H2O2H2')
>>> c
{'O': 2, 'H': 4}
>>> c.subtract_chemical_formula('H3')
>>> c
{'O': 2, 'H': 1}

Note

We did not include mass calculation, since pyQms will do it much more accurately using unimod and other element enrichments.

add_chemical_formula(chemical_formula)

Adds chemical formula to the instance

Chemical formula can be a string or a dictionary with the element count.

For example:

chemical_formula = 'C18H36N9O18'
chemical_formula = {
    'C' : 18,
    'H' : 36,
    'N' : 9,
    'O' : 18
}
add_peptide(peptide)

Adds peptide sequence to the instance.

Note

Only standard amino acids can be processed. If one uses special amino acids like (U of F) they have to be added to knowledge_base.py.

clear()

Resets all lookup dictionaries and self

One class instance can be used analysing a series of sequences, thereby avoiding class instantiation overhead.

Warning

Make sure to reset when looping over sequences and use the class. Chemical formulas (elemental compositions) will accumulate if not resetted.

composition_at_pos = None

chemical composition at given peptide position incl modifications (if peptide sequence was used as input or using the use function)

Note

Numbering starts at position 1, since all PSM search engines use this nomenclature.

Type:dict
composition_of_aa_at_pos = None

chemical composition of amino acid at given peptide position (if peptide sequence was used as input or using the use function)

Note

Numbering starts at position 1, since all PSM search engines use this nomenclature.

Examples:

c.composition_of_mod_at_pos[1] = {
    '15N': 2, '13C': 6, 'N': -2, 'C': -6
}
Type:dict
composition_of_mod_at_pos = None

chemical composition of unimod modifications at given position (if peptide sequence was used as input or using the use function)

Note

Numbering starts at position 1, since all PSM search engines use this nomenclature.

Type:dict
hill_notation(include_ones=False, cc=None)

Formats chemical composition into Hill notation string.

Parameters:cc (dict, optional) – elemental composition dict
Returns:Hill notation format of self.
For example:
'C50H88N10O17'
Return type:str
hill_notation_unimod(cc=None)

Formats chemical composition into Hill notation string adding unimod features.

Parameters:cc (dict, optional) – elemental composition dict
Returns:Hill notation format including unimod format rules of self.
For example:
'C(50)H(88)N(10)O(17)'
'C(50)H(88)14N(1)N(9)(17)'
Return type:str
subtract_chemical_formula(chemical_formula)

Subtracts chemical formula from instance.

subtract_peptide(peptide)

Subtracts peptide (chemical formula) from instance.

use(sequence)

Re-initialize the class with a new sequence

This is helpful if one ones to use the same class instance for multiple sequence since it remove class instantiation overhead.

Parameters:sequence (str) –

Note

Will clear the current chemical composition dict!