Update data model, regex, grammar, and JSON schema and add graphic generation code for isotopic fine structure#18
Conversation
|
@douweschulte this implements the annotation data model and parsing. If you want to go into how you saw doing the calculation should be done, go ahead. Otherwise this should be ready to go. |
|
The figure looks excellent! Are you able to compute and annotate the difference in ppm between the centroid and 13C for the resolving power 50000 peak? |
edeutsch
left a comment
There was a problem hiding this comment.
I think I'd suggest allowing any amino acid letter A-Z, but otherwise great, thanks!
| @@ -38,6 +38,14 @@ NUMBER : DIGIT ("." (DIGIT)+)? | |||
|
|
|||
| AMINO_ACID : "A" | "R" | "N" | "D" | "C" | "E" | "Q" | "G" | "H" | "K" | "M" | "F" | "P" | "S" | "T" | "W" | "Y" | "V" | "I" | "L" | |||
There was a problem hiding this comment.
Is there a strong reason to limit ourselves to these? The next most common would be selenocysteine (U). Plenty of human proteins have selenocysteine, e.g.:
https://www.uniprot.org/uniprotkb/P07203/entry#sequences
UWPR also defines O:
https://proteomicsresource.washington.edu/protocols06/masses.php
J is defined as I or L:
https://www.bioinformatics.org/sms2/iupac.html
for which you can calculate an exact mass.
One could argue that the rest you can't calculate a mass for because they're ambiguous.
But still, why not allow them?
Oh, I think ProForma formally encourages X to have a 0 mass and you can do X[+123.0222] to specify some other artificial amino acid (of which there are plenty?) that might be common in synthetic peptides.
Why not just allow all A-Z?
|
I missed the first comment and the question in it. In this example, the For instance, if you were to use a different composition like C23H14Cl2N6O8S2 instead, you'd get: which has a So the difference is vanishingly small, but in specific scenarios at high resolution there is information available that would be lost otherwise. Computing the Of course, if you're using a text encoding of a floating point number, assuming everything beyond the fourth or fifth significant figure is noise is pragmatic and is unlikely to make a substantial difference in any case. |
|
Sleep deprivation, request for clarification: did you want the change in m/z between the average peak and the isotopic peak to be included in the figure directly? It'll be ugly, I'd rather show that as a table. Too much line noise in small text disturbing the smooth beauty of the spectrum itself.
|
|
Hi @mobiusklein , it might be stacking the deck a bit with extra sulfurs and nitrogens, but how about something like y4{CMNR}, which I think is something like C18H35N8O6S2? I wonder if that delta may be as much as 1 ppm? |
|
great, thanks. I'd favor the more extreme so that it's clearer that it can make a substantial difference, but whatever you prefer is fine. thanks! |








This covers the discussion from the previous two sessions on isotopic fine structure support. It also adds notebook code for generating a figure to demonstrate the effect of resolution on peak shape and isotopic fine structure decomposition.
Current TODOs:
+xiA