X Tutup
Skip to content

Commit 701cd83

Browse files
authored
Book 6: ModFinder Module
1 parent 671109c commit 701cd83

File tree

7 files changed

+331
-0
lines changed

7 files changed

+331
-0
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ Book 4: [The Genomics Module](genomics/README.md), working with genomic data.
2424

2525
Book 5: [The Protein-Disorder Module](protein-disorder/README.md), predicting protein-disorder.
2626

27+
Book 6: [The ModFinder Module](modfinder/README.md), identifying potein modifications in 3D structures
28+
2729
## License
2830

2931
The content of this tutorial is available under the [CC-BY](http://creativecommons.org/licenses/by/3.0/) license.

modfinder/README.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
The ModFinder Module of BioJava
2+
=====================================================
3+
4+
A tutorial for the modfinder module of [BioJava](http://www.biojava.org)
5+
6+
## About
7+
<table>
8+
<tr>
9+
<td>
10+
<img src='https://cloud.githubusercontent.com/assets/840895/22190971/fe5cd304-e0f4-11e6-9eb5-c1b071312081.png'>
11+
</td>
12+
<td>
13+
The <i>modfinder</i> module of BioJava provides an API The <i>alignment</i> module of BioJava provides an API for identification of protein pre-, co-, and post-translational modifications from structures.
14+
15+
</td>
16+
</tr>
17+
</table>
18+
19+
## Index
20+
21+
This tutorial is split into several chapters.
22+
23+
Chapter 1 - Quick [Installation](installation.md)
24+
25+
Chapter 2 - [How to get the list of supported protein modifications](supported-protein-modifications.md)
26+
27+
Chapter 3 - [How to identify protein modifications in a structure](identify-protein-modifications.md)
28+
29+
Chapter 4 - [How to define a new protein modiifcation](add-protein-modification.md)
30+
31+
## Please cite
32+
33+
**BioJava: an open-source framework for bioinformatics in 2012**<br/>
34+
*Andreas Prlic; Andrew Yates; Spencer E. Bliven; Peter W. Rose; Julius Jacobsen; Peter V. Troshin; Mark Chapman; Jianjiong Gao; Chuan Hock Koh; Sylvain Foisy; Richard Holland; Gediminas Rimsa; Michael L. Heuer; H. Brandstatter-Muller; Philip E. Bourne; Scooter Willis* <br/>
35+
[Bioinformatics (2012) 28 (20): 2693-2695.](http://bioinformatics.oxfordjournals.org/content/28/20/2693.abstract) <br/>
36+
[![doi](http://img.shields.io/badge/doi-10.1093%2Fbioinformatics%2Fbts494-blue.svg?style=flat)](http://bioinformatics.oxfordjournals.org/content/28/20/2693.abstract) [![pubmed](http://img.shields.io/badge/pubmed-22877863-blue.svg?style=flat)](http://www.ncbi.nlm.nih.gov/pubmed/22877863)
37+
38+
39+
## License
40+
41+
The content of this tutorial is available under the [CC-BY](http://creativecommons.org/licenses/by/3.0/) license.
42+
43+
[view license](../license.md)
44+
45+
46+
47+
<!--automatically generated footer-->
48+
49+
---
50+
51+
Navigation:
52+
[Home](../README.md)
53+
| Book 6: The ModFinder Module
54+
55+
Prev: [Book 5: The Protein-Disorder Module Module](../protein-disorder/README.md)
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
How to define a new protein modiifcation?
2+
===
3+
4+
The protmod module automatically loads [a list of protein modifications](supported-protein-modifications.md) into the protein modification registry. In case you have a protein modification that is not preloaded, it is possible to define it by yourself and add it into the registry.
5+
6+
## Example: define and register disulfide bond in java code
7+
8+
```java
9+
// define the involved components, in this case two cystines (CYS)
10+
List components = new ArrayList(2);
11+
components.add(Component.of("CYS"));
12+
components.add(Component.of("CYS"));
13+
14+
// define the atom linkages between the components, in this case the SG atoms on both CYS groups
15+
ModificationLinkage linkage = new ModificationLinkage(components, 0, “SG”, 1, “SG”);
16+
17+
// define the modification condition, i.e. what components are involved and what atoms are linked between them
18+
ModificationCondition condition = new ModificationConditionImpl(components, Collections.singletonList(linkage));
19+
20+
// build a modification
21+
ProteinModification mod =
22+
new ProteinModificationImpl.Builder("0018_test",
23+
ModificationCategory.CROSS_LINK_2,
24+
ModificationOccurrenceType.NATURAL,
25+
condition)
26+
.setDescription("A protein modification that effectively cross-links two L-cysteine residues to form L-cystine.")
27+
.setFormula("C 6 H 8 N 2 O 2 S 2")
28+
.setResidId("AA0025")
29+
.setResidName("L-cystine")
30+
.setPsimodId("MOD:00034")
31+
.setPsimodName("L-cystine (cross-link)")
32+
.setSystematicName("(R,R)-3,3'-disulfane-1,2-diylbis(2-aminopropanoic acid)")
33+
.addKeyword("disulfide bond")
34+
.addKeyword("redox-active center")
35+
.build();
36+
37+
//register the modification
38+
ProteinModificationRegistry.register(mod);
39+
```
40+
41+
## Example: definedisulfide bond in xml file and register by java code
42+
```xml
43+
<ProteinModifications>
44+
<Entry>
45+
<Id>0018</Id>
46+
<Description>A protein modification that effectively cross-links two L-cysteine residues to form L-cystine.</Description>
47+
<SystematicName>(R,R)-3,3'-disulfane-1,2-diylbis(2-aminopropanoic acid)</SystematicName>
48+
<CrossReference>
49+
<Source>RESID</Source>
50+
<Id>AA0025</Id>
51+
<Name>L-cystine</Name>
52+
</CrossReference>
53+
<CrossReference>
54+
<Source>PSI-MOD</Source>
55+
<Id>MOD:00034</Id>
56+
<Name>L-cystine (cross-link)</Name>
57+
</CrossReference>
58+
<Condition>
59+
<Component component="1">
60+
<Id source="PDBCC">CYS</Id>
61+
</Component>
62+
<Component component="2">
63+
<Id source="PDBCC">CYS</Id>
64+
</Component>
65+
<Bond>
66+
<Atom component="1">SG</Atom>
67+
<Atom component="2">SG</Atom>
68+
</Bond>
69+
</Condition>
70+
<Occurrence>natural</Occurrence>
71+
<Category>crosslink2</Category>
72+
<Keyword>redox-active center</Keyword>
73+
<Keyword>disulfide bond</Keyword>
74+
</Entry>
75+
</ProteinModifications>
76+
```
77+
78+
```java
79+
FileInputStream fis = new FileInputStream("path/to/file");
80+
ProteinModificationXmlReader.registerProteinModificationFromXml(fis);
81+
```
82+
83+
84+
Navigation:
85+
[Home](../README.md)
86+
| [Book 6: The ModFinder Modules](README.md)
87+
| Chapter 4 - How to define a new protein modiifcation
88+
89+
Prev: [Chapter 3 : How to identify protein modifications in a structure](identify-protein-modifications.md)
90+
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
How to identify protein modifications in a structure?
2+
===
3+
4+
## Example: Identify and print all preloaded modifications from a structure
5+
6+
```java
7+
Set<ModifiedCompound> identifyAllModfications(Structure struc) {
8+
ProteinModificationIdentifier parser = new ProteinModificationIdentifier();
9+
parser.identify(struc);
10+
Set`<ModifiedCompound> mcs = parser.getIdentifiedModifiedCompound();
11+
return mcs;
12+
}
13+
```
14+
15+
## Example: Identify phosphorylation sites in a structure
16+
17+
```java
18+
List identifyPhosphosites(Structure struc) {
19+
List<ResidueNumber> phosphosites = new ArrayList`();
20+
ProteinModificationIdentifier parser = new ProteinModificationIdentifier();
21+
parser.identify(struc, ProteinModificationRegistry.getByKeyword("phosphoprotein"));
22+
Set mcs = parser.getIdentifiedModifiedCompound();
23+
for (ModifiedCompound mc : mcs) {
24+
Set` groups = mc.getGroups(true);
25+
for (StructureGroup group : groups) {
26+
phosphosites.add(group.getPDBResidueNumber());
27+
}
28+
}
29+
return phosphosites;
30+
}
31+
```
32+
33+
## Demo code to run the above methods
34+
35+
```java
36+
import org.biojava.nbio.structure.ResidueNumber;
37+
import org.biojava.nbio.structure.Structure;
38+
import org.biojava.nbio.structure.io.PDBFileReader;
39+
import org.biojava.nbio.protmod.structure.ProteinModificationIdentifier;
40+
41+
public static void main(String[] args) {
42+
try {`
43+
PDBFileReader reader = new PDBFileReader();
44+
reader.setAutoFetch(true);
45+
46+
// identify all modificaitons from PDB:1CAD and print them
47+
String pdbId = "1CAD";
48+
Structure struc = reader.getStructureById(pdbId);
49+
Set mcs = identifyAllModfications(struc);
50+
for (ModifiedCompound mc : mcs) {
51+
System.out.println(mc.toString());
52+
}
53+
54+
// identify all phosphosites from PDB:3MVJ and print them
55+
pdbId = "3MVJ";
56+
struc = reader.getStructureById(pdbId);
57+
List psites = identifyPhosphosites(struc);
58+
for (ResidueNumber psite : psites) {
59+
System.out.println(psite.toString());
60+
}
61+
} catch(Exception e) {
62+
e.printStackTrace();
63+
}
64+
}
65+
```
66+
67+
68+
Navigation:
69+
[Home](../README.md)
70+
| [Book 6: The ModFinder Modules](README.md)
71+
| Chapter 3 - How to identify protein modifications in a structure
72+
73+
Prev: [Chapter 2 : How to get a list of supported protein modifications](supported-protein-modifications.md)
74+
75+
Next: [Chapter 4 : How to define a new protein modiifcation](add-protein-modification.md)

modfinder/installation.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
## Quick Installation
2+
3+
In the beginning, just one quick paragraph of how to get access to BioJava.
4+
5+
BioJava is open source and you can get the code from [Github](https://github.com/biojava/biojava), however it might be easier this way:
6+
7+
BioJava uses [Maven](http://maven.apache.org/) as a build and distribution system. If you are new to Maven, take a look at the [Getting Started with Maven](http://maven.apache.org/guides/getting-started/index.html) guide.
8+
9+
As of version 4, BioJava is available in maven central. This is all you would need to add a BioJava dependency to your projects:
10+
11+
```xml
12+
<dependencies>
13+
...
14+
<dependency>
15+
<!-- This imports the latest SNAPSHOT builds from the protein structure modules of BioJava.
16+
-->
17+
<groupId>org.biojava</groupId>
18+
<artifactId>biojava-structure</artifactId>
19+
<version>4.2.0</version>
20+
</dependency>
21+
<dependency>
22+
<!-- This imports the latest SNAPSHOT builds from the protein modfinder modules of BioJava.
23+
-->
24+
<groupId>org.biojava</groupId>
25+
<artifactId>biojava-modfinder</artifactId>
26+
<version>4.2.0</version>
27+
</dependency>
28+
<!-- other biojava jars as needed -->
29+
</dependencies>
30+
```
31+
32+
If you run
33+
34+
<pre>
35+
mvn package
36+
</pre>
37+
38+
on your project, the BioJava dependencies will be automatically downloaded and installed for you.
39+
40+
41+
<!--automatically generated footer-->
42+
43+
---
44+
45+
Navigation:
46+
[Home](../README.md)
47+
| [Book 6: The ModFinder Modules](README.md)
48+
| Chapter 1 : Installation
49+
50+
Next: [Chapter 2 : How to get the list of supported protein modifications](supported-protein-modifications.md)
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
How to get a list of supported protein modifications?
2+
===
3+
4+
The protmod module contains [an XML file](https://github.com/biojava/biojava/blob/master/biojava-modfinder/src/main/resources/org/biojava/nbio/protmod/ptm_list.xml), defining a list of protein modifications, retrieved from [Protein Data Bank Chemical Component Dictrionary](http://www.wwpdb.org/ccd.html), [RESID](http://pir.georgetown.edu/resid/), and [PSI-MOD](http://www.psidev.info/MOD). It contains many common modifications such glycosylation, phosphorylation, acelytation, methylation, etc. Crosslinks are also included, such disulfide bonds and iso-peptide bonds.
5+
6+
The protmod maintains a registry of supported protein modifications. The list of protein modifications contained in the XML file will be automatically loaded. You can [define and register a new protein modification](add-protein-modification.md) if it has not been defined in the XML file. From the protein modification registry, a user can retrieve:
7+
- all protein modifications,
8+
- a protein modification by ID,
9+
- a set of protein modifications by RESID ID,
10+
- a set of protein modifications by PSI-MOD ID,
11+
- a set of protein modifications by PDBCC ID,
12+
- a set of protein modifications by category (attachment, modified residue, crosslink1, crosslink2, …, crosslink7),
13+
- a set of protein modifications by occurrence type (natural or hypothetical),
14+
- a set of protein modifications by a keyword (glycoprotein, phosphoprotein, sulfoprotein, …),
15+
- a set of protein modifications by involved components.
16+
17+
## Examples
18+
19+
```java
20+
// a protein modification by ID
21+
ProteinModification mod = ProteinModificationRegistry.getById(“0001”);
22+
23+
Set mods;
24+
25+
// all protein modifications
26+
mods = ProteinModificationRegistry.allModifications();
27+
28+
// a set of protein modifications by RESID ID
29+
mods = ProteinModificationRegistry.getByResidId(“AA0151”);
30+
31+
// a set of protein modifications by PSI-MOD ID
32+
mods = ProteinModificationRegistry.getByPsimodId(“MOD:00305”);
33+
34+
// a set of protein modifications by PDBCC ID
35+
mods = ProteinModificationRegistry.getByPdbccId(“SEP”);
36+
37+
// a set of protein modifications by category
38+
mods = ProteinModificationRegistry.getByCategory(ModificationCategory.ATTACHMENT);
39+
40+
// a set of protein modifications by occurrence type
41+
mods = ProteinModificationRegistry.getByOccurrenceType(ModificationOccurrenceType.NATURAL);
42+
43+
// a set of protein modifications by a keyword
44+
mods = ProteinModificationRegistry.getByKeyword(“phosphoprotein”);
45+
46+
// a set of protein modifications by involved components.
47+
mods = ProteinModificationRegistry.getByComponent(Component.of(“FAD”));
48+
49+
```
50+
51+
Navigation:
52+
[Home](../README.md)
53+
| [Book 6: The ModFinder Modules](README.md)
54+
| Chapter 2 - How to get a list of supported protein modifications
55+
56+
Prev: [Chapter 1 : Installation](installation.md)
57+
58+
Next: [Chapter 3 : How to identify protein modifications in a structure](identify-protein-modifications.md)

protein-disorder/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,3 +116,4 @@ Navigation:
116116
| Book 3: The Protein Structure modules
117117

118118
Prev: [Book 4: The Genomics Module](../genomics/README.md)
119+
| Next: [Book 6: The ModFinder Module](../modfinder/README.md)

0 commit comments

Comments
 (0)
X Tutup