p3d @EuroSciPy2010 by C. Fufezan

Post on 14-Feb-2017

165 views 1 download

transcript

High-throughputstructural bioinformatics

using Python & p3d

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Overview

Background

p3d overview

example ATP binding site

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Fufezan, C. and Specht M. (2009) BMC Bioinformatics 10, 258

http://p3d.fufezan.net

http://github.com/fu/p3d clone us - fork us!

Background

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

chain(s) of amino acids ...N D R P A I M K

... form proteins

oxygennitrogen

carbon

Background

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

chain(s) of amino acids ...N D R P A I M K

... form proteins

oxygennitrogen

carbon

Background

chain(s) of amino acids ...N D R P A I M K

... form proteins

and some bind cofactorse.g. ATPAdenosin-tri-phosphate

oxygennitrogen

carbon

Background

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Background

knowledge based approaches to elucidate structural factors that are essential for co-factor binding - protein engineering- protein folding- co-factor tuning

proteinsSTRUCTURE O FUNCTION O BIOINFORMATICS

ISSN 0887-3585

A PDB Survey of Heme Ligands in Proteins

Articles published online in Wiley InterScience, 9 May 2008–19 August 2008

V O L U M E 7 3 , N U M B E R 3 , N O V E M B E R 1 5 , 2 0 0 8

prote

ins

VO

LU

ME

7

3,

N

UM

BE

R

3,

N

OV

EM

BE

R

15

,

20

08

PA

GE

S

52

7–

79

4

Proteins_c1_sp_Ob.qxp 9/11/08 4:14 PM Page 1

Morozov et al. (2004) PNAS, 101, 6946-Huang et al.(2004) PNAS, 101, 5536-Fufezan et al. (2008) Proteins, 73, 690-Negron et al. (2009) Proteins, 74, 400-Fufezan (2010) Proteins, in press

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

p3d

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

p3d overview

Python module that allows

to access and manipulate protein structure files

rapid development of new screening tools

easily incorporate complex queries

Fufezan, C. and Specht M. (2009) BMC Bioinformatics 10, 258http://p3d.fufezan.net

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

N

CA

CB

CG2CG1

CO

x y z idx

atom type

AAchain

residbetaortemperaturefactor

user

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

proteinnot-proteinogenic

chain['A']oxygennitrogen

backboneatype['CA']

...

Protein Object...

hash...

residues

backbone

alpha

oxygen

protein not proteinDr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

proteinnot-proteinogenic

chain['A']oxygennitrogen

backboneatype['CA']

...

Protein Object...

hash...

residues

backbone

alpha

oxygen

protein not proteinDr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

proteinnot-proteinogenic

chain['A']oxygennitrogen

backboneatype['CA']

...

Protein Object...

hash...

residues

backbone

alpha

oxygen

protein not proteinDr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

proteinnot-proteinogenic

chain['A']oxygennitrogen

backboneatype['CA']

...

Protein Object...

hash...

residues

backbone

alpha

oxygen

protein not proteinDr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

proteinnot-proteinogenic

chain['A']oxygennitrogen

backboneatype['CA']

...

Protein Object...

hash...

residues

backbone

alpha

oxygen

protein not proteinDr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

Tree Object

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

Tree Object

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

Tree Objectquery( Vector1, radius )

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

idxatype

aachain

xyz

userbeta

protein-object

ATOM Object

Tree Objectquery( Vector1, radius )

Vectorsdo not have to

be atoms!!

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)pdb.query(‘oxygen and not protein’)

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)pdb.query(‘oxygen and not protein’)

pdb.query(‘ protein and within 4 of ’, p3d.vector.Vector(x,y,z) )

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)pdb.query(‘oxygen and not protein’)

pdb.query(‘ protein and within 4 of ’, p3d.vector.Vector(x,y,z) )

for residueName in pdb.hash[non-aa-resname]:

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)pdb.query(‘oxygen and not protein’)

pdb.query(‘ protein and within 4 of ’, p3d.vector.Vector(x,y,z) )

for residueName in pdb.hash[non-aa-resname]: targets = pdb.query(' protein and within 4 of \

Protein class

O1 FME A 1

CA PHE A 2

List of atom objects (vectors)

BSP Tree

residues

backbone

alpha

oxygen

protein not protein

sets (hashes)

Query function using human readable syntaxe.g.: pdb.query(‘backbone and resid 5..12 and within 5 of resname FME’)

pdb.query(‘resname HOH and within 4 of resname ASP’)pdb.query(‘oxygen and not protein’)

pdb.query(‘ protein and within 4 of ’, p3d.vector.Vector(x,y,z) )

for residueName in pdb.hash[non-aa-resname]: targets = pdb.query(' protein and within 4 of \ ( resname 'residueName' and oxygen )' )

Example ATP binding

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

The ATP binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Adenosine-tri-phosphate

ΔG˚' = -30 kJ mol-1

40 kg / day

The ATP binding sites

non. redundant set of proteins24 binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

The ATP binding sites

non. redundant set of proteins24 binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

The ATP binding sites

non. redundant set of proteins24 binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

The ATP binding sites

non. redundant set of proteins24 binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

The ATP binding site

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

The ATP binding site

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

The ATP binding site

+4.5

-4.5

hydropathy index

0

1

10

Observations

non. redundant set of proteins24 binding sites

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Summary

p3d allows to develop quickly Python scripts to screen Protein structures

combines Vectors, sets and BSPTree

p3d allows flexible and complex queriesusing human readable language

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

Acknowledgements

Dr. C. FufezanInstitute for Biochemistry and Biotechnology of Plants (IBBP)

M. SpechtProf. Dr. M. Hippler

founding by the DFG and Alexander von Humboldt Stiftung