Genome to Vaccinome: Reverse vaccinology workbench for viruses

Genome to Vaccinome: Reverse vaccinology workbench for viruses

Genome to Vaccinome: Immunoinformatics & Vaccine design case studies Urmila Kulkarni-Kale Bioinformatics Centre University of Pune [email protected] Outline Immunology basics What is reverse vaccinology? Immunoinformatics Databases (Knowledgebases)

Algorithms (B- and T-cell epitope predictions) Predictions Case studies Mumps virus Japanese encephalitis virus The Immune System body's defense against infectious organisms The Innate immunity: first line of defense rapid nonspecific responses recognition of conserved structures present in many microorganisms

lipopolysaccharides in bacterial cell walls or proteins in flagella The adaptive immune response: second line of defense tailored to an individual threat specific to an infectious agent memory cells persist that enable a more rapid and potent response on re-infection April 5, 2K9 UKK, Bioinformatics Centre, University of Pune

3 The adaptive immune response Stimulated by receptor recognition of a specific small part of an antigen known as an epitope Two major arms: The humoral immune response of antibody-secreting B lymphocytes (B cell epitopes) The cellular immune response of T lymphocytes (T cell Th epitopes) Response stimulated by receptor recognition April 5, 2K9

UKK, Bioinformatics Centre, University of Pune 4 Antigen presentation and recognition: molecular and cellular processes. Host-Pathogen interactions: Surface proteins In case of Viruses: Capsid

Envelope Membrane D. Serruto, R. Rappuoli / FEBS Letters 580 (2006) 29852992 Antigen-Antibody (Ag-Ab) complexes Non-obligatory heterocomplexes that are made and broken according to the environment Involve proteins (Ag & Ab) that must also exist independently Remarkable feature: high affinity and strict specificity of antibodies for their

antigens. Ab recognize the unique conformations and spatial locations on the surface of Ag Epitopes & paratopes are relational entities January 2K7 Bioinformatics Centre, UoP 8 Methods to identify epitopes 1. Immunochemical methods

ELISA : Enzyme linked immunosorbent assay Immunoflurorescence Radioimmunoassay 2. X-ray crystallography: Ag-Ab complex is crystallized and the structure is scanned for contact residues between Ag and Ab. The contact residues on the Ag are considered as the epitope. 3. Prediction methods: Based on the X-ray crystal

data available for Ag-Ab complexes, the propensity of an amino acid to lie in an epitope is calculated. January 2K7 Bioinformatics Centre, UoP 9 Antigen-Antibody complex Number of Ab-binding sites on an antigen

Number of antibodies that could be raised against an antigen A few antibodies may have overlapping binding sites on same antigen Ab-binding sites: Sequential & Conformational Epitopes! Paratope Sequential Conformational

Ab-binding sites January 2K7 Bioinformatics Centre, UoP 11 Properties of Epitopes They occur on the surface of the protein and are more flexible than the rest of the protein. They have high degree of exposure to the solvent.

The amino acids making the epitope are usually charged and hydrophilic. January 2K7 Bioinformatics Centre, UoP 12 B cell epitope prediction algorithms :

Hopp and Woods 1981 Welling et al 1985 Parker & Hodges - 1986 Kolaskar & Tongaonkar 1990 Kolaskar & Urmila Kulkarni 1999, 2005 Haste et al., 2006 Sequence based Structure based

T cell epitope prediction algorithms : Margalit, Spouge et al - 1987 Rothbard & Taylor 1988 Stille et al 1987 Tepitope -1999 January 2K7

Bioinformatics Centre, UoP 13 Hopp & Woods method Pioneering work Based on the fact that only the hydrophilic nature of amino acids is essential for an sequence to be an antigenic determinant Local hydrophilicity values are assigned to each amino acid by the method of repetitive averaging using a window of six

Accuracy: 45-55% January 2K7 Bioinformatics Centre, UoP 14 Wellings method Based on the % of each aa present in known epitopes compared with the % of aa in the avg. composition of a protein. assigns an antigenicity value for each

amino acid from the relative occurrence of the amino acid in an antigenic determinant site. regions of 7 aa with relatively high antigenicity are extended to 11-13 aa depending on the antigenicity values of neighboring residues. January 2K7 Bioinformatics Centre, UoP 15

Parker & Hodges method Utilizes 3 parameters : Hydrophilicity : HPLC Accessibility : Janins scale Flexibility : Karplus & Schultz Hydrophilicity parameter was calculated using HPLC from retention co-efficients of model synthetic peptides. Surface profile was determined by summing the parameters for each residue of a seven-residue segment and assigning the sum to the fourth residue.

One of the most useful prediction algorithms January 2K7 Bioinformatics Centre, UoP 16 Kolaskar & Tongaonkars method Semi-empirical method which uses physiological properties of amino acid residues frequencies of occurrence of amino acids in experimentally known epitopes.

Data of 169 epitopes from 34 different proteins was collected of which 156 which have less than 20 aa per determinant were used. Antigen: EMBOSS January 2K7 Bioinformatics Centre, UoP 17 CEP Server Predicts the conformational epitopes from

X-ray crystals of Ag-Ab complexes. uses percent accessible surface area and distance as criteria January 2K7 Bioinformatics Centre, UoP 18 An algorithm to map sequential and conformational epitopes of protein antigens of known structure

January 2K7 Bioinformatics Centre, UoP 19 January 2K7 Bioinformatics Centre, UoP 20 CE: Features

The first algorithm for the prediction of conformational epitopes or antibody binding sites of protein antigens Maps both: sequential & conformational epitopes Prerequisite: 3D structure of an antigen January 2K7 Bioinformatics Centre, UoP 21 CEP: Conformational Epitope Prediction Server January 2K7 Bioinformatics Centre, UoP 22 T-cell epitope prediction algorithms Considers amphipathic helix segments, tetramer and pentamer motifs (charged residues or glycine) followed by 2-3 hydrophobic residues and then a polar

residue. Sequence motifs of immunodominant secondary structure capable of binding to MHC with high affinity. Virtual matrices are used for predicting MHC polymorphism and anchor residues. January 2K7 Bioinformatics Centre, UoP 23 MHC-Peptide complex

April 5, 2K9 UKK, Bioinformatics Centre, University of Pune 24 Epitome database January 2K7

Bioinformatics Centre, UoP 25 CED database January 2K7 Bioinformatics Centre, UoP 26

BciPep Database January 2K7 Bioinformatics Centre, UoP 27 AgAbDB: Home page Rational Vaccine design:

Challenges & opportunities Genomic Data of viruses Relatively very few Modeling is only solution Antigen: 3D structure(s) Variations/ conservations

Annotations Organisations Data mining Rules for predictions Accuracy related issues Experimental validations Epitope Prediction software

Reverse Vaccinology workbench: list of parts The components are A curated genomic resource (VirGen). 2004 A server for prediction of epitopes (CEP) 1999; 2005 A knowledge-base to study Ag-Ab interactions (AgAbDb) 2007 A server for variability analyses (PVIS) 2009 A derived database of 3D structures of viral proteins Compilation of experimental structures of viral proteins from PDB Predicted structures using homology modeling approach 1999; 2007 Study of sequencestructurefunction (antigenicity) to identify & prioritize vaccine candidates

VirGen home Menu to browse viral families Search using Keywords & Motifs Genome analysis & Comparative genomics resources

Navigation bar Guided tour & Help Sample genome record in VirGen Tabular display of genome annotation

Retrieve sequence in FASTA format Alternate names of proteins Graphical view of Genome Organization Viral polyprotein along with the UTRs

Graphical view generated dynamically using Scalable Vector Graphics technology Multiple Sequence Alignment MSA Link for batch retrieval of sequences Dendrogram Browsing the module of Whole Genome

Phylogenetic trees Most parsimonious tree of genus Flavivirus Input data: Whole genome Method: DNA parsimony Bootstrapping: 1000 AgAbDB: Home page AgAbDB: summary of interacting residues

PDB PDB Interactions mapped on structure Study of variations at different levels of Biocomplexity Strains/isolates of a virus Serotypes of a virus How similar is

similar? How different is different? Viruses that belong to same genus Viruses that belong to same family Implications of variations in designing vaccines Protein Variablility Index Server(PVIS) Beta test version PVIS takes MSA as an input and calculates variability of amino acids using Wu-Kabats coefficient at each position of the consensus sequence

Features: Interactive, GUI based alignment output format No limit on input length of MSA At each position of alignment, user can view consensus residue and its corresponding variability Generates CSV (Comma Separated File) of Variability values against their positions in consensus sequence Various output formats Antigenic diversity of mumps virus: an insight from predicted 3D structure of HN protein

Mumps Virus:at a glance Source: VirGen database Order: Family: Subfamily: Genus: Species: Mononegavirales Paramyxoviridae Paramyxovirinae

Rubulavirus Mumps virus Genome: -ve sense ssRNA Genotypes: 10: AJ (SH gene) Known antigenic proteins: F & HN Fold: propeller Monomer: 6 bladed propeller with 4-stranded sheet & 4 helices SBL-1 HN: Predicted structure

Helices: Red, Strands: yellow, Turns: blue, Coils: green A new site for neutralisation: mapping antigenicity using parts list approach Total variations: 47 Hypervariable region of HN identified using MSA of Vaccine strains (Majority marked with yellow screen

Residues 462, 464, 468, 470, 473, 474 present on surface; Known escape mutants are in proximity Mapping mutations on 3D structure of Mumps virus: a case study Colour: according to majority

Case study: Design & development of peptide vaccine against Japanese encephalitis virus January 2K7 Bioinformatics Centre, UoP 47 We Have Chosen JE Virus, Because JE virus is endemic in South-east Asia

including India. JE virus causes encephalitis in children between 5-15 years of age with fatality rates between 21-44%. Man is a "DEAD END" host. January 2K7 Bioinformatics Centre, UoP 48 We Have Chosen JE Virus, Because

Killed virus vaccine purified from mouse brain is used presently which requires storage at specific temperatures and hence not cost effective in tropical countries. Protective prophylactic immunity is induced only after administration of 2-3 doses. Cost of vaccination, transportation is high. January 2K7 Bioinformatics Centre, UoP

storage and 49 Predicted structure of JEVS Mutations: JEVN/JEVS January 2K7 Bioinformatics Centre, UoP

50 January 2K7 Bioinformatics Centre, UoP 51 CE of JEVN Egp January 2K7 Bioinformatics Centre, UoP

52 Species and Strain specific properties: TBEV/ JEVN/JEVS Loop1 in TBEV: Loop1 in JEVN: Loop1 in JEVS: LA EEH QGGT HN EKR ADSS HN KKR ADSS

Antibodies recognising TBEV and JEVN would require exactly opposite pattern of charges in their CDR regions. Further, modification in CDR is required to recognise strain-specific region of JEVS. January 2K7 Bioinformatics Centre, UoP 53 Multiple alignment of Predicted TH-cell epitope in the JE_Egp with corresponding epitopes in Egps of other Flaviviruses 426



QENWNTDIKTLKFDALSGSQEVEFI January 2K7 Bioinformatics Centre, UoP 54 TBE VAANETHSGRKTASFTISSEKTILTMG Peptide Modeling Initial random conformation Force field: Amber Distance dependent dielectric constant 4rij Geometry optimization: Steepest descents & Conjugate gradients Molecular dynamics at 400 K for 1ns Peptides are:


56 January 2K7 Bioinformatics Centre, UoP 57 Publications

Urmila Kulkarni-Kale, Janaki Ojha, G. Sunitha Manjari, Deepti D. Deobagkar, Asha D. Mallya, Rajeev M. Dhere & Subhash V. Kapre (2007). Mapping antigenic diversity & strain-specificity of mumps virus: a bioinformatics approach. Virology. A.D. Ghate, B.U. Bhagwat, S.G. Bhosle, S.M. Gadepalli and U. D. KulkarniKale(2007). Characterization of Antibody-Binding Sites on Proteins: Development of a Knowledgebase and Its Applications in Improving Epitope Prediction. Protein & Peptide Letters, 14(6), 531-535.

Urmila Kulkarni-Kale, Shriram Bhosle and A. S. Kolaskar (2005) CEP: a conformational epitope prediction server. Nucleic Acids Research. 33,W168W171. Urmila Kulkarni-Kale, Shriram Bhosale, G. Sunitha Manjari, Ashok Kolaskar, (2004). VirGen: A comprehensive viral genome resource. Nucleic Acids Research 32:289-292. Urmila Kulkarni-Kale & A. S. Kolaskar (2003). Prediction of 3D structure of envelope glycoprotein of Sri Lanka strain of Japanese encephalitis virus. In YiPing Phoebe Chen (ed.), Conferences in research and practice in information technology. 19:87-96. A. S. Kolaskar & Urmila Kulkarni-Kale (1999) Prediction of threedimensional structure and mapping of conformational antigenic determinants of envelope glycoprotein of Japanese encephalitis virus. Virology. 261:31-42.

Acknowledgements Prof. A. S. Kolaskar Ms. G. Sunitha Manjari, Bhakti Bhawat, Surabhi Agrawal & Shriram Bhosle M.Sc. / ADB [email protected]

Ms. Sangeeta Sawant, & Dr. M. M. Gore Ms. Janaki Oza, Prof. Deepti Deobagkar, Dr. Mallya, Dr. Dhere & Dr. Kapre Financial support: Center of excellence (CoE) by both MCIT & DBT, Govt. of India M.Sc. Bioinformatics programme from DBT, Govt. of India Molecular modeling facility at Bioinformatics centre, University of Pune Serum Institute of India

Thank you all! Bioinformatics Centre @ University of Pune HRD Activities In Bioinformatics and Biotechnology

Short Term Courses Long Term Courses Long Term Courses M.Sc. Bioinformatics Advanced Diploma in Bioinformatics (On hold) CRCDM (PPP model) Credit exchange program: M.Sc. Zoology & Biotechnology Contributory teaching: M.Sc./M.Tech. Biotechnology (Integrated) M.B.A. Biotechnology

M. Sc. Bioinformatics Started in 2002 Masters level 2 years (4 Semesters), full time 25 credits/semester + Project (16 credits) Intake thru entrance test No. of students: 30+1+2 ICMS Syllabus

Integrated Course Management System (ICMS ) BINC: Bioinformatics National Certification examination ~850 Registrations 13700 HITS Thank You

Recently Viewed Presentations

  • Communicating Quantitative Information

    Communicating Quantitative Information

    From (data incomplete for Obama) Denominators What is the significance of Harding (1920)? Obama's victory: biggest by non-incumbent (and still counting). Is this a mandate? apparently expectations too high!
  • Diapositiva 1 -

    Diapositiva 1 -

    Archaeologists define henges as earthworks consisting of a circular banked enclosure with an internal ditch. How I have said, Stonehenge is a megalithic monument on the mainly of thirty upright stones, aligned in a circle, with thirty lintels (6 tons...
  • CS5540 HCI - School of Computing

    CS5540 HCI - School of Computing

    A small P5, however, affects the low-to-high transition of Out2, increasing this transition time. * * Latest Voltage Interface ISCA 2002 - CMOS Voltage Interface, Kursun et al. F08 ECE/CS 6710 Digital VLSI Design UoU ECE/SoC * * Our Goals...
  • Open OnDemand: 1.0, Jupyter, App Development, & Authentication

    Open OnDemand: 1.0, Jupyter, App Development, & Authentication

    Open OnDemand:1.0, Jupyter, App Development, & Authentication. Basil Mohamed Gohar. Web and Interface Applications Manager. This work is supported by the National Science Foundation of the United States under the award NSF SI2-SSE-1534949.
  • 121.04.04  Building Infrastructure Jonathan Hunt PIP-II  IC Scope

    121.04.04 Building Infrastructure Jonathan Hunt PIP-II IC Scope Inst - BldgI - Electrical Systems for PIP-II (Elec) Design, procurement, fabrication, assembly, and pre-installation test and QA of electrical distribution utilities (racks, cable trays, cables) and installation (except cables) in the Utility Building, High Bay Building, and Booster...
  • Folie 1 -

    Folie 1 -

    Slidesfrom AIMA bookprovidedby Cristina Conati, UBC. Data Mining Bayesian Networks. Full Bayesian Learning. MAP learning. Maximum Likelihood Learning. Learning Bayesian Networks. Fully observable. With hidden (unobservable) variables. Full Bayesian Learning.
  • Diapozitiv 1 -

    Diapozitiv 1 -

    Uporaba GPS Mirsad Skorupan Povzeto iz seminarja SIRIKT 2008 dr. Mare Krevs, dr. Blaž Repe in Mirsad Skorupan Tabor DUGS - Črmošnjice, 17. in 18.2008
  • Presentation Title

    Presentation Title

    6) And to that end….. Ask questions at your doctors office…and pharmacy. Is there a generic of this medicine? IN some plans there is a difference in the cost of a routine mammogram and a ???? mammogram. There is a...