Crash course in Omics

of 40

Please download to get full document.

View again

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
PDF
40 pages
0 downs
4 views
Share
Description
Crash course in Omics. EuPathDB.org University of Pennsylvania. Putative Function (GO). Expression profiling arrays (many platforms). BLAST Similarities. Subcellular Location. Genome and Annotation. Protein Expression. RNA Sequencing. Phenotype. Pathways. Isolates. SNPs. ChIP-Chip.
Transcript
Crash course in OmicsEuPathDB.orgUniversity of PennsylvaniaPutative Function (GO)Expressionprofiling arrays(many platforms)BLAST SimilaritiesSubcellular LocationGenome and AnnotationProtein ExpressionRNA SequencingPhenotypePathwaysIsolatesSNPsChIP-ChipPhylogeneticsOrthologsStructureComparativeGenomicsInteractionsGenome assembly
  • 5X random genome shotgun
  • Library insert size
  • Paired-end? (Mated end pairs)
  • Contigs
  • Scaffolds
  • Pairs Give Order & OrientationScaffoldGaps in scaffolds are traditionally indicated by 100 “N’s”2-pairMean & Std.Dev.is knownEnd Reads (Mates)PrimerPlasmidFosmidConsensus550bpReadsSEQUENCESNPsNGSDistance?Anatomy of a WGS AssemblySTSChromosomeSTS-mapped ScaffoldsContigGap (mean & std. dev. Known)Read pair (mates)ConsensusReadsSNPsSix Frame TranslationORF-findingORFs ≠ Genes>Translation Frame 1 MQKPVCLVVAMTPKRGIGINNGLPWPHLTTDFKHFSRVTKTTPEEASRLN GWLPRKFAKTGDSGLPSPSVGKRFNAVVMGRKTWESMPRKFRPLVDRLNI VVSSSLKEEDIAAEKPQAEGQQRVRVCASLPAALSLLEEEYKDSVDQIFV VGGAGLYEAALSLGVASHLYITRVAREFPCDVFFPAFPGDDILSNKSTAA QAAAPAESVFVPFCPELGREKDNEATYRPIFISKTFSDNGVPYDFVVLEK RRKTDDAATAEPSNAMSSLTSTRETTPVHGLQAPSSAAAIAPVLAWMDEE DRKKREQKELIRAVPHVHFRGHEEFQYLDLIADIINNGRTMDDRTTerminologyEukaryotic Relationships ca. 2005Keeling, 2005HomologyEarly Globin GeneGene Duplicationα-chain geneβ-chain geneα frogα chickα mouseβ mouseβ chickβ frogPARALOGSORTHOLOGSORTHOLOGSHOMOLOGSEvolutionary relationships
  • Homology - related by evolutionary descent not equivalent to similarity
  • Orthology - same gene in different organisms, e.g. alpha hemoglobin in humans and chimps
  • Paralogy - genes within an organism related by gene duplication, e.g. alpha and beta hemoglobin in humans
  • Xenology - genes related by gene transfer
  • Synteny = large regions of chromosomes containing the same genesSynteny among PlasmodiaExpression Profiles
  • The pattern of expression of one or more genes over time or a set of experimental conditions, e.g. during development or a drug treatment or in a genetic mutant such as a gene knock-out.
  • Always… has a time and space component
  • Microarrays
  • cDNA microarrays
  • “GeneChip” in situ synthesized oligonucleotide arrays
  • Oligomer (~70mer) arrays
  • Experiments are almost always Competitions between conditions or stagesThe RNA samples from the test and the control are labeled with different colors in a reverse-transcription reaction and then hybridized, together, competitively to a slide or chip containing gene sequences in multiple copies.Ratios of experimental to control expression are often expressed as colors rather than numbersClusteredMicroarrayDataGenes with SimilarExpressionProfiles areGroupedtogetherExpressed Sequence Tags, ESTsUsually represent partial cDNAOften clusteredCome from libraries that may, or may not be normalizedOften used to identify genes in genomes and locations of intronsSAGE-tags (Serial Analysis of Gene Expression)Primary purpose is relative levels of gene expressionRNA-Seq (NGS)Little sequence biasQuantitativeCan be strand-specificOther RNA expressionGenes can be located on either DNA strandOverview of transcription: Either strand can serve as a template for a geneFigure 8-4Sequences of DNA and transcribed RNAConventionGene location = non-template strand, i.e. same as the mRNA Complex patterns of eukaryotic mRNA splicing: What is a Gene?Figure 8-14Bioinformatics uses algorithms
  • Algorithms are sets of rules for solving problems or identifying patterns
  • Algorithms can be general or case specific and often need to be trained
  • Computational analysis, like wet-bench analyses are only as good as the tools, techniques and material allow, and all interpretations come with caveats (like the experimental conditions, often call parameters in bioinformatics.
  • How to find an intron
  • Usually begins with GT and end with AG
  • Must be longer than 19 nucleotides
  • Must contain a branchpoint “A”
  • Donor GT often followed by a sequence pattern. This pattern is species-specific
  • Acceptor AG often proceeded by pyrimidine stretch
  • Has a mean length of “X” as is observed in this species
  • Different prediction methods often generate different resultsPrediction 1Prediction 2Protein Expression/SequenceDataTechnology2D gel electrophoresisMass spectrometryTandem MS (MS-MS, LC MS-MS etc)
  • MW-Isoelectric point
  • MW
  • Sequence/spans
  • Typical 2 D gelHigh throughput mass spectrometry
  • Direct identification of proteins from biological sample.
  • Capillary liquid chromatography apparatus (LC) coupled with...
  • Electrospray tandem mass spectroscopy (MS/MS)
  • “Sequest”, Mascot, or other software links mass spectra with genomic sequence database.
  • ++++++++++++++PeptidesIonizedPeptidesHow Tandem MS WorksCollision Inducted Dissociation (CID)Liquid chromatographyComplex mixture ProteinIsolationFragmentationMeasurementMass SpectrometerProtein DatabaseNucleic Acid DatabaseEST DatabaseTandem Mass SpectrumSequest Database SearchTheoretical Mass SpectrumCorrelation AnalysisRanked Score of Matched PeptidesPeptide databaseENNPCKLQYDYNTNVTHGFGQEYPCETDIVERFSDTEGAQCDKKKIKDNSEGACAPYRRLHVCVRNLENINDYSKINNKHNLLVEVCLAAKYEGESITGRYPQHQETNPDTKSQLCTVLARSFADIGDIIRGKDLYRGGNTKEKKKRKKLEENLKTIFGHIYDELKNGKTNGEEELQKRYRGDKDNDFYQLREDWWDANRETVWKAITCNAGSYQYSQPTCGRGEIPYVTLSKCQCIAGEVPTYFDYVPQYLRWFEEWAEDFCRKKKKKIPNVKTNCRQVQRGKEKYCDRDGYNCDGTIRKQYIYRLDTDCTKCSLACKTFAEWIDNQKEQFDKQKQKYQNEISGGGGRRQKRSTHSTKEYEGYEKHFNEELRNEGKDVRSFLQLLSKEKICKERIQVGEETANYGNFENESNTFSHTEYCDRCPLCGVDCSSDNCRKKPDKSCDEQITDKEYPPENTTKIPKLTAEKRKTGILKKYEKFCKNSDGNNGGQIKKWECHYEKNDKDDGNGDINNCIQGDWKTSKNVYYPISYYSFFYGSIIDMLNESIEWRERLKSCINDAKLGKCRKGCKNPCECYKRWVEKKKDEWDKIKEFFRKQKDLLKDIAGMDAGELLEFYLENIFLEDMKNANGDPKVIEKFKEILGKENEEVQDPLKTKKTIDDFLEKELNEAKNCVEKNPDNECPKQKAPGDGAAPSDPPREDITHHDGEHSSDEDEEEEEEEEQQPPAEGTEQGEEKSESKEVVEQQETPQKDTEKTVPTTTPTVDVCDTVKTALADTGSLNAACSLKYVTGKNYGWRCIAPSGTTSGKDGAICVPPRTQELCLYYLKELSDTTQKGLREAFIKTAAQETYLLWQKYKEDKQNETASTELDIDDPQTQLNGGEIPEDFKRQMFYTFGDYRDLFLGRYIGNDLDKVNNNITAVFQNGDHIPNGQKTDRQRQEFWGTYGKDIWKGMLCALQEAGGKKTLTETYNYSNVTFNGHLTGTKLNEFASRPSFLRWMTEWGDQFCRERITQLQILKERCMVYQYNGDKGKDDKKEKCTEACTYYKEWLTNWQDNYKKQNQRYTEVKGTSPYKEDSDVKESKYAHGYLRKILKNIICTSGTDIAYCNCMEGTSTTDSSNNDNIPESLKYPPIEIEEGCTCKDPSPGEVIPEKKVPEPKVLPKPPKLPKRQPKERDFPTPALKNAMLSSTIMWSIGIGFATFTYFYLKKKTKSTIDLLRVINIPKSDYDIPTKLSPNRYIPYTSGKYRGKRYIYLEGDSGTDSGYTDHYSDITSSSESEYEELDINDIYAPRAPKYKTLIEVVLEPSGNNTTASGNNTPSDTQNDIQNDGIPSSKITDNEWNTLKDEFISQYLQSEQPNDVPNDYSSGDIPLNTQPNTLYFDNPDEKPFITSIHDRDLYSGEEYSYNVNMVNTNNDIPISGKNGTYSGIDLINDSLNSNNNote: ORFs in addition to predictedGenes must be searchedMass Spectrometry can be used to measure metabolic and other chemical compoundsHomologous chromsomes (in a diploid)Loci, alleles and SNPs in a populationAAGCCTCATCAACGCCTCATCaSNP =Single Nucleotide PolymorphismAlleles and Phenotype
  • Some phenotypes are caused by a single locus in the genome and a single allele at that locus (e.g. some flower colors, or Drosophila eye color)
  • Other phenotypes (Type-I diabetes, heart disease are multi-locus or “complex” (i.e. many genes are involved, each potentially with many alleles)
  • Population dataDataTechnologyChip-SeqNGS
  • Single Nucleotide Polymorphisms, SNPs
  • Alleles
  • Allele frequency
  • Haplotypes
  • Alleles have frequencies in different populationsPopulations and alleles have geographic boundariesA parasite isolate comes from a particular population, a particular location and will have a specific haplotype (e.g. representation of alleles) often characterized via SNPsParasite IsolatesDataTechnologyPCR-RFLPSequencingSNP chipGPS
  • Species, Strain,
  • Isolate
  • Location, Date
  • SNP
  • Sequence
  • Allele
  • phenotpe
  • Experimental systemsHostVectorPathogenInfectious Disease Paradigm
    Related Search
    We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks