MB620 Bioinformatics
University of New Haven
Instructor: Joel S. Bader
Class 12: By Popular Demand


Suggested topics


Master Outline

Genetics
Traits/Genes to Location Genetic and physical maps
Research Genetics mapping panel
Stanford mapping server
Traits/Genes to Experimental Organisms Jackson Laboratories
Trait/Gene Location Database OMIM, On-line Mendelian Inheritance in Man
Genomic DNA Analysis
Sequences to Contigs CuraTools
CAP, PHRAP
Contigs to mRNA Genscan
Grail
tblastn (protein query, genomic database)
mRNA Analysis
DNA to Homologblastn, blastx, fasta
DNA to ProteinORF finders
NCBI ORF Finder
Protein Analysis
protein homologsblastp
conserved residuesmultiple sequence alignment, clustal-w
evolutionary historyPhylip, Paup
blastp for linguistics AltaVista Babelfish
Prokaryot homologs Clusters of Orthologous Groups
Domains Pfam
Prosite
Prodom
Cellular localization Psort
Secondary structure prediction Consensus prediction
Tertiary structure prediction Swiss-Model
Known folds SCOP
CATH, DALI
Structure similarity searchVAST
Bioinformatics Resources
GenBank, OMIM, Blast, Entrez NCBI
Swiss-Prot, TrEMBL, Prosite ExPASy

Primer design

Why design primers? Amplifying DNA: Polymerase Chain Reaction (PCR)
(How was DNA amplified before PCR?)
Nobel Prize, Kerry Mullis Requirements Visit CuraTools and use Primer3.
Steve Rozen, Helen J. Skaletsky (1996,1997)
   Primer3. Code available at
   http://www-genome.wi.mit.edu/genome_software/other/primer3.html

Codon usage

Remember ORF prediction: have mRNA, predict protein coding region
What does the protein coding region look like?
What is the proper reading frame?
Good hint: codon usage
Total number of codons: 64
Number of protein-coding codons: 63
Number of protein-coding codons used: less than 63, depends on species.
The codon usage database in Japan has codon usage information.

A little math

the entropy of a code S = - sum(words) prob(word) ln[prob(word)]
the effective number of words = exp(S)
if all N words have the same probability 1/N then
   the entropy = - sum(words) (1/N) ln(1/N) = ln(N)
   the effective number of words = exp[ln(N)] = N
if only one word is used then
   the entropy = - sum(words) prob(word) ln[prob(word)] = 0
   the effective number of words = exp[0] = 1
How do organisms stack up?
OrganismEffective number of codons
E. coli51.8
S. cerevisiae52.2
Chloroplast, Arabadopsis thaliana54.2
Arabadopsis thaliana55.1
Mitochondrion, Homo sapiens46.8
Homo sapiens54.7
Actually, it's not just codon usage that depends on organism.
The genetic code also varies.
See the genetic code page at NCBI.

Restriction enzyme sites

Visit CuraTools.
Use TACG to do simulated digests
TACG is Copyright (c) 1996,1997 by Harry Mangalam at Univ. of California, Irvine.

Medline

Journal database: PubMed (Medline) at NCBA
By now you're familiar with database searches.

New trends: Functional Genomics

From sequence to function.

Gene expression analysis

Proteomics


Copyright 1999 Joel S. Bader jsbader@curagen.com