BCH5425 Molecular Biology and Biotechnology
Spring 1998
Dr. Michael Blaber

Lecture 28

Genomic Libraries

Period of time between first man-powered flight and landing on the moon (1902-1969):

67 years

Period of time between discovery of structure of DNA and determination of the sequence of the entire human genome (1953-2010?)

57 years (?)

Genomic DNA libraries

Size of some genomes and chromosomes:

Comparative Sequence Sizes
(yeast chromosome 3)
350 Thousand
Escherichia coli (bacterium) genome
4.6 Million
Largest yeast chromosome now mapped
5.8 Million
Entire yeast genome (completed 5/96)
15 Million
Smallest human chromosome (Y)
50 Million
Largest human chromosome (1)
250 Million
Entire human genome
3 Billion

Fragmentation of genomic DNA for library construction

Restriction endonuclease digestion

The exact probability of having any given DNA sequence in the library can be calculated from the equation

N = ln(1 -P)/ln(1 - f)

P is the desired probability

f is the fractional proportion of the genome in a single recombinant

N is the necessary number of recombinants

For example, how large a library (i.e. how many clones) would you need in order to have a 99% probability of finding a desired sequence represented in a library created by digestion with a 6-cutter?

N = ln(1 - 0.99)/ln(1 - (4096/3x109))

N = 3.37 x 106 clones

Thus, from this type of analysis we can see that we need a technology which will allow us to achieve the following:

  1. Stable insertion of relatively large DNA fragments into our cloning vector
  2. High efficiency of insertion and the ability to handle large numbers of clones

Bacteriophage lambda vectors are commonly used for construction of genomic libraries

Bacteriophage l is an E. coli phage with a type of icosahedral phage particle which contains the viral genome:

The advantages of this type of system vs plasmids like pBR322 are:

  1. The phage genome is able to package efficiently with DNA inserts as large as 20 Kb.
  2. Furthermore, the packaged phage are highly infectious and infect E. coli at a much higher efficiency than plasmid transformation methods.

Incomplete Digestion of Genomic DNA will allow identification of sequence overlaps

Complete digestion with an endonuclease will result in a library containing no overlapping fragments:

Probing libraries

Once a library (cDNA or genomic) has been constructed we want to be able to identify clones which contain DNA of interest.

In standard methodologies the oligonucleotide is phosphorylated at the 5' end with radiolabeled g32P-ATP and T4 polynucleotide kinase.

Note that its important to keep track of the orientation of the nitrocellulose in relationship to the x-ray film (usually radioactive ink is used to identify the nitrocellulose orientation).

False positives

If we are designing DNA probes from protein sequence information we will have possible ambiguity in our deduced DNA sequence used for the design of the probe.

During oligonucleotide synthesis multiple bases will be incorporated at ambiguous positions.

Antibodies (Immunoglobulins)

If the particular vector, or phage, used to construct a cDNA library contains a promoter region upstream of the insertion site we may be able to screen for desired clones by looking for expression of the protein of interest.

Antigen, antibody, epitope

One of the defense mechanisms of vertebrates is the ability to distinguish between self and non-self molecules.

Antibodies are 'Y' shaped molecules which contain two identical heavy chains, and two identical light chains.

Antibodies are synthesized by B lymphocytes. Each B lymphocyte is capable of producing a single type of antibody directed against a specific structural determinant, or epitope, on an antigen.

If the protein of interest has been purified it can be used to induce an immune response in a host animal.

An antibodiy isolated from a single B lymphocyte cell population is termed monoclonal.

Sometimes immunizing with the protein of interest is problematic: appropriate amounts of purified material cannot be produced, or the protein is itself toxic at the dosage level necessary to produce an immune response.

As with radiolabeled oligonucleotides, antibodies can be used to identify library clones which contain a cDNA of interest. This method would of course rely upon a host vector or phage which contains a promoter upstream from the site of insertion of the genomic DNA.

1998 Dr. Michael Blaber