# Notes for Part II: Information and Heredity

BIOL 112 (Winter 2011) notes for Part II: Information and Heredity (Professor Dent's section). From February 16, 2011 to whenever the semester ends.

## 1Introduction¶

• Prof does not have office hours - email him to schedule an appointment, or just email him
• 3 sets of practice problems - do them
• A nice story about a central Asian monk named Temujin around 1220 AD and unknown paternity
• (aka Genghis Khan)
• Had to give his land to his son, but paternity issues, some sort of controversy
• Sort of illustrates the concept of heredity, which is what we're learning now
• The mystery of heredity
• All organisms come from other organisms (except for the base case, which ostensibly was abiogenesis)
• All organisms resemble their parents
• Siblings are not identical
• The magic that is conception
• An egg becomes fertilised and undergoes cleavage (cell division) until eventually you get an organism

## 2Cell cycle¶

• All organisms consist of cells, which divide to produce new cells
• Higher organisms fuse their cells (sperm and ova) to produce a new organism
• Cell division results in more cells, etc
• Dividing cells devote a lot of resources to segregating chromosomes
• i.e. organising them in a line, pulling them apart so that each cell gets an equal number of chromosomes
• Chromosome: single string of DNA
• Circular: bacteria
• Linear: most other organisms (including us)
• When a cell is ready to divide, chromosomes condense, associate with proteins (e.g. histones)
• Combination of DNA + proteins = chromatin which is dark and easy to see
• Normal DNA is difficult to see under a light microscope, but when it's ready to divide, easier to see
• The DNA double helix wraps around histones, which wraps around other proteins, etc ... all bunched together
• Karyotype: organising and identifying chromosomes
• Take a cell that is getting ready to divide
• Flatten it with glass (mitotic squash?)
• Stain it with a dye, gives them stripes; take a picture and cut them up etc
• Shows that chromosomes come in pairs - homologs (except for the sex chromosomes, which don't look like each other)
• The number of chromosomes is characteristic of a specie (with some exceptions)
• Humans typically have 46, roundworms have 2, pigeons 80
• Doesn't really correspond to complexity or anything
• Just before cell division, each chromosome has been replicated once to produce two chromatids (at the top of the chromosome)
• Two chromatids for each chromosome in a pair, held together by a centromere
• Normally a chromosome is a single piece of DNA
• But when we look at a karyotype, we are seeing two pieces of DNA - the two chromatids, bound by a centromere
• We call these mitotic chromosomes to show that they are a special case
• Segregating chromosomes is exacting: a cell needs to give each daughter cell the right number of chromosomes
• With random segregation, only half the time does each daughter get one chromatid (the right result)
• So there needs to be some sort of mechanism that ensures that the right result is enforced every time
• Organisms need at least one of each chromosome, and typically exactly one
• more can cause problems - ex, Down syndrome - an extra chromosome 21 (which is incidentally the smallest chromosome)
• So the chromosomes must obviously be duplicated BEFORE the cell divides
• NOW THE ACTUAL CELL CYCLE (steps of cell division)
• Chromosome (DNA) replication: S phase
• Mitosis (M) somatic cells divide into daughter cells, each of which inherits one copy of each chromosome
• OR Meiosis (M) for germ lines (gametes) - non-identical copies, creating daughter cells that have one of each homolog
• Cytokinesis: dividing the cytoplasm in two (optional)
• Starts with Gap 1 (G1) - cell is growing, accumulating resources it needs to divide
• Then it duplicates its chromosomes - DNA synthesis (S phase)
• Gap 2 (G2) - another rest phase, preparing to divide
• Then it undergoes mitosis (M), and the daughter cells go through the cell cycle again
• To ensure that this occurs in order, there is a checkpoint between each phase
• From S to G2, cell checks that each chromatid has been duplicated
• What evidence of this checkpoint do we have?
• Drug: hydroxyurea, blocks chrosome replication
• So cell division is put on hiatus indefinitely
• Another: caffeine, which disables the checkpoint
• This is usually not a problem, as chromosome replication occurs without problems
• However, if you both block chromosome replication AND disable the checkpoint, you have problems
• Because the cell will attempt to continue to the mitosis stage, but will find that it's not able to divide properly
• Cell cycle tightly regulated to ensure that, in particular, each cell gets the right number of chromosomes
• Diagram later maybe
• More on the G1 to S transition:
• There is a protein called Cdk4, which is always present during the cell cycle but doesn't really do anything
• But then it associates with another protein called Cyclin D, which is only produced during S phase
• This complex interacts with other proteins to say that the cell is ready to go into S
• Cyclin is degraded, Cdk4 is released (no longer functional)
• So this activity of proteins tells the cell what cycle it's in, when to start transition
• Although cyclins and cdks are involved in all parts of the cell cycle, important in different transitions, etc
• How do cells know when to divide?
• Most somatic cells not dividing - arrested in the G1 phase of the cycle
• Often waiting for signals from other cells to tell them to divide
• For example, cells in the immune system: not usually dividing, but when there is an infection, they start dividing
• A cell called a macrophage identifies a virus, sends a chemical signal to T-cells telling them to start dividing, and about the infection
• So T-cells don't start dividing until they get a signal from a macrophage
• Unregulated division of cells &rarr; cancer
• If the G1 to S checkpoint is defective, a cell can divide in an unregulated manner (i.e. dividing when it shouldn't be)
• For example, if cyclin E is always active or overabundant, a cell will repeatedly divide &rarr; cancer
• If we can understand how signals regulate the cell cycle, we might be able to design drugs to interfere with these signals
• (And thus fight cancer)

### 2.1Mitosis¶

• Although the process of dividing cells randomly and hoping that each half gets the same stuff works for things like mitochondria and ribosomes etc, doesn't work for chromosomes
• Cell can regulate chromosome duplication, to ensure proper segregation (unlikely to happen randomly)
• Mitotic chromosome: the kind we're familiar with, two sister chromatids united by a centromere (only happens during mitosis, usually a messy ball - not yet condensed)
• Mitotic spindle: mechanism for this
• Prophase: centrosomes duplicate, migrate to opposite ends of nucleus, chromosomes start to condense
• Prometaphase: nuclear envelope breaks down, microtubules from spindles (formed from centrosomes, at poles) can interact with chromosomes, bind to each other
• Microtubules are growing and trying to interact with chromosomes, unstably and randomly sampling space
• Because centrosomes don't know where chromosomes are, just looking for them with microtubules
• Each kinetochore is given a geometry so that two kinetochore microtubules from the same spindle can't get both chromatids
• Basically, kinetochores are the things that the microtubules can attach to, to latch onto the chromatids
• Metaphase: all the chromosomes are lined up in the middle between the poles (equatorially), attached chromatid to centrosome (metaphase plate)
• Chromatids must be paired and kept together until it is time to segregate
• Checkpoint to make sure that all chromosomes are attached to microtubules and on opposite centrosomes, before continuing
• Anaphase: centrosomes break down, so sister chromatids no longer attached, microtubules pull chromatids apart
• Telophase: shortly before the cell actually divides, equal number of chromosomes in each half
• Cytokinesis: only in plants, for dividing the cell
• Vesicles between the cells are added, fuse to become the cell plate (cell wall)
• In animals, purse string method - actin and myosin constrict, to pinch and divide the cell
• Note that cytokinesis does not always happen - some cells have multiple nuclei per cell
• Example, cardiac tissue is multinucleated (syncytial) - mitosis without cytokinesis
• Errors: if one chromosome, say, goes to the wrong side, such that one daughter cell has 2 copies of one chromosome and the other has 0, both will die

### 2.2Meiosis¶

• Sex: mixing the genetic material of two organisms, for organisms that are different from you and thus may be better adapted
• First, need to reduce the number of chromosomes by half
• Ploidy - the number of sets of chromosomes containing exactly one of each homolog
• Haploid, diploid, triploid, tetraploid, pentaploid etc
• Somatic cells are diploid, so gametes must be haploid
• Meiosis: process by which haploid cells are made, much like mitosis
• Meiosis I: reduction division
• Early prophase I, chromosomes condense, move apart
• Mid-prophase I: chromosomes are condensed, have been duplicated (so 2 of each chromosome, 4 of each chromatid)
• Late prophase I - prometaphase, little pieces of homologous chromatids start exchanging DNA (chiasma or recombination)
• Metaphase I: all lined up, except two columns, sort of, in the middle (so pairs of homologues lined up)
• Checkpoint before anaphase
• Anaphase I: homologues separate (chromatids stay stuck together), half on each side
• Telophase I, ^
• However, the process is not complete - still too many chromatids (diploid)
• So then we have meiosis II: equatorial division, similar to mitosis
• Form a spindle, microtubules grab chromatids, so you end up with four haploid cells
• Prophase II, metaphase II, anaphase II, telophase II etc
• While mitosis can be quick, cells can arrest in meiosis for a long time
• E.g. ova, produced by human females; arrest, wait until they're fertilised
• However, problems with this: Down's syndrome, happens more often with older owmen
• A problem with meiosis I; two homologues in one cell, so daughter cells (from meiosis II) still have 2 of each chromosome (21 in this case)
• The older the mother, the longer the egg has been "sitting around", in an arrested stage
• And the longer the egg has been sitting around the greater the chance of imperfect duplication

### 2.3Ploidy¶

• We're diplontic life organisms - mostly diploid, small haploid portion (fertilisation etc)
• But there are also haplontic organisms, e.g. algae - almost completely haploid, gametes fuse, undergo meiosis immediately (so briefly diploid I guess)
• It's not the absolute number of chromosomes that's important, but rather the ratio that matters
• As long as the ratio is constant (e.g. 3 chromosome 1s, 3 chromosomes 3)
• However, odd ploidys tend to be sterile (one daughter cell will have more chromosomes than the other etc)
• Trisomy: genetic anomaly, three copies (instead of two) of a particular chromosome (type of aneuploidy)
• You can also have triploids (sterile though) and tetraploids or even more
• For example, store-bought strawberries ... can be octoploids (makes them bigger, still viable and fertile)
• Another example: frog in South Africa, at one point, evolved to be tetraploid rather than diploid (whole new species); similar but larger than original

## 3Genetics: Mendel¶

• Gregor Mendel: Austrian(/Australian) monk, really cool guy and doesn't afraid of anything
• Pea plant experiment
• Tried to explain why for a given character, offspring share traits with their parents, but siblings' traits not necessarily identical
• Continuous variation: height, skin colour, etc
• Leads to blended inheritance usu
• Discrete variation: only a few possible traits for a given character
• Example: pea flower colour or mouse fur colour
• What Mendel focused on (pea plant colours)
• Used true-breeding strains (e.g. round pea, only gives round peas when fertilising itself)
• True-breeding round + true-breeding wrinkled = only round in F1
• But F2 progeny, 1/4 wrinkled, omg genetics
• Same sort of thing happens with many other traits that exhibit complete dominance
• Vocab terms: genes, alleles (different possibilitiesfor a gene), hetero/homozygous
• Punnett squares etc
• Stochastic process, follows laws of probability, pretty straightforward
• Genotype: set of alleles; phenotype: set of traits
• Genotype uniquely determines phenotype, but phenotypes can be the result of many different genotypes
• Alleles of the same gene segregate independently; same for alleles of different genes (law of independent assortment)
• Example: SSYY x ssyy
• F1: All SsYy
• F2: 9:3:3:1 (S-Y-:ssY-:S-yy:ssyy)
• Note on Punnett squares: if you have n genes, $2^n$ genotypes
• Some gametes may have the same genotype; still have to include them
• If you have more than 3 genes, just use probabiddy, not Punnett squares
• Conclusion: heredity inherited through discrete units (alleles)
• Mendel's discoveries were ignored until the discovery of chromosomes (discrete structures that could be responsible for this phenomenon)
• If genes are on different chromosomes, then they can segregate independently during meiosis (arrangement across metaphase plate)
• However, they don't HAVE to be on different chromosomes - chiasma (genetic recombination) during meiosis (?)
• But the law apparently only applies to genes on different chromosomes
• Usually can't see alleles, but in the case of sex chromosomes, you sort of can (XX vs. XY)
• Y is kind of dominant - XXY is male, X is female
• Red-green quarter blindness: sex-linked (X), recessive
• Colour-blind females have affected fathers
• Colour-blind males usually have unaffected parents (colour-blind dad = red herring; colour-blind mother = unlikely)
• There is no colour-blind gene on the Y chromosome - too small I guess
• Wild-type (predominant, >99%) vs. mutant alleles (sometimes purposely-induced mutations)
• Polymorphic alleles - > 1% of population
• Autosomes (non-sex chromosomes) and allosomes (sex chromosomes)
• For genes on the same chromosome:
• Experiment with fruit flies, either two genes are on the same chromosome or not
• Results: the genes are on the same
• Draw the Punnett squares for both possibilities, results matched the same-chromosome situation
• Also tells you how the genes are arranged on the chromosomes (e.g. both dominants on the same chromosome)
• HOWEVER, you also get some evidence that the genes are NOT linked to the same chromosome
• Of course this is due to genetic recombination (crossover etc)
• Recombination occurs at random points on the chromosome
• So the probability or rate of recombination occurring between two genes depends on their distance from each other
• The further apart two genes are, the more likely that they will be separated
• So measuring recombination rate can tell you about the distance between two genes on a chromosome
• This distance is measured in centiMorgans (cM) = number of recombinations / number of total progeny
• Example: if black is 16.6 cM from vestigial and vestigial is 12 cM from curly ...
• Then you don't know how far black is from curly, could be either 28.6 or 4.6

### 3.1Continuous variation¶

• Results from having many genes determine each trait, and many alleles for each gene
• Example, snapdragons and colour (red + white = pink, intermediate phenotype), incomplete dominance
• Reason: white does not have gene for producing red pigment
• Pink is heterozygous, so produces only half the amount of red pigment, so pink
• But if you had many alleles and many genes you can get a true range of phenotypes
• Although not exactly continuous variation, it does illustrate intermediate phenotypes (although there is only one)
• Example in humans: allelic series and Alzheimer's disease
• Different allelic combinations result in different risks of developing the disease
• So the alleles are working additively
• You can also have multiple genes affecting a phenotypic trait (multigenic or polygenic)
• Example: heart disease, many genes working additively
• Each allele acts as a single Mendelian trait, but their sum gives the actual result
• Epistasis: interactions between alelles of different genes
• Example: fur-colour in mice, not a single locus - two genes determine, discrete trait
• If you have two recessive alleles for the albino locus, you are albino
• But if you don't have two recessive alleles for the albino locus, you produce pigment
• But the pigment colour is determined by another gene, Agouti - either brown if you have bb or black if you have B-
• So mice can only be either brown or black or white
• In other words, the genes assort independently, but the ratio is not 9:3:3:1 because of interactions (epistatis)
• Epistasis often results from genes involved in different steps along the same pathway or process
• Environmental contribution to phenotype:
• Penetrance: percentage of individuals with a given genotype showing a certain phenotype
• Expressivity: the degree to which a phenotype is expressed

## 4DNA and heredity¶

• Chromosomes: physical things that carry heredity, and they are made of DNA OMG
• How to get the stuff that chromosomes are made from: purify them, assay the components
• Assay: just a way of measuring something
• For example, an allergy skin test assay, to find out what substance is causing allergic reactions
• Separate the components, test them one at a time
• For DNA: grind up any living organism
• Use an organic solvent to extract lipids and proteins (e.g. phenol)
• Precipitate with ethanol, and you get just DNA
• Now do the assay test
• Hypothetically, get some porcupine DNA, dip a cat into the DNA, should result in a pet porcupine RIGHT?
• No
• But you can actually do something similar, with bacteria (discovered by Frederick Griffith)
• Can an extract from dead bacterial cells genetically transform living bacterial cells?
• S strains and R strains of bacteria, S is deadly, R is harmless (when injected into a mouse)
• So take the S strain, heat it up so that the bacteria are dead, inject in mouse - harmless
• However, if you kill the S strain, mix it with the R strain, and inject it into the mouse, it will die
• Essentially you have transformed the R strain into the deadly S strain, by mixing them (something transferred)
• Transforming principle - hereditary
• Oswald Avery, simple experiment to identify the transforming principle - turned out to be DNA (not lipids carbs etc)
• Another experiment on this done on viruses, which can also have DNA
• Specifically, bacteriophages which infect bacteria, inject something into the bacteria
• Which causes more bacteriophage to be made within the bacterium
• Bacteriophage have only DNA and proteins, so the Hershey-Chase experiment set out to show that it was DNA that was being injected
• Done by labelling protein with radioactive sulfur and DNA with radioactive phosphate (has a phosphate backbone)
• Then blend the mixture of phage and bacteria, so that the phage gets stripped off the top of bacteria
• Locate the DNA after centrifuging - is it in the bacteria (pellet I think)? Or is it in the supernatant (liquid at top)
• Shows you clearly that DNA is what is being injected

### 4.1Structure of DNA¶

• Nucleoitodes: base (4 different kinds) + sugar (deoxyribose) + phosphate, polymer backbone (ribose-phosphate-ribose)
• 5' phosphate at one end, 3' hydroxyl at the other end (so directionality)
• Chargaff's rule: ratio of bases (A:T:G:C) specific to organisms (humans have a different ratio than corn etc)
• Chargaff's analysis also showed that the ratio of A:T is 1:1, same for C:G (so those are always paired)
• X-ray crystallography (diffraction etc, cool stuff)
• Showed that DNA consists of two strands, double helix, antiparallel (so each end has a 5' and a 3' end)
• Phosphates probably on the outside
• Watson and Crick put it all together for a good model of DNA
• Minor groove: when backbones are close together; major groove: when they're far apart
• DNA strands are the reverse complement to each other (opposite directions, and A-T, C-G)
• The reason A-T and C-G are paired: natural affinity of bases
• A-T, form hydrogen bonds when they lie together in a plane with the backbone on the outside
• C-G, form three hydrogen bonds (?) so these hydrogen bonds determines base pairing
• Note: A, G are purines (large); T-C, pyrimidines (small bases), so no bulges; distance between two strands always the same
• RNA, another nucleotide, can also form complementary strands, with U instead of T
• Key to DNA: sequence of nucleotides not constrained by the structure; can accommodate any sequence
• Each strand has the same information as the other strand - so in a sense it's already replicated
• But how does it actually replicate (to make children etc)

### 4.2DNA replication¶

• Original possibilities considered:
• Semi-conservative, two strands separate, new strand for each
• Conservative, two strands separate, make copies, those copied strands join
• Experiment to determine what actually occurs (Meselson and Stahl):
• To distinguish between new and old DNA - N14, light DNA; N15, heavy DNA
• Use a centrifuge to separate them
• Procedure: grew bacteria in heavy DNA, allowed it to replicate, became lighter etc
• After one generation, all the DNA migrated to the middle, suggesting that all the DNA is intermediately heavy
• Also, adding a new light generation, etc
• So this supports the semi-conservative model - each pair of strands is half new, half old
• If replication were conservative, this experiment after two rounds would result in 1/4 heavy, 3/4 light
• Reason: the original would remain heavy, would make a new light strand after one replication
• After the second round, the original would still be heavy, but the new ones would all be light
• Making DNA in a test tube:
• Triphosphate nucleotides
• DNA polymerase (an enzyme)
• DNA template (an old strand that can be duplicate), which must have ragged ends
• Ragged ends: one of the strands sticks out, which acts as a template to stick a new base on
• For example, if you have a G at a ragged end, you would add a C to it (formed from the materials)
• This ragged end must be at the 3' end (so that you attach shit to the 3' base)
• Indicates that the direction of polymerisation is 5' to 3' (so start make the 5' end first)
• When in the lab, you create ragged ends by breaking up the DNA with phenol etc
• But how does the cell make ragged ends (which it hates) and how does it pull them apart?
• Solution: enzyme called helicase, unwinds the DNA (pries them apart, like unzipping)
• Enzyme called primase, creates fake ragged ends - creates a short RNA primer
• Then the polymerase uses the primer as a 3' end
• Leading strand (can follow the helicase) vs. lagging strand (has Okazaki fragments)
• How exactly does this work? Ascertain this
• This bubble formation is not random, occurs at specific locations in DNA
• For example, bacteria, circular chromosomes; creates two linked rings (which are then broken apart)
• For our linear chromosomes, many different origins of replication, eventually the whole thing is replicated
• Error correction mechanisms:
• DNA proofreading by DNA polymerase III
• Sometimes it can see when it made mistakes, go back, replace it with the right base
• Mismatch repair
• If the new strand does not match the old, an enzyme will fix it
• Excision repair: bases are damaged, will replace them
• Example: UV light produces thymine dimers, where thymine bases are bonded to each other
• An enzyme can recognise this, will rip out the bases and replace them with good ones
• Can repair single bases, multiple bases, etc

## 5The genetic code¶

• Key to DNA:
• DNA can accommodate any arbitrary sequence, which can encode information
• Two strands encode the same information in complementary format - method for replication
• Alkaptonurea: metabolic disease, urine turns black when exposed to air
• Garrod figured out that is a recessive hereditary trait (ran in the family etc)
• Then deduced that it results from the absence of a specific enzyme
• So the enzyme would normally convert a certain compound into another, but as that enzyme is missing ...
• The compound remains in that format, and when oxidised, that compound turns black
• Thus genes correspond to enzymes, or something
• Another example of a gene in a pathway (first example: epistasis)
• Show that genes determine enzymes in a biochemical pathway
• Established: one enzyme, one gene, for a series of genes in a pathway
• We now know: one gene, one protein (so each gene encodes for a protein)
• In this pathway, we have compound conversion: ornithine --> citrulline --> arginine (an essential amino acid)
• So this pathway must be intact for organisms to grow etc
• Method: put spores of each arg (mold) mutant strain in a medium with and without nutritional supplements
• Normally if you add the minimal number of nutrients, the wild-type can grow
• But sometimes there are mutants that need additional supplements to grow
• The wild-type: you can add ornithine, or citrulline, or arginine, and it will grow
• One mutant type - will only grow if you add citrulline or arginine
• Another mutant type - only on arginine, clearly missing an enzyme to convert shit into citrulline
• So it was possible to identify a number of mutants, each one corresponding to a missing enzyme in the pathway
• RNA: ribonucleic acid, how information goes from the nucleus into the cytoplasm (where proteins are actually made)
• So DNA is converted into RNA, which carries information into cytoplasm
• Ribosomes use this information to make proteins
• So DNA information flows to DNA (replication), and also to RNA, then to proteins
• Difference: uracil instead of thymine; has both hydroxyl groups on ribose (?)
• So you can have these double-stranded hybrids of RNA and DNA
• Transcription: enzyme = RNA polymerase
• Enzyme unwinds the two strands of DNA
• Starts making an RNA strand (5' to 3') complementary to ONE of the strands
• Only goes a short distance - corresponding to how much information it needs
• Then the RNA goes out of the nucleus, ribosomes grab onto it to make proteins
• Fred Sanger - sequenced the first protein (insulin)
• Showed that a sequence of amino acids is characteristic of a protein
• 64 possible sequences, only 20 amino acids, so yeah some redundancy
• Experiment to figure out which sequences encode for which amino acids:
• RNA with only one of: UUU, AAA, or CCC (repeated)
• Conclusion: each triplet is mRNA for a different amino acid
• Gobind Khorana: AAGAAGAAG
• Some proteins were made of LysLysLys (amino acids)
• Others, ArgArgArg ... others, GluGluGlu
• Due to the three possible reading frames for any sequence of DNA
• If you do enough experiments like that, you can figure out which amino acid each codon corresponds to
• Methionine: AUG/ATG, start codon
• Stop codon: TAG/TAA/TGA (or with U)
• Code is degenerate: while DNA uniquely determines the amino acid sequence, the reverse is not true
• How does the ribosome know how to interpret the RNA sequence?
• Unlike in DNA, where one strand has natural hydrogen bond affinity to the other strand, this is not true for ribosomes
• No affinity of amino acids to specific codons in RNA
• So there is another type of RNA - transfer RNA, has a 3' and 5' end, and has an anticodon
• The reverse complement of the codon that encodes for the amino acid
• There is 1+ tRNA for each amino acid
• So for CAT the tRNA would be GTA I guess
• Each tRNA has a unique shape
• So the tRNA binds to the mRNA, and the correct amino acid is on the tRNA (other side)
• Which is how you get the amino acid, the amino acids join up etc
• How does the cell know which amino acid to attach to a specific tRNA?
• Answer lies in the structure of the tRNA itself - shape, also tRNA synthase
• Enzymes recognise specific tRNA and amino acids, attach them to each other
• If you, say, converted an amino acid on tRNA to something else, the ribosome wouldn't know, would just blindly insert it
• This code, shared by organisms ... evidence for a single common ancestor, presumably around 6000 years ago

### 5.1Transcription and translation¶

• Clicker question: if the enzymes for converting a precursor to ornithine and ornithine to citrulline were missing ...
• What would the neurospora grow on?
• Answer: arginine or citrulline. Since citrulline can still be converted to arginine.
• One DNA strand has the same sequence as the mRNA (sense/coding strand)
• This is the strand we look at if we want to see what amino acids are being produced, etc (just change T to U)
• The other one acts as the template - the antisense strand (reverse of mRNA)
• So the DNA sequence that is actually transcribed is the reverse complement of the mRNA
• Some genes are transcribed in the other direction - from the other strand of DNA
• It's more or less random which strand of DNA actually gets transcribed
• How does the RNA polymerase know where to start?
• Binds to a specific DNA sequence upstream of the place to transcribe, called a promotoer
• So just in front of the portion of the sequence that needs to be transcribed
• So the promoter recognises this stretch of DNA by its sequence (e.g. TATA box lol)
• All genes have some kind of promoter - not necessarily the same promoter of course
• So these promoters are encoding information, but not for amino acids
• The three-frame problem: how does the cell know which reading frame to choose?
• Experiment: confirmed that coding regions always begin with methionine (ATG)
• The first methionine codon that the ribosome can find signals the start of the reading frame
• Ribosome consists of two subunits, one large one small
• Small one recruits a methionine tRNA
• Large subunit will sit on the p-site
• Will start going along, grabbing tRNA as it goes along
• Will replace each tRNA by another one as needed
• N/amino-terminus: start of sequence
• Ends at a stop codon (release factor - peptide that causes the compound to be released when it encounters a stop codon)
• Wobble pairing:
• Although there's a tRNA for every amino acid, there isn't tRNA for every codon
• During the interaction of the anti-codon with the codon
• For example: CAU should have AUG as its anticodon (5' to 3') but instead, it's GUG
• Rules determine which wobbles work, depending on the anticodon position (5' end or 3' end)
• tRNA also has a different kind of base called Inosine (I), can pair with A/U/C in the anticodon position
• Hence, the degeneracy of the genetic code, and why you don't need one tRNA for every codon
• (Because some tRNAs can bind with more than one type of codon etc)
• Know how to determine possible anticodons made on codon sequences for amino acids and the Wobble pairing rules
• Transcription and translation can be simulatneous
• In prokaryotes, ribosomes can start translating mRNA before transcription is complete
• Not possible in eukaryotes, because ribosomes can't get into the nucleus, where transcription is occurring
• Proteins can begin to function before translation is finished (before they've been completely synthesised)
• Polysomes - many ribosomes for a single strand of mRNA
• Like a train of ribosomes, all producing the same peptide simultaneously

### 5.2Mutations¶

• Demonstration of link between mutation and phenotype - sickle-cell anemia (autosomnal recessive trait)
• Unusually high prevalence among Africans (heterozygous for it) - link to malaria lol
• Main protein in blood: hemoglobin, binds to oxygen and carries it around the body
• But in sickle-cell anemia, hemoglobin will form fibres, under certain conditions, instead of being soluble in the blood
• Cause red blood cells to stretch, break and eventually die
• Cause: change in the hemoglobin amino acid sequence
• Experiment done by Linus Pauling - cut up hemoglobin into small pieces, using enzymes
• 2-D gel to separate the peptide fragments based on their charge, in one direction (different amino acids have different charges)
• And in another direction based on their size
• So where peptides end up depends on their amino acid sequence
• And it turns out there is one peptide in a different location for sickle-cell patients
• Which is likely due to a different amino acid sequence
• Later, it was sequenced, confirmed - sickle-cell anemia caused by a single base pair change
• Results in a single different amino acid - valine instead of glutamic acid - which is enough to cause sickle-cell anemia
• Classifying mutations:
• Point mutations - small change in DNA (e.g. one base pair changed), usually affect one gene
• Missense mutation: single nucleotide change --> different amino acid produced, so the protein is different (e.g. sickle cell)
• Nonsense mutation: when something becomes a STOP codon, protein truncated early
• Deletion (frame shift): individual nucleotides deleted, changes both the amino acid and causes a frame shift ... subsequent amino acids different
• Insertion (frame shift): same as above except adding in a random nucleotide
• Deletion (no frame shift): if you delete 3, you get rid of one amino acid, but there is no frame shift (skipping an amino acid)
• Insertion (no frame shift): same as above but adding 3 (or a multiple thereof)
• Silent mutation: no effect, due to degeneracy in the code (for example, TAG because TGA)
• Chromosomal mutations - large changes in chromosomes, usually affect many genes
• Deletion; delete a large chunk of a chromosome, removing many genes in the process (ex: bands missing in chromosome)
• Duplication and deletion: unequal crossing-over (during recombination) - one chromosome has extra genes, the other, missing
• Inversion: a piece of the chromosome is oriented backwards, often has no detectable effect if the break is between genes
• Reciprocal translocation: recombination occurring between non-homologues (which shouldn't be recombining)
• Mutagen: agents that cause changes in the DNA sequence
• Of course, even replicating DNA itself can cause mutations
• Mutations can occur in two different cell populations - somatic cells and germ line cells
• If a somatic cell gets mutated, then further somatic cells may be mutated, but can never escape the body
• Can kill a cell, or make it cancerous, or harm it or do nothing or whatever
• Transmitted to daughter cells, but never to progeny
• If a germ line cell gets mutated, then progeny may be mutated too ... that allele can be a polymorphism in the gene pool

## 6Molecular genetics of prokaryotes¶

### 6.1Viruses¶

• Viruses first discovered by Dmitri Ivanovsky - Tobacco mosaic virus
• Many tobacco plants dying of a disease
• The only infectious agent known at the time = bacteria
• But when he ground up the leaves, looked under a microscope, couldn't see them
• And it had to be an infectious agent, not a toxic agent - even when diluted and transferred many times, still infectious
• If it were bacterial, using a Chamberlain filter would get rid of the infections
• (Filter whose holes were large enough for water to pass through but too small for bacteria)
• However, when these were used, the infection was still transmitted ... so the infectious agent must be smaller than bacteria
• So viruses were defined base on their disease-causing abilities and size
• Initially, viruses defined based on their host specificity
• Then later, based on their genetic material (some have only RNA, no DNA)
• Several forms of DNA:
• Single-stranded DNA
• Double-stranded DNA (e.g. chicken pox virus)
• Single-stranded RNA (e.g. tobacco mosaic, influenza)
• Double-stranded RNA
• The above can all be linear, like us, or circular, like bacteria
• Bacteriophage - viruses that infect prokaryotes
• Infects a bacterium, injects its DNA into it, uses the bacterium's ribosomes to replicate
• Eventually you've digested all the bacterial genes, and lots of phage proteins etc
• With T4, there is a lytic cycle - burst out of the bacterium, infect many others
• Lysogenic phage - phage injects chromosome into the bacterium's DNA (now called prophage, continuous with bacterial chromosome)
• Might be done if conditions are not good - don't want to infect others just yet
• And the bacterium does not know that its DNA has phage DNA, so it keeps replicating
• In the process, the virus is replicated in all the host's offspring lol
• Some phage have both cycles, others are only one or the other
• When lysing, it's important to assemble all your phage parts before you lyse
• Done using promoters
• Elements in front of genes, indicate when they should be transcribed
• Early genes need to be transcribed first, then they can turn on the "late" genes, which are responsible for lysis
• Phage are capable of exchanging DNA
• In a high multiplicity infection (when you can have more than phage affecting a bacterium at the same time)
• Phage have only one single chromosome
• If they have different alleles, they can recombine, like in prophase I of meiosis ... cis to trans and vice versa
• Can influence what the plaques look like (dark/light/small/large etc)
• Eukaryotic viruses usually have an additional layer of complexity (often)
• Glycoprotein envelope on outside (coat)
• Lipid bilayer membrane, with the proteins above embedded in them (sometimes, not always)
• Nucleocapsid
• Viral RNA/DNA genome
• Binds to surface of the cell due to proteins on their membrane binding with proteins on surface of cell
• Responsible for tropisms of viruses - the certain types of cells that they like to infect
• Allows them to get inside the cell through a vesicle; the two membranes fuse, virus released into cytoplasm
• Virus genetic material is transcribed; proteins made
• These proteins go through the Golgi, then bud off, forming new virus particles (using the cell machinery - co-opting)
• Eukaryotic viruses can exchange genetic material
• H1N1 - mixture of various influenza strains
• HIV retrovirus - have RNA as genetic material when they enter the cell
• Also carries a protein called reverse transcriptase
• Known to bind to proteins (CD4, used for finding infectious agents) on the surface of immune system cells
• The RNA enters the cell, then makes a DNA copy of its RNA genetic material
• Then inserts DNA copy of genome into the genome of the host, damn
• And once the cell divides, the host will make RNA copies of the retrovirus' DNA, which is just the original RNA
• So it will make new viruses just by replication ... the virus is now part of the host cell's genome
• Chicken pox - caused by a double-stranded DNA virus (varicella)
• Once you get over chicken pox, the virus travels through nerve cells, infects the spinal cord, lies there dormant
• So you can get recurring infections - e.g. shingles (from zoster), virus becomes active again
• Member of the herpes virus family - characterised by reemergent infections, difficult to get rid of

### 6.2Bacteria¶

• Bacterial replication - usually happens asexually (binary fission)
• Experiment with phenotypes - ability to grow on different types of media
• One strain of E. coli, needs methionine and biotin for growth
• Another strain, threonine and leucine required for growth
• When you combine them, some bacteria don't need either ... due to genetic material being exchanged
• Conjugation: sex pilus, tube formed between two bacteria, one feeds plasmids into the other, which has none
• F-plasmid: necessary for forming conjugations; an F+ bacterium can form a conjugation tube
• Then the bacteria that receives the plasmids becomes F+, can also form conjugation tubes
• This process also allows chromosomal DNA to be moved
• The plasmid gets inserted into the chromosome - Hfr (high frequency of recombination) strains
• Bacteria have circular DNA, so they just insert themselves somewhere in there
• Or, they can just recombine partly (like in prophase of meiosis I), the non-used part degrades; the rest divides
• Experiment: allow bacteria to start to conjugate, then interrupt the process
• Frequency of a gene being transferred through a tube depends on how close it is to the origin of replication
• Allowing you to map the genes
• Transduction: "hitchhiking on phage"
• Phage accidentally carrying around a piece of bacterial chromosome
• They can't replicate, but they can insert it into other bacteria, spreading those genes around, lol
• Transformation: moving DNA around on plasmids
• Some bacteria, enzymes increase in level to take advantage of increased lactose in the environment
• In E. Coli - three genes responsible for metabolising lactose (Beta-galactosidases), have a promoter
• The three genes form an operon - transcribed onto same mRNA; ribosomes make them into separate proteins
• Lac repressor - between the promoter and genes; lac operator, prevents polymerase from binding and making RNA from these genes
• Basically represses these genes during specific conditions (repressor molecule, binds to DNA)
• But when lactose gets into the cell, lactose binds to repressor molecule ... allosteric regulation
• The repressor molecule no longer binds to DNA, so the gene gets turned on basically
• The reverse is also possible - the regulation of tryptophan formation
• If tryptophan is present, tryptophan binds to repressor, the repressor blocks the polymerase from making more enzymes
• So, a negative feedback mechanism to create an appropriate amount of tryptophan (homeostatis)
• What if you replaced the lac operon with the trp operon?
• Cells can build sort of complex logic circuits using these repressors
• Sophisticated regulation based on what's present in the cell at the time
• There are also activator proteins - cyclic AMP receptor protein
• Cyclic AMP is usually present in high quantities in the cell in the absence of glucose
• When it binds with protein kinase A, encourages polymerase to come and transcribe this gene
• Bacteria weighs the relative advantage of glucose vs. lactose

## 7Molecular genetics of eukaryotes¶

• Back to ploidy - greater ploidy, greater size
• One possibility: just has more cells (not true)
• Other possibility: just has bigger cells (this is true, generally)
• Cell size roughly correlated to genome size
• So, possible to find impressions in bones of dinosaur fossils, which give the size of cells
• From that, you can get the size of the dinosaur DNA
• Genome size usually given in number of base pairs - the haploid number (so number of base pairs PER chromosome)
• So humans have 6 billion bp per cell but 3 billion base pairs per chromosome
• Although you'd expect humans to have many genes based on numbers in other organisms (e.g. roundworms)
• Humans actually only have about 30,000 genes
• Turns out humans have many more genes than they actually need
• Large stretches of non-coding DNA between genes that actually code for things
• Exons: things that code for proteins; introns: don't encode for proteins
• When the RNA gets transcribed from the DNA, all of it gets transcribed into a primary RNA transcript
• Then the introns get cut out, and the exons get spliced together, so you have a continuous RNA strand, all of it coding for amino acids
• Experiment that showed this: heat up a sequence with an intron between exons
• Hybridisation: forms loops, so the exons join together
• Through heating up and cooling down ?
• There are enzymes that do this
• Small nuclear ribonuclear protein - enzyme that consists of protein and RNA (RNA strand, essential component of enzymatic activity)
• Recognises specific sequences at a 5' splice site
• Then loops an intron out, sort of ... splicing
• Example of RNA processing (doing things to the RNA before it's translated by ribosomes)
• Others: getting a 5' G gap, and polyadenylation (addition of a poly A tail)
• G-cap: 5' end, guanine binds to the other 5' end, poly A's added to the end
• Tells the cell that this is mRNA that needs to be translated
• Purpose of junk DNA?
• Telomeres: protects the ends of DNA, because there's an inherent flaw in the way DNA polymerisation occurs
• When ready to replicate DNA, separate the strands, add an RNA primer (3' hydroxyl to start the process) - helicase + primase
• So DNA polymerase III can then come along, use the primer to start transcribing
• But then DNA polyermase I rips out the RNA, and there's a gap at the end, since it's a linear chromosome; that end not filled in
• If you do this once, you get recessed ends on both ends of the chromosome
• Every time you replicate the DNA, it gets shorter ... until eventually you have no chromosomes
• Cell deals with this by creating telomeres. Short stretches of extra DNA, that are useless, on the ends of chromosomes
• Acts as a buffer so that when it's lost, it doesn't matter
• Telomerase comes along, adds a lot of random DNA to the end of a sequence
• Telomerase is very important:
• Most human cells don't express telomerase; as DNA replicates, chromosomes keep getting shorter
• Until they get too short, triggers a checkpoint, cells stop dividing
• Part of what happens in human aging - your telomeres start getting shorter, eventually you run out of telomeres
• Also, cancer cells often express telomerase, so they can divide indefinitely
• Clicker question: 100 six-base-pair telomeric repeats on the end of chromosomes
• And primers are 25 base pairs long
• You get maximum 24 divisions before you lose all your telomeres
• Some non-coding DNA actually has a function
• Centromeric regions tend to have highly repetitive DNA, which is how cells know where to find a centromere
• However, the vast majority of DNA that doesn't code for cellular proteins = transposable elements
• Pieces of DNA that don't really have a purpose
• Transposons: encode proteins called transposase in their middle
• Retrotransposons: look like retroviruses, only they aren't infectious
• Non-LTR retrotransposon: have reverse transcriptase
• Alu sequence: something
• Transposons: have inverted repeats on either end, they can excise themselves, insert themselves somewhere else
• Sometimes they land in a cellular gene, disrupting the gene and causing a mutation
• Reason there are so many transposons:
• When one moves somewhere else, it leaves a gap
• Cell fills in the gap with whatever is on the homologous chromosome
• So you get another transposon where it used to be
• Retrotransposons have an RNA intermediate, require reverse transcriptase
• Reverse transcriptase gene gets made into a protein by the cell
• Makes DNA from the RNA, that DNA goes back into the genome, inserts itself somewhere
• Non-autonomous non-LTR: get transcribed, piggyback on reverse transcriptase made by other genes in the genome, same thing
• Pseudogenes: things that look like cellular genes, but without the introns
• So when reverse transcriptase acts on mRNA, making DNA that then gets inserted back into the genome
• But, doesn't have a promoter or anything, so they can't actually do anything
• Conclusion: genome is a complex ecosystem, transposable elements compete for survival
• Which came first? polymerase gene (the DNA) or the polymerase enzyme (the protein)?
• Chicken and egg problem lol
• Answer came from the study of self-splicing introns
• Experiment: put RNA in test tube without any enzymes, as a control
• Supposed to be a negative control ... but the RNA still got spliced
• Turns out the RNA can fold into a structure that has catalytic properties, which then splices itself out
• Ribozyme - RNA enzyme; can catalyse a variety of reactions just like proteins
• Can hybridise to itself
• Example: ribosomes are actually ribozymes, just RNA with a few unimportant proteins associated with it
• So the first enzyme could have had just RNA, and this RNA resulted in ribozymes
• Maybe there was RNA polymerase that could replicate itself
• So primitive organisms could have been made by RNA that also acted as a lipid membrane
• Housekeeping genes ..?
• Cells that do not have a particular protein usually don't have the mRNA for it
• So regulating the transcription of a particular gene - main method of controlling which genes are present
• In both eukaryotes and prokaryotes, RNA polymerase can regulate transcription due to promoter element (TATA box)
• Which is usually about 25 bp upstream of the actual gene
• Eukaryotes have three polymerases
• I transcribes rRNA
• II: mRNA (most important)
• III: tRNA and other small RNAs
• Polymerase doesn't just bind to the DNA and transcribe; proteins of the TFII family prepare the way
• TFIID - complex of proteins including TBP, binds to the TATA box
• TFIID recruits other TFIIs that eventually recruit the polymerase II
• Enhancers and silencers - unique to eukaryotes
• Can act from a great distance away
• Can be inverted and will still work
• Transcription factors bind to specific sequences; protein domains probe major groove of DNA
• These factors determine what proteins are synthesised in different tissues
• Example: heat shock genes
• Genes can mix and match silencers and enhancers to get a unique pattern of proteins expressed
• How histones affect DNA transcription:
• Heterochromatin, stains darkly; not transcribed in contrast to the lightly-stained euchromatin
• For example, in females, one X chromosome is inactivated by condensing it, so it becomes unreadable
• This happens early during embryogenesis; one chromosome chosen at random for inactivation in the cell
• Then every daughter cell will also have the same chromosome inactivated (in the form of a Barr body)
• Different cells will have different X-chromosomes expressed, so different parts of a female can have different phenotypes
• Example, Calico cat - one yellow and one black allele
• Some parts of the body, yellow allele activated; others, the black allele activated, resulting in patches
• Rate of transcription can also be controlled by having multiple copies of the same gene
• Example: humans have like 280 rRNA genes, because they need them (to make ribosomes)
• However, frogs have millions, because they need to be able to make tadpoles quickly
• Called gene amplification - some cancer cells do this to become resistant to anti-cancer drugs
• Alternative splicing
• Sometimes, when introns are spliced out, so are exons by mistake
• Results in proteins that are missing chunks
• For example, you can mutate the doublesex gene in Drosophila to create transgendered flies
• So you suppress neither femaleness nor maleness
• RNA stability - must be unstable otherwise you only need to make RNA once and it would be around forever lol
• Example: excess tubulin binds to tubulin RNA, decreasing stability ... negative feedback loop
• Control of translation
• Changing the amount of capping
• Factors binding to the RNA to prevent ribosome attachment
• RNA interference: cell makes small RNA, does not encode protein, but complementary to mRNA that does
• This RNA (siRNA) then binds to that mRNA with the help of the protein Dicer, blocking its translation
• Post-translational controls:
• Kinases: enzymes that phosphorylate other proteins (important way proteins control each other's activity)
• Selective protein degradation - proteins that there are many of are marked for degradation
• Trade-off between efficiency and speed
• Regulation is efficient because unnecessary proteins are not made
• So no energy is wasted, but it takes longer or something
• Post-translational regulation is faster but not very efficient, you make proteins that do nothing until they're activated

## 8Recombinant DNA technology¶

### 8.1Cloning¶

• Genetic engineering: taking a single gene from one organism, putting it in another, perhaps altered
• Diabetes type I:
• Normally, when blood glucose is high, pancreas releases insulin, causes glucose to be absorbed
• When low, pancreas releases glucagon, liver releases glucose into bloodstream for other cells to use
• In diabetes, insulin not produced; you die, bathed in glucose
• 1921, Banting and Best purified insulin from dogs, could treat diabetes
• Pigs were later used to produce massive amounts of insulin
• But pig insulin isn't exactly the same - humans often had an immune reaction to it
• Turns out there is one amino acid difference between pig and human insulin
• 1970s, idea - take the human gene for insulin, put it in a plasmid, transform into bacteria
• First problem: bacteria can't splice, so you have to put the gene in without introns
• Do this by making a DNA copy of the mRNA that the introns have already been spliced out of
• To do this, use the reverse transciptase enzyme from a retrovirus --> produced cDNA
• Second problem: how do you find the right mRNA for making insulin?
• Answer - you don't. Just take all the of the mRNA in the cell
• Put each mRNA's cDNA into a different bacterium, then try to figure out which one produces insulin
• Plasmid used to carry the DNA = vector
• Cut the plasmid so that it is linear
• Then attach each end of a cDNA to the plasmid, recreating a circle
• DNA is cut using a class of enzymes - restriction endonucleases/enzymes
• Proteins that identify a specific 6-base-pair (usually) stretch of DNA, cut it at that point for both strands
• This specific stretch called the restriction site
• Then, pasting the two ends together requires ligase to form phosphodiester bonds between cDNA and plasmid
• Usually, put a gene for antibiotic resistance on the plasmid (e.g. ampicillin resistance)
• So if we put in some ampicillin, only the bacteria that have the resistance gene will grow
• This phenotype (ampicillin resistance) called the transformation marker
• Creates a cDNA library - every member of a colony will have identical plasmids, all colonies, different cDNA
• How do you find the colony that has the insulin cDNA? Use a DNA probe (piece of DNA used to find complementary DNA)
• Basically you heat a stretch of DNA, which pulls the strands apart
• If you let them cool, the strands will find each other again due to hydrogen bonds between the base pairs (hybridisation)
• The longer the stretch of the complementary sequence, the greater the affinity to the other strand
• So you synthesise a short stretch of radioactive DNA that is complementary to the sequence you're looking for
• Then you expose the cDNA library to the probe - should stick to the right colony
• You can then find the radioactive colony - that's the one you want
• For insulin, deduce a sequence that codes for insulin and synthesise the corresponding sequence
• Due to the degeneracy of the code, you have to use a mixture of all the possible corresponding sequences
• Now you have found the human gene for insulin and put it in bacteria
• Next step, make the bacteria express the gene - give it a promoter and terminator so RNA polymerase will recognise it as a gene
• Back to restriction enzymes - each one leaves a different, characteristic overhang when it cuts
• And the sticky ends will try to find each other
• So to get the insulin gene expressed: cut it out of the library vector with a restriction enzyme
• Cut the expression vector that has the promoter with the same enzyme
• Then you mix the two, let the ends find each other; again, add ligase to make sure it stays
• Problem: sticky ends usually just go back to each other, instead of combining the way we want them to
• To make sure the genes combined the way we want them to, we can look at the plasmid DNA using gel electrophoresis
• Because DNA is negatively charged, will move in an electric field towards the cathode (positively charged end)
• If you force the DNA to move through a thick tangle of agarose fibres (strings of sugars), will move slowly
• How slowly, depends on the size, so larger DNA moves more slowly
• If we know which size insulin should be, then we can find the right DNA
• Now, the bacterial RNA polymerase will recognise the insulin-making gene, will transcribe and translate it
• So it will make insulin. You can also turn this on and off with lactose

### 8.2Gene mapping¶

• Positional cloning - cystic fibrosis
• Inherited, autosomnal recessive, respiratory infections - can't move mucus out of lungs
• Purpose: find the specific chromosome and then the specific region the gene for cystic fibrosis lies
• We could then identify and clone the gene
• Problems:
• Not that many human traits that show simple Mendelian inheritance against which you can map the disease trait
• Solution: restriction fragment length polymorphisms
• Variations in these sites will vary from one individual to the next ?
• So, cut DNA with BamHI; you can then test for a particular RFLP using a PCR test (polymerase chain reaction)
• PCR = method of producing large quantities of a short stretch of DNA using a DNA template
• Idea: two DNA primers that flank the stretch, one complementary to each strand
• Mix the primers with the single strand of template DNA, heat, allow hybridisation (annealing)
• Then use DNA polymerase to synthesise complementary strands of each piece
• Primers bind to the 3' stretch of the DNA to be amplified on each complementary strand
• Initiate polymerisation in different directions; you now have two double strands
• Repeat the process; you get a geometric increase in the amount of DNA between the primers
• So basically this amplifies a small stretch of DNA enough that you can run it on the gel
• Problem with PCR - heating the DNA destroys the polymerase in the process
• Rather than add new polymerase each round, able to isolate polymerase from Thermus aquaticus
• Bug that lives at 95 C in Yellowstone national park, obviously able to withstand high heats
• So then you amplify the region of the chromosome that has the RFLP of interest
• Then cut the amplified DNA with restriction enzymes, see if it cuts
• If it cuts you have one large band; if not, two shorter bands
• As a result, you know what alleles of the particular RFLP are present in that individual
• Behave in a Mendelian fashion - you can distinguish homozygotes from heterozygotes (both long and short bands)
• Where do you get RFLPs from in the first place? trial and error
• Sequence the DNA of individuals, look for common polymorphisms that change the sequence of restriction sites
• Look at people who have family histories of CF, look at their RFLPs, you can find RFLPs linked to the CF gene
• You can then look at the human genome and figure out where on the chromosome the CF gene must be, roughly
• But you still need to figure out which of the genes in that region is the CF gene
• One way is through a cDNA library
• Other way, computers - bioinformatics; look through the genome, find similar things etc
• Find the difference between the CF gene in people with CF and people without
• Sequencing DNA:
• Synthesise a new strand of DNA using the DNA of interest as a template
• But make it inefficient at one nucleotide, so that it stops every time that nucleotide is added
• Then you run a gel, see how long the synthesised DNA is, tells you how far it travelled before it stopped
• Which tells you where that nucleotide is
• Do this for all four nucleotides ... you can figure out their sequence
• To make it inefficient, mix in dideoxyNTP which lack the OH's necessary for polymerase to continue
• Do four reactions, each with a different dideoxyNTP model, four mixtures of DNA with different lengths
• Turns out, affected individuals have a small 3-base-pair deletion in an exon; removes single phenylalanine
• The CF gene encodes a salt chloride channel - regulates chloride flow and thus controls salt balance
• When it is defective, then the mucus that coats epithelial cells is too dry
• Possible treatment: put correct CF gene on an adenovirus, infect CF patients with it, will replicate (not very successful)
• CF has led to increased resistance against typhoid fever in the past, probably why it is still around