Wikinotes

Maintainer: admin

BIOL 112 (Winter 2011) notes for Part II: Information and Heredity (Professor Dent's section). From February 16, 2011 to whenever the semester ends.

Prof does not have office hours - email him to schedule an appointment, or just email him
3 sets of practice problems - do them
A nice story about a central Asian monk named Temujin around 1220 AD and unknown paternity
- (aka Genghis Khan)
- Had to give his land to his son, but paternity issues, some sort of controversy
- Sort of illustrates the concept of heredity, which is what we're learning now
The mystery of heredity
- All organisms come from other organisms (except for the base case, which ostensibly was abiogenesis)
- All organisms resemble their parents
- Siblings are not identical
The magic that is conception
- An egg becomes fertilised and undergoes cleavage (cell division) until eventually you get an organism

All organisms consist of cells, which divide to produce new cells
Higher organisms fuse their cells (sperm and ova) to produce a new organism
Cell division results in more cells, etc
Dividing cells devote a lot of resources to segregating chromosomes
- i.e. organising them in a line, pulling them apart so that each cell gets an equal number of chromosomes
Chromosome: single string of DNA
- Circular: bacteria
- Linear: most other organisms (including us)
- When a cell is ready to divide, chromosomes condense, associate with proteins (e.g. histones)
- Combination of DNA + proteins = chromatin which is dark and easy to see
- Normal DNA is difficult to see under a light microscope, but when it's ready to divide, easier to see
- The DNA double helix wraps around histones, which wraps around other proteins, etc ... all bunched together
Karyotype: organising and identifying chromosomes
- Take a cell that is getting ready to divide
- Flatten it with glass (mitotic squash?)
- Stain it with a dye, gives them stripes; take a picture and cut them up etc
- Shows that chromosomes come in pairs - homologs (except for the sex chromosomes, which don't look like each other)
The number of chromosomes is characteristic of a specie (with some exceptions)
- Humans typically have 46, roundworms have 2, pigeons 80
- Doesn't really correspond to complexity or anything
Just before cell division, each chromosome has been replicated once to produce two chromatids (at the top of the chromosome)
- Two chromatids for each chromosome in a pair, held together by a centromere
Normally a chromosome is a single piece of DNA
- But when we look at a karyotype, we are seeing two pieces of DNA - the two chromatids, bound by a centromere
- We call these mitotic chromosomes to show that they are a special case
Segregating chromosomes is exacting: a cell needs to give each daughter cell the right number of chromosomes
- With random segregation, only half the time does each daughter get one chromatid (the right result)
- So there needs to be some sort of mechanism that ensures that the right result is enforced every time
- Organisms need at least one of each chromosome, and typically exactly one
  - more can cause problems - ex, Down syndrome - an extra chromosome 21 (which is incidentally the smallest chromosome)
So the chromosomes must obviously be duplicated BEFORE the cell divides
NOW THE ACTUAL CELL CYCLE (steps of cell division)
- Chromosome (DNA) replication: S phase
- Mitosis (M) somatic cells divide into daughter cells, each of which inherits one copy of each chromosome
- OR Meiosis (M) for germ lines (gametes) - non-identical copies, creating daughter cells that have one of each homolog
- Cytokinesis: dividing the cytoplasm in two (optional)
- Starts with Gap 1 (G1) - cell is growing, accumulating resources it needs to divide
- Then it duplicates its chromosomes - DNA synthesis (S phase)
- Gap 2 (G2) - another rest phase, preparing to divide
- Then it undergoes mitosis (M), and the daughter cells go through the cell cycle again
- To ensure that this occurs in order, there is a checkpoint between each phase
- From S to G2, cell checks that each chromatid has been duplicated
- What evidence of this checkpoint do we have?
  - Drug: hydroxyurea, blocks chrosome replication
  - So cell division is put on hiatus indefinitely
  - Another: caffeine, which disables the checkpoint
  - This is usually not a problem, as chromosome replication occurs without problems
  - However, if you both block chromosome replication AND disable the checkpoint, you have problems
  - Because the cell will attempt to continue to the mitosis stage, but will find that it's not able to divide properly
Cell cycle tightly regulated to ensure that, in particular, each cell gets the right number of chromosomes
- Diagram later maybe
More on the G1 to S transition:
- There is a protein called Cdk4, which is always present during the cell cycle but doesn't really do anything
- But then it associates with another protein called Cyclin D, which is only produced during S phase
- This complex interacts with other proteins to say that the cell is ready to go into S
- Cyclin is degraded, Cdk4 is released (no longer functional)
- So this activity of proteins tells the cell what cycle it's in, when to start transition
- Although cyclins and cdks are involved in all parts of the cell cycle, important in different transitions, etc
How do cells know when to divide?
- Most somatic cells not dividing - arrested in the G1 phase of the cycle
- Often waiting for signals from other cells to tell them to divide
- For example, cells in the immune system: not usually dividing, but when there is an infection, they start dividing
  - A cell called a macrophage identifies a virus, sends a chemical signal to T-cells telling them to start dividing, and about the infection
  - So T-cells don't start dividing until they get a signal from a macrophage
- Unregulated division of cells → cancer
- If the G1 to S checkpoint is defective, a cell can divide in an unregulated manner (i.e. dividing when it shouldn't be)
  - For example, if cyclin E is always active or overabundant, a cell will repeatedly divide → cancer
  - If we can understand how signals regulate the cell cycle, we might be able to design drugs to interfere with these signals
  - (And thus fight cancer)

Although the process of dividing cells randomly and hoping that each half gets the same stuff works for things like mitochondria and ribosomes etc, doesn't work for chromosomes
Cell can regulate chromosome duplication, to ensure proper segregation (unlikely to happen randomly)
- Mitotic chromosome: the kind we're familiar with, two sister chromatids united by a centromere (only happens during mitosis, usually a messy ball - not yet condensed)
Mitotic spindle: mechanism for this
Prophase: centrosomes duplicate, migrate to opposite ends of nucleus, chromosomes start to condense
Prometaphase: nuclear envelope breaks down, microtubules from spindles (formed from centrosomes, at poles) can interact with chromosomes, bind to each other
- Microtubules are growing and trying to interact with chromosomes, unstably and randomly sampling space
- Because centrosomes don't know where chromosomes are, just looking for them with microtubules
- Each kinetochore is given a geometry so that two kinetochore microtubules from the same spindle can't get both chromatids
- Basically, kinetochores are the things that the microtubules can attach to, to latch onto the chromatids
Metaphase: all the chromosomes are lined up in the middle between the poles (equatorially), attached chromatid to centrosome (metaphase plate)
- Chromatids must be paired and kept together until it is time to segregate
- Checkpoint to make sure that all chromosomes are attached to microtubules and on opposite centrosomes, before continuing
Anaphase: centrosomes break down, so sister chromatids no longer attached, microtubules pull chromatids apart
Telophase: shortly before the cell actually divides, equal number of chromosomes in each half
Cytokinesis: only in plants, for dividing the cell
- Vesicles between the cells are added, fuse to become the cell plate (cell wall)
- In animals, purse string method - actin and myosin constrict, to pinch and divide the cell
- Note that cytokinesis does not always happen - some cells have multiple nuclei per cell
- Example, cardiac tissue is multinucleated (syncytial) - mitosis without cytokinesis
Errors: if one chromosome, say, goes to the wrong side, such that one daughter cell has 2 copies of one chromosome and the other has 0, both will die

Sex: mixing the genetic material of two organisms, for organisms that are different from you and thus may be better adapted
First, need to reduce the number of chromosomes by half
Ploidy - the number of sets of chromosomes containing exactly one of each homolog
- Haploid, diploid, triploid, tetraploid, pentaploid etc
- Somatic cells are diploid, so gametes must be haploid
Meiosis: process by which haploid cells are made, much like mitosis
- Meiosis I: reduction division
  - Early prophase I, chromosomes condense, move apart
  - Mid-prophase I: chromosomes are condensed, have been duplicated (so 2 of each chromosome, 4 of each chromatid)
  - Late prophase I - prometaphase, little pieces of homologous chromatids start exchanging DNA (chiasma or recombination)
  - Metaphase I: all lined up, except two columns, sort of, in the middle (so pairs of homologues lined up)
  - Checkpoint before anaphase
  - Anaphase I: homologues separate (chromatids stay stuck together), half on each side
  - Telophase I, ^
- However, the process is not complete - still too many chromatids (diploid)
- So then we have meiosis II: equatorial division, similar to mitosis
  - Form a spindle, microtubules grab chromatids, so you end up with four haploid cells
  - Prophase II, metaphase II, anaphase II, telophase II etc
While mitosis can be quick, cells can arrest in meiosis for a long time
- E.g. ova, produced by human females; arrest, wait until they're fertilised
- However, problems with this: Down's syndrome, happens more often with older owmen
  - A problem with meiosis I; two homologues in one cell, so daughter cells (from meiosis II) still have 2 of each chromosome (21 in this case)
  - The older the mother, the longer the egg has been "sitting around", in an arrested stage
  - And the longer the egg has been sitting around the greater the chance of imperfect duplication

We're diplontic life organisms - mostly diploid, small haploid portion (fertilisation etc)
But there are also haplontic organisms, e.g. algae - almost completely haploid, gametes fuse, undergo meiosis immediately (so briefly diploid I guess)
It's not the absolute number of chromosomes that's important, but rather the ratio that matters
As long as the ratio is constant (e.g. 3 chromosome 1s, 3 chromosomes 3)
However, odd ploidys tend to be sterile (one daughter cell will have more chromosomes than the other etc)
Trisomy: genetic anomaly, three copies (instead of two) of a particular chromosome (type of aneuploidy)
You can also have triploids (sterile though) and tetraploids or even more
- For example, store-bought strawberries ... can be octoploids (makes them bigger, still viable and fertile)
- Another example: frog in South Africa, at one point, evolved to be tetraploid rather than diploid (whole new species); similar but larger than original

Gregor Mendel: Austrian(/Australian) monk, really cool guy and doesn't afraid of anything
- Pea plant experiment
- Tried to explain why for a given character, offspring share traits with their parents, but siblings' traits not necessarily identical
Continuous variation: height, skin colour, etc
- Leads to blended inheritance usu
Discrete variation: only a few possible traits for a given character
- Example: pea flower colour or mouse fur colour
- What Mendel focused on (pea plant colours)
Used true-breeding strains (e.g. round pea, only gives round peas when fertilising itself)
True-breeding round + true-breeding wrinkled = only round in F1
- But F2 progeny, 1/4 wrinkled, omg genetics
- Same sort of thing happens with many other traits that exhibit complete dominance
Vocab terms: genes, alleles (different possibilitiesfor a gene), hetero/homozygous
Punnett squares etc
Stochastic process, follows laws of probability, pretty straightforward
Genotype: set of alleles; phenotype: set of traits
- Genotype uniquely determines phenotype, but phenotypes can be the result of many different genotypes
Alleles of the same gene segregate independently; same for alleles of different genes (law of independent assortment)
- Example: SSYY x ssyy
- F1: All SsYy
- F2: 9:3:3:1 (S-Y-:ssY-:S-yy:ssyy)
Note on Punnett squares: if you have n genes, $2^n$ genotypes
- Some gametes may have the same genotype; still have to include them
- If you have more than 3 genes, just use probabiddy, not Punnett squares
Conclusion: heredity inherited through discrete units (alleles)
Mendel's discoveries were ignored until the discovery of chromosomes (discrete structures that could be responsible for this phenomenon)
- If genes are on different chromosomes, then they can segregate independently during meiosis (arrangement across metaphase plate)
- However, they don't HAVE to be on different chromosomes - chiasma (genetic recombination) during meiosis (?)
- But the law apparently only applies to genes on different chromosomes
Usually can't see alleles, but in the case of sex chromosomes, you sort of can (XX vs. XY)
- Y is kind of dominant - XXY is male, X is female
Red-green quarter blindness: sex-linked (X), recessive
- Colour-blind females have affected fathers
- Colour-blind males usually have unaffected parents (colour-blind dad = red herring; colour-blind mother = unlikely)
- There is no colour-blind gene on the Y chromosome - too small I guess
Wild-type (predominant, >99%) vs. mutant alleles (sometimes purposely-induced mutations)
- Polymorphic alleles - > 1% of population
Autosomes (non-sex chromosomes) and allosomes (sex chromosomes)
For genes on the same chromosome:
- Experiment with fruit flies, either two genes are on the same chromosome or not
- Results: the genes are on the same
- Draw the Punnett squares for both possibilities, results matched the same-chromosome situation
- Also tells you how the genes are arranged on the chromosomes (e.g. both dominants on the same chromosome)
- HOWEVER, you also get some evidence that the genes are NOT linked to the same chromosome
- Of course this is due to genetic recombination (crossover etc)
  - Recombination occurs at random points on the chromosome
  - So the probability or rate of recombination occurring between two genes depends on their distance from each other
  - The further apart two genes are, the more likely that they will be separated
  - So measuring recombination rate can tell you about the distance between two genes on a chromosome
  - This distance is measured in centiMorgans (cM) = number of recombinations / number of total progeny
  - Example: if black is 16.6 cM from vestigial and vestigial is 12 cM from curly ...
  - Then you don't know how far black is from curly, could be either 28.6 or 4.6

Results from having many genes determine each trait, and many alleles for each gene
Example, snapdragons and colour (red + white = pink, intermediate phenotype), incomplete dominance
- Reason: white does not have gene for producing red pigment
- Pink is heterozygous, so produces only half the amount of red pigment, so pink
- But if you had many alleles and many genes you can get a true range of phenotypes
Although not exactly continuous variation, it does illustrate intermediate phenotypes (although there is only one)
Example in humans: allelic series and Alzheimer's disease
- Different allelic combinations result in different risks of developing the disease
- So the alleles are working additively
You can also have multiple genes affecting a phenotypic trait (multigenic or polygenic)
- Example: heart disease, many genes working additively
- Each allele acts as a single Mendelian trait, but their sum gives the actual result
Epistasis: interactions between alelles of different genes
- Example: fur-colour in mice, not a single locus - two genes determine, discrete trait
- If you have two recessive alleles for the albino locus, you are albino
- But if you don't have two recessive alleles for the albino locus, you produce pigment
- But the pigment colour is determined by another gene, Agouti - either brown if you have bb or black if you have B-
- So mice can only be either brown or black or white
- In other words, the genes assort independently, but the ratio is not 9:3:3:1 because of interactions (epistatis)
- Instead, 9:4:3
- Epistasis often results from genes involved in different steps along the same pathway or process
Environmental contribution to phenotype:
- Penetrance: percentage of individuals with a given genotype showing a certain phenotype
- Expressivity: the degree to which a phenotype is expressed

Chromosomes: physical things that carry heredity, and they are made of DNA OMG
How to get the stuff that chromosomes are made from: purify them, assay the components
- Assay: just a way of measuring something
- For example, an allergy skin test assay, to find out what substance is causing allergic reactions
- Separate the components, test them one at a time
- For DNA: grind up any living organism
  - Use an organic solvent to extract lipids and proteins (e.g. phenol)
  - Precipitate with ethanol, and you get just DNA
- Now do the assay test
  - Hypothetically, get some porcupine DNA, dip a cat into the DNA, should result in a pet porcupine RIGHT?
  - No
- But you can actually do something similar, with bacteria (discovered by Frederick Griffith)
- Can an extract from dead bacterial cells genetically transform living bacterial cells?
- S strains and R strains of bacteria, S is deadly, R is harmless (when injected into a mouse)
  - So take the S strain, heat it up so that the bacteria are dead, inject in mouse - harmless
  - However, if you kill the S strain, mix it with the R strain, and inject it into the mouse, it will die
  - Essentially you have transformed the R strain into the deadly S strain, by mixing them (something transferred)
  - Transforming principle - hereditary
- Oswald Avery, simple experiment to identify the transforming principle - turned out to be DNA (not lipids carbs etc)
- Another experiment on this done on viruses, which can also have DNA
  - Specifically, bacteriophages which infect bacteria, inject something into the bacteria
  - Which causes more bacteriophage to be made within the bacterium
  - Bacteriophage have only DNA and proteins, so the Hershey-Chase experiment set out to show that it was DNA that was being injected
  - Done by labelling protein with radioactive sulfur and DNA with radioactive phosphate (has a phosphate backbone)
  - Then blend the mixture of phage and bacteria, so that the phage gets stripped off the top of bacteria
  - Locate the DNA after centrifuging - is it in the bacteria (pellet I think)? Or is it in the supernatant (liquid at top)
  - Shows you clearly that DNA is what is being injected

Nucleoitodes: base (4 different kinds) + sugar (deoxyribose) + phosphate, polymer backbone (ribose-phosphate-ribose)
5' phosphate at one end, 3' hydroxyl at the other end (so directionality)
Chargaff's rule: ratio of bases (A:T:G:C) specific to organisms (humans have a different ratio than corn etc)
- Chargaff's analysis also showed that the ratio of A:T is 1:1, same for C:G (so those are always paired)
X-ray crystallography (diffraction etc, cool stuff)
- Showed that DNA consists of two strands, double helix, antiparallel (so each end has a 5' and a 3' end)
- Phosphates probably on the outside
- Watson and Crick put it all together for a good model of DNA
- Minor groove: when backbones are close together; major groove: when they're far apart
- DNA strands are the reverse complement to each other (opposite directions, and A-T, C-G)
- The reason A-T and C-G are paired: natural affinity of bases
  - A-T, form hydrogen bonds when they lie together in a plane with the backbone on the outside
  - C-G, form three hydrogen bonds (?) so these hydrogen bonds determines base pairing
  - Note: A, G are purines (large); T-C, pyrimidines (small bases), so no bulges; distance between two strands always the same
- RNA, another nucleotide, can also form complementary strands, with U instead of T
Key to DNA: sequence of nucleotides not constrained by the structure; can accommodate any sequence
- Each strand has the same information as the other strand - so in a sense it's already replicated
- But how does it actually replicate (to make children etc)

Original possibilities considered:
- Semi-conservative, two strands separate, new strand for each
- Conservative, two strands separate, make copies, those copied strands join
Experiment to determine what actually occurs (Meselson and Stahl):
- To distinguish between new and old DNA - N14, light DNA; N15, heavy DNA
- Use a centrifuge to separate them
- Procedure: grew bacteria in heavy DNA, allowed it to replicate, became lighter etc
- After one generation, all the DNA migrated to the middle, suggesting that all the DNA is intermediately heavy
- Also, adding a new light generation, etc
- So this supports the semi-conservative model - each pair of strands is half new, half old
- If replication were conservative, this experiment after two rounds would result in 1/4 heavy, 3/4 light
  - Reason: the original would remain heavy, would make a new light strand after one replication
  - After the second round, the original would still be heavy, but the new ones would all be light
Making DNA in a test tube:
- Triphosphate nucleotides
- DNA polymerase (an enzyme)
- DNA template (an old strand that can be duplicate), which must have ragged ends
- Ragged ends: one of the strands sticks out, which acts as a template to stick a new base on
- For example, if you have a G at a ragged end, you would add a C to it (formed from the materials)
- This ragged end must be at the 3' end (so that you attach shit to the 3' base)
- Indicates that the direction of polymerisation is 5' to 3' (so start make the 5' end first)
- When in the lab, you create ragged ends by breaking up the DNA with phenol etc
- But how does the cell make ragged ends (which it hates) and how does it pull them apart?
  - Solution: enzyme called helicase, unwinds the DNA (pries them apart, like unzipping)
  - Enzyme called primase, creates fake ragged ends - creates a short RNA primer
  - Then the polymerase uses the primer as a 3' end
  - Leading strand (can follow the helicase) vs. lagging strand (has Okazaki fragments)
  - How exactly does this work? Ascertain this
- This bubble formation is not random, occurs at specific locations in DNA
- For example, bacteria, circular chromosomes; creates two linked rings (which are then broken apart)
- For our linear chromosomes, many different origins of replication, eventually the whole thing is replicated
Error correction mechanisms:
- DNA proofreading by DNA polymerase III
  - Sometimes it can see when it made mistakes, go back, replace it with the right base
- Mismatch repair
  - If the new strand does not match the old, an enzyme will fix it
Excision repair: bases are damaged, will replace them
- Example: UV light produces thymine dimers, where thymine bases are bonded to each other
- An enzyme can recognise this, will rip out the bases and replace them with good ones
- Can repair single bases, multiple bases, etc

Key to DNA:
- DNA can accommodate any arbitrary sequence, which can encode information
- Two strands encode the same information in complementary format - method for replication
Alkaptonurea: metabolic disease, urine turns black when exposed to air
- Garrod figured out that is a recessive hereditary trait (ran in the family etc)
- Then deduced that it results from the absence of a specific enzyme
- So the enzyme would normally convert a certain compound into another, but as that enzyme is missing ...
- The compound remains in that format, and when oxidised, that compound turns black
- Thus genes correspond to enzymes, or something
- Another example of a gene in a pathway (first example: epistasis)
Experiment: Beadle and Tatum
- Show that genes determine enzymes in a biochemical pathway
- Established: one enzyme, one gene, for a series of genes in a pathway
  - We now know: one gene, one protein (so each gene encodes for a protein)
- In this pathway, we have compound conversion: ornithine --> citrulline --> arginine (an essential amino acid)
- So this pathway must be intact for organisms to grow etc
- Method: put spores of each arg (mold) mutant strain in a medium with and without nutritional supplements
  - Normally if you add the minimal number of nutrients, the wild-type can grow
  - But sometimes there are mutants that need additional supplements to grow
- The wild-type: you can add ornithine, or citrulline, or arginine, and it will grow
- One mutant type - will only grow if you add citrulline or arginine
- Another mutant type - only on arginine, clearly missing an enzyme to convert shit into citrulline
- So it was possible to identify a number of mutants, each one corresponding to a missing enzyme in the pathway
RNA: ribonucleic acid, how information goes from the nucleus into the cytoplasm (where proteins are actually made)
- So DNA is converted into RNA, which carries information into cytoplasm
- Ribosomes use this information to make proteins
- So DNA information flows to DNA (replication), and also to RNA, then to proteins
- Difference: uracil instead of thymine; has both hydroxyl groups on ribose (?)
- So you can have these double-stranded hybrids of RNA and DNA
Transcription: enzyme = RNA polymerase
- Enzyme unwinds the two strands of DNA
- Starts making an RNA strand (5' to 3') complementary to ONE of the strands
- Only goes a short distance - corresponding to how much information it needs
- Then the RNA goes out of the nucleus, ribosomes grab onto it to make proteins
Fred Sanger - sequenced the first protein (insulin)
- Showed that a sequence of amino acids is characteristic of a protein
64 possible sequences, only 20 amino acids, so yeah some redundancy
Experiment to figure out which sequences encode for which amino acids:
- RNA with only one of: UUU, AAA, or CCC (repeated)
- Conclusion: each triplet is mRNA for a different amino acid
Gobind Khorana: AAGAAGAAG
- Some proteins were made of LysLysLys (amino acids)
- Others, ArgArgArg ... others, GluGluGlu
- Due to the three possible reading frames for any sequence of DNA
- If you do enough experiments like that, you can figure out which amino acid each codon corresponds to
- Methionine: AUG/ATG, start codon
- Stop codon: TAG/TAA/TGA (or with U)
- Code is degenerate: while DNA uniquely determines the amino acid sequence, the reverse is not true
How does the ribosome know how to interpret the RNA sequence?
- Unlike in DNA, where one strand has natural hydrogen bond affinity to the other strand, this is not true for ribosomes
- No affinity of amino acids to specific codons in RNA
- So there is another type of RNA - transfer RNA, has a 3' and 5' end, and has an anticodon
  - The reverse complement of the codon that encodes for the amino acid
  - There is 1+ tRNA for each amino acid
  - So for CAT the tRNA would be GTA I guess
  - Each tRNA has a unique shape
  - So the tRNA binds to the mRNA, and the correct amino acid is on the tRNA (other side)
  - Which is how you get the amino acid, the amino acids join up etc
- How does the cell know which amino acid to attach to a specific tRNA?
  - Answer lies in the structure of the tRNA itself - shape, also tRNA synthase
  - Enzymes recognise specific tRNA and amino acids, attach them to each other
  - If you, say, converted an amino acid on tRNA to something else, the ribosome wouldn't know, would just blindly insert it
- This code, shared by organisms ... evidence for a single common ancestor, presumably around 6000 years ago

Clicker question: if the enzymes for converting a precursor to ornithine and ornithine to citrulline were missing ...
- What would the neurospora grow on?
- Answer: arginine or citrulline. Since citrulline can still be converted to arginine.
One DNA strand has the same sequence as the mRNA (sense/coding strand)
- This is the strand we look at if we want to see what amino acids are being produced, etc (just change T to U)
- The other one acts as the template - the antisense strand (reverse of mRNA)
- So the DNA sequence that is actually transcribed is the reverse complement of the mRNA
Some genes are transcribed in the other direction - from the other strand of DNA
- It's more or less random which strand of DNA actually gets transcribed
How does the RNA polymerase know where to start?
- Binds to a specific DNA sequence upstream of the place to transcribe, called a promotoer
- So just in front of the portion of the sequence that needs to be transcribed
- So the promoter recognises this stretch of DNA by its sequence (e.g. TATA box lol)
- All genes have some kind of promoter - not necessarily the same promoter of course
- So these promoters are encoding information, but not for amino acids
The three-frame problem: how does the cell know which reading frame to choose?
- Experiment: confirmed that coding regions always begin with methionine (ATG)
- The first methionine codon that the ribosome can find signals the start of the reading frame
- Ribosome consists of two subunits, one large one small
- Small one recruits a methionine tRNA
- Large subunit will sit on the p-site
- Will start going along, grabbing tRNA as it goes along
- Will replace each tRNA by another one as needed
- N/amino-terminus: start of sequence
- Ends at a stop codon (release factor - peptide that causes the compound to be released when it encounters a stop codon)
Wobble pairing:
- Although there's a tRNA for every amino acid, there isn't tRNA for every codon
- During the interaction of the anti-codon with the codon
- For example: CAU should have AUG as its anticodon (5' to 3') but instead, it's GUG
- Rules determine which wobbles work, depending on the anticodon position (5' end or 3' end)
- tRNA also has a different kind of base called Inosine (I), can pair with A/U/C in the anticodon position
- Hence, the degeneracy of the genetic code, and why you don't need one tRNA for every codon
- (Because some tRNAs can bind with more than one type of codon etc)
- Know how to determine possible anticodons made on codon sequences for amino acids and the Wobble pairing rules
Transcription and translation can be simulatneous
- In prokaryotes, ribosomes can start translating mRNA before transcription is complete
  - Not possible in eukaryotes, because ribosomes can't get into the nucleus, where transcription is occurring
- Proteins can begin to function before translation is finished (before they've been completely synthesised)
Polysomes - many ribosomes for a single strand of mRNA
- Like a train of ribosomes, all producing the same peptide simultaneously

Demonstration of link between mutation and phenotype - sickle-cell anemia (autosomnal recessive trait)
- Unusually high prevalence among Africans (heterozygous for it) - link to malaria lol
- Main protein in blood: hemoglobin, binds to oxygen and carries it around the body
- But in sickle-cell anemia, hemoglobin will form fibres, under certain conditions, instead of being soluble in the blood
- Cause red blood cells to stretch, break and eventually die
- Cause: change in the hemoglobin amino acid sequence
- Experiment done by Linus Pauling - cut up hemoglobin into small pieces, using enzymes
  - 2-D gel to separate the peptide fragments based on their charge, in one direction (different amino acids have different charges)
  - And in another direction based on their size
  - So where peptides end up depends on their amino acid sequence
  - And it turns out there is one peptide in a different location for sickle-cell patients
  - Which is likely due to a different amino acid sequence
  - Later, it was sequenced, confirmed - sickle-cell anemia caused by a single base pair change
  - Results in a single different amino acid - valine instead of glutamic acid - which is enough to cause sickle-cell anemia
Classifying mutations:
- Point mutations - small change in DNA (e.g. one base pair changed), usually affect one gene
  - Missense mutation: single nucleotide change --> different amino acid produced, so the protein is different (e.g. sickle cell)
  - Nonsense mutation: when something becomes a STOP codon, protein truncated early
  - Deletion (frame shift): individual nucleotides deleted, changes both the amino acid and causes a frame shift ... subsequent amino acids different
  - Insertion (frame shift): same as above except adding in a random nucleotide
  - Deletion (no frame shift): if you delete 3, you get rid of one amino acid, but there is no frame shift (skipping an amino acid)
  - Insertion (no frame shift): same as above but adding 3 (or a multiple thereof)
  - Silent mutation: no effect, due to degeneracy in the code (for example, TAG because TGA)
- Chromosomal mutations - large changes in chromosomes, usually affect many genes
  - Deletion; delete a large chunk of a chromosome, removing many genes in the process (ex: bands missing in chromosome)
  - Duplication and deletion: unequal crossing-over (during recombination) - one chromosome has extra genes, the other, missing
  - Inversion: a piece of the chromosome is oriented backwards, often has no detectable effect if the break is between genes
  - Reciprocal translocation: recombination occurring between non-homologues (which shouldn't be recombining)
Mutagen: agents that cause changes in the DNA sequence
- chemicals, radiation, viruses etc
- Of course, even replicating DNA itself can cause mutations
Mutations can occur in two different cell populations - somatic cells and germ line cells
- If a somatic cell gets mutated, then further somatic cells may be mutated, but can never escape the body
  - Can kill a cell, or make it cancerous, or harm it or do nothing or whatever
  - Transmitted to daughter cells, but never to progeny
- If a germ line cell gets mutated, then progeny may be mutated too ... that allele can be a polymorphism in the gene pool

Viruses first discovered by Dmitri Ivanovsky - Tobacco mosaic virus
- Many tobacco plants dying of a disease
- The only infectious agent known at the time = bacteria
- But when he ground up the leaves, looked under a microscope, couldn't see them
- And it had to be an infectious agent, not a toxic agent - even when diluted and transferred many times, still infectious
- If it were bacterial, using a Chamberlain filter would get rid of the infections
- (Filter whose holes were large enough for water to pass through but too small for bacteria)
- However, when these were used, the infection was still transmitted ... so the infectious agent must be smaller than bacteria
- So viruses were defined base on their disease-causing abilities and size
- Initially, viruses defined based on their host specificity
- Then later, based on their genetic material (some have only RNA, no DNA)
- Several forms of DNA:
  - Single-stranded DNA
  - Double-stranded DNA (e.g. chicken pox virus)
  - Single-stranded RNA (e.g. tobacco mosaic, influenza)
  - Double-stranded RNA
  - The above can all be linear, like us, or circular, like bacteria
Bacteriophage - viruses that infect prokaryotes
- Infects a bacterium, injects its DNA into it, uses the bacterium's ribosomes to replicate
- Eventually you've digested all the bacterial genes, and lots of phage proteins etc
- With T4, there is a lytic cycle - burst out of the bacterium, infect many others
- Lysogenic phage - phage injects chromosome into the bacterium's DNA (now called prophage, continuous with bacterial chromosome)
  - Might be done if conditions are not good - don't want to infect others just yet
  - And the bacterium does not know that its DNA has phage DNA, so it keeps replicating
  - In the process, the virus is replicated in all the host's offspring lol
- Some phage have both cycles, others are only one or the other
- When lysing, it's important to assemble all your phage parts before you lyse
  - Done using promoters
  - Elements in front of genes, indicate when they should be transcribed
  - Early genes need to be transcribed first, then they can turn on the "late" genes, which are responsible for lysis
- Phage are capable of exchanging DNA
  - In a high multiplicity infection (when you can have more than phage affecting a bacterium at the same time)
  - Phage have only one single chromosome
  - If they have different alleles, they can recombine, like in prophase I of meiosis ... cis to trans and vice versa
  - Can influence what the plaques look like (dark/light/small/large etc)
Eukaryotic viruses usually have an additional layer of complexity (often)
- Glycoprotein envelope on outside (coat)
- Lipid bilayer membrane, with the proteins above embedded in them (sometimes, not always)
- Nucleocapsid
- Viral RNA/DNA genome
- Binds to surface of the cell due to proteins on their membrane binding with proteins on surface of cell
- Responsible for tropisms of viruses - the certain types of cells that they like to infect
- Allows them to get inside the cell through a vesicle; the two membranes fuse, virus released into cytoplasm
- Virus genetic material is transcribed; proteins made
- These proteins go through the Golgi, then bud off, forming new virus particles (using the cell machinery - co-opting)
Eukaryotic viruses can exchange genetic material
- H1N1 - mixture of various influenza strains
HIV retrovirus - have RNA as genetic material when they enter the cell
- Also carries a protein called reverse transcriptase
- Known to bind to proteins (CD4, used for finding infectious agents) on the surface of immune system cells
- The RNA enters the cell, then makes a DNA copy of its RNA genetic material
- Then inserts DNA copy of genome into the genome of the host, damn
- And once the cell divides, the host will make RNA copies of the retrovirus' DNA, which is just the original RNA
- So it will make new viruses just by replication ... the virus is now part of the host cell's genome
Chicken pox - caused by a double-stranded DNA virus (varicella)
- Once you get over chicken pox, the virus travels through nerve cells, infects the spinal cord, lies there dormant
- So you can get recurring infections - e.g. shingles (from zoster), virus becomes active again
- Member of the herpes virus family - characterised by reemergent infections, difficult to get rid of

Bacterial replication - usually happens asexually (binary fission)
- Experiment with phenotypes - ability to grow on different types of media
- One strain of E. coli, needs methionine and biotin for growth
- Another strain, threonine and leucine required for growth
- When you combine them, some bacteria don't need either ... due to genetic material being exchanged
- Conjugation: sex pilus, tube formed between two bacteria, one feeds plasmids into the other, which has none
- F-plasmid: necessary for forming conjugations; an F+ bacterium can form a conjugation tube
- Then the bacteria that receives the plasmids becomes F+, can also form conjugation tubes
- This process also allows chromosomal DNA to be moved
- The plasmid gets inserted into the chromosome - Hfr (high frequency of recombination) strains
- Bacteria have circular DNA, so they just insert themselves somewhere in there
- Or, they can just recombine partly (like in prophase of meiosis I), the non-used part degrades; the rest divides
- Experiment: allow bacteria to start to conjugate, then interrupt the process
  - Frequency of a gene being transferred through a tube depends on how close it is to the origin of replication
  - Allowing you to map the genes
Transduction: "hitchhiking on phage"
- Phage accidentally carrying around a piece of bacterial chromosome
- They can't replicate, but they can insert it into other bacteria, spreading those genes around, lol
Transformation: moving DNA around on plasmids
- Some bacteria, enzymes increase in level to take advantage of increased lactose in the environment
- In E. Coli - three genes responsible for metabolising lactose (Beta-galactosidases), have a promoter
- The three genes form an operon - transcribed onto same mRNA; ribosomes make them into separate proteins
- Lac repressor - between the promoter and genes; lac operator, prevents polymerase from binding and making RNA from these genes
- Basically represses these genes during specific conditions (repressor molecule, binds to DNA)
- But when lactose gets into the cell, lactose binds to repressor molecule ... allosteric regulation
- The repressor molecule no longer binds to DNA, so the gene gets turned on basically
- The reverse is also possible - the regulation of tryptophan formation
  - If tryptophan is present, tryptophan binds to repressor, the repressor blocks the polymerase from making more enzymes
  - So, a negative feedback mechanism to create an appropriate amount of tryptophan (homeostatis)
- What if you replaced the lac operon with the trp operon?
  - Cells can build sort of complex logic circuits using these repressors
  - Sophisticated regulation based on what's present in the cell at the time
- There are also activator proteins - cyclic AMP receptor protein
  - Cyclic AMP is usually present in high quantities in the cell in the absence of glucose
  - When it binds with protein kinase A, encourages polymerase to come and transcribe this gene
  - Bacteria weighs the relative advantage of glucose vs. lactose

Back to ploidy - greater ploidy, greater size
- One possibility: just has more cells (not true)
- Other possibility: just has bigger cells (this is true, generally)
- Cell size roughly correlated to genome size
- So, possible to find impressions in bones of dinosaur fossils, which give the size of cells
- From that, you can get the size of the dinosaur DNA
- Genome size usually given in number of base pairs - the haploid number (so number of base pairs PER chromosome)
  - So humans have 6 billion bp per cell but 3 billion base pairs per chromosome
- Although you'd expect humans to have many genes based on numbers in other organisms (e.g. roundworms)
- Humans actually only have about 30,000 genes
- Turns out humans have many more genes than they actually need
- Large stretches of non-coding DNA between genes that actually code for things
- Exons: things that code for proteins; introns: don't encode for proteins
- When the RNA gets transcribed from the DNA, all of it gets transcribed into a primary RNA transcript
- Then the introns get cut out, and the exons get spliced together, so you have a continuous RNA strand, all of it coding for amino acids
- Experiment that showed this: heat up a sequence with an intron between exons
  - Hybridisation: forms loops, so the exons join together
  - Through heating up and cooling down ?
- There are enzymes that do this
- Small nuclear ribonuclear protein - enzyme that consists of protein and RNA (RNA strand, essential component of enzymatic activity)
  - Recognises specific sequences at a 5' splice site
  - Then loops an intron out, sort of ... splicing
  - Example of RNA processing (doing things to the RNA before it's translated by ribosomes)
  - Others: getting a 5' G gap, and polyadenylation (addition of a poly A tail)
  - G-cap: 5' end, guanine binds to the other 5' end, poly A's added to the end
  - Tells the cell that this is mRNA that needs to be translated
Purpose of junk DNA?
- Telomeres: protects the ends of DNA, because there's an inherent flaw in the way DNA polymerisation occurs
- When ready to replicate DNA, separate the strands, add an RNA primer (3' hydroxyl to start the process) - helicase + primase
- So DNA polymerase III can then come along, use the primer to start transcribing
- But then DNA polyermase I rips out the RNA, and there's a gap at the end, since it's a linear chromosome; that end not filled in
- If you do this once, you get recessed ends on both ends of the chromosome
- Every time you replicate the DNA, it gets shorter ... until eventually you have no chromosomes
- Cell deals with this by creating telomeres. Short stretches of extra DNA, that are useless, on the ends of chromosomes
- Acts as a buffer so that when it's lost, it doesn't matter
- Telomerase comes along, adds a lot of random DNA to the end of a sequence
- Telomerase is very important:
  - Most human cells don't express telomerase; as DNA replicates, chromosomes keep getting shorter
  - Until they get too short, triggers a checkpoint, cells stop dividing
  - Part of what happens in human aging - your telomeres start getting shorter, eventually you run out of telomeres
  - Also, cancer cells often express telomerase, so they can divide indefinitely
- Clicker question: 100 six-base-pair telomeric repeats on the end of chromosomes
  - And primers are 25 base pairs long
  - You get maximum 24 divisions before you lose all your telomeres
- Some non-coding DNA actually has a function
  - Centromeric regions tend to have highly repetitive DNA, which is how cells know where to find a centromere
- However, the vast majority of DNA that doesn't code for cellular proteins = transposable elements
  - Pieces of DNA that don't really have a purpose
  - Transposons: encode proteins called transposase in their middle
  - Retrotransposons: look like retroviruses, only they aren't infectious
  - Non-LTR retrotransposon: have reverse transcriptase
  - Alu sequence: something
  - Transposons: have inverted repeats on either end, they can excise themselves, insert themselves somewhere else
  - Sometimes they land in a cellular gene, disrupting the gene and causing a mutation
- Reason there are so many transposons:
  - When one moves somewhere else, it leaves a gap
  - Cell fills in the gap with whatever is on the homologous chromosome
  - So you get another transposon where it used to be
- Retrotransposons have an RNA intermediate, require reverse transcriptase
  - Reverse transcriptase gene gets made into a protein by the cell
  - Makes DNA from the RNA, that DNA goes back into the genome, inserts itself somewhere
- Non-autonomous non-LTR: get transcribed, piggyback on reverse transcriptase made by other genes in the genome, same thing
Pseudogenes: things that look like cellular genes, but without the introns
- So when reverse transcriptase acts on mRNA, making DNA that then gets inserted back into the genome
- But, doesn't have a promoter or anything, so they can't actually do anything
Conclusion: genome is a complex ecosystem, transposable elements compete for survival
Which came first? polymerase gene (the DNA) or the polymerase enzyme (the protein)?
- Chicken and egg problem lol
- Answer came from the study of self-splicing introns
- Experiment: put RNA in test tube without any enzymes, as a control
- Supposed to be a negative control ... but the RNA still got spliced
- Turns out the RNA can fold into a structure that has catalytic properties, which then splices itself out
- Ribozyme - RNA enzyme; can catalyse a variety of reactions just like proteins
- Can hybridise to itself
- Example: ribosomes are actually ribozymes, just RNA with a few unimportant proteins associated with it
- So the first enzyme could have had just RNA, and this RNA resulted in ribozymes
- Maybe there was RNA polymerase that could replicate itself
- So primitive organisms could have been made by RNA that also acted as a lipid membrane
Housekeeping genes ..?
- Cells that do not have a particular protein usually don't have the mRNA for it
- So regulating the transcription of a particular gene - main method of controlling which genes are present
- In both eukaryotes and prokaryotes, RNA polymerase can regulate transcription due to promoter element (TATA box)
- Which is usually about 25 bp upstream of the actual gene
- Eukaryotes have three polymerases
  - I transcribes rRNA
  - II: mRNA (most important)
  - III: tRNA and other small RNAs
- Polymerase doesn't just bind to the DNA and transcribe; proteins of the TFII family prepare the way
- TFIID - complex of proteins including TBP, binds to the TATA box
- TFIID recruits other TFIIs that eventually recruit the polymerase II
- Enhancers and silencers - unique to eukaryotes
  - Can act from a great distance away
  - Can be inverted and will still work
  - Transcription factors bind to specific sequences; protein domains probe major groove of DNA
  - These factors determine what proteins are synthesised in different tissues
  - Example: heat shock genes
  - Genes can mix and match silencers and enhancers to get a unique pattern of proteins expressed
- How histones affect DNA transcription:
  - Heterochromatin, stains darkly; not transcribed in contrast to the lightly-stained euchromatin
  - For example, in females, one X chromosome is inactivated by condensing it, so it becomes unreadable
  - This happens early during embryogenesis; one chromosome chosen at random for inactivation in the cell
  - Then every daughter cell will also have the same chromosome inactivated (in the form of a Barr body)
  - Different cells will have different X-chromosomes expressed, so different parts of a female can have different phenotypes
  - Example, Calico cat - one yellow and one black allele
  - Some parts of the body, yellow allele activated; others, the black allele activated, resulting in patches
- Rate of transcription can also be controlled by having multiple copies of the same gene
  - Example: humans have like 280 rRNA genes, because they need them (to make ribosomes)
  - However, frogs have millions, because they need to be able to make tadpoles quickly
  - Called gene amplification - some cancer cells do this to become resistant to anti-cancer drugs
- Alternative splicing
  - Sometimes, when introns are spliced out, so are exons by mistake
  - Results in proteins that are missing chunks
  - For example, you can mutate the doublesex gene in Drosophila to create transgendered flies
  - So you suppress neither femaleness nor maleness
- RNA stability - must be unstable otherwise you only need to make RNA once and it would be around forever lol
  - Example: excess tubulin binds to tubulin RNA, decreasing stability ... negative feedback loop
- Control of translation
  - Changing the amount of capping
  - Factors binding to the RNA to prevent ribosome attachment
  - RNA interference: cell makes small RNA, does not encode protein, but complementary to mRNA that does
  - This RNA (siRNA) then binds to that mRNA with the help of the protein Dicer, blocking its translation
- Post-translational controls:
  - Kinases: enzymes that phosphorylate other proteins (important way proteins control each other's activity)
  - Selective protein degradation - proteins that there are many of are marked for degradation
- Trade-off between efficiency and speed
  - Regulation is efficient because unnecessary proteins are not made
  - So no energy is wasted, but it takes longer or something
  - Post-translational regulation is faster but not very efficient, you make proteins that do nothing until they're activated

Genetic engineering: taking a single gene from one organism, putting it in another, perhaps altered
Diabetes type I:
- Normally, when blood glucose is high, pancreas releases insulin, causes glucose to be absorbed
- When low, pancreas releases glucagon, liver releases glucose into bloodstream for other cells to use
- In diabetes, insulin not produced; you die, bathed in glucose
- 1921, Banting and Best purified insulin from dogs, could treat diabetes
- Pigs were later used to produce massive amounts of insulin
- But pig insulin isn't exactly the same - humans often had an immune reaction to it
- Turns out there is one amino acid difference between pig and human insulin
- 1970s, idea - take the human gene for insulin, put it in a plasmid, transform into bacteria
  - First problem: bacteria can't splice, so you have to put the gene in without introns
  - Do this by making a DNA copy of the mRNA that the introns have already been spliced out of
  - To do this, use the reverse transciptase enzyme from a retrovirus --> produced cDNA
  - Second problem: how do you find the right mRNA for making insulin?
  - Answer - you don't. Just take all the of the mRNA in the cell
  - Put each mRNA's cDNA into a different bacterium, then try to figure out which one produces insulin
  - Plasmid used to carry the DNA = vector
  - Cut the plasmid so that it is linear
  - Then attach each end of a cDNA to the plasmid, recreating a circle
  - DNA is cut using a class of enzymes - restriction endonucleases/enzymes
  - Proteins that identify a specific 6-base-pair (usually) stretch of DNA, cut it at that point for both strands
  - This specific stretch called the restriction site
  - Then, pasting the two ends together requires ligase to form phosphodiester bonds between cDNA and plasmid
  - Usually, put a gene for antibiotic resistance on the plasmid (e.g. ampicillin resistance)
  - So if we put in some ampicillin, only the bacteria that have the resistance gene will grow
  - This phenotype (ampicillin resistance) called the transformation marker
  - Creates a cDNA library - every member of a colony will have identical plasmids, all colonies, different cDNA
  - How do you find the colony that has the insulin cDNA? Use a DNA probe (piece of DNA used to find complementary DNA)
  - Basically you heat a stretch of DNA, which pulls the strands apart
  - If you let them cool, the strands will find each other again due to hydrogen bonds between the base pairs (hybridisation)
  - The longer the stretch of the complementary sequence, the greater the affinity to the other strand
  - So you synthesise a short stretch of radioactive DNA that is complementary to the sequence you're looking for
  - Then you expose the cDNA library to the probe - should stick to the right colony
  - You can then find the radioactive colony - that's the one you want
  - For insulin, deduce a sequence that codes for insulin and synthesise the corresponding sequence
  - Due to the degeneracy of the code, you have to use a mixture of all the possible corresponding sequences
  - Now you have found the human gene for insulin and put it in bacteria
  - Next step, make the bacteria express the gene - give it a promoter and terminator so RNA polymerase will recognise it as a gene
  - Back to restriction enzymes - each one leaves a different, characteristic overhang when it cuts
  - And the sticky ends will try to find each other
  - So to get the insulin gene expressed: cut it out of the library vector with a restriction enzyme
  - Cut the expression vector that has the promoter with the same enzyme
  - Then you mix the two, let the ends find each other; again, add ligase to make sure it stays
  - Problem: sticky ends usually just go back to each other, instead of combining the way we want them to
  - To make sure the genes combined the way we want them to, we can look at the plasmid DNA using gel electrophoresis
  - Because DNA is negatively charged, will move in an electric field towards the cathode (positively charged end)
  - If you force the DNA to move through a thick tangle of agarose fibres (strings of sugars), will move slowly
  - How slowly, depends on the size, so larger DNA moves more slowly
  - If we know which size insulin should be, then we can find the right DNA
  - Now, the bacterial RNA polymerase will recognise the insulin-making gene, will transcribe and translate it
  - So it will make insulin. You can also turn this on and off with lactose

Positional cloning - cystic fibrosis
- Inherited, autosomnal recessive, respiratory infections - can't move mucus out of lungs
- Purpose: find the specific chromosome and then the specific region the gene for cystic fibrosis lies
- We could then identify and clone the gene
Problems:
- Not that many human traits that show simple Mendelian inheritance against which you can map the disease trait
- Solution: restriction fragment length polymorphisms
- Variations in these sites will vary from one individual to the next ?
- So, cut DNA with BamHI; you can then test for a particular RFLP using a PCR test (polymerase chain reaction)
- PCR = method of producing large quantities of a short stretch of DNA using a DNA template
  - Idea: two DNA primers that flank the stretch, one complementary to each strand
  - Mix the primers with the single strand of template DNA, heat, allow hybridisation (annealing)
  - Then use DNA polymerase to synthesise complementary strands of each piece
  - Primers bind to the 3' stretch of the DNA to be amplified on each complementary strand
  - Initiate polymerisation in different directions; you now have two double strands
  - Repeat the process; you get a geometric increase in the amount of DNA between the primers
  - So basically this amplifies a small stretch of DNA enough that you can run it on the gel
- Problem with PCR - heating the DNA destroys the polymerase in the process
  - Rather than add new polymerase each round, able to isolate polymerase from Thermus aquaticus
  - Bug that lives at 95 C in Yellowstone national park, obviously able to withstand high heats
- So then you amplify the region of the chromosome that has the RFLP of interest
  - Then cut the amplified DNA with restriction enzymes, see if it cuts
  - If it cuts you have one large band; if not, two shorter bands
  - As a result, you know what alleles of the particular RFLP are present in that individual
  - Behave in a Mendelian fashion - you can distinguish homozygotes from heterozygotes (both long and short bands)
- Where do you get RFLPs from in the first place? trial and error
  - Sequence the DNA of individuals, look for common polymorphisms that change the sequence of restriction sites
  - Look at people who have family histories of CF, look at their RFLPs, you can find RFLPs linked to the CF gene
  - You can then look at the human genome and figure out where on the chromosome the CF gene must be, roughly
  - But you still need to figure out which of the genes in that region is the CF gene
  - One way is through a cDNA library
  - Other way, computers - bioinformatics; look through the genome, find similar things etc
  - Find the difference between the CF gene in people with CF and people without
- Sequencing DNA:
  - Synthesise a new strand of DNA using the DNA of interest as a template
  - But make it inefficient at one nucleotide, so that it stops every time that nucleotide is added
  - Then you run a gel, see how long the synthesised DNA is, tells you how far it travelled before it stopped
  - Which tells you where that nucleotide is
  - Do this for all four nucleotides ... you can figure out their sequence
  - To make it inefficient, mix in dideoxyNTP which lack the OH's necessary for polymerase to continue
  - Do four reactions, each with a different dideoxyNTP model, four mixtures of DNA with different lengths
- Turns out, affected individuals have a small 3-base-pair deletion in an exon; removes single phenylalanine
- The CF gene encodes a salt chloride channel - regulates chloride flow and thus controls salt balance
- When it is defective, then the mucus that coats epithelial cells is too dry
- Possible treatment: put correct CF gene on an adenovirus, infect CF patients with it, will replicate (not very successful)
- CF has led to increased resistance against typhoid fever in the past, probably why it is still around

Notes for Part II: Information and Heredity

1Introduction¶

2Cell cycle¶

2.1Mitosis¶

2.2Meiosis¶

2.3Ploidy¶

3Genetics: Mendel¶

3.1Continuous variation¶

4DNA and heredity¶

4.1Structure of DNA¶

4.2DNA replication¶

5The genetic code¶

5.1Transcription and translation¶

5.2Mutations¶

6Molecular genetics of prokaryotes¶

6.1Viruses¶

6.2Bacteria¶

7Molecular genetics of eukaryotes¶

8Recombinant DNA technology¶

8.1Cloning¶

8.2Gene mapping¶