Back to Biochemistry for Health Sciences

Nucleic Acids & DNA Replication

~6 min read

Lesson 6 of 12

Notes

Introduction to Nucleic Acids

Nucleic acids — DNA and RNA — are the information molecules of life. DNA stores genetic information; RNA carries, interprets, and executes that information. The chemical structure of these polymers is precisely suited to their function: a stable, information-rich molecule for storage (DNA) and versatile, reactive molecules for expression (various RNA classes).

Nucleotide Structure

A nucleotide consists of three components:

A nitrogenous base: either a purine (adenine [A], guanine [G]) or a pyrimidine (cytosine [C], thymine [T] in DNA; uracil [U] in RNA)
A pentose sugar: deoxyribose (DNA) or ribose (RNA) — the 2' hydroxyl group of ribose makes RNA more reactive and less stable than DNA
One to three phosphate groups: triphosphates (dNTPs) are the substrates for polymerases; the α-phosphate is incorporated into the growing chain, with pyrophosphate (PPi) released and hydrolysed (driving polymerisation thermodynamically forward)

Nucleotides are linked by 3',5'-phosphodiester bonds: the 3' OH of one sugar is joined to the 5' phosphate of the next. This creates a backbone with directionality — conventionally written 5' → 3'.

The DNA Double Helix

The Watson-Crick double helix (1953) model describes B-form DNA, the predominant physiological form:

Antiparallel: two strands run in opposite directions (one 5'→3', the other 3'→5')
Complementary base pairing: A pairs with T (two hydrogen bonds), G pairs with C (three hydrogen bonds) — Chargaff's rules: [A]=[T], [G]=[C]
Right-handed helix: ~10.5 base pairs per turn, ~3.4 nm per turn (0.34 nm per base pair)
Major groove and minor groove: result from the geometry of base pair stacking; most DNA-binding proteins read sequence information in the major groove
Stabilisation: base stacking (van der Waals + hydrophobic interactions between stacked aromatic bases) and hydrogen bonding

The G+C content of DNA affects stability: GC base pairs (3 H-bonds) are stronger than AT (2 H-bonds). Higher GC content → higher melting temperature (T_m).

Chargaff's Rules

Erwin Chargaff (1950) found that in any DNA sample:

[A] = [T] and [G] = [C] (molar equivalence)
But the ratio (A+T)/(G+C) varies between species

These rules were key experimental evidence for complementary base pairing and the double helix structure. They also mean that knowing the sequence of one strand allows deduction of the other (complementary strand).

Semi-Conservative Replication: The Meselson-Stahl Experiment

Semi-conservative replication means each new double helix consists of one parental strand and one newly synthesised strand. This was proved definitively by Meselson and Stahl (1958):

E. coli were grown in ¹⁵N (heavy nitrogen) medium until all DNA was ¹⁵N-labelled
Transferred to ¹⁴N medium (normal) for exactly one or two generations
DNA was centrifuged in a CsCl density gradient
After one generation: only intermediate-density DNA (one ¹⁵N + one ¹⁴N strand) — consistent ONLY with semi-conservative replication
After two generations: equal amounts of intermediate and light (¹⁴N/¹⁴N) DNA

Alternative hypotheses (conservative: parental intact + fully new; dispersive: mixed throughout) were ruled out.

The Replication Fork: Key Enzymes

DNA replication initiates at specific sequences called origins of replication (ori). In bacteria (E. coli), there is one origin (oriC); in eukaryotes, there are thousands of origins distributed throughout the genome (enabling completion within hours despite genome size).

Key proteins at the replication fork:

Helicase (DnaB in E. coli; MCM complex in eukaryotes): unwinds the double helix by breaking hydrogen bonds, creating a replication fork. Requires ATP. Creates torsional stress ahead of the fork, relieved by topoisomerase I (nicks one strand) and topoisomerase II/gyrase (nicks both strands, passes the double helix through, re-ligates).

Single-strand DNA-binding proteins (SSBs): stabilise the unwound single-stranded template until it can be replicated.

Primase: synthesises a short RNA primer (~10 nucleotides) complementary to the template strand. Primers are necessary because DNA polymerases can only extend existing strands — they cannot initiate de novo synthesis.

DNA polymerase III (E. coli; equivalent: Pol δ/ε in eukaryotes): the main replicative polymerase. Key properties:

Synthesises DNA only in the 5'→3' direction
Requires a primer and template
Has 3'→5' exonuclease activity (proofreading): excises misincorporated nucleotides and corrects errors; reduces error rate from ~1/10⁵ to ~1/10⁷

Leading strand synthesis: continuous synthesis in the 5'→3' direction toward the fork, requiring only a single primer.

Lagging strand synthesis: discontinuous — synthesised in short Okazaki fragments (1,000–2,000 nt in prokaryotes; 100–200 nt in eukaryotes), each initiated with a new RNA primer, because synthesis must proceed 5'→3' away from the fork. After synthesis, primers are removed by RNase H/FEN1 and replaced with DNA by DNA polymerase I (E. coli). DNA ligase seals the nicks between fragments using NAD⁺ (prokaryotes) or ATP (eukaryotes) as cofactor.

DNA Polymerase Fidelity

The overall replication fidelity of ~1 error per 10⁹–10¹⁰ base pairs results from three complementary mechanisms:

Watson-Crick base pairing selectivity: ~1 error per 10⁵ from base pair geometry alone
3'→5' proofreading: corrects ~99% of remaining errors → error rate drops to ~1/10⁷
Mismatch repair (MMR): post-replication system (MutS/MutL/MutH in E. coli; MSH/MLH families in eukaryotes) identifies and repairs mismatches; further reduces rate to ~1/10⁹–10¹⁰

Defects in MMR are found in hereditary non-polyposis colorectal cancer (Lynch syndrome — MSH2, MLH1, MSH6, PMS2 mutations). Loss of proofreading in cancer drives mutator phenotypes.

Telomeres and Telomerase

Linear eukaryotic chromosomes face the end-replication problem: the 5' end cannot be fully replicated because after primer removal at the 3' template end, there is no upstream sequence for DNA polymerase to fill in — each cycle shortens the chromosome.

Telomeres are repetitive sequences (TTAGGG in humans, thousands of repeats) at chromosome ends that serve as a buffer, protecting coding sequences from erosion. They form a T-loop structure with telomere-associated proteins (shelterin complex) to protect chromosome ends from DNA damage response.

Telomerase is a ribonucleoprotein enzyme with its own RNA template (TERC component) that extends the 3' overhang of the telomere, allowing subsequent lagging strand synthesis to partially refill it. Telomerase is expressed in:

Germ cells and stem cells (maintains telomere length)
~85–90% of human cancers (reactivated — enables replicative immortality, one of the hallmarks of cancer)
Most somatic cells: low/absent telomerase → progressive telomere shortening → replicative senescence after ~50 divisions (the Hayflick limit)

Replication Fidelity and Cancer

Replication errors that escape all repair become heritable mutations. In cancer, several mechanisms accelerate mutation:

MMR deficiency (Lynch syndrome, sporadic CRC): hypermutator phenotype
Nucleotide excision repair (NER) deficiency (xeroderma pigmentosum): UV-induced mutations
Telomere dysfunction: very short telomeres cause chromosomal instability (breakage-fusion-bridge cycles)
Polymerase proofreading mutations (POLE, POLD1 exonuclease domain hotspots): ultramutator phenotype — hundreds of mutations per Mb (vs. 1/Mb normal)

Understanding replication fidelity mechanisms is directly relevant to cancer biology and to the mechanism of action of many chemotherapy agents that target DNA replication (e.g., topoisomerase inhibitors, nucleoside analogues).

✍️

SAQs & Essay

Short answer questions + essay writing practice

🃏

Flashcards

FSRS spaced-repetition card review

📝

MCQ Quiz

Multiple choice questions with explanations