What is your genome? You have heard the term on the news.
Humans have 23 pairs (i.e., 2n) of chromosomes in each of their somatic (non-reproductive) cells which have nuclei. (Note: Sperm and ova have 23 unpaired chromosomes because the number has been halved (i.e., from “2n” to “n” on order to produce the gamete; the process of meiosis). A picture of the 23 pairs of chromosomes is a Karyotype

Each pair is said to be a “homologous pair” because they contain the same genes at the precise same locations (i.e., loci) along the chromosome. But they might and probably do contain different versions of the gene. The technical name for different versions of the gene are “alleles.” Of course the reason you have two of each homologous chromosome is one came from your father’s sperm and one from your mother’s ovum (again, a sentence that made my college students squirm).
The shapes of the chromosomes depicted in the photo of the karyotype is how they appear when cells are dividing. Normally when the cells are not dividing from one into two cells (the process of mitosis) they are much (many orders of magnitude) thinner and longer. The DNA which is continuous and linear along the length of each chromosome is highly coiled and wrapped in proteins called histones. The DNA is a double helix meaning it is like a twisted and tilted ladder. I will use a ladder to describe DNA’s structure. The left and right side rails of the ladder is comprised of building blocks. Each unit of the building block is a phosphate and 5-carbon sugar (deoxyribose) which can be likened to the ladder’s side rails (vertical). Each sugar-phosphate building block also has one of four “bases” attached more or less perpendicular to the side rail which serves as half of each step. The bases are: Adenine, Thymine, Cytosine or Guanine (A,T,C or G). Every base (one half of each step) is bonded to a base on the other side of the “step”. So a “step” consists of two bases. Adenine always bonds to a Thymine (A-T or T-A) and Cytosine always binds to Guanine (C-G or G-C); see the illustration below. In the discovery of DNA’s structure (last post) Crick and Watson were familiar with the research of anscientist named Chargaff who lectured at Cambridge in 1952 with Watson and Crick in attendance. He showed that human DNA had equal Adenine and Thymine quantities and equal Cytosine and Guanine amounts. Watson and Crick used this fact to build a model where A bonded to T and C bonded to G in the model.

If you could un-coil and un-twist a chromosome and then divide it into its left and right sides you would be left with two extraordinarily long series of the bases since the stair’s left and right sides of the steps were separated. The letters might be AATGGCTATTCCCGATAGCCGA….and if written all the way out it would be perhaps 150 millions characters long for an typical chromosome.
By the way, “Gattaca” is a 1997 American dystopian science fiction film starring Ethan Hawke, Uma Thurman and Jude Law. The film’s name is a reference to genomes.
A gene consists of a sequence of 1000’s of the bases so any single chromosome contains hundreds or even more than 1000 genes plus or minus. The 23 chromosomes differ in their gene density. Chromosome number 1 is not only the longest but contains almost 3000 genes, more than any other. Guys, the Y chromosomes has the least number of genes, about 100. About 30,000 genes have been identified in humans and there are 23 chromosomes so you can do the math. Genes actually have “start” sequences and “end” sequences which is a necessity in order for genes to be expressed at times and not expressed at other times based upon the physiological state of the tissue and developmental state of the creature. For example the need for the enzymes to digest a meal are needed based upon the timing of food intake. The digestive enzymes are proteins whose composition is coded by the four bases in the sequence of your genome where the gene for that particular enzyme is located. Physiological signal molecules act to enhance the expression of the gene.
Your genome is the combined list of all the bases as they appear in sequence across all 23 chromosomes. It consists of about 3.2 billion characters from just four letters A, T, G and C. Of course there is meaning to all the letters because a gene typically consists of 1000’s of the letters, and in sets of three letters they define the order that amino acids are assembled into proteins in tiny sub-cellular organs called “organelles” in the cell and outside of the cell’s nucleus. The interpretation of a gene into a protein is called “translation.” The letters of the DNA are “transcribed” into complimentary letters in the messenger RNA (mRNA) which travels out of the cell’s nucleus and is “translated” into a protein by joining amino acids in the sequence defined by the letters (bases A,T,C and G) in the mRNA, read three letters at a time.
The technology to determine those sequences of millions and millions of bases (A,T,C and G) is amazing. It starts with amplifying a small amount of the genome into millions of copies in a process called “PCR” which stands for Polymerase Chain Reaction. The “chain reaction” part eludes to the amplification of the amount of DNA, not the use of radioactive materials! Then the chromosomes are literally blasted into millions fragments, sequenced and then re-assembled mathematically. (For Genetics Students: both strands are read in the 5’ to 3’ direction). Since genes have start and stop sequences defined by three bases the software of the super computers which reassembles all the data recognizes which sides of the ladder so-to-speak go with one-another. The “PCR” process owes it ability to an enzyme discovered in heat-loving bacteria in very hot thermal springs. The bacteria is names Thermus aquaticus and was first isolated from Mushroom Spring in the Lower Geyser Basin of Yellowstone National Park. The heat-stable enzyme is named TAQ Polymerase where “TAQ” is a reference to its Latin name (the T from Thermus and the AQ from aquaticus). Every DNA test in the world starts with PCR using the enzyme followed by DNA sequencing which in itself is an amazing technological and mathematical achievement.
Finally, DNA is being used for data storage. DNA is synthesized and the order of the bases is used to produce information just like the 1’s and 0’s in conventional memory chips. But the data density is spectacular. It is estimated that all published materials could be encoded into DNA and fit into the trunk of a medium automobile. The DNA is good for dense long-term data storage but can’t be read as fast as solid state chips.

Leave a comment