Genomics

Genomics is the study of an organism's entire genome, in contrast to genetics, which often focuses on single genes. Genomic research based around small groups of individuals often attempts to discover or characterise what quirks of inheritance and mutation have made them the way they are. Genomic research focussing on populations attempts to identify the characteristics of the population in terms of allele frequencies and/or structural novelties.

Comparative genomics
Differences between subpopulations are easily found. One of the ways that the differences between populations can be seen is in different frequencies of alleles for Single Nucleotide Polymorphisms (SNPs). These SNPs almost always have little to no significance for health or appearance. Unfortunately for racists HBD proponents, they don't even follow political or geographic boundaries very well. Even if they did, a difference in allele frequency doesn't add up to a hill of beans.

Say you had two populations - A and B. For a particular SNP, population A only ever has the allele T. Population B can have the allele T or the allele G, in the ratio 2:1. The allele frequencies of T are therefore 100% and 67% respectively. Unless this SNP is on the X or Y chromosomes, each person will have two alleles. In population A, this is only ever TT. For population B there are 3 possibilities - TT, GT and GG. These are observed in the ratio 4:4:1. So almost half of population B have the same genotype as population A, despite a significant difference in allele frequency between the groups.

Population genomics is most useful behind the scenes. The computational aspect of genomics uses a 'reference genome' as the basis for building up its sequence of alleles for the samples it processes. If the reference genome is unreliable, then the process falters. So, the more information that is gathered on the human genome, the better the reference can be. In particular, a reference genome built up using one population is less useful for other populations. There are active projects that try to escape European ancestry in order to better serve other populations. The 1000 Genomes Project included three main populations - European, African and East Asian - and picked what was hoped to be a genetically isolated subpopulation for each. The UK10k project intends to sequence 10,000 British individuals, which it is hoped will cover minority populations in the United Kingdom adequately.

The comparison of genomes between species is informative in a different way - the genes that change the least are the genes that are most important in the functioning of the organism. This helps identify which variant alleles out of the millions are likely to be deleterious. Comparison between species provide a clearer picture of the evolutionary relationships between organisms.

Personalised genomics
A person's entire genome can be analysed to look for potential diseases, or to find the cause for a disease they are known to suffer. The latter is much, much easier than the former - even well-known causative mutations can have low penetrance, and hence be poor predictors of future illness. Even apart form that problem, there are two practical obstacles to using genomics for diagnosis.

The first is the need for a reliable body of information as to which variants genuinely cause diseases. Simple genetic disorders manifest in childhood, and hence it is often relatively easy to isolate the cause. A family with a an inherited disease is invaluable as the familial structure helps narrow down the search considerably. For adult-onset conditions, particularly those where environmental exposure or general health are important factors, it is a much more difficult prospect. Existing research databases, such as the Human Genetic Mutation Database, are currently barely fit for the purpose except for a few, very well studied mutations. The body of knowledge is forever being added to, but the complexity of cell biology, and the trillions of potential interactions between genetic features, are a huge obstacle to finding reliable predictors for disease.

The second is the reliability of the methods used to extract genomic information. Some parts of the human genome are more amenable to sequencing than others, and even the well-behaved portions of the genome don't give good results without thorough techniques. Most genome-wide sequencing technology takes a probabilistic approach, sampling from the genome at random. This not only means that some areas will be covered less well than others, but that a certain minimum amount of coverage is needed to be reasonably confident that alleles from both chromosomes have been collected. If the true genotype at a position is AT, for example, each 'read' will record only A or only T. With 5 attempts to read the allele, you'll still get AAAAA 3% of the time, and TTTTT 3% of the time. Even if you see one of each, you can't be sure that the true genotype is AT because of the possibility of sequencing errors. Once you start considering more complicated forms of variation, such a Copy Number Variation (a section of genetic code appearing multiple times or being deleted, giving rise to genotypes that look like they have 0, 1, 3 or more alleles instead of 2), it's a wonder it's possible to get any usable results at all.

All of which has dire implications for the various commercial genetics kits companies are trying to sell, which currently are basically scams.

Relationship with genetics
Perhaps the easiest way to contrast the two is via a fishing analogy.

In genetics, you want to catch a specific fish. You choose an appropriate bait and throw back anything else that takes your line. You have a great interest in all aspects of that particular fish.

In genomics, you scour the river clean with a backhoe. Now you have to pick through a massive pile of mud and fish. You don't know what you're looking for, but you hope you'll know it when you see it.

The two approaches are not mutually exclusive. Once genomics has determined that a gene (or other feature) has a particular relationship to a condition, it can be examined in greater, targeted detail. The existing body of knowledge with respect to genes (and other features) can be cross-referenced with genomic findings to find novel associations. Often a sample will be sequenced across the entire exome or genome, despite there being a candidate gene in mind. If that gene doesn't show anything important, then the rest of the sequence is there to be examined.