|

Phenotype-Genotype Relationships in Psychiatric Disorders
Assen Jablensky
School of Psychiatry & Clinical Neurosciences
University of Western Australia
My topic is phenotype-genotype relationships in psychiatric disorders,
and in talking about phenotypes I will use that term in a very broad
sense. Height, weight, colour of skin, features like cognitive test
performance, or MRI brain volumetry are examples of phenotypes, ranging
from simple elementary entities which we can pinpoint and measure to
more complex levels of phenomena to which we attach diagnoses and disease
labels. I will first mention some important developments emerging in
the aftermath of the human genome project, now that the human genome
is fully sequenced and annotated. I shall then discuss briefly the problem
of phenotype - genotype linkages, or as they are also called, phenogenetic
relationships, in trying to persuade you that in psychiatric genetics,
the phenotype is likely to be the major "rate limiting factor"
for using the novel developments stemming from the human genome project,
and discuss. I will then present data from our family studies of schizophrenia
in Western Australia as a "proof of principle" that by paying
greater attention to detail in phenotype characterisation we gain power
for genetic studies. Finally, I shall refer to an initiative which is
emerging on the horizon and has been recently floated in the literature:
that after the human genome project, we need something like a global
human phenome project.
Following the successful completion of the human genome project,
the major question that arises is about the function of the identified
genes - supposedly numbering between 30 and 40 thousand (some still
believe there are about 50 000 functional genes in the human genome).
What do they do, and how do they relate to human diseases? Several
hundred mendelian, relatively rare single-gene diseases have already
been dissected genetically with great success and efficiency, but
we are now faced with the worldwide prevalence of the so-called common
or complex diseases, ranging from asthma, cancers, cardiovascular
disease, osteoporosis, arthritis, to schizophrenia and bipolar disorder.
There are proposals that, to tackle their genetic complexity, we should
collect enormously large samples of people affected with those diseases,
as well as healthy controls, and systematically - like in the human
genome project - sequence all the possible genes to find in their
structure variants associated with morbidity. That would be a mammoth
enterprise never tried on such a scale in any human endeavour. Another
idea is to first identify all of the so-called single nucleotide polymorphisms
(SNPs). These are genetic markers, very simple ones, which are spaced
almost evenly across the genome. Their rate of individual variation
is so high that a combination of such SNPs can identify the genetic
uniqueness of any individual on the planet. It has been estimated
that about 500,000 such SNPs are required to to provide the information
necessary for constructing a so-called SNP haplotype map of the human
specied. Haplotypes are combinations or patterns formed by such SNPs
and the map would enable us to identify how such combinations relate
to common diseases. There are also major technical developments, such
as the DNA microarrays, which allow the identification of the expression
of something like 11 or 12 thousand genes in a single sweep, using
a to identify the gene expression on a miniaturised component, a microchip
which is smaller than a fingernail. One of the fascinating future
applications of this technology is the capacity to combine in a single
study conventional positional cloning and association with microarrays
to map genes as well as to measure their expression. However, all
these novel developments bordering on science fiction are at present
facing a major problem, which is that the amount of information generated
(and which potentially can be generated) is orders of magnitude larger
than whatever has been handled so far by our statistical models. This
problem may not be unsurmountable but is creating difficulties, so
that the statistical treatment of complexity remains an extremely
important challenge.
In talking about genotype-phenotype relationships, and the collateral
problem of gene-environment interactions, I think one should realise
that that there exists an implicit tendency (also referred to as the
"modern dogma") to interpret the results of the human genome
project in terms of a simplistic view that the causation of disease
is ultimately gene-based. We have had in the past such models as "one
gene - one disease". Now, with the common diseases, we tend to
acknowledge that the situation is not as simple as one-to-one, but probably
is something like "many genes - one disease". I think that
this is very likely to be wrong. First, some basic considerations about
phenotypes and genotypes that have recently been stated with persuasive
clarity by Weiss and Buchanan (2003). Evolution works by screening the
phenotype and not the genotype, so that the genetic history of the species,
as well as the formation of any specific traits that we are trying to
analyse in our studies, results from either evolutionary pressure on
the phenotype, or from lack of such pressure on the phenotype. Secondly,
there are multiple non-genetic or epigenetic influences and processes,
which means that after transcription, factors with stochastic properties
and, therefore, behaving randomly contribute significantly to phenotype
variation. These include - to mention just a few - alternative splicing;
incomplete folding of the proteins produced by the genes which renders
them dysfunctional (this is happening in Alzheimer's disease and in
other neurodegenerative diseases); the concentrations of transcription
factors in the cell may vary, depending on a number of parameters that
are difficult to control; and finally there may be varying amounts of
gene products which have inhibitory action on function.
Charging ahead, I should like to refer now to the results of a major
epidemiological study, conducted by the World Health Organization,
in which I had the privilege to play a role (Jablensky et al., 1992).
It is known as the WHO Ten-country Study on Schizophrenia, in which
identical methodologies were applied in geographically defined areas
on four continents to identify all incident cases of schizophrenia,
assess their clinical characteristics and re-examine them on annual
follow-ups. This enabled us to calculate the incidence of schizophrenia
per 10,000 people in the age range 15-54 in quite diverse populations
and cultures. The incidence rates, of course, were found to vary but
the band of this variation was rather narrow and the overall conclusion
was that schizophrenia has a similar incidence in different populations.
In fact, the similarity of both incidence rates and symptoms of the
disease across countries such as Denmark, India, Nigeria, Russia,
Japan etc. was so striking, that one could feel tempted to say "well,
if such striking similarities exist, and we know that schizophrenia
has a strong genetic component in its causation, so probably there
are common genes causing schizophrenia across those populations".
While I was initially also inclined to take such point of view, I
now think that this kind of interpretation of the findings of the
schizophrenia study, as well as of studies involving other complex
diseases, is likely to be wrong for the following reasons. In complex
diseases, the phenotype-genotype relationships are not just "many
genotypes to one phenotype" but rather "many genotypes to
many phenotypes". A given genotype can be associated with a range
of phenotypes, and the same phenotype can be associated with many
different genotypes. Complex diseases illustrate well the phenomenon
called phylogenetic drift, which means that in the course of evolution
the association of a disease with particular genotype becomes looser
and looser if that disease tends to be non-fatal and to have a later
onset. Diseases like dementias, diabetes, myocardial infarction and
schizophrenia rarely, if at all, have their onset at birth. Schizophrenia
typically manifests itself in late adolescence or early adulthood,
but many other complex diseases have onset in middle age. This means
that the selection pressure to eliminate or alter the disease-predisposing
genotype becomes weaker and weaker in the course of evolution, so
that the result is the uncoupling of the specific relationship between
a particular set of primordial genotypes and a given phenotype. Moreover,
what tends to occur increasingly in the course of time is the so-called
phenotype conservation or phenotypic convergence. Being only weakly
exposed to selection pressure, a phenotype like schizophrenia (described
by its symptoms) may remain relatively unchanged against a background
of an increasing genotype divergence associated with population migration.
In the long run, the result would be that schizophrenia becomes associated
with a number of different genotypes, some of them rare, some relatively
common. If this hypothesis can be supported by adequate evidence and
withstand critical testing, then genetic heterogeneity (both locus
and allelic) of schizophrenia will indeed pose great challenges to
our search for susceptibility genes.
At this point I should like to address a misconception which is revealed
by the use of terms like "genes for schizophrenia". Weiss
and Buchanan (2003) refer to this as the "Lamarckian illusion",
i.e. naming the genes so as if the function through which they are discovered
is their evolutionary reason for being. What we actually do in research
into diseases like schizophrenia and their genetics is screening the
population using our health services by clinical phenotypes. However,
the clinical phenotypes we identify are very likely to represent just
the tail end of the phenotype's true distribution which extends beyond
the clinic and into the general population. At present we know very
little about the nature of the "non-clinical" dimension of
the phenotype that, in its severe form, manifests itself with the clinical
features of schizophrenia. Is that dimension predominantly expressed
as personality traits, as patterns of cognitive functioning, or as something
else? It is the tip of the iceberg that we see in our health services
and in our studies.
The "hopeful" scenario is that there are common genetic
variants underlying the common diseases, but however difficult that
is to prove, the alternative is also very likely. The accumulation
of rare alleles will result in significant genetic heterogeneity:
a disease may be common in the population precisely because multiple,
different and rare genetic variants are involved. At present we do
not have a critical test that could conclusively arbitrate between
these two alternative models of complex diseases. Schizophrenia is
one of the genetically complex diseases for reasons which are well
known - its inheritance does not follow mendelian rules; it is very
likely associated with multiple genes, each of a small effect; a contribution
of environmental factors is likely but so far we have not been able
to identify any specific such factor; and significant gene - gene
environment, as well as gene - gene interactions are probably an important
part of the picture. The search for susceptibility genes in schizophrenia,
as well as in other similar disorders, follows a classical scheme.
First we use genetic markers in a linkage analysis to find out which
route phenotypes travel together with the disease across generations
within families and how they map on the genome. Then we try to refine
the linkage region by genotyping in it additional microsatellite markers
(now also SNPs). Following that we conduct linkage disequilibrium
analysis to estimate the statistical association between polymorphisms
within the region and the disease in a sample of cases and to compare
that to a sample of controls without the disease. Hopefully, we then
find our biologically plausible candidate genes, using the rapidly
accumulating information on the annotated genomic databases and conduct
further association studies until mutations or variants in the genes
are found. The next, most complex and difficult step, is to understand
their role - what they do, are they expressed in the relevant tissue,
and conduct functional studies, including animal models.
The study designs include multiply affected families, as well as designs
based on nuclear families - affected sib pairs where two siblings share
the same phenotype, or triads where we have two unaffected parents and
an affected child. At present, close to thirty studies have carried
out complete genome scans of families with schizophrenia and tentative
findings of linkage have been reported on more than half of the human
chromosomes. The findings are suggestive, but hardly any such finding
has been definitively replicated, although there is a tendency in recent
years for successful replication of at least some. As regards the case-control
association studies using SNPs or SNP haplotypes in candidate genes,
the problem they face is the difficulty in ruling out false positive
results, due to the lack of strong prior hypotheses about the probability
that a "true" association exists.
Another problem is that the majority the studies have so far used
as the phenotype the clinical diagnosis of schizophrenia. Although
the reliability of clinical diagnosis is now higher than it used to
be prior to the introduction of the explicit diagnostic criteria in
classifications such as ICD-10 and DSM-IV, it remains questionable
whether such diagnoses are the best phenotype for genetic research.
Increasing doubts and questions are emerging. To quote just one, "genes
do not code for hallucinations and delusions or thought disorganisation
per se
the biological effects of genes are likely to be more
predictable in terms of the underlying abnormalities in brain function
rather than in terms of a highly variable and subtle experiences of
subjective experience of hallucinations or delusions" (Weinberger,
2002). What we need are phenotypes capturing structural abnormalities
in the brain, particular brain dysfunction, and behavioural traits,
rather than only the clinical diagnosis which is probably too general.
We can postulate that the more precisely or narrowly a phenotype is
defined, the more likely it is that the phenotype would be closer
to the causal physiological pathways involving a limited number of
genes. It is possible that our current diagnostic categories of schizophrenia,
bipolar disorder, anxiety disorder etc. actually represent conflations
of several interacting narrow phenotypes which operate at a deeper
level, below the surface of clinical presentation. How can we increase
the genetic informativeness of phenotypes for psychiatric research?
First, we can try to reduce the amount of diagnostic misclassification
in our sample (which some estimates suggest may be as high as 30%
diagnostic error. Secondly, we may divide our sample into clinically
more homogeneous groups using approaches like candidate symptoms (e.g.
primary or idiopathic negative symptoms, target features like anhedonia,
or cognitive dysmetria, a concept which has been proposed by Andreasen
(1999). However, a more general pointer in the right direction is
to maximise the risk ratio (or prevalence ratio, lambdas), which is
the ratio between risk of disease in first-degree relatives (e.g.
siblings) and risk of disease in the general population (Risch, 1990).
For example, a sibling of a person with schizophrenia has about a
10-fold increase in risk of schizophrenia compared to a randomly selected
person in the general population where the risk is about 1%. This
gives a risk ratio of 10, and we will be looking for alternative phenotypes,
correlated with schizophrenia, that would have at least that or even
higher risk ratio. There are studies indicating that the lambda for
the Continuous Performance Task / Identical Pairs (CPT-IP) is in the
order of 30. Using such cognitive tasks, we may be able to define
component traits, often referred to as endophenotypes (Gottesman and
Gould, 2003) or schizophrenia-related variants (SRV), based on the
hypothesis that currently known cognitive neuroanatomical and biobehavioural
markers might account for more of the genetic variation than does
the clinical diagnosis of schizophrenia (Cromwell, 1986). Such schizophrenia-related
variants are associated with the presence of schizophrenia but play
no part in its clinical diagnosis. They tend to emerge earlier than
the onset of clinical symptoms, and we are likely to also find them
among the mentally healthy relatives of the patients. For the time
being, we can only hope, without having definitive evidence, that
such SRVs involve the same biological pathways as the disease but
are less remote from the relevant gene action than are our diagnostic
categories. Since the early work of Kraepelin (1919), who not only
identified schizophrenia as a disease but also provided the first
evidence that it is basically a cognitive disorder, we have good reasons
to look at cognitive dysfunction in schizophrenia as the source of
possibly useful phenotypes.
I shall now try to reinforce some of these rather theoretical points
by providing empirical data from the Western Australian Family Study
of Schizophrenia. In this study, we have three aims: to explore alternative
ways of refining the phenotype; to carry out a detailed, multi-level
assessment of patients, relatives and controls; and conduct molecular
genetics studies using a combination of clinical diagnosis and neurocognitive
phenotypes. We have by now completed a genome scan or 116 families,
including a total of 412 individuals, each having a full assessment
involving a standardised clinical diagnostic assessment, personal and
family history, neurological and physical examination, two sets of measures
of temperament and personality traits, a neurocognitive test battery,
and in a subset of those families, also brain potentials studies, saccadic
eye movements, and structural MRI. We have DNA samples from all the
members of those families. The neurocognitive assessment includes measures
of prior and current intelligence, executive attention, verbal fluency
and verbal memory, speed of neural processing, and measures of handedness
and laterality. On most of these measures, we find - expectedly - that
the probands with schizophrenia are very different from the normal controls,
while the first-degree relatives of patients are somewhere in between
the probands and the controls. That is to say that roughly 50% of the
relatives are very similar to the patients in their neurocognitive performance
without having any of the clinical features of the disorder. The problem
we faced was how to analyse this large volume of multi-domain data -
as multiple single tests, or as a multivariate composite picture combining
in some all these measurements? Analysing them as a composite neurocognitive
mosaic could be expected to produce a picture that reflects neurobiology
better than individual variables. Measures such as CPT-IP have a high
prevalence ratio but their effect size is small to modest, which makes
individual tests less hopeful candidates for genetic analysis. In order
combine all the variables in ways that result in complex patterns, we
decided to use form of latent class analysis, known as grade of membership
analysis (GoM), which generates a set of so-called pure types - latent
classes described in terms of probabilities for the variables constituting
the neurocognitive set of measurements. The advantage of the method
is that it generates the pure types and also simultaneously estimates
(by maximum likelihood) the extent of which each subject in the sample
fits each of the pure type profiles. Every individual then gets a quantifiable
grade of membership in more than one of those latent classes. By using
this method, we obtained two major neurocognitive pure types, each including
a subset of the probands with schizophrenia and a proportion of their
biological relatives. The first type is characterised by very high probabilities
that its members will be significantly impaired on general ability,
the continuous performance tasks, verbal memory, verbal fluency, neural
processing speed, and also have high scores on soft neurological signs.
The second, non-deficit type shows little impairment on the majority
of neurocognitive measures but, interestingly, exhibits almost 100%
probability of being characterised by high schizotypy scores (on Raine's
Schizotypal Personality Questionnaire), and harm avoidance and self-transcendence
(on Cloninger's Temperament and Character Inventory). Heritability,
in terms of familial aggregation, was highly significant for the deficit
type and only marginally significant for the non-deficit type.
The clinical features of the patients who were classified into those
two major types showed some striking differences. There was a high
proportion of non-paranoid clinical subtypes (such as undifferentiated,
hebephrenic and simple schizophrenia), while the non-deficit type
included predominantly cases of paranoid schizophrenia. Our initial
hypothesis was that the non-deficit type may represent an earlier
stage of the disorder, with milder deficits which in the course of
time become more severe. This hypothesis was rejected when we examined
the length of illness in the two clusters. Nearly 50% of the non-deficit
cases had over thirteen years length of illness compared to 23% of
the deficit type. It is likely, therefore, that these two neurocognitive
types arise differently from one another early in the course of the
disorder, and that the deficit type is not a late stage of the non-deficit
type. Furthermore, there was a tendency for the deficit type to require
higher doses of both typical and atypical antipsychotic medications.
The "proof of principle", that we may be dealing with two
genetically distinct forms of schizophrenia, comes from the genetic
analysis. We conducted a whole-genome scan using 400 microsatellite
markers and used a combined phenotype including clinical diagnosis (of
those affected) and the composite neurocognitive profiles of all probands
and family members (affected and unaffected) to stratify the sample
by liability classes. The main finding was a linkage peak with a lo9d
score close to 4 within a relatively narrow region on chromosome 6p,
which is genome-wide significant. Importantly, this significant linkage
was almost entirely explained by the cognitive deficit phenotype; the
non-deficit phenotype did not show any linkage in this region. The other
finding was on chromosome 10q where we obtained suggestive (bordering
on significant) finding of linkage with the lod score of 3.5. Again,
it was mainly the cognitive deficit phenotype that was linked to that
region.
There is a growing list of potential candidate genes. To mention
just a few, close to our area of linkage on chromosome 6 is the dysbindin
gene, coding for a protein with a function in synapse formation and
several groups, including ours, are now investigating this gene for
association with schizophrenia. There is another gene (neuritin 1)
located more distally in the same region which is also involved in
synaptogenesis and neural plasticity. We are now sequencing the entire
gene and will have results in the near future. However, I think that
the main "proof of principle" comes from the partitioning
of cognitive dysfunction in schizophrenia into subtypes, which results
in correlated phenotypes useful for genetic studies. If we manage
to integrate multiple correlated measurements into a composite cognitive
trait, we obtain increased power to detect genetic linkage, since
it is not only the individuals affected with the disease but also
their clinically unaffected relatives that become genetically informative.
My conclusion is about a recent publication (Freimer and Sabatti, 2003)
which signals something important and related to what I have been trying
convey up to this point. It is a proposal for a Human Phenome Project.
We need new strategies for a systematic study of phenotypes in order
to identify variants associated with complex traits like schizophrenia.
Such a project would enable us to inverse the mapping strategy from
the traditional search for shared genotypes to a search for shared phenotypes,
by analysing large, comprehensively phenotyped samples from the general
population as well as samples ascertained for disease-related phenotypes.
Such research could be modelled on the existing network of NIH clinical
research centres. I have no doubt that these ideas will eventually get
strong support in the United States. The question on which I will end
my talk today is: will there be a prospect for an Australian Brain and
Mind phenome research network?
© 2004, Brain and Mind Australia Inc. - Copyright
Notice -

|