|
Software
I have reorganized my programs so that they can now all be downloaded in
a single
jar file.
If you download and add this jar file
to your CLASSPATH, you should be able to run all the programs.
Note that I use the LINKAGE data format for most input files, and for
some output files. Go here for
more information on this format.
Two new programs, McLink and McLinkLD, have recently been
added (25/08/06). McLink implements an MCMC scheme for linkage
analysis with an option to assume a specified model for
linkage disequilibrium between
the markers. McLinkLD samples from the joint distribution of
LD models and linkage lod scores given the observed pedigree
and genotypes. These programs are provided for users to
experiment with and results from them should be interpreted with
caution. It is not clear under what conditions, if any, the MCMC
methods implemented by these programs result in good mixing
properties and reliable results.
General pedigree analysis utilities
-
CheckFormat
a program that checks the format of LINKAGE parameter and pedigree
input files.
-
CheckErrors
previously called
GMCheck
a program that uses graphical modelling or Bayesian
network methods to calculate the posterior probability of genotype
or phenotype errors in pedigrees.
-
DownCodeAlleles
a program that removes alleles unobserved in genotype data from
the specified model for the locus.
-
GeneCountAlleles
a program that implements gene counting, or the EM algorithm,
to obtain maximum likelihood estimates for allele frequencies
from genotypes of related individuals.
-
SelectLoci
a program for selecting subsets of loci from LINKAGE input files.
-
SelectKindreds
a program for selecting subsets of kindreds from LINKAGE input files.
You can probably do the same thing with a grep command.
-
TrimPed
a program to remove individuals from a pedigree if they have insufficient
observed data.
Linkage analysis programs
-
TwoPointLods
a program for calculating simple two point lod scores on
a grid of values for the recombiation parameter.
-
MaxTwoPointLods
a program for finding the maximum lod score. Note that the search
includes values of the recombination fraction between 0.5 and 1.
-
McLink
a program for calculating multi locus linkage statistics in
extended pedigrees using Markov chain Monte Carlo integration.
There is an option to run assuming linkage disequilibrium between
the markers which
can be specified as a model output from HapGraph.
As this is a Markov chain Monte Carlo implementation with
unknown mixing properties it may not give reliable results in
all cases. This program is provided primarily for those
who want to experiment with MCMC pedigree analysis.
-
McLinkLD
a program that combines McLink and HapGraph. This iteratively
updates inheritance states in a pedigree and the graphical model for linkage
disequilibrium giving, in effect, linkage statistics model
averaged over estimated linkage disequilibrium models.
This is very computationally intensive. If you can estimate
a linkage disequilibrium model using HapGraph and input it
to McLink that is probably a more tractable solution.
As this is a Markov chain Monte Carlo implementation with
unknown mixing properties it may not give reliable results in
all cases. This program is provided primarily for those
who want to experiment with MCMC pedigree analysis.
Haplotyping programs
- HapGraph
a program for fitting a graphical model for linkage disequilibrium
to haplotype data, and a general graphical model fitting program.
HapGraph now estimates graphical models from genotype data. It also
estimates haplotype frequencies and reconstructs phase.
- HaploFreqs
a program for listing the haplotypes and their frequencies according
to a graphical model for linkage disequilibrium.
- GCHap
a program for calculating maximum likelihood estimates of
haplotype frequencies from a sample of genotyped individuals.
This uses a staged gene counting, or EM, method starting
with a small number of loci and adding one at each stage.
- ApproxGCHap
a program for calculating rough maximum likelihood estimates of
haplotype frequencies from a sample of genotyped individuals.
It is the same as GCHap except that to save time and space,
haplotypes with low frequency are eliminated
at each stage.
Viewing programs
-
ViewGraph
a general program for viewing and editing graphs.
-
ViewPed
a program for viewing pedigrees when the input is in
the form of a standard triplet file.
-
ViewLinkPed
a program for viewing pedigrees when the input is in
the form of a LINKAGE pedigree file.
The top level program, which can all be run by typing something like
% java ClassName input1 input2 ...
are described fully in their class descriptions
in the
"Unnamed"
package of the
Javadocs web pages.
The new programs are written in Java
version 1.5.
so you need an appropriate Java virtual machine to run them.
Note that several of the programs are computationally demanding and
may take considerable time to run. If they throw an error indicating
that there was insufficient storage, increase this with the
-Xms and -Xmx options to java.
Links
|