The VIBES program, developed by John Winn (2004) is a general purpose tool for modelling Bayesian networks using variational Bayes approximations. For an introduction to VIBES, see the sourceforge page. VIBES has been frozen at version 2.0.1 since 2005. The objective of this development project is to extend VIBES to make it usable for some types of problem that arise especially in biomedical research
The program has been fixed to compile with the latest Sun JDK.
The ConstantNode class has been fixed to implement Observable, so that constants can now be specified as observed data.
Other extensions in progress: -
allow observed data to be specified as ragged arrays of discrete nodes,
develop a script-based interface for specifying the model
For an introduction to variational Bayes methods,
see chapter 33 of David Mackay's book
on inference and learning algorithms. Variational Bayes is an
approximate method for fitting a Bayesian probability model to
observed data. The usual approximation is a mean field approximation,
in which the posterior distribution
is
approximated by a separable distribution
that
is adjusted so as to minimize the Kullback-Leibler divergence
(variational free energy) between
and
.
This yields a lower bound on the marginal likelihood (evidence) for
the model given the data. This lower bound can be used for model
choice.
The advantages of the variational Bayes approach are:-
in comparison with other approximate methods for evaluating the evidence, such as BIC or Laplace approximations, the approximation obtained by variational Bayes is more accurate.
in comparison with methods that evaluate the evidence directly (such as MCMC with thermodynamic integration), the computational burden of evaluating a lower bound on the evidence with variational Bayes is far lower.
by specifying appropriate priors, model fitting can be combined with automatic relevance determination, so that the iteration prunes away those features of the model that are not supported by the data.
Variational Bayes methods are useful when you want to learn the structure of a graphical model. In contrast, Markov chain Monte Carlo methods (implemented in programs such as BUGS) are most useful when you are able to specify the model structure, and just want to learn the model parameters.
Vibes is written in
Java. The code for VIBES is in directory jmw39, and uses
the libraries com and VisualNumerics. The
main function is in the file jmw39/app/vibes/VibesCore.java
Your working directory should contain the VIBES
source in subdirectory jmw39, the libraries com
and VisualNumerics as subdirectories, and the manifest (for creating a jar archive) in subdirectory META-INF. An archived
directory containing these subdirectories, together with a doxygen
config file (for generating code documentation) can be downloaded
from here. The instructions below assume that you are working in Linux.
Before
building, you have to set the CLASSPATH environment variable to include
the working directory, and whatever directories contain the files javaws.jar and tools.jar. For this you have to type something like,
export CLASSPATH=.:/usr/lib/jvm/ia32-java-6-sun-1.6.0.03/jre/lib/javaws.jar:/usr/lib/java-6-sun-1.6.0.03/lib/tools.jar
Edit this line as required to specify the correct paths to the
files javaws.jar and tools.jar on your system.
To force a rebuild of all source files, delete
all .class files in the jmw39 subdirectory by
typing
find ./jmw39 -type
f -name "*.class" -exec rm {} \;
To build the .class files type
javac jmw39/app/vibes/VibesCore.javaTo run the program from the .class files, type
java jmw39/app/vibes/VibesCore
You can then build a jar archive from the .class files, by typing
jar cvfm vibesdev.jar META-INF/MANIFEST.MF jmw39 com VisualNumerics
VIBES reads data in Matlab binary format. Data in this
format can be generated from Octave (a free program with scripting
language similar to Matlab), or from R using the writeMat
function, which is included in the R library R.matlab.
All data objects are specified as matrices of type "numeric". Scalars and vectors must be specified as
column matrices. Data objects of type "integer" must be converted to type "numeric"; in R you can use the function as.numeric().
For example, this R
script simulates a logistic regression model and saves the data in
logisticr.mat.
library(R.matlab)
invlogit <- function(x) {
return(1 / (1 + exp(-x)))
}
N <- 1000 # num observations
alpha <- 0.5
beta <- 3
x <- rbinom(N, 1, 0.5)
xbeta <- alpha + x*beta
invlogit.xbeta <- invlogit(xbeta)
y <- numeric(1000)
for(i in 1:length(xbeta)) {
y[i] <- rbinom(1, 1, invlogit.xbeta[i])
}
writeMat(con="logisticr.mat", N=matrix(N, ncol=1),x=matrix(x, ncol=1), y=matrix(y, ncol=1), verbose=T)
Set a file association in your file browser for the .jar
file extension to open with the command java -jar. Then
you can simply right-click on the jar archive (or on a shortcut to
it) in your file browser.
The model is specified in an XML file. This XML
file is created by using the graphical drawing tool, then saving it.
The file logisticr.xml
specifies a model for the simple logistic regression example above.
To load this model, click on File-Open. You can then load the data
file logisticr.mat
by clicking on File–Load.
If you specify the size of a plate in the data file (as in this example where N is specified as 1000), this will overwrite any value previously specified as part of the model.
Discrete variables with M categories should be coded as integers between 0 and M-1 (and converted to type numeric before saving in Matlab format). VIBES stores each discrete value as a binary array of length M in which all elements except one are zero.
The prior on a Dirichlet node (probability vector) is specified as a scalar constant U. This is 1 plus the value of a single element in the Dirichlet parameter vector – thus to specify a Beta(0.5, 0.5) prior, specify U as 1.5.
To start iteration, click on the Init button, then the Start button.