VIBES development project

The VIBES program, developed by John Winn (2004) is a general purpose tool for modelling Bayesian networks using variational Bayes approximations. For an introduction to VIBES, see the sourceforge page. VIBES has been frozen at version 2.0.1 since 2005. The objective of this development project is to extend VIBES to make it usable for some types of problem that arise especially in biomedical research

Current status of development version

The program has been fixed to compile with the latest Sun JDK.

The ConstantNode class has been fixed to implement Observable, so that constants can now be specified as observed data.

Other extensions in progress: -

allow observed data to be specified as ragged arrays of discrete nodes,

develop a script-based interface for specifying the model

What is variational Bayes?

For an introduction to variational Bayes methods, see chapter 33 of David Mackay's book on inference and learning algorithms. Variational Bayes is an approximate method for fitting a Bayesian probability model to observed data. The usual approximation is a mean field approximation, in which the posterior distributionis approximated by a separable distributionthat is adjusted so as to minimize the Kullback-Leibler divergence (variational free energy) between and. This yields a lower bound on the marginal likelihood (evidence) for the model given the data. This lower bound can be used for model choice.

The advantages of the variational Bayes approach are:-

  1. in comparison with other approximate methods for evaluating the evidence, such as BIC or Laplace approximations, the approximation obtained by variational Bayes is more accurate.

  2. in comparison with methods that evaluate the evidence directly (such as MCMC with thermodynamic integration), the computational burden of evaluating a lower bound on the evidence with variational Bayes is far lower.

  3. by specifying appropriate priors, model fitting can be combined with automatic relevance determination, so that the iteration prunes away those features of the model that are not supported by the data.

When are variational Bayes methods useful?

Variational Bayes methods are useful when you want to learn the structure of a graphical model. In contrast, Markov chain Monte Carlo methods (implemented in programs such as BUGS) are most useful when you are able to specify the model structure, and just want to learn the model parameters.

Building the development version from source code

Vibes is written in Java. The code for VIBES is in directory jmw39, and uses the libraries com and VisualNumerics. The main function is in the file jmw39/app/vibes/VibesCore.java

Your working directory should contain the VIBES source in subdirectory jmw39, the libraries com and VisualNumerics as subdirectories, and the manifest (for creating a jar archive) in subdirectory META-INF.  An archived directory containing these subdirectories, together with a doxygen config file (for generating code documentation) can be downloaded from here.  The instructions below assume that you are working in Linux.  

Before building, you have to set the CLASSPATH environment variable to include the working directory, and whatever directories contain the files javaws.jar and tools.jar.   For this you have to type something like, 

export CLASSPATH=.:/usr/lib/jvm/ia32-java-6-sun-1.6.0.03/jre/lib/javaws.jar:/usr/lib/java-6-sun-1.6.0.03/lib/tools.jar

Edit this line as required to specify the correct paths to the files javaws.jar and tools.jar on your system.

To force a rebuild of all source files, delete all .class files in the jmw39 subdirectory by typing

find ./jmw39 -type f -name "*.class" -exec rm {} \;

To build the .class files type

javac jmw39/app/vibes/VibesCore.java
To run the program from the .class files, type
java jmw39/app/vibes/VibesCore

You can then build a jar archive from the .class files, by typing

jar cvfm vibesdev.jar META-INF/MANIFEST.MF jmw39 com VisualNumerics

Downloading the development version

The development version is available as a jar file (java archive) from here

Using VIBES

John Winn's original instructions are available here
The additional notes below may be helpful especially if you are not working in Matlab.  

Data format

VIBES reads data in Matlab binary format. Data in this format can be generated from Octave (a free program with scripting language similar to Matlab), or from R using the writeMat function, which is included in the R library R.matlab. All data objects are specified as matrices of type "numeric".  Scalars and vectors must be specified as column matrices. Data objects of type "integer" must be converted to type "numeric"; in R you can use the function as.numeric().  

For example, this R script simulates a logistic regression model and saves the data in logisticr.mat.

library(R.matlab)
invlogit <- function(x) {
return(1 / (1 + exp(-x)))
}

N <- 1000 # num observations
alpha <- 0.5
beta <- 3
x <- rbinom(N, 1, 0.5)
xbeta <- alpha + x*beta
invlogit.xbeta <- invlogit(xbeta)
y <- numeric(1000)
for(i in 1:length(xbeta)) {
y[i] <- rbinom(1, 1, invlogit.xbeta[i])
}
writeMat(con="logisticr.mat", N=matrix(N, ncol=1),x=matrix(x, ncol=1), y=matrix(y, ncol=1), verbose=T)

Starting VIBES

Set a file association in your file browser for the .jar file extension to open with the command java -jar. Then you can simply right-click on the jar archive (or on a shortcut to it) in your file browser.

Model specification and loading data

The model is specified in an XML file. This XML file is created by using the graphical drawing tool, then saving it. The file logisticr.xml specifies a model for the simple logistic regression example above. To load this model, click on File-Open. You can then load the data file logisticr.mat by clicking on File–Load. 

If you specify the size of a plate in the data file (as in this example where N is specified as 1000), this will overwrite any value previously specified as part of the model.

Other quirks

The only way to specify the dimension of a node is to specify the name of a plate. You may have to create a dummy plate to allow the dimension of a node to be specified

Discrete variables with M categories should be coded as integers between 0 and M-1 (and converted to type numeric before saving in Matlab format). VIBES stores each discrete value as a binary array of length M in which all elements except one are zero. 

The prior on a Dirichlet node (probability vector) is specified as a scalar constant U. This is 1 plus the value of a single element in the Dirichlet parameter vector – thus to specify a Beta(0.5, 0.5) prior, specify U as 1.5.

Running the program

To start iteration, click on the Init button, then the Start button.