Department of Mathematics, University of Bristol, Bristol BS8 1TW, UK
In this talk I will review current progress and challenges in the area of adaptive Monte Carlo methods. These are simple and natural extensions of existing Monte Carlo methods which have the potential to further help current users of generic MCMC softwares to design efficient algorithms. I will briefly discuss some theoretical aspects of these methods and illustrate their simplicity of implementation on several examples.
Use of WinBUGS for Screening Process Improvement pdf
Benoit H.L. Beck
Lilly Services S.A., Rue Granbonpré, 11, B-1348 Mont-Saint-Guibert, BELGIUM.
Non-clinical drug development investigations essentially consist of screening through large numbers of compounds in order to locate one or a couple of candidates that deserve further clinical considerations. During the lead optimization phase, chemical variations of a lead compound are pushed through a directed network of screening assays with the goal to discover, as quickly as possible, promising candidates that will be sent to clinical investigations. Such lead optimization network mixes in vitro, ex vivo as well as in vivo experiments following a structure dictated by ethical considerations and throughput constraints. For each assay in the process map, the level of information generated on a new unknown compound, and used to decide whether or not to push it to the next stage, depends on a pre-fixed parameter often characterized by a sample size, e.g., the number of subjects used to estimate an 50 percent efficacy dose in an in vivo model. The challenge is then to optimize the selection of the parameters in such a way that the average screening trial will be not-so expensive, and thus as short as possible, but accurate enough. As feedback loops exist and are used to bias the compound generation following a QSAR (Quantitative Structure Activity Relationship), it is important to revisit the selection of these parameters during the screen.
In this talk, after a short introduction on drug screening process, we will present some general considerations about sample size selections for screening experiments and will show how WinBUGS can be used to revisit the selection on the fly.
Cutting 'Feedback' in Bayesian Full Probability Models pdf
Department of Epidemiology and Public Health , Imperial College Faculty of Medicine, St Mary's Campus, Norfolk Place, London W2 1PG, UK
In many regression applications, the covariates of interest are themselves the output of a modelling exercise. Bayesian full probability models, estimated using MCMC, provide a natural framework for simultaneously modelling both the covariate distribution and the covariate-response relationship through shared parameters.
However, while theoretically sound, such an approach may have consequences that do not appear intuitively attractive. For example, there is feedback from the response data when estimating the covariate distribution, and this may dominate any information coming from the covariate data itself. A standard alternative is to carry out sequential analysis of the two models, conditioning on fixed covariate estimates in the covariate-response model. This prevents feedback but ignores uncertainty in the covariate estimates.
In this talk, I will discuss how the sequential approach can be implemented within a full MCMC analysis to cut feedback while still acknowledging uncertainty. This is achieved through a simple adjustment to the sampling algorithm, and, for example, can be implemented in WinBUGS using the 'cut' function. In hierarchical models, feedback can be cut at various levels, which will be illustrated using an application in population pharmacokinetic-pharmacodynamic (PKPD) modelling. Comparisons will be drawn with multiple imputation methods.
WinBUGS for Comparative Genomics Using Bayesian Calibration and Prediction Models pdf
Department of Mathematics and Statistics, Lancaster University, UK
In order to address biological questions more fully and to extract more information from this wealth of data, researchers require tools that will allow them to integrate different datasets in a dynamic, hypothesis-driven fashion and to analyze them within a biologically meaningful framework.
For the model organisms their sequenced based physical maps form the basis for further in depth studies and understanding of the respective genome. For a much larger number of other organisms being studied such a well-characterised map mostly does not exist. Instead several partial maps of various types would be available, e.g. genetic, landmark based physical maps, clone based maps. Typically various groups have created these over a period of time, created for a wide range of purpose, based on diverse mapping populations. Each of which may contain valuable information for that particular species, but possibly none that is complete enough to form a comprehensive and reliable basis for genome study of that species.
We are utilising WinBUGS to implement a Bayesian graphical network representation of the problem. The ideology behind the application of such a model to the genomic data is that there is a true physical map and all the partial maps provide a view of this true map. The integrated map, which is far richer than any single of the constituent maps, can be used to search for conserved regions across species. Our initial explorations have shown promises of carrying out refined comparative genomics queries.
United Analytic and Computational Methods for Population Pharmacokinetic Analysis of Single Dose Data pdf
M. Suzette Blanchard(1), Vincent J. Carey(2), Edmund V. Capparelli(3)
(1) Dept. of Biostatistics City of Hope National Medical Center and Beckman Research Institute 1500 East Duarte Road, Duarte, CA 91010-3000
(2) Harvard Medical School Channing Laboratory 181 Longwood Ave Boston MA 02115 USA
(3) Division of Clinical Pharmacology and Developmental Therapeutics Schools of Medicine and Pharmacy & Pharmaceutical Sciences University of California, San Diego 92103 USA
For optimal interpretability of pharmacokinetic data, pharmacologists have access to extensive research in nonlinear models based on Bayesian and frequentist frameworks. The frequentist paradigm has evolved trustworthy approaches to model selection. Bayesian methods provide full distributions for all parameters, where estimates using Markov chain Monte Carlo methods are exact, and confidence limits achieve the defined coverage. Currently implementation of multiple estimation methods using the available software is unwieldy, often requiring separate datasets for each software program. The PKtools library of functions accepts a unified set of input variables to NLME, NONMEM, and WinBUGS. Statistical and graphical data summaries, tables of comparative results, and output datasets are returned in a standard format. PKtools thus facilitates comparative and hybrid analyses and allows uniform application of diagnostic graphical assessments for each of the major methods.
Variable Smoothing in Bayesian Spatial Modelling pdf
Mark J Brewer
Biomathematics & Statistics Scotland, Macaulay Institute, Craigiebuckler, Aberdeen, AB15 8QH, U.K.
We introduce an adapted form of the Markov random field (MRF) for spatial smoothing with small area data first introduced in Besag et al. (1991) - the BYM scheme. We relax the restriction of having a single smoothing parameter controlling smoothness over the entire map, a constraint which may or may not be appropriate. The BYM MRF is defined in terms of differences of random effects, and hence we define the variance on each pairwise difference to be the sum of "contributions" from each area - regarding these contributions as variable smoothing parameters of the scheme. We can define conditional moments of the spatial random effects for this new scheme, as with the BYM original. We briefly outline conditions for ensuring propriety of the resulting posterior distributions, and illustrate the methodology via simulations and applications using WinBUGS and OpenBUGS on a Linux Beowulf cluster.
Reference: Besag J., York J. and Mollié A. (1991) Bayesian Image Restoration, with Two Applications in Spatial Statistics, Annals of the Institute of Statistical Mathematics, 43, 1-59, with discussion.
Using R and BRugs in Bayesian Clinical Trial Design and Analysis pdf
Bradley P. Carlin
Division of Biostatistics, School of Public Health, University of Minnesota, MN, USA
Thanks in large part to the rapid development of Markov chain Monte Carlo (MCMC) methods and software for their implementation, Bayesian methods have become ubiquitous in modern biostatistical analysis. In submissions to the U.S. FDA Center for Devices and Radiological Health, where data on new devices are often scanty but researchers typically have access to large historical databases, Bayesian methods have been in common use for over a decade. However, statisticians and regulators on the drug side of FDA are also now coming to appreciate the value of these methods, especially their ability to combine information from separate but related sources, reduce sample size, and directly measure the effects of interest while protecting overall error rates.
This talk will review how a variety of Bayesian clinical trial design and analysis methods can be implemented in R and BRugs, the version of the OpenBUGS package callable from within R. In particular, we will illustrate how a Bayesian might think about "power" when designing a trial, and how a Bayesian procedure may be calibrated to guarantee good long-run frequentist performance (i.e., low Type I and II error rates), a subject of keen interest to the FDA. The presentation is intended to be accessible to a broad audience, and to generate discussion regarding areas requiring further development before Bayesian clinical trial design and analysis can be realistically considered for routine adoption by practitioners.
Interventions for ADHD pdf
E. Michael Foster, Berkant Camdereli, Serhan Ziya
Department of Operations Research, & School of Public Health, University of North Carolina, Chapel Hill, NC U.S.A.
This paper attempts to determine an optimal policy for treating children with Attention-deficit/hyperactivity disorder (ADHD). The policy will be used to decide when to intervene and what type of intervention should be used for the treatment of a child with ADHD. Since we are only considering children, the interventions will happen between the age of 4 and 18. There are 3 types of interventions; medical treatment, behavior therapy, both together or the decision maker may decide not to intervene at all.
We are going to formulate this problem as a Markov Decision Process (MDP) model where the state of the children is described by the level of ADHD symptoms. First, we need to estimate the parameters of the MDP model. Afterwards, we will analyze the effects of different objectives on the optimal policy. One of the objectives is minimizing the amount of money spent for the treatment of the child till s/he reaches the age 18 while ensuring that the child’s disorder is reduced to some acceptable level. Another one is maximizing the time that the child lives normal life given that there is limited amount of money to spend on the treatments. Finally, we will try to derive some structural properties of those different models.
Estimating Gene Expression Intensity Using Multiple Scanned Microarrays pdf
Dept. of Mathematics and Statistics, P.O. Box 68, FIN-00014, University of Helsinki, Finland
We propose a method to improve the expression data from cDNA microarrays by making more scans at varying scanner sensitivities. A Bayesian latent intensity model is introduced for such data to estimate the true expression of genes. The method both improves in all ranges the accuracy at which signals can be measured and extends the dynamic range of measured gene expression at the high end. Our method is generic and can be applied to data from any organism, for imaging with any scanner, and for extraction with any image analysis software. Results from various real data sets illustrate a more precise estimation of the true expression of genes than can be achieved by standard methods using only a single scan.
A Time Effect in a Social Network from a Bayesian Perspective pdf
Susan Adams, Nathan Carter, Charles Hadlock, Dominique Haughton, George Sirbu
Bentley College, Waltham, Massachusetts, U.S.A
In this paper, we propose a methodology for examining differences between statistics of a social network at two distinct points in time from a Bayesian point of view. The problem has been of interest for some time in the social networks community, because it is quite difficult to test whether differences over time between for instance overall network connectivities (tendencies to make links) are significant. Several issues make this problem challenging: links in a social network tend to be dependent, and the networks at the two different points in time are likely to be dependent as well. This implies that bootstrapping a social network to address this problem, for example, may be impractical.
By expanding on a previously published Bayesian version of the so called P1 model for social networks with random effects (which allows for dependence between the edges of the networks), we are able to obtain from Winbugs posterior distributions for the difference in connectivity over time, and for the correlation (assumed to be the same for all actors) between each actor’s connectivities in the network at both points in time.
We illustrate our methods with the case of a social network of collaborations (joint publications) between departments of a business university where trans-disciplinary work was at some point in time actively promoted. Our methods allow us to compare the tendency to make collaborative links across departments before and after the promotion activities of the university.
Utilizing WinBUGS to Analyze Extremes with Application to Climatology pdf
Thomas H. Jagger
Department of Geography, Florida State University, Tallahassee, FL, U.S.A.
This talk demonstrates the use of two new stochastic nodes dGEV and dGPD. We illustrate the use of dGPD in a regression model for analyzing extreme tropical storm winds along the US coastline. We show that the Bayesian posterior mean and variances of the parameter samples using the new stochastic node, the ones trick within OpenBUGS are consistent with those using the maximum likelihood estimator.
- New stochastic nodes dGEV and dGPD
- Sample code and uses
- restrictions to GPD
- Examining extreme hurricane winds.
- Problem and initial data analysis
- Process model, POT and full model
- Climate Covariates
- Prior and likelihood specifications
- Maximum Likelihood Methods (Stuart Coles)
- OpenBUGS Model using ones trick
- OpenBUGS Model using dGPD stochastic nodes
- Results: Posterior densities, return level estimation.
- Open issues (may be expanded before talk)
- Measurement error and fixed threshold issue with POT model
- Future work.
- Bivariate Extreme Value distributions
- Mixture distributions of GPD
- Alternative parameterizations of GEV and GPD.
Eliciting Expert Priors for Generalised Linear Models Inside the WinBUGS Framework. pdf
Mathematics and Statistics, Lancaster University, Lancaster, LA1 4YF, United Kingdom
Arising from a need to quantify expert knowledge about fauna and flora species distributions in Queensland, ELICITOR was developed to elicit prior distributions for a logistic regression model graphically and rapidly inside WinBUGS. So that with (independent) expert and data on hand it is possible to go from prior to posterior with relative ease. The software gives immediate feedback to the expert about their assessments, and offers tools for considering probabilities. From the graphs created with the expert, the written model in WinBUGS code is automatically generated. Constructed as an add-on to WinBUGS, the code files simply need to be copied to the WinBUGS folder and ELICITOR then appears as additional menu items the next time WinBUGS is started. It is in the process of being extended to other Generalised Linear Models, with public (and extensible) Link and Prior for the knowledgeable WinBUGS user. Comments/suggestions on further improvements and applications are very welcome.
Using BRugs in R to Combine Multiple Microarray Scans pdf
Tanzy Love (1), Alicia Carriquiry (2)
(1)Dept. of Statistics, Carnegie MellonUniversity, Pittsburgh, PA, U.S.A.
==== (2) Department of Statistics, Iowa State University, Ames, IA , U.S.A. ====
Microarray data are subject to multiple sources of measurement error. One source of potentially significant error is the settings of the instruments (laser and sensor) that are used to obtain the measurements of gene expression. Because `optimal' settings may vary from slide to slide, operators typically scan each slide multiple times and then choose the reading with the fewest over-exposed and under-exposed spots. We have proposed in previous work a hierarchical modeling approach to estimating gene expression that combines all available readings on each spot. The basic premise is that all readings contribute some information about gene expression and that after appropriate re-scaling, it would be possible to combine all readings into a single estimate. However, the estimation of this model is complicated by censoring (both above and below) in the data. In a few cases out of 12160 spots, all of the data for a spot are censored. In order to analyze the whole microarray experiment. This modeling must be carried out on each dye-channel of each slide measured. In the example used, we have 60 data sets to be preprocessed, modeled, normalized, and analyzed. BRugs is used within R to fit the modeling step into the general framework.
Reversible Jump: Application to Genetic Association Studies pdf
==== Division of Epidemiology, Public Health and Primary Care, Medicine , Imperial College, London ====
In this talk I will discuss several new features of the BUGS language designed to facilitate the analysis of genetic association studies, but which also have general applicability. In particular, I will discuss reversible jump methods for variable selection and automatic curve fitting as well as the use of degenerate distributions for implementing latent variable approaches to probit regression.
BUGS in Fish Stock Assessment: Challenges and Potential Remedies pdf
Department of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland
Complexity of fisheries creates a challenge not only for the Bayesian statistician but especially for BUGS as the tool for doing the actual calculus. In this talk I will describe some of the problems that can be encountered when using BUGS for fish stock assessment and suggest potential procedures for avoiding or reducing those problems. These procedures include sequential analysis, approximation of the likelihood function, reparameterisation of the model and the use of BUGS modules.
Analysis of Histogram Data and Remarks on Rounding Errors in WinBUGS. pdf
Bird Migration Research Station, University of Gdansk, Poland
Two kinds of bird migration and bird orientation data are coming as counts: number of birds catching a day and number of scratches in each sector of orientation cage experiment.
Similar model was applied in both cases: each "wave" of migration or each preferred direction was modelled as Gaussian or wrapped Gaussian respectively.
Model-expected probability in each histogram bin was calculated and multinomial distribution for likelihood was used.
Some remarks on the of rounding errors effects in hierarchical Bayesian model were presented.
JAGS: Just Another Gibbs Sampler pdf
IARC, 150 Cours Albert Thomas, 69372 Lyon CEDEX 08, France
JAGS is an alternative "engine" for the BUGS language. It has a command line interface that emulates most of the features of “classic” BUGS. JAGS has a completely independent code base from OpenBUGS (written in C++) and is portable across Microsoft Windows, Linux, Mac Os X and UNIX.
JAGS uses a dialect of the BUGS modelling language. Most of the minor language changes are designed for compatibility with OpenBUGS, but there are also some major innovations:
- Data transformations are declared in a separate “data” block. I will show how data block can also be used to simulate data from one model and then analyse it in another.
- Censoring and truncation, which have long been a source of confusion, are handled in a different way.
- Some deterministic distributions have been introduced. These distributions bridge the gap between functions and distributions in BUGS, and allow inference on deterministic functions of the data, such as interval censored failure times and aggregated data from ecological studies.
Future plans for JAGS will also be described, including a new modular structure for the library that paves the way for an interface with R and, hopefully, user-contributed extensions.
Two Brief Topics on Modelling With WinBUGS pdf
MRC Biostatistics Unit, Institute of Public Health, Forvie Site, Robinson Way, Cambridge CB2 2SR, UK
Improving DIC. The Deviance Information Criterion is widely used but has not had great academic acclaim. Questions that will be addressed include: (i) can half the variance of the deviance be used as the effective number of parameters? (ii) could better plug-in estimates be used for Dhat? (iii) in what circumstances are AIC, DIC or BIC appropriate?
Using mixtures in WinBUGS: there is plenty of potential for confusion when using mixture models in WinBUGS, whether (i) all observations come from one of a set of distributions (eg is this coin biased or not?), (ii) each observation comes from one of a set of distributions (eg the 'eyes' example), (iii) each random effect comes from one of a set of distributions (eg microarrays). Graphical representation can help communication, although care is needed in analysing the output for situation (iii).
Opening BUGS pdf
==== Dept. of Mathematics and Statistics, P.O. Box 68, FIN-00014, University of Helsinki, Finland ====
BUGS is a long running software project aiming to make modern MCMC techniques available to applied statisticians in an easy to use package. With the growing popularity of Open Source software it has been decided to release the source code to BUGS on the web. The BUGS user community is encouraged to improve and extend the software. This talk will give an overview of the structure of the BUGS software and the tools used in it creation and maintenance. Interfacing BUGS to other statistical software, such as R will also be discussed.
Bayesian analysis of fluctuating asymmetry as a measure of developmental instability pdf
Stefan Van Dongen
Department of Biology - Group of Evolutionary Biology, University of Antwerp, Groenenborgerlaan 171, B-2020 Antwerp, Belgium,
Fluctuating asymmetry (FA), refers to small random deviations from perfect symmetry in otherwise bilaterally symmetric traits like arms, wings and legs. FA is assumed to be the joint outcome of two developmental processes which operate during ontogeny. On the one hand, random developmental noise causes any developing trait to deviate from its pre-determined trajectory. On the other hand, stabilizing mechanisms exist which attempt to minimize the effects of noise. Because development is never perfect, a realized phenotype will always deviate from its expected value conditional on the current environmental and genetic conditions. Unfortunately, each trait develops only once, such that it is impossible to calculate the deviations from its expectation. However, in bilaterally symmetric traits, the left and right side develop under exactly the same conditions. Therefore, the expected phenotypes are identical on both sides and the expectation of left minus right trait value equals zero. Any deviation from perfect symmetry reflects the joint outcome of developmental noise and/or stability, i.e., of developmental instability (DI).
The analysis of FA as a measure of DI in biology has received a lot of attention because it has been hypothesized that it may reflect the effects of stress. Therefore, FA may be an easy to use surrogate for the outcomes of stress on living organisms. In an ever changing world where the impact of human activities increasingly threatens natural ecosystems, FA may represent a useful biomonitorring tool. While the concept of FA is relatively simple, its statistical analysis is not. FA is a reflection of variation in development, so in other words, at the individual level we attempt to estimate a variance with two data points, which is bound to be prone to high sampling variation. I will provide a short overview of recent developments in analyzing patterns in FA through latent variable models in a Bayesian framework. Along the development of the statistical models, choices have to be made with respect to the assumed underlying distributions of developmental noise. Theoretic models of morphological development predict either log-normal or g-distributions, while traditional analysis tools of FA and DI have assumed normality. We present the analysis of simulated datasets to explore the robustness of results under the different alternative distributions. Results from normal and gamma models appeared comparable, while data generated assuming log-normality of developmental noise yielded biased results if analysed assuming a normal or gamma distribution.
Generalized Evidence Synthesis for Diagnostic Test Data pdf
Pablo E. Verde
Coordination Centre for Clinical Trials, University of Duesseldorf, Moorenstr. 5, D-40225 Duesseldorf, Germany.
This work concerns the following question: How to make meta-analysis of diagnostic test data when the studies included are of different type (e.g. different designs). A hierarchical Bayes analytical plan is developed: test results are modeled as multinomial random variables, study variability as Mutivariate Normal and a third stage is used to include variability due to studies with different type. Statistical computations are performed with Markov chain Monte Carlo (MCMC) methods based on a Gibbs sampler implemented in WinBUGS 1.4.1 software. The data of a new systematic review, which investigates the potential diagnostic benefits of computer tomography (CT) scans in the diagnostic of appendicitis is used to illustrate this Bayesian approach. The analysis shows that pooling results with classical techniques may over-estimate diagnostic performance.
- Key words
- Meta-analysis, SROC, diagnostic test, Bayesian statistics, hierarchical models, MCMC.
References Ohmann C., Verde P. E., Gilbers T., Franke C., Fürst G., Sauerland, Böhner H. (2005) Systematic Review of CT-Investigation in Suspected Acute Appendicitis. Unpublished manuscript, Coordination Centre for Clinical Trials, The Heinrich-Heine University of Duesseldorf.
Evidence Synthesis Where the Data Provide Information on Very Complex Functions of the Basic Parameters: Using WBDev pdf
Nicky J Welton & AE Ades
MRC Health Services Research Collaboration, Department of Social Medicine, Canynge Hall, Whiteladies Road, Bristol, BS8 2PR, U.K.
It has been argued that “all available evidence” be used in the appraisal of health technologies, and in the UK this approach is being encouraged by the National Institute for Clinical Excellence (NICE), the body providing clinical guidelines. A Bayesian decision analytic perspective allows the incorporation of disparate sources of evidence in a single statistical analysis, and an MCMC simulation framework provides a unified and straightforward propagation of all uncertainties and correlations into a cost-effectiveness model in a single step. Here we outline two examples where the available data give evidence on very complex functions of the underlying basic parameters. In the first example we estimate a Markov model with 1-week transition probabilities between asthma health states from data on weekly transitions, but find that WinBUGs becomes slow to compile and run when we incorporate additional data on transitions over a longer period. The second example provides data on Early Onset Group B Streptococcus (EOGBS) infection in infants. Because this is a rare condition much of the data available to populate a natural history model comes from retrospective studies of maternal risk factors in infants with EOGBS. Once again, because of the complexity of the relation between the basic parameters and this data, WinBUGS models become slow to run. We investigate the use of add-on WBDev to improve speed and explore the limits of complexity that can realistically be modelled in this way.