Analytics for a Changing World: From Data to Decisions

The New Zealand Statistical Association and the Operations Research Society of New Zealand are holding a joint conference which will cover both of their respective areas.

Some of the abstracts which have been advertised already look very exciting:

Assessing Change with Longitudinal Data

John M. Neuhaus (with Charles E. McCulloch), University of California, San Francisco

Investigators often gather longitudinal data to assess changes in responses over time within subjects and to relate these changes to within-subject changes in predictors. Missing data are common in such studies and predictors can be correlated with subject-specific effects. Maximum likelihood methods for generalized linear mixed models provide consistent estimates when the data are “missing at random” (MAR) but can produce inconsistent estimates in settings where the random effects are correlated with one of the predictors. On the other hand, conditional maximum likelihood methods (and closely related maximum likelihood methods that partition covariates into between- and within-cluster components) provide consistent estimation when random effects are correlated with predictors but can produce inconsistent covariate effect estimates when data are MAR. Using theory, simulation studies, and fits to example data this talk shows that decomposition methods using complete covariate information produce consistent estimates. In some practical cases these methods, that ostensibly require complete covariate information, actually only involve the observed covariates. These results offer an easy-to-use approach to simultaneously protect against bias from either cluster-level confounding or MAR missingness in assessments of change.

A Unified approach to shrinkage

Ken Price, University of Washington

As data have become "Big", shrinkage estimators of various forms have become standard tools in statistical analysis. Common justifications for them include penalized maximum likelihood, empirical Bayes posterior means, and full Bayes posterior modes. None of these, however, addresses the question of why one might want a shrunken estimate in the first place. In this talk we outline a general approach to shrinkage, as a result of balancing veracity (getting close to the truth) and simplicity (getting close to zero, typically). While yielding "simple" shrunk estimates, the approach does not require any assumption that the truth is actually full of zeros - an assumption that is often unreasonable. Several well-known shrinkage estimates will be derived as special cases, illustrating close connections between them.

The Need for Speed in the Path of the Deluge

Chris Wild, University of Auckland

There is a rapidly increasing awareness of the so-called "data deluge": the explosion in quantities of data being collected, the explosion of settings in which it is being collected, and expansions in the conceptions and scope of what constitutes data. This is accompanied by advances in ways of visualising data and accessible data-visualisation tools. Consequently, it is imperative to find ways to get students much further, much faster and with better comprehension - a quantum leap in ambition. What can make this possible are some of the same things that gave rise to the deluge, computational power and clever software. We will advance some strategies that envisage maximising awareness and excitement about data and what it can do for you and only later back filling details. We also provide glimpses of two software projects beginning to enable such a future - a fleeting encounter with some data-analysis software followed by a more in-depth look at visualisation approaches to bootstrap and randomisation inference.

Some Recent Advances in Network Tomography

Martin Hazelton, Massey University

Volume network tomography is concerned with inference about traffic flow characteristics based on traffic measurements at fixed locations on the network. The quintessential example is estimation of the traffic volume between any pair of origin and destination nodes using traffic counts obtained from a subset of the links of the network. The data provide only indirect information about the target variables, generating a challenging type of statistical linear inverse problem. For much of the past 40 years work on network tomography appeared primarily in the engineering literature, but burgeoning interest by statisticians has seen it dubbed a `classic' problem in a recent issue of JASA. In this talk I will discuss network tomography for a rather general class of traffic models. I will describe some recent progress on model identifiability, and the development of effective MCMC samplers for simulation-based inference.