TITLE: Evidence Weight for Matching DNA profiles
in the cases of
mixed stains.
SPEAKER: Anders Stockmarr
Department of Forensic Genetics
University of Copenhagen
DATE: October 8, 1997
LOCATION: 321 Funger Hall
TIME: 11:00 a.m.
------------------------------------------------------------
When observing a mixed DNA profile, i.e., a profile
with DNA from more
than one person, it is often not obvious how much weight
a profile
"match" between parts of the profile and a
given suspect should be
given. It depends on a number of factors, among them
how the
authorities became aware of the suspect and the number
of donors to the
stain, if this number is known at all. The talk will
present current
methodology in DNA profiling analysis and the special
kind of problems
that arises in the case of mixed stains, with an application
to an actual
case.
------------------------------------------------------------
--------------------------------------------------------------------------------
TITLE: Probability Models and Limit Theorems for Random
Interval
Graphs with Applications to Cluster Analysis.
SPEAKER: Bernard Harris
University of Wisconsin-Madison
DATE: October 20, 1997
LOCATION: 321 Funger Hall
TIME: 11:00 a.m.
-------------------------------------------------------------
Assume that n k-dimensional data points have been obtained
and subjected
to a cluster analysis algorithm. A potential concern
is whether the
resulting clusters have a "causal" interpretation
or whether they are
merely consequences of "random" fluctuation.
In this report, the
asymptotic properties of a number of potentially useful
combinatorial
tests based on the theory of random interval graphs
are described. Some
preliminary numerical results illustrating their possible
application as a
method of resolving the above question are provided.
-------------------------------------------------------------
--------------------------------------------------------------------------------
TITLE: Some Current Issues with the Minimum Description
Length
Principle
SPEAKER: Peter G. Bryant
College of Business and Administration and Graduate
School of Business Administration
University of Colorado at Denver
DATE: October 31, 1997
LOCATION: 308 Funger Hall
TIME: 11:00 a.m.
---------------------------------------------------------------
Commonly used methods of data analysis fall into two
general categories:
The Analyse des Donnees approach, which develops data
summaries without reference to sampling ideas; or
Sampling-based approaches , which assume the data are
a
random sample from some population. Typical approaches
in this
category include: Fisher's significance testing, Neyman-Pearson
hypothesis testing, Bayesian approaches, and penalized
maximum
likelihood with penalties such as AIC or Bozdogan's
ICOMP.
Until recently, it seemed these were the only choices
available.
Rissanen's Minimum Description Length (MDL) Principle
provides a
possible alternative. The principle is consistent with
probabilistic
ideas, but does not require them, and thus avoids some
of the
logical problems of the formal methods, while giving
model-selection
guidance in an objective, numerical manner.
The MDL principle is data-analytic in that it assumes
that our goal is to
describe the data we have as efficiently as possible,
as opposed to
describing the parameters of some (often hypothetical)
population. It is
similar to more classical approaches in that it applies
most naturally to
parametric classes of statistical models, and provides
a mechanism for us
to select which model class fits the data better, considering
both the fit
to the data and the complexity of the description. It
is based on a
one-to-one correspondence between statistical models
and the lengths (in
bits, say) of any description of the data. The model
that comes closest to
the data minimizes the description length. The net result
is that for a
given model class, we can derive a model fitting criterion,
MDL,
showing how well the best-fitting model of that class
fits our given data.
The model class with the smallest MDL is the better
one.
In this talk, I
Introduce the MDL principle
Illustrate MDL criteria for two classes of model: linear
regression
with Gaussian errors, and multinomials;
Compare the model selection rules from MDL with classical
rules, and
Discuss some open problems(in particular, the question
of how to
evaluate the performance of MDL procedures).
------------------------------------------------------------------
--------------------------------------------------------------------------------
TITLE: Robust Estimating Functions with Nuisance Parameters.
SPEAKER: Mingxiu Hu
Department of Statistics
The George Washington University
DATE: November 7, 1997
LOCATION: 308 Funger Hall
TIME: 11:00 a.m.
---------------------------------------------------------------
The most important applications of estimating functions
with nuisance
parameters under a semi-parametric model are Quasi-likelihood
functions (Wedderburn, 1974) and GEE (Liang & Zeger,
1986). To some
extent, these methods try to minimize a quadratic form
of the
residuals, and therefore are not robust in the sense
that they, like
least squares estimates, are sensitive to heavy-tailed
distributions,
contaminated distributions, and outliers. In this thesis,
a so-called
robust estimating function family is presented for the
analysis of
longitudinal data, which, to some extent, is a multi-dimensional
extension of a robust regression approach, M-regression
(Huber,
1981), for one-dimensional data. Besides the property
of robustness,
the robust estimating function family enjoys three important
properties:
unbiasedness, orthogonality, and optimality. We also
explored robust
estimates for the nuisance parameters since their robustness
may have
substantial impact on the robustness of the estimating
functions.
-----------------------------------------------------------------
--------------------------------------------------------------------------------
TITLE: Bayesian Inference and Design for Two-Phase
Prevalence Studies.
SPEAKER: Refik Soyer
Management Science Department
The George Washington University
DATE: November 14, 1997
LOCATION: 310 Funger Hall
TIME: 11:00 a.m.
--------------------------------------------------------------
We discuss Bayesian methods for the assessment of the
prevalence of a
disorder based on data from a two-phase design. In calculating
the posterior
distributions of the quantities of interest, e.g., the
prevalence,
sensitivity and specificity we use a Gibbs sampler.
We illustrate our
approach by assessing the prevalence of depression in
adolescents using data
attained from a two-phase design. We also address the
design of two-phase
experiments and present a Bayesian decision theoretic
approach to the design
problem. In so doing, we adopt a Monte Carlo-based approach
to develop
optimal Bayesian designs for two-phase screening tests.
The MC approach
facilitates the preposterior analysis by replacing it
with a sequence of
scatter plot smoothing/regression techniques and optimization
of the
corresponding fitted surfaces. The method is illustrated
with an example.
--------------------------------------------------------------
--------------------------------------------------------------------------------
TITLE: George Snedecor, Henry Wallace and the Birth
of Computing.
SPEAKER: David Grier
Department of Statistics
The George Washington University
DATE: November 21, 1997
LOCATION: 310 Funger Hall
TIME: 11:00 a.m.
----------------------------------------------------------------
When John Attanasoff built his early electronic computer
at Iowa State
University, he proposed nine different problems that
it might do. Eight of
the nine are clearly rooted in agricultural statistics
and show his
connection to George Snedecor's computing laboratory.
For a few years in the
second decade of this century, Snedecor's lab was probably
the largest and
most sophisticated computing organization in this country.
Funded by Pioneer
Seed Scion, Henry Wallace, it developed a host of sophisticated
computing
techniques that challenged many notions about numerical
analysis. This work
included combinatorial algorithms (for breed books)
sorting procedures (for
non-parametric statistics) and pattern matching algorithms.
This seminar will
present the basic outline of the work of this lab, including
the
contributions by A. E. Brandt and the other computers
at Iowa State.
-----------------------------------------------------------------
--------------------------------------------------------------------------------
TITLE: Sequential Monitoring of Informatively Censored
Longitudinal Data
SPEAKER: Ravinder Anand
Department of Statistics
The George Washington University
DATE: December 5, 1997
LOCATION: TBA
TIME: TBA
--------------------------------------------------------------------
In longitudinal studies, subjects often drop out prematurely,
leading
to right censored data. When right censoring is related
to the primary
outcome variable, it is termed informative censoring.
Wu and Lan (1992)
proposed a group sequential procedure based on Wu and
Bailey's (1989)
conditional linear model for informatively censored
data. Under weak
assumptions about the arrival process, follow-up times
and intermittent
missing data, we extend Wu and Lan's result to non-normal
settings and
show that the asymptotic distribution of the sequential
test statistics is
multivariate normal.
When assumptions for Wu and Bailey's conditional linear
model are
suspect, the unweighted least squares estimate (UWLE)
of the slope can be
used because it provides consistent estimation both
in the presence and
absence of informative censoring. Under the linear random
effects model
(Laird and Ware, 1982) for the non-normal primary outcome
variable, we
show that the sequential test statistic based on the
UWLE of the slope
also converges in distribution to a multivariate normal
distribution. The
two group sequential procedures are illustrated and
compared using data
from an AIDS clinical trial.
|