| Title: Mixed Distributions in an Algorithm for Cyclic
Permutations via Randomization and Derandomization
Speaker: Dr. Hosam M. Mahmoud
Department of Statistics, George Washington University
Date: January 31, 2003
Location: Funger Hall 307
Time: 11:00-12:00 noon.
------------------------------------------------------------
Randomization as a transform, followed by derandomization
as an inverse transform are useful in solving certain
classes of problems with fixed elements. We investigate
the limit distributions associated with cost measures
in an algorithm for generating random cyclic permutations.
The number of moves made by an element turns out to
be a mixture of 1 and 1 plus a geometric distribution
with parameter $1/2$, where the mixing probability is
the limiting proportion of the rank of the element being
moved to the size of the permutation. On the other hand,
the raw distance traveled by an element to its final
destination does not converge in distribution without
norming. Linearly scaled, the distance converges to
a mixture of a uniform and a shifted product of a pair
of independent uniforms.
------------------------------------------------------------
Title: Regression Models for Time Series Analysis
Speaker: Dr. Benjamin Kedem
University of Maryland, College Park
Date: February 14, 2003
Location: Funger Hall 307
Time: 11:00-12:00 noon.
------------------------------------------------------------
A relatively recent statistical development is the
important class of models known as generalized linear
models (GLM) that was introduced by Nelder and Wedderburn
(1972), and which provides under some conditions a unified
regression theory suitable for continuous, binary, categorical,
and count data. The theory of GLM was originally intended
for independent data, but it can be extended to dependent
data under various assumptions. The extension to time
series will be presented accompanied by some real data
examples.
-----------------------------------------------------------
Title: After 9/11 – What Has Changed, What Hasn’t?:
A Science and Technology Person’s Perspective
Speaker: Dr. Charles M. Herzfeld
Senior Fellow, Potomac Institute for Policy Studies,
and Senior Adjunct Fellow, Center for Strategic and
International Studies
Date: February 28, 2003
Location: Funger Hall 207
Time: 5:00-6:00 pm
------------------------------------------------------------
The world has many problems, some old and some new.
The well-hidden blessing of 9/11 was that it made a
lot of us think again about the basics of life. Here
we attempt a high level perspective from the viewpoint
of a scientist/technologist. We will examine long term
trends of human advance that have not as yet been affected
by 9/11, and try to see where they are going: population
growth, food availability, shortages in resources. We
will examine some challenges for the technical community,
the managers and the teachers. For example, the public
is hugely ignorant about "real risks", versus
the "perceived risks" of modern life, yet
constantly forces decisions about how to deal with these
risks. Examples are airline security, nuclear power,
and many more. Most organizations have become excessively
bureaucratic, so that making changes has become practically
impossible. And there is the war on terrorism, which
brings many problems and opportunities to a head.
-----------------------------------------------------------
Title: An Analysis of Box-Cox Transformed Data
Speaker: Ms. Jade Lee Freeman
Department of Statistics, The George Washington University
Date: March 7, 2003
Location: Funger Hall 307
Time: 11:00-12:00 Noon
------------------------------------------------------------
We present a method for estimating the mean vector
from a multivariate skew distribution that includes
some unobserved data below the detection limits. To
estimate the mean vector and the covariance matrix we
develop an EM algorithm solution and use it to maximize
the likelihood. We obtain expressions for the mean vector,
covariance matrix, and the asymptotic covariance of
the vector of means in the original scale. The performance
of the MLE method in selecting the correct power transformation
and the coverage rate of the confidence region under
several conditions are investigated with Monte Carlo
simulation.
Box-Cox transformation system produces the power normal
(PN) family, whose members include normal and log-normal
distributions. We study the moments of PN and obtain
expressions for its mean and variance. The quantile
functions and a quantile measure of skewness are discussed
to show that the PN family is ordered with respect to
the transformation parameter. The conditional distributions
are studied and shown to belong to the PN family. We
obtain expressions for the mean, median and modal regressions.
Chebyshev-Hermite polynomials are used to obtain an
expression for the correlation coefficient and to prove
that correlation is smaller in the PN scale than the
original scale. Frechet bounds are used to obtain expressions
for the lower and upper bounds of the correlation coefficient.
An algorithm is given to compute the bounds.
We also investigate the efficiency of tests after a
power transformation. In particular, we consider the
one sample test of location and study the gains in efficiency
for one-sample t-test following a Box-Cox transformation.
We prove that the asymptotic relative efficiency of
transformed univariate t-test and Hotelling test of
multivariate location with respect to the same statistics
based on untransformed data is at least one. We also
study the efficiency of the correlation coefficient
following a Box-Cox transformation. We prove that much
stronger conclusions can be reached about the independence
of the margins of bivariate normal variates once they
have been transformed with a Box-Cox transformation.
------------------------------------------------------------
Title: Data Mining---Three and a Half Issues
Speaker: Dr. David Banks
Center for Biologics Evaluation, U.S. FDA
Date: March 28, 2003
Location: Funger Hall 307
Time: 11:00-12:00 Noon
-----------------------------------------------------------
Data mining subsumes many problem areas. This talk
describes (1) issues that arise in preanalysis, which
largely concerns data quality, (2) the problem of selecting
an appropriate data mining tool (the focus is upon choosing
methods for nonparametric multivariate regression),
and (3) estimation of local dimensionality. Additionally,
there is a brief discussion of the indexing problem,
which concerns the creation of a useful summary of complex
multivariate data sets.
-----------------------------------------------------------
Title: Maximin Efficiency Robust Tests for the Focused
Clustering of Disease
Speaker: Mr. Pablo Bonangleino
Department of Statistics, The George Washington University
Date: April 4, 2003
Location: Funger Hall 308
Time: 11:00-12:00 Noon
------------------------------------------------------------
Score tests are among the most powerful of tests for
focused clustering. However, their dependency on a specific
stochastic model and parameterization of the exposure
indicates that they will be inefficient if these assumptions
are far from the truth. To address this issue, we develop
maximin efficiency robust tests (MERTs) for candidate
sets of possible exposures. The power of several MERTs
and single tests are compared through simulation. In
addition, the power of MERTs and single tests are examined
for a departure from the standard Poisson assumption,
namely the zero-inflated or Poisson with added zeros
model. Variations of the tests, adequate for this model,
are proposed. Finally, we present an application of
the methods to leukemia incidence data from upstate
New York.
-------------------------------------------------------------
Title: Issues in Measurement and Analysis of Health
Related Quality of Life
Speaker: Professor Mounir Mesbah
Université de Bretagne-Sud, Vannes, France
Date: April 4, 2003
Location: Funger Hall 321
Time: 5:00-6:00 pm
------------------------------------------------------------
Health Related Quality of Life surveys deals generally
with two kinds of data: data recorded during an exploratory
or validation step, in order to help with the construction
(définition) of variables and indicators, and
data recorded during an analysis step in order to investigate
the evolution of the distribution of the previous constructed
variables between various populations, times and areas.
These are generally two well separated steps during
the research process of a scientist in the field of
Health Related Quality of Life, Environment or any other.
The first step, generally deal with measurement, calibration,
metrology of variables and most used statistical methods
are multivariate exploratory analysis and structural
models, like factorial analysis models or item response
theory models. The second step, is certainly more known
by inferential statisticians. Linear, generalized linear,
time series and survival methods (and models) are very
useful in this step. The variables constructed in the
first step are incorporated in this second step and
their joint distribution – joint with
the other analysis variables (treatment group, time,
duration of life, etc ...)- is investigated. In this
talk, I will compare the simple strategy of separating
the two steps with the one defining and analysing a
global model including both the measurement and the
analysis step. I will illustrate the issue with a real
example in oncology, where the main goal is the analysis
of the joint distribution of Survival and Quality of
Life of cancer patients randomized in two treatment
groups during a clinical trial.
-------------------------------------------------------------
Title: The Effect Of Statistical Dependence on Inferences
from Binomial Data
Speaker: Professor Weiwen Miao
Department of Mathematics and Computer Science, Macalester
College, Minnesota
Date: April 11, 2003
Location: Funger Hall 307
Time: 11:00-12:00 Noon
------------------------------------------------------------
The talk describes the effect of the statistical dependence
on tests and confidence intervals for the parameter
p, the success probability in a binomial random variable.
The problem was motivated by a jury discrimination case,
Moultrie v. Martin, in which half the grand jurors served
a second year. Hence, the racial compositions of the
grand juries in consecutive years were no longer statistically
independent. The first part of the talk concentrates
on the effect of the dependence on hypothesis testing.
It will be shown that ignoring dependence not only made
the statistical evidence of discrimination appear stronger
than it truly was but also exaggerated the power of
the test used to determine the possible discrimination.
Both the exact distribution of the number of "successes"
and its normal approximation are compared in order to
provide a practical condition for the use of the approximation.
The second part of the talk focuses on the effect of
dependence on confidence intervals for a population
proportion. When observations are dependent, even slightly,
the coverage probability of the virtually all the confidence
intervals in the literature can deviate noticeably from
their nominal level. We proposed and examined several
modified confidence intervals. Our results showed that
the modified Wilson interval performs well and can be
recommended for general use.
------------------------------------------------------------------
Title: Semiparametric AFT for survival data: inference,
implementation and theory
Speaker: Professor Zhiliang Ying
Department of Statistics, Columbia University
Date: April 18, 2003
Location: Funger Hall 307
Time: 11:00-12:00 Noon
------------------------------------------------------------
The accelerated failure time (AFT) model is an equivalent
version of the linear regression model, especially formulated
for survival data. Because of its many attractive features
and the link to the classical linear regression, there
have been many efforts made in the past two and a half
decades to develop inference procedures as well as related
theory. The first part of this talk will be an overview
of these efforts, including difficulties encountered
there of. In the second part, some significant recent
developments will be described. Special attention will
be given to the implementational aspect.
------------------------------------------------------------------
Title: Convergence analysis of the least-squares estimates
for infinite AR models
Speaker: Professor Yulia R. Gel
Department of Statistics, University of Washington
Date: April 29, 2003
Location: Funger Hall 321
Time: 11:00-12:00 Noon
------------------------------------------------------------
The standard parameter estimation methods usually assume
that the true underlying model of the observed process
is a finite AR, MA or a mixed ARMA equation. However,
this assumption can be rarely justified in practice.
A common approach in system identification is to approximate
the true model by the finite AR model. Numerous literature
is devoted to the optimal selection of the approximation
model order and analysis of selection criterions such
as AIC, BIC, PLS and others. Our main approach is vice
versa, we initially consider the parameter estimation
problem of an infinite AR model and then go back to
the analysis on the optimal order selection. This talk
focuses on the convergence analysis of the regularized
Least Squares method for the infinite case. It is established
the result on the degree of a.s. convergence of the
infinite LS estimates. In addition, it is presented
a complimentary result on the convergence of semi-martingales,
which is a corner-stone for proof of all theorems here,
but is of interest by itself. The proposed identification
procedure is evaluated by simulations.
------------------------------------------------------------
Title: Scaled Boolean Algebras in the Foundations of
Probability
Speaker: Dr.Michael Hardy
Department of Statistics, University of Toledo, Ohio
Date: May 6, 2003
Location: Funger Hall 321
Time: 10:00-11:00 am
------------------------------------------------------------
This topic could be considered a part of the theory
of comparative probability orderings. Typically in that
theory one assumes a sort of weak additivity that says
that if A is less probable than B, and B and C are disjoint,
then [A or C] is less probable than [B or C]. But we
avoid additivity axioms because this arose from an examination
of justifying conventional probability axioms when probabilities
are construed as degrees of belief rather than as frequencies
or proportions or the like. To a Boolean algebra of
uncertain propositions we assign degrees of belief that
are members of a partially ordered "scale".
We ask under what conditions the scale should consist
of real numbers and the sum and product rules of probability
should hold.
------------------------------------------------------------
Title: Infinitely Divisible Time Series Models
Speaker: Dr.Xuefeng Li
Department of Statistics, University of Pennsylvania
Date: May 12, 2003
Location: Funger Hall 321
Time: 11:00-12:00 Noon
------------------------------------------------------------
Motivated from a project of analyzing call center data,
time series models with infinitely divisible marginal
distributions are studied. Existing models, though have
a form similar to the classical ARMA model, have great
restrictions. In this work we proposed two new constructions.
The first one comes from the construction of multivariate
random variables with infinitely divisible margins and
gives more flexible moving average structure. The second
one is based on the integration of Gamma random fields
and gives continuous stationary stochastic processes
with Gamma margins. Most of the properties about these
new constructions carry over to the family of infinitely
divisible distributions. Estimation procedures as well
as their asymptotic properties are investigated. Open
questions and future research directions are discussed.
This is a joint work with Dr. Lawrence Brown and Dr.
Robert Wolpert.
The series hosts a seminar about twice a month on current
research topics. The seminar often features an invited
guest speaker and occasionally local faculty members,
students or others affiliated with the department. The
usual time of the seminar is 11:00 a.m. on Fridays. Professor
Reza
Modarres (E-mail : reza@gwu.edu)
is the Seminar Series Coordinator.
--------------------------------------------------------------------------------
The contact person is Reza Modarres at Reza@gwu.edu
or 202-994-6359. |