Logo
Home
People Research Degree Programs Courses Seminar


 
 
Seminar Announcements for Spring 2003
Title: Mixed Distributions in an Algorithm for Cyclic Permutations via Randomization and Derandomization

Speaker: Dr. Hosam M. Mahmoud
Department of Statistics, George Washington University
Date: January 31, 2003
Location: Funger Hall 307
Time: 11:00-12:00 noon.

------------------------------------------------------------

Randomization as a transform, followed by derandomization as an inverse transform are useful in solving certain classes of problems with fixed elements. We investigate the limit distributions associated with cost measures in an algorithm for generating random cyclic permutations. The number of moves made by an element turns out to be a mixture of 1 and 1 plus a geometric distribution with parameter $1/2$, where the mixing probability is the limiting proportion of the rank of the element being moved to the size of the permutation. On the other hand, the raw distance traveled by an element to its final destination does not converge in distribution without norming. Linearly scaled, the distance converges to a mixture of a uniform and a shifted product of a pair of independent uniforms.

------------------------------------------------------------

Title: Regression Models for Time Series Analysis

Speaker: Dr. Benjamin Kedem
University of Maryland, College Park
Date: February 14, 2003
Location: Funger Hall 307
Time: 11:00-12:00 noon.

------------------------------------------------------------

A relatively recent statistical development is the important class of models known as generalized linear models (GLM) that was introduced by Nelder and Wedderburn (1972), and which provides under some conditions a unified regression theory suitable for continuous, binary, categorical, and count data. The theory of GLM was originally intended for independent data, but it can be extended to dependent data under various assumptions. The extension to time series will be presented accompanied by some real data examples.

-----------------------------------------------------------

Title: After 9/11 – What Has Changed, What Hasn’t?: A Science and Technology Person’s Perspective

Speaker: Dr. Charles M. Herzfeld
Senior Fellow, Potomac Institute for Policy Studies, and Senior Adjunct Fellow, Center for Strategic and International Studies
Date: February 28, 2003
Location: Funger Hall 207
Time: 5:00-6:00 pm

------------------------------------------------------------

The world has many problems, some old and some new. The well-hidden blessing of 9/11 was that it made a lot of us think again about the basics of life. Here we attempt a high level perspective from the viewpoint of a scientist/technologist. We will examine long term trends of human advance that have not as yet been affected by 9/11, and try to see where they are going: population growth, food availability, shortages in resources. We will examine some challenges for the technical community, the managers and the teachers. For example, the public is hugely ignorant about "real risks", versus the "perceived risks" of modern life, yet constantly forces decisions about how to deal with these risks. Examples are airline security, nuclear power, and many more. Most organizations have become excessively bureaucratic, so that making changes has become practically impossible. And there is the war on terrorism, which brings many problems and opportunities to a head.

-----------------------------------------------------------

Title: An Analysis of Box-Cox Transformed Data

Speaker: Ms. Jade Lee Freeman
Department of Statistics, The George Washington University
Date: March 7, 2003
Location: Funger Hall 307
Time: 11:00-12:00 Noon
------------------------------------------------------------

We present a method for estimating the mean vector from a multivariate skew distribution that includes some unobserved data below the detection limits. To estimate the mean vector and the covariance matrix we develop an EM algorithm solution and use it to maximize the likelihood. We obtain expressions for the mean vector, covariance matrix, and the asymptotic covariance of the vector of means in the original scale. The performance of the MLE method in selecting the correct power transformation and the coverage rate of the confidence region under several conditions are investigated with Monte Carlo simulation.

Box-Cox transformation system produces the power normal (PN) family, whose members include normal and log-normal distributions. We study the moments of PN and obtain expressions for its mean and variance. The quantile functions and a quantile measure of skewness are discussed to show that the PN family is ordered with respect to the transformation parameter. The conditional distributions are studied and shown to belong to the PN family. We obtain expressions for the mean, median and modal regressions. Chebyshev-Hermite polynomials are used to obtain an expression for the correlation coefficient and to prove that correlation is smaller in the PN scale than the original scale. Frechet bounds are used to obtain expressions for the lower and upper bounds of the correlation coefficient. An algorithm is given to compute the bounds.

We also investigate the efficiency of tests after a power transformation. In particular, we consider the one sample test of location and study the gains in efficiency for one-sample t-test following a Box-Cox transformation. We prove that the asymptotic relative efficiency of transformed univariate t-test and Hotelling test of multivariate location with respect to the same statistics based on untransformed data is at least one. We also study the efficiency of the correlation coefficient following a Box-Cox transformation. We prove that much stronger conclusions can be reached about the independence of the margins of bivariate normal variates once they have been transformed with a Box-Cox transformation.

------------------------------------------------------------

Title: Data Mining---Three and a Half Issues

Speaker: Dr. David Banks
Center for Biologics Evaluation, U.S. FDA
Date: March 28, 2003
Location: Funger Hall 307
Time: 11:00-12:00 Noon
-----------------------------------------------------------

Data mining subsumes many problem areas. This talk describes (1) issues that arise in preanalysis, which largely concerns data quality, (2) the problem of selecting an appropriate data mining tool (the focus is upon choosing methods for nonparametric multivariate regression), and (3) estimation of local dimensionality. Additionally, there is a brief discussion of the indexing problem, which concerns the creation of a useful summary of complex multivariate data sets.

-----------------------------------------------------------

Title: Maximin Efficiency Robust Tests for the Focused Clustering of Disease

Speaker: Mr. Pablo Bonangleino
Department of Statistics, The George Washington University
Date: April 4, 2003
Location: Funger Hall 308
Time: 11:00-12:00 Noon
------------------------------------------------------------

Score tests are among the most powerful of tests for focused clustering. However, their dependency on a specific stochastic model and parameterization of the exposure indicates that they will be inefficient if these assumptions are far from the truth. To address this issue, we develop maximin efficiency robust tests (MERTs) for candidate sets of possible exposures. The power of several MERTs and single tests are compared through simulation. In addition, the power of MERTs and single tests are examined for a departure from the standard Poisson assumption, namely the zero-inflated or Poisson with added zeros model. Variations of the tests, adequate for this model, are proposed. Finally, we present an application of the methods to leukemia incidence data from upstate New York.

-------------------------------------------------------------

Title: Issues in Measurement and Analysis of Health Related Quality of Life

Speaker: Professor Mounir Mesbah
Université de Bretagne-Sud, Vannes, France
Date: April 4, 2003
Location: Funger Hall 321
Time: 5:00-6:00 pm
------------------------------------------------------------

Health Related Quality of Life surveys deals generally with two kinds of data: data recorded during an exploratory or validation step, in order to help with the construction (définition) of variables and indicators, and data recorded during an analysis step in order to investigate the evolution of the distribution of the previous constructed variables between various populations, times and areas.

These are generally two well separated steps during the research process of a scientist in the field of Health Related Quality of Life, Environment or any other. The first step, generally deal with measurement, calibration, metrology of variables and most used statistical methods are multivariate exploratory analysis and structural models, like factorial analysis models or item response theory models. The second step, is certainly more known by inferential statisticians. Linear, generalized linear, time series and survival methods (and models) are very useful in this step. The variables constructed in the first step are incorporated in this second step and their joint distribution – joint with

the other analysis variables (treatment group, time, duration of life, etc ...)- is investigated. In this talk, I will compare the simple strategy of separating the two steps with the one defining and analysing a global model including both the measurement and the analysis step. I will illustrate the issue with a real example in oncology, where the main goal is the analysis of the joint distribution of Survival and Quality of Life of cancer patients randomized in two treatment groups during a clinical trial.

-------------------------------------------------------------

Title: The Effect Of Statistical Dependence on Inferences from Binomial Data

Speaker: Professor Weiwen Miao
Department of Mathematics and Computer Science, Macalester College, Minnesota
Date: April 11, 2003
Location: Funger Hall 307
Time: 11:00-12:00 Noon
------------------------------------------------------------

The talk describes the effect of the statistical dependence on tests and confidence intervals for the parameter p, the success probability in a binomial random variable. The problem was motivated by a jury discrimination case, Moultrie v. Martin, in which half the grand jurors served a second year. Hence, the racial compositions of the grand juries in consecutive years were no longer statistically independent. The first part of the talk concentrates on the effect of the dependence on hypothesis testing. It will be shown that ignoring dependence not only made the statistical evidence of discrimination appear stronger than it truly was but also exaggerated the power of the test used to determine the possible discrimination. Both the exact distribution of the number of "successes" and its normal approximation are compared in order to provide a practical condition for the use of the approximation. The second part of the talk focuses on the effect of dependence on confidence intervals for a population proportion. When observations are dependent, even slightly, the coverage probability of the virtually all the confidence intervals in the literature can deviate noticeably from their nominal level. We proposed and examined several modified confidence intervals. Our results showed that the modified Wilson interval performs well and can be recommended for general use.

------------------------------------------------------------------

Title: Semiparametric AFT for survival data: inference, implementation and theory

Speaker: Professor Zhiliang Ying
Department of Statistics, Columbia University
Date: April 18, 2003
Location: Funger Hall 307
Time: 11:00-12:00 Noon
------------------------------------------------------------

The accelerated failure time (AFT) model is an equivalent version of the linear regression model, especially formulated for survival data. Because of its many attractive features and the link to the classical linear regression, there have been many efforts made in the past two and a half decades to develop inference procedures as well as related theory. The first part of this talk will be an overview of these efforts, including difficulties encountered there of. In the second part, some significant recent developments will be described. Special attention will be given to the implementational aspect.

------------------------------------------------------------------

Title: Convergence analysis of the least-squares estimates for infinite AR models

Speaker: Professor Yulia R. Gel
Department of Statistics, University of Washington
Date: April 29, 2003
Location: Funger Hall 321
Time: 11:00-12:00 Noon
------------------------------------------------------------

The standard parameter estimation methods usually assume that the true underlying model of the observed process is a finite AR, MA or a mixed ARMA equation. However, this assumption can be rarely justified in practice. A common approach in system identification is to approximate the true model by the finite AR model. Numerous literature is devoted to the optimal selection of the approximation model order and analysis of selection criterions such as AIC, BIC, PLS and others. Our main approach is vice versa, we initially consider the parameter estimation problem of an infinite AR model and then go back to the analysis on the optimal order selection. This talk focuses on the convergence analysis of the regularized Least Squares method for the infinite case. It is established the result on the degree of a.s. convergence of the infinite LS estimates. In addition, it is presented a complimentary result on the convergence of semi-martingales, which is a corner-stone for proof of all theorems here, but is of interest by itself. The proposed identification procedure is evaluated by simulations.

------------------------------------------------------------

Title: Scaled Boolean Algebras in the Foundations of Probability

Speaker: Dr.Michael Hardy
Department of Statistics, University of Toledo, Ohio

Date: May 6, 2003
Location: Funger Hall 321
Time: 10:00-11:00 am
------------------------------------------------------------

This topic could be considered a part of the theory of comparative probability orderings. Typically in that theory one assumes a sort of weak additivity that says that if A is less probable than B, and B and C are disjoint, then [A or C] is less probable than [B or C]. But we avoid additivity axioms because this arose from an examination of justifying conventional probability axioms when probabilities are construed as degrees of belief rather than as frequencies or proportions or the like. To a Boolean algebra of uncertain propositions we assign degrees of belief that are members of a partially ordered "scale". We ask under what conditions the scale should consist of real numbers and the sum and product rules of probability should hold.

------------------------------------------------------------

Title: Infinitely Divisible Time Series Models

Speaker: Dr.Xuefeng Li
Department of Statistics, University of Pennsylvania

Date: May 12, 2003
Location: Funger Hall 321
Time: 11:00-12:00 Noon
------------------------------------------------------------

Motivated from a project of analyzing call center data, time series models with infinitely divisible marginal distributions are studied. Existing models, though have a form similar to the classical ARMA model, have great restrictions. In this work we proposed two new constructions. The first one comes from the construction of multivariate random variables with infinitely divisible margins and gives more flexible moving average structure. The second one is based on the integration of Gamma random fields and gives continuous stationary stochastic processes with Gamma margins. Most of the properties about these new constructions carry over to the family of infinitely divisible distributions. Estimation procedures as well as their asymptotic properties are investigated. Open questions and future research directions are discussed.

This is a joint work with Dr. Lawrence Brown and Dr. Robert Wolpert.

The series hosts a seminar about twice a month on current research topics. The seminar often features an invited guest speaker and occasionally local faculty members, students or others affiliated with the department. The usual time of the seminar is 11:00 a.m. on Fridays. Professor Reza Modarres (E-mail : reza@gwu.edu) is the Seminar Series Coordinator.


--------------------------------------------------------------------------------
The contact person is Reza Modarres at Reza@gwu.edu

or 202-994-6359.

 
 
 
   
Home Site Map