Logo
Home
People Research Degree Programs Courses Seminar


 
 
 
Seminar Announcements for Spring 2001
------------------------------------------------------------
Title: Model Selection via Out-of-Sample Forecast Error Comparisons: Practice and Theory
SPEAKER: Dr. David F. Findley
Bureau of the Census
Statistical Research Division
DATE: January 26, 2001
LOCATION: Funger Hall 308
TIME: 11:00 a.m.

------------------------------------------------------------

Many monthly economic time series can be effectively modeled with autoregressive moving average models that include regressors for the effects of moving holidays and/or other calendar effects or outliers. Such models are fit to some Box-Cox transformation of the observed data. The comparison of two competing models with different transformations, different holiday interval lengths, different outliers, or with autoregressive versus moving average components, is a non-nested comparison--neither model is special case of the other. There are no practical hypothesis tests for such comparisons. In this situation, a natural approach is to compare the models' abilities to forecast the most recent data, after excluding this data from the data span used to estimate model parameters. We will show empirical results demonstrating the versatility of a forecast comparison diagnostic that implements this idea and is available in the Census Bureau's X-12-ARIMA and X-12-Graph software. Then we will present a new theoretical result that suggests some of the diagnostic's observed behavior and does not require the assumption that any model considered is correct.

------------------------------------------------------------
Title: Recursive Estimation for Misspecified MA(1) Models
SPEAKER: Mr. Jim Cantor
Department of Statistics
DATE: February 9, 2001
LOCATION: Funger Hall 308
TIME: 11:00 a.m.
------------------------------------------------------------

In this talk, new results for recursive parameter estimation for misspecified models are presented. "Recursive" means that the parameter estimate for the series at time (and length) t is obtained as a function of the parameter estimate at t-1 and the data value at time t. "Misspecified" means that the model type fitted does not match the system generating the data. Specifically, we investigate the situation in which a first order moving average model, MA(1), is fit to data from a stationary first order autoregression, AR(1), using two standard recursive estimation methods: pseudolinear regression (PLR) and (monitored) recursive maximum likelihood (RML). Using a minimum mean squared one-step-ahead forecast error criterion, we show that PLR converges almost surely but to a non- optimal parameter value. For monitored RML, we show that the optimal parameter value is a cluster point for the recursion almost surely. These results represent the first rigorous analysis of PLR and monitored RML in the misspecified model situation.

------------------------------------------------------------
Title: Random Walks on Wreath Products of Groups
Speaker: Professor Clyde Schoolfield
Harvard University
DATE: February 16, 2001
LOCATION: Funger Hall 321
TIME: 11:00 a.m.
------------------------------------------------------------

For a certain random walk on the symmetric group S_n that is generated by random transpositions, Diaconis and Shahshahani (1981) obtained bounds on the rate of convergence to uniformity using group representation theory. Similarly, we bound the rate of convergence to uniformity for a random walk on the hyperoctahedral group Z_2 | S_n that is generated by random signed transpositions. Specifically, we determine that, to first order in n, 1/2 n log n steps are both necessary and sufficient for total variation distance to become small. Moreover, we show that our walk exhibits the so-called ``cutoff phenomenon.'' We extend our results on this random walk to the generalized symmetric groups Z_m | S_n and further to the complete monomial groups G |S_n for any finite group G. As an example, we will describe an application of our results to mathematical biology.

------------------------------------------------------------
Title: Options and Discontinuity: An Asymptotic Decomposition for Trading Algorithm

Speaker: Seongjoo Song
Department of Statistics, University of Chicago

Date: February 23, 2001
Location: Funger Hall 321
Time: 11:00 a.m.
------------------------------------------------------------

The problem of hedging contingent claims is well understood in a complete financial market. In such a market, any contingent claim can be replicated exactly by trading available securities with large enough initial capital. On the other hand, the risk of any option cannot be hedged away completely when the market is incomplete. There are many different causes of incompleteness. Among them, discontinuity of the underlying asset price process is a very important cause.

This is because the discontinuous model fits the data better than any continuous model, and in particular because it incorporates such very real phenomena as crashes and devaluations, which can upset any trading strategy. This paper studies the problem of option pricing and hedging in the Presence of such discontinuities by adopting an asymptotic approach, letting securities prices converge to continuous processes. We then study the first order error in this convergence. The first order error term after we hedge an option with the classical Black-Scholes strategy is decomposed into a part which can be traded away and a part which is purely unreplicable. First, I modify the Black-Scholes hedging strategy by adding the replicable part of the first order error and secondly, I adopt the mean-variance hedging method by Duffie and Richardson(1991) and Schweizer(1992) for the nonreplicable part. Under some regularity conditions, the closed form solution is obtained for the hedging strategy which minimizes the mean square of the hedging error. Besides, I propose several approaches to price a contingent claim and compared their performances. In addition to assuming continuous time hedging, in this setting, I also study the properties of hedging at intervals, as the length of such intervals goes to zero. Some results of simulation and real market data application are also provided. In simulation, we see that the new hedging strategy improves the classical Black-Scholes hedging strategy up to 30\% in terms of the mean square of hedging error, when the distribution of log stock price is skewed.

------------------------------------------------------------
Title: A method of moments for random recursive structures
Speaker: Professor Hsien-Kuei Hwang
Academia Sinica
DATE: March 2, 2001
LOCATION: Funger Hall 308
TIME: 11:00 a.m.
------------------------------------------------------------

I will present a method of moments that is very useful for random variables defined in some recursive manner. Many examples including m-ary search trees and quickselect will be used to describe the method. (The method of moments is a "traditional" way of deriving limit laws; its application, although primitive by modern probability standards, has several advantages, especially when applying to recursive random variables.)

------------------------------------------------------------
Title: Recent Advances in Ranked Set Sampling
Speaker: Professor Ram Tiwari
Department of Mathematics
DATE: March 9, 2001
LOCATION: Funger Hall 308
TIME: 11:00 a.m.
------------------------------------------------------------

The ranked set sampling procedure is a two-step sampling scheme in which a subgroup of independently sampled items are collected and ranked, but only one item from the subgroup is chosen for complete measurement. The item’s rank within the subgroup is noted, so the final sample consists of independent order statistics. If the subgroup sizes are identical (say n) and each of the n different order statistics are sampled in equal proportion, the ranked set sample is said to be balanced. The talk consists of two parts. In the first part, we consider the underlying population to be a member of the location-scale families of symmetric distributions, and derive unbiased estimators of the population mean and variance. In the second part, we assume that the underlying distribution is unknown and modeled nonparametrically, and derive its Bayes estimator with respect to an ordered Dirichlet distribution as prior.

------------------------------------------------------------
Title: Classifying Tumors and Assessing the Survival of Tumor Patients using Microarray Gene Expression Data

Speaker: Danh Nguyen
University of California at Davis


DATE: March 14, 2001
LOCATION: Funger Hall 321
TIME: 11:00 a.m.
------------------------------------------------------------

The introduction of DNA microarray technology is a technical advance in the biomedical research. Specifically, the use of microarray technology, such as complementary DNA (cDNA) and oligonucleotide arrays, allows simultaneous monitoring of thousands of gene expressions per sample. Data from microarray experiments presents a data analytical or methodological challenge, since the number of variables (genes) far exceed the number of samples. In this talk, we explore the use of dimension reduction methods in conjunction with classification methods for classifying tumor types based on array gene expression data. The primary dimension reduction methods considered is partial least squares (PLS) and principal components analysis (PCA). When survival times of patients are tracted it is also of interest to estimate the survival probabilities of patients following certain gene expression patterns (profiles). We illustrate the methods to various microarray gene expression data sets: (1) ovarian, (2) acute leukemia (3), B-cell lymphoma and (4) colon data sets.

------------------------------------------------------------
Title: Computational Sequence Analysis: Genome and Statistical Controversies

Speaker: Professor Pranab K. Sen
Department of Biostatistics and Statistics
DATE: March 16, 2001
LOCATION: Funger Hall 308
TIME: 11:00 a.m.
------------------------------------------------------------

Computational biology is an interdisciplinary field; principles of molecular genetics govern computational sequence analysis. For human GENOME sequences, we encounter some nonstandard statistical models where high-dimensional categorical data models crop up, often, without perceptible quantitative undercurrents. As such, conventional (continuous or discrete) multivariate analysis may encounter computational as well as conceptual difficulties. Limitations of (conditional-, partial-, profile-, pseudo-, and quasi-) likelihoods are appraised in this context; without an acceptable topology that defines neighborhoods, for statistical modeling and analysis, there might not be enough incentive to pursue a parametric (l9kelihood) approach. Bayesian perspectives fare better, though there may be some concern from validity and robustness considerations. Alternatives that take into account underlying biological implications to a greater (and parametrics to a lesser) extent are appraised and advocated on a case by case basis.

------------------------------------------------------------
Title: The Statistics Department and The Biostatistics Center: Past Collaborations and discussion of Future Possibilities

Speaker: Dr. Sarah Fowler
Department of Statistics and Biostatistics Center of the George Washington University

DATE: March 30, 2001
LOCATION: Funger Hall 308
TIME: 11:00 a.m.

-----------------------------------------------------------

The purpose of the presentation is to give a "social history" of collaborations between Biostatistics Center (BSC) faculty and regular faculty of the Department of Statistics (DOS) and to stimulate discussion about how to foster future collaborations. Topics to be covered include: a description of the current BSC, its research projects and activities; the grant development and review process; and a chronology of BSC administration, teaching and collaboration with DOS faculty from 1972 to the present. In particular, the presentation will describe the nature and productivity of 7 statistical methods grants, and how the involvement of DOS regular faculty and doctoral students in the BSC research projects has lead to collaborations on statistical theory and methods applicable to clinical trials.

------------------------------------------------------------
Title: Survey Sampling Methodology and Analysis of Complex Survey Data

Speaker: Dr. Leyla Mohadjer, Westat Corporation

DATE: April 20, 2001
LOCATION: Funger Hall 308
TIME: 11:00 a.m.
------------------------------------------------------------

This talk will provide a general overview of recent developments in survey research and methodology with more emphasis on the specific areas I have worked on recently. Most of the research and methodology in survey sampling is developed in the twentieth century. By the 1970’s, major surveys were undertaken to meet the needs of statistical agencies and researchers. The field has expanded at a very rapid paste in the past thirty years, especially as researchers and government agencies have learned about the values of surveys in achieving their goals. The main method of sample design (stratification, clustering, multistage sampling, etc.) is described in textbooks published in the 1950’s. The recent developments include refinements and extensions of these methods. The focus of the research is to derive efficient sample designs and data collection procedures. The talk will include a number of examples of recent developments in survey sampling methodology.

One of the challenges facing survey practitioners is survey nonresponse. There has been increasing concern that nonresponse rate has been rising. That is, over time, it has become more difficult to obtain cooperation. Thus greater efforts occur in the field to increase response rate. A sizable number of experiments are conducted to test various approaches to improve response rate. The talk will include descriptions of the results of a couple of experiments conducted to improve survey response rate. For data analysis, most standard techniques used in statistical packages assume that observations are independent and drawn using simple random sampling, and that all sampled cases have participated in the survey. From these assumptions, classical statistical theory has developed a wide variety of estimators that are valid under these conditions. These requirements are often not met in sample surveys since it is usually cost effective to select samples through a complex multi-stage design (e.g., involving stratification, clustering of units, and the use of several stages of selection) rather than through simple random sampling. Once a sample departs from simple random sampling, and in the presence of nonresponse, however, new computational procedures are required in order to take into account the impact of survey design and nonresponse on statistical estimation. The talk will include descriptions of a number of statistical software packages currently available for analysis of data from complex surveys.

-------------------------------------------------------------

Title: Middle-censoring and Applications

Speaker: Professor S. Rao Jammalamadaka
Department of Statistics, University of California at Santa Barbara

DATE: April 18, 2001
LOCATION: Funger Hall 308
TIME: 11:00 a.m.
------------------------------------------------------------

In connection with survival analysis, there is considerable literature which treats data that is censored from the left, right or both. In this talk, we consider situations where the data becomes unobservable if it falls inside a random interval in the middle. This happens in clinical trials and lifetime studies where a subject is temporarily absent or withdrawn from the study and the event of interest occurs during this period, so that the exact time of occurrence cannot be observed. Both left and right censoring are special cases of such "middle-censoring." The nonparametric maximum likelihood estimator of the survival function is derived in this context as a solution to the self-consistency equation and its large-sample properties discussed.

-------------------------------------------------------------

Title: Symmetry and the Covariance Structure of Ordered Dependent Observations

Speaker: Dr. Marlos Viana
Eye Research Institute, The University of Illinois at Chicago

DATE: April 26, 2001
LOCATION: Funger Hall 307
TIME: 3:00 p.m.
------------------------------------------------------------

In the analysis of data from bilateral biological processes (e.g., vision, hearing) it is often required to model the vector of ordered joint observations and its relation to one or more covariates. In this talk we will discuss the covariance structure of ordered dependent observations under a class of permutation and block-permutation symmetric covariance tructures. Applications include the analysis of joint extreme (best, orst) observations from dependent bilateral measurements. The covariance structure of ordered, cyclically-symmetric dependent observations will also be discussed. Applications include the analysis of extreme observations from corneal curvature topographic maps. Related reading are can be found at http://www.uic.edu/~viana/

-------------------------------------------------------------

Title: Analyzing gene expression data from microarrays: a mixture-based approach

Speaker: Professor Francesca Chiaromonte
Department of Statistics, Pennsylvania State University

DATE: May 4, 2001
LOCATION: Funger Hall 321
TIME: 11:00 ap.m.
------------------------------------------------------------

The analysis of global gene expression data from microarrays is breaking new ground in genetics research, while confronting modelers and statisticians with critical issues related to size, exploration, modeling and error management. Clustering of expression profiles, as a means of identifying functionally related and possibly co-regulated genes, has been the focus of much literature to date. We use a clustering scheme based on multivariate normal mixtures that allows us to (i) robustify the analysis through the introduction of a contamination term, (ii) blend exploration and modeling through the use of free and constrained means, and (iii) provide cluster membership probabilities, as opposed to simple memberships, for the genes. Maximum likelihood estimation of the parameters is performed via EM algorithm. We present some preliminary results on published data comparing k-means clustering to mixture based clustering whose likelihood maximization was initialized through k-means memberships.


--------------------------------------------------------------------------------
The contact person is Reza Modarres at Reza@gwu.edu

or 202-994-6359.

 

 
 
 
   
Home Site Map