September 11, 2009
Title: Gaussian phases in generalized coupon collection
Speaker:
Hosam Mahmoud,
Department of Statistics, George Washington University
Abstract:
A generalized coupon collection problem is considered in which a random number
of purchases occurs at each stage of collecting n, a large number of coupons.
We address the question of how many different coupons have been collected after
k=k(n) draws, as n tends to infinity. We identify three phases of k(n): The
sublinear, the linear, and the superlinear. In the sublinear phase we see o(n)
different coupons, and with true randomness in the number of purchases, under
the appropriate centering and scaling, a Gaussian distribution is obtained
across the entire phase. However, if the number of purchases is fixed, a
degeneracy comes into the picture and normality holds only at the higher end of
this phase. In the case of a number of purchases with a fixed range, the small
number of different coupons collected in the sublinear phase is upgraded to a
number in need of centering and scaling to become normally distributed in the
linear phase with a different normal distribution of the type that appears in
the usual central limit theorems. The Gaussian results are obtained via
martingale theory. We say a few words in passing about the high probability of
collecting nearly all the coupons in the superlinear phase. It is our aim to
present the results in a way to explore the critical transition at the "seamlines"
between different Gaussian phases, and between these phases and other nonnormal
phases. A similar development in the Ehrenfest model (an urn model for gas mixing)
will be mentioned.
Date: Friday, September 11, 2009
Time: 11:00-12:00 noon
Location: MPA Building, Room 309 (805 21st Street, NW, Washington, DC 20052)
September 25, 2009
Title: Filling the Gap: Introducing the Conway-Maxwell-Poisson regression for count data
Speaker:
Kimberly Sellers,
Department of Mathematics, Georgetown University
Abstract:
Poisson regression is a popular tool for modeling count data and is applied in a vast
array of applications from the social to the physical sciences. Real data, however,
are often over- or under-dispersed and, thus, are not conducive to Poisson regression.
Further, the dispersion present in the data may vary depending on other explanatory
components. We propose a generalized regression model based on the Conway-Maxwell-Poisson
(CMP) distribution to address this problem. The CMP regression generalizes the well-known
Poisson and logistic regression models, and is suitable for fitting count data with a wide
range of dispersion levels. Further, we extend this model approach to consider a model
structure for the dispersion parameter as well to better understand the dispersion
component. With a GLM approach that takes advantage of exponential family properties, we
will discuss model estimation, inference, diagnostics, and interpretation. As well, we will
present hypothesis tests for the implementation of the CMP regression, and a variable
selection technique. We will compare the CMP to several alternatives and illustrate its
advantages and usefulness using datasets with varying types and levels of dispersion.
This talk is based on joint work with Galit Shmueli (University of Maryland College Park).
Date: Friday, September 25, 2009
Time: 11:00-12:00pm
Location: MPA Building, Room 309 (805 21st Street, NW, Washington, DC 20052)
October 9, 2009
(Joint with the Institute for Integrating Statistics and Decision Sciences)
Title: Combining Simulations and Physical Observations to
Estimate Cosmological Parameters
Speaker:
Dave Higdon,
Group Leader, Statistical Sciences, Los Alamos National Laboratory
Abstract:
The Lambda-Cold Dark Matter (LCDM) model of cosmology is perhaps
the simplest model that best describes the makeup and evolution
of the universe in accordance with physical observations. This model
contains up to 20 different cosmological parameters from space and
ground based surveys. These cosmological measurements have reached
a remarkable level of accuracy over the last decade. Future sky
surveys promise to give even more numerous and more accurate data.
However, such data does not inform directly about the cosmological
parameters of interest. Detailed physical simulation models are
typically required to relate information from these surveys to
cosmological parameters of interest.
A Bayesian formulation adapted from Kennedy and O'Hagan (2001) and
Higdon et al. (2008) is used to give parameter constraints from
physical observations and a limited number of simulations. The
framework is based on the idea of replacing the simulator by an emulator
which can then be used to facilitate computations required for the
analysis. In this talk I'll describe an application that uses large
scale structure and Cosmic Microwave Background (CMB) data to inform
about a subset of the parameters controlling the LCDM model.
Date: Friday, October 9, 2009
Time: 4:30-5:30pm
Location: Duques Hall, Room 652 (2201 G Street, NW, Washington, DC 20052)
October 19, 2009
(Joint with the Institute for Integrating Statistics and Decision Sciences)
Title:
Attitudes Toward Firm and Competition: How do they Matter for CRM Activities
Speaker:
Nalini Ravishanker,
Department of Statistics, University of Connecticut
Abstract:
Easy availability of information on a customer's transactions with the firm and
the pressure to establish financial returns from marketing investments has led
to a dominance of models that directly connect marketing investments to sales at
the customer level. Customer's attitudes, on the other hand, have always been
assumed to influence customer's reactions to a firm's marketing communications,
but rarely included in models that determine customer value. We empirically
assess (a) the role of customer's attitudes in determining their value to the
firm, and (b) how knowledge of customer attitudes can influence a firm's customer
management strategy. Specifically, we evaluate which aspects of attitudes, i.e.,
attitudes toward firm or competition, have a bigger effect on customer behavior,
and whether customer attitudes are more important for managing some customers
than others. We use monthlyl sales call, sales, and survey based attitude
information collected over three years from the same customers of a multinational
pharmaceutical firm for this study. We develop a hierarchical generalized dynamic
linear model (HGDLM) framework that combines the sales call and sales data that
are available at regular time intervals, with customer attitudes that are not
available at regular intervals, and carry out inference in the Bayesian framework.
Date: Monday, October 19, 2009
Time: 4:00-5:00pm
Location: Funger Hall, Room 520 (2201 G Street, NW, Washington, DC 20052)
October 23, 2009
Title:
Flexible Stepwise Regression: An Adaptive Partition Approach to the Detection of Multiple Change-Points
Speaker:
Yinglei Lai,
Department of Statistics, George Washington University
Abstract:
We present the flexible stepwise regression: an adaptive partition
approach to the detection of multiple change-points. It partitions
a "time course" into consecutive non-overlapped intervals such that
the population means/proportions of the observations in two adjacent
intervals are significantly different at a given level. This is achieved
through a modified dynamic programming algorithm. This method can provide
consistent estimation results. It has a wide range of applications.
A special case of our method is the reduced isotonic regression. Both
simulation and experimental data will be used to illustrate our method.
Date: Friday, October 23, 2009
Time: 3:00-4:00pm
Location: Phillips Hall, Room 108 (801 22nd Street, NW, Washington, DC 20052)
November 13, 2009
Title: Moment Determinancy of Distributions: Some Recent Results
Speaker:
Jordan Stoyanov,
Department of Mathematics and Statistics, Newcastle University
Abstract:
The main discussion will be on distributions and their properties
expressed in terms of the moments which are assumed to be finite.
We describe distributions which are unique (M-determinate) and
others which are non-unique (M-indeterminate). We also show the
practical importance of these properties in areas such as Financial
modelling and Reliability analysis.
We start briefly with classical criteria and turn to very recent
developments based on the so-called Krein-Lin techniques. Thus we
will be able to analyze Box-Cox functional transformations of random
data and characterize the moment determinacy of their distributions.
Distributions of stochastic processes such as the Geometric BM and
the solutions of SDEs will also be considered.
All statements and criteria will be well illustrated by examples
involving popular distributions such as Normal, Skew-Normal, Log-normal,
Skew-Log-Normal, Exponential, Gamma, Poisson, IG, etc. Several facts
will be reported, it seems some of them are not so well-known, they are
a little surprising and even shocking.
The material will be addressed to professionals in Statistics/Probability,
Stochastic modeling and also to Doctoral and Master students in these
areas. If time permits, some open questions will be outlined.
Date: Friday, November 13, 2009
Time: 3:00-4:00
Location: Phillips Hall, Room 108 (801 22nd Street, NW, Washington, DC 20052)
November 16, 2009
Title: Combined State and Parameter Estimation in General State-Space Models
Speaker:
Jonathan Stroud, Department of Statistics, George Washington University
Abstract:
This talk considers the problem of combined state and parameter estimation in
general state-space models. Working within the Bayesian framework, we derive
simulation-based (MCMC and sequential Monte Carlo) strategies for filtering, smoothing
and parameter estimation. The approaches are quite general and can be applied to a
wide class of models, including nonlinear, non-Gaussian and continuous-time models.
We illustrate the methods using a stochastic volatility jump-diffusion model and a
dynamic spatio-temporal model.
Date: Monday, November 16, 2009
Time: 4:00-5:00pm
Location: Phillips Hall, Room 111 (801 22nd Street, NW, Washington, DC 20052)
November 19, 2009
(Joint with the Institute for Integrating Statistics and Decision Sciences)
Title: What Data Mining Teaches Me About Teaching Statistics
Speaker:
Dick De Veaux,
Department of Mathematics and Statistics, Williams College
Abstract:
Data mining has been defined as a process that uses a variety of data analysis and modeling
techniques to discover patterns and relationships in data that may be used to make accurate
predictions and decision. Statistical inference concerns the same problems. Are the two
really different? Through a series of case studies, we will try to illuminate some of the
challenges and characteristics of data mining. Each case study reminds us that the important
issues are often the ones that transcend the methodological choice one faces when solving
real world problems. What lessons can these teach us about teaching the introductory course?
Date: Thursday, November 19, 2009
Time: 4:00-5:00pm
Location: Funger Hall, Room 620 (2201 G Street, NW, Washington, DC 20052)
December 4, 2009
Title: Pseudo-Likelihood: Theory and Applications to Population-Based Genetic Association Studies
Speaker:
Kung-Yee Liang,
Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University
Abstract:
This talk is divided into two parts. In Part I, we review the motivation behind and theoretical
developments of the pseudo-likelihood method, which was originally advocated by Gong and Samaniego
(1981, Annals of Statistics). A series of examples drawn from the biomedical studies (Liang and
Self, 1996, JRSSB) are presented illustrating the wealth of its applications. We then turn the
attention in Part II to its application to population-based genetic association studies. One
concern about conducting population-based association studies is on population stratification (PS),
i.e., the positive association observed between risk of disease and genetic markers is due to
heterogeneity in ethnic makeup between cases and controls. One approach to alleviate this concern
is to use the so called genomic controls to detect and correct PS. We propose a pseudo-likelihood
approach, which eases several concerns of the conventional likelihood approach including, among
others, computational burden and tangling between parameters of interest and nuisance parameters.
This approach is illustrated through a population-based case-control study on asthma.
Date: Friday, December 4, 2009
Time: 11:00am-12:00pm
Location: MPA Building, Room 309 (805 21st Street, NW, Washington, DC 20052)
December 11, 2009
Title: The High Cost of Weak Forensic Science Practices in the United States
Speaker:
Clifford H. Spiegelman, Department of Statistics, Texas A&M University
Abstract:
At least three recent NRC panels have issued reports that seek to
improve the state of forensic sciences. The FBI stopped the use of
compositional bullet lead at trial completely in 2007, and has recently
sent letters to defense lawyers advising them that faulty evidence was
used to in trials where their clients were convicted. This is after
challenges from a former FBI Crime Lab chief metallurgist and colleagues
started, and an NRC report was issued. The NRC report suggested many
important changes that should have been done. These included changing
to proper statistical techniques, and collection of additional data.
Another NRC report dealing with ballistic imaging stated that there is
no statistical foundation for firearm toolmark testimony. That statement
together with longstanding challenges lead by Professor Adina Schwartz at
John Jay College of Criminal Justice and colleagues, has lead to heavy
restrictions on the use of firearm toolmark evidence in many
jurisdictions. The most recent NRC report urges congress to create an
independent National Institute of Forensic Science that is independent
of the DOJ.
This talk will give several examples of the costs of poor forensics from
the assassination of President Kennedy to wrongly convicted
individuals. In addition, it highlights the heavy cost to law
enforcement and hence the public due to properly excluded forensic
evidence. Had forensic science community through DOJ invested in proper
studies, including proper statistical studies, their evidence would be
able to satisfy legal standards for scientific evidence.
Date: Friday, December 11, 2009
Time: 11:00am-12:00pm
Location: MPA Building, Room 309 (805 21st Street, NW, Washington, DC 20052)
The series hosts a seminar about twice a month on current research
topics. The seminar often features an invited guest speaker and
occasionally local faculty members, students or others affiliated with
the department. The usual time of the seminar is 11:00am on Fridays.
Professors Hosam Mahmoud (hosam@gwu.edu) and
Jonathan Stroud (stroud@gwu.edu)
are the Seminar Series Coordinators.
|