December 12, 2008
Title: Bayesian Methods in Project Management
Speaker: Fabrizio Ruggeri, CNR IMATI, Italy
Abstract:
Different aspects of project management are illustrated. They are the results
of research projects, still ongoing, and consulting activities which involved
CNR-IMATI, Politecnico di Milano and Universidad Rey Juan Carlos de
Madrid, and a leading Italian company. Major emphasis will be devoted to
the bidding process, when a company is interested in estimating costs and
benefits from taking part in a bid, finalised to the construction of an
industrial plant. Three aspects will be considered: forecasts of costs due to
construction and losses due to rare but disruptive events, and modelling of
competitors' behaviour. Finally, we address the issue of execution of
activities in due time, focusing on forecast of subcontractors' deliveries and
critical chain and buffer management.
Date: Friday, December 12, 2008
Time: 3:30-4:30 pm
Location: Duques Hall, Room 453 (2201 G Street, NW, Washington, DC 20052)
December 5, 2008
Title: Generalized Confidence Intervals: Methodology and Applications
Speaker: Thomas Mathew, Department of Mathematics and
Statistics, University of Maryland Baltimore County
Abstract:
The concept of generalized confidence intervals is somewhat recent, and is
useful to obtain confidence intervals for certain "complicated" parametric
functions. The usual confidence intervals are derived using the percentiles
of a pivotal quantity. Generalized confidence intervals are derived based
on a generalized pivotal quantity (GPQ), which is a function of a random
variable, its observed value, and also the parameters. In the talk, I will
explain the construction of a GPQ and will describe the conditions that they
must satisfy. I will then discuss several applications of the generalized
confidence interval methodology for obtaining confidence intervals for a
number of somewhat complicated problems: confidence intervals for (i) the
lognormal mean, (ii) a bioassay problem, and (iii) a problem involving the
bivariate normal distribution. In each case, I will motivate the problem
with specific applications and will also illustrate the results using the
relevant data analysis. Some attractive features of the generalized
confidence intervals are that they are easy to compute and they exhibit
excellent performance even for small sample sizes.
Date: Friday, December 5, 2008
Time: 11:00-12:00 noon
Location: Funger Hall, Room 220 (2201 G Street, NW, Washington, DC 20052)
November 21, 2008
Title: Analysis of Multi-Factor Affine Yield Curve Models
Speaker: Siddhartha Chib, Harry C. Hartkopf Professor of Econometrics and
Statistics, Olin Business School, Washington University in St. Louis
Abstract:
In finance and economics, there is a great deal of work on the theoretical
modeling and statistical estimation of the yield curve (defined as the
relation between -log(p_t(tau))/tau and tau, where p_t(tau) is the time t
price of the zero-coupon bond with payoff 1 at maturity date t + tau).
Of much current interest are models in which the bond prices are derived
from a stochastic discount factor (SDF) approach that enforces an important
no-arbitrage condition. The log of the SDF is assumed to be an affine
function of latent and observed factors, where these factors are assumed to
follow a stationary Markov process. In this paper we revisit the question
of how such multi-factor affine models of the yield curve should be fit.
Our discussion is from the Bayesian MCMC viewpoint, but our implementation
of this viewpoint is different and novel. Key aspects of the inferential
framework include (i) a prior on the parameters of the model that is
motivated by economic considerations, in particular, those involving the
slope of the implied yield curve; (ii) posterior simulation of the
parameters in ways to improve the efficiency of the MCMC output, for
example, through sampling of the parameters marginalized over the factors,
and through tailoring of the proposal densities in the Metropolis-Hastings
steps using information about the mode and curvature of the current target
based on the output of a simulating annealing algorithm; and (iii) measures
to mitigate numerical instabilities in the fitting through
reparameterizations and square root filtering recursions. We apply the
techniques to explain the monthly yields on nine US Treasuries (with
maturities ranging from 1 to 120 months) over the period January 1986 to
December 2005. The model contains three factors, one latent and two
observed. We also consider the problem of predicting the nine yields for
each month of 2006. We show that the (multi-step ahead) prediction regions
properly bracket the actual yields in those months, thus highlighting the
practical value of the fitted model.
Date: Friday, November 21, 2008
Time: 10:45-11:45am
Location: Duques Hall, Room 552 (2201 G Street, NW, Washington, DC 20052)
November 14, 2008
Title: Statistics, Genetics, Partitions and Urn Models
Speaker: Warren Ewens, Department of Biology, University of Pennsylvania
Abstract:
The massive amounts of genetic data now becoming available have led to
the need for statistical analyses attempting to assess, among other
things, the evolutionary forces that have led to these data. Often the
data are in the form of partitions of the integers {1,2,..., n}. This
has led to an increase in interest in the probability theory for
partitions. These include in particular the Kingman theory of
partition structures. Also, some probablity structures arise from
previously unanalyzed urn models. These will be described.
Date: Friday, November 14, 2008
Time: 11:00-12:00 noon
Location: Funger Hall, Room 220 (2201 G Street, NW, Washington, DC 20052)
October 31, 2008
Title: Statistics in Forensic Science
Speaker: Walter Rowe, Department of Forensic Sciences, George Washington University
Abstract:
Forensic scientists make frequent use of statistical methods. Like other scientists they may have to
concern themselves with obtaining representative samples from large (possibly inhomogeneous)
collections of evidence; they may also be concerned about the precision of their measurements.
However, in many criminal and civil cases forensic scientists have two fundamental questions to
answer when confronted with a piece of evidence. What is it? And where did it come from?
Sometimes it is only necessary to answer the first question. Is that white powder cocaine?
A positive or negative answer to that question suffices in most drug possession and drug trafficking
cases. The more intriguing forensic question is where the piece of evidence came from. Some
types of evidence (fingerprints, shoe and tire impressions, tool marks and fired bullets and cartridge
cases) present what appear to be unique features (fingerprint ridge characteristics or patterns of
striations). Probability models have been developed for some types of pattern evidence (e.g.
fingerprints, tool marks and striation patterns on fired bullets) to support the argument that their
features are unique. With other types of evidence, forensic scientists may determine a set of
features, no one of which is unique but which when aggregated specify a unique source. In DNA
profiling the alleles present at a large number of gene loci are determined. For each gene locus the
combination of alleles found usually is present in a large fraction of the human population. However,
if enough gene loci are examined the DNA recovered from a blood or semen stain can be linked to
one and only one member of the human population. This association is possible because geneticists
and forensic molecular biologists have accumulated frequency data for the gene loci in which they
are interested. Relevant frequency data is usually lacking for other types of evidence. In dealing
with these types of evidence the forensic scientist may only be able to say that the evidence came
from a particular geographical area or in the case of manufactured items belongs to a particular
product formulation.
Principal component analysis (PCA) and discriminant analysis (DA) have been applied to a variety
of forensic problems in recent years. These range from the comparison of soil samples and the
classification of ignitable liquids used as arson accelerants to the comparison of writing inks such
as ball pen inks, gel pen inks, permanent markers and dry erase markers. PCA and DA allow
examine the similarities and differences between similar materials and assign unknown samples to
groups having similar formulations. In the field of forensic document examination identifying the
formulation of the ink used to prepare a document can be useful because the dates at which a
particular formulation came on the market will generally be known. A document which has been
prepared with an ink that was not available at the time it was supposedly created cannot be authentic.
PCA also allows forensic scientists to compare different methods of analysis and select those that
have the greatest discriminating power.
This presentation will conclude with a brief survey of the attitudes of United States courts toward
statistical inference and the prevailing rules for the admissibility of scientific and technical evidence
(which includes statistics).
Date: Friday, October 31, 2008
Time: 11:00-12:00 noon
Location: Funger Hall, Room 220 (2201 G Street, NW, Washington, DC 20052)
October 17, 2008
Title: Imbalance in Digital Trees and Similarity of Digital Strings
Speaker: Hosam Mahmoud, Department of Statistics, George Washington University
Abstract:
The imbalance factor of the nodes containing keys in a random digital
tree is investigated. Accurate asymptotics for the mean are derived
for a randomly chosen key in the tree via poissonization and the
Mellin transform, and the inverse of the two operations. It is also
shown from a singularity analysis of the moving poles of the Mellin
transform of the poissonized moment generating function that the
imbalance factor (under appropriate centering and scaling) follows a
Gaussian limit law.
The methods are amenable to the investigation of the average
similarity of random strings as captured by the average number of
"cousins" in the underlying tree structures. Certain analytic issues
arise in the digital tree underlying DNA that do not have an analog in
the binary case.
Date: Friday, October 17, 2008
Time: 11:00-12:00 noon
Location: Funger Hall, Room 220 (2201 G Street, NW, Washington, DC 20052)
October 15, 2008
Title: Quantifying the Fraction of Missing Information for Hypothesis Testing
in Statistical and Genetic Studies
Speaker: Xiao-Li Meng, Department of Statistics, Harvard University
Abstract:
This talk is based on a forthcoming discussion paper in Statistical
Science (jointly with Nicolae and Kong, and preprint available at
http://www.imstat.org/sts/future_papers.html ) with the following
abstract:
Many practical studies rely on hypothesis testing procedures applied
to datasets with missing information. An important part of the
analysis is to determine the impact of the missing data on the
performance of the test, and this can be done by properly quantifying
the relative (to complete data) amount of available information. The
problem is directly motivated by applications to studies, such as
linkage analyses and haplotype-based association projects, designed to
identify genetic contributions to complex diseases. In the genetic
studies the relative information measures are needed for the
experimental design, technology comparison, interpretation of the
data, and for understanding the behavior of some of the inference
tools. The central difficulties in constructing such information
measures arise from the multiple, and sometimes conflicting, aims in
practice. For large samples, we show that a satisfactory,
likelihood-based general solution exists by using appropriate forms of
the relative Kullback-Leibler information, and that the proposed
measures are computationally inexpensive given the maximized
likelihoods with the observed data. Two measures are introduced, under
the null and alternative hypothesis respectively. We exemplify the
measures on data coming from mapping studies on the inflammatory bowel
disease and diabetes. For small-sample problems, which appear rather
frequently in practice and sometimes in disguised forms (e.g.,
measuring individual contributions to a large study), the robust
Bayesian approach holds great promise, though the choice of a
general-purpose "default prior" is a very challenging problem. We also
report several intriguing connections encountered in our
investigation, such as the connection with the fundamental identity
for the EM algorithm, the connection with the second CR
(Chapman-Robbins) lower information bound, the connection with
entropy, and connections between likelihood ratios and Bayes
factors. We hope that these seemingly unrelated connections, as well
as our specific proposals, will stimulate a general discussion and
research in this theoretically fascinating and practically needed
area.
Date: Wednesday, October 15, 2008
Time: 3:00-4:00pm
Location: 1957 E Street, Room 212 (1957 E Street, NW, Washington, DC 20052)
October 3, 2008
Title: Information-Theoretic and Entropy Methods of Estimation
Speaker: Amos Golan, Department of Economics, American University
Abstract:
In this talk I will review the state of Information Theoretic and
Entropy Methods in Econometrics. I will discuss the connecting theme
among these methods and will provide a more detailed discussion of the
sub-class of methods that treat the observed sample moments as
stochastic. The resulting method uses minimal distributional
assumptions, performs well (relative to current methods of estimation)
and uses efficiently all the available information (hard and soft
data). This method is computationally efficient. I will present the
basic ideas using a number of empirical examples taken from economics,
physics, image reconstruction and operation research. Studying these
examples will provide a way for a synthesis of that class of models
and connecting it to the more traditional methods of data analysis. I
will conclude with some thoughts on potential future developments.
Date: Friday, October 3, 2008
Time: 11:00-12:00 noon
Location: Duques Hall, Room 652 (2201 G Street, NW, Washington, DC 20052)
September 19, 2008
Title: Efficient Parameterization of PDE-Based Dynamics for Spatio-Temporal Processes
Speaker: Ali Arab, Department of Mathematics, Georgetown University
Abstract:
Spatio-temporal dynamical processes in the physical and environmental
sciences are often described by partial differential equations (PDEs).
The inherent complexity of such processes due to high-dimensionality
and multiple scales of spatial and temporal variability is often
intensified by characteristics such as sparsity of data, complicated
boundaries and irregular geometrical spatial domains, among others.
In addition, uncertainties in the appropriateness of any given PDE for
a real-world process, as well as uncertainties in the parameters
associated with the PDEs are typically present. These issues
necessitate the incorporation of efficient parameterizations of
spatio-temporal models that are capable of addressing such
characteristics. A hierarchical Bayesian model characterized by the
PDE-based dynamics for spatio-temporal processes based on their
Galerkin finite element method (FEM) representations is developed and
discussed. As an example, spatio-temporal models based on
advection-diffusion processes are considered. Finally, an application
of the hierarchical Bayesian modeling approach is presented which
considers the analysis of tracking data obtained from DST (data
storage devices) sensors to mimic the pre-spawning upstream migration
process of the declining shovelnose sturgeon.
Date: Friday, September 19, 2008
Time: 11:00-12:00 noon
Location: Funger Hall, Room 220 (2201 G Street, NW, Washington, DC 20052)
September 10, 2008
Title: Interactions are important
Speaker: Andrew Gelman, Departments of Statistics and
Political Science, Columbia University
Abstract:
As statisticians and practitioners, we all know about interactions but
we tend to think of them as an afterthought. We argue here that
interactions are fundamental to statistical models. We first consider
treatment interactions in before-after studies, then more general
interactions in regressions and multilevel models. Using several
examples from our own applied research, we demonstrate the effectiveness
of routinely including interactions in regression models. We also
discuss some of the challenges and open problems involved in setting up
models for interactions.
Date: Wednesday, September 10, 2008
Time: 3:00-4:00pm
Location: 1957 E Street, Room 212 (1957 E Street, NW, Washington, DC 20052)
September 5, 2008
Title: On Effect-Measure Modification: Relationships Among Changes in the
Relative Risk, Odds Ratio, and Risk Difference
Speaker: Dr. Babette Brumback, University of Florida
Abstract:
This is based on joint work with Arthur Berg, also at the University of Florida
It is well known that presence or absence of effect-measure
modification depends upon the chosen measure. What is perhaps more
disconcerting is that a positive change in one measure may be
accompanied by a negative change in another. Therefore, research
demonstrating that an effect is 'stronger' in one population when
compared to another, but based on only one measure, for example the
odds ratio, may be difficult to interpret for researchers interested
in another measure. This talk reports on an investigation of
relationships among changes in the relative risk, odds ratio, and risk
difference from one stratum to another. Analytic and simulated results
are presented concerning conditions under which the measures can and
cannot change in opposite directions. For example, when all risks are
less than 0.5, it is impossible for the relative risk and risk
difference to change in the same direction but opposite to that of the
odds ratio. Data-analytic and hypothetical examples are used for
demonstration, including an examination of the how the relationship
between physical quality of life and body mass index differs across
women and men, based on data from the 2005 Behavioral Risk Factor
Surveillance System survey.
Date: Friday, September 5, 2008
Time: 11:00-12:00 noon
Location: Funger Hall, Room 220 (2201 G Street, NW, Washington, DC 20052)
The series hosts a seminar about twice a month on current research
topics. The seminar often features an invited guest speaker and
occasionally local faculty members, students or others affiliated with
the department. The usual time of the seminar is 11:00am on Fridays.
Professors Hosam Mahmoud (hosam@gwu.edu) and
Jonathan Stroud (stroud@gwu.edu)
are the Seminar Series Coordinators.
|