Centre for Mathematics, University of Coimbra   Laboratory for Computational Mathematics  
   

Laboratory

Projects

Events

Computing Resources

 

 

 

 

   

Approximating from Sparse Discrete Observations

 

 
 

Problem
Description

Models using categorical data are often faced with few observations and large support distributions. Asymptotic characterizations of the estimators are of little use in such situations. Thus, we are interested in developing estimating procedures that are adapted to extracting the most information from the sparse observations available.

Another related problem, appears when the underlying model is high-dimensional: one might have the knowledge of one marginal distribution. One should be able to accommodate this knowledge into the the estimators

 

 
  Modelling
&
Computational Challenges
Having to estimate from few observations over a (comparatively) large discrete support rules out the simple histogram based approximations, although these perform well asymptotically. Smoothing over adjacent cells can contribute to improve on this problem. Many categorical models assume some contiguity or adjacency between the cells, thus the idea of smoothing becomes more natural. This would justify that, observing a few observations that concentrate in some region means that a somewhat larger region has a significant probability.

The polynomial smoothers have serious drawbacks, particularly for the case of sparse observations: they are known to perform very well asymptotically but produce negative approximations for the probabilities. Thus, we need to develop another class of smoothers, that preserve the nonnegativeness and that perform well, if not asymptotically, in the presence of sparse observations.

 

 
  Research
at LCM
We have introduced families of penalized polynomial smoothers. There is computational work to be done in order to have some insight on the optimization. The computational simulations produced so far have hinted a good performance, especially in two dimensional problems. The influence of weighting functions needs to be addressed and optimized.

The theoretical aspects of the estimators are to be addressed. We are interested on finite distance properties, rather than in asymptotic properties. The estimators have rather complex expressions to be manipulated, so several difficulties need to be overcome.

 

.. .. ..

 

 
  Papers
&
Reports
[1] P. Jacob, P.E. Oliveira, Penalized smoothing of discrete distributions with sparse observations, Preprint, Pré-Publicações do Departamento de Matemática da Universidade de Coimbra, 06-28, 2006
[2] P. Jacob, P.E. Oliveira, Penalized smoothing of sparse tables, Preprint, Pré-Publicações do Departamento de Matemática da Universidade de Coimbra, 07-02, 2007

 

 
  Software

[1] An R package is under preparation.

     
 

Project
Team

 

Pierre Jacob, Institut de Mathématiques et Modélisation de Montpellier, Université de Montpellier II, France
Paulo Eduardo Oliveira, LCM-CMUC