Dissemin is shutting down on January 1st, 2025

Links

Tools

Export citation

Search in Google Scholar

On Adaptation to Sparse Design in Bivariate Local Linear Regression

Journal article published in 1999 by Peter Hall, Burkhardt Seifert ORCID, Berwin A. Turlach ORCID
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

this paper is to introduce imputation and interpolation rules in two dimensions, and describe their numerical performance. A more detailed theoretical study would show that they have all the hoped-for properties in connection with common methods for choosing bandwidth. For example, in the setting of standard asymptotic analysis (where sample size increases and other parameter settings remain fixed) they do not interact adversely with cross-validation or plug-in rules for bandwidth choice, no matter whether these are global or local in character. Bearing in mind that data sparseness problems usually only affect a small proportion of the region over which we aim to estimate a regression mean, we see that imputation and interpolation rules generally have negligible impact on a wide variety of global bandwidth choice methods. It is sometimes proposed that problems of data sparseness, be they in one or higher dimensions, might be overcome by an adequate local bandwidth choice method, such as that based on near neighbours in the design data. There are, however, alternative viewpoints. Any local bandwidth selector that has good performance for estimating a regression mean should take into account information about that function, as well as about the design sequence. By way of contrast, nearest neighbour methods address only variation in the design sequence. Moreover, since local bandwidth choice techniques are based on only a relatively small fraction of the information in a sample, in particular that in the close vicinity of the point at which inference is being conducted, they are particularly susceptible to problems of data sparseness. As a result, they can exhibit very high variability in places where the design sequence is sparse. Using an imputation-and-interpolation r...