Using one-class classifiers and multiple kernel learning for defining imprecise geographic regions

Eduardo Cunha, Bruno Martins: Using one-class classifiers and multiple kernel learning for defining imprecise geographic regions. In: International Journal of Geographical Information Science, 28 (11), pp. 2220–2241, 2014.

Abstract

This article presents an automated method for defining the boundaries of imprecise geographic regions, based on publicly available data. The method uses one-class support vector machines (SVMs) for interpolating from a set of point locations, which are assumed to lie in the region whose boundaries are to be defined, and leverages also on a combination of multiple Gaussian kernels, within the formalism of SVMs, to improve accuracy. The points that are used for model training correspond to geospatial coordinates associated with Flickr photos that are tagged with the name of the vague region to be defined. Besides considering latitude and longitude coordinates from Flickr photos, as done in a previous related work, each point location is also associated with a set of descriptive features, obtained from textual annotations and from publicly available raster datasets encoding population counts, terrain elevation, and/or land coverage information. The overall approach is evaluated by means of statistical classification measures, using regions whose boundaries are well defined (i.e., the official boundaries for several European countries). Besides this formal evaluation, we also illustrate our results for several vague regions. Results show that our method performs better than a previous state-of-the-art approach (i.e., we measured an improvement of 5.5% in terms of the F1 metric), which was based solely on interpolating from the geospatial coordinates of known points.

BibTeX (Download)

@article{Cunha2014,
title = {Using one-class classifiers and multiple kernel learning for defining imprecise geographic regions},
author = { Eduardo Cunha and Bruno Martins},
url = {http://dx.doi.org/10.1080/13658816.2014.916040},
year  = {2014},
date = {2014-01-01},
journal = {International Journal of Geographical Information Science},
volume = {28},
number = {11},
pages = {2220--2241},
publisher = {Taylor & Francis},
abstract = {This article presents an automated method for defining the boundaries of imprecise geographic regions, based on publicly available data. The method uses one-class support vector machines (SVMs) for interpolating from a set of point locations, which are assumed to lie in the region whose boundaries are to be defined, and leverages also on a combination of multiple Gaussian kernels, within the formalism of SVMs, to improve accuracy. The points that are used for model training correspond to geospatial coordinates associated with Flickr photos that are tagged with the name of the vague region to be defined. Besides considering latitude and longitude coordinates from Flickr photos, as done in a previous related work, each point location is also associated with a set of descriptive features, obtained from textual annotations and from publicly available raster datasets encoding population counts, terrain elevation, and/or land coverage information. The overall approach is evaluated by means of statistical classification measures, using regions whose boundaries are well defined (i.e., the official boundaries for several European countries). Besides this formal evaluation, we also illustrate our results for several vague regions. Results show that our method performs better than a previous state-of-the-art approach (i.e., we measured an improvement of 5.5% in terms of the F1 metric), which was based solely on interpolating from the geospatial coordinates of known points.},
keywords = {GIS Applications of Georeferenced Multimedia, Multiple Kernel Learning, One-class Classification, Vague Geographic Regions},
pubstate = {published},
tppubtype = {article}
}