João Galvão, December 2014

Segmentação de Vastos Volumes de Dados com o SNNagg

Maribel Yasmina Santos (superv.), Universidade do Minho, December 2014.

Nowadays, and motivated by the recent advances in information technologies and in the massive use of electronic devices, the amount of generated data has increased at a very high rate. In order to be able to handle these large amounts of data, data mining algorithms are used. This work is focused in the use of clustering and, namely, in the SNN (Shared Nearest Neighbour) algorithm, a density-based clustering algorithm. Clustering algorithms usually present high runtimes due to the quadratic complexity. In this work, the SNN algorithm is used to analyse spatial data. Read More …

João Ricardo Oliveira, December 2013

Spatio-temporal SNN: Integrating Time and Space in the Clustering Process,

Maribel Yasmina Santos (superv.), Universidade do Minho, December 2013.

Spatio-temporal clustering is a new subfield of data mining that is increasingly gaining scientific attention due to the technical advances of location-based or environmental devices that register position, time and, in some cases, other semantic attributes. One of the main challenges of this area is to integrate several dimensions in the clustering process with a general-purpose approach. Read More …

José Guilherme Moreira, December 2013

Input parameters self-tuning on the SNN algorithm

Maribel Yasmina Santos (superv.), Universidade do Minho, December 2013.

Recent technological developments have lead to a ever increasing rate in data collection. Organisations are facing several challenges when they try to analyse this vast amount of data with the aim of extracting useful information. This analytical capacity needs to be enhanced with tools capable of dealing with big data sets without making the analytical process a difficult task. Clustering is usually used, as this technique does not require any a priori knowledge about the data. However, clustering algorithms usually require one or more input parameters that influence the clustering process and the results that can be obtained. Read More …