The data sets made available from data publishers are becoming richer, but also complex to analyze. One example is the census data made available by the INE at http://censos.ine.pt/. Despite the richer set of information we can derive from those data sets, it is often necessary to cross data at different granularities, some of each represented differently. For example, the information int the two following representations are equivalent:
Value _____ ______ A 10 B 20 A+B 30 |
A B Total _____ ____ ____ _____ Value 10 20 30 |
The way the combination of data sets will be made depends on their representation. To this day, this procedure is done manually by the analyst, with the support of general data transformations tools.
The aim of this work is to develop a tool to aid the analyst: (i) describing the semantics of the data, (ii) apply a set of data transformations to derive new data sets containing a singular granularity from complete data sets. The transformations should have a theoretical support.
Requirements: A student should consider applying to this M.Sc. project if he/she has a strong background on developing GUI applications.
Supervision: The work will be supervised by Prof. João Moura Pires, from FCT/UNL and co-supervised at 50% by Prof. Nuno Datia, from IPL/ISEL.
Hosting Institution: FCT/UNL
Additional information: This thesis is available for 2015/16. A more exhaustive description of the problem can be found here.