The widespread use of spatial data sets from many different sources is the cause of many costly, inaccurate and time-consuming data processing problems. An organisation may spend large amounts of time and money on a spatial information project, only to find that the end result is not what was required, or even that the problem cannot be solved because the available data is insufficient. It would be desirable to have prior knowledge of the capabilities of the available data, and to analyse this knowledge in order to minimise the inappropriate use of data, thereby minimising this situation. Unfortunately, due to the widespread dissemination and use of data from different sources, the people who have this knowledge are not usually the people who are asked to do the data processing. Constraints of time and resources often force managers to require that a data set should simply be used so that a job can be completed, without appropriate consideration of the data lineage. This problem is more pronounced today given the increasing use of the Internet as a data source and the fact that so much of the data available through this source is undocumented and unsupported. A system that allows the layman to understand the consequences of using a data set would solve this problem. This paper discusses the development of such a system.