This post addresses the use of database-centric architectures in large-scale enterprise GIS environments. This is contrasted with the geo-centric architecture described in the previous post. In the geo-centric architecture, the GIS vendor’s software is tightly coupled to the GIS database and is directly involved in managing the flow of information in and out of the database. The database-centric architecture makes a shift to having the database management system be responsible for the GIS data.
Key elements of a database-centric architecture are as follows:
- The database is the source of truth for spatial information in the organization, rather than the GIS vendor’s application database
- The database is typically a spatially enabled relational database, e.g Oracle Spatial / Locator, SQL Server Spatial or PostGIS
- The spatial database environment may take on an added role in managing the spatial data to address capabilities such as:
- Long transactions
- Conflict resolution
- Metadata management
- Synchronization services
- Additional spatial capabilities, such as network modeling
- The interface between the spatial data base and the GIS application database can be via specific integration, or more commonly, via the use of spatially-enabled ETL (Extract-Transform-Load) product technology
A context diagram of the database-centric architecture is shown in the above diagram.
The Role of Spatial ETL
Spatial ETL products can perform a critical role in supporting a database-centric GIS architecture and should be considered a central design component. The reasons for using Spatial ETL include:
Data model transformation support – this can be particularly important when it is necessary to deal with the differences that arise from different vendor formats. This also becomes important when it is necessary to transform the spatial information into a materially different data model. This occurs when the spatial information is being transformed to an enterprise data model, or an industry standard model. Common transformations typically include attribute modifications, feature merging and splitting, relationship generation and topology generation.
Validation – The ETL can play a central role in performing validation tasks between the source GIS environments and the operational data store. At the simplest level, the ETL can support feature counts and other simple validations. During more complex transformations the validation can become critical to ensure the appropriate information is still retained when the model is transformed into the target environment.
Commercial-off-the-shelf (COTS) product capabilities – ETL products offer a number of commercial advantages that often transcend the data capabilities offered by GIS vendor-specific application programming interfaces or custom-developed interfaces. These include:
- Rich error handling, including notifications ranging from service logs to emails and tweets
- Scalability via the use of server-based architectures
- Maintainability via the use of standard application programming interfaces
- In the case of FME, web based initiation, interaction and resolution of synchronization processes
- Flexibility to support multiple target formats
Another key reason to utilize spatial ETL is the ability to deal with difference management, which is critical for supporting the synchronization process.
The use of an enterprise spatial database enables a multi-vendor GIS architecture. This can offer some significant advantages to large scale users of geospatial technologies including elimination of redundant data entry, reduction in custom applications and a long term reduction in support and development costs. More details on the benefits will be forthcoming in a future post.