As a consultant I am usually making recommendations to customers for processes and products. On a recent discussion with a client (a major bank) I was basically working out an enterprise architecture to support their reporting requirements. In a move to consolidate its operations and personnel the bank has made a central body to take charge of ALL reporting requirements of the organization locally as well as globally.
The bank has no data warehouse or data mart. They do not have any proper reporting server in place. Or even a separate schema to support their reporting. They simply take a snapshot of the enterprise system, restore it on to a reporting server and run off their reporting queries from there. They do need some extra calculations for which they have developed several scripts.
The main issue was that they didn’t have a big enough window (after the whole backup/restore procedure just to get the data onto their reporting server) so they couldn’t really go in for complex calculations. The volume of data prohibited the complexity and thus their ability to report on those measures. Not only was this a major problem but the fact that all previous data was being wiped clean every day – EVERY DAY (you read this right).
The bank obviously doesn’t want to spend too much money and thus the reporting department was trying to make do with legacy methods. The default reaction would be to enable them with modern reporting tools and yes that did cross my mind. However, since the actual development of the report required complex data manipulation that was restricted because it took too long to write code for that and also the time factor.
The best solution would be for them to have a new and optimized logical model for their staging area/reporting server, and this they could use their existing report development front end without much effort in retraining the work force. So the main solution was ETL and in the longer run up to things a proper reporting/BI solution would be the way to go.
The client although familiar with ETL had never imagined that they could use ETL to enhance their reporting (usually people tend to associate it with data integration and manipulation and not with the representation of data in any meaningful form). It took a good lot of convincing to tell them that they didn’t really require a reporting solution but a properly done ETL.
As our discussions bore on, the client eventually got convinced and asked me if I could come in for a presentation. One of the other client people (apparently the more technical of the lot) asked me if I could name some ETL tools. The only names that came to mind were the handful of known products.
I decided to compile a list (this may not be 100% comprehensive but I hope to keep this updated and relevant), because a lot of people genuinely have no idea the sheer amount of options they have available to them.
Open-source ETL frameworks
- Pentaho Data Integration
- Talend Open Studio
- Enhydra Octopus
- Mortgage Connectivity Hub
Freeware ETL frameworks
Proprietary ETL frameworks
- Ab Initio
- IBM Data Stage
- IBM Information Server
- IBM DB2 Warehouse Edition
- IBM Cognos Data Manager
- Oracle Data Integrator
- Oracle Warehouse Builder
- SAP Business Objects – Data Integrator
- SAS Data Integration Studio
- Informatica PowerCenter
- Altova MapForce
- Djuggler Enterprise
- Embarcadero Technologies DT/Studio
- ETL Solutions Transformation Manager
- Group 1 Software DataFlow
- IKAN – ETL4ALL
- IKAN – MetaSuite
- Information Builders – Data Migrator
- SQL Server Integration Services
- Pervasive Data Integrator
- Safe Software
Technorati Tags: Ab Initio, Business, Business Objects, Concepts, data warehouse, DataStage, dw, EDW, Enterprise Data Warehouse, ETL, Extract Transform Load, IBM, IBM DataStage, Jasper, Oracle, Oracle Data Integrator, Pentaho, Pentaho Data Integration, SAP, SAP Business Objects, Talend, Talend Open Studio, Technology, Tools