|
|
Mimno, Myers & Holum
COLUMN: Should You Use EAI Tools, ETL Tools, or Both?
FlashPoint Column - March 26, 2003
COLUMN: Should You Use EAI Tools, ETL Tools, or Both?
by Pieter Mimno
Principal, Mimno, Myers & Holum
SUMMARY: Data integration and application integration are among the most important issues facing business managers and IT managers. Many organizations have built or purchased multiple applications that support specific business areas, such as finance, sales, ERP, CRM, HR, SCM, campaign management, strategic planning, etc. These point solutions provide value for each of the business areas, but it is often difficult to access clean, consistent information across these business areas. Both Enterprise Application Integration (EAI) vendors and Extraction/Transformation/Load (ETL) vendors claim to support data and application integration across multiple islands of information. This article examines the claims of vendors and provides guidelines on how to use the best features of both EAI and ETL tools
ENTERPRISE APPLICATION INTEGRATION (EAI) TOOLS
EAI tools, such as IBM WebSphere MQ, Tibco, webMethods, or SeeBeyond, are used to provide real-time, bi-directional messaging interfaces between packaged applications. They act as the hub of a hub-and-spoke messaging architecture that links enterprise software packages, custom applications, legacy systems, databases, workflows, and Web services. EAI tools support transactional systems and integrate at the application level. They can provide a complete end-to-end integration solution by automating entire business processes, including human participation and workflow management.
An important objective of EAI tools is to support Web services applications. Web services are defined as software components that adhere to a specific set of Internet standards and enable the unrestricted movement and sharing of data across connected applications and data sources. Due to their ability to seamlessly integrate applications, Web services have the potential to transform the way organizations share and communicate information. The technology supports component-based application interoperability that is independent of platform and implementation languages. It can be used to implement complex, bi-directional, business-to-business (B2B) integration.
Web services are typically based on EAI tool functionality. EAI tools utilize J2EE or .NET platforms to extract business information embedded in legacy applications, transform the information, and generate components and Web services. XML, executed as Java or Com objects, is used to locate and invoke specified components or services. Legacy applications are leveraged, but are not modified.
EAI tools incorporate graphical user interfaces that simplify access to pre-defined libraries of application system adapters, transformation logic, business process modeling, security, and workflow management. Standards supported by EAI tools include XML, Java, COM, SOAP (Simple Object Access Protocol), and JMS (Java Messaging Service). The tools are evolving to support an emerging set of standards that provide a common language for application integration and component interaction. These standards include WSDL (Web Services Description Language) and UDDI (Universal Description, Discovery, and Integration).
Although EAI tools support real-time integration between multiple business processes, they have significant limitations, including lack of native interfaces to data sources, requirement to code data transformations in a procedural language, lack of bulk data movement capability, lack of data modeling functionality, and lack of metadata integration capability. Libraries of pre-defined application system adapters and transformation functions are available, but components of these libraries have to be coded using a procedural language.
A primary limitation of EAI tools is the lack of a central metadata repository containing a "single version of the truth" for business rules and definitions. EAI tools capture internal routing and mapping meta data, but there is no mechanism to ensure consistent definitions of business rules and definitions across data sources and applications. Another limitation is the fact that EAI message queues are relatively short. EAI tools are appropriate for movement of short transactional messages between application packages; however, they should not be used for bulk extraction of data from source systems. Finally, EAI tools do not support a GUI-driven Business Intelligence tool used for query, reporting and OLAP analysis operations.
In summary, EAI tools have the following strengths and limitations:
Strengths of EAI tools:
" Real-time transactional message movement
" Publish and subscribe architecture
" Process modeling capability
" Pre-built libraries of application system adapters and transformation logic
" Support of standards: XML, Java, COM, J2EE, .NET, SOAP, JMS
Limitations of EAI tools:
" Short messages - lack of bulk data access capability
" Procedural coding is necessary to specify libraries of application system adapters and transformations
" No support for generation and maintenance of central meta data repository
" No interface with data modeling tools used for logical target data modeling
" Not focused on business analytics or data integration at the meta data level
" No interface with BI tools for query, reporting, and OLAP analysis
EXTRACTION/TRANSFORMATION/LOAD (ETL) TOOLS
ETL tools operate at the heart of a data warehousing application. They function in a code-less, high performance environment, using an intuitive, graphical design interface to define extraction, transformation, and load functions. The primary purpose of the ETL tool is to generate clean, consistent data, which is required for decision-support functions, as well as to generate a central metadata repository that provides a single version of the truth for business rules and entity definitions.
Functions supported by best-of-breed ETL tools include extraction of both real-time and static data from a wide variety of data sources using native interfaces, bulk data transfer over secure communications lines, real-time data access, data integrity testing, data cleansing, data transformations, data aggregation, loading of multiple target databases, analytic applications, generation of metadata in a central metadata repository, and synchronization of metadata with local metadata repositories maintained by Business-Intelligence (BI) tools. Native interfaces to data sources include legacy files, flat files, spreadsheets, relational databases, ERP, CRM, Web log files, real-time message queues, XML, and external data sources. Real-time access of data from packaged applications is supported through a direct on-line link to an EAI tool. This native, real-time link treats the EAI tool as simply another data source for the ETL tool.
Best-of-breed ETL tools integrate with both data modeling tools and Business Intelligence (BI) tools at the metadata level. Target data models designed using the data modeling tool are imported into the meta data repository of the ETL tool. A logical data model, maintained by the data modeling tool, ensures consistency of all physical database schemas across the data warehousing application. Similarly, the central meta data repository generated by the ETL tool is synchronized at the meta data level with local meta data repositories maintained by the BI tools. This ensures that all data stored in target databases is logically consistent across the data warehousing application.
In summary, best-of-breed ETL tools have the following strengths and limitations:
Strengths of ETL tools:
" Native interfaces for access to legacy, relational, ERP, CRM, Web logs, real-time data feeds, and XML-compliant data
" Codeless environment based on object-oriented design techniques, leading to ease of deployment and modification
" Synchronization of target data models via integration at the meta data level with data modeling tools
" Maintenance of a "single version of the truth" for target data in a distributed, enterprise environment via transparent integration with the meta data repository of the ETL tool
" Bulk data movement over high-speed, secure channels
" End-to-end, packaged analytic applications
" High performance using concurrent data streams and parallel transformation pipelining
" Support of standards: XML, Java, COM, J2EE, .NET, JMS, CWM (Common Warehouse MetaModel), LDAP, and UML
Limitations of ETL tools:
" Orientation toward historical data movement, rather than transactional (operational) message movement
" Orientation toward data modeling, rather than process modeling
" Lack of publish and subscribe architecture
" Lack of guaranteed message delivery
" Inability to load transactional systems through an API layer
These limitations of ETL tools are eliminated through native, bi-directional links between best-of-breed ETL tools and EAI tools. Native links to EAI tools enable ETL tools to access transactional messages, operate within a publish/subscribe architecture, and support guaranteed message delivery. Similarly, the limitations of EAI tools listed above are overcome by direct, native interfaces to ETL tools.
CONCLUSION
It is clear that neither EAI tools nor ETL tools alone support the complete range of functionality required for application integration. However, the functionality provided by both sets of tools complement each other.
EAI tools provide real-time transactional message movement between application packages. However, they require expensive, manual techniques to code data transformations and do not create a "single version of the truth" for business rules and definitions. They should not be used for decision-support functions, due to their requirement for extensive hand-generated procedural code, their lack of metadata integration capability, and their inability to support intuitive, end-user analysis functions.
ETL tools support end-to-end data warehousing applications, including native interfaces to data sources, bulk data movement, data cleansing, data transformations, loading of target databases, analytic applications, high performance, generation of central meta data, and integration at the meta data level with end-user tools for query, report, and OLAP analysis. The lack of support for bi-directional access to real-time transactional messages is overcome through native interfaces with EAI tools.
It is important to evaluate the strengths and limitations of both EAI and ETL technology. The recommendation is to use EAI tools in combination with ETL tools to solve the application integration problem.
For further information about the issues discussed in this report, please contact Pieter Mimno, Principal, Mimno, Myers & Holum, at pmimno@mimno.com, or visit his Web site at www.mimno.com. Mr. Mimno specializes in the selection of system components and support for all phases of development of data warehousing applications.
|
 |