ETL tools comparison criteria
The criteria used for the ETL tools comparison are grouped into 12 categories:
✓ | Infrastructure | ✓ | Functionality | ✓ | Usability |
✓ | Platforms supported | ✓ | Debugging facilities | ✓ | Data Quality / profiling |
✓ | Performance | ✓ | Future prospects | ✓ | Reusability |
✓ | Scalability | ✓ | Batch vs Real-time | ✓ | Native connectivity |
The ETL criteria underlying the categories have a direct relationship with the business and technical requirements for selecting ETL tools. The ETL matrix provided in the ETL & Data Integration Guide 2024 allows for easy comparison between ETL tools.
Commercial ETL tools comparison
In total, twenty ETL & Data Integration tools are sold commercially, meaning they aren’t open source. The best-known commercial ETL tools are Informatica PowerCenter, SAP Data Service, Microsoft SSIS, and IBM’s Information Server.
Open source ETL tools comparison
We have researched three ETL & Data Integration tools that have an open source license structure. Take note, this doesn’t mean that you don’t have to pay for the software and/or service, but some have interesting licensing structures. All the vendors of the three open source ETL tools have a paid enterprise product offering. The open source ETL tools are: CloverETL, Pentaho Data Integration (PDI), and Talend Open Studio.
Our guide covers at least 90% of the market
20 of the major ETL vendors in this area have contributed to the guide. Get immediate insight into which ETL tools & data integration platforms have scored well on specific categories and key selection criteria. By doing so you will ensure that your organization gets the best ETL tool for the best price. The guide (started in 2004) is updated on a quarterly basis.
Future prospects of ETL tools
Originally we looked at the number of features a product boasted and compared that with the number of years the product had been on the market to calculate a growth rate. We stopped doing that for two reasons; firstly, there were a number of relatively new products on the marketplace with a broad functionality that would score unreasonably high if we calculated this way. Secondly, many products were acquired by other vendors and given a new name, allowing the vendors to claim that they had only recently entered the market.
A new method, better results
After using this method a newer method was introduced. The growth potential, as defined within the confines of this ETL study, looked at how frequently a vendor released new, valuable features, an important indicator for innovation and the strategic importance of the product to the vendor. With a maturing market, and “softer” aspects of the software becoming more important, we once again decided to change our method.
Four questions
To determine the future prospects of the ETL vendors we now base our research around four questions, that we score with pluses and minuses:
- How large is the current install base? The size of the current install base gives an indication of the financial strength of the vendor and the vested interests of the customers because of lock-in effects. Both factors contribute to a higher chance of survival in the long run.
- Have there been major developments and improvements in the past three years? By researching the new features of the software (among others by analyzing all release notes) we can form a clear idea of the activity of the vendor on improving the software. Where improvement is lacking, chances are the product is slipping.
- What is the current maturity level? By means of our own survey results we can rank the software from least to most mature. We believe that an already proven and mature product has a better chance of prospering in the market than a less developed product.
- Is there a clear roadmap and will there be significant developments and improvements? The clear roadmap is an indicator of two things: the vendor has defined a clear point on the horizon, which allows them to efficiently direct all resources to this point. It allows the vendor to take leadership in the market. Furthermore it usually also means that the vendor has considerable “marketing muscle”. Both factors contribute to a higher chance of dominating the market.
By answering these questions, we get a clearer image of their future. The scoring was done by awarding pluses and minuses (ranging from – to ++) on all four criteria.
Which tools will dominate the market in the coming years?
Can we look ahead and measure which of the products will still be strong and which will not? We have made an attempt to do just that.
The top contenders
The top contenders are IBM InfoSphere Information Server, SAP Data Services, and SAS Data Management. IBM has three solutions in the graph, but is reducing this number down to one (Information Server). IBM InfoSphere Warehouse Edition and IBM Cognos Data Manager will be end-of-life and current installations will be replaced by Information Server in due time.
All three vendors have a large install base, which gives them the financial means for significant development. These three vendors also have shown a serious commitment to the development of the software and the accompanying conceptual framework and they are really pushing the envelope of ETL software. By redefining the field of Data Integration, they probably have the best future prospects.
Open source tool Talend follows
Talend lags closely behind these three vendors and is growing rapidly. It has strong financial backing and a strong focus on Big Data. Also, because it is a relatively young company (founded in 2006), it is unhindered by legacy design decisions, allowing it to fully focus on all current concepts.
ODI and PowerCenter are trailing behind
Informatica PowerCenter and Oracle Data Integrator are trailing a little behind in terms of scoring. Both have large user bases and are very mature products, but they seem to lack the strong vision and roadmap of the top contenders. Knowing that they have always been able to make a strong point we expect them to present a good vision of the future of ETL shortly. The same goes for the next three, Pervasive Data Integrator, SQL Server Integration Services and Information Builder’s DataMigrator. Oracle is also reducing its number of products, Oracle Warehouse Builder will be integrated into Oracle Data Integrator and will be end-of-life by 2021.
Other notable vendors
Other notable vendors are CloverETL and Syncsort DMX. Syncsort has been offering a very mature product for a long time now. CloverETL also has a well-rounded product now. Then, there is a special mention for Pentaho. It is making big steps in terms of product maturity and also has a respectable user base.