Talend Architecture - The Functional Architecture of Talend Open Studio

Swatee Chand
Edureka
Published in
6 min readMar 22, 2018
Talend Architecture — Edureka

The architecture of software serves as a skeletal system which enables the structured flow of data and processes. To optimize the use of your software, you must have a clear understanding of its architecture. Through this blog on Talend architecture, I am going to give you a complete insight on the internal as well as the functional architecture of Talend.

Following are the topics, I will be discussing in this Talend architecture blog:

  • Talend Introduction
  • Talend Products
  • Talend Open Studio Architecture
  • Talend Functional Architecture

Before moving ahead, let me introduce you to Talend.

Talend Introduction

Talend is an open source software integration platform/vendor. It is a vendor which provides various software and services for:

  • Big Data
  • Data Integration
  • Cloud Storage
  • Data Management
  • Master Data Management
  • Data Quality
  • Data Preparation
  • Enterprise Application Integration.

According to Gartner Magic Quadrant 2017, Talend is recognized as a global leader in big data and cloud integration solutions. Following are a few of the most intriguing features offered by Talend:

  1. The current version of Talend is 7 times faster than the previous ones.
  2. Reduces the cost by 1/5th of the original expense.
  3. All Talend tools act as Code Generator.
  4. Talend tools are considered to be future proof.

When all the above points are taken into consideration, you can easily conclude that, it’s highly unlikely that it will go out of the market anytime soon. As a result, more and more companies are using Talend which has led to an increase in Talend’s hold in the market. Currently, Talend holds 19.3% of the total market share.

Talend provides the software that helps companies become data driven by making data more accessible, improving its quality and quickly moving it where it’s needed for real-time decision making. Talend is also known as the Swiss knife of a non-programmer for Big Data. It makes the user’s interaction with Big Data technologies like Hadoop, Hive, Spark, Pig, etc., really simple as there is no need of writing even a single line of code.

Since its release in 2005 and till date, Talend has released a wide range of products and services. In the next section of this article let’s take a look at a few of its major products.

Talend Products

The list of products includes licensed versions, open-sourced versions, and platforms. Lets now see all these products one by one.

Talend Enterprise Products

Talend Open Studio

Talend Platforms

Among all these products Talend Open Studios are most commonly used. The reason being, it is open sourced which makes it free to download and use. It is the best tool to get you started and comes with almost all the functions you need to process your data. But in case you want to increase your productivity, collaboration and the return on investment you can go for the enterprise versions. As the name suggests, the enterprise products are best suited for commercial purpose. However, the enterprise versions are not on our discussion list for today, so let’s focus on the Open Studios and move ahead with this blog on Talend architecture.

In the next section, I will try to explain the internal architecture of Talend Open Studio, which makes Talend so powerful yet user-friendly.

Talend Open Studio Architecture

But, before I explain the internal working of TOS, let me quickly brief you about it.

Talend Open Studio is based on Eclipse RCP which supports ETL oriented implementations. It is generally used for on-premises deployment and is extensively used for integration between operational systems, ETL processes and much more. Through its GUI, you can access metadata repository containing the definition and configurations for each process performed in Talend. As you might know, Talend’s GUI is extremely interactive and user-friendly and all you need to do is just drag, drop and link the components to perform a task. To execute these tasks, just click on the ‘Run’ button present in the Run tab and the rest is handled by TOS itself.

But have you ever wondered, what happens at the back end? Below diagram represents the basic Talend architecture which shows how Jobs are handled by TOS internally.

Well, at the back end, the Jobs and the business models which we create on its GUI are stored in an XML format by the TOS. Now, whenever you execute these Jobs they will be converted into Java codes and the Business models will be converted into Perl codes by the code generator.

Now that you have a basic understanding of how Talend Open Studio works, lets now take a look at the functional architecture of Talend.

Functional Talend Architecture

Because of its functional architecture, Talend can easily identify various functions and then interact and respond to various needs of the IT market. Below is the functional architecture of Talend Open Studio:

As you can see, the entire Talend architecture is divided into 3 functional blocks which are color coded (blue, green and orange). Let’s see the functioning of each of these blocks one by one:

Administration & Monitor

This block is responsible for administrating and monitoring Jobs. Here, you can find at least one Studio to carry out various data integration processes, irrespective of data volumes and process complexity. One thing, you must note is that you need a proper authorization to work on any project in Talend Studio.

Administration and Management

This block contains a web-based Administration Center (i.e an application server) with two shared repositories. One of these is based on an SVN server while the second one on a database server. The Administration Center is responsible for the management and administration of all projects. The database server stores the Administration metadata like user accounts, access rights and project authorization whereas the SVN server stores the project metadata like Jobs, Business Models, Routines, Routes, Services etc. This makes the sharing of data easier between the end users.

Execution & Deployment

This block is responsible for the execution and deployment of the Jobs. You can deploy one or more Job Servers inside your information system. These servers run the Jobs or the technical processes according to the scheduled time, date or event that is set in the Talend Administration Center Web application. Also, an end-user can easily transfer any Job to a remote execution server directly from a Studio, which is called the ‘distant run’ in Talend.

With this, we come to the end of this blog on Talend Architecture. Hope it was informative and you enjoyed reading it.

If you wish to check out more articles on the market’s most trending technologies like Artificial Intelligence, DevOps, Ethical Hacking, then you can refer to Edureka’s official site.

Do look out for other articles in this series which will explain the various other aspects of Talend.

1. What is Talend?

2. Talend Tutorial

3. Talend ETL Tutorial

4. Talend Big Data Tutorial

Originally published at www.edureka.co on March 22, 2018.

--

--