Informatica Part - I



 What is BI?
-----------------
Business Intelligence is the process of collecting raw data or business data and turning it into information that is useful and more meaningful.  The raw data is the records of the daily transaction of an organization such as interactions with customers, administration of finance, and management of employee and so on.  These data’s will be used for “Reporting, Analysis, Data mining, Data quality and Interpretation, Predictive Analysis”.


What is Data Warehouse?
----------------------------------
 A data warehouse is a database that is designed for query and analysis rather than for transaction processing. The data warehouse is constructed by integrating the data from multiple heterogeneous sources.It enables the company or organization to consolidate data from several sources and separates analysis workload from transaction workload.  Data is turned into high quality information to meet all enterprise reporting requirements for all levels of users.

ETL Concepts
----------------------
Extraction, transformation, and loading.  ETL refers to the methods involved in accessing and manipulating source data and loading it into target database.

The first step in ETL process is mapping the data between source systems and target database(data warehouse or data mart).  The second step is cleansing of source data in staging area.  The third step is transforming cleansed source data and then loading into the target system.

Note that ETT (extraction, transformation, transportation) and ETM (extraction, transformation, move) are sometimes used instead of ETL.

or
Extract, Transform & Load is a process in Data Warehousing. ETL refers to, "Extraction of data from different applications" developed & supported by different vendors, managed & operated by different persons hosted on different technologies "into Staging tables-Transform data from staging tables by applying a series of rules or functions - which may include Joining and Deduplication of data, filter and sort the data using specific attributes, Transposing data, make business calculations etc - to derive the data for loading into the destination system-Loading the data into the destination system, usually the data warehouse, which could further be used for business intelligence & reporting purposes.


Source System
--------------------
A database, application, file, or other storage facility from which the data warehouse is derived.

Mapping
----------------
The definition of the relationship and data flow between source and target objects.

Metadata
--------------
Data that describes data and other structures, such as objects, business rules, and processes.  For example, the schema design of a data warehouse is typically stored in a repository as metadata, which is used to generate scripts used to build and populate the data warehouse.  A repository contains metadata.

Staging Area
--------------------
A place where data is processed before entering the warehouse.

Cleansing
--------------
The process of resolving inconstistencies and fixing the anomalies in source data, typically as part of the ETL process.

Transformation
----------------------
The process of manipulating data.  Any manipulation beyond copying is a transformation.  Examples include cleansing, aggregating, and integrating data from multiple sources.

Transportation
---------------------
The process of moving copied or transformed data from a source to a data warehouse.

Target System
---------------------
A database, application, file, or other storage facility to which the "transformed souce data" is loaded in a data warehouse.


INFORMATICA 9.0:
======================

-- Informatica is a ETL Tool
-- Informatica is a product of Informatica Corporation
               -- It is a GUI based Tool
               -- The base language used to design this tool is JAVA
               -- Informatica first version released in 1993
               -- Informatica versions 4.7,5.0,6.0,7.1.1,8.1.1,8.5,8.6 and 9.0

Informatica is released with two flavors.
1. Informatica Power Center
2. Informatica Power Mart

INFORMATICA :
------------------------

Informatica is an Integrated tool set  to DESIGN
to RUN
to MONITOR
to ADMINISTRATOR
the plan of Data Acquisition known as mapping.

Repository Manager
-----------------------------
The Repository Manager, allows for easy administration, searching, and reporting of one or more repositories.

Designer
----------------
The Designer helps you create source definitions, target definitions, and transformations to build your mappings.

Workflow Manager
-----------------------------
In the Workflow Manager, you define a set of instructions called a workflow to execute mappings you build in the Designer.

Workflow Monitor
-----------------------------
The Workflow Monitor is a tool that allows you to monitor workflows and tasks.  You can view details about a workflow or task in either Gantt Chart view or Task View.



















No comments:

Post a Comment

PHP Notes

The Characteristics of PHP:- ----------------------------------- 1.PHP is a high level programming language. 2.It is a server-side scrip...