Quantcast
Channel: Teradata Developer Exchange - Database
Viewing all articles
Browse latest Browse all 174

Unity Source Link Scenario

$
0
0
Short teaser: 
Use Unity Source Link to import and query foreign data from Teradata Database.
Cover Image: 

Overview of Unity Source Link

Unity Source Link (USL) allows reads and joins from an external or foreign database to Teradata Database without requiring data replication in the Teradata data warehouse. USL can be used to:
  • create heterogeneous joins for reporting
  • insert or select from a foreign database into Teradata Database
  • migrate data to Teradata Database
Note: USL currently supports only importing data from Oracle databases. Complex data types, such as large object (LOB), user-defined type (UDT), period, and interval are not supported.
More...
The figure below illustrates the system hardware and software components.
  • The USL portal server is a dedicated Teradata Managed Server (TMS) with Linux SLES 11. Your system can have more than one USL portal server, with a single portal on each server.
  • USL portal software connects Teradata Database to one or more foreign databases.

The USL software components run on the USL portal server and on Teradata Database nodes:
  • The USL portal software on the TMS server acts as a gateway between Teradata Database and the foreign database. The USL portal software uses ODBC to access the foreign database. It handles data conversion between the foreign data type and the Teradata data type.
  • External Access Handler (EAH) software resides in Teradata Database in a dedicated database named DBCForeign. EAH has three sub-components: stored procedures, the import table operator, and tables
    • Stored procedures are the interface for administrative and end users to Teradata Database.
    • The tables track foreign database connections in DBCForeign database.
    • The import table operator runs in a java virtual machine on each node in Teradata Database and communicates through a socket connection with the USL portal server.

Unity Source Link Workflow

Unity Source Link is implemented and used by two distinct roles.
  1. The Unity Source Link administrative user:
    1. Calls the CreatePortal stored procedure to define a portal in DBCForeign, including the name, IP address, port, and connection timeout information.

      The portal entry is stored in DBCForeign.Portals table on Teradata Database and views are created for accessing statistics, requests, and session information for the portal.

    2. Calls the CreateServer stored procedure to define servers in DBCForeign, including the server name, user name, password, portal name, server data source name (DSN), and data source.

      The server row is stored in the DBCForeign.Servers table on Teradata Database.

    3. Configures access in DBCForeign.ServerUserMapping table on Teradata Database to specify which users or roles have access to which server and informs users which server names are mapped for their access.
    4. Grants privileges for EAH stored procedures:
      • USL administrators need all privileges on DBCForeign
      • USL users need EXECUTE PROCEDURE privileges for CreateLink and DeleteLink stored procedures
  2. The Unity Source Link end user:
    1. Calls the CreateLink stored procedure to define views to foreign tables so they can be queried from any SQL client, such as SQLA and BTEQ.

      An entry for tracking each view is created in the DBCForeign.Links table on Teradata Database with "LinkName@ServerName" as the view name.

      The view is defined as a query from the import table operator. Table operators accept a table or table expression as input and generate a table as output. They are a type of user-defined function (UDF). Table operators provide a simplified in-database MapReduce-style programming model. Table operators allow output column definitions to be dynamically determined at runtime. In USL, the output column definitions are determined based on the column definitions of the foreign table or the result table of the foreign table expression at query time.

    2. Queries the views created by the stored procedure.
      • When querying the views, any select, filtering, ordering, grouping, operations, or functions that are part of the view definition are executed by the foreign SQL database engine.
      • When querying the views, any select, filtering, ordering, grouping, operations, functions, or joins that are outside of the view definition are executed by the Teradata Database engine.

USL Performance Considerations

Because USL links heterogenous systems, overall performance is affected by processes taking place on the USL Portal Server, on Teradata Database, and on the foreign database.

USL Portal Server and Performance

In addition to connecting Teradata Database to a foreign database, one of the main functions of the USL Portal Server is to process any required data conversions between the two systems. This means Teradata Database is relieved of the burden of data conversions, aiding in overall system performance when processing data from the foreign database.

Database Execution

USL uses two different SQL database engines, making it important to understand which portions of a query are executed by which database engine.

The CreateLink stored procedure creates a view named "LinkName@ServerName", and subsequent query execution depends on the view definition.
  • When querying the views, any select, filtering, ordering, grouping, operations, or functions that are part of the view definition are executed by the foreign SQL database engine.
  • When querying the views, any select, filtering, ordering, grouping, operations, functions, or joins that are outside of the view definition are executed by the Teradata Database engine.

Identifying High-Value Customers Across Multiple Sites

This scenario illustrates a heterogeneous join.

A company owns multiple casino properties. One casino (C1) is in Las Vegas and the other (C2) is in Atlantic City. A customer walks into the Las Vegas property and uses their loyalty card. The management wants to know the customer's status but they only have activity data on their customers. They query the activity data from other casino properties as well as their own. The problem is that the different casino properties are somewhat independent and use different database vendors. Las Vegas uses Teradata Database and Atlantic City uses a foreign database. In this simple scenario they are looking for the customer's last activity date and total winnings in all past visits.

For More Information about Unity Source Link

For more information about Unity Source Link and about the SQL used in these examples, see:

Document Description
Unity Source Link User Guide, B035-2601 Describes setup, administration, and use of Unity Source Link.
SQL Functions, Operators, Expressions, and Predicates, B035-1145 Describes the UNION operator and other functions, operators, expressions and predicates.
SQL Data Manipulation Language, B035-1146 Describes the CALL statement, used for invoking stored procedures, and other DML statements.
SQL Data Definition Language - Syntax and Examples, B035-1144 Describes the SHOW VIEW statement, and other DDL statements.

Scenario Assumptions

This scenario assumes the following tasks have already been performed by the system and Unity Source Link administrators:

  • The External Access Handler software has been installed on Teradata Database.
  • Appropriate access rights to DBCForeign have been granted to the Unity Source Link Administrator, and the USL administrator has granted appropriate access rights to users who require access to USL.
  • The Unity Source Link Portal Server service has been started and configured.
  • A portal that defines the Teradata Database connection to the foreign database has been created. The portal must use the same port number as that used by the service (the default is port 5678).
  • ODBC data sources have been created on the Unity Source Link Portal Server (a Teradata Managed Server) to access the foreign database. In our scenario, the foreign database is Oracle.
  • Teradata Wallet has been configured to store and protect the Oracle Database passwords in an encrypted form.

Our scenario will show how a Teradata Database user with appropriate privileges can access the data in both the Teradata and Oracle databases to answer questions about casino guests.

Assume the following customer data exists in the Teradata database table c1 for the Las Vegas casino:

Cust_ID Activity_DT Tot_Dollars
1 2012-10-15 1000.00
1 2012-10-16 - 500.00
1 2012-11-01 2000.00
3 2012-12-30 -1000.00

Assume the following customer data exists in the Oracle database table Datatypes.c2 for the Atlantic City casino:

Cust_ID Activity_DT Tot_Dollars
2 2012-10-15 1000.00
2 2012-11-15 -500.00
3 2013-01-01 5000.00
3 2013-01-02 -2000.00
3 2012-01-01 -500.00

Links to Foreign Databases

USL allows you to create links to foreign databases. These links are views in DBCForeign that you can query the same way that you query tables and views in Teradata Database.

Creating a Link

When a link is created, a row is added to the DBCForeign.Links table. This row contains the characteristics of the link. During the execution of the CreateLink stored procedure, a view is created in the form of "LinkName@ServerName".
  1. Use the CreateLink command to define a link in Teradata Database to a table in a foreign database to create a view for this table in the DBCForeign database. Example:
    CALL DBCForeign.CreateLink('LinkName', 'TableName', 'ServerName', 'SelectList', 'Clause', outStatus);
    Option Description
    LinkName

    Name of the link. The view that gets created takes the form of "LinkName@ServerName'".

    TableName

    Name of the foreign table to be accessed, up to 128 characters.

    ServerName

    Name of a server that was previously created using the CreateServer stored procedure. A list of the created servers can be obtained by querying the DBCForeign.Servers table.

    SelectList
    [Optional] List of foreign table columns to be selected. If no select list is specified, all foreign table columns are selected. The list is not limited to a simple list of columns. SQL must be well-formed. It can contain the following operations and functions:
    • scalar operations, such as c1 + 10
    • aggregate functions
    • non-aggregate columns, must appear in a GROUP BY clause
    • user defined functions
    Clause
    [Optional] SQL clause to specify filtering, grouping, or ordering operations that are performed on the table. If no clause is specified, no filtering, grouping, or ordering operations are performed on the table. The WHERE keyword is not needed if the clause only contains a WHERE clause.
    • c3 > 10
    • c3 > 10 GROUP BY c1
    • c3 > 10 GROUP BY c1
    • GROUP BY c1
    outStatus

    Returns a message indicating the stored procedure executed successfully or an error occurred.

Example: Creating a Link

  1. Login to the USL portal server and use a SQL client to create the link.
    CALL DBCForeign.CreateLink('c2Link2', 'Datatypes.c2', 
    'Connect2Oracle', 'Cust_ID, MAX(Activity_DT) dt, 
    SUM(Tot_Dollars) td', 'GROUP BY Cust_ID', st);
    
    *** Procedure has been executed. 
    *** Total elapsed time was 3 seconds.
    
    
    outStatus
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    [USL7109] OK! User =  DATATYPES_USER1, Select List = "Cust_ID, MAX(Activity_DT) dt, SUM(Tot_Dollars) td", Clause = "GROUP BY Cust_ID", View = "c2Link2@Connect2Oracle"
    
    SHOW VIEW DBCForeign."C2Link2@Connect2Oracle";
    
    *** Text of DDL statement returned. 
     *** Total elapsed time was 1 second.
    
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    CREATE VIEW "DBCForeign"."c2Link2@Connect2Oracle" AS (SELECT * FROM "DBCForeign"."Import" (
    USING 
    I_TABLE('Datatypes.c2') 
    I_SQL('SELECT Cust_ID, MAX(Activity_DT) dt, SUM(Tot_Dollars) td 
    FROM Datatypes.c2 
    GROUP BY Cust_ID')
    I_PORTAL_IP('153.64.25.217')
    I_PORT( 5005) 
    I_DATASOURCE('oracle') 
    I_DSN('OracleWP') 
    I_TIMEOUT(    20) 
    I_UID('datatypes_user1') 
    I_PWD('$tdwallet')) AS I1)
    

Example: Querying a View

Query the activity data from other casino properties together with local data for the customers' last activity date and their total past winnings.
  1. Run a query.
    SELECT Cust_ID (TITLE 'Customer'),
           MAX(dt) (TITLE 'Last Visit'),
           SUM(td) (TITLE 'Winnings')
    FROM (SELECT Cust_ID, MAX(Activity_DT) dt, SUM(Tot_Dollars) td
          FROM c1 
          GROUP BY 1
          UNION
          SELECT Cust_ID, dt, td
          FROM DBCforeign."c2Link2@Connect2Oracle" 
         ) c
    GROUP BY 1
    WHERE Cust_ID = 3;
    
    The new query returns one row:
      Customer  Last Visit  Winnings
    -----------  ----------  --------
              3    13/01/02   1500.00
    

Discussion of the USL Example Query

Notice that the query selects information from both the Teradata database table holding the Las Vegas customer information and the view of the Oracle database table data holding the Atlantic City customer information. The necessary data transformation is taken care of automatically by USL so the information can be combined and manipulated seamlessly by Teradata Database.

The SELECT statement immediately above the UNION collects the Las Vegas data in a table c1. That SELECT statement aggregates the table data per customer to show the most recent casino visit date and the sum of all the money won or lost by the customer up to and including that date. By itself, that SELECT would return this data:

Cust_ID dt td
3 12/12/30 -1000.00
1 12/11/01 2500.00

The SELECT statement immediately after the UNION queries the view of the Oracle data. This view was created in Teradata Database. Because the link statement specified similar aggregations on the Oracle data, this SELECT statement would return data organized in similar columns to the data above:

Cust_ID dt td
2 12/11/15 5000.00
3 13/01/02 2500.00

That data in the original Oracle table might have looked like this before USL transformed it to match the data formats used in Teradata Database:

CUST_ID DT 
2. 2012-11-15 00:00:00 5.000
3. 2013-01-02 00:00:00 2.500

The complete query that combines data from both the Las Vegas and Atlantic City casinos uses a WHERE clause to limit the data to the customer of interest, and mathematically sums the customer's winnings from both casinos to present the overall grand total amount won or lost by that customer at both casino locations:

Customer Last Visit Winnings
3 13/01/02 1500.00
Supersede
Ignore ancestor settings: 
0
Apply supersede status to children: 
0
Channel: 

Viewing all articles
Browse latest Browse all 174

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>