+ All Categories
Home > Documents > IBM BI Tookit Datastage V1 0

IBM BI Tookit Datastage V1 0

Date post: 06-Jul-2018
Category:
Upload: umesh-kumar
View: 243 times
Download: 1 times
Share this document with a friend

of 141

Transcript
  • 8/18/2019 IBM BI Tookit Datastage V1 0

    1/141

    © Copyright IBM Corporation 2006

    (Optional clientlogo can

    be placed here)

    Disclaimer (Optional location for any re!ired disclaimer copy"

    #o set disclaimer$ or delete$ go to %ie& ' Master ' lide Master)

    IBM lobal B!siness er*ices

    Co!rse #itle

    Business Intelligence (BI) Development

    Toolkit for Datastage 

    D!ration of co!rse+ , ho!rs

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    2/141

    -resentation #itle IBM Internal .se Doc!ment I2

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    Course Objective

     /t the completion of this co!rse yo! sho!ld beable to !nderstand +

    O*er*ie& of processes follo&ed in astandard de*elopment proect"

     %ario!s phases and related &or3 prod!ctassociated &ith the de*elopment process"

     Importance of generating *ario!s &or3prod!cts"

     tandard 4 Best practice 4 #ip 5 tric3sspecific &ith the tool"

     Insight abo!t different types of proects"

    Different types of testing"

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    3/141

    -resentation #itle IBM Internal .se Doc!ment I

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    Course Content

    Mod!le 1+ Datatage 7o& 7e*el Design

    Mod!le 2+ Datatage Coding tandards

    Mod!le + Datatage Best -ractices 8 #ips 5 #ric3s

    Mod!le 9+ %ersion Control

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    4/141

    © Copyright IBM Corporation 2006

    (Optional clientlogo can

    be placed here)

    Disclaimer (Optional location for any re!ired disclaimer copy"

    #o set disclaimer$ or delete$ go to %ie& ' Master ' lide Master)

    IBM lobal B!siness er*ices

    Co!rse #itle

    Module 1 Data!tage "o#

    "evel Design 

    BI De*elopment #ool3it for

    Datastage 

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    5/141

    -resentation #itle IBM Internal .se Doc!ment I

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    Module Objectives

     /t the completion of this chapteryo! sho!ld be able to+

     8 .nderstand the concept of

    7o& 7e*el Design process" 8 :no& ho& a 7o& 7e*el Design

    Doc!ment loo3s li3e"

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    6/141

    -resentation #itle IBM Internal .se Doc!ment I6

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    "o# "evel Design $genda

    %e& points described in t'e "o# "evel Design

    #opic 1 +Introd!ction"

    #opic 2 Obecti*es4-!rpose"

    #opic + cope"

    #opic 9 +Core /spects Of Design"

    #opic +7o& 7e*el #echnical O*er*ie&"

    #opic 6 +7o& 7e*el #echnical Design"

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    7/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    8/141-resentation #itle IBM Internal .se Doc!ment I<

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    'at is a "o# "evel Design0

      T'e "o# "evel Design details all t'e tec'nical aspects

      involved in t'e Data !tage ,T" process #it' respect to t'efollo#ing

     

    o!rce4#arget =ames and 7ocations

      #his section contains the name of the so!rce4target table or >ilenames$schema details for tables or ser*er details for files"

    o!rce4#arget tr!ct!res i"e table str!ct!re or file str!ct!re

      #his describes the field names in a table along &ith their

    datatypes or if it is a Delimited or fi?ed &idth one for flat file"

    o!rce #o #arget mapping

    @?plains ho& data flo&s from so!rce to target"

     

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    9/141-resentation #itle IBM Internal .se Doc!ment IA

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    'at is a "o# "evel Design0

    / to find any data !ality iss!es"

    obs4e!ences4Master e!encer Details"

      #his section sho&s the name of the obs$ e!ences and Mastere!encers along &ith the transformation details

    -artitioning Information if any"

    ched!ling Information etc"

     

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    10/141-resentation #itle IBM Internal .se Doc!ment I10

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    !ample "o# "evel Design

      LLD_Template

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    11/141-resentation #itle IBM Internal .se Doc!ment I11

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    :ey -oints

    !tep Overvie##his sho&s the 3ey elements e"g the inp!ts$o!tp!ts$3ey acti*ities in*ol*ed etc" along &ith theartifacts"

    %e& $ctivities

     /nalysis of Eigh 7e*el Design Identify 3ey elements to be

    incl!ded in the 7o& 7e*elDesign"

    .nderstanding of the entireflo& from so!rce to targetalso &ith mapping r!les

    Outputs

    #echnical pecification

    Inputs

    Eigh 7e*el Design

    /oles

    De*eloper 

    Templates and !ample $rtifacts

    Sample Artifact

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    12/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    13/141-resentation #itle IBM Internal .se Doc!ment I19

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    Module Objectives

     /t the completion of this chapter yo! sho!ldbe able to+

     8 :no& the ob 7e*el =aming con*entions

    !sed in Data tage"

     8 :no& the -arameter =aming

    con*entions !sed in Datatage"

     8 :no& proper Doc!mentation

    standards4Commenting &ithin the ob"

     8 :no& proper .se of@n*ironmental4eneric parameters as a

    standard practice"

     8 Identify the 3ey Coding standard

    principles"

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    14/141-resentation #itle IBM Internal .se Doc!ment I1

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    Datastage Coding !tandards  $genda

    #opic 1 +Coding standards

     8   Fepository str!ct!re Datatage"

     8   @#7 Coding standard g!idelines"

    #opic 2 + ob =aming Con*entions

     8   tage =aming Con*entions"

     8   7in3 =aming Con*entions"

     8   Container =aming Con*ention"

     8   -arameter =aming Con*ention"

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    15/141-resentation #itle IBM Internal .se Doc!ment I16

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    Datastage Coding !tandards  $genda

     #opic + ob =aming Con*entions

     8   tage =aming Con*entions"

     8   7in3 =aming Con*entions"

     8   Container =aming Con*ention"

     8   -arameter =aming Con*ention"

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    16/141-resentation #itle IBM Internal .se Doc!ment I1;

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    Ghat is a Coding standardH

    #he set of r!les or g!idelines that tells de*elopers ho& they m!st &rite their code"Instead of each de*eloper coding in their o&n preferred style$ they &ill &rite allcode aligning to @#7 standards ens!ring the consistency of the designed @#7application thro!gho!t the proect"

    Benefits

     Fed!cing de*elopment time"

    @nabling ne& members of the team to !ic3ly pic3 !p de*elopment"  /llo&ing for fle?ibility in e?changing team members bet&een the Data Con*ersion

    and the Data Gareho!se 4 Feporting teams"

    -ro*iding a template to follo&"

    @nabling m!ltiple teams4team members to &or3 on m!ltiple phases

    er*ing as a basis (after the completion of the pilot proect) for the de*elopment of obs for all other co!ntries"

    Ma3ing !se of the .I$ and selfJdoc!menting nat!re of the tool"

    Maintainability"

    Coding standard

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    17/141-resentation #itle IBM Internal .se Doc!ment I1<

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    /epositor& structure

    #he repository is the central storing place for KB!ildL related components" It is a 3eycomponent of the soft&are &hilst de*eloping obs in Datatage Designer 

    Data ,lements J / specification that describes the type of data in a col!mn and ho& thedata is con*erted" er*er obs only")

    5obs 8 >older for obs that are b!ilt$ compiled and r!n"

    /outines 8 #he B/IC lang!age can be !sed to &rite c!stom ro!tines that can be called

    !pon &ithin ser*er obs" Fo!tines can be reJ!sed by se*eral ser*er obs"

    !'ared Containers 8 / shared container is a reJ!seable item stored in the repositoryand a*ailable to any ob in the proect"

    !tage T&pes 8 /ny stage !sed in a proect 8 this can be data so!rce$ datatransformation$ or data &areho!se"

    Table Definitions J / definition describing the data yo! &ant incl!ding information abo!tthe data table and the col!mns associated &ith it" /lso referred to as meta data"

    Transforms 8 imilar to ro!tines these ta3e one *al!e and comp!te another *al!e fromit"

    Coding standards

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    18/141-resentation #itle IBM Internal .se Doc!ment I1A

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    ,T" Coding standard guidelines By !sing a simple repository str!ct!re$ it is easier to na*igate and find the

    components that are needed to b!ild a ob$ and if a n!mber of complicatedsched!les are !sed$ can also sho& the flo& of obs"

    It is a good idea to set !p a folder str!ct!re based on a common feat!re$ notablythe architect!ral area"

    >or each of these gro!ps a obs and a e!ences folder is created" #h!s$ for eachgro!p t&o separate folders are created !nder the obs folder" #hese gro!ps in t!rncan be di*ided into s!bgro!ps (and th!s s!bfolders"

    #emplates are stored in a separate #emplates folder directly !nder the obs folder"It is e?pected that a small n!mber of templates &ill s!ffice to create obs at allle*els$ so that there is no need to create specific folders for templates at e*ery

    le*el" " #ho!ghtf!l naming of obs and categories &ill help the de*eloper in !nderstanding

    the str!ct!re"

    If m!ltiple *ersions of a so!rce system are s!pported then it is a good idea toreflect the *ersion n!mber in the folder name$ so that it is clear &hich *ersion thecorresponding obs$ se!ences and templates &ere &ritten for"

    Coding standards

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    19/141-resentation #itle IBM Internal .se Doc!ment I20

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    5ob Templates 

      @ach proect sho!ld contain ob templates in order to ens!re that obs arecreated &ith the proper amo!nt of ob parameters$ and the correct obparameter names" #hese ob templates are stored in a separate#emplates folder directly !nder the obs folder "

    5obs and !e4uences 

    obs can be gro!ped into folders based on a common feat!re$ notably thearchitect!ral area they belong to" #h!s$ for each gro!p a separate folderis created !nder the obs folder" #hese gro!ps in t!rn can be di*ided intos!bgro!ps (and th!s s!bfolders)"

    Table Definitions

      #he #able Definitions section contains metadata &hich can be imported

    from a n!mber of so!rces$ e"g" Oracle tables$ or flat files" #he folders thatthis metadata is stored in m!st represent the physical origin or destinationof a table or file" #he recommended naming standard (and the defa!lt forODBC) is+

      1st s!bfolder+ database type (ODBC$ .ni*erse$ DDB2$ OF/OCIA)

      2nd s!bfolder+ database name"

    Coding standards

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    20/141-resentation #itle IBM Internal .se Doc!ment I21

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    6as' +iles

    Eash files can be stored either in .ni*erse$ or in the file system of theoperating system"

    !e4uential +iles

      / Datatage proect &ill potentially !se so!rce$ target$ and intermediatefiles" #hese can be placed in separate directories" #his &ill+

    implify maintenance"

     /llo& data *ol!mes to be spread e*enly across m!ltiple dis3s"

     /llo& for closer monitoring or file system"

     /llo& for closer monitoring of data flo&"

     /id ho!se3eeping processes"

    Coding standards

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    21/141-resentation #itle IBM Internal .se Doc!ment I22

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    Ghat is =aming Con*entionH

    #his is an ind!stry accepted &ay to name *ario!s obects"

     / *ariety of factors are considered &hen assessing the s!ccess of a

    proect" =aming standards are an important$ b!t often o*erloo3ed

    component" /ppropriate =aming con*entionJ @stablishes consistency in the repository$

    -ro*ides a de*eloper friendly en*ironment"

    Benefits

    >acilitates smooth migrations and impro*es readability for anyonere*ie&ing or carrying o!t maintenance on the repository obects"

    It helps to !nderstand the processes being affected thereby sa*ingsignificant time"

    2aming Conventions

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    22/141-resentation #itle IBM Internal .se Doc!ment I2

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

      #he follo&ing pages s!ggest naming con*entions for *ario!s repositorycomponents "Ghate*er con*ention is chosen$ it is important to ma3e theselection *ery early in the de*elopment cycle and comm!nicate thecon*ention to proect staff &or3ing on the repository" #he policy can beenforced by peer re*ie& and at test phases by adding processes to chec3con*entions both to test plans and to test e?ec!tion doc!ments"

    2aming Conventions

    IBM l b l B i i

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    23/141-resentation #itle IBM Internal .se Doc!ment I29

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    *roject 2aming Conventions Component *arameter !uggested 2aming

    Conventions

     -roect =ame #ypically a proect contains a set ofse!ences 4 obs 4 ro!tines 4 tabledefinitions 4 etc" #his may be apartic!lar release or *ersion and is*ery m!ch dependent on the

    proect circ!mstances" #he proectname cannot contain spaces andp!nct!ation"

      Distinction &ill be made according

    to the proect stages+De*elopment$ #est$ /cceptance$and -rod!ction$ &hich &ill beappended to the proect name inabbre*iated (three character)format"

    IBM l b l B i i

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    24/141-resentation #itle IBM Internal .se Doc!ment I2

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    5ob 2aming Conventions

    Component *arameter !uggested 2amingConventions

    ob

    #he ob names !sed are *erym!ch dependent on the proect".s!ally ob names contain as!bect area (the target table)$

    and possibly a ob f!nction (load$transform$ clear$ !pdate$ etc)"

    ob names ha*e to be !ni!eacross all folders"

    >or proects$ the standard chosenis+

    ob f!nctionNtargettableN

    IBM l b l B i i

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    25/141-resentation #itle IBM Internal .se Doc!ment I26

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    !tage 2aming Conventions

      *assive stages + / passi*e stage indicates a data component$ s!chas a se!ential file$ an Oracle table$ or an ODBC so!rce" In acti*estages some 3ind of processing occ!rs$ s!ch as sorting$transforming$ aggregating$ etc

    7eneric Convention 8data source t&pe9:8data source name9

      &here dataso!rcetype is a t&o J fo!r character (preferably three)

    abbre*iation &hich is as clear and !nambig!o!s as possible

    Component *arameter !uggested 2aming Conventions

    e!ential >ile edata so!rce nameN 

    Comple? >lat >ile Cffdata so!rce nameN 

    Eash file Eshdata so!rce nameN 

    IBM l b l B i i

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    26/141-resentation #itle IBM Internal .se Doc!ment I2;

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    !tage 2aming Conventions

    Object*arameter !uggested 2amingConventions

    ,M7 file ,ml data so!rce nameN 

    Oracle database Oradata so!rce nameN 

    DB2 database DB2 data so!rce nameN 

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    27/141-resentation #itle IBM Internal .se Doc!ment I2<

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    !tage 2aming Conventions

    Component *arameter !uggested 2amingConventions

    ODBC so!rce OdbcNdata so!rce nameN

    >ile transferred *ia >#- >tp Ndata so!rce nameN

    iebel D/ bl data so!rce nameN

    Dataset Ds data so!rce nameN

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    28/141-resentation #itle IBM Internal .se Doc!ment I2A

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    !tage 2aming Conventions

     $ctive stages + In acti*e stages some 3ind of processing occ!rs$ s!ch assorting$ transforming$ aggregating$ etc 

    7eneric Convention 8stage:t&pe9:8functional:name9

    In case of a transformation$ the f!nctionalname typically consists of a *erb

    (indicating the action that is performed) and a no!n (the obect of the action)" 

    Component *arameter !uggested 2aming Conventions

    Command Cmd f!nctionalname9 

     /ggregator /gg f!nctionalname9 

    >older >ld f!nctionalname9 

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    29/141

    -resentation #itle IBM Internal .se Doc!ment I0

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    !tage 2aming Conventions

    Object*arameter !uggested 2amingConventions

    >ilter >ltr f!nctionalname9 

    Inter -rocess Ipc f!nctionalname9 

    7in3 -artitioner 7pr f!nctionalname9 

    7oo3!p 73p f!nctionalname9 

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    30/141

    -resentation #itle IBM Internal .se Doc!ment I1

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    !tage 2aming Conventions

    Component *arameter !uggested 2aming Conventions

    Merge Mrg f!nctionalname9 

    ort rt f!nctionalname9 

    #ransformer ,fm f!nctionalname9 

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    31/141

    -resentation #itle IBM Internal .se Doc!ment I2

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    !tage 2aming Conventions

    Component*arameter !uggested 2aming conventions

    Change Data Capt!re  Cdc f!nctionalname9 

    >!nnel

    >nl4Cl!b f!nctionalname9 

    oin oin f!nctionalname9 

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    32/141

    -resentation #itle IBM Internal .se Doc!ment I

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    !tage 2aming Conventions

    Component*arameter !uggested 2amin Conventions

    !rrogate :ey enerator   :ey f!nctionalname9 

    Femo*e D!plicates 

    Dd!p f!nctionalname9 

    Copy Cpy f!nctionalname9 

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    33/141

    -resentation #itle IBM Internal .se Doc!ment I9

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    "ink 2aming Conventions

    7in3s m!st ha*e a descripti*e name" .nli3e the stages$ they start &ith a nonJcapital"If possible$ let the name resemble the preceding stage name$ b!t &itho!t the stagetype$ and !sing the past participle of the *erb !sed in the preceding stage name"

    @?amples+

    enrichedC!stomer 

    sortedOrders

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    34/141

    -resentation #itle IBM Internal .se Doc!ment I

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    Container 2aming Conventions

    !'ared Containers

      #he names of hared Containers start &ith cn $ follo&ed by a meaningf!l namedescribing its f!nction"

     "ocal Containers

      #he names of 7ocal Containers start &ith 7cn $ follo&ed by a meaningf!l namedescribing its f!nction"

      !tage ;ariable

     / tage %ariable is an intermediate processing *ariable that retains its *al!e d!ring

    read b!t does not pass its *al!e to a target col!mn"

    !tage variable names start #it' stg: and reflect t'eir usage"

     / standard m!st be set so that common stage *ariables are named consistently<

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    35/141

    -resentation #itle IBM Internal .se Doc!ment I6

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    *arameter 2aming Conventions

    *arameters

     / parameter name sho!ld clearly reflect its !sage"

     7eneral

    #he general naming con*ention is+ -nameN

    Database *arameter !uggested 2aming Conventions

    Data o!rce =ame  -DBlogical db nameND=

    .ser Identification 

    -DBlogical db nameN.@FID

    .ser a!thentication -ass&ord -DBlogical dbnameN-/GOFD

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    36/141

    -resentation #itle IBM Internal .se Doc!ment I;

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    *arameter 2aming Conventions

     Director& (pat') parameters !uggested 2aming Conventions

    so!rce data for the ob

    -DIFI=-.#

    Destination directory

    -DIFO.#-.#

    Directory for temp D files -DIF#@M-

    Directory for errorJreporting files -DIF@FFOF

    Directory &here cs* and other referencedata is held"

    -DIFF@>

    >or directory (path) parameters the con*ention is+ -DIF!sageN#he follo&ing directory parameters ha*e been identified+

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    37/141

    -resentation #itle IBM Internal .se Doc!ment I<

    IBM lobal B!siness er*ices

    © Copyright IBM Corporation 2006

    Datastage Coding *rinciples and !tandards

    !uggested Met'ods of orking

    Before editing a ob$ *erify that the ob in de*elopment is identical to the one inprod!ction" If not$ re!est a copy from the prod!ction system"

    Create a bac3!p copy of the ob yo! are going to edit$ so that yo! are able toret!rn it to its original state if needed"

     /fter de*elopment has finished$ clean!p any bac3!p copies of obs yo! ha*ecreated$ so that there &ill be no mis!nderstandings as to &hat the correct obis" 

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    38/141

    -resentation #itle IBM Internal .se Doc!ment IA© Copyright IBM Corporation 2006

    Documentation practices in a job

    Incorporating Comments

    One challenge of internal soft&are doc!mentation is ens!ring that the comments are maintained and !pdated inparallel &ith the so!rce code" /ltho!gh properly commenting so!rce code ser*es no p!rpose at r!n time$ it isin*al!able to a de*eloper &ho m!st maintain a partic!larly intricate or c!mbersome piece of soft&are"

     5obs Commenting

    Doc!ment all obs in their ob -roperties+

    -ro*ide a short description containing a short$ meaningf!l description"

    -ro*ide a 7ong description containing a history of *ersion$ date$ changes made and by &hom"

    Incl!de a reference to the design$ incl!ding its *ersion"

    Doc!ment any special file references"

    Ghen modifying obs$ al&ays 3eep the short and long descriptions in the ob -roperties !p to date"

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    39/141

    -resentation #itle IBM Internal .se Doc!ment I90© Copyright IBM Corporation 2006

    Documentation practices in a job

    /outines and +unctions

    Fo!tines and f!nctions are doc!mented in the short and long description fields (as are obs)$ and in thecode *ia comments"

    #he comments in the short and long description fields (on the eneral tab) are similar to ob comments"

    -ro*ide a short description containing a short$ meaningf!l description"

    -ro*ide a 7ong description containing a history of *ersion$ date$ changes made and by &hom"

    Incl!de a reference to the design$ incl!ding its *ersion"

    Doc!ment any special file references"

    Ghen modifying obs$ al&ays 3eep the short and long descriptions in the ob -roperties !p to date"

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    40/141

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    41/141

    -resentation #itle IBM Internal .se Doc!ment I92© Copyright IBM Corporation 2006

    .se of parameters

    Definition

    ob parameters allo& yo! to design fle?ible$ re!sable obs$ ma3ing a ob independent from itsso!rce and target en*ironments"

    If$ for e?ample$ &e &ant to process data !sing a certain !serid and pass&ord$ &e can incl!dethese settings as part of yo!r ob design" Eo&e*er$ &hen &e &ant to !se the ob again for adifferent en*ironment$ &e m!st most li3ely edit the design and recompile the ob"

    JJ Instead of entering constants as part of the ob design$ yo! can set !p parameters &hichrepresent processing *ariables<

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    42/141

    -resentation #itle IBM Internal .se Doc!ment I9© Copyright IBM Corporation 2006

    .se of parameters

    Creating *roject !pecific ,nvironment ;ariables

     Eere are the steps to standard steps to follo&+

    tep 1 JN tart !p Datatage /dministrator"

    tep 2 JNChoose the proect and clic3 the Q-ropertiesQ b!tton"

    tep JN On the eneral tab clic3 the Q@n*ironment"""Q b!tton"tep 9JNClic3 on the Q.ser DefinedQ folder to see the list of ob specific en*ironment *ariables"

    tep JN#ype in all the re!ired ob parameters that are going to be shared bet&een obs

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    43/141

    -resentation #itle IBM Internal .se Doc!ment I99© Copyright IBM Corporation 2006

    .se of parameters

    .sing ,nvironment ;ariables as 5ob *arameters +

     tep 1JNOpen !p a ob"

    tep 2JNo to ob -roperties and mo*e to the parameters tab"

    tep JN Clic3 on the Q/dd @n*ironment %ariables"""Q b!tton (&hich doesnt add an en*ironment *ariable b!t rather bringsan e?isting en*ironment *ariable into yo!r ob as a ob parameter)"

    tep 9JN /dd these ob parameters !st li3e normal parameters to stages in yo!r ob enclosed by the R symbol$ for e?ample+

     8 DatabaseSRTDGDB=/M@R

     8 -ass&ordSRTDGDB-/GOFDR

     8 >ileSRT-FO@C#-/#ER4RO.FC@DIFR4C!stomersR-FOC@D/#@R"cs* 

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    44/141

    -resentation #itle IBM Internal .se Doc!ment I9© Copyright IBM Corporation 2006

    .se of parameters

      *oints to 2ote

      Ge set the Defa!lt *al!e of the ne& parameter to QT-FOD@>Q to ens!re it dynamically set each timethe ob is r!n"

      Ghen the ob parameter is first created it has a defa!lt *al!e the same as the %al!e entered in the /dministrator" By changing this *al!e to T-FOD@> yo! instr!ct Datatage to retrie*e the latest%al!e for this *ariable at ob r!n time

      et the *al!e of these encrypted ob parameters to T-FOD@>" Ge need to type it in t&ice to thepass&ord entry bo?"

      #he Q%ie& DataQ b!tton &ill not &or3 in ser*er or parallel obs that !se en*ironment *ariables set toT-FOD@> or T@=%" #his is a defect in Datatage" It may be preferable to !se en*ironment*ariables in e!ence obs and pass them to child obs as normal ob parameters" eg" In a se!ence ob TDGDB-/GOFD is passed to a parallel ob &ith the parameter DGDB-/GOFD"

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    45/141

    -resentation #itle IBM Internal .se Doc!ment I96© Copyright IBM Corporation 2006

    .se of parameters

    *oints to 2ote

     Ge set the Defa!lt *al!e of the ne& parameter to QT-FOD@>Q to ens!re it dynamically set eachtime the ob is r!n"

      Ghen the ob parameter is first created it has a defa!lt *al!e the same as the %al!e entered in the /dministrator" By changing this *al!e to T-FOD@> yo! instr!ct Datatage to retrie*e the latest%al!e for this *ariable at ob r!n time

    et the *al!e of these encrypted ob parameters to T-FOD@>" Ge need to type it in t&ice to thepass&ord entry bo?"

    #he Q%ie& DataQ b!tton &ill not &or3 in ser*er or parallel obs that !se en*ironment *ariables set toT-FOD@> or T@=%" #his is a defect in Datatage" It may be preferable to !se en*ironment*ariables in e!ence obs and pass them to child obs as normal ob parameters" eg" In a se!ence ob TDGDB-/GOFD is passed to a parallel ob &ith the parameter DGDB-/GOFD"

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    46/141

    -resentation #itle IBM Internal .se Doc!ment I9;© Copyright IBM Corporation 2006

    $pplication e=amples  ,nvironment

      Database name> username> pass#ord +

    Database names or access details can *ary bet&een en*ironments or can change o*er time" Byparamaterising these at -roect le*el any change can be !ic3ly applied &itho!t !pdating orrecompiling all obs"

      +ile names and "ocation +

     /ll file names and locations &ere specific to each r!n th!s the filenames themsel*es &ere hardcoded b!t the file batch and r!n reference and related location &ere parameterised "

     

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    47/141

    -resentation #itle IBM Internal .se Doc!ment I9<© Copyright IBM Corporation 2006

    $pplication e=amples  *rocess +lo#

    -arameters can be man!ally entered at r!ntime$ ho&e*er$ to a*oid data entry errors and speed !p t!rnaro!nd$ parameter files &ere preJgenerated and loaded &ithin Datatage &ith minimal man!al inp!tor e?ample+

    MIF/#IO=D/#@ 8 Uset to the date the e?tract &as ta3enV

    #/F@#W#@M 8 Uset to the test en*ironment name d!e to be loaded &ith data from this r!n "

     

    Co!rse #itle

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    48/141

    © Copyright IBM Corporation 2006

    (Optional clientlogo can

    be placed here)

    Disclaimer (Optional location for any re!ired disclaimer copy"

    #o set disclaimer$ or delete$ go to %ie& ' Master ' lide Master)

    IBM lobal B!siness er*ices

    Co!rse #itle

    Mod!le + Datastage Best

    -ractices 4 #ips and #ric3s

    BI De*elopment #ool3it for

    Datastage 

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    49/141

    -resentation #itle IBM Internal .se Doc!ment I1© Copyright IBM Corporation 2006

    Module Objectives

     /t the completion of this chapter yo!sho!ld be able to+

     8Describe Datastage Best -ractices

    and #ips

     8Define Datastage Best -ractices and

    #ips

     8Demonstrate Datastage Best

    -ractices and #ips

     8@tc"

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    50/141

    2© Copyright IBM Corporation 2006

    Datastage Best *ractices Tips and Tricks $genda

    Getting started Prerequisites

    Overview of the Data Migration nvironment implemented

    stimating a !onversion

    Preparing the DS environment

    !reating Pro"ect Level Parameters

    #$ Designing %o&s'$#$ General Design Guidelines'$($ nsuring )estarta&ilit*'$+$ Sample %o& Template'$,$ -tracting Data'$.$ Transforming the -tracted Data

    '$.$#$Performing Loo/ups'$.$($ Loo/up stage Pro&lem

    '$.$+$ 0sing Transformer'$.$,$ Transformer compared to Dedicated stages'$.$.$ Tips1 Sorting'$.$'$ Tips1 )emoving Duplicates'$.$2$ 3ull 4andling'$.$5$ 6hen to con7gure nodes and partitioning

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    51/141

    © Copyright IBM Corporation 2006

    Datastage Best *ractices Tips and Tricks  $genda

    '$'$ !apturing )e"ects'$2$ Loading 8alid Data'$5$ Sequencing the "o&s'$9$ %o& sequence vs :atch Scripts'$#; Tips1 )eleasing loc/ed %o&s'$##$ Mapping multiple stand

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    52/141

    9© Copyright IBM Corporation 2006

    6$ 0nit Testing of the modules

    '$#$ General Design Guidelines'$($ nsuring )estarta&ilit*

    '$+$ Sample %o& Template

    '$,$ -tracting Data

    '$.$ Transforming the -tracted Data

    '$'$ !apturing )e"ects

    '$2$ Loading 8alid Data

    '$5$ Sequencing the "o&s

    '$9$ %o& sequence vs :atch Scripts

    '$#; Tips1 )eleasing loc/ed %o&s

    '$##$ Mapping multiple stand

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    53/141

    © Copyright IBM Corporation 2006

    2$ Maintenance Activit*

    2$# :ac/up and version control Activit*

    2$( 8ersion !ontrol in !lear!ase

    2$+ DS Auditing Activit*

    2$, )etrieving %o& Statistics

    Assuring 3aming !onventions of components= "o&s and categories

    2$. Performance Tuning of DS %o&s

    5$ Preparing 0TP< guidelines

    Datastage Best *ractices Tips and Tricks $genda

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    54/141

    6© Copyright IBM Corporation 2006

    Datastage Best *ractices Tips and Tricks  $genda

     9 Ta/ing whole pro"ect &ac/up

     Ta/ing %o& level -port

     Ta/ing folder level -port

    9$# :ac/up and version control Activitties

    8ersion !ontrol in !lear!ase

    9$( DS Auditing Activit*

     Trac/ing the list of modi7ed "o&s during a period

    )etrieving %o& Statistics

    Getting the row counts of di>erent "o&s

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    55/141

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    56/141

    <© Copyright IBM Corporation 2006

    1< 7etting !tarted 

      In a typical Data Migration @n*ironment$ &e ha*e defined the roadmapto implement the design !sing Gebphere Datatage and some tips andtric3s along &ith$ ac!ired thro!gh e?perience"

    Designing the architect!re

    -reparing the D en*ironment ob De*elopment -hase + creating the estimation model

    ob De*elopment -hase + designing the ob template

    ob De*elopment -hase + Deli*ering Code Mod!les

    ob @nhancement -hase + %ersion Control Datatage /!diting /cti*ity

    Datatage Maintenance /cti*ity

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    57/141

    A© Copyright IBM Corporation 2006

    ?

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    58/141

    60© Copyright IBM Corporation 2006

    "O*er*ie& of Data Migration en*ironment

    Datatage Fe!irement+Cleansed data is pop!lated into staging area 0 fromstage 7egacy( &hich holds the cleansed records from legacy systems)

    Client specific b!siness r!les ha*e to be *alidated d!ring stage0 to stage1 loadprimarily"

    taging 2 is the final target of Datatage load" Femaining *alidations can be

    applied here" taging 2 records can be !sed by other applications to load finally totarget @F-

    In taging area 0+ here &e ha*e tables for loading Master records$ transactionalrecords and config!ration data

    In staging area 1+ here &e ha*e the same tables as in tage 0 b!t the data

    model can ha*e small differences" /part from that$ tables for storing error recordsand stat!s of each r!n" Ge call them C=%7O and C=%F.= resp" #he obrepository tables (disc!ssed in a!diting section) ha*e also been stored here"

    taging area 2+ #his is similar to oracle @F- tables &hich are loaded &ith stage1 records"

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    59/141

    61© Copyright IBM Corporation 2006

    9"@stimation a con*ersion

    " /n o*er*ie& of the load ob designs need to be chal3ed o!t"

    1"#he no of loo3!ps to be performed in the load ob" Design of loo3!p obs

    sho!ld be e?plored (scope of any oin stage or &hether it can be performed

    !sing c!stom 7 in the so!rce oracle stage)

    2"#he comple?ity of the transformer in the load ob need to be determined"In case of m!ltiple loo3!ps or large n!mber of *alidations the comple?ity

    sho!ld be high and the contingency factor in the estimation model can be

    increased"

    "#he e?istence of mandatory fields (m!st be loaded in target) sho!ld be

    e?amined" #he records can be reected at the first opport!nity (after so!rceDB stage) and sent to log &itho!t any f!rther *alidation" >or non mandatory

    fields$ the records can not be reected and all the *alidations on other

    col!mns need to be performed"

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    60/141

    62© Copyright IBM Corporation 2006

    " -reparing a D en*ironment

    Datatage Installation sho!ld be in place along &ith other databaseinstallations

    -roect 7e*el @n*ironment *ariables has to be created to hold connecti*ity*al!es of staging databases$ the file locations for inp!t$ o!tp!t andtemporary storage"

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    61/141

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    62/141

    69© Copyright IBM Corporation 2006

    6"1 eneral !idelines

    #emplates ha*e to be created to enhance re!sability and enforce codingstandard" obs sho!ld be created !sing templates"

    #he template sho!ld contain the standard ob flo& along &ith propernaming con*entions of components$ proper ob le*el annotation andshort4long description" Change record section sho!ld be 3ept in log

    description to 3eep trac3"

    Dont copy the ob design only" copy !sing sa*e as or create copy optionat ob le*el"

    #he Datatage connection sho!ld be logged off after completion of &or3to a*oid loc3ed obs"

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    63/141

    6© Copyright IBM Corporation 2006

    @

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    64/141

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    65/141

    6;© Copyright IBM Corporation 2006

    6"9 @?tracting Data

    1".se table method for selecting records from so!rce" -ro*ide select listand &here cla!se for better performance

    2"-!ll the metadata into appropriate staging folders in #ableDefinitionsNOracle" /l&ays !se the Orchdb !tility to import metadata" Itimports the description part also &hich is helpf!l to 3eep trac3 of the

    original metadata in case they are modified in the ob flo&"

    " /*oid !sing the table name in the form of parameter in oracle stages"

    9"In case of some access restricted apps tables$ to access the data fromoracle stage open command section sho!ld be !sed &ith the rele*ant!ery

    "=ati*e /-I stages al&ays perform better compared to ODBC stage" oOracle stage sho!ld be !sed"

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    66/141

    6<© Copyright IBM Corporation 2006

    6"9 #ransforming e?tracted data

    B 6""1"-erforming 7oo3!ps

    B 6""2" 7oo3!p stage -roblem

    B 6""" .sing #ransformer 

    B 6""9" #ransformer compared to Dedicated stages

    B 6""" #ips+ orting

    B 6""6" #ips+ Femo*ing D!plicates

    B 6"";" =!ll Eandling

    B 6""

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    67/141

    6A© Copyright IBM Corporation 2006

    .sing a 7oo3J!p stage+

    1"#he no of datasets referenced in one loo3!p stage sho!ld be limiteddepending on the reference table data *ol!me"

    2"#o capt!re the failed records and store in a definite format in an errortable$ the loo3!p fail!re condition and condition not met option is set toCO=#I=.@ and hence metadata of all the concerned col!mns in theo!tp!t of loo3!p stage sho!ld be made =.77/B7@" It performs a leftJo!ter oin in this case (so!rce is ass!med as left lin3)

    @

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    68/141

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    69/141

    ;1© Copyright IBM Corporation 2006

    .sing parameters in #ransformer+

    Ghile passing ob parameters to a target col!mn in transformer stage$-roect defa!lted parameters can not be directly mapped to a target col!mn" / ob le*el parameter &ill not ca!se any problem" -ossible sol!tions are+

    1"Create a ob le*el parameter and map it to the act!al proect le*elparameter at se!ence le*el is a possible sol!tion"

    2".se et@n*ironment(Xen**arX) li3e et@n*ironment(KT-O-COL)

     / parameter can not be !sed directly inside a stage *ariable in a #ransformer(It &ill gi*e a compilation error)" #he alternate strategy to be follo&ed is to !sea transformer4col!mn generator stage prior to the *alidation transformer andinsert the parameter *al!e to a d!mmy field of the o!tp!t dataset of the firsttransformer stage" >!rther calc!lations can be carried o!t !sing that d!mmycol!mn"

    @

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    70/141

    ;2© Copyright IBM Corporation 2006

    6""2 #ransformer compared to dedicated stages

     / -, #ransformer is compiled into a CYY component separately and th!sslo&s do&n the performance" It is a 3ind of allJro!nder stage and dedicatedstages are a*ailable for many tas3s+

    #ransformer constraints can be implemented !sing a filter stage

    >or metadata con*ersion$ &e ha*e modify stage

    >or dropping col!mns or to get m!ltiple o!tp!ts$ &e can !se copy stage

    Co!nters can be implemented !sing a s!rrogate 3ey stage"

    #hese specialiPed stages are faster as they do not carry m!ch o*erhead andsho!ld be !sed &hen no deri*ations are present"

    B!t these dedicated stages ha*e problems too" In filter stage and modifystage$ no synta? chec3 is pro*ided and th!s there is no easy &ay to ens!recorrect code !nless &e compile and analyPe the error message" o$ in manycases !sing a transformer enhances the maintainability of the code later on andis s!ggested if performance is not an iss!e"

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    71/141

    ;© Copyright IBM Corporation 2006

    @

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    72/141

    ;9© Copyright IBM Corporation 2006

    @

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    73/141

    ;© Copyright IBM Corporation 2006

    @

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    74/141

    ;6© Copyright IBM Corporation 2006

      !ppose &e are generating a 3ey message &ith more than one fields &hichare coming from so!rce" Ge need to be *ery caref!l abo!t that" Beca!se &hen&e are concatenating that field in the 3ey message field and the field contains an!ll then the record may get dropped$ specially if more fields are concatenatedafter that" !ppose this is o!r code to generate a 3ey message +

      6ere t'e field B$2%:2.M is a nullable field

     8If len(%ar>ndBn3=!m) N 0 #hen C!stomer ID+ +*alidateC!stite.ses"ID + $ B/=:/CCO.=#=.M+ +

    *alidateC!stite.ses"B/=:/CCO.=#=.M + $ B/=:=.M+ +*alidateC!stite.ses"B/=:=.M + $ OFID +*alidateC!stite.ses"OFID7: @lse

    6""6 =.77 Eandling &hile concatenating error messages

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    75/141

    ;;© Copyright IBM Corporation 2006

     8In this case the record containing B/=:=.M S =.77 &ill getdropped" B!t if &e !se a =!ll#o@mpty con*ersion for the field then thecode &ill be perfect$ as belo& +JIf len(%ar>ndBn3=!m) N 0 #hen C!stomerID+ + *alidateC!stite.ses"ID + $ B/=:/CCO.=#=.M+ +*alidateC!stite.ses"B/=:/CCO.=#=.M + $ B/=:=.M+ +=!ll#o@mpty (*alidateC!stite.ses"B/=:=.M) + $ OFID +*alidateC!stite.ses"OFID7: @lse

    @

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    76/141

    ;<© Copyright IBM Corporation 2006

     In most of the cases$ the tas3 of node config!ration and partitioning hasbeen left to Datatage ( defa!lt /!to) and it partitions the inp!t datasetbased on the n!mber of nodes( t&o in o!r case+ so t&o partitions)

     C!stomiPation is re!ired &hen a oin is performed (presort the databefore oin) or &hen a sort stage is !sed (typical cases fo!nd till date)"

     In some cases the stage may need to be restricted to one node so that itcreates only one process &hich &ill &or3 on the entire dataset e"g" if &eneed to 3no& no of ro&s and &rite a stage *ariable as belo&+

     s*Fo&Co!ntSs*Fo&Co!nt Y 1

     Eere if the stage r!ns on t&o nodes$ it &ill create t&o processes &hich &illr!n on t&o partitions" o the final co!nt &o!ld be half of the entire dataset"

     /lso applicable for the logic of *ertical pi*oting in #ransformer !sing stage*ariables"

    @

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    77/141

    ;A© Copyright IBM Corporation 2006

    @

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    78/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    79/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    80/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    81/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    82/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    83/141

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    84/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    85/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    86/141

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    87/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    88/141

    A0© Copyright IBM Corporation 2006

    ;"1 #ro!bleshootingJ Deb!gging techni!es

    .sing $*T:D.M*:!CO/, parameter

    #his en*ironment *ariable is a*ailable in the Datatage /dministrator 

    !nder the -arallel Feporting branch" Config!res Datatage to printa report sho&ing the operators$ processes$ and data sets in a r!nning ob"

    .sing $*T:DI!$B",:COMBI2$TIO2 parameter

    Disable the parameter /-#DI/B7@COMBI=/#IO=" #hisen*ironment *ariable is a*ailable in the Datatage /dministrator

    !nder the -arallel branch" It globally disables operator combining(defa!lt beha*ior+ t&o or more operators &ithin a step are combinedinto one process &here possible)" =ote that disabling combininggenerates more .=I, processes$ and hence re!ires more systemreso!rces and memory

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    89/141

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    90/141

    A2© Copyright IBM Corporation 2006

    ;"1 #ro!bleshootingJ Deb!gging techni!es

    @nable the follo&ing en*ironment *ariables in Datatage /dministrator+   /-#-M-7/W@F#IMI= 8 sho&s ho& m!ch C-. time each stage

    !ses

     /-#-MEOG-ID 8 sho& process ID of each stage

      /-#F@COFDCO.=# 8 sho&s record co!nts in log

      /-#CO=>I>I7@ 8 s&itch config!ration file (one node$ m!ltiple nodes)

      OED.M- 8 sho&s OE code for yo!r ob" ho&s if any !ne?pectedsettings &ere set by the .I"

      .se a Copy stage to d!mp o!t data to intermediate pee3 stages orse!ential deb!g files" Copy stages get remo*ed d!ring compile time so

    they do not increase o*erhead"

      .se ro& generator stage to generate sample data"

      7oo3 at the phantom files for additional error messages+c+[datastage[proectfolder[5-E5

    IBM lobal B!siness er*ices

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    91/141

    A© Copyright IBM Corporation 2006

    ;"2 Oracle error codes in Datastage

    ome common error codes has been listed for readyreference along &ith possible remedies to resol*e the iss!esfaster "

    O)A!L ))O)!ODS 3 DS

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    92/141

    IBM lobal B!siness er*ices

    ; C d l ti

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    93/141

    A© Copyright IBM Corporation 2006

    ;" Common errors and resol!tion

    )" Ghen chec3ing operator+ Operator of type Q/-#7.#CreateOpQ+ &ill partitiondespite the preser*eJpartitioning flag on the data set on inp!t port 0"

     /,!O".TIO2+ #ells that the ob &ill repartition the data e*en tho!gh the code istelling the ob to preser*e the partitioning from !pstream" Ghere this is happeningopen !p the stage and set the inp!t lin3 properties to Clear partitioning"

    9)" Ghen binding inp!t interface field U>I@7D1Q to field U>I@7D2Q+ Con*erting an!llable so!rce to a nonJn!llable res!lt a fatal r!ntime error co!ld occ!r !se amodify operator to specify the *al!e to &hich the n!ll sho!ld be con*erted"

      /,!O".TIO2+ /s the fail!re condition is set to CO=#I=.@$ metadata of all theconcerned col!mns in the o!tp!t of loo3!p stage sho!ld be made =.77/B7@"

    IBM lobal B!siness er*ices

    ; 9 M E dl

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    94/141

    A6© Copyright IBM Corporation 2006

    ;"9 Message Eandler 

    "ocal Message 6andler +

    #o s!ppress !n&anted &arnings follo&ing method can be follo&ed+

    Fight clic3 to the &arning &hich yo! &ant to handle N clic3 on /dd to

    message Eandler N Clic3 on /dd F!le N / message &ill be In the ne?t r!n$the messages &ill be handled and a consolidated message &ill be sho&n

    Ghile ta3ing e?ports$ the e?ec!tables m!st also be promoted to !sethese handlers"

    7ocal F!ntime message handlers (7ocal"msh) are stored in FCC nnnn

    folder !nder the specific proect folder ( #he path can be fo!nd in the-roect -athname in /dministrator)

     &here nnnn is the ob n!mber generated from D OB"

    IBM lobal B!siness er*ices

    ; 7 l F ti M E dli I Di t 1

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    95/141

    A;© Copyright IBM Corporation 2006

    ;" 7ocal F!ntime Message Eandling In Director J1

    IBM lobal B!siness er*ices

    ; 7ocal F!ntime Message Eandling In Director 2

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    96/141

    A<© Copyright IBM Corporation 2006

    ;" 7ocal F!ntime Message Eandling In Director J2

    IBM lobal B!siness er*ices

    ; 7 l F ti M E dli I Di t

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    97/141

    AA© Copyright IBM Corporation 2006

    ;" 7ocal F!ntime Message Eandling In Director J

    IBM lobal B!siness er*ices

    ; 7 l F ti M E dli I Di t 9

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    98/141

    100© Copyright IBM Corporation 2006

    ;" 7ocal F!ntime Message Eandling In Director J 9

    IBM lobal B!siness er*ices

    ; 6 #ips + ob 7e*el and -roect le*el Message Eandling

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    99/141

    101© Copyright IBM Corporation 2006

    ;"6 #ips + ob 7e*el and -roect le*el Message Eandling

    5ob "evel Message 6andler  /llo&s for a ob so!rce only promotion of code$ allo&s messages to be

    handled for a single ob e?cl!si*ely$ p!ts the message handling in a centrallocation"

    #here is a folder named MsgEandler Datatage directory Ghen a ne&

    message handler is sa*ed$ a ne& "msh file &ill be created"#o ta3e one proect from D@% ser*er to another en*ironment$ these

    message handlers can not be e?ported directly along &ith the "ds? file$ ratherthe rele*ant "msh files need to be copied and sa*ed to the same MsgEandlerfolder there" #hen the ob &hich is e?ported &ill allo& to compile and the

    message handler &or3s fine*roject "evel Message 6andler

    Can be defined from /dministrator" /pplies to all the obs in that proect

     /-#@FFOFCO=>I.F/#IO= is a parameter that can be config!red toc!stomiPe the error log

    IBM lobal B!siness er*ices

    ; ; . i b 7 l M E dl 1

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    100/141

    102© Copyright IBM Corporation 2006

    ;";J .sing ob 7e*el Message EandlerJ1

    IBM lobal B!siness er*ices

    ; ; .sing ob 7e*el Message Eandler 2

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    101/141

    10© Copyright IBM Corporation 2006

    ;";J .sing ob 7e*el Message EandlerJ2

    IBM lobal B!siness er*ices

    ; ; .sing ob 7e*el Message Eandler

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    102/141

    109© Copyright IBM Corporation 2006

    ;";J .sing ob 7e*el Message EandlerJ

    IBM lobal B!siness er*ices

    < -reparing .#- !idelines

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    103/141

    10© Copyright IBM Corporation 2006

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    104/141

    106© Copyright IBM Corporation 2006

    Ferent "o&s

    9$+ Performance Tuning of DS %o&s

    Anal*sing a @ow

    Measuring Performance

    Designing for good performance

    mproving performance

    9$, Assuring 3aming !onventions of components= "o&s and categories

    9$. Scheduled Maintenance

    IBM lobal B!siness er*ices

    F 1 Back .p and /ecover& activit&

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    105/141

    10;© Copyright IBM Corporation 2006

    F

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    106/141

    10<© Copyright IBM Corporation 2006

    Back up activit&

    Taking 5ob level ,=port+/ ob Fepository table has been created in stage1"/ se!ence ob r!ns to refresh this repository" #his se!ence calls a ro!tine

    &hich e?tracts the ob names and the associated category path into ase!ential file" #he s!bse!ent load ob loads the data into repository"

    If some specific categories4obs has to be e?ported$ then the rele*ant sl filehas to be modified &ith the re!ired !ery in the &here cla!se to select there!ired obs to be e?ported"

    If the re!irement is *ersion control$ then the repository of modified obs hasto be refreshed and then the main batch can be r!n directly to perform thee?port" It &ill create ob le*el ds? files" One report file &ill be generated"

    If a ob is loc3ed by any !ser$ the !tility &ill cease to proceed f!rther !nlessthe option to s3ip4abort is pro*ided by the !ser" o$ it is better to restart theser*er before the e?port is started" #he ob le*el ds? files &ill be created &iththe same folder str!ct!re as in the ser*er 

    IBM lobal B!siness er*ices

    Back up activit&

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    107/141

    10A© Copyright IBM Corporation 2006

    Back up activit&

    Taking folder level ,=portOnce the ob le*el bac3!p is complete$ those files can be concatenated

    to create folder le*el ds? files"

    If some specific categories has to be e?ported$ then the rele*ant sl filehas to be modified &ith the re!ired !ery in the &here cla!se to select

    the re!ired obs to be e?ported"If the re!irement is *ersion control$ then the repository of modified obs

    has to be refreshed and then the main batch can be r!n directly toperform the e?port" It &ill concatenate the ob le*el ds? files createdearlier to create folder &ise ds? files"

    If there e?ists a log file$ the batch &ill abort" .nloc3 the ob in the ser*erand perform the e?port batch again to ta3e e?port of that ob" If the e?portprogram &as s!ccessf!l$ folder le*el ds? files &ill be generated along &itha report file"

    IBM lobal B!siness er*ices

    ;ersion Control

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    108/141

    110© Copyright IBM Corporation 2006

    ;ersion Control

     8#o !pload the ds? into the respecti*e folder in CC 8connect to ClearCase &eb client and go to the proper path

     8Create the acti*ity indicating the reason of change (defect n!mber)

     8Chec3 o!t the respecti*e folder (folderN basicNchec3 o!t)"

     8-!t the "ds? file into the CCFC path in yo!r local machine

     8Chec3 in the folder and clic3 #ools N !pdate reso!rces &ith the selectedacti*ity"/dd the "ds? file to so!rce control (Fight clic3 on the file in the righthand pane N basic N add to so!rce control" / bl!e bac3gro!nd &ill come !p

     8!nchec3 the option for chec3ing o!t after adding to so!rce control

     8Fight clic3 on the file in the right hand pane N #ools Nsho& *ersion tree"#he *ersion tree &ill be ta3en"

     8To furt'er appl& an& c'ange to t'e code 8Import the "ds? file to the local machine and ma3e modifications as perre!irement

     8Compile and r!n the ob and !pload the ne& ds? as disc!ssed

    IBM lobal B!siness er*ices

    F 1? D! $uditing activities

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    109/141

    111© Copyright IBM Corporation 2006

    F

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    110/141

    112 © Copyright IBM Corporation 2006

    $ssuring naming convention of component and jobs 

     8 / pl4sl proced!re to ens!re the naming con*entions of obs$ stagesand lin3s$ categories can be !sed" It can generate the report ofcomponents not matching &ith the specified con*ention"

     8If Metatage can be !sed to e?port Datatage system tables to anFDBM e"g" Oracle *ia a metabro3er$ then the proced!re can be r!n onthe tables to *alidate the standards

    IBM lobal B!siness er*ices

    Fetrie*ing ob tatistics

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    111/141

    11 © Copyright IBM Corporation 2006

    Fetrie*ing ob tatistics

       / *ery important aspect of a!diting acti*ity in case of data migration" #his is

    ens!red in t&o phases"

    B>irst is to retrie*e the record co!nts for so!rce$ records inserted or!pdated into target table$ records failed b!siness r!le *alidation and recordsreected by oracle" #his is done !sing a ro!tine &ritten in D basic &hichretrie*es record co!nts by searching for lin3s &ith some specific 3ey&ords"#hese 3ey&ords refer to the lin3s from so!rce$ to target or the fail!re lin3s in theload ob" #hese information are stored in C=%F.= #/B7@

    B / second approach retrie*es those ob names for &hich n!mber ofso!rce records do not match &ith the combined *al!e of inserted records and

    failed records( hence some records ha*e been dropped some&here in the flo&)

    IBM lobal B!siness er*ices

    A" -erformance t!ning of D obs

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    112/141

    119 © Copyright IBM Corporation 2006

    A" -erformance t!ning of D obs

     8 /nalysing a flo&

     8Meas!ring -erformance

     8Designing for good performance

     8Impro*ing performance

    IBM lobal B!siness er*ices

    A -erformance t!ning of D obs + -!rpose

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    113/141

    11 © Copyright IBM Corporation 2006

    A" -erformance t!ning of D obs + -!rpose

    B  #he doc!ment describes the process to&ards analysing a obflo& and meas!ring its performance based on certain proectbenchmar3" >!rther$ it s!ggests steps to impro*e the performance ofthe identified obs" Important to mention that$ performance t!ning isnot a s!bect that too m!ch time sho!ld be spent on d!ring the initialdesign" #hat is to say !nless it is clear that performance &ill be an

    iss!e$ it may &ell be that the performance is ade!ate &itho!t ha*ingto carry o!t any of these t!ning options$ and yo! &ill therefore sa*eyo!rself time 8 not ha*ing to implement these changes"

    IBM lobal B!siness er*ices

    -erformance t!ning of D obs + /nalysing the flo&

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    114/141

    116 © Copyright IBM Corporation 2006

    -erformance t!ning of D obs + /nalysing the flo&

    1" / score d!mp of the ob helps to !nderstand the flo&" Ge can do thisby setting the /-#D.M-COF@ en*ironment *ariable tr!e and r!nningthe ob (/-# D.M-COF@ can be set in the /dministrator client$ !nderthe -arallel N Feporting branch)" #his ca!ses a report to be prod!ced&hich sho&s the operators$ processes and data sets in the ob"

     8#he report incl!des information abo!t+

     8 Ghere and ho& data is repartitioned"

     8Ghether Datatage had inserted e?tra operators in the flo&"

     8#he degree of parallelism each operator r!ns &ith$ and on &hich nodes"

     8Information abo!t &here data is b!ffered"

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    115/141

    IBM lobal B!siness er*ices

    -erformance t!ning of D obs + Meas!ring -erformance

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    116/141

    11< © Copyright IBM Corporation 2006

    g g

     8Ge are Meas!ring performance !sing the follo&ing &ays" 8If the target is a database e"g" Oracle in o!r case$ replace the database

    stage &ith a se!ential file and see &hether it ta3es the same time" #his &o!ldgi*e !s a 3no&Jho& &hether the database connection to the target (as it is aremote connection) is slo& or the *ol!me of data is h!ge hence it ta3es time"

     8 In the transformations section J In*alidate all transformations to defa!lt*al!es" #his &o!ld help !s 3no& &hether the ob is r!nning slo& beca!se oftransformations

     8If the so!rce is a database e"g" Oracle in o!r case$ then the !ery sho!ldbe r!n !sing hints4partition4inde?" #his &o!ld gi*e an insight &hether the so!rce!ery is a bottlenec3""

    IBM lobal B!siness er*ices

    -erformance t!ning of D obs + Meas!ring -erformance

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    117/141

    11A © Copyright IBM Corporation 2006

    -erformance t!ning of D obs + Meas!ring -erformance

    Chec3 for any aggregator stage in yo!r obs J #his is part of transformationbottlenec3 b!t need to be gi*en special attention" /n aggregator stage in themiddle of a big ob ma3es the enter ob slo& since all the records need to passthe aggregator (cannot be processed in parallel)"

    #o catch partitioning problems$ r!n yo!r ob &ith a single node config!ration file

    and compare the o!tp!t &ith yo!r m!ltiJnode r!n" Wo! can !st loo3 at the filesiPe$ or sort the data for a more detailed comparison

    IBM lobal B!siness er*ices

    -erformance t!ning of D obs + Impro*ing -erformance

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    118/141

    120 © Copyright IBM Corporation 2006

    g p g

     8Basic steps+ 8Femo*ing !n&anted col!mns at the first opport!nity

     8Fed!cing n!mber of ro&s processed at the earliest" #his can be doneby placing the transformer constraint or filter &here cla!se in the so!rceoracle stage

     8  @liminate #ransformers &ith modify stages &here the transformationsare simple" Modify$ d!e to internal implementation details$ is a partic!larlyefficient operator" /ny transformation &hich can be implemented in theModify stage &ill be more efficient than implementing the same operation ina transformer stage" #ransformations that to!ch a single col!mn (fore?ample$ 3eep4drop$ type con*ersions$ some string manip!lations$ n!ll

    handling) sho!ld be implemented in a Modify stage rather than a#ransformer"

    IBM lobal B!siness er*ices

    -erformance t!ning of D obs + Impro*ing -erformance

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    119/141

    121 © Copyright IBM Corporation 2006

    g p g

     8Consider !sing Oracle b!l3 loader instead of !psert method &here*erapplicable"

     8Instead of creating m!ltiple standalone flo&s in a single ob$ creatingseparate obs and calling them parallels !sing a se!encer stage canimpro*e the performance"

     8If data is going to be read bac3 in$ in parallel$ it sho!ld ne*er be &rittenas a se!ential file" / data set or file set stage is a m!ch more appropriateformat"

    IBM lobal B!siness er*ices

    -erformance t!ning of D obs + Impro*ing -erformance

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    120/141

    122 © Copyright IBM Corporation 2006

    $dvanced steps+

    F!nning the obs &hich handle small *ol!me of data to a single nodeinstead of m!ltiple nodes" #his &ill limit spa&ning !p m!ltiple processes andpartitions &hen there is no need" #his can be done by adding the en*ironmentT/-#CO=>I>I7@ and setting it to !se a single node config!ration"

    Ghen &riting intermediate res!lts that &ill only be shared bet&een parallel obs$ al&ays &rite to persistent data sets (!sing Data et stages)" @ns!re thatthe data is partitioned$ and that the partitions$ and sort order$ are retained ate*ery stage" /*oid format con*ersion or serial I4O"

    IBM lobal B!siness er*ices

    A"9 ched!led Maintenance

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    121/141

    12 © Copyright IBM Corporation 2006

     8/egular Cleanup of log files 8-eriodic clean !p of 5-E5 folder" If the time bet&een &hen a ob says

    it is finishing$ and &hen it act!ally ends$ increases$ this may be a

    symptom of a too f!ll 5-E5 folder" One &ay to do this is in Datatage

     /dministrator$ select the proects tab$ clic3 yo!r proect$ then press the

    Command b!tton$ enter the command C7@/F">I7@ 5-E5$ and press the

    e?ec!te b!tton" /nother &ay is to create a ob &ith the command+@,@C.#@ QC7@/F">I7@ 5-E5Q on the ob control tab of the ob

    properties &indo&" It may be sched!led to r!n &ee3ly$ b!t at a point in

    yo!r prod!ction cycle &here it &ill not delete data critical to deb!gging a

    problem" 5-E5 is a proect le*el folder$ so this ob sho!ld be created and

    sched!led in each proect" 8Cleaning !p persistent datasets periodically" Datasets sho!ld not be

    !sed for long tem storage$ th!s the temporary datasets can be cleaned

    !p" / script can be sched!led to a!tomate the process"

    IBM lobal B!siness er*ices

    A" C!stomised Code

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    122/141

    129 © Copyright IBM Corporation 2006

      Options+

     8Create a basic ro!tine and !se it as before4after ob s!bro!tine or!sing a ro!tine acti*ity stage"

     8Create a CYY ro!tine and !se it inside a -, transformer 

     8Create c!stom operators and !se them as a stage+ #his allo&s3no&ledgeable Orchestrate !sers to specify an Orchestrate operatoras a Datatage stage" #his is then a*ailable to !se in Datatage-arallel obs

    Co!rse #itle

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    123/141

    © Copyright IBM Corporation 2006

    (Optional clientlogo can

    be placed here)

    Disclaimer (Optional location for any re!ired disclaimer copy"

    #o set disclaimer$ or delete$ go to %ie& ' Master ' lide Master)

    IBM lobal B!siness er*ices

    Mod!le 9 + %ersion Control

    BI De*elopment #ool3it for

    Datastage

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    124/141

    IBM lobal B!siness er*ices

    ;ersion Control $genda

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    125/141

    -resentation #itle IBM Internal .se Doc!ment I12< © Copyright IBM Corporation 2006

    #opic 1 +%ersioning Methodology

     8   Discipline"

     8   Basic -rinciple4/pproach"

     8   Different -roects"

    #opic 2+ InitialiPing Components

     8   %ersion Control =!mbering"

     8   >iltering Components"

    #opic + -romoting Components

     8   Component selection for promotion"

     8   Different Methods"

    #opic 9+ Best -ractices 8   .sing of C!stom >older in %ersion Control"

     8   tarting of %ersion Control from DJDesigner"

    IBM lobal B!siness er*ices

    ;ersioning Met'odolog&

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    126/141

    -resentation #itle IBM Internal .se Doc!ment I12A © Copyright IBM Corporation 2006

    In a typical enterprise en*ironment$ there may be many de*elopers &or3ing on obs all atdifferent stages of their de*elopment cycle" Githo!t *ersion control$ effecti*e management

    of these obs co!ld become *ery time cons!ming and they co!ld be diffic!lt to maintain" It gi*es an o*er*ie& of the methodology !sed in %ersion Control and highlight some of its

    benefits" It is not intended as a comprehensi*e g!ide to *ersion control managementtheory

    Benefits

    ;ersion tracking J archi*ing and *ersioning (i"e" releaseJle*el trac3ing) of Datatagerelated components &hich can be retrie*ed for b!g trac3ing and other p!rposes"

    Central code repositor& J all coding changes are contained in one central managedrepository$ regardless of proect or ser*er locations"

    Data!tage integration J Components are stored &ithin theK %@FIO=L proect$ &hich canbe opened directly in Datatage from %ersion Control" /lternati*ely$ %ersion Control can beopened directly from &ithin any Datatage client"

    Team coordination J Components are mar3ed as readJonly as they are processed thro!gh

    %ersion Control$ ens!ring that they cannot be modified in any &ay after being released" 

    IBM lobal B!siness er*ices

    ;ersioning Met'odolog&

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    127/141

    -resentation #itle IBM Internal .se Doc!ment I10 © Copyright IBM Corporation 2006

    Discipline

    #o gain the ma?im!m benefit from !sing %ersion Control Ge m!st

    e?ercise a disciplined approach" If Ge b!ild in that discipline from thestart Ge &ill !ic3ly realiPe the benefits as proect gro&s"

     /l&ays ens!re that Ge pass components tho!gh %ersion Control beforesending them to their ne?t stage of de*elopment" #his &ill ma3e theproect de*elopment far easier to trac3$ especially if Ge ha*e comple?proects containing a large n!mber of obs

    Basic *rinciple$pproac'

      Most Datatage ob de*elopers adopt a three stage approach tode*eloping their Datatage obs$ &hich has become the de factostandard"

    #hese stages are+

     #he Development stage

     #he Test stage

     #he *roduction stage

    IBM lobal B!siness er*ices

    ;ersioning Met'odolog& 

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    128/141

    -resentation #itle IBM Internal .se Doc!ment I11 © Copyright IBM Corporation 2006

    Basic *rinciple$pproac'

    !cenario #it'out ;ersion control

      In this model$ obs are coded in the de*elopment en*ironment$ sent for test$rede*eloped !ntil testing is completed$ and then passed to prod!ction"

     #here is no central management system to control the flo& bet&een thede*elopment$ test and prod!ction en*ironments"

     Ge need to thin3 of %ersion Control as a central h!b &here all Datatageproects pass thro!gh"

      /dopting a staged approach to proect de*elopment$ proects can pass fromone stage into %ersion Control before passed to the ne?t stage"

    IBM lobal B!siness er*ices

    ;ersioning Met'odolog&

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    129/141

    -resentation #itle IBM Internal .se Doc!ment I12 © Copyright IBM Corporation 2006

    Basic *rinciple$pproac'

    !cenario #it' ;ersion control

     Ghilst in %ersion Control$

    -roects &ill ha*e the appropriate *ersioning information added"

    #his information &ill incl!de *ersion n!mber$ history$ and notes" Consistency of the code across different en*ironment is maintained

    IBM lobal B!siness er*ices

    ;ersioning Met'odolog&

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    130/141

    -resentation #itle IBM Internal .se Doc!ment I1 © Copyright IBM Corporation 2006

    Different *rojects 

    T'e ;ersion Control *roject   %ersion Control !ses a specialDatatage proect as a repository to store all proects and their associatedcomponents" #his proect is !s!ally called K%@FIO=L$ altho!gh Ge maycreate a proect &ith any name" Ghate*er name Ge choose for *ersionproect$ the principle remains the same" the %ersion Control repository

    contains the archi*e of all components initialiPed into it" It therefore storese*ery le*el of each code release for each component"

    Ot'er *rojectsIf Ge adopt the three stage approach$ Ge &o!ld typicallyha*e three other proects+

     Development Ghere Datatage obs and associated components are

    de*eloped"

    ;ersioning Met'odolog&

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    131/141

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    132/141

    IBM lobal B!siness er*ices

    Initialiing Components

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    133/141

    -resentation #itle IBM Internal .se Doc!ment I16 © Copyright IBM Corporation 2006

    ;ersion Control 2umbering

    #he f!ll *ersion n!mber of a Datatage component is bro3en do&n asfollo&s+

    /elease 2umber< Minor 2umber 

    &here+

    #he /elease 2umber is allocated &hen Ge initialiPe components in

    %ersion Control" If re!ired Ge can specify a release n!mber in theInitialie Options dialog bo?" By defa!lt$ %ersion Control sets this to thehighest release n!mber c!rrently !sed by obects in its repository"

     #he Minor 2umber is allocated a!tomatically by %ersion Control &henGe initialiPe a component" It &ill increment by one each time Ge initialiPe

    a partic!lar component !ntil Ge increase the release n!mber"

    IBM lobal B!siness er*ices

    Initialiing Components

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    134/141

    -resentation #itle IBM Internal .se Doc!ment I1; © Copyright IBM Corporation 2006

    +iltering Components 

    Ge can filter a long list of components to sho& only those that &e areinterested in for promotion"

    >or e?ample$ Ge may &ant select components associated &ith KalesLor K/cco!ntingL" Father than search thro!gh the entire list$ Ge canfilter thro!gh the list$ and select the s!bset for promotion"

    IBM lobal B!siness er*ices

    Initialiing Components

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    135/141

    -resentation #itle IBM Internal .se Doc!ment I1< © Copyright IBM Corporation 2006

    #o filter components+

    1" Clic3 the +ilter b!tton in the Display toolbar so that a te?t entry fieldappears+

    2" In the te?t entry field$ type in the te?t &e &ant to filter by"

      &e can type letters or &hole &ords$ and separating letters or &ords&ith a comma &ill res!lt in an KOFL operation" >or e?ample$ typing inKacco!nting$ salesL &ill res!lt in a list sho&ing components that ha*e

    Kacco!ntingL or KsalesL in its name"Clic3 the arro& ne?t to the >ilter b!tton to specify &hether the filter is

    case sensiti*e or not"

    " Ghen &e are happy &ith o!r filter te?t$ clic3 the >ilter e?ec!te b!tton$press ret!rn$ or clic3 in the tree *ie& of the %ersion Control &indo&"

    9" #o ret!rn to the defa!lt *ie&$ clic3 the >ilter b!tton again"

    IBM lobal B!siness er*ices

    *romoting Components

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    136/141

    -resentation #itle IBM Internal .se Doc!ment I1A © Copyright IBM Corporation 2006

    Ge can promote components after they ha*e been initialiPed into %ersion Control"

    In a typical en*ironment$ components are initialiPed from a development proect and

    promoted to a test or production proect"

      Component selection for promotion

      Ge can select components for promotion in the follo&ing &ays+

      By indi*id!al selection

      By batch

      By !ser 

      By ser*er 

      By proect

      By release  By date

     

    IBM lobal B!siness er*ices

    *romoting Components

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    137/141

    -resentation #itle IBM Internal .se Doc!ment I190 © Copyright IBM Corporation 2006

      #he different &ays of selecting component for promotion are as follo&s+

     B& individual selection Ge can select components for promotion in thetree *ie& from any *ie& mode" Indi*id!al component selection is s!itable&hen &e are promoting a small n!mber of components" #he more !s!alscenario is to !se Felease4Batch election"

    B& batc' Ghen &e initialiPe a gro!p of components into %ersion Control$the selected gro!p is 3no&n as a KbatchL" By defa!lt batches are identified bythe date and time they &ere initialiPed$ b!t &e al&ays prefer to specify aname for a batch" %ersion control allo&s !s to select components forselection by initialiPation batch$ promote batch$ or named batch" electingcomponents by batch a!tomatically highlights all the components of thatbatch and so selects them for promotion"

    B& date Ge can select components that &ere initiated on a partic!lar date"

     /ll the components that &ere initialiPed on that date are selected ready forpromotion"

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    138/141

    IBM lobal B!siness er*ices

    Best *ractice

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    139/141

    -resentation #itle IBM Internal .se Doc!ment I192 © Copyright IBM Corporation 2006

    .sing of Custom +older in ;ersion Control

    Many de*elopment proects &hich !se Datatage for e?traction$ transformationand loading (@#7) also incorporate other proect related files &hich are not part of theDatatage repository"

    #hese files may contain DD7 scripts or other reso!rce data" %ersion Control canprocess these /CII files in the same &ay as it processes Datatage components"

    If &e choose to add C!stom folders$ they are a!tomatically created by %ersion Control Jthere is no need to create them man!ally"

    @*ery time %ersion Control s!bse!ently connects to a proect$ either for initialiPation orfor promotion$ it chec3s to see if the c!stom folder e?ists" If it does not e?ist$ then%ersion Control &ill create it"

     /fter %ersion Control has created a c!stom folder$ it can then be pop!lated &ith therele*ant items"

    #he only re!irement for !sing c!stom folders in %ersion Control is that the componentsm!st be stored &ithin a folder in the proect itself"

    IBM lobal B!siness er*ices

    Best *ractice

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    140/141

    -resentation #itle IBM Internal .se Doc!ment I19 © Copyright IBM Corporation 2006

    !tarting of ;ersion Control from D!Designer

    Ge can r!n %ersion Control directly from &ithin Datatage Designer$ Director orManager by adding a lin3 to the Datatage client tools men!"Ge can also add options &hich &ill allo& %ersion Control to start &itho!t displaying thelogin dialog"

    If Ge &ant %ersion Control to start &ith login details already filled in and &itho!tdisplay the login dialog$ Ge can enter appropriate command line arg!ments" #hese are

    entered in the $rguments field and ha*e the follo&ing synta?+

    4EShostname 4.S!sername 4-Spass&ord&here+hostname is the Datatage er*er hosting proectusername is Datatage !sername password is Datatage pass&ord"

    >or e?ample$ if Ge ha*e a hostname of Kdsser*erL$ a !sername of K*c!serL$ and apass&ord of KcontrolL$ then Ge &o!ld type in+ 6Kds:server .Kvc:user *Kcontrol

    %ersion Control can no& be started from the Datatage Client"

    IBM lobal B!siness er*ices

    3uestions and $ns#ers

  • 8/18/2019 IBM BI Tookit Datastage V1 0

    141/141


Recommended