Using SQL with 2E

From 2EWiki

Jump to: navigation, search

Ray Weekes

Version 2, May 2007

This paper explains how SQL can be used in CA’s 2E. It discusses some design & programming considerations when using SQL compared to a traditional implementation using DDS & HLL I/O operations. It is based upon 2E R8.1. It not intended as a replacement for the standard 2E user manuals.

Contents


Why Use SQL?

  • In today’s world of internet/J2EE systems then SQL is considered the normal way of designing databases and manipulating data.
  • IBM has stated that SQL is the strategic way forward for enhancements. New database features will not be offered in the DDS implementation.
  • SQL, as a cross-platform industry standard, allows some portability of generated code to other platforms. For example, i5/OS COBOL containing embedded SQL might be a starting point for converting 2E generated batch code to run on other DB2 platforms, or on other database systems such as SQL/Server or Oracle.
  • SQL offers performance advantages in certain areas and you can utilize the power of SQL ’set at a time’ operations by coding SQL DML as user source inside a 2E generated SQL program.
  • It’s easier to do certain data calculations, e.g. Mathematical Functions, in SQL rather then directly in 2E or in other languages like RPG or Cobol. SQL statements can be added as user source inside a 2E generated SQL program.
  • It is possible to create a SQL database using 25 character long model field names for data definition. This makes it easier for the users working at the presentation level to understand the data structure & content. For example, creating a data warehouse requiring easy access using front-end SQL query tools and ODBC gateways.
  • HLL programs using SQL do not perform file level checks. This allows some flexibility in making database changes without the need to rebuild existing programs.
  • RPG programs using SQL do not have a traditional 50 file limit
  • SQL is the only means of remote database access using IBM’s Distributed Relational Database Architecture (DRDA). (Note that this document is not a guide to using DRDA).

SQL DDL Vs SQL DML

When discussing SQL, particularly in the context of 2E, then it is important to understand exactly what is meant. On the iSeries SQL can be used as a Data Definition Language (DDL) to define a database. This is an alternative to DDS. SQL can also be used as a Data Manipulation Language (DML) embedded within generated RPG or COBOL programs to replace traditional READS & CHAINS etc.

iSeries allows SQL DDL & DML to be intermixed with few restrictions. But 2E is more restrictive. There are two options to consider:

  • Pure SQL route whereby a database is generated using SQL DDL and all programs are generated to use SQL DML
  • A SQL programming route whereby the database is built using DDL but programs may be a mixture either traditional non-SQL or SQL DML.

2E generated non-SQL programs may not be used over an SQL database. i5/OS allows traditional HLL I/O to access SQL Tables & Views, but 2E programs will require a keyed access open and all SQL Tables & Views are non-keyed. 2E programs will also expect format names as defined in the model, which SQL files will not have.

iSeries DB2 UDB SQL

i5/OS has only one database, DB2 UDB, but there are two distinct database interfaces, SQL and DDS. You can define, access, and manipulate data through either the SQL interface or the DDS interface. Regardless of how the database is defined the data is still held in the same physical files in libraries and can be read using SQL statements or traditional HLL I/O statements. The difference between SQL & DDS is more one of terminology.

  • An SQL SCHEMA is implemented as a library
  • An SQL TABLE is almost identical to a DDS physical file.
  • An SQL VIEW is the same as a DDS non-keyed logical file and may contain joins
  • An SQL INDEX is the same as a DDS keyed logical file

SQL Table v DDS Physical File

An SQL physical file differs from a DDS defined physical file in a few ways:

  • can only have one member, MAXMBRS(1)
  • by default has no limit on the number of records, SIZE(*NOMAX)
  • by default will reuse deleted records, REUSEDLT (*YES)
  • format name will be same as file name (but file may be renamed)

Prerequisites

i5/OS contains the run-time objects necessary to execute programs compiled with SQL. But to generate and compile programs that use SQL you must have the SQL Development Kit Licensed Program installed on your iSeries.

2E SQL Database Implementation

The table below summarizes how 2E model objects will relate to the generated source code and to the final implementation objects:

2E Object SQL Generated DB2 Object Created
Physical File CREATE TABLE Physical File
Update/Retrieval/Resequence ACP CREATE INDEX Keyed Logical File
CREATE VIEW Non-keyed Logical File
Span ACP Ditto x 2 Ditto x 2
QRY, or ACP with DELAY or REBUILD CREATE VIEW Non-keyed Logical File

SQL does not allow multiple formats in a view. Therefore a 2E Span type access path will be implemented as 2 Views & 2 Index.

A QRY ACP, or an ACP set as DELAY or REBUILD, does not require a permanent index.

Within the same model it is possible to mix SQL & DDS ACPs. It is possible to use DDS logical ACPs over SQL defined Tables. But you cannot build an SQL VIEW over a DDS physical file. But note that within 2E the SQL ACPs will be created in the SQL SCHEMA library whilst DDS ACPs will be created in the GENLIB.

Database Design Considerations using SQL

Joins

All 2E generated DDS logicals with joins also generate the DDS keyword JDFTVAL. This is equivalent to a relational ’Left Outer Join’. That is, if a matching record is not present from the secondary file than the primary record will still be processed and the joined fields will contain blank/zeros.

2E generated SQL VIEWS only support basic inner joins. Records from the primary table will not be processed if there is no matching record in the secondary table. (iSeries DB2 UDB SQL supports left outer joins, but 2E does not currently support them for historical reasons)

You will thus need to review your application design where 2E virtuals are implemented as joins and assess the possible impact. If joins are significant in the application design then you must adopt a procedural approach. That is, read without joins and use a RTVOBJ to read data from secondary files.

SQL & Duplicate Keys

SQL does not allow for specific sequencing of duplicate key values. i.e. LIFO/FIFO etc. is not supported. If this is significant then the access path key must be extended by additional fields. Access paths should be set to UNDEFINED to ensure function compatibility with DDS generated access paths.

SQL & ACP SELECT/OMIT

A DDS logical file with STATIC selection is effectively preselecting records which meet a given criteria and building a separate index to identify them. This can give enormous performance advantages. But in SQL there is no support for indexes built over a subset of records. The SQL index will contain an entry for all records. Functionally the program will still only process the selected records because any select/omit criteria will be contained in the WHERE clause of the SQL SELECT statement. But the performance advantage is lost. There is effectively no difference between static or dynamic select/omit for an SQL ACP. Thus a unique key must be unique over the entire file, and not just within the subset defined by the select or omit criteria. It is probably best not to use STATIC selection as part of the 2E database design when using SQL.

Any selection criteria for an access path is implemented in the WHERE clause of the SQL SELECT statement, either in the VIEW if static, or in the program if dynamic. But if using base tables for SQL operations the selection will always be in the program.

ACP Settings YDBFACC=T(TABLE) YDBFACC=G(DBFGEN)
Static S/O in Program Code S/O in View
Dynamic S/O in Program Code S/O in Program Code

Select/Omit Criteria and Index

The decision to generate an index is determined only by the setting for index maintenance, not by whether it’s STATIC or DYNAMIC. The index only refers to the processing sequence not to the selection criteria. But the ACP STATIC selection criteria and selection criteria arising from RST/POS key values are combined into the SQL SELECT WHERE clause. Thus any ACP STATIC selection criteria may not be associated with an index. Try to build selection criteria into ACP keys or ensure that ACP selection is based over an indexed field.

SQL & Collating Sequences

SQL does not support ALTERNATE collating sequencing as defined on 2E ACPs. Typically this is used to ensure that lowercase characters are collated in true alphabetic order with upper case characters. In i5/OS DDS the key sequence is compiled into the logical file index and the records are permanently held in the specified key sequence. But an SQL View has no intrinsic order. The sort sequence that a program will use is set at compile time. One option is use a ’shared weight’ sort table by specifying SRTSWQ(*LANGIDSHR). Another option is for the compiler to determine the sort sequence at execution time from the users job description, SRTSEQ(*JOBD). Any specified sort sequence will hold for all SELECTs issued by the program. A different sort sequence may have performance implications since any INDEX may not be in the same sort sequence. No sort sequence can be used for DRDA.

2E SQL DDL

Creating SQL Database

All SQL database objects exist in a SQL SCHEMA. Within i5/OS a SCHEMA is implemented as an i5/OS library. Inside the library are some special system files that provide a data-dictionary known as an SQL CATALOG, as well as those objects needed for journaling.

The integrity of the SQL CATALOG is maintained at all times by i5/OS data management, regardless of whether SQL objects are changed by SQL statements, HLL I/O statements, or i5/OS commands.

A 2E model generating SQL must have a separate SQL SCHEMA library in which to create Tables & Views. This will be separate from the Generation Library. The library name is held in model value YSQLLIB. A 2E command YCRTSQLLIB can be used to create the schema.

2E Generated DDL

Generation of SQL is controlled by model vale YDBFGEN(*SQL). The generated SQL DDL source is written into source file QSQLSRC in the generation library. The generated source member is really a command level source which has to be retrieved and executed. There are no CREATE TABLE or CREATE VIEW shipped execution messages in the model. Furthermore an SQL ACP can’t have any overrides.

The 2E command YEXCSQL reads the SQL statements from the source member as data and executes them as dynamic SQL, a somewhat slow process. It also means that the generated source can only be processed on a machine with 2E installed.

An alternative approach might be to use the i5/OS command RUNSQLSTM against the generated source member. However, this is not possible without modification to the source to remove the 2E comments and EXEC SQL & END-EXEC lines, and to insert semicolon delimiters at the end of each SQL statement.

2E SQL Naming Conventions and Security

2E uses SQL naming conventions (*SQL), as opposed to i5/OS system (*SYS) naming conventions for its generated DDL. This can not be changed and there is no shipped execution message for YEXCSQL to control the naming option.

*SQL conventions are closer to ANSI naming standards than i5/OS naming. In general qualified names must be used and therefore the generated SQL DDL source contains the YSQLLIB model value as a hard coded library name. This makes it somewhat difficult to move the source to another environment.

With *SQL naming SQL security conventions will be used. The owner of the program must have authority to the objects referred to in the program SQL statements, as opposed to the user running the program. SQL tables and views are created with the *PUBLIC authority set to *EXCLUDE. This can not be changed through 2E, but can be amended afterwards using GRTOBJAUT commands, or by GRANT SQL statements. (DDS files are generally created with *PUBLIC authority set to *CHANGE).

Column Types

2E generates SQL data types thus:

Character   - CHAR
Packed Decimal - DECIMAL
Zoned Decimal - NUMERIC
Binary   - INTEGER
DT#/TM#/TS# - DATE/TIME/TIMESTAMP

The NOT NULL WITH DEFAULT keyword is always generated in SQL column definitions. Column default values may be defined within the model

Table Vs View Access

A 2E logical ACP will be implemented as an SQL View and a separate SQL Index. An SQL View is only really a definition of what will be passed to the program. A View has no intrinsic sequence because it can not have an ORDER BY clause. It may have a WHERE clause to subset the rows returned. If all the following are true:

  • no model relations have been dropped
  • the ACP does not have any virtual fields
  • there is no ACP selection criteria

then the generated View will be identical to the underlying base Table as generated by the 2E physical ACP.

In this situation it makes more sense to generate programs to access base SQL Tables rather than SQL Views. A View need only be used when an ACP contains virtual fields resulting in a join to another table, or if you wish to read a subset of columns. If the ACP contains select/omit criteria then the table may still be used because the record selection will be added to the WHERE clause of the SQL SELECT statements within the generated program code.

The 2E model option YDBFACC is used to determine whether a TABLE or VIEW is used in the generated program. The model value can be overridden at the individual access path level. The value that YDBFACC resolves to for each logical ACP will affect both what is generated for the SQL DDL and what is generated inside functions built over the ACP as SQL DML. (The PHY will always generate just the DDL for the Table).

*TABLE - only the index is generated as DDL, unless index name set to *NONE
    - all programs will access the base table directly in DML


*DBFGEN - both a view and an index are generated as DDL
    - all programs will access data using a view in DML

Note that you cannot use table access for DRDA.

SQL Index

When using the DDS interface to define a keyed file then the system automatically creates an index as appropriate. For example, the physical or logical file may be keyed, or a logical may have joins to another physical.

With SQL an index is not created automatically. It must be explicitly defined. An SQL VIEW is only a means to format & select data but not to sequence it. The existence of a suitable index may improve performance for an SQL query but it is not a prerequisite for it. An SQL VIEW with joins will find the access path at run time. Therefore to improve performance we generally create an SQL INDEX corresponding to the sequence of data that we are likely to query. An SQL INDEX is implemented as an i5/OS keyed logical file.

A 2E ACP will create both an SQL VIEW and an SQL INDEX. If 2 ACPs give rise to identical index, for example the UPD & RTV ACPs, then 2 keyed logical files will be created. But at the database level the 2 keyed logicals will share the index; exactly the same way that DDS defined logicals will share an index. It may therefore by advantageous to suppress the build of an SQL INDEX if it’s identical to an existing one.

You may suppress an individual ACP index build by changing the index name to *NONE on the ACP auxiliaries screen. It may be beneficial to suppress the primary index for small system files containing only one or a few records. For small files it’s generally considered faster to read the whole table. But note that if you suppress the unique index for the primary key you must ensure within your CRTOBJ that you do not accidentally create duplicate keys. (see programming considerations).

2E will not generate any index for ACPs defined with index maintenance other than immediate. That is, no index is generated for DELAY or REBUILD options. A QRY ACP will not require an index, and neither will it require any CLP driver program to run an OPNQRYF command. All the necessary record sequencing & selection will be built into the SQL SELECT statement generated inside the program.

A 2E SQL index name is formed by appending an I to the end of the source name.

Span ACP

SQL does not support multi-format files. A SPN access path is implemented in SQL as 2 views. These views may contain joins, unlike a DDS multi-format logical. The view names will be different to those used in a DDS SPN access path. If a SPN ACP is created over the primary keys of 2 connected files it is not necessary to generate any additional index.

Which SQL to Generate?

It is generally considered best to use base tables for all SQL activity, except where joins may be needed or where a subset of columns is required. To do this set model value YDBFACC=*TABLE. On individual ACPs set data access method option to M=Model value. But where an ACP has joins then override the data access method value to G=DBFGEN. To suppress index generation change the index source name to *NONE.

The best generation or implementation options will be thus:

ACP Type     Generate
PHY       TABLE
UPD       INDEX
RTV       nothing
RTV + virtuals   VIEW
RSQ       INDEX
RSQ + virtuals   VIEW + INDEX
SPN       nothing (assuming primary key on header & detail)
SPN + virtuals   VIEW
QRY       nothing
QRY + virtuals   VIEW

No index will be built if ACP maintenance is not IMMEDIATE

Extended SQL Naming

Extended naming allows the use of the full 25 character model object name for SQL Table and Column definitions. The source name is still used for any SQL View and Index since these do not have a unique model object name.

Within each model file name and model field name the individual words are concatenated together with an underscore (_), trailing blanks are removed, and lowercase is converted to uppercase. Any special characters (i.e. other than A-Z, 0-9) are also converted to underscore.

To use this option set model value YSQLVNM to *SQL.

How Extended Naming works

2E can use extended naming because an i5/OS API assigns an internally generated 10 character system name for tables & column names. The system name is used as the standard database object name & file field name. The long name is saved in the SQL Catalog. The generated RPG or COBOL programs will use the long names for SQL statement variables. The SQL precompiler can convert long names back to system names. Note that if your model object name is a valid system name, i.e. less than or equal to 10 characters, e.g. WORK, then an additional internal system name is not required and both names will be identical.

The long SQL name is only seen & recognized by the SQL interface to the DB2 database. For example, Query for iSeries will only see 10 character system names, but Query Manager and interactive SQL can see the long names.

Considerations for Extended Naming:

  • the rules for forming the name mean that its possible to get duplicate table names or duplicate column names in the same table
  • you must avoid SQL reserved words such as ’Table’, ’View’, ’Order’, etc.
  • if you change a model field or file name then the table and any programs using the table must also be changed. For example you might use ’Spare Field’ in a model file and then assign it as necessary
  • the system generated table names will not be very meaningful and will be hidden at the model level. This is not very user friendly and makes writing routine housekeeping functions difficult
  • to create an SQL table you must have authority to i5/OS API QDBRTVSN.

Future Enhancements

SQL allows table and column definitions to use the FOR COLUMN keyword to explicitly use your own name rather than a system generated name. This feature could be used to allow both model DDS names and model long names. But currently 2E does not support this feature.

SQL DML Considerations

2E will generate programs with embedded SQL dependent upon model value YDBFGEN and individual function options.

Generated RPG and COBOL SQL source contains embedded static SQL statements in place of conventional database operations like READ or CHAIN. This embedded SQL is always delimited by source directives which enable the SQL precompilers to translate the SQL into standard HLL code calling special API program routines. The converted source code is then passed automatically to the normal RPG or COBOL compilers.

SQL DML Naming Conventions

2E generated SQL DML always uses unqualified names in the FROM clause. To ensure that library list is used at run time then the RPG or COBOL program must be compiled to use *SYS naming standards. This is the default. Note that this is the opposite of 2E generated DDL which uses *SQL naming standards. With *SQL naming the database uses the current user profile as the implied library name.

Mixing DDS & Procedural Code with SQL DDL & DML

If a database has been created with SQL then only 2E functions generated to use SQL can be used. This is because 2E RPG or COBOL programs with standard I/O operations will always expect to open files for keyed access. But SQL Tables & Views will always be non-keyed files.

But i5/OS allows native HLL I/O to access SQL table & views. You can use SQL to access externally described files created using DDS.

The following rules apply when a 2E Model & ACP options are set thus:

        -------- Data Manipulation--------
Data Definition   Procedural     SQL


  DDS     Yes       Yes (1)


  SQL     No (2)     Yes


(1) DSPTRN & EDTTRN type functions can only be used if SQL DML is generated to use base tables rather than views.


(2) Programs with procedural code require their database file opened for keyed access but SQL Views are non-keyed. SQL view format names will be different to DDS format names.

NULL Values

2E SQL programs will not expect NULL values. This is because the NOT NULL keyword is always generated in the SQL column definitions and because SQL does not support views with left outer joins which might return NULL values.

An RPG program will, by default, not be able to receive null values. If it reads a record containing null values then this causes a data mapping error and an error message will be issued. If null values could arise, because say the SQL DDL has been modified to remove the NOT NULL keyword, then the CRTRPGPGM command must be overridden to allow null values. The program must be compiled with the ALWNULL keyword set to *YES. This only applies to input operations where the default value, usually blank or zero, will be returned instead.

Database Type Functions

2E database functions have always performed an existence check before adding, changing or deleting a target record. If the check fails an error message is issued and the operation is not attempted. A CHGOBJ function also locks the current record before amending & replacing existing field values.

In SQL DML this existence check is considered unnecessary and inefficient. Furthermore an SQL UPDATE statement can update direct by key and does not need to replace all field values in the record. Therefore 2E generated database functions will only perform a prior SELECT when necessary. It is considered necessary under the following situations:

CRTOBJ

  • is called from the default user point in an EDT type function. There has to be a check for a possible duplicate key and an error message issued if the key already exists.
  • the function contains any added action diagram coding, and it may only be a comment, in the ’Record Already Exists’ user point. This allows the correct logic to be executed.

CHGOBJ

  • is called from the default user point in an EDT type function. The function needs to perform a prior image comparison to detect concurrent updates by other users.
  • the null update suppression option is turned on for the function. A prior SELECT is needed to compare before & after images
  • there are any output parameters declared over the first parameter block which will automatically generate DB1 to PAR context moves. The prior SELECT is necessary to get the current DB1 context field values
  • the function contains any added action diagram coding, and it may be only a comment, in any user point other than ’Before Data Read’. Fields in the current DB1 context may be referenced.

DLTOBJ

  • is called from the default user point in an EDT type function. The function needs to perform an image comparison to detect concurrent updates by other users.
  • the function contains any added action diagram coding, and it may be only a comment, in the user point ’After Data Read’. Fields from the DB1 context of the record deleted may be referenced.

Database Functions Programming Considerations

If no prior SELECT is generated in a database function and the function fails to carry out as expected then there are some important differences in the way the function behaves:

  • no 2E error message will be sent to the program message queue if the target record exists (CRTOBJ) or record is not found (CHGOBJ or DLTOBJ).
  • if an error exists then the PGM.*Return Code will always be set to Y2U0004 ’*Database operation error’. The return code does not differentiate between record exists/not exists and a more serious database update error.
  • a CRTOBJ can only detect a duplicate key if the unique index for the table has been implemented. If there is no unique index then it is possible to create records with duplicate primary keys.
  • if the target record does not exist for a CHGOBJ SQL UPDATE or a DLTOBJ SQL DELETE then SQL regards this as a warning (SQLCODE is 100).
  • if a CRTOBJ tries to perform an SQL INSERT for an existing key then SQL regards this as an exception/error (SQLCODE is -803) and will send a unique key constraint violation message to the joblog regardless of what message the program itself sends.

For a CRTOBJ function you need to consider from the application design what the likelihood of there being a record already in existence with that key. If this is likely then you can force a prior SQL SELECT statement to be generated by just adding a comment inside the function.

When considering whether to implement null update suppression at the model level you need to consider the trade off from having a prior SQL SELECT inside each CHGOBJ. Null update suppression will always force a prior SQL SELECT. But where you know that most of the time the record will be different it will be more efficient to do an immediate UPDATE without the overhead of a prior SELECT taking place. To do this then force the null update suppression check off by changing the CHGOBJ NULUPD function option value to *No.

CHGOBJ Locking

If a prior SELECT is generated inside a CHGOBJ then it does not, by default, lock the target record before the UPDATE. This is in contrast to HLL I/O operations which will lock a record from a file opened for update. In SQL there is a danger that a record may be changed by another user between the SELECT and the UPDATE statements.

There is a model option YSQLLCK which can be used to activate locking. When locking is activated the SQL simple SELECT is converted into an SQL CURSOR SELECT with the FOR UPDATE clause. This is followed by a FETCH. This has the effect of locking the record until the UPDATE is performed or the cursor is closed. However there is a penalty in added performance overhead.

The locking option is activated by setting YSQLLCK model value to either *FET or *IMG.

*UPD - a CHGOBJ SELECT will never lock the target record

*FET - will lock all CHGOBJ functions where an existence check takes place

*IMG - will only lock a CHGOBJ called from the default user point in a standard EDT type function.

Unfortunately there is no facility to set this option on a function by function basis.

Note that the YSQLLCK value will not affect a CHGOBJ which does an immediate UPDATE without any need for a prior SELECT statement.

Using a CHGOBJ to amend the Primary Key

A CHGOBJ function may be used to amend the primary key of a record. By default the SQL UPDATE statement generated inside a CHGOBJ will not contain the primary key fields in the SET clause. But it is possible to force the SET clause to contain primary key entries by defining them as INPUT MAPPED. The programmer must then modify the DB1 context within the action diagram at the ’After Read’ or ’Before Update’ user points. But note that this only works when using a cursor fetch for update of the current record. This is because without a cursor fetch the same DB1 field name is used for the primary key SET clause as would be used in the WHERE clause. An UPDATE by cursor fetch does not require a WHERE clause.

DML Programming Considerations

Error Handling

With traditional HLL I/O operations any database exceptions cause an operating system interrupt to the user. For example, if a physical file member becomes full then the user is given the option to extend the maximum number of records that the member can hold. However, SQL file exceptions are handled differently.

The SQL processor sits over the i5/OS database. It intercepts machine and operating system escape messages and converts them into SQL error code, SQLCODE. The program then continues execution and it is up to the program logic to determine what course of action to follow.

2E generated code will handle exceptions on all SELECT or FETCH operations. It sends the SQL error message from QSQLMSG message file to the program message queue. The program then exits with the SQLCODE value moved to the PGM.*Return code. Exception messages will still be logged to the joblog (e.g. duplicate key on INSERT), but it is the responsibility of the programmer to take appropriate action. If no special error handling is added then you must check the joblog for hidden errors.

SQL & QRY ACP

When a query (QRY) ACP is implemented in DDS there are several objects are generated. These comprise a separate physical file without a data member, a non-keyed logical file, and a CLP program to execute an OPNQRYF command.

When a function built over a QRY ACP is executed then the OPNQRYF command is run over the physical file prior to opening the logical file inside the program. Even though the program may be interested in only a small subset of the data, e.g. Function has RST parameter, the OPNQRYF runs over the entire file.

In an SQL implementation there are no extra run time objects unless virtuals are being used requiring a separate view with joins. Furthermore the query selection is completely expressed within the function SQL query statement. This query expression will reflect any required subsetting of the data though RST parameters, rather the querying the complete file.

Thus an SQL implementation of a function built over a QRY ACP will be simpler to implement since there are no extra run time objects. It may also be more efficient since it does not need to query the whole file. Any performance benefits will depend on the SQL optimizer strategy which in turn depends on the type of query selection and the underlying database index structure.

The benefits of an SQL implementation for QRY type functions may be achieved even if the database is implemented using DDS.

Cursor Sensitivity

A 2E function reading multiple records, e.g. RTVOBJ, will open a cursor and then FETCH individual records within a loop. If the data in the underlying result set defined by the cursor declaration changes between opening the cursor and fetching the row then there is no guarantee that the latest details will be read. This is known as cursor sensitivity. By default a cursor is not sensitive to underlying database changes. If the cursor was sensitive then changes made to the database after the cursor is opened are visible.

JOB Context

The file I/O feedback area is not supported for SQL programs. There will be no i5/OS File Information Data Structure for an SQL SELECT over the primary file. This means that 2E JOB context fields for Library/File/Member names will always be blank.

Commitment Control & Journaling

The SQL ANSI standard is use commitment control. However, under DB2 UDB for iSeries it is optional. To use commitment control the SQL program must be compiled with the COMMIT option set. This also affects the locking or isolation level used. The default is *CHG, which is equivalent to READ UNCOMMITTED (i.e. allow dirty reads). It’s more normal to use *CS which is equivalent to READ COMMITTED.

To disable Commitment Control then set CRTSQLCBL or CRTSQLRPG to COMMIT(*NONE). i5/OS automatically journals physical access path changes when created by SQL. The ENDJRNPF command may be used to turn it off.

Because in SQL files are not required to be opened for Commitment Control then the 2E function options *MASTER and *SLAVE have less significance. A 2E SQL program set as *MASTER will have a final ROLLBACK at end of program for any uncommitted changes. A *SLAVE program has no special coding.

2E generates RPG and COBOL native COMMIT and ROLLBACK statements rather than the SQL versions. These hold locks (e.g. equivalent to COMMIT HOLD)

Execute User programs

2E does not allow an EXCUSRPGM function to be identified as being SQLRPG or SQLCBL source. If no compiler directives are added to the source code then the YSBMMDLCRT command will submit these programs using either the SQL compiler or the normal complier based upon the YDBFGEN model setting. The best option is use Y* complier directives within the program source in order to use the correct complier and correct parameter settings.

(A bug, still present in R8.1, will add an AUT parameter to the create command parameter list. But neither CRTSQLRPG nor CRTSQLCBL accept an AUT parameter and the compile will fail. In this scenario you must compile SQL user programs outside the model, such as from PDM.)

Debugging SQL Programs

The debug option can not be added to the SQL COBOL precompiler. You will have to save the intermediate source code held in QTEMP and then compile using debug.

File Level Checks

SQL programs will not provide any i5/OS file level checking. This provides some flexibility in dealing with database changes. For example if new columns are added and it is not yet required to sue them or insert other then defaults then existing programs need not be regenerated and compiled.

Performance Considerations

A common question is whether using SQL will perform better than non-SQL. There is no simple answer to that question. It depends on various factors and results will vary from site to site. All that is possible is to explain the various features which will either help to improve performance, or will hinder performance.

Things that might help Performance

Static v Dynamic SQL

The SQL generated by 2E is always static SQL rather than dynamic. The query optimizer determines at compile time what files the program will access and how to access them. The optimizer is able to convert SQL requests into optimally efficient database access methods. Details of the query access plan are permanently stored as part of the program object rather than being built at run-time. They can be printed with the PRTSQLINF command.

SQL Optimiser

Much has been written about how to optimize an SQL RDBMS database access for performance. But in practice the 2E model design, by definition, ensures that SQL access will perform reasonably efficient. All access is record orientated, using SELECT or UPDATE/INSERT by primary key, or by using a cursor FETCH. All such access will be supported by an underlying index corresponding to the key fields used. Thus unless the index creation has been dropped then the SQL optimizer will always retrieve by index. Joins on views will also be supported by the primary key index.

To determine exactly how an SQL program is executing its access plan you can run the program in debug mode. Before calling the program run the STRDBG command. This forces the SQL database optimizer to log its execution processes into the joblog. The joblog can then be displayed or printed.

SQL Index Strategy

SQL Indexes may be suppressed by blanking out their implementation name, which then defaults to *NONE. Index suppression may be desirable in certain circumstance. The primary index may be dropped on small system files with just one or only a few records. Indexes which are very similar may be dropped without experiencing much performance degradation

Processing a table in ascending and descending sequence requires two 2E ACPs. But only one may be needed to be implemented because the SQL optimiser realizes that the same index may be used either forwards or backwards.

Elimination of Prior Selects

Ensuring that the prior existence check is eliminated where necessary in CRTOBJ/CHGOBJ/DLTOBJ will improve performance. Also ensure that CHGOBJ locks are not imposed unless necessary. A simple UPDATE will perform better than a cursor update.

SQL WHERE Clause

The style of SQL coding used for multiple positioners on a RTVOBJ may affect performance. This is determined by model value YSQLWHR. DB2 UDB performs best when the model value is set at the shipped value of *OR. But other SQL databases, such as Oracle, are known to perform better when set to *NOT.

Execute User Source

The power of SQL arises from the ability to use set-at-a-time operations. 2E, by definition, will always use record level operations. However, it is relatively simple to write a piece of SQL as EXCUSRSRC and to include it inside a generated function. For example, a simple RTVOBJ across a file in order to delete certain records will generate a cursor FETCH by key followed by a single record DELETE by key. But the same operation can by written more efficiently in just one SQL statement, for example in RPG:

C/EXEC SQL
C+ DELETE FROM CUSTMAST WHERE STATUS = ’D’
C/END-EXEC

But you may need to include your own error handling as part of the EXCUSRSRC. This is best done by calling the standard 2E error handling subroutine ZWEXPG:

C           SQLCOD    IFLT *ZERO     
C                     EXSR ZWEXPG    
C                     END            

If the 2E error handling routine is not present then you can force the generator to create it by including some other database function in the action diagram, although it need not be executed.

Things that might hinder Performance

SQL SELECT *

2E generated SQL will always select all columns from the database into a data structure using the SQL statement SELECT *. Selecting all columns is not generally considered very efficient within SQL programming standards. It will affect performance. The only real solution is to create a cut down ACP and implement it as a SQL VIEW. The generated code will then SELECT all columns from the trimmed down VIEW.

Open Data Path (ODP)

An important consideration for performance in iSeries SQL is the number of Open Data Paths (ODP) and the ability to reuse ODPs. With SQL ODPs are opened & closed automatically and therefore the scope for programmer influence is limited. The SQL physical & logical files cannot be changed to share their ODP, neither can OVRDBF command be used to share ODPs. (OVRDBF command can only be used to direct a reference to a different Table or View).

A SQL program can only reuse an ODP opened by the same statement number. An identical statement coded later in the program does not reuse an ODP from any other statement. Thus if an identical RTVOBJ is coded twice in a function then this will create 2 open ODPs. With conventional HLL I/O operations there would only be one ODP per file. Thus you must expect a 2E generated SQL program to use many more ODPs than the equivalent non-SQL version, and therefore it will use more memory and take longer to load.

The creation and deletion of ODPs has a significant performance impact due to the amount of time required to open & close a data path. However, once an ODP has been created it has little impact on performance. ODPs are only bought into main storage when they are addressed. Sharing 2E subroutines will mitigate the effect of the number of ODPs opened.

The program complier options to close cursors (CLOSQLCSR) is believed to have some effect on when ODPs are deleted for re-use. But this is a complicated subject that needs some more research. The IBM manuals and help text is ambiguous.

Performance Conclusions

Simple record level access using a key or multiple record access driving down a keyed access path is probably being done as efficiently as it could be using traditional non-SQL access. Therefore switching to SQL is unlikely to give major performance improvements.

The power of SQL is when processing multiple rows in one operation. 2E’s design methodology means that it’s predominantly record level access. But using a QRY ACP or using ACPs with Select/Omit criteria may help with SQL.

Major performance benefits may be achieved by replacing procedural processing with SQL set operations. Such opportunities are more likely to be found in batch type processing rather than online. Online SQL may suffer from the overhead caused by increased number of ODPs needed. Although once opened the code should perform ok.

2E SQL does give you the opportunity to fine tune the database indexing strategy. But if you suppress an index arising from a 2E ACP then that means that any program using that ACP must be generated to use SQL.

Additional advantages may arise from using SQL tables instead of DDS defined physical files. A compromise may be to create all physical files from SQL and then to define DDS logicals over the physicals. Programs can then be a mixture of non-SQL using DDS logicals or SQL using base SQL tables.

Using SQL to do non-database activities

SQL is a powerful language in itself and need not be confined to just database activity. It can often be easier to write certain data calculations or data manipulations in SQL rather then code them in 2E or in RPG/COBOL. These SQL functions can be coded as EXCUSRSRC and embedded in 2E SQL generated programs. The SQL reference manual has approx 150 built-in functions. Here are some examples:

Mathematical Functions

  • Random Number (RAND)
  • Sin/Cosin etc (SIN/COS)
  • Log (EXP/LOG/LN)

String Functions

  • Trim Left or Right (TRIM/LTRIM/RTRIM)
  • Modify String Data (INSERT/REPLACE/CONCAT)
  • Case Conversion (UPPER/LOWER)

Date & Time

  • Return current Date or Time (CURDATE/CURTIME)
  • Analyze Dates (DAYNAME/DAYOFWEEK/DAYOFMONTH)
  • Date Arithmetic (SUM(DAYS(date1)-DAYS(date2)))

User Defined Functions

  • Create your own bespoke procedural functions in SQL or HLL, (CREATE FUNCTION)

Appendix - SQL Model Values Summary

YSQLLIB Library Name used for SQL SCHEMA


YDBFGEN Generate external functions to use *DDS or *SQL
    May be overridden on individual functions.
    Also controls which complier for EXCUSRPGM with no Y* complier directive.


YDBFACC When generating SQL use either Tables (*TABLE) or Views (*DBFGEN)
    May be overridden on individual access paths
     
YSQLLCK Type of SQL CHGOBJ record locking (see earlier explanation)


YSQLVNM Use either *DDS names or use *SQL Extended Naming


YSQLLEN Length of model names used for SQL Extended Naming, best left at 25


YSQLWHR Style of SQL WHERE clause coding generated. Use *OR for iSeries

Appendix – Example of RPG SQL to Generate Random Number

C* Generate 3 digit RANDOM number between 0 and 100           
C                     Z-ADD*ZERO     #OD7NB           2E VARIABLE
C                     Z-ADD*ZERO     NBR30   30       SQL VARIABLE
C* If seed not set then generate using system time          
C           SEED      IFEQ *ZERO                            
C                     TIME           SEED    60             
C/EXEC SQL                                                  
C+     SET :NBR30 = DECIMAL(                                
C+        (SELECT RAND(INT(:SEED)) * 100                       
C+          FROM  SYSIBM/SYSDUMMY1)  ,3,0   )                  
C/END-EXEC                                                  
C                     ELSE                                  
C* Generate using existing seed                             
C/EXEC SQL                                                  
C+     SET :NBR30 = DECIMAL(                                
C+        (SELECT RAND() * 100                                 
C+          FROM  SYSIBM/SYSDUMMY1)  ,3,0   )
C/END-EXEC                                                  
C                     ENDIF                                 
C* Trap for SQL error                                       
C           SQLCOD    IFLT 0                                
C                     EXSR ZWEXPG                           
C                     SETON                       90  *     
C                     ELSE                                  
C                     SETOF                       90  *     
C                     END                                   
C* Pass back generated value                                
C                     Z-ADDNBR30     #OD7NB
Personal tools