Interbase Stored Procedure Grammar - Felix John COLIBRI. |
- abstract : The BNF Grammar of the Interbase Stored Procedure. This grammar can be used to build stored procedure utilities, like pretty printers, renaming tools, Sql Engine conversion or ports
- key words : Stored procedure - Interbase - BNF grammar - Parser
- software used : Windows XP, Delphi 6, Interbase 6
- hardware used : Pentium 2.800Mhz, 512 M memory, 140 G hard disc
- scope : Delphi 1 to 2006, Turbo Delphi for Windows, Kylix, Interbase
- level : Delphi developer, Interbase developer
- plan :
1 - Interbase Sql Grammar
One of our customer recently wanted to perform some heavy duty computations on his Interbase Stored Procedures (analysis and transformations). Instead of plunging head on in an ad-hoc tool, we started to build a general
parser. So we had to start-off with a BNF grammar. We already presented a simplified Sql grammar for the main INSERT, UPDATE,
DELETE and SELECT requests. So we extended this grammar to handle the stored procedures as well.
Before the grammar presentation, here is a simple example of a stored procedure
contained in the Interbase Sql Reference page: CREATE PROCEDURE ORG_CHART
RETURNS (HEAD_DEPT CHAR(25), DEPARTMENT CHAR(25), MNGR_NAME CHAR(20), TITLE CHAR(5), EMP_CNT INTEGER) AS DECLARE VARIABLE mngr_no INTEGER; DECLARE VARIABLE dno CHAR(3); BEGIN
FOR SELECT H.DEPARTMENT, D.DEPARTMENT, D.MNGR_NO, D.DEPT_NO FROM DEPARTMENT D LEFT OUTER JOIN DEPARTMENT H ON D.HEAD_DEPT = H.DEPT_NO ORDER BY D.DEPT_NO INTO :head_dept, :department, :mngr_no, :dno DO BEGIN
IF (:mngr_no IS NULL) THEN BEGIN MNGR_NAME = '--TBH--'; TITLE = ''; END ELSE SELECT FULL_NAME, JOB_CODE FROM EMPLOYEE WHERE EMP_NO = :mngr_no INTO :mngr_name, :title; SELECT COUNT(EMP_NO) FROM EMPLOYEE
WHERE DEPT_NO = :dno INTO :emp_cnt; SUSPEND; END | If this does'nt look too inviting, here is another version of the same procedure, reformatted using our grammar:
create procedure org_chart
returns (pv_head_dept char(25), pv_department char(25),
pv_mngr_name char(20), pv_title char(5),
pv_emp_cnt integer) as
declare variable l_mngr_no integer;
declare variable l_dno char(3);
begin for
select h.department, d.department, d.mngr_no, d.dept_no
from department d
left outer join department h
on d.head_dept= h.dept_no
order by d.dept_no
into : pv_head_dept, : pv_department, : l_mngr_no,
: l_dno do
begin
if (: l_mngr_no is null)
then begin
pv_mngr_name= '--TBH--';
pv_title= '';
end else
select full_name, job_code
from employee
where emp_no= : l_mngr_no
into : pv_mngr_name, : pv_title;
select count(emp_no)
from employee
where dept_no= : l_dno
into : pv_emp_cnt;
suspend; end end
| The grammar was used to indent the text (and add prefixes), but the bold and coloring was performed using the Sql To HTML utility.
2 - Stored procedure grammar 2.1 - The Full Interbase grammar We started with our full Interbase Grammar. This grammar was created by
extracting from the Interbase documentation all BNF fragment, and putting then together in the same file. This is a HUGE 530 line grammar. Our previous INSERT, UPDATE,
DELETE and SELECT grammar was 78 line long.
The stored procedure grammar integrates back - the expressions (arithmetic, comparisons etc)
- the stored procedure productions proper
2.2 - The Stored Procedure BNF Grammar The basic structure is (partial):
stored_procedure= CREATE PROCEDURE procedure_name
[ '(' parameter_name_and_type_list ')' ]
[ RETURNS '(' parameter_name_and_type_list ')' ]
AS procedure_body .
procedure_body= [ variable_declaration_list ] compound_statement .
variable_declaration_list = declaration { declaration } .
declaration= DECLARE VARIABLE NAME data_type ';' .
compound_statement= BEGIN { compound_statement | statement [ ';' ] } END .
statement= assignment | exit_control | error_handling
| other | sql_request | control . |
And when we display all the details, we get:
stored_procedure= CREATE PROCEDURE procedure_name
[ '(' parameter_name_and_type_list ')' ]
[ RETURNS '(' parameter_name_and_type_list ')' ]
AS procedure_body .
procedure_name= NAME .
data_type= SMALLINT | INTEGER | FLOAT | DOUBLE PRECISION
| ( DECIMAL | NUMERIC )
[ '(' integer [ ',' integer ] ')' ]
| DATE | TIME | TIMESTAMP
| ( CHAR | CHARACTER [ VARYING ] | VARCHAR )
[ '(' integer ')' ] [ CHARACTER SET CHARACTER_SET_NAME ]
| ( NCHAR | NATIONAL ( CHARACTER | CHAR) )
[ VARYING ] [ '(' integer ')' ]
| BLOB . integer= NUMBER .
parameter_name_and_type_list= name_and_type { ',' name_and_type } .
name_and_type= NAME data_type .
procedure_body= [ variable_declaration_list ] compound_statement .
variable_declaration_list = declaration { declaration } .
declaration= DECLARE VARIABLE NAME data_type ';' .
compound_statement= BEGIN { compound_statement | statement [ ';' ] } END .
statement= | assignment
| exit_control | error_handling
| other | sql_request
| control .
simple_or_compound_statement= compound_statement | statement [ ';' ] .
value_litteral= litteral | [ ':' ] NAME [ '.' NAME ] .
litteral= INTEGER_LITTERAL | STRING_LITTERAL
| DOUBLE_LITTERAL | NULL .
integer_litteral= INTEGER_LITTERAL .
table_or_view_name= NAME .
name_view_procedure= NAME .
column_name= NAME [ '.' NAME ] .
collation_name= NAME .
alias_name= NAME .
select= F .
select_expression= select .
functions= average | count | max | min | sum | upper . // | cast .
average= AVG '(' [ ALL | DISTINCT ] value_litteral ')' .
count= COUNT '(' ( '*' | [ ALL | DISTINCT ] value_litteral ) ')' .
max= MAX '(' [ ALL | DISTINCT ] value_litteral ')' .
min= MIN '(' [ ALL | DISTINCT ] value_litteral ')' .
sum= SUM '(' [ ALL | DISTINCT ] value_litteral ')' .
upper= UPPER '(' value_litteral ')' .
cast= CAST '(' value_litteral AS data_type ')' .
function_or_value= [ '+' | '-'] ( functions | value_litteral ) .
search_condition= search_value { ( OR | AND ) search_condition } .
search_value= simple_value { arithmetic_operator simple_value } .
arithmetic_operator= '+' | '-' | '*' | '/' | '||' .
simple_value=
function_or_value
[ [ NOT ] ( between | like | in | compare | containing | starting )
| IS [ NOT ] NULL ]
| ( ALL | SOME | ANY ) '(' select_column_list ')'
| EXISTS '(' select_expression ')'
| SINGULAR '(' select_expression ')'
| '(' search_condition ')'
| NOT search_condition .
select_one_column= select .
select_column_list= select .
between= BETWEEN value_litteral AND value_litteral .
like= LIKE value_litteral [ ESCAPE value_litteral ].
in= IN '(' value_litteral { ',' value_litteral } | select_column_list ')' .
compare= compare_operator ( search_value | '(' select_one_column ')' ) .
compare_operator= '=' | '<' | '>' | '<=' | '>=' | '<>' .
containing= CONTAINING value_litteral .
starting= STARTING [ WITH ] value_litteral .
select= SELECT
column_clause
FROM from_table_reference { ',' from_table_reference }
[ WHERE search_condition ]
[ GROUP BY column_name
[ COLLATE collation_name ] { ',' column_name [ COLLATE collation_name ] } ]
[ HAVING search_condition ]
[ UNION select_expression [ ALL ] ]
[ ORDER BY order_list ]
.
column_clause= [ DISTINCT | ALL ] ( '*' | column_detail { ',' column_detail } ) .
column_detail= functions | value_litteral [ element_index ] .
element_index= '[' ':' NAME ']' .
from_table_reference= NAME ( procedure_end | join_end ) | direct_joined_table .
join_type= ( [ INNER | { LEFT | RIGHT | FULL } [OUTER] ] ) JOIN .
procedure_end= [ '(' value_litteral { ',' value_litteral } ')' ] [ alias_name ]
[ join_type name_view_procedure [ alias_name ] ON search_condition ] .
join_on= F .
joined_table= ( name_view_procedure join_on | '(' joined_table ')' ) { join_on } .
join_on= join_type ( joined_table | name_view_procedure ) [ alias_name ]
ON search_condition .
join_end= join_on { join_on } .
direct_joined_table= '(' joined_table ')' { join_on } .
order_list= ( column_name | integer_litteral ) [ COLLATE collation_name ]
[ ascending_or_descending ] { ',' order_list } .
ascending_or_descending= ASC | ASCENDING | DESC | DESCENDING .
assignment= variable_name '=' search_condition .
variable_name= NAME .
exit_control= EXIT | SUSPEND .
error_handling= | proc_whenever
| proc_when | proc_exception
.
proc_whenever= WHENEVER ( NOT FOUND | SQLERROR | SQLWARNING )
( GOTO label_name | CONTINUE ) .
label_name= NAME .
exception_name= NAME.
// -- must be before END: some kind of TRY ... EXCEPT concept
proc_when= WHEN ( error_code { ',' error_code } | ANY ) DO
simple_or_compound_statement.
error_code= EXCEPTION exception_name | ( SQLCODE | GDSCODE )
error_litteral .
error_litteral= [ '+' | '-' ] integer_litteral .
proc_exception= EXCEPTION exception_name .
other= | proc_post_event
| execute_procedure .
proc_post_event= POST_EVENT ( event_string | column_name ) .
event_string= EVENT_STRING .
execute_procedure= EXECUTE PROCEDURE NAME [ input_parameters ]
[ RETURNING_VALUES output_parameters ] .
parameter_name= ':' NAME .
input_parameters= parameter_name { ',' parameter_name } .
output_parameters= parameter_name { ',' parameter_name } .
sp_select= select [ into_clause ] .
into_clause= INTO output_parameter { ',' output_parameter } .
output_parameter= ':' NAME .
sql_request= insert | update | delete | sp_select .
update= UPDATE table_or_view_name SET set_value { ',' set_value }
[ WHERE search_condition ] .
set_value= column_name '=' search_condition .
insert= INSERT INTO table_or_view_name
[ '(' column_name { ',' column_name } ')' ]
( VALUES '(' value_litteral { ',' value_litteral } ')'
| select_expression ) .
delete= DELETE [ TRANSACTION transaction_name ] FROM table_name
( [ WHERE search_condition ] | WHERE CURRENT OF cursor_name ) .
transaction_name= NAME .
cursor_name= NAME .
table_name= NAME .
control= proc_if
| proc_while
| for_select .
condition_name= search_condition .
proc_if= IF '(' condition_name ')' THEN simple_or_compound_statement
[ ELSE simple_or_compound_statement ] .
proc_while= WHILE '(' condition_name ')' DO simple_or_compound_statement .
for_select= FOR sp_select DO simple_or_compound_statement .
|
Just a couple of comments - like all our grammars, the Stored Procedure Grammar is not a "validating grammar" but an "operational grammar": our purpose was to analyze and
manipulate Stored Procedure declarations, not to run them. We did not want to generate Interbase BLR pseudo code or check scripts for syntactic correctness.
It is easy to check that our grammar allows more general texts than allowed
by the Interbase engine. For instance, in value_litteral we allow both the colon : and the dot .:
value_litteral= litteral | [ ':' ] NAME [ '.' NAME ] . |
and this would be correct in any assignment to an output parameter, but not in the column clause between SELECT and FROM
- on the other hand, some production were rearranged to handle our test suite. For instance, in the SHOW_LANGS stored procedure, we found the [:index] element:
create procedure show_langs
(p_code varchar(5), p_grade smallint, p_cty varchar(15) )
returns (pv_languages varchar(15) ) as
declare variable l_i integer;
begin l_i= 1;
while (l_i<= 5) do begin
select language_req[: i]
from job
where ( (job_code= : p_code)
and (job_grade= : p_grade) and (job_country= : p_cty)
and (language_req is not null) )
into : pv_languages;
/* -- Prints 'NULL' instead of blanks */
if (pv_languages= ' ')
then pv_languages= 'NULL';
l_i= l_i+ 1; suspend;
end end | therefore we added this indexing possibility in the column_clause of
SELECT : select= SELECT
column_clause ...ooo...
column_clause= [ DISTINCT | ALL ] ( '*' | column_detail { ',' column_detail } ) .
column_detail= functions | value_litteral [ element_index ] .
element_index= '[' ':' NAME ']' .
| But should this index sometime show up in the WHERE or VALUES parts, then we would move the indexing element up in a more general rule.
Certainly carefully reading the manual would bring back the answer. But this is not a sure thing: the textual description is not as accurate as BNF, and their BNF is approximate and illustrative. So we adopted this more pragmatic approach.
Did Jim STARKEY use a grammar ?. The answer is a resounding YES. In the Interbase 6 sources, you find some YACC bits and pieces. First, reverse engineering YACC grammars into LL1 BNF is not that simple, and we did it on
occasion for languages like Java or C++. However Interbase contains MANY languages (DDL, DML, BLR, embedded Sql...), and my feeling is that those YACC files were used as a starting point for some of the parsers. So there
is a possibility (possibility ? well...) that they were not kept up to date with the language evolutions. So we are back to square one, which means adjusting the grammar to whatever the Sql Engine accepts.
- our overall structure of the grammar is arbitrary:
- the local variables could be placed in the procedure header instead of the body. Since procedures cannot be nested, this is not an issue
- we tried to isolate the different concepts of the statement stored procedure:
- assignment
- exit_control
- error_handling
- pure sql_requests
- control and other
All the groups could have been merged into a single statement alternative. - the grammar might still contains some LL1 duplicates (rule where several alternate elements start with the same terminal token)
- the main problems was the : semi colon. The rule tells that all statements are terminated by a semi-colon BUT NOT the BEGIN .. END. For instance:
if (pv_languages= ' ')
then pv_languages= 'NULL'; /* <== ; */
l_i= l_i+ 1; | but:
if (pv_languages= ' ') then
begin pv_languages= 'NULL';
end /* <== no ; */
l_i= l_i+ 1; | and this rule wreaks total havoc in the statement rules. You CANNOT use:
compound_statement= BEGIN { general_statement } END .
general_statement= compound_statement | statement ';' .
statement= assignment | if_statement | ...ooo... .
if_statement= IF condition then general_statement .
...ooo... |
because if_statement is in the iteration, and there statement is always followed by the semi-colon. We tried to place the semicolon after the rules that cannot use compound statement, but to no avail.
The only solution would be to move the semi-colon issue up at the body level, but then all the arithmetic would also have to be brought out of the body. So we accepted to make the semi-colon an option after each statement, which
does parse our stored procedures, but is much more general that what the Sql Engine accepts (semi colon are optional everywhere !). Once again, this grammar is not a validating grammar, but an operational one
- finally we presented a "stored procedure only" grammar: all the arithmetic as well as the general INSERT, UPDATE, DELETE and SELECT Sql
requests are nested within the body. In a full Sql grammar, they would be outside of the stored procedure. In addition, a full Sql grammar would include all the other rules (ALTER, COMMIT, GRANT etc)
2.3 - Testing the grammar To test the grammar, we used a test set including the procedures extracted from:
3 - Download the Grammar Text Here are the source code files:
As usual:
- please tell us at fcolibri@felix-colibri.com if you found some errors, mistakes, bugs, broken links or had some problem downloading the file. Resulting corrections will
be helpful for other readers
- we welcome any comment, criticism, enhancement, other sources or reference suggestion. Just send an e-mail to fcolibri@felix-colibri.com.
- or more simply, enter your (anonymous or with your e-mail if you want an answer) comments below and clic the "send" button
- and if you liked this article, talk about this site to your fellow developpers, add a link to your links page ou mention our articles in
your blog or newsgroup posts when relevant. That's the way we operate: the more traffic and Google references we get, the more articles we will write.
4 - References Here are some Stored Procedure links:
5 - The author Felix John COLIBRI works at the Pascal Institute. Starting with Pascal in 1979, he then became involved with Object
Oriented Programming, Delphi, Sql, Tcp/Ip, Html, UML. Currently, he is mainly active in the area of custom software
development (new projects, maintenance, audits, BDE migration, Delphi Xe_n migrations, refactoring), Delphi Consulting and Delph training. His web site features tutorials, technical papers about programming with full downloadable source
code, and the description and calendar of forthcoming Delphi, FireBird, Tcp/IP, Web Services, OOP / UML, Design Patterns, Unit Testing training sessions.
|