Wednesday, March 9, 2011

Celebrating LIST() and STRING()

Here's a situation that pops up from time to time when working with MobiLink: The consolidated and remote databases are ready to go and all the MobiLink scripts have been coded, and it's time to start testing... but there's no data in any of the tables.

What's the quickest way to insert a single row into each table just to get started running synch tests?

To be more specific, how can you execute an INSERT statement that looks like this template, for each and every table in the database, so every column has at least one value?


INSERT owner.table VALUES (
value -- column description
, value -- column description
, value -- column description
);

Here's a SQL script that will do just that, using the following features of SQL Anywhere:
  • the LIST() aggregate function is called twice, once to build each INSERT statement, and again to build each VALUES entry within an INSERT,

  • the STRING() function is also used twice, to convert expressions into strings and concatenate them into longer strings, once for each INSERT and once for each VALUES entry,

  • the EXECUTE IMMEDIATE statement is used to apply all the inserts to the database, and

  • the WAIT_FOR_COMMIT option is used so it doesn't matter what order the inserts are executed in (referential integrity isn't checked until after all the inserts are finished).
If you're like me, aggregate functions (especially those OLAP WINDOW things) are mysterious to behold... but LIST() is worth the effort to understand because it's so flexible and powerful:

BEGIN
DECLARE @insert_statements LONG VARCHAR;

SELECT LIST (
STRING (
'INSERT ', USER_NAME ( SYSTABLE.creator ), '.', SYSTABLE.table_name, ' VALUES ( \x0D\x0A',
' ',
( SELECT LIST (
STRING (
CASE SYSDOMAIN.domain_name
WHEN 'bigint' THEN '99 -- '
WHEN 'binary' THEN '99 -- '
WHEN 'bit' THEN '1 -- '
WHEN 'char' THEN '''x'' -- '
WHEN 'date' THEN 'CURRENT DATE -- '
WHEN 'decimal' THEN '99 -- '
WHEN 'double' THEN '99 -- '
WHEN 'float' THEN '99 -- '
WHEN 'integer' THEN '99 -- '
WHEN 'long binary' THEN '''x'' -- '
WHEN 'long nvarchar' THEN '''x'' -- '
WHEN 'long varbit' THEN '1 -- '
WHEN 'long varchar' THEN '''x'' -- '
WHEN 'nchar' THEN '''x'' -- '
WHEN 'numeric' THEN '99 -- '
WHEN 'nvarchar' THEN '''x'' -- '
WHEN 'smallint' THEN '99 -- '
WHEN 'st_geometry' THEN 'NULL -- '
WHEN 'time' THEN 'CURRENT TIME -- '
WHEN 'timestamp' THEN 'CURRENT TIMESTAMP -- '
WHEN 'timestamp with time zone' THEN 'CURRENT UTC TIMSTAMP -- '
WHEN 'tinyint' THEN '99 -- '
WHEN 'uniqueidentifier' THEN 'NEWID() -- '
WHEN 'unsigned bigint' THEN '99 -- '
WHEN 'unsigned int' THEN '99 -- '
WHEN 'unsigned smallint' THEN '99 -- '
WHEN 'varbinary' THEN '''x'' -- '
WHEN 'varbit' THEN '1 -- '
WHEN 'varchar' THEN '''x'' -- '
WHEN 'xml' THEN ''''' -- '
END CASE,
SYSCOLUMN.column_name,
' ',
SYSDOMAIN.domain_name,
CASE SYSDOMAIN.domain_name
WHEN 'bigint' THEN ''
WHEN 'binary' THEN ''
WHEN 'bit' THEN ''
WHEN 'char' THEN STRING ( '( ', SYSCOLUMN.width, ')' )
WHEN 'date' THEN ''
WHEN 'decimal' THEN STRING ( '( ', SYSCOLUMN.width, ', ',
SYSCOLUMN.scale, ')' )
WHEN 'double' THEN ''
WHEN 'float' THEN ''
WHEN 'integer' THEN ''
WHEN 'long binary' THEN ''
WHEN 'long nvarchar' THEN ''
WHEN 'long varbit' THEN ''
WHEN 'long varchar' THEN ''
WHEN 'nchar' THEN STRING ( '( ', SYSCOLUMN.width, ')' )
WHEN 'numeric' THEN STRING ( '( ', SYSCOLUMN.width, ', ',
SYSCOLUMN.scale, ')' )
WHEN 'nvarchar' THEN STRING ( '( ', SYSCOLUMN.width, ')' )
WHEN 'smallint' THEN ''
WHEN 'st_geometry' THEN ''
WHEN 'time' THEN ''
WHEN 'timestamp' THEN ''
WHEN 'timestamp with time zone' THEN ''
WHEN 'tinyint' THEN ''
WHEN 'uniqueidentifier' THEN ''
WHEN 'unsigned bigint' THEN ''
WHEN 'unsigned int' THEN ''
WHEN 'unsigned smallint' THEN ''
WHEN 'varbinary' THEN STRING ( '( ', SYSCOLUMN.width, ')' )
WHEN 'varbit' THEN STRING ( '( ', SYSCOLUMN.width, ')' )
WHEN 'varchar' THEN STRING ( '( ', SYSCOLUMN.width, ')' )
WHEN 'xml' THEN ''
END CASE ),
'\x0D\x0A , '
ORDER BY SYSCOLUMN.column_id )
FROM SYSCOLUMN
INNER JOIN SYSDOMAIN
ON SYSDOMAIN.domain_id = SYSCOLUMN.domain_id
WHERE SYSCOLUMN.table_id = SYSTABLE.table_id ),
'\x0D\x0A); ' ),
'\x0D\x0A\x0D\x0A'
ORDER BY SYSTABLE.table_name )
INTO @insert_statements
FROM SYSTABLE
WHERE USER_NAME ( SYSTABLE.creator ) = 'GROUPO'
AND SYSTABLE.table_type = 'BASE';

SET TEMPORARY OPTION WAIT_FOR_COMMIT = 'ON';

EXECUTE IMMEDIATE @insert_statements;

COMMIT;

END;
  • The outer SELECT statement on lines 4 through 91 retrieves each base table belonging to GROUPO in the SQL Anywhere 12 demo database, using data from the SYSTABLE catalog view.

  • The outer LIST() function call on lines 4 through 87 gathers up all the SYSTABLE rows into a single string, and that string is placed in a variable on line 88.

  • The STRING() function call on lines 5 through 85 builds the first parameter passed to the outer LIST() function: the generated INSERT.

  • The two carriage return - line feed pairs on line 86 form the second parameter for the outer LIST() function: the separator between each INSERT in the list.

  • The ORDER BY clause on line 87 instructs the outer LIST() to sort the generated INSERT statements in table name order.

  • The inner SELECT on lines 8 through 84 retrieves each column for the table in the outer SELECT, using data from the SYSCOLUMN and SYSDOMAIN catalog views.

  • The inner LIST() function call on lines 8 through 80 gather all the SYSCOLUMN rows into a single string.

  • The STRING() function call on lines 9 through 78 is the first parameter passed to the inner LIST(), the separator is specified on line 79 and the list's ORDER BY clause is on line 80.

  • The CASE expression on lines 9 through 41 determines what literal to use for this column in the VALUES list.

  • The expressions on lines 42 through 78 build a --comment that's very handy if the output has to be customized.
Here's what the first few generated INSERT statements look like in the string:

INSERT GROUPO.Contacts VALUES (
99 -- ID integer
, 'x' -- Surname char( 20)
, 'x' -- GivenName char( 20)
, 'x' -- Title varchar( 34)
, 'x' -- Street char( 30)
, 'x' -- City char( 20)
, 'x' -- State char( 16)
, 'x' -- Country char( 16)
, 'x' -- PostalCode char( 10)
, 'x' -- Phone char( 13)
, 'x' -- Fax char( 13)
, 99 -- CustomerID integer
);

INSERT GROUPO.Customers VALUES (
99 -- ID integer
, 'x' -- Surname char( 20)
, 'x' -- GivenName char( 20)
, 'x' -- Street char( 30)
, 'x' -- City char( 20)
, 'x' -- State char( 16)
, 'x' -- Country char( 16)
, 'x' -- PostalCode char( 10)
, 'x' -- Phone char( 13)
, 'x' -- CompanyName char( 32)
);

INSERT GROUPO.Departments VALUES (
99 -- DepartmentID integer
, 'x' -- DepartmentName char( 40)
, 99 -- DepartmentHeadID integer
);

This method doesn't just work for the demo database, it works for a lot of real-world databases... but not all.

When it doesn't work perfectly, however, it's a good start: Change the code to save the generated inserts in a file instead of executing them directly, and then edit the file to fix whatever's wrong.

No comments: