Using Script Files

Using script files to store and execute task-specific SQL statements has a number of advantages over using views, functions, or procedures that are stored within the database itself, particularly for one-off or infrequent tasks, or actions that must be applied to multiple databases. These advantages are:

  • When database operations are only part of an overall task, maintenance and management of all components of the task is easier and more reliable when SQL scripts are kept together in the file system with input files, database output, output processing scripts, and final task products. Because all of the SQL needed for a specific data summarization task is kept together, there is little or no risk that one of a set of separate database objects—views or stored procedures—that are needed to complete a specific task will be either deleted or altered. The clutter of queries, functions, and procedures that would otherwise accumulate in a heavily used database can be reduced or eliminated.
  • When multiple databases with the same data model are used (e.g., for different projects), only one copy of the scripts tailored for that data model need be maintained, rather than having duplicate procedures or views in every database. This reduces maintenance and ensures consistency.
  • Creation of the SQL script for a new task can be simplified by copying and editing a previously existing script. The user’s preferred editor can be used to carry out search and replace operations to easily and reliably make changes throughout the entire set of SQL statements and scripts that are needed for a particular task.
  • Complete documentation can (and should!) be included in the script files, so that the purpose, assumptions, limitations, and history of changes can be easily reviewed by anybody who might consider using or modifying the query script. This documentation is easily accessible to scanning and searching tools like grep.
  • The script can be easily preserved to document the way in which data were selected or summarized. Scripts can be easily archived, backed up, and put under version control independently of the database. Script files can be made read-only so that they cannot easily or accidentally be modified after the script for a particular task has been finalized.
  • Data management processes can be more easily automated by integrating a script-processing tool like execsql with other system tools than by using interactive database interfaces. The ability of execsql to export data in CSV, TSV, OpenDocument spreadsheet, readable Markdown-compatible text, HTML, JSON, and LaTeΧ formats reduces the amount of time that might otherwise be required to interactively open the database, run the appropriate query (not to mention verifying that the query, or any queries that it depends on, have not been altered), export the result, and reformat the result. If the query output will be further processed or used in another scriptable application (e.g., to produce graphics or statistics using R), execsql can be combined with other programs in a system script file to further automate the data summarization and analysis process.
  • If a database must be maintained in two different formats (e.g., in PostgreSQL for ordinary use, but downloaded to SQLite for use when a network connection is not available), one script file can potentially be used to carry out exactly the same data selection and summarization operations on both formats of the database.
  • The capabilities provided by some of execsql’s metacommands surpass the features available in views or stored procedures in most DBMSs, and this additional functionality is only available when script files are used.

The capabilities provided by metacommands may, in some cases, allow a script designed for execsql to take the place of a custom database client program.