Automated database change management

This article was originally published in VSJ, which is now part of Developer Fusion.
Most approaches to database change management involve tedious, manual and costly processes that are error prone. In many cases there is no version control of the objects that reside within the database.

The result? Database chaos.

Typically, a database development project will suffer problems such as overwritten changes, code that won’t compile, lost data and confusion over what has been applied to which database. Release into production becomes a chaotic and costly phase, with everybody “fire-fighting” and in “panic mode”.

It is a fact that database systems are becoming more complex with ever-increasing demands. Businesses must continually innovate and accept challenges. IT departments cannot afford to complacently use processes simply because it’s “always been done this way” or is the prevalent paradigm.

The Need for Database Change Management

One of the certainties in life is that things will change. This is especially true of database systems that must evolve and change in response to new user requirements and the continuous drive for competitive advantage. Changes to a database may be required for bug-fixing, resolution of defects and performance tuning e.g. de-normalisation. The complexity of some systems can be overwhelming when you consider the interaction and inter-dependencies between various technologies and protocols e.g. COM+, MDAC, Web Services, XML, HTML, ASP, SOAP and that many environments are heterogeneous.

When you think about database change management, the following questions may spring to mind:

  1. How are changes to database structures and static data managed in such complex and often mission-critical environments?
  2. How can you know with certainty what version a database is in a particular environment?
  3. How can you deploy new versions with accuracy and reliability?
  4. Can you reliably determine who made changes to your database objects?
  5. Do you have an audit trail recording what changes have occurred and at what point in time?
  6. If severe errors occur, can you back-out the corrupt version and rollback to a known state of integrity?
This is the essence of database change management. Every database development project must have a process or methodology in place to propagate and deploy upgrades to database schema and data. After all, the database is an extremely valuable asset, containing transactional, financial and customer information that are the lifeblood of an organisation.

Problems with Current Approaches

Now let’s discuss some of the current methods of database change management, outlining their flaws, and the degree of risk that they contribute to the application development lifecycle.

Manual Scripting

Probably the most common method of managing database change is via a manual process using script files. This involves hand-crafted upgrade scripts for changes to database code, schema and static data. There may also be the need for a corresponding downgrade script to be used as a contingency to rollback a failed upgrade.

A typical scenario is that a DBA will receive an upgrade script from a developer that is different to the copy in the version control system (VCS), which is again different to the production environment. The DBA(s), while performing a vital quality assurance role, becomes a bottleneck as they must manually resolve such issues plus review every script for correct syntax, functionality (often requiring an understanding of the business rules) and performance. There is also a need to understand how the changed objects will affect their dependent objects (e.g. due to referential integrity) and the order in which changes should be propagated.

After the scripts have been executed against the target environment, if further issues are found, then another upgrade script is required to be coded and so on and so on. You very often end up with lots of incremental scripts just for one new piece of functionality. When you multiply that out over many developers and many new pieces of functionality, the number of scripts could run into the hundreds.

With manual processes, there is always the risk that the development and testing environments do not match the production database in some subtle way. This can lead to production outages that could have been avoided if there was a guarantee that what was tested, matches exactly what is placed on the production system.

Reliance on hand-crafted scripts to manage code baselines and propagate change within your database development project, cannot guarantee complete control. Many IT departments are still using cumbersome manual scripting, which ties-up highly skilled resources in tedious, time-consuming and error-prone tasks.

Summary of drawbacks

  • manual process, therefore time consuming (expensive) and error prone
  • requires upgrade and possibly downgrade scripts
  • may require additional upgrade scripts if errors are found (may run into hundreds)
  • increasing pressure upon DBA(s), resulting in bottlenecks
  • quality assurance may be impeded if DBA(s) come under pressure
  • production and test environments may still not match
  • real risk of production outages
  • complete database management control is not a reality

Create Brand New Database and Migrate all Data

This approach involves the development of scripts to create the database schema and code such as stored procedures. These scripts encapsulate the changes required for the next version of the database. A brand new database is created, upon which the scripts are executed, giving an empty schema. A migration process is executed against the new database involving routines to extract data from the original database and load it into the new upgraded database. This approach has the advantage of being less of a change process, with all its inherent complexities, to more of a creation and initial load process.

Summary of drawbacks

  • a much longer time to upgrade due to the migration of potentially huge amounts of data (with an increased risk of data loss and corruption)
  • additional requirements for disk space
  • the creation, testing and maintenance of extraction and loading routines
  • inefficient manual process to generate scripts to build the new database with the upgraded schema and code
  • complete database management control is not a reality

Team Development using Owner Qualified Objects

This method provides each developer with their own objects or schema within a common development database. Developers are given exclusive security access using a login, which also prefixes each object to identify ownership. They are then free to code and unit test in complete isolation. Only the DBA has security access to the baseline database objects e.g. in Microsoft SQL Server, this would be the dbo (database owner) prefixed objects.

When development of an object is complete, the developer checks it into the VCS and informs the DBA it is ready for promotion. The DBA replaces the baseline object in the database with the new version and updates an overall upgrade script.

Summary of drawbacks

  • tendency to only work for objects such as stored procedures and functions i.e. Data Manipulation Language (DML). Meaning Data Definition Language (DDL) must be maintained by the DBA(s)
  • results in many manual steps which are time consuming and error prone such as promotion of objects and modifying the final upgrade script
  • the DBA(s) must ensure integrity of the process, but in so doing, becomes a bottleneck
  • far too much reliance on diligent communication between individuals
  • complete database management control is not a reality

Ascertain Changes with SQL Queries

In order for the developer or DBA(s) to apply upgrades, they must ascertain what changes need to be propagated and in what order. One way to achieve this, is to develop SQL queries and stored procedures to compare schemas and extract object properties e.g. in SQL Server, interrogation of information schema views with a FULL JOIN and WHERE clause finding NULL values.

Summary of drawbacks

  • this could entail a large amount of visual comparison of schemas, DDL files and system catalogues with the aid of a file comparison tool
  • although such queries and stored procedures can tell you what is different, they rarely have the ability to actually deploy changes to upgrade or synchronise
  • an ad hoc and manual approach that is tedious and error prone
  • requires an excellent understanding of the database schema and system catalogues
  • need in-depth knowledge of object dependencies and business rules
  • complete database management control is not a reality

What is an Automated Methodology?

If you participate in a project that involves database development, a crucial element of a project’s success, is the accurate propagation and deployment of the database code and schema changes. Often, an unnecessarily large proportion of highly-skilled resource is required for this aspect of the project. Once the analysis and design has been completed, the deployment of the changes, while critical, is quite laborious and repetitive. Therefore, it is highly recommended to systematise or automate these tasks.

An automated methodology is a complete end-to-end solution for database change management and version control, which will minimise human intervention for tasks that are repetitive, time-consuming and error-prone. This process is enabled by an application called the “change control tool”.

Elements of an Automated Approach

The following outlines the basic elements required to achieve an automated approach.

Local Development Database

There are two choices here:
  • to have a single development database that all developers use, or;
  • each developer has their own copy of the development database
The latter is recommended as it provides each developer with an isolated workspace that cannot be changed by another developer. Typically, each developer would have a cut-down extract of the production system on their desktop. How do the developers keep their local databases up to date? An automated methodology provides the ability for each developer to incorporate all other developer’s changes on their local database and then unit test their changes.

Integration with Version Control System (VCS)

Most IT departments utilise the critical functionality of a VCS to maintain and track changes to source code. It is quite a common practice for developers to store new and amended code and components for languages such as VB, C++, C#, ASP, HTML, Java and XML. But what about the SQL source code and objects which are resident in the database system? It is recommended to store all SQL objects within a VCS, thus enabling automated versioning of the database.

Building the Source Structure

To implement an automated methodology, all SQL code and schema definitions must first be reverse-engineered and lodged into a VCS (a one-off task per system) usually with a separate folder for each object. An automated approach facilitates the extraction of DDL and SQL code and the creation of the appropriate source structure.

Figure 1: Example Directory Structure for SQL Objects in VCS
Figure 1: Example Directory Structure for SQL Objects in VCS

Static Data

The commands to insert static data (reference tables) are also lodged into a VCS. These commands can be generated by a stored procedure or the “change control tool” to create an insert statement for each row within your static data tables. Automation simplifies the generation of static data insert commands.

Build Environment

A “build” environment is configured containing a database which is a replica of the production database. An hourly (or daily) compile and build of all SQL objects is run as a scheduled task. This facilitates a proactive approach to verification.

Custom Scripts

As with any complex database system and the applications it supports, not all situations can be covered and will need intervention. On rare occasions, a scenario arises which is ambiguous and the change control process has to guess as to what is required. For example, if a new table schema requires a rename of a column, the decision might be made to drop the column and all existing data – something you may not want to happen.

There are other situations in which the process would simply not know what is required e.g. data is required for a newly created column. In these cases, the automated approach would provide the ability to execute ad hoc custom scripts. For example, for new column data, you would create a script to insert the new data.

Then have the automated process call this script after changes have been made to your database.

Change Control Tool

At the centre of an automated methodology is the “change control tool.” This is the software that automates database propagation and deployment by performing such tasks as generation of the SQL commands necessary to move a database to the desired version. Key features of such a tool would include:
  • ability to run in user mode (wizard-based) or auto mode (command-line execution) for automated scheduling
  • integration with a version control system e.g. Visual Source Safe
  • schema and data comparison and synchronisation
  • build a new database from a set of object creation scripts hence verifying code
  • compare a database to a set of object creation scripts and perform an upgrade of the database
  • object dependency resolution – algorithms to ensure objects are built in the correct order and relationships are maintained e.g. defaults bound to columns, DRI and circular relationships
  • identify and report all differences and errors
  • automatic generation of SQL change scripts for schema and data
  • capability to reverse-engineer existing databases
  • session-save facility
  • ad hoc custom script execution
  • ability to compare all database objects and properties

Implementing an Automated Approach

There are probably numerous ways to implement the specific tasks of an automated approach. This is due to the complexities and uniqueness of each environment. But a typical implementation of this methodology would involve the following components (see Figure 2):
  1. As a one time process, script out your entire production database’s schema and static data into individual scripts and lodge these into your version control system (VCS). Instruct the development team to use them in the same manner that they would VB, ASP, C++ or Java code.
  2. Decide whether each developer will use a local development database, which is a subset of production, or use a central, shared development repository.
  3. Provide each developer with the ability to perform an ad hoc or scheduled update of their local database with the change control tool. Or setup a similar scheduled update if using a shared environment.
  4. From that point on, in order to make a change, a developer would:
    • check out the relevant scripts
    • make the required amendments
    • execute it against their local (or shared) development database
    • perform unit testing and bug fixing
    • use the change control tool to extract the latest version of scripts from the VCS (which incorporates all other developers changes)
    • run the change control tool to build a database locally, using the scripts from the VCS and the developer’s own changes (to verify that nothing else has been broken)
    • then check the relevant scripts back into the VCS
  5. Use the change control tool to execute a scheduled (hourly or daily) build of the entire schema and code to a “production-sized” build database to verify successful compilation. Label (or snapshot) the source code at these regular intervals to have a named, point-in-time view of your entire database.
  6. To propagate and deploy changes to your various environments, you connect the change control tool to the labelled scripts within the VCS and to your target database. The change control tool will then verify that all the scripts compile, with no dependency errors, and make all necessary changes to the target database to ensure that it matches the scripts.

Figure 2: An Automated Database Change Management Solution – Logical Diagram
Figure 2: An Automated Database Change Management Solution – Logical Diagram [Click to enlarge]

Benefits of an Automated Approach

1. Improved Quality and Faster Software Delivery

Local Configuration
The use of a local development database means that no developer can be prevented from delivering their changes just because someone else has broken a shared database halfway through unit testing. A regular, automated update of this database from the base scripts in the VCS enables a thorough and swift examination of the developer’s changes. This configuration encourages the developer to deliver an improved quality of code and ensures that what ends up in the VCS is of a high standard.

Scheduled Build Verification
The continual testing and verification of deployment via a scheduled build of the code base helps detect bugs, such as syntax, regression and dependency errors, much earlier and while fresh in the developer’s mind. The end result is faster software delivery, enhanced quality and integrity of code along with the avoidance of costly and stressful “fire-fighting” at release time.

Elimination of Manual Tasks
One of the main aims of an automated methodology is to systematise change propagation and deployment. Why perform mundane tasks such as table modifications that lend themselves so well to automation? Why, in effect, re-invent the wheel every time a deployment is due? This point is highlighted even further, if you need to deploy to several environments, with possibly hundreds of databases, on the way to production release.

An automated methodology can deploy in a totally controlled, repeatable and reliable manner. Ultimately, elimination of manual tasks will:

  • minimise the cost of development projects
  • eradicate errors
  • reduce the length of the development life cycle
  • reduce anxiety within the project team

2. Auditable and Version Control

It is crucial to have the ability to label (snapshot) database code and thereby provide an audit trail and versioning of your database. This provides reliable information on how, when and by whom the database code has changed. It also paves the way for the painless maintenance of static data, which can also break a database when data is lost, overwritten or out of synch. An automated approach with version control enables you to leverage the benefits of your VCS for your database repositories.

3. Sarbanes Oxley Compliance

In the wake of recent high-profile financial scandals, companies today have become more accountable for the integrity of their financial records and practices. Recent introduction of new legislation, such as the Sarbanes Oxley Act (SOX) in the United States, affects the IT departments whose job it is to store a corporation’s electronic records. To satisfy the requirements of SOX, an IT department needs the ability to create and maintain a corporate record archive in a cost-effective manner. An automated methodology for database change will help make your systems auditable and assist efforts to meet corporate governance requirements.

4. Controlled Rollback

A critical attribute of an automated process is the ability to back-out an upgrade in the event of failure. Rolling back deployments manually can become extremely complex. In effect, you have to craft a change script for the failed change script. Things can get nasty if you need to re-instate data that was deleted by the upgrade – either validly or invalidly. If a critical error occurs in your deployment, automation will ensure a straight-forward and controlled rollback.

5. Benefits for ISVs

If you are an ISV (Independent Software Vendor), then you possibly distribute a product to your customers that relies on an underlying database. To release a new version of your software may also require an update to this database. How do you confidently distribute these changes? How do you know with certainty, what version of the database each customer has and in what condition? What if the customer has made unknown changes or customisations? Creating an install package for your latest release can be daunting.

Another approach would be to include with your release package, the required version of base scripts and an application to interrogate the customer’s current database and how it differs from the new version. Changes to databases can then be automatically deployed “on-the-fly”, without having to manually track the version and state of each customer’s database.

6. Parallel Code Development

With the pressures of tight deadlines for development projects, one strategy is to “split” the code base and assign teams to work in parallel on separate pieces of functionality. If you have two or more code bases that developers are making amendments to, then re-integrating them can be tricky at the best of times. With an automated approach and integration to a version control system (VCS), you label your code hourly (or daily) and can test against that baseline. With baselines in place, it is easier to analyse the code and perform a merge, knowing that all code on either side is syntactically correct beforehand.

7. Improved Productivity

The individuals responsible for the management of database change control; whether it is the DBA(s), developer(s) or change controller(s), have a crucial role as gatekeepers and custodians of a valuable asset. Throughout the database development project, the DBA(s) must liaise between development and operations to ensure the smooth deployment of upgrades. This is a delicate balance between swift deployments of functionality (to keep a project on schedule and minimise hold-ups); while at the same time guaranteeing the integrity of the database (by verifying all proposed changes).

Introducing efficiencies to the process of database change will empower these people with greater flexibility and improved productivity. The DBA(s) would then have the ability to propagate a change far more quickly and with a greater standard of quality compared to using manual methods. Consequently, they would have more time to focus on proactive planning, integrity and performance issues. In fact, the effect of an improved change control approach can be felt throughout the organisation and even extend to the customer.

8. Accurate Project Estimation & Planning

It is unfortunately far too common for IT projects to run severely over budget with continual postponement of completion deadlines. Attempts to hit deadlines by allocating additional resources often exacerbate the problem of escalating project costs. One of the main reasons for this is the myriad of complex technical issues and unforeseen critical errors that can occur. A degree of contingency can be allocated to combat these uncertainties, but methods of reducing development errors is a topic that is high on the project manager’s agenda.

So, another key strength of an automated approach is that it eradicates many unforeseen technical issues that manifest throughout the build and deploy phases of the application development lifecycle. The actual task of estimating project length and appropriate resource allocation becomes a lot easier and much more accurate as many of the “unknowns” are removed. This is because the tasks of generating, building, verifying and deploying changes are automated, thereby minimising errors and avoiding unexpected consequences of changing database objects.

Implementation Considerations

For the vast majority of organisations, the implementation of an automated database change management approach would require very little effort. This is because most of the pre-requisite components for an automated approach are already available within the organisation itself. It is merely a case of using what is there in a different manner. In addition, such an approach should provide the following attributes resulting in a smooth implementation:
  • low overhead to install and configure
  • quick to learn and productive within a few minutes
  • requires little or no staff training
  • no additional hardware requirements
However, depending on the organisations culture towards change acceptance, the following will need to occur in varying degrees of depth (dependant upon the requirements of the organisation). This will facilitate internal recognition that an investment in time and money should be made for the adoption of a more efficient and effective process:
  • undertake a detailed technical evaluation of the enabling software via a pilot study/trial run in order to equip the evaluator with sufficient evidence to eradicate misgivings concerning acceptance of the new methodology
  • demonstrate that the current processes are outdated and inefficient in comparison
  • produce evidence that a speedy and notable return on the initial investment will be achieved
  • establish an effective communication strategy targeted towards those that are likely to benefit from and utilise the “new way of doing things” so as to ease implementation and gain acceptance
As in any business, in order to justify the implementation of change, an evaluation must include the analysis of return on investment alongside the cost of continuing to practice an obsolete methodology. Only then can the “opportunity cost” be calculated, which can in itself, be the deciding factor in moving forward with the implementation of a new process. If a decision is made to “leave it for now”, such a decision incurs an “opportunity cost”. What is this cost? Is it more than a monetary value?

By undertaking these exercises, evidence can be communicated and confidence generated amongst the users (and investors), paving the way for a seamless implementation of a new approach to database change management.

Conclusion

The complexity of database systems is on the rise and IT departments face increasing pressures to deliver functionality and timely, accurate information to ensure an organisation’s competitive advantage. Additional responsibilities arise from legislation such as the Sarbanes Oxley Act. Companies must assess their current approach to database development and be receptive to a better way of doing things. Failure to embrace improved methodologies can only waste valuable technical resources and prove extremely costly. Your IT department cannot continue to endure database chaos, as it can directly affect a corporation’s bottom line. The ramifications of an inefficient methodology extend to lost business opportunities, higher operating costs, second-rate product/service quality, poor public relations, high staff turnover and legal liabilities.

This series of articles has outlined a new approach to database change management which is increasingly viewed as a best practice and there is no reason why it shouldn’t become a de facto industry standard. It is an approach that organisations are strongly encouraged to adopt, as it provides an audit trail and is both reversible and repeatable. This is achieved through integration with a version control system, hourly compilation (hence verification) of database objects, automated generation and propagation of changes and deployment of upgrades. This has the ability to reduce database development project phases from weeks down to days, free-up expensive technical resources from mundane and time consuming tasks (so they can concentrate on deploying their knowledge and skills more effectively and efficiently) and virtually guarantee the integrity of your database code.


Darren Fuller began his IT career in 1990. Much of his experience has been gained by providing database consultancy on mission-critical systems and development projects within large organisations, often within heterogeneous environments using multiple technologies. Darren has specialised in database technologies with a particular interest in troubleshooting, optimisation and tuning, especially with Microsoft SQL Server (since 1994). He has a passion for ensuring a database system is finely tuned and the processes within a development project help to achieve this aim in an efficient and cost-effective manner.
 
Darren holds a Bachelor of Business in Computing and a Microsoft MCSE certification. He welcomes any feedback and can be contacted at [email protected].

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“The first 90% of the code accounts for the first 90% of the development time. The remaining 10% of the code accounts for the other 90% of the development time.” - Tom Cargill