Here, we’ll detail the final five pitfalls…
(6) Lack of Platform PrecisionModern information systems nearly always rely on a fairly sophisticated “platform” of hardware and software to provide them with standard services like a runtime environment, data storage facilities and application security. The platform that information systems use has got a lot more complex in recent years and rather than 2 or 3 products (such as a compiler, operating system and database) today’s platforms are often comprised of 10 or more products (operating systems, databases, application servers, application frameworks, virtual machines, cluster services, security services and so on).
This increased complexity has inevitably resulted in a much greater chance of incompatibility between the various components, as there is more to go wrong and more chance of you using a particular combination for the first time. This situation means that its no longer sufficient to simply say that you “need Unix and Oracle” when specifying your platform. You need to be really precise about the specific versions and configurations of each part in order to ensure that you get what you need. This will allow you to avoid the situation where you can’t deploy your system because someone has helpfully upgraded a library for one part of the platform without realising that it means that something else will no longer work.
Work out what you need from your runtime platform as early as possible and make sure that you’re precise about the versions and configurations of its components.
(7) Making Performance and Scalability AssumptionsAn experience that many software architects will relate to is that of surprises arising during system build and test. This is probably never more the case than when considering performance and scalability – if nothing else because there are just so many things to go wrong!
Particularly when using new technology, it is often hard to get a good feel for performance and scalability characteristics without a great deal of experience and its easy to assume that it’s all going to be OK, only to find out rather late in the day than performance or scalability is sensitive to some factor you’ve not considered.
The only real solution to these problems is constant vigilance and assuming nothing! A little performance and scalability paranoia will help you to keep considering these factors throughout the project and to keep challenging your assumptions. Start considering performance and scalability early, create performance models to try to predict key performance metrics and spot bottlenecks and get stuck into some practical proof-of-concept work as your design ideas are forming. This will all help to increase confidence that there aren’t any performance and scalability demons lurking in your design.
(8) DIY SecurityAnother system quality that has kept many software architects awake at night is security. This quality is becoming ever more important as systems are exposing interfaces outside the organisation and simultaneously becoming more and more audited due to regulatory factors and corporate governance initiatives.
A mistake made in many systems over the years has been to try to add security into the system using “home brew” security technology. Be it custom encryption algorithms, a developer’s own auditing system or an entire DIY access control system, locally developed security solutions are rarely a good idea. While most of us think we could probably whip up a clever piece of security technology in no time, we’re usually wrong.
Like many complex things, security technology is considerably harder to build than it appears at first glance and mainstream security products are inevitably built by specialists with many years of experience in the field. Trying to create your own security mechanisms is likely to be time consuming and may well introduce subtle security vulnerabilities into your system that can be exploited by attackers if there is anything in your system that they really want.
Of all of the system qualities, security is probably the one where it’s well worth getting some expert help to assess the security you need, the vulnerabilities that you may have and the mechanisms you should use to address them.
(9) No Disaster RecoveryMany systems reach production without major mishap and manage to run successfully there for years, without any significant interruption to service. Others aren’t so lucky and suffer some sort of major failure involving entire system recovery a number of times during their operational life.
The problem when implementing disaster recovery (DR) during a project is often how you get funding and attention for something that may never happen, at a time when there is already too much to do with the resources available. However the obvious problem with not having DR is that serious infrastructure failures can happen at any time and if your system is important to the organisation, then its loss is going to have a serious impact on the business.
The key to getting resources to implement a DR mechanism for your system is to be specific and quantify the cost of system unavailability in a number of realistic scenarios. If you can also estimate the probability of the scenarios occurring then you can use these two figures to convince people that DR is important and to justify a certain level of budget to implement it.
Finally, also remember to test your DR processes and mechanisms regularly. Experience shows that the DR exercises rarely work the first time and you don’t want to find the weaknesses in your design when you have a real failure to deal with!
(10) No Backout PlanWith the best will in the world, things go wrong and while the previous tip is a reminder to ensure production resilience, you also need to remember to allow for disasters during deployment.
In the ideal world, deployment is always a smooth process, resulting in a system running as expected. Things do go wrong though, from configuration difficulties to unexpected environmental factors, to undiscovered faults and of course simple human error. Make sure that whatever happens during the deployment of your system or upgrade that you have a documented, reviewed and agreed backout plan to allow you to restore the environment to its state before you started deployment. At least this means that you can undo the damage and avoid total disaster if the worst happens!
ConclusionIn these two short articles, I’ve shared some of the pitfalls that have caused many software architects to come a cropper in the past. Hopefully this will allow you to steer your projects clear of these potential icebergs that could leave it holed below the waterline and doomed to being another of the wrecks that others can only learn from.
Further ReadingSome useful references on describing software architectures using views and UML include:
- The 4+1 View Model of Architecture, Philippe Kruchten, originally in IEEE Software November 1995.
- Software Systems Architecture: Working with Stakeholders using Viewpoints and Perspectives, Nick Rozanski and Eoin Woods.
- Large Scale Software Architecture, Jeff Garland and Richard Anthony.
- Applied Software Architecture, Christine Hofmeister, Robert Nord and Dilip Soni.
- Documenting Software Architectures: Views and Beyond, Paul Clements, Felix Bachmann, Len Bass, David Garlan, James Ivers, Reed Little, Robert Nord and Judith Stafford.
- Architecting Enterprise Solutions: Patterns for High-Capability Internet-based Systems, Paul Dyson and Andy Longshaw.
- Blueprints for High Availability, Evan Marcus and Hal Stern.
- Security Engineering: A Guide to Building Dependable Distributed Systems, Ross Anderson.
Eoin Woods is a software and enterprise architect of UBS Investment Bank. He is a member and Fellow of the International Association of Software Architects, and will be speaking at IASA’s IT Architects Regional Conference in San Diego, Oct. 15–16 2007. He is also co-author of the book Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives (published by Addison-Wesley).