From the front lines
Data Migration
The Solo Developer
On Sophistication

A Software Engineering Process for the Solo Developer

The solo developer is faced with all the problems encountered by other development organizations but cannot bring the same resources to bear on these problems. Unlike other developers, the solo developer is responsible for the entire development cycle and cannot enjoy the luxury of specialization.

Everybody has strengths and weaknesses. There are things that are fun to do. There are others that are very hard. The solo developer process must take advantage of the easy and challenging activities while ensuring that the other tasks get done without too much pain. What is easy and what is hard varies from individual to individual.

The solo developer needs a toolkit and a set of skills that help to control the development process. The toolkit will include off the shelf tools, such as compilers, editors, and database systems. The developer must also develop his or her own tools, such as a time and task recording program, and an application framework. The skills include analysis, database and program design, project management as well as programming skills.

The aim of the process is to bring the tools and skills to bear on the project’s challenge, and to bring the project to a successful conclusion. The process aims to improve itself with time.

At any time, the projects that can be tackled by the solo developer are limited by the toolkit, the skills and the experience of the developer, as well as by limitations in the process. On completion of a project, it must be reviewed with a view of improving the three resources of the developer. The project will often require new skills, which may have to be developed further. New tools may have been used, which may be included in the basic tool kit. Other tools may be needed for compatibility with the existing tools. Each project I have ever worked on has required some innovation. Decide which innovations can be included in the framework. Look at the things that did not go as planned, and think of improved procedures that could have anticipated the problems. Review the estimates, break down complex tasks, and adjust the time estimates. Time recording is so important for accurate estimation.

Up Front

A project starts with a customer requirement. It is so important that the requirement is well understood. Talk to all the interested parties. Your customer may be meeting a requirement of his customer. Arrange a meeting, and understand those requirements. Understand the customer’s additional requirements. Talk to the customer’s staff involved with this area of the business. Ask for war stories. Ask for things that have gone wrong before. Ask what happens over an extended period. Are there legal requirements? Are there financial requirements? Are there personnel requirements? Are there weekly, monthly or annual requirements? What is the scope of the application? Who will be using it? Is it a desktop, a client-server, or a distributed application? If distributed, will it have local servers, or will it be online? What is the client platform: Windows or Web?

If the customer does not know, make suggestions. Lay down requirements. Be prepared to increase or reduce the scope.

Put this in writing, and confirm your understanding with the interested parties. Be conscious of security issues. The parties may tell you things that they do not want others to know. Be trustworthy. If need be, prepare separate reports.

Once the requirement wish list is in place, get your customer to select and prioritize the most valuable items. Emphasize that you are a small operation, and want to focus on features that can be delivered in a few weeks. Also emphasize that you can reasonably estimate the cost of small tasks, and that both of you have an interest in controlling the risk. While you may not be able to estimate the total cost of the project, the cost will be under your customer’s control. You get paid as you deliver, and your customer has control over the rate of expenditure, and can stop development if cash is tight. This is stressful for you, but that is the life of the solo developer.

With a small list of requirements, list the tasks needed to meet these requirements. Break down the tasks until they are specific enough for you to know the effort involved. I find two tools useful here: a checklist of tasks built up from previous projects, and time estimates for each of these tasks, also derived from previous projects. If the time is more than an hour, break down the task to subtasks. I like to prepare the list as nested bullet points, but use whatever tool works for you.

I find that I always have to do things that I do not anticipate. I use the experience of previous projects to adjust the time upwards. At first it is a good idea to use a factor of three or four. Without experience in the form of a checklist, it is easy not to think of important tasks, such as installing the application, setting up the database at the client, designing your menu hierarchy and look and feel, etc. The task estimates are also over optimistic. If the cost of the project is a concern, remember that it is better to start with a reasonable estimate and to negotiate down – possibly cutting features – than it is to start with an underestimate and to try and negotiate up. This puts you in the weaker position.

The task list becomes your most important working document. It shows your progress, and it is easy to record the time taken next to each task. Add any unanticipated tasks to the list. At the end of the project, you will be able to adjust your checklist and estimates from this information. I like to print out the list, and write on it. This makes any discrepancies obvious.


Programming is what it is all about. The solo developer with a professional attitude will tackle the programming task in a systematic way. Planning is essential. There are two aspects of planning: the what and the how. The how is part of the software engineering process, and I will discuss it in a moment. The what requires some design.

As a business system developer, I like to start with a database design. The database must support all of the tasks that I have identified. Normalization requires that I factor out code tables. I need to add the maintenance of this data to the task list. This is one reason I need to adjust the time estimates upward.

If there is legacy data, I want to migrate it as soon as possible, in fact as soon as the database has been designed. The legacy data is usually dirty, and requires some effort to clean up. It can also highlight unanticipated requirements, which are best dealt with early. It can also conceal some nasty surprises. I don’t know about you, but I want to know about these as soon as possible. I prefer testing my code with real data. The developer is always the worst person to prepare test data. A few megabytes of real data will also make performance problems painfully obvious to the person best placed to deal with it: you, the developer.

The solo developer is forced to be a low price producer. It helps to be a low cost producer. This can be achieved in a number of ways. The first is to have the basic architecture in place. You can then focus on the project’s specific requirements without re-inventing the wheel every time. I use the time between projects to revamp my standard architecture. I fix most problems as I go along, but some require care and cannot be dealt with when under pressure to deliver. I usually deal with scaling problems in this way. Scaling problems are those that are inherent in ‘toy’ programs with a small number of functions, but become apparent only when the program has grown. I like to design my framework to work in a way that avoids these problems. The solution may be overkill in small programs, but I live with that.

Here are some examples of scaling problems. RAD programming encourages the use of tabbed notebook controls to cram as much functionality as possible onto the same form. The form works like a wizard applet, with a subset of the controls visible at any time. Putting all these controls onto the same form leads to longer names for each control to avoid name clashes. Switching from one page to another at development time is a big nuisance. The code associated with the form becomes huge and difficult to maintain. There is little opportunity for code reuse, and I find that the user interface code and the processing code becomes hopelessly entangled. Things become worse if database tables and queries are also included on the form.

Data aware controls are a very useful way to build applications very quickly. By using the appropriate tables and controls, the system does most of the work. I have a number of peeves with data aware controls. The first is that they don’t always behave as I want them to. For example, I like to allow the user to capture all the data for a transaction, then to either commit or cancel the transaction. Having a transaction open while the user captures it is fine for a desktop application, but is not a good idea for a client-server application, and is a disaster for a distributed application.

The framework enforces a design that avoids these problems. My current framework makes use of child forms that are placed in a parent form, which can in turn be a child on another form. This allows me to put fields for each table on its own form, together with the validation and other user interface code. The processing takes place in another program file. This clearly separates the user interface code from the database access code, and also limits the functionality implemented by each program file. The whole design is much cleaner than the RAD code that I wrote a few years ago. I gain in performance because the forms and their controls are created when they are needed.

My current project has highlighted a number of rough edges in my framework, but those can be filed down after this job is done.

On any project I am always faced with a lot of repetitive coding. Doing data capture screens is tedious and error prone. I have built a code generator that inspects my database, looking at things like primary keys, referential integrity, and unique constraints, and generates all the data capture screens, with all the controls, display and validation code. The appearance is acceptable, but I still have to make some adjustments. I must still do the navigation between forms by hand, but I plan to do this by generator in the future. In most team projects, this work is given to the most junior programmers. I don’t have that luxury, so I use the generator to gain economy of scale, and to drive down my costs.

For this scheme to work, I must be able to identify each database record with an integer field, as well as a description field. This constrains my database design. I use surrogate keys to identify my principal objects, even when a natural key exists. Minor objects are identified by the owner object’s surrogate keys, together with a sequence number. Join records identify all the master objects in this way. I have ongoing arguments with database designers on this issue, but I have a number of reasons for liking this approach. From the database design point of view, each object has a unique identifier that will never change. I consider any user specified data to be subject to change over time. I also like to keep my relational keys as compact as possible. Most databases work much faster with compact keys. From the programming point of view, I benefit from a uniform database design where I can identify a record by means of a single integer together with its owners, which can be identified in the same way.

I also like to specify the data constraints in my database schema as an additional safeguard against programming errors.

I have recently adopted the Extreme Programming unit testing regime. I often develop new code in my test bench, and then fit it into the main program. I am slowly building tests into my older projects. I usually undertake to correct any errors found in my new code for a month after going live at my expense. I cannot afford to do this too often. I find that I spend at least a day post implementation fixing things. I want to get to the point where it works first time every time. I expect to get there. Now that I am gaining experience with this way of working, I am beginning to believe the hype around code refactoring with frequent tests.