Dashed Utopia Pt2
Here I want to give a synopsis of what my team at this San Diego startup did create such an effective software development system. This happened years ago, using many of my own ideas coupled with a lot of best practices in software development, such as following Scrum and a lot of DevOps practices.
There’s a lot we didn’t do in terms of best practices as well. While I would have liked to have us use test-driven development, the wasn’t simply feasible with the code base we were working with, which had not be created with TDD in mind to being with.
We did perform a lightweight version of continuous integration and development, and the quality assurance team followed well-documented test protocols based on solid specification. Likewise, release notes were semi-automated and it was the responsibility of the devs provide and verify against their own basic test scenarios, shared with QA.
What had the largest impact, however, was a major refactoring project that I enacted. This project involved three major portions of the software:
- The backend administration portal.
- The content management portal.
- The database.
The first two were all about improving the user experience of the interfaces, in particular with the content management portal, since this where a bulk of the work was being done by non-technical staff.
What it explicitly didn’t involve with the mobile client software, which was our main product and several different versions for each of our clients were already out in the wild. We had to make all of the backend changes while minimizing the risk that the changes would break our service to to the client software.
I am very much a follower of the idea that a bulk of software development should actually be in the planning and not outright execution. A lot of my practices were heavily influence by the software development books of Code Complete an Rapid Development by Steve McConnell. It was from his writing and research that I saw extensive planning as offering the most value for the effort.
There was already variations of these tools being used; they had been written by a single backend developer who made most of the decisions about how the services were to look, act, and perform. He wrote the database schema and determined, in his best estimation, what the user experience should be like. However, he had no background in the space of user interface and user experience design , so the interfaces turned out to be a hinderance to the non-technical people who had to make use of them (especially on the content management side of things).
His database skills were also apparently limited, or he was doing too many things for one person to manage, as the database was not normalized or indexed, both critical factors to ensure it performs effectively and smoothly. The non-normalized state caused some very basic queries to be extremely slow, while the lack of index caused lookups in general to be slow.
[In case you’re wondering, normalization is the process of balancing out the tables of data so that replaced information is minimized and connections between tables are logical, consistent, and don’t overlap]
For the database portion, I tried a DBA contractor who turned out to be fantastic at assessing what needed to be done; I retained her for the length of the project to ensure that as we made changes, the database integrity was maintained or adapted appropriately.
For the user experience portion of the project, working on the two portals, I was the primary designer; at that point, I had worked with two UX design groups (one internal to a pervious company and one external) and was the primary UI developer for several mobile products, so my experience was the most applicable.
I interviewed, multiple times as the project evolved, the content production staff. Their job was to produce videos and images that where to be shared across the mobile product. Their CMS portal, before the project was started, was one of the biggest sticking points to their productivity. The user experience was serial; for each upload of content or change in a text file, they had to wait for a lot of backend processing, which took time and meant one person was limited in what they could upload. As a result, we had hired many interns in order to parallel process the content management. Each intern had an account and was responsible for a certain batch of the content to be upload and text files to created or changed.
My design focused on batch uploads and processing; the producers could start the upload process (canceling as needed) and move on to do other things while they waited — such as doing their main job of producing video and image content.
A similar, less intensive, evaluation was done for the server administration, where permissions for file access and more granular content management was handled and in particular the authorization to push the content to the staging servers and eventually into production.
The first four months of what was planned to be a six month project was almost entirely “planning.” I put that in quotes because a lot of people think that planning is a short term activity. The reality is that a good project is always in state of planning; it’s just the intensity of that planning that may vary over time.
In our case, the first four months was about interviewing, testing existing UI, creating new test cases and UIs, making changes to the (test) database, fine tuning, re-testing the proposed UIs, taking polls on user experience, getting more feedback from any and all users, and then repeating that one and over until we were confident that we had a system that would perform well: We wanted to improve the network performance and database queries just as much as we want to improve the user experience for the production team.
Along the way, we all made changes to our overall software developer lifecycle, making changes to automate as many places a possible and to document everything of note that we’d need later to sustain the level of productivity were beginning to achieve.
That was the last major factor: measuring our throughput and output. Along with taking baseline metrics of the current time it took to make DB queries or make a network connection; or how long it took for a producer to upload a batch of large video files, confirm on staging, and pushing to production, we also measured our ability to estimate how long our sprints would be, how long took to release a full suite of applications, how many manual steps we had, how often bugs occurred after an code changes, and so on.
And code was being indeed generated during this four month planning phase — don’t forget, we still had existing clients to support, new client to onboard, new product updates to release, and an existing content management system that needed help. On staff of eight people, only one was fully committed to the main project, our newest employee at that time, our second backend engineer whose background in C#/.NET as perfect for what we needed. The rest of us had to balance our contributions to the project against our immediate obligations, and that was my job as both the director of software and the product owner.
As for the project itself, we still generated plenty of code during the planning phase. Usually it took the form of rapid prototyping, based on a lot of wireframes and conversations. As the UI/UX came together in prototype form, we came to better understand what the underlying code architecture was going to look like, giving us further confidence that we were on the right track. We ran weekly sprints and had to have a demonstration of what had been accomplished at the end of the week; usually this was to show where we were in the user interface so the production team could provide more refined feedback. Like the development team, they also had their regular work to do, which is one reason we had to spread the planning out for so long.
My software team and I were continually updating and tweaking the architecture — the big parts of the code, like frameworks and libraries. Asking if a particular library was worth implementing and how it would be implemented helped deepen the overall planning so that finally when we were confident that all of our performance targets could be met, then we could concentrate writing the code to actually meet those targets the production servers. This took about two months where we were all involved in a focused manner, backend, dev-ops, client, and production all together.
All of this took time, management, and a lot of continuous planning.
At the start of the project, I looked at several project management software suites and landed on FogBugz as our PM tool of choice. FogBugz is made by the same folks who made Trello (before Trello got bought out by Atlassian) and was an ideal choice for us to keep track of all the task, features, and bugs that we had to complete, build, and fix throughout the lifetime of the project.
FogBugz was arguably the reason we became as effective as we did. Using a dedicated project management tool put us all on the same page with our development efforts. Everyone, developers, quality assurance, and the producers, made use of the software. We almost entirely eliminated email as a communication tool, and we used the software’s evidence based scheduling to good effect in planning releases of not just the project by also our normal work. The software also helped us track our meeting agenda points as well as generated metrics for us to keep track of as we progressed further along. It also helped with on-boarding of new staff; because of all of the links and up-to-date documentation, anyone with basic training could come up to speed quickly. As consequence, that meant that our staff in general could keep moving at speed; if they forgot something, it was as easy as checking back with the relevant tasks and linked documentation. I can say with certainty FogBugz was a key element in making leading to the success of the project.
The end result after six months of relatively intense work (I forbade working on weekends or more than ten hours in a day — hours which we needed less and less of as the project progressed) we ended up not just with a far superior product, but we had saved everyone a ton of time, time that the company could use to be more create more interesting products, bring on more clients, and grow to profitability.
At the same time of the project and our normal work, and obviously continuing afterwards, we had a lot of fun. We took breaks before we thought we might burn out, we took lunches, we went on vacations (well, the rest of the team did; I was content with watching the progress we were making), and we played a lot of fuss-ball.
Once the project was done, though, we took two huge burdens off of our backs. One was the refactoring/updating project itself, of course. But the whole purpose of the project was to eliminate the burden that the old system had already been imposing on all of us. Our maintenance costs dropped because we had cleaner implement and better tested system. Our database, network, and client software connections all improved considerably, which in turn improved the end user experience; they didn’t have to wait as long for content to load being the major consequence.
But the biggest relief came for the production team. We had at our peak during the old system, a half dozen or more interns doing parallel work to upload content. Afterwards, because content could be batched, the effort was split among our the two executive producers and a single intern whose main job was to help with production tasks and do video updated, not babysit uploads.
The results of all of our efforts led us to a more relaxed and more productive system, something I had wished for but never could see in previous companies full of frantic fire-fighting, just in time, incomplete, over-complicating, opaque, mindsets.
And so ends this part of the story. Part 3, and the final part of this story, but not the end of “the book” will come soon. More details (as they become available) here: https://kecskes.net/projectshare.
This story is part of a book I attempted to write in regards to my experiences in project operations and software development. Instead of just sitting on the 41K words I’ve written (not including the probably 100K words of just brainstorming), I thought I’d share fragments instead. #projectshare.
Part 1 of this particular story is here: https://medium.com/@adam.kecskes/a-dashed-utopia-of-project-operations-3581ec60b4bc and part 3 is here: https://medium.com/@adam.kecskes/dashed-utopia-pt3-of-3-cf9ec8454dae