We need to allow different levels of agility in the same organization

Recently I heard a keynote of the ESE congress where a famous and experienced speaker claimed, that Agile would not be appropriate for Embedded Systems, and CMMI should be preferred instead. I remembered a very big company I once worked for. We subsequently rose our CMMI level – but noting, really not one single thing changed in the way, we developers programmed, designed or wrote the specs. CMMI can be just a fake – as everything that’s based on paper.

I’m tired of the discussion whether Agile should be used, or not at all. This is too dogmatic, it is not a yes or no topic. For me the key concept of Agile was historically the existence of iterative development and the absence of an a-priori fixed feature-set – just no big-bang releases and no BDUF (big design up front). Robert C. Martin describes in his famous book ‘Agile Software Development’ why the formerly popular analogy does not match, that creating the sourcecode is comparable to building a bridge and would therefore need BDUF. Instead he introduces model of the compiler being the one that ‘builds the bridge’, which is more appropriate and I totally agree to him. In order to understand where Agile comes from, one has to understand this change in the point of view.

Based on this model where building software is different to building a bridge, because of the varying effort for the build step, I want to define two oppositional poles: On one pole are projects where the process of building is tremendously expensive and not even one additional iteration is possible, which leads to an a-priori fixed featureset. On the other pole much iterations are possible and an a-priori feature-freeze, that would make the project clumsy and resistive to change, is counterprodutive. Embedded projects utilize different components that are between both poles. PCB, Mechanics, Software … every discipline has different cost/time/effort for the process of ‘building’, which makes it necessary to adjust the level of Agility individually for each discipline.

Examples:

  • Bridge: Not iterative at all, features are fixed from the very beginning.
  • Formed parts: A few changes are acceptable, some adjustments can be made from one prototype to another. Not all, but most ‘features’ have to be defined a-priori.
  • Rapidly prototyped parts (3-D Printer): Several iterations are possible, features tend to be changeable, only the subset of the features that is related to the interaction with other parts has to be frozen at the project start.
  • Software: Very iterative, the compiler can build the software in a few minutes time. Only the basic plot has to be known in the project’s beginning. Changing the feature set is easy at every project stage.

Therefore the discussion about using ‘Agile or traditional’ in the whole organization leads to a wrong direction. Instead: Be as Agile as possible and as traditional as necessary in every particular discipline. This doesn’t make it less complicated at the interfaces between the particular departments/diciplines, off course, and demands more communication. E.g. some software features will be mandatory because of mandatory hardware features. I suggest that in this case the deparment that is more agile (software usually), accepts that they have to sacrifice some of their own agility in favor of the less agile (hardware usually) department. And vice versa that the hardware people take into account that they have to split up the software features they require into mandatory and optional ones.

Embedded Software Architecture Trends in 2015

I wish everyone a happy new year and I wonder what might be the biggest change in Embedded Software Development next year. Off course Linux will continue its triumphal success for recent ARM-driven systems. It’s really time to move to Linux if you haven’t done so yet. A key moment for me to realize the impact of Embedded Linux was when a Lauterbach senior sales representative looked at me with pity when I told him that we’re not using Linux. He was assuming that everyone uses Linux (on the ARM based uC of this project) and for him it was out of question that something else could be used. He underlined that every other customer in his area is using Linux on that sort of controller.

I also wonder about the trends in off-shore software development. In the 90s I saw much development being shifted to Russia. But when Russian developers became similar expensive as European developers this stopped – at least that was my perception. In between 2000 until 2010 I saw much development in India and for BSP and driver development this seems to be a perfect place.

I, however, would bet on China to become the next center of off-shore soft- and also hardware development. The high economic growth is an ideal spirit for enthusiast programmers. The spirit there reminds me much of the electrifying times during the dot-com bubble at the end of the 90s – but I don’t fear that this bubble will burst, not at all. And the clever import restrictions of China force the companies to have facilities there anyway. Furthermore Chinese Programmers are highly trained, and I appreciate the pragmatic and fast approaches I saw there in the projects I heard of.

Another question I ask myself is whether a backwards movement from TDD, MDD, SOLID etc. will happen. I don’t see this technologies disappearing. But maybe only the best of it will remain (e.g. TDD without test-first, MDD without class diagram roundtrips – only state machines etc., SOLID without giving every class several interfaces, ‘if’ will not have a smell anymore …). For a more efficient usage of SOLID, I hope a Mock-Framework without need for interfaces will come out (like voodoo-mock which seems to be a somewhat abandoned tool to me). Dependency-Injection is a great technology – but injecting everything to everywhere just for the unit-tests makes the code heavier than necessary. Coupling and complexity are two poles in an optimization problem – decoupling too much is even worse as decoupling too less – but that’s a topic for another blog …

So long, happy new year, have fun, consider buying Chinese software company stocks this year πŸ˜‰

Pushing through a software rewrite

Every experienced software developer at least once came across code that needed to be rewritten. Code that was originally meant for a smaller functionality set, then grew over time without any cleanups and finally eroded up to total unmaintainability. Code that degraded like a piece of rotten meat as Uncle Bob would call it (www.cleancoders.com).

It is easy to get budget for a rewrite when a software module finally reached the point where it can be proven that a rewrite will pay off in the short term. When it is on top of all bug statistics, when each bugfix comes with new regressions, when all developers cry out ‘oh no’ when the module’s name is mentioned, when most unit-tests remain deactivated (if there had been any).

But isn’t it too late when it has come so far ? Shouldn’t a reasonable software architect go against this long before this singularity of irreversible erosion ? Before it is easy to get the budget ? Off course, and off course there are two basic actions: refactor or rewrite. Refactoring by using the principles of Fowler’s Book ‘Refactoring’ and Martin’s SOLID and Clean Code principles can be a total cure. This limits the domain for rewrites to code that is so rotten, so far away from the rest of the application, that a rewrite will be cheaper than a refactoring.

An example: In my experience a lot of embedded projects reuse code modules from a time where principles like OO and SOLID weren’t so popular to embedded programmers. A typical sign of this style is code that is written in C++, but reads like C. It is typical for C programmers that were forced to use C++ but resisted to think object orientated and used objects like a code module. Not to mention that such modules usually won’t have class-based unit tests. An I also observed a high degree of unnecessary inner redundancy in such code.

When the rest of the system’s code is written in an object oriented way, or even SOLID based with lots of instance injection and comprehensive class-based unit testing, a C-Style module really doesn’t fit into the code base and often (not always) such modules are candidates for becoming the next piece of rotten meat.

Back to our choice: refactor or rewrite. Can a C-Style module reasonably be refactored into SOLID OO code ? To my experience this migration would be too far. Off course one would copy’n’paste some of the beef code when a rewrite is made. But a step-by-step refactoring would in most cases take longer and lead to inferior results when the technological distance is so huge.

However, if you decide to rewrite such modules to keep the overall code base at a comparable level and to prevent total erosion before it happens, you need to get the budget, or at least the commitment of the stakeholders. And usually they won’t be amused when you propose a rewrite in advance, because there is no short term benefit, and your stakeholders are assessed and also incentivized mainly by short term goals. They want to have new features and when you suggest invisible cleanups you’re suspected of behaving uneconomical, right ? An ivory tower architect that decides based on the TNTWIWHDI criterion (β€œThat’s not the way I would have done it”, see objectmentor.com). Just ignore that kind of arguments, in the long term you won’t be able to do anything against it anyway, see it as a sign that you’re doing your job right πŸ˜‰

Things become difficult when you will be confronted by the simple economic approach of estimating the lifetime of the module, the amount of maintenance cost in this lifetime and then to compare this against the cost of a refactoring (plus the lower maintenance cost of the refactored module) and against the cost of a rewrite (plus its maintenance cost) including the cost for making the rewrite as mature as the module to be rewritten. In my experience this approach will almost always lead to the decision of not starting a rewrite. But as we all know, software modules always live longer than expected, have more complexity than originally thought, are reused in more products than planned in the beginning etc.

Therefore my recommendation is to avoid this kind of short-term, naive economical discussions and to answer this by showing up the bigger complexity. A rewrite has the following benefits (in cases like above when the rewrite is faster than a refactoring):

  • Fighting code erosion is a constant effort. Like a mechanist cleans up her/his workbench every day or so, also code has to be cleaned up regulary. The pitty is that a rotten workbench is easily visible and the need for cleaning it up is easily explainable. Explaining why non SOLID code is rotten to a superior, who wrote his last line of code back in the C era, is just impossible. So don’t go into the details of the current piece of work, this will only be used against you. Stay abstract, explain that a constant budget is necessary for the continuous fight against code-erosion and that you – the software architect – are the only one that decides where to spend it for. Clarify that if this budget is missing in a few years a complete rewrite of everything will become necessary, which can ruin everything (I like to mention the company ‘Nokia’ with their Symbian OS in this context).
  • Avoid to call it a rewrite. In fact it is not that everything will be invented from scratch. The knowledge behind the code (which usually has more value than the code itself) remains in the new module and also the beef-parts of the code will remain – but be in another structure. You will find it a lot easier when you call the rewrite a reorganization. When you emphasize the aspect that existing functionality and existing code-snipplets are just moved into a structure that is more compatible to the rest of the application. And in fact this is usually not a lie.
  • Don’t brag with the great innovations you’re going to apply in the newer, greater version. This makes you suspicious of using self serving technology (or TNTWIWHDI). It is best to avoid unwanted attention an not to initiate conversations about this topic. And by the way, this will also protect you from subconsciously ordering (or even implementing) rewrites that might be somewhat related to your own ego, because you had such a great idea of how this piece of code could be made better πŸ˜‰
  • If your organization allows this: Don’t ask for permission. If you ask questions, you might get answers that won’t help you. If you have a budget for continuous refactoring, great, then just do it. However, I personally would speak to some programmers about the sense of a rewrite because this is (depending on the module) a decision with a significant impact and during the first months the module’s degraded software maturity will cause effort. You better establish a firm commitment by the developers before starting to combat for budget.
  • The last argument is: Developer fun. In a scientific context I once read β€œThe brain runs on fun” and I agree tho this. Creating an atmosphere of creativity and fun is an important task for a software architect and having a relatively clean code base and state-of-the-art coding methodology is a key element for developer fun. However this argument is only applicable in companies who value software developers as an important resource. Traditionally in the field of embedded software the value of software developers is not utmostly high because the company sees at first the visible machine development. The thinking is like this: β€œGreat, now we built the machine, we also need a little bit of software to run it, and we need a manual and a transport package off course, that shouldn’t be forgotten. Just copy the software-image into SAP right beside the manual …”. Ok, this way of thinking has changed over the past years because companies realized that software can provide USPs and that software nowadays consumes a big amount of the machines budget. Maybe it is a good idea before talking about developer fun to draw attention to the importance of software in todays devices. However, this argument is dangerous and can be used against you (the ivy tower fun loving architect …).

What is the right definition of “Software Architecture” ?

Recently a colleague gave me an interesting definition of Software Architecture he heard at a conference. It was something like:

Software Architecture is the sum of all decisions made for a software product.

Maybe this is true in some way, but I’d prefer a definition that emphasizes the overview aspect of architecture. In the IEEE standard “1471-2000 – IEEE Recommended Practice for Architectural Description for Software-Intensive Systems” a definition can be found that emphasizes this aspect:

The fundamental organization of a system, embodied in its components, their relationship to each other and the environment, and the principles governing its design and evolution.

Sounds good, however, I miss the aspect that a software architecture doesn’t suddenly appear out of nowhere. It is more a process than an entity. A process that never ends, the Architecture constantly changes and erodes during a product’s lifetime. It is the outcome of a tremendous amount of communication and also used in communication to a lot of stakeholders. I found a definition with more regards to this aspect in the “Microsoft Application Architecture Guide, 2nd Edition”:

Software application architecture is the process of defining a structured solution that meets all of the technical and operational requirements, while optimizing common quality attributes such as performance, security, and manageability. It involves a series of decisions based on a wide range of factors, and each of these decisions can have considerable impact on the quality, performance, maintainability, and overall success of the application.

Well, after all I’m still not 100%ly confident with this definitions. What would be your definition ?

Typedef or template ?

Every once in a while I see c++ templates being used in code that has only one single type to pass into the template. This type might change in future application versions, but it is forseeable that it will always be one and the same for the whole application. In this case I always ask myself why the developer hadn’t chosen a typedef instead.

Example: If a class processes the float data type and for future versions shall also be prepared to process double types – but a mix of float and double is no use case. In this case using a typedef offers the same degree of versatility – without the drawbacks of template programming. As drawbacks I see a higher code-complexity and a tendency to write code in .h files that belongs to .cpp files (although it is possible to split up .h and .cpp files also for templates – if one chooses a fixed set of types to be compiled in – but that’s a topic for another blog …).

Don’t get me wrong, there are many situations where templates are justified: If several types have to be processed by the same class and in the same application. Or if an independent binary library isn’t supposed to be recompiled for an application-specific set of types. But, again, a template should not be used if it will be compiled for only one type in an application (and if it is forseable that it will always be only one). If the type might change, but the amount of supported types will remain 1: Use typedef as long as you can.

Up-front performance optimization / bit fiddling

Have you ever seen code like this in your embedded project ?

uint32_t numberOfHighBits =  ((someValue & 0xfff) * 0x1001001001001ULL & 0x84210842108421ULL) % 0x1f;
numberOfHighBits += (((someValue & 0xfff000) >> 12) * 0x1001001001001ULL & 0x84210842108421ULL) % 0x1f;
numberOfHighBits += ((someValue >> 24) * 0x1001001001001ULL & 0x84210842108421ULL) % 0x1f;

In high performance code ? Or in time critical device drivers ? I mostly saw this kind of code in places that were not time critical at all. Even on slow processors (like on FPGA soft cores for example) the savings of this code style in runtime measures are in most places of the code neglectible. Not neglectible are the higher costs for software maintenance which becomes obvious when we look at a better readable alternative:

uint32_t numberOfHighBits=0;
for(uint32_t bitPos=0; bitPos<sizeof(someValue)*8; bitPos++)
{
  if(isBitSet(someValue, bitPos))
  {
    numberOfHighBits++;
  }
}

The basic problem on this topic is that not few embedded developers – esp. the ones who programmed extremely resource restricted systems over years and years in the past – are doing an up-front performance optimization. I found some to be very proud of this behavior, they feel that it makes them a better programmer.

Up-front performance optimization can also be found in other topics. Some embedded programmers don’t use object orientation because of the performance and memory impact of a V-Table. Off course there are systems that are so resource restricted, that a V-Table is not possible. (Object-orientation still is, by the way …). But on todays embedded systems this kind of restrictions usually don’t exist anymore.

Also your recent gigaherz ARM controller will be brought to it’s limits in some years, when the product manager lets you adds more and more features. It allways was this way and it allways will be. But then this will not be related to V-Tables or bit hacks. Nor would bad programming style have saved you from it.

So, how can you support that your developers don’t write useless bit hacks for your brand new gigaherz ARM controller ? I would suggest that you encourage the development team not to do any up-front performance optimizations in the code at all. At least when it is not 100%ly clear that a performance optimization is necessary. Especially if the topic is not architecture related at all and only the way the code is written is affected, like in the example above.

Well, will everyone stick to this rule ? Not off course. You will still be arguing why clean code is sometimes worth more than a piece of code that is hard to read but might be faster in the execution. Let me make a suggestion for this case:

Set up a performance profiler on every single workstation/target that can be started as easy as possible. Set up a Wiki-Page that gives clear information with screenshots how to profile. The costs for learning how to profile and for executing a profiling session must be as few as possible.

Then, when a developer applied an unnecessary optimization, it can be shown with ease wether it is benefitial for the runtime behavior at all. It might happen that written code will not be rewritten when a developer notices that her/his performance optimization was useless. But the developer will learn from this profiling-event and the probability for the next useless up-front optimization will decrease …

In some circumstances, however, like an interrupt handler or in nested loops of high performance code – or when the profiler prooved that in a particular place it is necessary – you will still have to and want to use bit hacks. Have a look at this awesome website in this case, it list up lots of examples. Bit Twiddling Hacks

Generated Statemachines – A Lean Way

Have you ever used a statemachine generator ? Maybe you’ve heard of complicated products like Rational Rose Technical (formerly known as Rational Rose Realtime) where the whole software is turned into active objects passing messages between state machines. There you write code into the model that will be executed when a state transition occurs. Code and model form a unit. Well, using such a powerful approach is really benefitial.

But …

often this is too much overkill for a project. In that case it is not necessary to relinquish the concept of a generated state machine.

Let’s look at an example state machine. Most embedded devices that are safety critical and have some actuators and operation modes can be modeled this way. The rectangular boxes are states off course. The lines between the states are state transitions. The text near a state-transition is the event-name that moves the state-machine from one state to another.

Example State Machine
Example State Machine

In a project some years ago when I was a freelancer, I made good experience by using a simple state machine generator without injecting state-transition-code. Instead we used the state machine generator as a decoupled helper instance and we used only two methods:

  • stateMachine.SendEvent( <event-enum> );
  • stateMachine.IsInState( <state-enum> );

Code examples:

void UI::UserPressedButton()
{
	if(!stateMachine.IsInState(Waiting_For_Command))
	{
		piezo.PlayErrorSound();
		return;
	}

	stateMachine.SendEvent(SomeAction_Called);

	//process button;
	someAction.Execute();
}

That’s all. We only call IsInState() and SendEvent() on an object instance called stateMachine, nothing more, that’s the whole trick.

In other words we ask the statemachine about it’s state before we’re doing something that is allowed only in some certain states. And furthermore we send events to trigger state transitions, to change the current state. (I left out the error handling of SendEvent here for a better readability).

Consequently methods like SomeAction::Execute() are also secured by a state machine sanity check:

SomeAction::Execute()
{
    if(!stateMachine.IsInState(Checking_Precondition))
    {
        // this is a serious internal error, handle it ...
    }

    checkPreconditions();
    stateMachine.SendEvent(SA_Precondition_OK);

    moveActuator();
    stateMachine.SendEvent(SA_Move_Done);
}

As you see this is a very straightforward and lean method that gives a temendous amount of safety and robustness to your system. It cannot happen anymore that the system executes stuff it isn’t allowed to in the current state because just typing if (stateMachine.IsInState()) is a sufficient protection.

The good news is that the code of the stateMachine object that offers the SendEvent() and IsInState() methods can easily be generated. We used SinelaboreRT to generate it directly from Enterprise Architect (it also has it’s own editor if you have no UML tool). It’s inexpensive (only 99$ some years ago) and has even a state machine simulator. There you can send events to the machine and see what it does (which actually did save me from one conceptional bug). It also offers sanity checking of your state machine.

The concept of using this kind of decoupled, lean generated statemachine code really saved our ass in an application where the user interaction came over network (Soap) commands. The Soap commands were issued by a Windows PC and the Windows programmers were unaware of Embedded or safety critical programming style and were continuously firing Soap commands at wrong times. We just used code like follows to perfectly secure the safety critical embedded system from the misplaced Soap calls of the Windows UI programmers:

void Soap::Process_Cmd_SomeAction()
{
    if(!stateMachine.IsInState(Waiting_For_Command))
    {
        Respond_Invalid_Soap_Cmd();
        return;
    }

    stateMachine.SendEvent(SomeAction_Called);

    someAction.Execute();
}

void Soap::Process_Cmd_AnotherAction()
{
    if(!stateMachine.IsInState(Waiting_For_Command))
    {
        Respond_Invalid_Soap_Cmd();
        return;
    }

    stateMachine.SendEvent(AnotherAction_Called);

    anotherAction.Execute();
}

void Soap::Process_Cmd_PowerOffSystem()
{
    if(!stateMachine.IsInState(Waiting_For_Command))
    {
        Respond_Invalid_Soap_Cmd();
        return;
    }

    stateMachine.SendEvent(Shutdown_Called);

    systemController.PowerOff();
}

Soap commands can also be processed in states different from Waiting_For_Command. For example a FailureStop command could be allowed in all states below the ‘Running’ state. Now you might notice that this is a hierarchical state machine. Have a look in the state machine diagram where the transition named Failure_Deteced is located …

void Soap::Process_Cmd_FailureStop()
{
    if(!stateMachine.IsInState(Failure))
    {
        //Ignore if we're allready in the Failure state
        return;
    }

    stateMachine.SendEvent(Failure_Detected);

    failureHandler.Execute();
}

Another example could be a Soap command that is only allowed in the Moving_Something state. E.g. a command to abort the movement.

void Soap::Process_Cmd_AbortMovement()
{
    if(!stateMachine.IsInState(Moving_Something))
    {
        Respond_Invalid_Soap_Cmd();
        return;
    }

    actuator.stopMoving();

    stateMachine.SendEvent(SA_Move_Done);
}

You got the point, right ? This is really straight forward, easy, lightweight and because the state machine itself is generated from UML you have to write very little code (initially and also when the statemachine changes – and in an agile project it will change – often πŸ˜‰ )

Have fun πŸ˜‰
Roelof Berg
Embedded Software Architecture Blog
www.embeddedsoftwarearchitecture.com

How the finance office turned me into a blogger

Why this blog ?

This was once a business homepage for me as a freelancer. Since November 2013 I’m no longer a freelancer, instead I’m employed as software architect at a major German vendor of medical devices.

Last week the German finance office requested about 23.000 Euro from me because they thought my former business was still going on. Dear ! So I had to act and delete all old business related content of my web space and turned it into a pure private homepage.

So, what to do now having a private home page ? No idea ! Well, I could blog some thoughts and insights of my job as software architect. At least during the 12 years when I worked as a freelance embedded software guy, I had seen a lot of different software architectures, company cultures, methods, tools, patterns … which gave me a broad perspective.

The topic of this blog will be: Embedded Software Architecture.