Friday, 22 June 2018

Understanding decomposed


I previously wrote about simplicity and how it helps to understand software code. In that post I already mentioned the usage of concepts like set theory. I’d like to expand on this and show how to boost readability of code.

Despite what people might think, our main purpose as developers is not writing software, but solving problems. Code is just a means, not the goal. Nobody demands software just for the sake of it from a developer. Applications need to solve whatever problems our customers need to have solved. This point of view on why we are here also changes what software is. It ceases to be a merely different formulation of a solution that is specified somewhere, it is the solution. But this requires us to think differently about what it is we are doing, and how we’re doing it.

Code is the ultimate truth, as they say. After all, this is what is executed in production, not some written specification, diagrams, JIRA tickets, or the ideas we’re keeping in our heads. The problem is, that if the code is the solution, then there are a few groups of people that need to understand it. The first one is developers, after all we’re closest to it, we’re writing it, it’s our bread and butter, our life. The second group is the stakeholders or customers, who are defining how the systems should behave. Maybe some other groups, like operations, are also interested to understand how the system will behave, or with what other systems it interacts.

Let’s do a small experiment. Below is a piece of code. Please take a look at this and try to name the algorithm that is implemented here.

int n = arr.length;
int temp = 0;
for (int i = 0; i < n; i++) {
    for (int j = 1; j < (n - i); j++) {
        if (arr[j - 1] > arr[j]) {
            temp = arr[j - 1];
            arr[j - 1] = arr[j];
            arr[j] = temp;
        }
    }
}

It’s bubble sort, but it probably took you a few seconds to find out. If I just wrote `bubbleSort(arr)`, you’d know immediately what it does, how it's done, and a few other properties of the algorithm, like it's terrible performance for bigger arrays. The only trick I used was to name this piece of code, and this name refers to something you already know. If you don’t, it’s very easy to find out.

My point is, that by using known concepts we can make the code shorter and more understandable. This piece of code is 11 lines long, merely calling a function takes just one line. Of course, those 11 lines need to be written somewhere, but we can hide them deeper in our code, outside of the part where the “business logic” lives.

Those concepts can be anything. I mentioned a pretty obvious one, making elements of a collection ordered in some way, it’s called sorting, everybody knows it. Another example could be set theory with such operations like intersection, set difference or Cartesian product. After all, our applications are about data processing, and it's never about one "data". Yet another example of such concept could be design patterns, which are re-usable solutions to well understood problems. Once a developer hears about, for instance, a decorator pattern, he or she should immediately understand what to expect.

There's a really nice tweet from Mario Fusco:
Although it's a bit biased towards the functional way, and imperative code can definitely be improved (see discussion below the tweet), it clearly shows some benefits of using pre-existing ideas, monadic function composition in this case. Knowledge of such a technique, coming from the functional world and now being more and more adopted in OOP world, enables not only a better separation of concerns, but also more descriptive code. I'd argue, that the functional code says more what it does, imperative code is more about how to achieve it. Because of that, the functional code is more readable and understandable; it conveys intention. And it's just an example, I'm not trying to say, that functional programming is always or usually better than object oriented or procedural approach. But because in functional world certain mathematical abstracts are much more present or natural, also the code can be far easier expressed in terms of those abstractions.

Why is it so important to go for known things? Because they tend to "disappear" from the sight and leave more space for other, usually more important matters. Once we can refer to some external knowledge, our brains don't need to worry about that part any more. If I know what "intersection" from set theory means, then understanding `a.intersect(b);` is trivial. I don't even see the dot, the semicolon and the brackets any more. If I wrote `a.commonElements(b);`, I'd probably force myself and others to think about or find out in the code what is meant by `commonElements`. It would be one more thing to keep in my head, in my working memory.

This memory is however very small. It can keep approximately 7 "things" at the same time. Once we exceed our own limit, we're loosing the whole picture, we cannot comprehend the whole problem as one any more (by the way, this is also the moment where bugs start to creep into our code). If we refer to something we already know, then those concepts don't need to "take up space" in our working memory, so it's easier and faster to reason about the code.

A different aspect of understanding has to do with the language we're using. The software we're writing is a type of a mental model. Those models try to reflect real life as much as possible or necessary. They can also have theories built around them helping us deal with real life or decide what to do. Those models and our software should be as close as possible to real life models to reduce friction between them. Any differences are bug breeding grounds, and very time consuming to work with. Every time we see parts of the code, that show inconsistencies, we need to think about whether the code indeed reflects reality, or whether it looks like it looks, because we were unable to express reality in a better way.

This whole idea plays very well with ubiquitous language, an integral part of Domain Driven Design. It states, that there should be only one common language used to talk about the problem we're trying to solve, no matter who's talking to whom, developers, testers, business people, and also in the code. It makes sense; why would we want to "translate" between code, documentation and spoken language, after all, we're talking about the same things. In such translations we'd be loosing important details, and it also takes time and effort.

If such a language doesn't exist yet, we should create it. Or rather, like agile or extreme programming proposes, use metaphors. That way, although there's no language specific to the kind of problem we're trying to solve, there is a similar concept from which we then can borrow. This will also help in communication between people; instead of describing terms every time we need to, or giving them some artificial names, we can use terms from this other domain. Their power is, that they already have a meaning in the context we're interested in. It will also greatly speed up bringing new people to the project or talking about the project outside.

Code written using ubiquitous language or metaphors will also reveal its intent much quicker. It's a very important property of a simple design. Such design leads to quicker software development and only implementing what is actually needed. If we can split our big problem into smaller ones and properly separate the concerns, we have higher chances of being able to use ubiquitous language directly in the code, thus making it more understandable. Our code should be talking more about the actual problem it's trying to solve, than any technical solution it uses, like libraries, frameworks or databases.

This also might have the additional benefit of not having to write code at all. Chances are, that for any problem that is not core and unique to our business, someone already solved it, and there is a library available. If we use this library, we'll save ourselves the time writing and maintaining it, and we'll be able to focus more on important matters. It will, again, make the code more understandable, because there will be less code to understand. We should also remember, that our task is to solve problems, not to write code, and we should only work on something that differentiates our companies from the competition. This is where profit is generated, commodity code has already been written, all companies can get it, so there's no real advantage here.

As you can clearly see by the sheer amount of topics and concepts I mentioned in this post, it's not so easy to keep our code understandable. And I didn't even mention all of them. It also means, that we can make our code difficult to read in so many ways. But I believe, that it is in our common goal to make our code more readable and understandable. This not only makes working with such code easier, but helps us to deliver value faster.

Happy coding!

Note, this article was originally posted on INNOQ blog.

Friday, 15 June 2018

Supporting understanding with simplicity

Some time ago, my colleague Joy Clark wrote about simplicity. I've also approached this topic once before. I'd like to follow up on both articles and show how simplicity can boost our understanding of software. I will also propose that we look at other disciplines to see if and how simplicity affects them. Maybe we can benefit from their experience.

What's all the fuss about?


Why do we even talk about simplicity? Why is it such a hot topic? Why is nobody praising complexity? Why do things like the KISS principle exist?

As it turns out, keeping things simple has some profound effects. In science, the Occam's razor principle has been commonly used for a few hundred years. In short, it proposes that, if there are two or more hypotheses to choose from, the one with the fewest assumptions should be selected. Obviously, the "razor" is not a law, it cannot be proven, it's just a heuristics. The most interesting thing about it is the reasoning behind it. It states that the simplest hypothesis should be selected, because it's the easiest to test or falsify. If we apply this principle to software engineering, we could say that we should prefer simpler solutions, because it's easier to find defects or prove the solution to be correct. Indeed, it feels intuitive. We could also reverse it and say that if we're having trouble while writing tests, it's probably because our solution is not as simple as it could be. But remember, it's only a heuristics, it does not necessarily need to be true. Maybe the problem you're trying to solve is indeed complex and your solution cannot be simplified. Einstein is credited with saying "Everything should be kept as simple as possible, but not simpler", maybe that's your case.

What does "simple" look like?


But what is "simple"? How can we tell if something is simple, or select the simpler solution out of two? By definition, simple means "having or composed of only one thing, element, or part" (by Wordnik), "free of secondary complications, unmixed" (by Merriam-Webster) or "having few parts or features; not complicated or elaborate" (by The Free Dictionary). The meaning is similar to what Rich Hickey proposes in his video "simple made easy" (it's one of those I strongly recommend watching). So "simple" is about cardinality, the less the simpler. It also plays very well with the idea of cognitive load. The human brain is said to be able to process only a few items at the same time. The more different concepts -- like variables or state -- influence any given piece of code, the less simple it is. As the number of moving parts increases, we need to use more energy and concentrate harder to reason about and change the code, so it takes more time and it's easier to make a mistake.

How "simple" works


No wonder, then, that there exist so many principles suggesting to reduce the number of concepts present or affecting a particular piece of code, starting with high cohesion and low coupling, through single responsibility principle and worse is better, all the way to TDD. TDD is actually a very interesting example of how simplicity can help us on many different levels. It was developed as a technique that helps us focus on a single thing: the feature to implement. But not only are we avoiding the violation of the YAGNI principle, it also has an additional psychological effect. Every single step focuses on a different aspect, so it frees us from thinking about too many things at the same time. When writing a test that fails, we don't need to think about the implementation. The only important thing is: what am I expecting the system to do. When making a test pass, we focus on just that, doing the simplest thing that will make the test green (and not make any other red). At this time, we don't need to worry about, for example, code quality, coding conventions, etc. We can worry about that in the refactoring step, but then the tests we wrote so far free us from thinking about functionality.

Is it easy?


And again, it feels intuitive: less things at the same time are easier to handle, not only in software development. There's a reason why it's forbidden to drive a car and use a mobile phone at the same time. Of course, we can get better and be able to "juggle" more items at the same time, but there will always be a toll to pay and the limits can't be pushed indefinitely. You can surely recall a time, when coding something was really difficult, because you didn't know the API nor the syntax nor common pitfalls; your programs were buggy and crashed often. Because everything was new, it took an enormous amount of energy and focus to put a few things together. Now, as you're already familiar with your tools, standard library, etc., all those basic problems seem to disappear, and leave you to focus on your actual problems. But that is only because it's getting easier for you to write code, it has nothing to do with how simple your language or tools are. They may be complicated, but after some practice, they become easy. Indeed, Wordink defines easy as "Capable of being accomplished or acquired with ease; posing no difficulty: an easy victory; an easy problem".

It's a trap!


Making things easy might seem like a good idea. If you know the tool that can help you solve a problem at hand, you'll be done fast and probably without any major issues. The catch is: not everyone knows the tool. If a new developer joins the team they might have a hard time understanding what you did. They would need to familiarize themselves with the tool first before they can deep dive into your solution. Easy is subjective, simple is objective. Making things easier will always be making them easier for you or your current team, not for everybody. Making things simpler, on the other hand, will be making them simpler for everyone. Of course, simple doesn't automatically make things easy, it might still be complicated (though not complex). For example, this "simple" could mean, that the code implements some quite sophisticated algorithm, which is difficult to understand by itself, but it's free of anything that is not necessary to solve the problem, such as usage of Spring or Hibernate.

Understanding


But even if all those multiple things coupled together are easy for you, working with such code will not be easy. You'd constantly need to switch contexts in your head while reading the code, each time focusing on a different concept. This not only takes time, but also leads to all kinds of problems. The most important one is that you cannot easily understand the logic. There's no way to be able to see only the code you want to see, for example business logic. There is a constant noise, code that does something else, not related to what interests you at the moment. Once you cannot understand what your application is doing and why it is doing it, you will introduce bugs and deliver features that do not fully match the requirements. Only if you can fully understand what is currently happening in the code, you will be able to change it in a predictable way and be aware of all consequences of your changes. Such understanding cannot be replaced by, for example, presence of a test suite. This can only verify that you're not breaking current behaviour (covered by tests, that is). Simple code, containing just one concept, makes changes to the code much easier. The scope is limited and even if you are introducing issues, they'll be easy to find.

The power of simplicity


Simplicity has also a different effect that we might want to pursue: elegance. It's this feeling, when you're looking at your solution, that you've done the job right. The code is exactly to the point, does what it needs to do, it doesn't do more than that, it's efficient, succinct, you just look at it and know what it does, you cannot really improve it. Antoine de Saint Exupéry once said that "It seems that perfection is reached not when there is nothing left to add, but when there is nothing left to take away". Also, mathematicians and physicists are looking for "elegant" solutions and theories. They've learned over the centuries, that if their solution is not elegant enough, it probably can be improved as it most likely is not complete yet. That is not to say that it's wrong. It's just that it's not touching the very core of the problem yet, it might be just a step towards "the" solution. This can be very well seen in a pursuit of the Grand Unified Theory in particle physics, where scientists know which parts need to be improved, just by looking at the pieces of the Standard Model that are not elegant enough. It goes very well with, already mentioned, Occam's razor, the more elegant, the more simple, the easier to falsify. So maybe it should also be a sign for us, that, whenever we see not-so-elegant code, we're probably not solving the problem the simplest way possible.

Easy is not evil


Having said all that, I want to clarify, that "easy" is not a bad thing, it's not contradicting "simple" by definition -- they can support each other very well. If "simple" is "easy" for you, even better, strive for it! Just make sure, that when you're trying to improve things, try to first make them simple. Once concerns are separated from one another, you can learn and familiarize yourself with those, which are not easy for you yet –– the same will be possible for all other people working with your code. It will be also possible for you and your team to announce that your code is assuming knowledge of, for example, set theory. When a new developer joins the team and they don't know some pieces yet, they can start working on those concerns they already know and, in parallel, learn the pieces they are not familiar with just yet. And even if they needed to go directly for the difficult parts of the system, they would at least be struggling with one problem only.

So please, keep it simple.

Note, this article was originally posted on INNOQ blog.

Wednesday, 2 November 2016

What do I believe in

Below are my guiding principles, that are the basis of what, how and why am I doing with regards to software development.

Long term development speed


First and foremost I understand my main responsibility as delivering value as fast as possible in the long term.
By delivering value I mean adding, changing or removing features, that "business" requested, but also things like:
  • improving processes around me, both technical, like application deployment, and business, like simplifying stock management,
  • proposing and introducing new techniques and technologies whenever I feel that they will improve things,
  • sharing knowledge within and outside of the team (blog, mentoring, pair/crowd programming, etc.).
"As fast as possible" is self explanatory, I want to deliver as much as possible using as little time as possible.

"Long term" part needs a bit of clarification, because it might seem like contradicting with previous part. As studies show (see below), when it comes to TCO for software, most of the money is spent on maintenance, not on development. That means, that thinking in "short term" and "long term" might be different. Sometimes a "quick and dirty hack" today, just to have something done faster in short term, would mean, that in long term we would be slower. We would either have to anyway solve the problem correctly or be constantly hit by our bad solution. That's why I take as my obligation to think long term, and to try to find solutions better in long term, even at a cost of spending more time in short term.

One of the main factors, if not THE factor, leading to increased long term development speed is understanding.

Understanding and readability


You can only do things, do them right and do them fast if you understand your environment, for example code. Readability, not only of the code, but also things like test scenarios, greatly eases reasoning about what the application is doing, so it's far easier to add or change functionality or do an investigation "why the thing that happened happened". Readability cannot be replaced by anything, especially not by TDD/BDD techniques or any other kinds of tests. Those are only safety nets and can only protect against issues in currently implemented features, they're providing no help whatsoever to changing the code. You need to know for sure how the things you're doing now to the code, documentation, test scenarios or whatever else will affect the system. But even more importantly you need to make sure, that your changes don't hurt understanding of everybody else who will be reading or changing it. As every line of code (and not only code) is written once but read multiple times it's worth spending more time on writing it in such a way, that it's easier to read later on.

How to achieve code readability has a short answer: code quality.

Code quality


Although quality code is much more pleasant to work with than all those "big balls of mud", it's not a goal on its own, it's a way of reaching goals mentioned above. Luckily enough there are already standards or rules for code quality.

Industry standards


Some time ago our industry finally started creating standards. Those standards are all those rules like SOLID, DRY, KISS or YAGNI, techniques like TDD or BDD or "templates" like design patterns or hexagonal architecture / clean architecture. I believe, that although nobody proved them to be correct (yet), they're being referred to by many developers as something that enables them to develop faster and with better quality. Also, up until now, I haven't found any other set of rules or guides, that would be as complete as the things mentioned before. People are reporting, that other approaches or principles are working for them, but usually no one else is supporting such claims, or is supporting them partially, they're more of exceptions than rules. They also doesn't form a system, that is they can't be put together and applied all at once, they will probably contradict one another in a few places.

I admit, it's a homework for me to try to find out how good those standards are, do they really provide what they promise and are they really the only way. For now they work for me, but I might be a subject to confirmation bias and be missing other, important points ov view or approaches. But do expect more detailed, in-depth, filled with hard data and less beliefs posts soon.

Mindset


The right mindset is required to have code of good quality. But code quality is not something that can be achieved, it can only be asymptotically approached. What can be achieved is code of good enough quality. Moreover, as ideas, languages and tools evolve, untouched code, even if of good quality yesterday, will be of worse quality tomorrow.

Quality degradation starts always the same way, explained by broken windows theory. In short: once someone consciously ignores or violates quality, an exception is created, that others are likely to follow. Eventually exceptions become regular behaviour, because they're "easier".

All this makes code quality a thing, that we need to fight for all the time, every single time we touch the code. By separating the tasks related to improving quality we risk, that we will not get the time for them. There is actually a place for those tasks in a regular development process: refactoring step in TDD/BDD cycle. It is as important as the other two, but quite often forgotten. One can also apply boy scout rule and always commit code with even only a bit better quality that was checked out, even when this particular piece of code wasn't touched during feature implementation.

The ideas mentioned here are also mentioned in Software Craftsmanship Manifesto. They're also part of a forming software professionalism movement or definition.

It's nothing new


All or most of the things mentioned above are not new, most of them can be found in agile principles that were formulated in 2001. Even ideas expressed in Agile Manifesto are not new, see for example below for NATO report. A. J. Perlis in 1968 said “A software system can best be designed if the testing is interlaced with the design instead of being used after the design”, so TDD wasn't discovered in 2003 by Kent Beck, it was more "rediscovered", see discussion here.

Exceptions


Unfortunately, the World is not perfect and from time to time we're asked to do something fast in short term. There are cases, where such requests are valid, for example the law has changed and we need to adapt our systems before an externally set deadline, or a problem occurred, that we couldn't foresee, the company is loosing money and we need to react immediately. In such cases we should be cutting some corners to deal with the situation, but we should also take care to bring the quality back after the issue has been dealt with.

A few additional notes from me


As mentioned above I'm not fighting for clean code just for its own sake, for me it's a way of ultimately being able to deliver fast consistently. It's one of Agile Manifesto's principles: "Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.". Out of all rules of clean code, currently mostly discussed and stressed, in my humble opinion, are separation of concerns (a.k.a. single responsibility principle) and simplicity.

This post is far from being complete, I tried to focus on the most basic problems. But remember this: always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.

BTW, this post was simultaneously posted on two blogs: Zooplus Tech Blog and my private one. Please feel free to go to the other one and check it out!

Software engineering / professionalism

NATO Software Engineering Conference report - although the conference was held in 1968, it contains things like:
  • description of iterative process of writing software,
  • statement, that best systems are written by small groups,
  • staring development from a skeleton, which is used to explore problem domain and technical solutions,

 

Total cost of ownership

 

Simplicity

  • Out of the tar pit - scientific paper focused on complexity and how to handle accidental one

 

Videos / presentations to watch

 

Books to read

Friday, 21 October 2016

Code quality as a winning factor

Software development seems to me like the only discipline where practitioners can get away with producing crap. If you're a doctor and you screw up a surgery or a treatment, in the worst case you will kill your patient, in the best you will not make his/hers situation worse. If you're a bridge builder and you'll make a mistake during construction, in the worst case the bridge will collapse killing people, in the best case the bridge will be usable only after adding some additional support to it. If you're a lawyer and you fail to properly drive your case, in the worst scenario you'll loose it along with millions of €/$ of compensation or lost profits or you'll cause unjustified damage to someone, in the best, well, you can only hope the fallout will be low.

In all those cases the outside world will know, that you did something wrong. Not in software development. If you produce crappy software, that actually works and fulfils customers' needs, then the outside world will be happy, ...for now. What they won't realize is that you also produced a problem for everybody. Next time business will ask for a feature they'll be unhappy because it will take more time than necessary, developers also would need to work with hard to read and maintain code.

I admit, we are still, as an industry, struggling to define how to measure code quality. Code coverage is a very bad one. Counting places with DRY principle violated would be a nice one, easy to automate (it is already automated in fact, see PMD for example), but still far not enough. Something that measures how much SOLID principles are obeyed is even impossible to implement, for example for Open/Closed principle it would be necessary to know if and where such openness is desired/necessary. Unfortunately this lack of generally accepted way of assessing software quality leads to crappy code.

But we will have to, soon, define industry standards. The reason is, more and more things depend on software. Robert C. Martin writes about this in his blog. Plus we also reached a point in time, where software bugs cost more and more, including people lives, for example issues with braking system in Toyota cars or Therac-25 radiation therapy machine. If not killing, then causing serious damage from sending companies bankrupt like in Knight Capital Group case to loosing Mars Climate Orbiter. For more just search the web for "famous software bugs".

Of course, for most of developers making a mistake will not have such dramatic consequences. Maybe the application would be just crashing every now and then, maybe a customer will not be able to place an order for half a day or maybe single link on a page will not be working properly. But, unfortunately, most of the time those errors we're doing will be completely invisible to outside world, the systems will be working correctly, probably with good enough performance, maybe even with good security. The problem would manifest only when someone would like to change the code. So called technical debt will be killing productivity, someone would need to understand unreadable code, adapt the code in all those places that seem completely unrelated, yet apparently necessary to be changed, rework half of tests that now started failing for no obvious reason (better(?) case) or write the tests from ground up (worse(?) case). You get the picture.

What I really don't understand is: why are we doing this to ourselves? It's us, developers, who need to work with crappy code on a daily basis. Business gets what they need, maybe after delays, maybe it's a bit cumbersome to use the system, but it's working for them, they're happy. What about us? Are we happy with working with our "big balls of mud", "spaghetti code", "pieces of crap" or however you're calling your code? Wouldn't it be better, that we could work with clean code, simple and elegant, highly cohesive, with concepts clearly separated from one another? Apparently not, I've yet to find such system. Maybe I really need to start reading code, that someone else wrote, just to see how others are doing things. I know this as a good learning technique.

I know, software development is far from being easy, and sometimes code gets messy because of lack of skill. In my career I've learned about ideas of clean code or code quality only a few years after I've finished university. What's more, I remember attending a meetup where software craftsmanship idea was presented and, at that time, I didn't even understand it. Recently I've been looking through courses, that my university offers, to see if there's something related to code quality, but found nothing. Honestly, there actually should be nothing, quality should be built into every course, that has something to do with code! I was happy to find a few days ago, by accident, that there's an online course on Coursera about crafting quality code. I didn't check it, so cannot tell how good is it, but it's there. Apparently not only people or companies, that are making money out of it, are doing such courses. And I certainly don't mean they're doing a bad thing by earning money on this!

It's all in our, developer's, hands though, we have power to change all that. Not only we will help our businesses move faster in ever-changing world, but also make our life much more pleasant and fun. All we need to do is to finally take responsibility, ownership of what we're doing. Every time we're being asked to do something, do it properly, the best we can, not necessarily the fastest, even under pressure. We're being hired as professionals and experts (and are being paid quite a lot of money), we need to start behaving accordingly. Business expects us to bring good solutions, even if they cannot recognize its quality immediately. This requires a bit of work on our side, but it's a win-win scenario. Don't you want that?

BTW, this post was simultaneously posted on two blogs: Zooplus Tech Blog and my private one. Please feel free to go to the other one and check it out!

Tuesday, 16 August 2016

Writing good test scenarios

On my current company's blog I posted some time ago part one of "writing good tests". Now it's time to continue.

Last time I just barely scratched the topic of writing test scenarios. I briefly gave only a few hints like "avoid unnecessary information", "don't include implementation details" or "don't repeat yourself in different scenarios". This time I'd like to dig deeper and present results of discussions we had in the team.

Note: Like I mentioned last time we're developing our applications Behaviour Driven Development way using Cucumber. I'm sometimes referring to how scenarios are written for Cucumber in Gherkin format, so please make sure you have at least a rough idea about it.

Purpose


Scenarios serve two purposes:
  • feature specification that help clarify what exactly needs to be done and
  • base for executing automatic acceptance tests.
To fulfil those purposes, they need to be understandable by both business and developers.

Features orientation


Acceptance tests are about features and functionality, scenarios should abstract all implementation details of how they're implemented. Especially, they shouldn't mention anything about applications, components or use any other technical terms. Scenarios should be written in business language using terms from business domain. The reason is: restructuring or refactoring of the code, moving functionalities between modules or applications have no influence on features themselves, so scenarios shouldn't change.

Example: something like "when application is executed" should be replaced with the desired functionality like "when mails are sent".

Implementation of tests, on the other hand, has to be and is implementation dependent. Implementation of each step should do exactly what the step actually says. In cases where this is not possible, they can be implemented in some other way, but which would still ensure proper behaviour of the system.

Example: for legacy code it might be difficult to just call some service and the only way to verify the results is to do something else like just running the application. That's OK (for now) as long as that is the closest place to verify desired behaviour.

How to write scenarios


Idea is pretty simple. Scenarios should be written in a top-down manner, so they should start with very high level features and later on add more and more details, define higher level concepts or terms using lower level concepts. Scope of each scenario should be as small as possible, should refer to only one feature or requirement and mention only things important or influencing particular functionality. Each scenario should define exactly one term and can use other terms to do so. If those other terms are not defined yet, each of them will require later on a scenario or dictionary entry. It's important to make sure, that each scenario contains exactly one definition.

Example: Given a customer
And the customer selects an article
And the customer enters a desired quantity of that article
When the customer adds the article to the cart
Then the cart contains the article with given quantity

This scenario is about adding articles to cart in some e-shop. It defines term "customer adds the article to the cart" by specifying what needs to happen before and desired outcome. To do that this scenario is using other terms like "customer selects an article" or "cart contains the article with given quantity". Those terms require scenarios on their own.

Dictionary


As mentioned previously, some terms might be defined in a dictionary rather than in a scenario. A dictionary is just a list of terms and their textual definitions or descriptions (that is they're not meant for test automation so they don't need to conform to any rules like Gherkin format, they just clarify things). A decision for each term needs to be made on a case by case basis. By default terms should have their defining scenarios. But if:
  • a term refers to outside world (for example something defined in another part of the system) or
  • it's a common knowledge (volume = length * width * height)
then the term can be described in the dictionary.


Implementation details abstraction


Like mentioned before, all implementation details need to be abstracted away from scenarios. It's not part of a business requirement that, for example, something should be saved into the database. If implementation team would decide to use different technical means of achieving the same goal, like persistence, then the test based on the scenario would be "red", but actually business goal would be achieved. This will give the team much more freedom to solve technical problems as they see fit and also change the implementation if necessary, but still make sure that the features are working.

Of course, sometimes requirements are to save something to the database. Such requirements should not be coming from "business" though, but might from other "technical" teams like operations. For purely technical modules those DB details will actually be THE thing that needs to be verified. In all those cases such technical details can and must be mentioned in scenarios themselves.

Exhaustiveness


Scenarios should be MECE (mutually exclusive, collectively exhaustive): they need to cover all business requirements, including seemingly obvious ones. There should always be test scenarios for business error cases, meaning cases where something wrong is expect to happen and then such scenarios need to specify what to do.

Simplicity


Scenarios should be as simple as possible. Any unnecessary information or data makes tests more difficult to understand and obscures their purpose. Test scenarios serve also as specification, so if something doesn't influence the feature, don't mention it.

Completeness


Single term should be defined by exactly one scenario. All exceptional cases should be covered together with "happy path" in one scenario. Use "Examples" clause to explain all the cases there are for a given feature.

Precision


Scenarios should use exactly the same wording and phrases for describing the same conditions or situations. It will reduce confusion like "are we talking about the same thing or is it something different?" and also will make tests implementation easier. Be consistent about names and abbreviations, always write them the same way (upper/lower-case, written together or separately). Scenario summary ("Scenario" / "Scenario Outline" lines) should use full form, steps can use abbreviations.

Other useful hints


Test scenarios should be completely separated from any names or identifiers used in production. So "country 1" is preferred over "Germany". This would give additional benefit of making sure, that no such name or id was hardcoded inside code. It would also ensure, that the feature is generic and can be applied to another entity (so no fixed features for, say, country Germany, but a generic feature that can be enabled for Germany).

Similar to previous advice, hardcoded numbers or values should be avoided, because they're usually obscuring real purpose or condition. So it's better to replace something like "given operation takes more than 30 seconds" with "given operation times out". It will make test scenarios independent from configuration parameters. This would also violate DRY principle (don't repeat yourself).

All test scenarios for one feature should be kept together, in one file if feasible. Otherwise it's a single responsibility principle violation, things related to each other should be kept together.

Scenarios should be written in such a way, that they can be read "in English" (or whatever language you're using), technical forms should be avoided if possible. For example "user adds article to cart" is better than "user clicks 'add' button".

Scenario summary should be both precise and concise. One, not so long, line should be the target size.

Use passive voice in scenario summaries. So "user is logged in after providing valid credentials" is probably easier to read than something like "when user provides valid credentials then he/she is is logged in".

Scenario summary should describe general property, steps can be more concrete. So summary could contain "cheapest option is selected out of all possibilities" because this is exactly how the feature should work, it doesn't matter how many possibilities are there. In the steps mentioning two options and showing the behaviour is perfectly OK. Steps are there to clarify and give examples, but summary should be 100% precise.

Summary


Like mentioned in the beginning the hints above are results of many discussions we had in the team. We have found it difficult to write those scenarios, sometimes it was taking a similar amount of time than actually implementing those features. That's why I'm sharing this, maybe someone else could use it and avoid lengthy (and heated) discussions.

I do not pretend to say, that this is the only way to do it. It works for us™, YMMV. I'd be extremely happy to also get some hints from others, so if you'd like to add something in here or you disagree with what I've written above, fell free to leave a comment.

BTW, this post was simultaneously posted on two blogs: Zooplus Tech Blog and my private one. Please feel free to go to the other one and check it out!

Monday, 4 July 2016

Microservices - a solution or a problem?

Microservices seem to be everywhere, everybody is talking about them, writing and consuming them. For a good reason, they can solve a lot of problems like scaling, high availability, independent developability, etc. But are they really THE solution to our problems? Do we always remember about their costs?

Integration

One of the good things about microservices is, that they're simple. Each one should be easy to understand even by a new person joining a team. I like the idea coming from DDD, that one microservice should match one bounded context (any other definition based on lines of code or the time required to rewrite the service I find artificial and probably leading to too much fragmentation, so increased complexity). But if previously our systems were complex and now are simple, where did the complexity go?

Of course in the integration of all services into a bigger system. The problem is, that whereas we know far better how to integrate different parts of a system within one executable, it's still something we're learning when it comes to integrating myriads of small services. One important topic is: how to follow execution? How to find out who might be affected if you change this one, tiny class? Of course there are tools for managing complex integration scenarios, but is the cost of introducing them justifying going microservices?

Internal complexity

Previously calling other part of the system was easy, you just did it after importing the other Jar, DLL or gem. Now it's not that simple, you need to know the endpoint, queue or other means of communication gateway of the other service. Then you probably need to take care about any issues with that other service, what to do in case of timeouts or other connectivity issues. Previously we didn't have such problems, because it was the same executable. Also, until recently we were trying to get rid of all the code, that wasn't business related. Now every microservice requires its own configuration, wrapping into REST services, mappers between external and internal data structures.
I'm not saying, that we didn't have those problems previously, we had, but I have a feeling, that now more and more stuff we do is not on business features, but just about running our applications.

Testing

Testing single microservice is a dream coming true. But how do you test your services with integration with other? If your service depends on 3 other microservices how do you setup your test environment? Using real services might be difficult, if they also have some dependencies. Are other teams providing "mock" implementations, that behave like real ones, but have no dependencies or requirements themselves? It takes effort and also creates a risk, that those "mocks" will not behave the same way in some corner cases.

Middle ground

My team is currently trying a different approach. For our product we have created quite a few small modules with clearly defined dependencies (we follow clean architecture approach) that we later on combine in a larger "monolith". Each module serves a single purpose and might be viewed as a "microservice", but all of them are still packaged into a bigger executable and deployed as such. Each module has its own release cycle, but only the final executable module, that has no other responsibility than putting together all modules in specific versions into a working application, is being deployed when desired. I admit, we merely started with this approach, but it already proved very flexible when we were refactoring and moving code around. Of course, this is not solving all issues you might run into with microservices, it also brings back some problems from past, but currently it works for us.

Thursday, 23 July 2009

More on MyFaces and ContentType

I've digged some more, debugged some more and know a little more. Whole problem with Apache MyFaces and ContentType comes from the awful idea that there is more than one moment, when, for a single request, resulting ContentType is evaluated.

MyFaces looks at Accept HTTP header, which, in my case, states text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8. MyFaces goes through that list and picks first entry which it understands, in my case that's text/html. It doesn't really matter, that next entry application/xhtml+xml is what I would really want, "text/html" was first.

But JSP mechanism, when it sees something like
<jsp:directive.page contentType="application/xhtml+xml; charset=UTF-8"/>
it does as it is instructed to do and sets ContentType to application/xhtml+xml.

I don't remember now which one comes first, MyFaces or JSP, but something is definetly wrong here. I would like to assume, that when one of those chooses ContentType, the other respects that or, if the other one changes it, the first one won't be stubborn and doesn't force it's choice. The second solution could be hard to implement, so the first one is the way to go. I would expect that MyFaces, before it selects ContentType, tries to get that information from a response that was given to them. And that I will probably file as a bug.