No Data or Ambiguous Requirements? How Can BDD Save Your Project and Your Sanity

Piotr Gayczak is a Senior Software Engineer at Godel who recently took part in this year’s DevConf in Lodz. As a passionate Data Scientist with experience in the implementation of Machine Learning, AI algorithms, and Big Data Ecosystems, Piotr’s talk expanded on the topic “No Data or Ambiguous Requirements? – BDD Can Save Your Project and Your Sanity”. He also ran a discussion, where he answered some tough questions about specific areas of BDD.

What is BDD?

Firstly, it would be a good idea to define what BDD is. Behaviour Driven Development is an extension of Test-Driven Development that makes use of simple domain-specific language and structure. It means that we use natural language, to explain WHAT our code should do, not what it really does.

The whole point of this approach is to keep the mindset of behaviour. The test we write using this approach should be easily understandable for everyone in the project. This includes agile coordinators, business analysts, clients, sponsors, and developers. While it is easy for us developers to quickly jump into architecting the solution in our minds, our end-users or consumers might struggle to understand us. Their technical knowledge may be limited, and it is not their responsibility and expectation to know all that. That’s why they hire us, the experts, mighty tech wizards.

The process of collaborative BDD will help both sides to understand what our code is doing. A trivial example would be our client wants to have an app that will do multiplication for their business. However, they do not need to know that for us developers it is the easiest thing in the world.

Can you go more in-depth into the world of BDD?

The general structure to follow while working with this framework is to follow the core structure. Features are the main level; these are actual features that are to be implemented in the code. For each feature, we can have a set of Scenarios which are our behaviours. Each scenario follows the simple path: Given some context, when we act and when we expect some results.

What was the project you were working on?

I was part of the team building an ELT process for an international company providing analytics and process support, mainly for big retail clients. Our project was part of a bigger product being developed that revolved around marketing campaigns and analytics in that field. We focused on data backend, Load and Transform. The goal was to get data from multiple data sources, internal and external, and calculate metrics (KPIs) that are later used by businesses to make data-driven decisions.

The system needed to be resilient and scalable as more users would be able to join in the future, but we didn’t have any control over external data quality. We worked with the following technologies: Cloudera Hadoop Distribution, Apache Spark, Apache Hive, Apache Airflow, Python, PostgreSQL and Gitlab. Everything was done on-premises. Hadoop and Hive were used to create Data Mart, Spark was used for data processing and transformation and the end of the data pipeline was in PostgreSQL.

Did you face any business challenges along the way?

We did face a few issues and challenges. But like most challenges, they assisted in helping us make decisions about our cooperation and testing strategy to avoid tedious rework and constant changes. Here are the main challenges we faced:

  • Very high-level requirements
  • Lack of meaningful data examples
  • Ambiguous metrics definitions
  • Constant inflow of changes and new requirements
  • Dependency on other teams
  • Legacy code

To expand on the points mentioned above, we could say, the client wanted a car, but could not answer any questions about specifics and had no money. This was a challenging situation to work with as there was no real example of available data, and requirements were very general (e.g., I would like a car) without any business specifics or logic explained (e.g., we don’t know what we need). On top of that, there were no data to work with (I have no money to buy a car). The ultimate challenge was technical, and it led us to teach our clients about data-driven projects.

What achievements did you improve once you implemented BDD?

The core improvements BDD helped us gain are as follows:

  • Clear goals and focus for each sprint
  • Delivery expectations clear to us and client
  • Less surprises and changes
  • Working on confirmed, clarified, and prioritised tasks
  • Easy documentation of behaviours and inconsistencies

What are your key takeaways and lessons learnt from the project?

As with any kind of approach, there are pros and cons to following BDD. Here are some thoughts and lessons learned from our story:

  1. It is a time-consuming and communication-focused approach. There will be a lot of meetings, discussions, planning and writing down behaviours before developers can jump into actual coding. Make sure that you take this into consideration when planning your delivery and sprint backlog.
  2. Getting used to domain-specific language might be difficult at the beginning, especially in a completely unknown industry. As time goes by it gets easier and brings a lot of benefits to the general understanding of business needs and communication.
  3. It was not easy to switch from developer thinking (code recipes) to a more business-oriented approach to writing behaviours. After getting more familiar with BDD, it was much easier to write tests that are readable and understandable by everyone. Even developers can also struggle to understand what the business wants.
  4. BDD is great to solve many non-code-related issues, but it was also a great solution for bug impact mitigation and making sure newly added features are not breaking anything down the line. The uncertainties from the beginning of the project were gone in no time, helping us to increase the development speed and accuracy.

Conclusion

Firstly, while Behave Driven Development (BDD) might seem an unnecessary complication, adding a lot of development overhead bears fruit in the later stages of the software lifecycle. BDD creates bug-free and transparent software, a mindset necessary to build perfect, scalable, performant solutions.

Secondly, the key to optimal implementation of this framework is a proper explanation of the reasons and benefits and agreeing on this approach’s pros and cons. The biggest one is understanding the trade-off between initial development speed and the quality of produced software.

Finally, you can see the growing benefits of following this path every day of your work. More time dedicated to planning, organising and making your code and solution more transparent makes it more robust, thought-through and secure. BDD gives you a framework to optimise spending the most precious resource we have, time, which is the real prize.