Knowledge Distribution

Let’s face it - software development is a social activity. We work in teams, sometimes distributed, where we need to communicate and coordinate in order to solve our tasks. Building an organization responsible for creating and evolving a system is a necessity as soon as your codebase has grown beyond a certain size. It’s our way to scale and be able to take on larger problems than what we could as individuals.

But moving from individual developers to teams does not come free; No matter how efficient we, as an organization, are, we’ll always pay a price. The cost of team work is known as process loss. Process loss is the theory that a team, just like a mechanical machine, cannot operate at 100 percent efficiency. In the mechanical world we have inefficiencies like friction and heat loss. Our software equivalents are coordination and communication. The main challenge in most software projects is to minimize the process loss. Failures to do so often come off as technical issues, when in reality those issues have social roots.

The software industry has been aware of these issues. But until now, we’ve never had a way to measure them. This is about to change. In this guide you’ll learn how CodeScene helps you uncover knowledge distribution and identify team productivity bottlenecks in your system. With the following suite of analyses you’re now able to make organizational decisions based on data from how you’ve actually worked so far.

How Do We Measure Knowledge?

The knowledge metrics are based on the amount of code each developer has contributed. CodeScene looks at the deep history of each file to calculate contributions. This makes sense for two different reasons:

  1. The last snapshot of a source code file wouldn’t be good enough since such shallow ownership is sensible to superficial changes (e.g. re-formatting issues, automated renaming of variables, etc).
  2. Even if one developer completely rewrites a piece of code, its original author will still retain some knowledge in that area since they’re familiar with the problem domain. The metrics in CodeScene acknowledge that and will retain some knowledge for the original developer as well.

CodeScene uses the name of each committer to calculate knowledge metrics. So please make sure you understand the possible biases discussed in the guide Know the possible Biases in the Data.

Prepare the Analyses by Assigning Colors to Developers and Teams

Your knowledge maps are based on colors to give you an accessible high-level overview. Please refer to the guide in Configure Developers and Teams to prepare for the knowledge analyses.

Tip: Use the Code Churn by Author analysis results to identify the top contributors so that you can assign them as distinct colors as possible.

Explore the Individual Knowledge Map

The first knowledge analysis measures the knowledge distribution for individual developers in your codebase.

Each developer is assigned a color in the following visualization. The color of each file represents its main developer (that is, the developer who has contributed most of the code). You see the resulting visualization in Fig. 52.

An example of a knowledge map

Fig. 52 An example of a knowledge map, hovering a circle to get more information.

All knowledge maps are interactive:

  • Click on a circle to zoom in on the corresponding package.
  • Click outside the circle to zoom out.
  • Hover the mouse over a circle to see information about the package or file it represents.

Explore your Team Knowledge Maps

CodeScene also measures knowledge distribution on a team level and this information is usually even more valuable than the individual metrics.

As soon as you’ve assigned developers to a team, as described in Configure Developers and Teams, CodeScene will accumulate their individual knowledge into their teams. The analysis results are presented using the same principles as for the Individual Knowledge Map. Only now, each color represents a team.

The Team Knowledge Map lets you reason about both the responsibilities of the different teams. In general, you want to ensure that your team organization is reflected in the software architecture of your system. Please refer to the discussions in the guide Social Networks for more information on the organizational theories and how they correlate to the quality and efficiency of your organization.

Uncover the Knowledge Loss in your Codebase

Knowledge loss represents code that is written by a developer who is no longer part of your organization or project. You use this information to reason about the knowledge distribution in your codebase and as part of your risk management since it is an increased risk to modify code we no longer understand. In addition, you can also use the analysis pro-actively to simulate the consequences, in terms of knowledge loss, of planned organizational changes.

The Knowledge Loss analysis will accumulate the contributions of all developers that you have marked as Ex-Developers in your configuration (see Configure Developers and Teams). Those parts of the codebase that are dominated by Ex-Developers are marked as red in the knowledge loss visualization. Fig. 53 shows an example from an organization where some core developers have left.

An example on a knowledge loss analysis

Fig. 53 An example on a knowledge loss analysis.