Hotspots

Hotspots are the workhorse of software analyses and our recommended starting point as you explore your codebase.

What is a Hotspot?

Most development activity tends to be located in relatively few modules`. A Hotspot analysis helps you identify those modules where you spend most of your development time. This is information you use to improve the parts that really matter. The parts where you’re likely to get a return on investment on improvements.

Explore the Hotspot Activity

CodeScene lets you explore the overall Hotspot activity in your code. These Hotspots are calculated from two different data sources:

  1. We use the lines of code in each file as a proxy for complexity.
  2. We use the change frequency of each file as a proxy for the effort you’ve spent on that code.

You want to look for an overlap between the two metrics. That’s why CodeScene presents an easy to explore, interactive visualization of your hotspots. Fig. 49 shows an example from the Visual Studio Code codebase.

Hotspots explained

Fig. 49 Hotspots in the Visual Studio Code codebase.

The Hotspot visualization makes it easy to identify the parts of your code where most development effort is spent. In a larger codebase you want to let CodeScene identify your refactoring targets. Let’s see how that’s done.

Focus on your Refactoring Targets

To prioritize your hotspots, CodeScene employs algorithms that look at deeper change patterns in the analysis data. The rationale is that complicated code that changes often is more of a problem if:

  1. The hotspot has to be changed together with several other modules.
  2. The hotspot affects many different developers on different teams.
  3. The hotspot is likely to be a coordination bottleneck for multiple developers.

This algorithm allows CodeScene to rank and prioritize the hotspots in your codebase as illustrated in Fig. 50.

Hotspots main suspects

Fig. 50 CodeScene prioritizes the Hotspots in your code.

The red hotspots are the ones you want to focus your attention on; If there is any technical debt or code quality issues in the red hotspots, then improvements to those parts are likely to give you a real return on your investment.

Use the hotspot’s biomarker score to get a quick assessment of potential technical debt or maintenance problems as shown in Fig. 51. We talk more about biomarkers in the next section.

Hotspot with biomarker

Fig. 51 CodeScene prioritizes the Hotspots in your code.

Once you’ve addressed those hotspots, the yellow hotspots become interesting as well. A yellow hotspot is secondary investigative target, albeit not as severe as the red category. Now, let’s explore the biomarkers calculated for each hotspot.

Shrink the Problem Space with Main Suspects

The ranked hotspots presented as Refactoring Targets are based on probabilities; We cannot guarantee that the code represents a true problem, but it’s likely to be one. And, best of all, that data is based on how your developers have worked with the system so far.

Hence, CodeScene includes a virtual code reviewer for any file. The virtual code review will aggregate the most significant metrics for your chosen file, as seen in Fig. 52.

The virtual code reviewer

Fig. 52 Get a holistic overview of your hotspot.

Using CodeScene’s Code Health – How easy is your code to maintain and evolve?, you’re also able to get a quick classification on possible maintenance issues:

Code Health of hotspots.

Fig. 53 View the Code Health trends for your hotspots.

The main advantage of using hotspots to guide improvements is that you’re able to narrow down refactorings to a small part of the system. That in turn will give you more time to tackle larger issues once you’ve made these initial improvements.

Use Defects to put Costs on Hotspots

When you come across hotspots with severe maintenance issues, there’s always going to be a trade-off: do we pay-off the worst technical debt or should we continue to shoehorn yet another feature into the hotspot? Ideally, we would like to know for sure that if we invest, say, two weeks into refactoring the code, then that effort will pay-off immediately. At the time of writing, there’s unfortunately no way of looking into the future. What we can do instead is to look at the existing costs and consequences of not doing any preventive and pro-active code improvements.

For this purpose, CodeScene comes with a Defect Density view. Since most organizations have a known and estimated number on how much a defect costs, let’s use defects to predict the costs of any sub-optimal code we might find in our hotspots. Fig. 54 shows an example from CodeScene’s dashboard.

Hotspot defect statistics

Fig. 54 CodeScene’s dashboard displays the bug density of the prioritized hotspots.

The statistics on the dashboard tells us the following things about the development costs in our codebase:

  • The prioritized hotspots only make up 5.5% of the total codebase, yet
  • we spend 17.6% of our development efforts in those hotspots, and
  • 23% of all bugs that we detect and fix are in that small part of the code.

Before we move on, let’s point out that 23% is actually on the lower-end; In most codebases, the top hotspots will be responsible for an even larger percentage of all fixed defects. This has direct implications on the costs of the whole system.

CodeScene’s Defect Density view shows how distributed our bug fixes are, which lets you correlate defects with hotspots as shown in Fig. 55.

Hotspot defect density

Fig. 55 Correlate prioritized hotspots with the distribution of defects in the codebase.

CodeScene also lets you inspect the distribution of those defects over time, as show in Fig. 56.

You use this information to detect if the hotspot seems to stabilize in terms of defects or if it’s likely to be a growing problem.

Specify the Data Source for Defect Statistics

CodeScene needs a data source for its defect mining, and provides two different options depending on what data you have:

  1. Use Jira issues to identify defects: This requires a that you integrate Jira with CodeScene. CodeScene will then use all issues identified as defects for its statistics. Specify this option in CodeScene’s Jira configuration as described in /integrations/integrate-jira.
  2. Use commit message patterns to estimate defects: If you have specific tags in your commit messages that can be used to identify defects, then this is a good option. As a fallback, CodeScene can use a heuristic that you can override that with

more specific patterns for higher precision, as shown in Fig. 57.

Hotspot defect pattern configuration

Fig. 57 Configure a pattern to match defect information in your commit messages.

Dive into your Hotspots

A large codebase may contain many different hotspots. You will also notice clusters of hotspots, which may indicate that a whole component or package is undergoing heavy changes.

The Hotspots Activity map in CodeScene lets you explore your whole codebase interactively as illustrated in Fig. 58.

The hotspots map

Fig. 58 Hotspots show you the activity in your codebase.

The hotspots map is interactive and hierarchical; Each large blue circle represents a folder in your codebase. That means you can zoom in and out to the level of detail you’re interested in:

  • Click on one of the large, blue circles representing a directory to zoom in on its content.
  • Click on a Hotspot to view information about it and to access its context menu to run detailed analyses.
  • Click outside the circle representing a zoomed in folder to zoom out again.
  • Hover the mouse over a circle to see information about the module it represents.

The most common interaction is to click on a Hotspot to get more details about it as illustrated in Fig. 59.

The hotspots context menu

Fig. 59 Click on a Hotspot to access the context menu.

Use the context menu to access the code for inspection, run CodeScene’s X-Ray (see X-Ray), investigate trends (see Complexity Trends) and contributors (see Parallel Development and Code Fragmentation).

CodeScene’s hotspot view also lets you view different aspects of your system, as illustrated in Fig. 60.

The hotspot aspects

Fig. 60 Switch between different aspects in the hotspot view.

Just click on an aspect to view its data. For example, Fig. 61 shows the distribution of programming languages used in the implementation of a system.

The programming language aspects

Fig. 61 The programming language aspect shows the technical sprawl in your codebase.

Use Code Churn as an Alternative Hotspot Metric

Another interesting aspect is Code Churn. By default, CodeScene uses the commit frequency of each file as the Hotspot criteria; The more changes you’ve done to a file, the higher its change frequency. This default criteria is supported by several findings from academic research; change alone is the single most important metric when it comes to quality issues in code. However, there are some rare cases when this metric becomes biased. One reason is large individual differences in commit style.

Relative Code Churn is an alternative hotspot metric that calculates the amount of change in each file in terms of Lines of Code. It’s a relative metric since the churn is weighted against the total size of the code in each file.

Let’s look at some use cases now that you know how the Hotspots analysis works.

Know how to use Hotspots

A Hotspot Map has several use cases and also serves multiple audiences like developers and testers:

  • Developers use hotspots to identify maintenance problems. Complicated code that we have to work with often is no fun. The hotspots give you information on where those parts are. Use that information to prioritize re-designs.
  • Hotspots points to code review candidates. At Empear we’re big fans of code reviews. Code reviews are also an expensive and manual process so we want to make sure it’s time well invested. In this case, use the hotspots map to identify your code review candidates.
  • Hotspots are input to exploratory tests. A Hotspot Map is an excellent way for a skilled tester to identify parts of the codebase that seem unstable with lots of development activity. Use that information to select your starting points and focus areas for exploratory tests.

Use Hotspots in your Daily Work

How well does Hotspots work in practice? Well, it turns out there’s strong scientific support behind the metric. The research has often focused on bug predictions, which is relevant since bugs are one of the main issues behind expensive software maintenance.

The book “Your Code as a Crime Scene” (Tornhill, 2015) dives deeper into those research findings to explain why and how Hotspots work. But let’s just summarize the conclusions in one line: There’s a strong correlation between Hotspots, maintenance costs and software defects. Hotspots are an excellent starting point if you want to find your productivity bottlenecks in code.

That means you want to take your Hotspots seriously. Our recommendation is to run a Hotspot analysis at least once a week. It’s also a good idea to share your findings with your team. Why not gather everyone around a Hotspot Map every now and then?