A volcano plot is a type of scatterplot that is used to quickly identify changes in large datasets composed of replicate data. In life sciences, volcano plots are often used to identify differentially expressed genes. Especially in the exploratory phase, volcano plots are a powerful tool to discuss your findings with fellow researchers and prioritize follow-up analyses.

If you have used volcano plots to present your results, you will probably be very familiar with the typical situation in which someone points to a minuscule, unlabelled, dot and asks: “… and this one, which gene is it?”. Follows a short embarrassing moment in which you are trying to come up with a good answer. Depending on the geekyness level of the room, you might get out of the impasse with gene-sounding abbreviations like T-1000, R2-D2 or C-3PO. Science-fiction puns aside, this is quite emblematic of a big problem with these plots: they present in a single, condensed view an extremely large amount of data points. Labelling each data point will just create a confusing cloud of overlapping names. I have found that usually the best way to communicate effectively is to make the plots interactive, exposing all the relevant data only when needed (that is, when triggered by the user).

In this post, I will show you how to build an interactive volcano plot using the awesome javascript libraries D3 and Opentip. I will guide you step by step, all the way until the plot shown above:

  1. The data
  2. The anatomy of a volcano plot
    1. Add the SVG container
    2. Add scales and axes
    3. Add the data points
    4. Add events and interactions
  3. Enhanced UX using a Voronoi overlay
    1. Add a Voronoi overlay
    2. Optimize the Voronoi overlay
    3. Forward mouse event

The data

Before starting to draw the plot, let’s have a look at the data. The data I will use in this example has been published in Nature by the Netherlands Cancer Institute, and is part of a study investigating metastatis in breast cancer. The raw data has been generated using RNA-Seq and captures the gene expression in neutrophils from wild-type (WT) mice and K14cre;Cdh1F/F;Trp53F/F (KEP) mice.

After computing the differential expression (between neutrophils in WT and KEP) using the limma package in R, I exported the results in JSON format using the package jsonlite. A single data point in our dataset has the following JSON structure:

The field Symbol contains the gene name, logFC the log2 fold change, adj.P.Val the p-value, and _row the Ensembl ID of the gene. In order to easily access these fields, let’s start our JavaScript with a few accessor functions. It will make our code more readable.

In order to access our JSON data I will use the convenient d3.json function:

Now that we know how the data is structured and how to access it, let’s start drawing our volcano plot using D3.

The anatomy of a volcano plot

The anatomy of a volcano plot is quite simple. The y-axis shows the statistical significance in terms of p-value on a negative logarithmic scale. In other words, the higher the dot, the lower the p-value and the more the significance. The x-axis shows the fold changes between two conditions on a linear scale (although the x values are usually in a log2 format). This means that positive values show an increase while negative values a decrease in fold changes.

Let’s start defining the dimensions of the SVG container and append it to the document. Notice that I add a small margin from the top and the right sides to avoid that data points will overflow the SVG container. Larger margins are applied to the bottom and the left sides, to reserve the necessary space for the axes labels and ticks. I also include an axesMargin of 10 pixels to avoid that the y and x axes overlap at the origin.

Then, let’s define the x and y scale, and the two axes.

Before appending  the axes to the document, I add an SVG group which will translate the whole plot using the defined margins.

At this point, the plot should look as follows.

Volcano plot axes

I will plot the data points with two types of visual styles, depending on whether they are significantly differentially expressed or not. In order to establish which data point is significant, I define the following predicate:

Now let’s plot the actual data points.

The resulting plot should look like the following. For the corresponding CSS you can consult our GitHub repository.

Volcano plot

The interaction with the user will involve hovering on a data point. When the user hovers on a data point with the mouse pointer, the volcano plot will respond:

  1. Bringing the data point in foreground.
  2. Changing the appearance of the data point.
  3. Displaying a tooltip with relevant information.

Point 1. can be simply achieved by using the solution proposed by Christopher Chiche on StackOverflow.

We can change the appearance of our data points by responding to mouseenter and mouseleave events.

Finally, to add the tooltip I will use a version of Opentip that we modified to support basic SVG elements. You can find it on GitHub.

ID: ” + getId(d) + ”
” + “Log2 Fold Changes: ” + getX(d) + ”
” + “Adjusted P-Value: ” + getY(d) + ”

“; new Opentip(this, content, { title: getSymbol(d), background: “white”, borderColor: “darkgray”, borderWidth: 2, delay: 0.25, hideDelay: 0, shadow: false, stem: ‘center bottom’, target: true, targetJoint: ‘center top’, tipJoint: ‘center bottom’, showOn: “mouseenter”, hideOn: “mouseleave” }); });

If you managed to hack your way until this point, you are probably looking at a nice interactive volcano plot. It should be very similar to the one that I showed you at the beginning. However, if you start playing with it, you will notice a major difference. While the interaction in the plot at the beginning can be triggered by the space surrounding each data point, in the version we built so far you would need to hover on exactly each minuscule circle. This gets really painful when two data points are extremely close to each other. In order to enhance the user experience of our interactive volcano plot, we can use a Voronoi overlay.

Enhanced UX using a Voronoi overlay

A Voronoi diagram is a partition of the plane (i.e. our 2D plot space) into regions based on the distance from a specified set of points. In other words, by building a Voronoi diagram we can identify the region surrounding each data point. No need to worry if this seems overly difficult: D3 makes it extremely easy to build and draw a Voronoi diagram.

Then we can simply plot our Voronoi diagram using the polygon element.

Notice that we always round the dimension to full pixels. This helps to keep the document size small and increase the rendering performance. If we decided to visualize our Voronoi overlay right now, it would look like this:

Volcano plot with Voronoi overlay

As you can see, each data point belongs to its own partition. You can immediately spot two problems:

  1. When there are many points close to each other, the space partition of the Voronoi diagram is often smaller than the circle used as a marker. In this case, the Voronoi overlay doesn’t improve the UX, but rather decreases it significantly.
  2. When a region is less dense with points, the space partition is particularly large. In this case, it is counterintuitive for the user to trigger events when the cursor position is very far away from the point of interest.

To overcame the first problem I can filter out all the Voronoi partitions which are smaller than the circle using the following predicate.

Then we can redraw our Voronoi polygons,

and obtain the following image. Notice that the small partition in the middle disappeared.

Volcano plot with reduced Voronoi overlay

To fix the second problem, I will use the SVG clipPath element. The idea is simple. Instead of using the whole partition generated by the Voronoi diagram, we will clip each partition using a larger circle centered in each data point. In order to achieve this, we need to start defining all the clipping paths.

I create a clipPath for each data point using a circle with a larger radius. Importantly, we need to set the ID attribute so that we know how to associate each clipPath with the corresponding Voronoi partition. The final version of our Voronoi overlay looks as follow.

Volcano plot with clipped Voronoi overlay

Now that I draw the Voronoi overlay, the last step is to forward the mouse events received by the Voronoi polygons to the circles of the data points. In this way, when the cursor over the Voronoi the listeners that we already associated to the circles will be triggered.

The final step is to hide the Voronoi diagram. Remember to set the CSS property pointer-events to let the hidden Voronoi become the target of mouse events.

Conclusion

With this in your toolset, the next time that you are presenting your volcano plot you will have a powerful way for your audience to interact with the data. More importantly, you won’t have to come up with lame jokes about Stars Wars-sounding fictional gene names  🙂

Note that I did not discuss the styling used in the reference plot. The CSS that I used is short and easy to read: you can find it, together with the complete code used for the tutorial, on GitHub.

If you found this tutorial useful and would like to see more, do follow us on Twitter and LinkedIn.