A volcano plot is a type of scatterplot that is used to quickly identify changes in large datasets composed of replicate data. In life sciences, volcano plots are often used to identify differentially expressed genes. Especially in the exploratory phase, volcano plots are a powerful tool to discuss your findings with fellow researchers and prioritize follow-up analyses.

If you have used volcano plots to present your results, you will probably be very familiar with the typical situation in which someone points to a minuscule, unlabelled, dot and asks: “… and this one, which gene is it?”. Follows a short embarrassing moment in which you are trying to come up with a good answer. Depending on the geekyness level of the room, you might get out of the impasse with gene-sounding abbreviations like T-1000, R2-D2 or C-3PO. Science-fiction puns aside, this is quite emblematic of a big problem with these plots: they present in a single, condensed view an extremely large amount of data points. Labelling each data point will just create a confusing cloud of overlapping names. I have found that usually the best way to communicate effectively is to make the plots interactive, exposing all the relevant data only when needed (that is, when triggered by the user).

In this post, I will show you how to build an interactive volcano plot using the awesome javascript libraries D3 and Opentip. I will guide you step by step, all the way until the plot shown above:

### The data

Before starting to draw the plot, let’s have a look at the data. The data I will use in this example has been published in Nature by the Netherlands Cancer Institute, and is part of a study investigating metastatis in breast cancer. The raw data has been generated using RNA-Seq and captures the gene expression in neutrophils from wild-type (WT) mice and *K14cre;Cdh1 ^{F/F};Trp53^{F/F}* (KEP) mice.

After computing the differential expression (between neutrophils in WT and KEP) using the limma package in R, I exported the results in JSON format using the package jsonlite. A single data point in our dataset has the following JSON structure:

1 2 3 4 5 6 |
{ "Symbol":"Mag", "logFC":0.0293, "adj.P.Val":0.9867, "_row":"ENSMUSG00000036634" } |

The field *Symbol* contains the gene name, *logFC* the log2 fold change, *adj.P.Val* the p-value, and *_row* the Ensembl ID of the gene. In order to easily access these fields, let’s start our JavaScript with a few accessor functions. It will make our code more readable.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
// The following are access function to extract the relevant information // from each data point. var getId = function(d) { return d["_row"]; }; var getSymbol = function(d) { return d["Symbol"]; }; var getX = function(d) { return d["logFC"]; }; var getY = function(d) { return d["adj.P.Val"]; }; |

In order to access our JSON data I will use the convenient d3.json function:

1 2 3 4 5 |
d3.json("data.json", function (error, root) { if (error) throw error; // The rest of the code will go here }; |

Now that we know how the data is structured and how to access it, let’s start drawing our volcano plot using D3.

### The anatomy of a volcano plot

The anatomy of a volcano plot is quite simple. The y-axis shows the statistical significance in terms of p-value on a negative logarithmic scale. In other words, the higher the dot, the lower the p-value and the more the significance. The x-axis shows the fold changes between two conditions on a linear scale (although the x values are usually in a log2 format). This means that positive values show an increase while negative values a decrease in fold changes.

Let’s start defining the dimensions of the SVG container and append it to the document. Notice that I add a small margin from the top and the right sides to avoid that data points will overflow the SVG container. Larger margins are applied to the bottom and the left sides, to reserve the necessary space for the axes labels and ticks. I also include an *axesMargin* of 10 pixels to avoid that the y and x axes overlap at the origin.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
var margin = {top: 20, right: 20, bottom: 40, left: 40}, // This margin makes sure the two axes will not overlap, // making the plot more readable. axesMargin = 10, width = 960 - margin.left - margin.right - axesMargin, height = 500 - margin.top - margin.bottom - axesMargin; // Create the SVG container in which we will draw the volcano plot // making sure to reserve enough space on the left and on the bottom // to show the axes labels. We also reserve a bit of space on the top // and on the right to avoid that data points on the edges will be // cut off from the SVG container. var svg = d3.select("body").append("svg") .style("width", width + margin.left + margin.right) .style("height", height + margin.top + margin.bottom); |

Then, let’s define the x and y scale, and the two axes.

1 2 3 4 5 6 7 8 |
// Build a linear scale for the x-axis. var xScale = d3.scale.linear().range([0, width]).domain(d3.extent(root, getX)), // Build a log10 scale for the y-axis yScale = d3.scale.log().range([0, height]).domain(d3.extent(root, getY)); // Create the D3 axes objects. var xAxis = d3.svg.axis().scale(xScale).orient("bottom"), yAxis = d3.svg.axis().scale(yScale).orient("left"); |

Before appending the axes to the document, I add an SVG group which will translate the whole plot using the defined margins.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
// We create the svg group that will contain the volcano plot and // we translate all the contained object var plot = svg.append("g") .attr("transform", "translate(" + margin.left + "," + margin.top + ")"); // It is time to draw the y axis... plot.append("g") .attr("class", "y axis") .attr("transform", "translate(0," + -axesMargin + ")") .call(yAxis) // ...and its label. .append("text") .attr("class", "label") // We want to rotate the label so that it follows the axis orientation .attr("transform", "rotate(-90)") .attr("y", 6) .attr("dy", ".71em") .style("text-anchor", "end") .text("-Log10 P-Value"); // Now the x-axis... plot.append("g") .attr("class", "x axis") .attr("transform", "translate(" + axesMargin + "," + height + ")") .call(xAxis) // ... end its label. .append("text") .attr("class", "label") .attr("x", width) .attr("y", -6) .style("text-anchor", "end") .text("Log2 Fold Changes"); |

At this point, the plot should look as follows.

I will plot the data points with two types of visual styles, depending on whether they are significantly differentially expressed or not. In order to establish which data point is significant, I define the following predicate:

1 2 3 4 5 |
// This is a predicate (returns true or false) depending on whether a data point // is statistically significant. var isSignificant = function(d) { return !!(Math.abs(getX(d)) > 1 && getY(d) < 0.05); }; |

Now let’s plot the actual data points.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
// Let's create a group to contain all the data points, correctly spaced from the axes. var scatter = plot.append("g") .attr("class", "scatter") .attr("transform", "translate(" + axesMargin + "," + -axesMargin + ")"); // The radius and border size of each data point. var radius = 3, border = 2; // This is a classical example of function composition. It will return // the value returned by the specified accessor, scaled, and rounded. var getScaled = function(scale, accessor) { return function (d) { return Math.round(scale(accessor(d))); } }; // Finally let's draw the single data points var circles = scatter.selectAll("circle").data(root, getId) .enter().append("circle") .attr("cx", getScaled(xScale, getX)) .attr("cy", getScaled(yScale, getY)) .attr("r", radius) .classed("significant", isSignificant); |

The resulting plot should look like the following. For the corresponding CSS you can consult our GitHub repository.

The interaction with the user will involve hovering on a data point. When the user hovers on a data point with the mouse pointer, the volcano plot will respond:

- Bringing the data point in foreground.
- Changing the appearance of the data point.
- Displaying a tooltip with relevant information.

Point 1. can be simply achieved by using the solution proposed by Christopher Chiche on StackOverflow.

1 2 3 4 5 |
d3.selection.prototype.moveToFront = function() { return this.each(function(){ this.parentNode.appendChild(this); }); }; |

We can change the appearance of our data points by responding to mouseenter and mouseleave events.

1 2 3 4 5 6 |
circles.on("mouseenter", function() { d3.select(this).classed("hover", true).moveToFront(); }); circles.on("mouseleave", function() { d3.select(this).classed("hover", false) }); |

Finally, to add the tooltip I will use a version of Opentip that we modified to support basic SVG elements. You can find it on GitHub.

1 2 |
circles.each(function(d) { var content = " |

ID: ” + getId(d) + ”

” + “Log2 Fold Changes: ” + getX(d) + ”

” + “Adjusted P-Value: ” + getY(d) + ”

“; new Opentip(this, content, { title: getSymbol(d), background: “white”, borderColor: “darkgray”, borderWidth: 2, delay: 0.25, hideDelay: 0, shadow: false, stem: ‘center bottom’, target: true, targetJoint: ‘center top’, tipJoint: ‘center bottom’, showOn: “mouseenter”, hideOn: “mouseleave” }); });

If you managed to hack your way until this point, you are probably looking at a nice interactive volcano plot. It should be very similar to the one that I showed you at the beginning. However, if you start playing with it, you will notice a major difference. While the interaction in the plot at the beginning can be triggered by the space surrounding each data point, in the version we built so far you would need to hover on exactly each minuscule circle. This gets really painful when two data points are extremely close to each other. In order to enhance the user experience of our interactive volcano plot, we can use a Voronoi overlay.

### Enhanced UX using a Voronoi overlay

A Voronoi diagram is a partition of the plane (i.e. our 2D plot space) into regions based on the distance from a specified set of points. In other words, by building a Voronoi diagram we can identify the region surrounding each data point. No need to worry if this seems overly difficult: D3 makes it extremely easy to build and draw a Voronoi diagram.

1 2 3 4 5 6 7 8 9 10 11 12 |
// Let's create a Voronoi diagram using the accessors defined before. // We also clip the Voronoi to the plotting area dimensions. var voronoi = d3.geom.voronoi() .x(getScaled(xScale, getX)) .y(getScaled(yScale, getY)) .clipExtent([[0,0], [width, height]]); // For convenience we create a new SVG group that contains all the // Voronoi elements. var voronoiG = plot.append("g") .attr("class", "voronoi") .attr("transform", "translate(" + axesMargin + "," + -axesMargin + ")"); |

Then we can simply plot our Voronoi diagram using the polygon element.

1 2 3 4 5 6 7 8 |
var paths = voronoiG.selectAll("path") .data(voronoi(root).map(d3.geom.polygon) .enter().append("polygon") .attr("points", function(d) { return d.map(function(x){ return [Math.round(x[0]),Math.round(x[1])]; }).join(",") }); |

Notice that we always round the dimension to full pixels. This helps to keep the document size small and increase the rendering performance. If we decided to visualize our Voronoi overlay right now, it would look like this:

As you can see, each data point belongs to its own partition. You can immediately spot two problems:

- When there are many points close to each other, the space partition of the Voronoi diagram is often smaller than the circle used as a marker. In this case, the Voronoi overlay doesn’t improve the UX, but rather decreases it significantly.
- When a region is less dense with points, the space partition is particularly large. In this case, it is counterintuitive for the user to trigger events when the cursor position is very far away from the point of interest.

To overcame the first problem I can filter out all the Voronoi partitions which are smaller than the circle using the following predicate.

1 2 3 4 5 6 7 |
// Predicate that returns true if and only if the area of the polygon // is larger than minArea. var hasLargerArea = function(minArea) { return function(polygon) { return polygon.area() > minArea } }; |

Then we can redraw our Voronoi polygons,

1 2 3 4 5 6 7 8 |
var paths = voronoiG.selectAll("path") .data(voronoi(root).map(d3.geom.polygon).filter(hasLargerArea(Math.pow(radius + border/2, 2)*Math.PI))) .enter().append("polygon") .attr("points", function(d) { return d.map(function(x){ return [Math.round(x[0]),Math.round(x[1])]; }).join(",") }); |

and obtain the following image. Notice that the small partition in the middle disappeared.

To fix the second problem, I will use the SVG clipPath element. The idea is simple. Instead of using the whole partition generated by the Voronoi diagram, we will clip each partition using a larger circle centered in each data point. In order to achieve this, we need to start defining all the clipping paths.

1 2 3 4 5 6 7 8 |
var clips = svg.append("defs") .selectAll("clipPath").data(root).enter() .append("clipPath") .attr("id", getId) .append("circle") .attr("cx", getScaled(xScale, getX)) .attr("cy", getScaled(yScale, getY)) .attr("r", 20); |

I create a clipPath for each data point using a circle with a larger radius. Importantly, we need to set the ID attribute so that we know how to associate each clipPath with the corresponding Voronoi partition. The final version of our Voronoi overlay looks as follow.

1 2 3 4 5 6 7 8 9 |
var paths = voronoiG.selectAll("path") .data(voronoi(root).map(d3.geom.polygon).filter(hasLargerArea(Math.pow(radius + border/2, 2)*Math.PI))) .enter().append("svg:polygon") .attr("points", function(d) { return d.map(function(x){ return [Math.round(x[0]),Math.round(x[1])]; }).join(",") }) .attr("clip-path", function(d) { return "url(#" + getId(d.point) + ")"; }); |

Now that I draw the Voronoi overlay, the last step is to forward the mouse events received by the Voronoi polygons to the circles of the data points. In this way, when the cursor over the Voronoi the listeners that we already associated to the circles will be triggered.

1 2 3 4 5 6 7 8 9 10 |
// This function get an event from the Voronoi and triggers the same // event type on the circle element associated with the partition. var forwardEvent = function(d) { var event = document.createEvent("SVGEvents"); event.initEvent(d3.event.type, true, true); circles.data([d.point], getId).node().dispatchEvent(event); }; paths.on("mouseenter", forwardEvent); paths.on("mouseleave", forwardEvent); |

The final step is to hide the Voronoi diagram. Remember to set the CSS property pointer-events to let the hidden Voronoi become the target of mouse events.

1 2 3 4 5 |
polygon { fill: none; stroke: none; pointer-events: all; } |

# Conclusion

With this in your toolset, the next time that you are presenting your volcano plot you will have a powerful way for your audience to interact with the data. More importantly, you won’t have to come up with lame jokes about Stars Wars-sounding fictional gene names 🙂

Note that I did not discuss the styling used in the reference plot. The CSS that I used is short and easy to read: you can find it, together with the complete code used for the tutorial, on GitHub.

If you found this tutorial useful and would like to see more, do follow us on Twitter and LinkedIn.