### Spacing circles in a 2-item Venn Diagram

Venn diagrams display the overlap of two set. One can adjust the size of the circles and their overlap so the areas of the circles and the overlap are proportional to the three values. The radii of the circles can be adjusted so their area is proportionate to their respective values. The placement of the circles can be adjusted so their area of overlap equals the value of the overlap. I show how to do this calculation.

Define X to be the size of the first set, Y to the size of the second set and Z to be the intersection of X and Y. I will perform the calculations in terms of the original units and then rescale all the parameters so they can be displayed.

We determine the radii for the two sets, r1 and r2 , as the area of

X = π r12

and the area of

Y = π r22.

If Z = 0 ,then there is no overlap and the sets can be displayed separately.

If Z = X ,then X is contained within Y and similarly if Z = Y, then Y is contained with X.

In all other cases, the size of the overlap is determined by the distance between the centers of the circles. Assuming r1 > r2, the circles can be separated by distances between r1 – r2 and r1 + r2.

For any distance in the range, the overlap shown in green below is made up of two sections, one bordered by an arc centered at X and the other bordered by an arc centered at Y. The overlap area is the sum of these two sections.

My previous post on the geometry of a colorful Venn plot, showed how to calculate the angles θ1 and  θ2 and distances d1 and d2 for circles separated by a given distance. These parameters will be used in the following discussion.

The lined section of in the graph is the contribution of the arc centered at C1. It is the difference between the area bounding the cone extending from P1 to P2 minus the triangle bounded by C1, P1 and P2. The angle of the arc is 2 * θ1 where θ1 is expressed in radians.

The arc cone area = π r12 * 2 θ1/ 2 π = r12 * θ1

The area of the triangle is twice the area of the two right triangles sharing point P0

2 * r1 cos(θ1) * r1 sin(θ1) / 2 = r12 sin(2 θ1) / 2

The area of the lined area is the difference of these two areas and is therefore

r12 * (θ1 – sin(2 θ1) / 2 )

The calculation is repeated for the arc centered at C2, so the entire overlap area in green is

r12 * (θ1 – sin(2 θ1) / 2 ) + r22 * (θ2 – sin(2 θ2)  / 2)

The following graphs shows how the overlap area decreases as the distance between the circles increases and how the angles θ1 and θ1 change with distance.

In the right graph, as the distance between the circles increases, the overlap area decreases with most of the contribution at all distances coming from the smaller yellow circle. If the target overlap was 1.7, then the distance between the circles should be set at 0.8. At the smallest distance of r1 – r2 the entire yellow circle is in the overlap as indicated by θ2 equaling 180 degrees. As the distance increases, θ2 decreases to the point where there is no overlap, so the arc collapses to zero. By contrast, θ1 for the red circle is zero at the farthest distance since it’s arc makes no contribution to the overlap area at that point, it increases to a maximum when the angle subtends the radius of the yellow and then recedes to zero as the distance moves to r1 + r2.

Unfortunately, there is no closed form solution to this problem of matching the distance to the target overlap. Fortunately, the difference between the overlap and the target size of the green area is a strictly monotonic decreasing function of the distance and can be solved using simple numerical methods.

The two best fitting angles, θ1 and  θ2, can also be used to calculate the two component distances making up the total distance between the circles:

d1 = r1 cos(θ1) , d2 = r2 cos(θ2)

and the distance between the circles is

d = d1 + d2.

In summary, if the size of sets X and Y are known as well as their intersection, a suitable positioning of two circles shows the size and overlap of the sets. Finally, the sizes of the circles are rescaled so that they can be plotted within the limits of the plot.

The routine to calculate the spacing is part of the colorfulVennPlot package in R:

http://cran.r-project.org/web/packages/colorfulVennPlot/index.html