You own a van and you drive around places. This fine day you’re out looking for broken telephone poles in a suburb.

You drive around and find twenty telephone poles. Out of these twenty, you reckon three of them are dodgy, but it’s 3pm and you don’t have time to fix them today.

It’s important for you to be able to find the dodgy poles later, so while you’re at each one you take out your GPS unit and take a reading. Before you clock off for today, you check the history of the GPS unit and write down the three pairs of numbers there:

(151.2092, -33.8684) (151.2010, -33.8700) (151.2103, -33.8649)

You sleep easy because you know that with these coordinates written down, you’ll be able to find the poles whenever you like. You know that the coordinates represent longitude and latitude, and that if you plot them on a map you’ll be able to see where the poles are with no dramas.

What you might not realise is that you used a coordinate reference system to find the location of those poles. Coordinate reference systems are used to represent the locations of things on the Earth, and your GPS reciever works by utilising one.

By unwittingingly utilising this coordinate reference system underpinning your GPS receiver, you can easily locate which telegraph poles need fixing.

Nifty.

A Coordinate Reference System (CRS) is used to uniquely identify the location of things relative to the Earth. It also goes by the name Spatial Reference System (SRS).

Think of the Cartesian plane you studied at school: when you saw a set of coordinates of a point, you instantly knew the exact location of that point. It’s the same thing in the geographical world, just more complicated.

The spatial data you find will be created by using some CRS. Points, lines, polygons, raster sheets – all of this data has to refer to a CRS. It would be meaningless to call it spatial data otherwise.

CRS’s usually come in two categories. These are geographic and projected, and they are heavily related to one another.

In a nutshell: geographic Coordinate Reference Systems define locations using a 3D surface, and measure location with latitude and longitude.

There seems to be a lot of ambiguity surrounding what a geographic CRS is and what it contains. Many sources disagree on the topic and terminology used is often unclear. To keep things consistent I’ll be going by the definitions found on the EPSG website.

With that being said, a geographic CRS is made up of two things:

- A coordinate system
- A datum

The terms “coordinate system” and “coordinate reference system” are often considered as the same thing, but in this context they are different. Here, a coordinate system is a set of axes with their properties (axes, axis names, order, abbreviations etc) and contained within a CRS.

Coordinate systems are classified according to the geometric properties of their coordinate space and of the shape of their axes. They come in a few different forms. Here are a few of them:

**Cartesian coordinate system:**position is given assuming the axes are orthogonal and straight. Each axis is measured in the same units.**Ellipsoidal coordinate system:**position is given by latitude, longitude, and optionally height. This coordinate system is the one used in geographic CRS’s.**Vertical coordinate system:**a one dimensional coordinate system that records heights of points above the Earth’s surface.

A datum is used to define a few things: the position of the origin, the scale and the axis orientation of a coordinate system. All these things are defined with respect to an object, which is typically the Earth.

Datums come in different forms. Here are two of them:

**Geodetic datum:**defines the model of the Earth to use when calculating coordinates. The model of the Earth is usually an ellipsoid or a sphere. Also contained in the datum is the location and the orientation of the model.**Vertical datum:**describes a reference level surface which is also known as the “zero-height” surface. The position of the zero-height surface with respect to the Earth is also described.

Coordinates only make sense with considered in conjuction with a model of the Earth. The same location on Earth under two different Earth models will be represented by two different sets of coordinates.

One example of a geographic CRS is WGS 84. WGS 84 goes by the EPSG code EPSG 4326 and is one of the most important geographic CRS’s. It is the geographic CRS of choice for the omnipresent GPS system.

(Don’t know how EPSG codes work? Click here.)

Like other geographic CRS’s, WGS 84 is made up of a coordinate system and a datum.

Here is a diagram of the EPSG codes for WGS 84:

The coordinate system of WGS 84 is an ellipsoidal coordinate system. In this coordinate system position is given by latitude and longitude. The north direction on the latitude axis is taken as positive, and likewise the east direction on the longitude axis is taken as positive.

The datum of WGS 84 is a geodetic datum, going under the EPSG code 6326. This datum models the Earth as an ellipsoid with a semi-major axis radius of 6,378,137m, and an inverse flattening of 298.257223563 (EPSG 7030). The Prime Meridian defines the location and origin of the model. The Prime Meridian is defined as running through Greenwich (EPSG 8901).

This CRS is defined as a 2D CRS, since the ellipsoidal height is not provided in the datum. If it were given, WGS 84 would be a 3D CRS instead.

The EPSG database also defines the area of use for a geographic CRS. While WGS 84 is suitable to be used across the world (EPSG 1262), other geographic CRS’s are only suitable in certain areas.

Projected CRS’s can be thought of as the two-dimensional cousin of the three-dimensional geographic CRS. The geographic CRS represents data using a three-dimensional construct; the projected CRS uses projections to transform points from the three-dimensional construct to a two-dimensional map.

Simply put, a projection is a series of transformations that convert the locations of points on a three dimensional surface (defined in the geographic CRS) to locations on a flat surface (defined in the projected CRS).

A projected CRS is made up of three things:

- A geogaphic CRS
- A coordinate system
- A map projection

A geographic CRS uses an ellipsoidal coordinate system; a projected CRS uses a Cartesian coordinate system. A geographic CRS uses latitude, longitude and degrees; a projected CRS uses northings, eastings and metres (or feet, kilometres etc).

It is difficult to represent a three-dimensional Earth as a two-dimensional map. To mitigate these difficulties a vast number of map projections exist, each with their own strengths and weaknesses.

One way of thinking about a map projection is as a way of converting geographic coordinates (latitude and longitude) into Cartesian coordinates (and vice versa). You could also say that a map projection converts coordinates referenced in a geographic CRS to coordinates referenced in a projected CRS.

For a successful projection you need more than just the name of the map projection. You need projection parameters. These parameters answer questions like:

- Where is the centre of the projection?
- What is the scale factor at each point?
- Where are the standard parallels?
- What are the values of the false easting and false northing?

Let’s look at an example.

The NAD83 / Canada Atlas Lambert (EPSG 3978) projection is a projected CRS. This projected CRS uses the geographic CRS known as NAD83 and the map projection Lambert Conic Conformal. The parameters are also provided, including specifiying that this conical projection uses two Standard Parallels.

Here’s the diagram of the EPSG codes for this projected CRS. It’s quite complicated.

Under the gold-standard EPSG system, a map projection is a subcategory of a “Coordinate Conversion”, or a Conversion for short. Included within the Conversion is a category that details the area of the projection, the conversion method of the projection and the projection parameters.

It’s evident that the projected CRS uses a Cartesian Coordinate system, a geographic CRS and a map projection. You can see that there are two coordinate systems referenced here: the ellipsoidal coordinate system used in the geographic CRS, and the cartesian coordinate system used in the projected CRS.

For many applications of spatial data you won’t need to think about what coordinate system it’s referenced as. You’ll put together the layers of data and things will just work.

It’s when things go wrong where you’ll need to dig deeper. Perhaps layers aren’t aligning properly, lakes don’t look right or the position of your data points suspicious – maybe the error can be traced back to a CRS mismatch somewhere.

Or maybe you can’t find those broken telegraph poles again. Probably should have taken a photo.

]]>Or at least, it used to be. It was absorbed by IOGP (International Association of Oil & Gas Producers) in 2005 and ceased operation as an independent body.

We don’t remember EPSG for its history as an entity, but rather for the database that it compiled. EPSG created the EPSG Geodetic Parameter Set – a comprehensive database of coordinate reference systems, datums, ellipsoids, and other such geodetic parameters. Although EPSG as an organisation is no more, its database lives on and is still updated and maintained.

Each entry in the EPSG database has a unique code associated with it. These codes are known as EPSG codes and they seem to be found everywhere that spatial data lurks.

An EPSG code might refer to a:

- Coordinate Reference System (CRS) – like EPSG4326, which refers to the coordinate reference system WGS 84.
- Datum – like EPSG 6326, which refers to the datum used in the coordinate reference system WGS 84.
- Area of use – like EPSG 1262, which refers to the entire world.
- Prime Meridian – like EPSG 8901, which refers to the meridian passing through Greenwich.

This is not a comprehensive list. EPSG codes also refer to ellipsoids, spheroids and miscellaneous other things.

For further illustration here is the EPSG structure for WGS 84, a commonly used geographic Coordinate Reference System:

Evidently, many EPSG codes together make up the WGS84 coordinate system. The name WGS84 appears three times in this hierarchy: under the entries for Geodetic CRS, Geodetic Datum and Ellipsoid, and as such the use of this code is ambiguous. The code EPSG 4326 by contrast is unambiguous: it refers only to the coordinate reference system, not the ellipsoid or the geodetic datum.

Next time you meet an EPSG code in the wild, you’ll be prepared.

]]>Ideally we wouldn’t have to use maps because we’d use globes for everything. But globes aren’t convenient. Flat maps can be viewed on computer screens, printed out, displayed on a wall, rolled up into a scroll, and are simply much more useful and convenient than globes are.

There’s another problem with globes in that they’re also only convenient to use at small scales. If you wanted to illustrate spatial data where you need suburb-level detail, you’d need a massive globe!

Two dimensional maps it is then. How do we make one? Through something called a *projection*.

Suppose you had a transparent globe with a light bulb in its centre, and you also had a big sheet of paper. What you could do is wrap the paper around the globe, so that the countries and features of the globe are be projected onto the paper. Then you could trace around the countries with a pencil to give yourself a two dimensional map of the world.

How you wrap the paper around the globe makes a big difference. There are three families of map projection: cylindrical, conical, and planar, and each refer to the way that you surround the globe with the paper:

Unfortunately, every projection is distorted somehow – there’s no such thing as a perfect projection. Projections distort area, shape, distance or direction. You can have one accurate but then you must compromise on the others.

There exist many, many projections, of which a very few people have heard of most of them. This tool is an interesting way to explore some of these projections.

Here are some of the more common ones:

The Mercator projection is a cylindrical projection instantly recognisable as the “default” map projection, for better or for worse. While it is highly accurate at the equator, areas of countries towards the north and south poles are much larger than they should be.

A side effect of this projection being so prevalent is that people are often surprised when they see how big Africa really is. Being placed in the centre of the projection means that Africa is less affected by the significant vertical distortions affecting the likes of Russia and Greenland. Other projections represent Africa’s relative size a lot better than the Mercator projection does.

Many online maps use the Mercator projection, including those made by Google, Bing, OpenStreetMaps and Yahoo. Being more of a provider of local maps than global ones, it is unimportant to users that Greenland looks as big as Africa. Users are more interested in a conformal map: the scale is the same in all directions, angles are depicted correctly, and real life circles are depicted as circles on the map. In addition, the Mercator projection has a constant North direction wherever you are in the world.

Having said all this, the implementation by Google, Bing etc has some problems. This article explains these in some detail.

A compromise projection, the Robinson projection was created to “look right” rather than to measure distances exactly. It manages to fit the entire globe onto a flat surface where the distortion of areas is much better than that in the Mercator projection. The Robinson projection is another cylindrical projection.

The meridians on this projection curve gently towards the poles, which avoids an extreme distortion of shape. The flipside is that the poles become a line rather than a point – just look at Antartica dominating the southern hemisphere.

The Robinson projection has a unique history – unlike other projections, it wasn’t derived from a mathematical model. Rather it was constructed through computer simulations and a trial and error approach, and then a mathematical model devised to reproduce the end result.

A variation of the Mercator projection, the transverse Mercator projection has a variable central meridian – you can place it along a number of meridians. Typically you’d place it at the location you’re interested in mapping, since this projection is highly accurate within about 5 degrees of its central meridian.

Both the Mercator projection and the transverse Mercator projection are very accurate in the middle of their mapping region. This makes them good choices for local mapping.

Like the regular Mercator projection, the transverse Mercator projection shouldn’t be used for showing the whole globe at once. The two have similar difficulties in mapping areas far away from the central meridian accurately. Using the transverse Mercator projection won’t fix the large distortion in areas at the boundaries of the map.

One difference between the two is where the distortion occurs. The regular Mercator projection has its area distortion on the extremes of the y-axis. The transverse Mercator projection, by contrast, is distorted heavily on the extremes of the x-axis.

The transverse Mercator projection is the basis for the Universal Transverse Mercator (UTM) mapping system. This system divides up the globe into many narrow longitude bands, and then applies the transverse Mercator projection with the central meridian located at the centre of each band. In this way the UTM mapping system isn’t a single map projection, but rather a number of them, and the location of features has to be specified relative to which longitude band it’s in.

One thing about the UTM system is that each longitude band uses a two-dimensional Cartesian coordinate system, rather than the more familiar degrees system. This can be confusing to newcomers.

Place a cone over the Earth, and then project its surface onto the cone so that the angles of landmasses are preserved (also called a *conformal* projection)*. * This is the basis of the Lambert Conformal Conic projection.

The circles of latitude (or *parallels*) on the Earth that touch the cone are known as the *standard** parallels*, and at these points the scale factor is 1. The Lambert Conformal Conic projection comes in either *tangent* or *secant* forms, which means that the cone will touch either one or two circles of latitude respectively.

The diagram below refers to the secant form of the projection.

If the secant form is used, we have two standard parallels. Between the two parallels the scale factor decreases, and outside the two circles of latitude the scale factor increases.

The Lambert Conformal Conic is a conformal projection, meaning that the angles of countries are kept constant. The result of this is that area distortion is minimal near the two standard parallels, but increases as you move further away from them.

Primary uses of the Lambert Conformal Conic projection are by pilots and others who’d like maps accurate over large east-west distances. Its drawbacks means that it isn’t used for world maps, but rather for specific applications of mapping.

]]>One of the core problems in reinforcement learning is the multi-armed bandit problem. This problem has been well studied and is commonly used to explore the tradeoff between exploration and exploitation integral to reinforcement learning.

To illustrate this tradeoff and to visualise different ways of solving the multi-armed bandit problem, I created a simulation using the JavaScript library D3.js.

Click on the below image to view it!

Given a number of options to choose between, the multi-armed bandit problem describes how to choose the best option when you don’t know much about any of them.

You are faced repeatedly with *n* choices, of which you must choose one. After your choice, you are again faced with *n* choices of which you must choose one, and so on.

After each choice, you receive a numerical reward chosen from a probability distribution that corresponds to your choice. You don’t know what the probability distribution is for that choice before choosing it, but after you have picked it a few times you will start to get an idea of its underlying probability distribution (unless it follows an extreme value distribution, I guess).

The aim is to maximise your total reward over a given number of selections.

One analogy for this problem is this: you are placed in a room with a number of slot machines, and each slot machine when played will spit out a reward sampled from its probability distribution. Your aim is to maximise your total reward.

If you like, here are three more explanations of the multi-armed bandit problem:

- This article comes in two parts. The first part describes the problem and the second part describes a Bayesian solution.
- The Wikipedia explanation
- A more mathematical introduction

There are many strategies for solving the multi-armed bandit problem.

One class of strategies is known as semi-uniform strategies. These strategies always choose the best slot machine except for a set percentage of the time, where they choose a random slot machine.

Three of these strategies can be easily explored with the aid of the simulation:

**Epsilon-greedy strategy:** The best slot machine is chosen 1-ε percent of the time, and a random slot machine is chosen ε percent of the time. Implement this by leaving the epsilon slider in one place during a simulation run.

**Epsilon-first strategy:** Choose randomly for the first *k* trials, and then after that choose only the best slot machine. To implement this start the simulation with the epsilon slider at 1, then drag it to 0 at some point during the simulation.

**Epsilon-decreasing strategy:** The chance to choose a slot machine randomly ε decreases over the course of the simulation. Implement this by slowly dragging the epsilon slider towards 0 while the simulation is being run. You can use the arrow keys to decrement it by constant amounts.

There also exist other classes of strategy to solve the multi-armed bandit problem, such as probability matching strategies, pricing strategies and particular strategies that depend on the domain of the application. Also existing are different variants of the multi-armed bandit problem, including non-stationary variants and contextual variants.

I hope you find the simulation useful. Happy banditing!

]]>I found that when I read about these forces, text-based explanations never really solidified the concepts for me. To really get a sense of how the forces worked, I had to build lots of force graphs and tweak the parameters manually to see what each of them did.

So I got really excited when I discovered this testing ground for force directed graphs, built especially for version 4 of D3:

Made by Steve Haroz, it’s a brilliantly simple way to allow you to experiment quickly and easily with all the different settings available for force graphs. Use it to develop an intuitive sense of what each force does and how each of the force parameters affects the final result.

Enjoy!

Hope you found that useful! Click here to view to the rest of the force directed graph series

]]>Spatial data is data that has a spatial component – the data took place *somewhere*, and that *somewhere *is important. The data is linked intricately to a place on Earth, and that place is relevant and important and you should care about its welfare.

Spatial data is called many different things. It can also be referred to as geospatial data, or as geographic information, or sometimes as spatial information.

Spatial data usually comes in two formats: raster and vector.

Raster stores data in a grid, consisting of rows and columns of cells. Think pixels on a computer screen, or art made out of Lego. That’s what raster data looks like.

Every grid has a value. There are no empty grids. Even if a grid has the value 0, it still has a value.

Many image formats can be used as Raster format data, like GIF, TIF and JPEG files. But to be useable, they’ve got to have reference information associated with the image to specify at what location the image is at. This process of taking an image and associating location information to it is called georeferencing.

Raster data is typically useful for things like weather cover, or of vegetation growth, or other things where satellite imagery comes in handy.

Vector data is made of geometrical shapes. These shapes are also known as vector objects and are used to represent the location of features on the Earth.

Three common types of geometrical shapes are points, lines and polygons. Points are used for things like mountain peaks and wells – things that you can represent really well by just a dot on a map. Lines are used for things like rivers, roads, train tracks and property boundaries. Polygons are used for lakes, buildings, property areas, forests, and other things that you’d want to represent the area of.

You might also encounter polylines on your vectorial adventures. Polylines sound scary, but they’re just a collection of straight lines joined end to end. No more.

Attribute data can be associated with each geometrical shape. Cities could have as attributes their name, or their population, or the number of buildings that it contains. A lake could have as attributes water colour, depth and salinity.

Unlike raster data, empty spaces are allowed in vector data. The vectors show where features are present and the space around the features is empty.

Vector and raster data are very different. But you knew that already.

One difference between the two is in speed. Raster data is quicker to process than vector data. But it’s likely that your computer is fast enough to work with either type quickly, so maybe you don’t care about this.

Vector data is more compact than raster data – the files will be smaller. Your hard drive is big enough that you probably don’t care about this either.

But then you find that your raster data resolution isn’t enough for what you want to do with it, so you go and double it. Then you notice that your data file has just quadrupled in size. Maybe you do care about file size after all.

Vector data is more intuitive than raster data, and it can support topological relationships between features. Because of this it’s probably a friendlier format for spatial analysis. As a bonus it’s easy to identify similar areas on your map, like areas with the same temperature or areas with the same elevation.

Ultimately often you just won’t get a choice. Data you want might only be available as one of the two formats, so it’s important to know how to work with each.

]]>You are given an array of numbers: something like 3,5,-2,-1,5,3,1,-4,-6,5. Your job is to come up with the subarray with the highest sum.

For this example it would be the subarray [3,5,-2,-1,5,3,1], which has a sum of 14. No other subarray gives a higher sum.

Let’s look at a few more:

- The sequence -3,-3,-2,-4,-2,-5,-4 has the highest sum being -2, given by the subarray [-2].
- The sequence 1,2,-1,-2,1 has the highest sum being 3, given by the subarray [1,2].
- The sequence 3,2,1,4 has the highest sum being 10, given by the subarray [3,2,1,4].

It’s an interesting problem with many applications. How do we solve it?

While there are many ways to solve this problem, the best solution is Kadane’s Algorithm. It’s a deceptively simple algorithm, yet also deceptively difficult to understand.

Here’s the Python algorithm given on Wikipedia:

def max_subarray(A): max_ending_here = max_so_far = A[0] for x in A[1:]: max_ending_here = max(x, max_ending_here + x) max_so_far = max(max_so_far, max_ending_here) return max_so_far

Here’s an R implementation, with a trace added to help figure out what’s going on.

find_max_subarray = function(arr){ max_ending_here = arr[1] max_so_far = arr[1] for(i in 2:length(arr)){ max_ending_here = max(arr[i], max_ending_here + arr[i]) max_so_far = max(max_so_far, max_ending_here) print(paste( max_ending_here,max_so_far,arr[i])) #for tracing } return(max_so_far) }

Try running the algorithm yourself. Feed it different subsequences and look at the trace to see if you can figure out how it works.

Understanding this algorithm is not a simple task. Here’s how I understand it.

For each number in the sequence, we do this:

Each number “decides” if it wants to use the previous subarray sum, or if it wants to start a new subarray from scratch. At every comparison, it also checks to see if its subarray sum is higher than any previous one.

Let’s look at an example – the sequence 1,-2,3,-1,2.

We iterate through the numbers, starting at the second number in the sequence (-2).

At this point, the maximum subarray sum seen so far is 1 – the sum of the first element in the sequence. This is also the previous subarray sum.

The new subarray sum of -1 is passed to the next number. This is the sum of the current subarray [1,-2].

The next number (3) gets the previous subarray sum of-1:

We start a new sequence at this point. The current subarray is [3] and the current subarray sum is 3. The maximum subarray sum is also updated, since 3 is higher than everything we had before.

The next number (-1) is passed the previous subarray sum of 3:

Now the subarray is [3,-1] and the subarray sum is 2. Do you see how this goes now?

The last number is passed the subarray sum of 2:

The subarray at this point is [3,-1,2] and the subarray sum is 4. The max_so_far variable is updated to read 4.

At this point the sequence has ended and max_so_far is returned – which is 4. This is the highest subarray sum of the sequence, and the subarray was [3,-1,2].

And that’s it!

Hope you enjoyed this post! Let me know by leaving a comment if you need clarification on anything.

]]>You might argue that the mode is a better choice than the median and the mean, mostly based on the below graphic.

But the mode suffers from other problems. It will fail you on an exponential distribution, or on a log distribution. The mean and median will stand tall on the centrally-tended morally-high ground and you will rue your foolishness and cry to the heavens.

So should we use the median then? It’s well known that for a skewed dataset, the median is a much better choice than the mean – the picture above proves that point. Even for symmetric distributions, it’ll perform as well as the mean will. It seems like the best choice for every situation.

But – there’s this great quote:

“When a value is garbage for us, we call it an ‘outlier’ and we want to be robust against it – so we use the median. When the same value is attractive for us we call it “extreme’ and we want to be sensitive to it – so we use the mean.”

Maybe things are more complicated.

The strength of the mean is that it’s quick to calculate – all you have to do is sum up all observations, and divide by the total number of observations, and you’re done. In R:

calc_mean = function(vec) { return(sum(vec)/length(vec)) }

The median has to first order all the observations and then find the middle value. It’s a more complicated process:

calc_median = function(vec) { vec = sort(vec) if((length(vec) %% 2 == 0) ){ return(vec[length/2] + vec[(length/2)-1]) } else{ return(vec(ceiling(length/2))) } }

Finding the mean of a group of numbers is done in O(n) , whereas finding the median is an O(n log n)) operation.

This may not seem like a big difference, and isn’t for small datasets. But what happens if you have a lot of data?

Let’s run some quick simulations in R. We’ll calculate the time taken to find mean and median for a range of data sizes, going from 10,000,000 elements to 250,000,000 elements. We’ll do this 50 times and then plot the average calulation times on a graph.

Here’s the code for this:

trials = 50 vec_lengths = c(10000000,50000000,100000000,150000000,200000000,250000000) mean_time = matrix(nrow = length(vec_lengths), ncol = trials) median_time = matrix(nrow = length(vec_lengths), ncol = trials) #generate vectors and runtimes for mean and median for(j in 1:trials){ set.seed(j) count = 1 for(i in vec_lengths){ x = rnorm(i) time_start = proc.time() mean(x) mean_time[count,j] = (proc.time() - time_start)[3] time_start = proc.time() median(x) median_time[count,j] = (proc.time() - time_start)[3] count = count + 1 } } mean_time_rowmeans = rowMeans(mean_time) median_time_rowmeans = rowMeans(median_time) library(ggplot2) ggplot(data = data.frame(mean_time = mean_time_rowmeans, median_time = median_time_rowmeans, size = vec_lengths), aes(x = vec_lengths)) + geom_point(aes(y = mean_time, colour = "mean")) + geom_point(aes(y = median_time, colour = "median")) + geom_line(aes(y = mean_time, colour = "mean")) + geom_line(aes(y = median_time, colour = "median")) + xlab("Number of Data Points") + ylab("Calculation Time (s)")

Here’s the results:

The median clearly scales worse than the mean does. Imagine you had a few billion data points, and you can start to see why these questions matter.

The median and mean are fundamentally different concepts, and it can be oversimplifying to classify them as just different measures of the same thing. Sometimes they simply just answer different questions.

In broad terms, if you’ve got symmetric data and a lot of it, use the mean. If you’ve got skewed data and lots of it, you should use the median where possible, but you might have to use the mean if your data is too large. If you don’t have a lot of data, then just use the median for everything.

Tread carefully…and remember that the median isn’t always king.

]]>This example builds off the d3-zoom articles to add zoom functionality onto our force directed graph. If you’ve got questions on how zoom works or what this code is really doing, check those articles out!

Once you know what’s going on, adding zoom to force directed graph is really simple.

Most of the code in the above example (click on the picture to see it) I’ve covered in other articles, so I won’t go over it again.

The key things to note are as follows:

What I refer to as the “zoom handler” is simply a variable to store the function that is applied on SVG elements to allow them to have zoom functionality.

In this example we only need one zoom handler, and we apply it onto the backing SVG element so that it will trigger whenever and wherever we mouse-wheel over the graph.

//add zoom capabilities var zoom_handler = d3.zoom() .on("zoom", zoom_actions); zoom_handler(svg); function zoom_actions(){ g.attr("transform", d3.event.transform) }

The key point to note is that we create a “g” element that contains both the nodes and the lines.

This means that any SVG coordinate transforms that we apply will apply to the entire force directed graph, and not subsets of it. You can imagine that if we applied the zoom handler to a “g” element only containing the nodes, we’d get strange behaviour where the nodes zoom in and the links didn’t. No thanks.

This also prevents unwanted behaviour from occurring when combining zoom and drag together. Read this article for more on the subject.

Anyway, this bit of the code is like this:

//add encompassing group for the zoom var g = svg.append("g") .attr("class", "everything"); //draw lines for the links var link = g.append("g") .attr("class", "links") .selectAll("line") .data(links_data) .enter().append("line") .attr("stroke-width", 2) .style("stroke", linkColour); //draw circles for the nodes var node = g.append("g") .attr("class", "nodes") .selectAll("circle") .data(nodes_data) .enter() .append("circle") .attr("r", radius) .attr("fill", circleColour);

That’s it – it’s really not that hard at all. Happy zooming!

Hope you found that useful! Click here to view to the rest of the force directed graph series.

In this article I’ll present two ways to apply drag and zoom to your d3 visualisations. One way is easy and simple, and one way is complicated and difficult, yet both examples hold key insights.

Mike Bostock has written a simple example where he combines drag and zoom. Click on the above picture to see his code.

Mike’s code is fairly self explanatory, so I’ll paraphrase from it. Roughly speaking, it goes like this:

- Create a SVG element to house the visualisation. Create some points and have them in a phyllotaxis pattern.

var svg = d3.select("svg"), width = +svg.attr("width"), height = +svg.attr("height"); var points = d3.range(2000).map(phyllotaxis(10)); function phyllotaxis(radius) { var theta = Math.PI * (3 - Math.sqrt(5)); return function(i) { var r = radius * Math.sqrt(i), a = theta * i; return { x: width / 2 + r * Math.cos(a), y: height / 2 + r * Math.sin(a) }; }; }

- Append a “g” element to the SVG element. This “g” element will contain all the circles.

var g = svg.append("g");

- Add circles to the “g” element defined in Step 2. Allow these circles to be dragged by defining a drag handler in the selection.call function. Also define a function called “dragged”, which contains the code that runs whenever the “drag” event is detected.

g.selectAll("circle") .data(points) .enter().append("circle") .attr("cx", function(d) { return d.x; }) .attr("cy", function(d) { return d.y; }) .attr("r", 2.5) .call(d3.drag() .on("drag", dragged)); function dragged(d) { d3.select(this).attr("cx", d.x = d3.event.x).attr("cy", d.y = d3.event.y); }

- Add a zoom handler to the SVG element and define a maximum level that we can zoom in. Also define a function called “zoomed” that triggers a transform whenever a zoom event is detected.
**The transform is applied to the g element defined in Step 2**. This is important.

svg.call(d3.zoom() .scaleExtent([1 / 2, 8]) .on("zoom", zoomed)); function zoomed() { g.attr("transform", d3.event.transform); }

And we’re done. Quick and painless.

When we zoom, we apply a transform on the “g” element that contains the circles. Every circle inherits this transform from its parent element.

This means that when we drag a circle, the mouse event takes into account the transformation of the SVG element.

We don’t run into problems when we drag circles around. Life is good.

Now the second way.

This code follows much of the same logic as above.

We create a SVG element and append a “g” element to it. We create a bunch of random circles that are appended to the “g” element. We even create a stylish black background that we also append to the SVG element.

var svg = d3.select("svg"), width = +svg.attr("width"), height = +svg.attr("height"); //create some circles at random points on the screen //create 50 circles of radius 20 //specify centre points randomly through the map function var radius = 20; var circle_data = d3.range(50).map(function() { return{ x : Math.round(Math.random() * (width - radius*2 ) + radius), y : Math.round(Math.random() * (height - radius*2 ) + radius) }; }); //stylish black rectangle for sexy looks var rect = svg.append("g") .attr("class", "rect") .append("rect") .attr("width", width) .attr("height", height) .style("fill", "black") ; //funky yellow circles var circles = d3.select("svg") .append("g") .attr("class", "circles") .selectAll("circle") .data(circle_data) .enter() .append("circle") .attr("cx", function(d) {return(d.x)}) .attr("cy", function(d) {return(d.y)}) .attr("r", radius) .attr("fill", "yellow");

Then we create a zoom handler and apply it to the SVG element.

//create zoom handler var zoom_handler = d3.zoom() .on("zoom", zoom_actions); //specify what to do when zoom event listener is triggered function zoom_actions(){ circles.attr("transform", d3.event.transform); } //add zoom behaviour to the svg element backing our graph. //same thing as svg.call(zoom_handler); zoom_handler(svg);

And things start to go wrong.

Instead of defining the zoom transform on the “g” element containing the circles, like we did in the above example, we define the zoom transform on the circles themselves.

Every circle has the same transform applied to it. The “g” element has no transform applied onto itself.

A subtle difference, but a crucial one. Because now mouse events won’t do what we think they will, since they reference the original “transform” of the encompassing SVG element.

This means that when we add drag capabilities into the mix, things go haywire.

Try zooming in or out, and then dragging the circles around. They won’t drag to the cursor! Lucky there’s a workaround.

We need to change our drag function to take into account the current transform.

First step is to find out where the circle that we’re dragging is first located. We can log this against the “start” event in the drag handler.

//used to capture drag position var start_x, start_y; //create drag handler with d3.drag() var drag_handler = d3.drag() .on("start", drag_start) .on("drag", drag_drag); function drag_start(){ // get starting location of the drag // used to offset the circle start_x = +d3.event.x; start_y = +d3.event.y; }

The drag function is a bit complicated. In the below code, we

- check to see if we’ve zoomed in or out at all. If we haven’t, we won’t be able to read the transform attribute because it won’t exist.
- If we haven’t zoomed in or out, it means that we’re operating at a scale factor of 1 – the default.
- If we have zoomed in or out, we need to get the current scale factor. We can extract it from the ‘transform’ attribute of whatever we’re dragging by using some string manipulation techniques.

- adjust the location of the circle to follow the mouse pointer. Because d3.event.x and d3.event.y follow the original non-transformed coordinate system of the SVG background, it means that the circle will drag too far when we’re zoomed in, and not far enough when we’re zoomed out. We fix this by taking the starting point and then adjusting for the current scale factor.

function drag_drag(d) { //Get the current scale of the circle //case where we haven't scaled the circle yet if (this.getAttribute("transform") === null) { current_scale = 1; } //case where we have transformed the circle else { current_scale_string = this.getAttribute("transform").split(' ')[1]; current_scale = +current_scale_string.substring(6,current_scale_string.length-1); } d3.select(this) .attr("cx", d.x = start_x + ((d3.event.x - start_x) / current_scale) ) .attr("cy", d.y = start_y + ((d3.event.y - start_y) / current_scale)); }

Last step is to apply the drag handler to the circles and we’re done.

drag_handler(circles);

It works – but it’s complicated and ugly.

The first takeaway is if you want to add zoom functionality to a whole visualisation, it looks like it’s important to apply the zoom handler on the encompassing elements wherever possible. So if you have a “g” element holding lots of circles, apply the zoom behaviour to the “g” element, not the individual circles. If you don’t do this then the mouse events won’t adjust to the current transform.

The second takeaway is that if something seems hard and convoluted that should be easy, odds are that you’ve missed something. Take a step back and figure out what’s the root cause of the problem.

Thanks for reading!

This post is part of a series of articles on how zoom works in version 4 of d3. Hope you enjoyed it!

]]>