### Disclaimer

These wafflings are submitted for public review of my own accord. I don’t have any official endorsement from any academic group. My only qualification is that I am a programmer working on the back end at the BCCVL.

### Goal

The goal of this page is explore and understand ROC curves in the context of SDMs.

### Set up

So I already set up an experiment where I trained a model in lower NSW and tested it in upper NSW. So just to be clear, slicing things up geographically is a flawed approach to training and testing (just randomly subdividing your data into training and testing components is better) but just for this page. I’m going to use the exact same experiment as before to generate figures in the context of the training and testing data, respectively. Purely for illustrative purposes.

So I built a random forest training in lower NSW, and testing in upper NSW. So let’s look at the plots from the training data.

### The training plots

So I get these 4 plots for the training data. First the ROC./AUC:

The ROC alone is a fairly terse way to think about the goodness of your model. Each point corresponds to a certain pair of values you get if you apply a binary threshold (x axis is false positive rate, y axis is true positive rate. Note the R package has a different way to label them). In the plot you don’t know what threshold was used for each point, and you can’t see the actually points (which can be quite stretched out from one another). Anyway, let’s step back and actually look at how the false and true positive rates relate to threshold:

As you can see, if we used the smallest threshold, which is 10 in this plot, we’d get a get great true positive rate (perfect in fact), and something like a 40% false positive rate. As we increase the threshold, our true and false positive rates go down. There is probably some sweet spot (i.e. an ‘optimal’ threshold) that the operator (i.e you and I) would have to decide on, and it would reflect a trade off between getting things right and getting things wrong. Note too that as we go from left to right in this plot, we actually traverse from right to left in the ROC plot above!

So, just to get a feel for what the data look like, we can plot histograms that depict the relationship of absence/occurrence to classification threshold:

So it’s interesting and perhaps as we expect. The absences are to the left, the occurrences are to the right. Great, this is why our ROC plot looks so good. It’s probably more illustrative to look at probability density functions (i.e. to normalise our perspective) instead, so I did that:

And again, it just confirms that we’d probably expect a binary decision threshold to work quite well on this data.

Let’s go to the (far more interesting) testing side of things.

### The testing plots

So I did the exact same plots as above by projecting the trained model into the independent region of upper NSW. Here they are

So the ROC looks a bit ordinary. At first glance I see an AUC of 0.689 and a significant deviation away from the line y=x in the upper right corner, so I’m encouraged. But my perspective is enhanced if I look at the false and true positive rates again:

So with the true and false positive rates on hand, I can see that I’ll be forced to use a low threshold if I expect to get any efficacy out of this model. In doing so I would achieve a “high” true positive rate, but alas my false positive rate would be in the vicinity of 40%. As I start using higher and high thresholds the true and false positive rates become increasingly similar (and very low). As before, the threshold in this plot goes from left to right, but in the ROC plot above it goes right to left. I can see now that as my binary threshold gets higher and higher I am more or less doing a coin toss! I am less encouraged by my model now that I know this. So again we can take a look at the historgrams:

It’s bit hard to tell what’s going on here – this time the probability density functions are essential to understanding what’s going on:

We can see now that we get a significant hump in the occurrence distribution at around 100.

That’s all

-the wooly mammoth