Recent news from the world of data and decisions.

Thursday
Jul082010

Surveying for risk not ecology

There is an inherent difference between surveys conducted to increase understanding about the behaviour or ecology of a specific species and the surveys conducted to assess risk or compliance.  The difference is subtle but crucial in sustainable development.

© Janet Hastings | Dreamstime.comGetting it wrong (which happens if the objective of the survey/census is not well articulated) can produce results that greatly misrepresent risk.  The two types can be thought of as subject focussed and hazard focussed surveys.

Let me give you a couple of examples.

Survey type 1 – Subject focussed.

Objective:  to test a hypothesis related to kangaroo diet. An appropriate survey design here would be to survey (or census) all the kangaroos in the region of interest.

Survey type 2 – Hazard focussed

Objective: to determine the number of kangaroo collisions during summer on some stretch of road.  Here, a census of kangaroos would use huge resources, and may produce biased results (not all the kangaroos surveyed go near the road in question, let alone get into an altercation with a car).© Dirkr | Dreamstime.com

Here we require a hazard focussed survey frame, i.e. we would survey (or census) the stretch of road(s).

These examples may be trivial, but a confusion between subject and hazard focussed surveys is a real and overlooked issue in many environmental surveys, particularly those involving rare or cryptic species.

The traditional approach in ecology is to undertake behavioural/biological surveys – to determine features of the species, or their interaction with their environment.  These surveys are usually designed to maximise the observer’s likelihood to actually observe the species.  The more individuals observed, the more data you can collect on morphology, behaviour etc etc etc.

However, in sustainable planning we are interested in determining

  • The likelihood of an impact (how many critters will be hit by cars, lose their breeding ground, fly into a turbine, etc?)
  • The consequence of an impact (this is done through population viability analysis if the population is geographically isolated, or through various harvest rates for species that range further). 

Determining the likelihood of the impact is akin to determining the impact of a new road on local kangaroo population.  We must focus our surveys on the hazard, not the subject.  This involves stepping away from traditional subject based surveys.

If this is not done, than you may have a wonderful data set filled with observations of species behaviour down in that valley over there, but no insight at all into the risk posed by the road up on the ridge here.  Without that insight, management is impossible and compliance is reliant on subjective assessment. 



Thursday
Jul082010

Give a man a fish....

Unfortunately I missed David Snowden’s recent ‘Making sense of complexity workshop’ but I was impressed with this response by the guys at the River Restoration Centre (http://bit.ly/98LbhP) which speaks about the difference between a cook who follows a recipe and a chief, who understands their ingredients and mixes them with style and flair to create a unique result.

I particularly agree that there is a tendency towards cooks in natural resource management in Australia - driven by a need to be transparent and tick the boxes.  However, this is a fallacy, as true transparency and repeatability only comes if the chef (or practitioner) can adequately communicate and defend why they used a certain ingredient in the mix and why it was added in just that way.

I am often frustrated by being asked to provide a cookbook, or a toolkit for analysis and decision making in NRM.  It is so much more important to understand your ‘ingredients’ - complexity, logic, multiple-criteria techniques, than having a black box recipe handed to you.

And just to labour the food idea, I suppose, for me it comes down to ‘give a man a fish and he’ll eat for a day, teach him to fish and he’ll eat forever’.

As consultants, it is imperative to not just give a recipe for a single decision, but to enable our clients to understand and value the decision making process, so that better decisions are made in the future in lots of different circumstances.

Tuesday
May182010

Cognitive Biases and decision making

I’m sure anyone who works in the field of complex decision making is aware of the impact of personal bias.

I still remember a professor who spent six months and an excessive amount of grant money building more elaborate experimental rigs.  He was testing a theory about the settling patterns of small particles in a fluid flow, and did not get the answer he had expected.  For those keeping score, the laws of physics remained unimpressed with the new, expensive rigs and kept doing what they’d always done.

But decision making biases can be more insidious than that, especially in complex systems wiith high levels of uncertainty, like ecology or economic modelling/budget forecasting. 

It’s true that good physical and stochastic models can go a long way to lower uncertainty, and multicriteria decision analysis can help in sorting and ranking a wide range of factors, simple cognitive bias can trump the most well planned decision framework.

The first step in eliminating these biases is the ability is to understand their existance.  The link below is a beautifully rendered study guide to a range of cognitive biases.  A couple of my “favourites”

  • The texas sharp shooter fallacy: The fallacy of selecting or adjusting an hypothesis after the data has been collected, which makes it impossible to fairly test the hypothesis.  The name refers to the analogy of shooting a bunch of bullets into a wall, drawing a circle around a closely clustered set and declaring that was your target.
  • The zero-risk bias: The tendency to try to reduce a small risk to zero over acheiving a greater reduction in a larger risk.
  • Planning fallacy: The tendency to underestimate task completion times (honestly, who hasn’t commited this one?)

Hope you find it useful.

 

Cognitive Biases - A Visual Study Guide by the Royal Society of Account Planning

Wednesday
Apr072010

Of Drunkards and Lamp-posts

It has been said of statistics that they” are often used as a drunkard uses a lamp post, more for support than illumination.”

It frequently falls to us to make an argument, based on a survey. To do this, we must first live with the data for a bit, getting a feel for its quirks and shortcomings along with its strengths. Unfortunately for all applied statisticians, rarely do we get the pleasure of translating data collected from our own design, where we have attempted to control all the confounding factors.

There is a growing school, driven amongst computer scientists and engineers for non-parametric studies of datasets that are generated without control- often called “data dredging.” It is the sometime harrowing and risky process of looking for patterns post-hoc, and then asserting a p value to their strength.

There are lots of discussions floating about out there regarding the relevance of a p value on a dredged pattern, usually along the lines of “well, given ‘something’ has to happen…” And they are valid. But I came across another concern that has left me perplexed.

We all know about the difficulty in interpreting cross tables, particularly 2X2, the staple of demographics.

But, dredging might make this even harder. Given a survey design, the interpretation of a contingency, or cross table, is relatively easy. You know which factor was controlled, and how the subjects were chosen. With a dredged set, how was the table propagated?

Let us presume that we have a 3X2 table. If we selected the data using a query that fixed the row totals, we could analyse it has two multinomials. Easy. If we fixed the column totals, we have three binomials. A different test but easy enough. If we extracted a fixed number of records (the grand total) and propagated the table, then each response cell becomes a Poisson variable. That’s more complicated, but doable. What about if there is another factor we didn’t think of? Something that might inflate the variance…….What I now have is a list of at least three different ways to analyse the same table, and in the case of a dredged set, no meaningful way to choose between them.

If anyone has any ideas, I’d love to hear them. If anyone feels like writing some theory for us foot soldiers, that will guide us when there is no data model.

And bemoaning dredging as a practice isn’t helpful…it is here to stay. We just need a rudder to help steer it.

Wednesday
Mar102010

It might just be worse than that

I rarely get to blog here, so I am taking the opportunity to point out something above the general decline in mathematics students - the losing of knowledge.

Take a look at these feeds from the Australian:

Mathematics Students in Serious Decline

Equation for maths warns of disaster

I was teaching at University when we as a society ripped these students off, replacing core problem solving with vapid histories and philosophies. Such a change had a lot to do with the academia of mathematics becoming disconnected with the application of its theory. And this is where the problem lies.

I’ll give you an example. I sought to take a Graduate Certificate in Applied Statistics recently. Not because I wanted another piece of paper, but because I am looking for some more knowledge to protect myself and my clients from stupid misapplications of theory. I couldn’t find one that was more than a simple training course in certain statistical software packages. Needless to say, I haven’t enrolled. I need, as do all analysts more than the mindless application of a software package.

The need arose when the tried and proven ANOVA test failed on me. ANOVA is the draught horse of multiple testing applications. I had a test returning p values that experience tells were way too small. The underlying data, was violating a number of assumptions of the ANOVA test, and I could find a transform that would fix it. I still haven’t found a transform, and have had to move on without that test, making my story that much longer as I now have to justify the use of “unorthodox” testing procedures.

This is not the first time that I’ve come across underlying short-comings in standard procedures. My argument comes from the fact that often I have to go right back to 1930’s papers by deities like Fisher, or early works by Tukey to find underlying mechanics and discussions. Far too many papers simply state the software package they used, and the outcomes, never addressing whether the package should have been applied in the first place.

I wonder how often it is that an assumption has been made regarding validity, and never checked.