What is the source for ArcMap's Jenks Optimization classification?
December 15 2008 |
10 comments
Categories:
ArcGIS Methods,
Symbology
Does ArcMap use the Jenks-Caspall or the Fisher-Jenks algorithm for classifying data into natural breaks. I did some support.esri.com research and found that ArcView 3.x appeared to have used the Fisher-Jenks, but ArcGIS Desktop only generically talked about Jenks Optimization without eluded to what algorithm it was using.
If someone could enlighten me, I would appreciate it.
Mapping Center Answer:
Neither of the algorithms you referenced are robust algorithms in terms of handling a wide variety of data distributions in a failsafe fashion. The ArcView 3x algorithm was based on Fisher-Jenks and for normally distributed data with a rich variety of values, our algorithm produced results that barely varied (5th+ decimal places). In fact every person who has implemented the algorithm to operate on more than an ideal dataset produces similar small variations.
Less than ideal data in this case means, for example, a dataset where 70% of the values = 4.0, and the user asks for 6 classes. Nothing Jenks, Fisher, or Caspall, ever intended their work to be used on, but something of a common occurrence today. ArcMap’s implementation is a bit more robust and efficient, and produces no more variance in output class break values than the ArcView implementation did for ideal (expected) data distributions.
So to answer your question, it’s the ESRI, ArcGIS implementation of the Fisher-Jenks algorithm. Our only mistake in naming was calling it Natural Breaks—which means one of two things: first and most commonly it is a manual process of placing class breaks on a histogram such that breaks fall in the troughs, and classes represent natural clumps in the data’s distribution. The second and more loosely is a class of algorithms that are variance based class break generators. These can produce considerably wider variations in class breaks than any of the specific implementations the Fisher-Jenks algorithm.
In fact I had written a variance-based classification as an Avenue sample for ArcView 2.1. In demonstrating that, I would always show that the effect of a simple arbitrary class breaks method often produced a misleading picture to general audiences. So, from the very beginning we knew that the Natural Breaks family of classifications was much less likely to produce a misleading depiction.
http://resources.arcgis.com/content/kbase?fa=articleShow&d=26442
If you would like to post a comment, please login.