Optimizing land use classification using decision tree approaches
Optimizing
land use classification using decision tree approaches
Vaibhav
Walia+ under guidance of DR. Sameer Saran
Indian Institute of remote sensing (NRSA),
Dehradun (UA)-248 001
+ Manipal institute of technology, Manipal (KA)
Abstract: Supervised classification is one of the important
tasks in remote sensing image interpretation, in which the image pixels are
classified to various predefined land use/land cover classes based on the
spectral reflectance values in different bands. In reality some classes may
have very close spectral reflectance values that overlap in feature space. This
produces spectral confusion among the classes and results in inaccurate
classified images. To remove such spectral confusion one requires extra
spectral and spatial knowledge. This report presents a decision tree classifier
approach to extract knowledge from spatial data in form of classification rules
using Gini Index and Shannon Entropy (Shannon and Weaver, 1949) to evaluate
splits. This report also features calculation of optimal dataset size required
for rule generation, in order to avoid redundant Input/output and processing.
*Challenges:
·
Improving
land use classification methods to achieve better classification
·
Optimising
the size of training dataset needed to generate classification rules
·
Developing
an application to generate classification rules, given a particular dataset and
information about the attributes
*Solutions:
- Better classification was achieved by :
I.
using
Decision tree algorithm instead of classical approaches such as MLC
II.
using
“Gini Index” as the attribute selection criteria when “Information gain” fails
- Optimum dataset size was found by extracting and comparing decision rules for increasing dataset sizes of the same dataset.
Example: ‘X’
tuples are read and converted to rules in the first pass, similarly ‘X + jump’ tuples are read and converted
to rules in the second pass, the resulting rules are compared and the procedure
is repeated for at least another ‘width’
tuples, where ‘jump’ and ‘width’ are user defined variables. If
the resulting rules are same throughout the ‘width’ then ‘no of tuples
read minus width’ is the optimum dataset size required for rule generation.
- The decision tree algorithm was implemented using C++, Nokia/trolltech‘s Qt framework for the gui and “qcoustomplot” an open source library, which was used for plotting graphs
Comments
Post a Comment