Development of a Building Detection System from an Aerial Image Based in Watershed Transformation and Linear Support Vector Machine

Michael Anthony Jay B. Regis, Concepcion L. Khan, Jaime M. Samaniego, Serlie B. Jamias, Vladimir Y. Mariano


Object detection in an aerial image has always been a fundamental problem in remote sensing, more so with increasing population. With the advancement in sensor technology and falling prices in imaging hardware, it is now cheaper to acquire aerial images as compared to a decade ago. With increased quality and quantity of the images gathered, it is necessary to develop an automated object detection system to address tedious manual building detection. In this study, a two staged approach was executed to address automated building detection. First, we performed image segmentation to create meaningful regions of the image using a marker controlled watershed transform. Discrete Fourier Transform (DFT) coefficients were then derived from the grayscale histogram of each region to act as feature vector necessary for the next stage. Second, we trained linear support vector machines (SVM) using the acquired feature vector to identify the building and non-building regions of the test images. We evaluated the performance of the proposed method by using detection percentage, branching factor and receiver operating characteristic (ROC). We trained the linear SVM classifier with 872 building and 616 non-building images from 31 training images of the Calumpang aerial survey. Experimental results from 31 test images (of the same aerial survey) shows that the detection percentage and branching factor is 69.50% and 22.70%, respectively. Moreover, the area under the curve (AUC) of the ROC is 0.887 strongly suggesting that the proposed method is highly effective.


Object detection; watershed transform; discrete Fourier transform (DFT); linear support vector machine (SVM); receiver operating curve (ROC)


Baudot, Y. (1993). Application of remote sensing to urban population estimation: a case study of Marrakech, Morocco. EARSeL Advances in Remote Sensing 2(3):138-147

Bayburt S., Buyuksalih G., Baz I., Jacobsen K., & Kersten T. (2008). Detection of changes in Istanbul area with medium and high resolution space images. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XXXVII, Part B7

Bell, J. (2014). Machine Learning: Hands-On for Developers and Technical Professionals. John Wiley & Sons, Inc.

Bieniecki, W. (2004). Oversegmentation avoidance in watershed-based algorithms for color images. Proceedings of the International Conference in Modern Problems of Radio Engineering, Telecommunications and Computer Science, 169-172. Lviv-Slavsko, Ukraine: IEEE.

Bradski, G. & Kaehler, A. (2008). Learning OpenCV. O'Reilly Media, Inc: Sebastopol, CA.

Cyganek, B. & Siebert, J.P. (2009). An Introduction To 3D Computer Vision Techniques And Algorithms. John Wiley and Sons Ltd.

Davies, R. (2012). Computer and Machine Vision: Theory, Algorithms, Practicalities. Elsevier Inc.

Forsyth, D. A. & Ponce, J. (2012). Computer Vision: A Modern Approach (2nd ed.). Pearson Education, Inc.

Guindon, B. (1997). Computer-Based Aerial Image Understanding: A Review and Assessment of its Application to Planimetric Information Extraction from Very High Resolution Satellite Images. Canadian Journal of Remote Sensing 23(1):38-47

Haralick, R. M. (1979). Statisical and Structural Approaches to Texture. Proceedings of the IEEE, 67:786-804

Hosokawa M., Jeong B., Takizawa O. & Matsuokac M. (2008). Disaster risk evaluation and damage detection using remote sensing data for global rescue operations. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XXXVII Part B8, 183-186

Kreiman G., Koch C., & Fried I. (2002). Single-neuron correlates of subjective vision in the human medial temporal lobe. Proceedings of the National Academy of Sciences of the United States of America, 8378-8383).

Lin C. & Nevatia R. (1998). Building Detection and Description from a Singe Intensity Image. Computer Vision and Image Understanding 72:101-121

Maloof M.A., Langley P., Binford T.O., Nevatia R., & Sage S. (2003). Improved Rooftop Detection in Aerial Images with Machine Learning. Machine Learning, 53:157-191

Muller, S. & Zaum, D. W. (2005). Robust building detection in aerial images. IAPRS, 143-148

Naithani, K. K. (1988). Can satellite imagery ever replace aerial photography? A photogrammetric view. International Society for Photogrammetry and Remote Sensing (ISPRS) Archives, XXVII Part B4, 274-279

Otsu, N. (1979). A Threshold Selection Method from Gray Level Histogram. IEEE Transactions On Systems, Man, And Cybernetics, 62-66

Roerdink, J.B. & Meijster, A. (2001). The Watershed Transform: Definitions, Algorithm and Parallel Strategies. Fundamental Informaticae, 41:187-228

Swets, J. (1988). Measuring the accuracy of diagnostic systems. Science, pp. 1285-1293

Vapnik, V. (1995). The Nature of Statistical Learning Theory. New York: Springer 63

Full Text: JST_2015 07


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.