Representation learning for mammography mass lesion classificationwith convolutionalneuralnetworks
CTAr01689
READS3 041
uppg uaeeue psueqqpaddssassadseueeuqupee sd peq-sndBioplants Centre Ciego de Arila Untiversit Cuba Yiewe projpct
Stream Data Mining View project
John Arevalo Fabio A. Gonzalez Ratil Ramos-Pollan² Jose L. Oliveira? and Miguel Angel Guevara Lopez
December 17 2015
(fagonzalezo jearevaloo)-@ .co
2Universidad Industrial de Santande Bucaramanga .co
3DETI-IEETA Universidade de Aveiro Portugaljlo@ua.pt
4CCG Computer Graphics Center Portugal
Keywords: Breast cancer; feature learning: convolutional neural networks;puter-aided diagnosis mammography
Abstract
Background and Objetive: The automaticlassification of breastimag-ing lesions is currently an unsolved problem. This paper describes an in-novative representation learning framework for breast cancer diagnosis ingeune o snbu Suua daap sae8aun eq udeouuulearm discriminative features avoiding the design of specific hand-crafted uod Adoq mau y po sp aeag psrmarking dataset was built from 344 breast cancer patients’ cases contain-ing a total of 736 film mammography (mediolateral oblique and cranio-caudal) views representative of manually segmented lesions associatedwith masses: 426 benign lesions and 310 malignant lesions. The developedmethod prises two main stages (0) preprocesing to enhance imagedetails and (i) supervised training for learning both the features and thebreast imaging lesions classifier. In contrast to previous works we adopta hybrid approach where convolutional neural nehworks are used to learnthe representation in a supervised way instead of designing particular de-scriptorsto explain the content of mammography images. Results: Experimental results using the developed benchmarking breast cancer datasetouuoad paoudu puegus sqyxa poau ano prq paensuouapwhen pared to state-of-the-art image descriptors such as histogramp p yo o pe (OH) sp8po g(HGD) increasing the performance from 0.787 to 0.822 in terms of the area
under the ROC curve (AUC). Interestingly this model also outperforms aset of hand-crafted features that take advantage of additional informationfrom segmentation by the radiologist. Finally the bination of both rep-resentations leamed and hand-crafted resulted in the best descriptor formass lesion classification obtaining 0.826 in the AUC score. Conclusions:A novel deep learming based framework to automatically address classifi-cation of breast mass lesions in mammography was developed.
1Introduction
Breast cancer is the most mon cancer in women worldwide with nearly1.7 million new cases diagnosed in 2012 (second most mon cancer over-all); this represents about 12% of all new cancer cases and 25% of all can-cers in women?. Breast cancer has a known asymptomatic phase that can bedetected with mammography and therefore mammography is the primaryimaging modality for screening. Double-reading (two radiologists indepen-dently read the same mammograms) has been advocated to reduce the pro-portion of missed cancers and it is currently included in most screening pro-grams [1]. However double-reading incurs in additional workload and costs.Alternatively puter-aided diagnosis (CADx) systems can assist a single radiologist when reading mammograms providing support for their decisions.WorldCancerResearchFundIntemational AccessedMay 20 2015
These systems can be used as second opinion criteria by radiologists playing akey role in the early detection of breast cancer and helping to reduce the deathrate among women with breast cancer in a cost-effective manner [2].
A successful approach to build CADx systems is to use machine learningclassfiers (MLC). MLC are learmed from a set of labeled data samples capturingplex relationships in the data [3 4 5]. In order to train a MLC for breastcancer diagnosis a set of features describing the image is required. Ideally a given image is from a malignant finding or not. This is however a chal-lenging topic that has gathered the focus of research in several sciences frommedicine to puter vision. Thus several types of features may be used toinfer the diagnosis. Many CADx systems use hand-crafted features based onprior knowledge and expert guidance In particular strategies based on featureselection [6] and hand-crafted features that characterize geometry and textures[7] has been proposed for mass classifications. As an alternative the use of ma-chine learning strategies to leam good features directly from the data is a newparadigm that has shown successful results in different puter vision tasks.One such paradigm is dep learning.
Deep learming methods have been widely applied in recent years to addressseveral puter perception tasks [8]. Their main advantage lies in avoidingthe design of specific feature detectors. In turn deep learning models look fora set of transformations directly from the data. This approach has had remark-able results particularly in puter vision problems such as natural scene