Kurt Rossmann Laboratories

or Radiologic Image Research

 

Home
Up
History
ROC Software
Publications
Contact Us

 

 

 

Mammography

 

1.  Automated Detection of Clustered Microcalcifications

 

We developed a computer program that can automatically locate clustered microcalcifications on mammograms.  With our method, a digital mammogram was processed by a linear filter to improve the signal-to-noise ratio of microcalcifications on the image.  Gray-level thresholding techniques, which combined a global gray-level thresholding procedure and a locally adaptive gray-level thresholding procedure, were then employed to extract potential signal sites from the noise background.  Subsequently, feature-extraction criteria were imposed on the potential signals to distinguish true signals from noise or artifacts.  The computer then indicated locations that may contain clusters of microcalcifications on the image.  Initially, for 60 mammograms used in the study, the true positive cluster detection accuracy of our automated detection program reached 87% at an FP detection rate of 4 clusters per image.  An ROC study was performed to determine whether this performance level could result in an improvement in radiologists’ performance when the CAD results were displayed on images.  The results of the ROC study showed that CAD, as implemented by the computer code in the current state of development, did significantly improve radiologists’ accuracy in detecting clustered microcalcifications under conditions that simulated the rapid interpretation of screening mammograms.  Further improvements, including the use of a shift-invariant artificial neural network and edge-gradient analysis, reduced the number of FP detections by our program to 0.5 per image.  (214, 228, 268, 345, 355, 359, 394, 422, 424, 438, 440, 451, 466, 523, 534, 542, 571, 586, 614, 625, 693, 705, 753)

 

2. Computerized Classification of Clustered Microcalcifications

 

We developed an automated scheme to help radiologist classify clustered microcalcifications.  The classification scheme first extracted, from the image, features of the individual microcalcifications (thickness, volume, area, and shape) and of the cluster itself (number of calcifications in the cluster, and area and shape of the cluster).  Second, these features were used as input to an artificial neural network, whose output was the probability that the cluster was malignant.  The performance of our classification scheme was tested against the average accuracy of 5 radiologists.  While both the computerized method and the radiologists were able to correctly identify all the malignant cases, the radiologists misclassified 75% of the benign cases (called them malignant), while the computer scheme misclassified only 25% of the benign cases.  This result indicated the potential of our classification scheme for reducing the number of unnecessary biopsies.  (506, 507, 606, 667, 715, 763, 817, 868)

 

3. Effect of CAD on Radiologists' Diagnosis of Clustered Microcalcifications

 

We tested whether CAD can improve radiologists' diagnostic performance in breast cancer diagnosis.  Our computer classification scheme estimated the likelihood of malignancy for clustered microcalcifications based on eight computer-extracted features from standard-view mammograms.  One hundred and four histologically verified microcalcification cases (46 malignant, 58 benign) in a near-consecutive biopsy series were used in this study.  Observer performance was measured on ten radiologists who read the original standard and magnification-view mammograms.  The computer aid provided a percentage estimate of likelihood of malignancy.  Comparison was made between computer-aided performance and unaided (routine clinical) performance using ROC analysis and by comparing biopsy recommendations.  The results showed that the Az value increased from 0.61 (unaided) to 0.75 (CAD; P<0.0001).  On average, each observer recommended 6.4 additional biopsies for malignant cases (P=0.0006) and 6.0 fewer biopsies for benign cases (P=0.003) with the computer aid.  This corresponded to increases in sensitivity (73.5% to 87.4%), specificity (31.6% to 41.9%), and hypothetical positive biopsy yield (46% to 55%).  We conclude that computer-aided diagnosis can be used to improve radiologists' performance in breast cancer diagnosis.  (607, 667, 668, 669)

 

4. Potential of CAD to Reduce Variability in Radiologists' Interpretation

 

We evaluated if CAD can reduce radiologists' inter-observer variability in the interpretation of mammograms.  We compared ten radiologists' decision making on mammograms from 104 patients with clustered microcalcifications with and without a computer aid.  The computer estimated the likelihood that a microcalcification cluster was due to a malignancy.  We then analyzed variability in the radiologists' recommendations of biopsy versus follow-up.  The results showed that variation in the radiologists' accuracy as measured by the standard deviation of Az value was reduced 47% by the computer aid.  In addition, access to the computer aid increased the agreement by all observers from 13% to 32% of total cases (P=0.0002) while kappa increased from 0.19 to 0.41 (P<0.05).  Finally, use of the computer aid eliminated two thirds of substantial disagreements where biopsy and routine screening were recommended for the same patient by two different radiologists (P<0.05).  We conclude that computer-aided diagnosis holds the potential to reduce the variability in radiologists' interpretation of mammograms in addition to its demonstrated potential to improve diagnostic accuracy.   (765)

5. Effect of Correct Detection of Microcalcifications on Computer Classification

 

We studied the effects of computer-detected true-positive microcalcifications and computer-detected false-positive microcalcifications on performance of computer classification.  Using a database of 100 mammograms, we compared computer classification performance obtained from computer-detected microcalcifications to computer classification performance obtained from manually-identified microcalcification.  The computer classification performance was comparable to or better than radiologists' performance as the number of computer-detected true-positive microcalcifications decreased to 40% and as the number of computer-detected false-positive microcalcifications increased to 50%.  Further loss in computer-detected true-positive microcalcifications degraded classification performance substantially.  These results showed that computer performance in classifying clustered microcalcifications as malignant or benign was insensitive to moderate decreases in computer-detected true-positive microcalcifications and moderate increases in computer-detected false-positive microcalcifications.   (763)

 

6. Automated Detection of Mammographic Masses

 

We developed methods for the computerized detection of masses in digital mammograms.  One method was based on the deviation from the normal architectural symmetry of the right and left breasts, a bilateral-subtraction technique was used to enhance the conspicuity of possible masses.  The scheme employed two pairs of conventional screen-film mammograms (the right and left MLO views and CC views), which were digitized.  After the right and left breast images in each pair were aligned, a nonlinear bilateral-subtraction technique was employed that involved linking multiple subtracted images to locate initial candidate masses.  Various features were extracted and merged using an artificial neural network in order to reduce false-positive detections resulting from the bilateral subtraction.  In an evaluation study using 154 pairs of clinical mammograms, the scheme yielded a sensitivity of 95% for detection at an average of 2.5 false-positive detections per image.  (275, 330, 400, 436, 437, 443, 477, 608, 610

 

7. Computerized Classification of Mammographic Masses

 

Malignant masses often can be distinguished from benign masses due to their more spiculated appearance in the mammographic image.  Thus, in the classification of masses, our computerized scheme was based on the degree of spiculation exhibited by the mass in question.  We developed a method for the automated extraction of the lesion from the parenchymal background in order to facilitate the extraction of various features.  The features extracted were obtained from cumulative edge gradient histogram analysis in which the gradient was analyzed relative to the radial angle.  Other features included gray-level measures and geometric measures.  From the cumulative edge-gradient-orientation histogram, various measures were calculated including FWHM (full-width at half-max), standard deviation of the cumulative edge gradient and average gradient in the radial direction.  With a pathologically-confirmed database, the computer classification scheme (Az=0.94) performed at a level similar to that of an experienced mammographer (Az=0.90) in distinguishing malignant from benign masses.  The average performance of general radiologists yielded an Az value of 0.81.  The biopsy rate at 100% sensitivity of the computer scheme was about 30% higher than that of the experienced mammographer and was over 60% higher than that of the average of the five general radiologists.  (454, 602, 663, 713, 740, 762, 811, 812)

 

8. Robustness of Computerized Classification of Masses

 

We evaluated the robustness of our computerized method developed for the classification of benign and malignant masses with respect to variations in both case mix and film digitization.  The method was evaluated independently with a 110-case database consisting of 50 malignant and 60 benign cases.  Mammograms were digitized twice with two different digitizers (Konica and Lumisys).  Effects of variations in both case mix and film digitization on performance of the method also were assessed.  Categorization of lesions as malignant or benign with an ANN (or a hybrid) classifier achieved Az value of 0.90 (0.94 for the hybrid) on the previous training database in a round-robin evaluation, and Az values of 0.82 (0.81) and 0.81 (0.82) on the independent database for the Konica and Lumisys formats, respectively.  These differences, however, were not statistically significant (P>.10).  Therefore, the computerized method for the classification of lesions on mammograms was robust with respect to variations in case mix and film digitization.  (713)

 

9. Effect of Dominant Features on Neural Network Performance

 

Two different classifiers, an ANN and a hybrid system (one step rule-based method followed by an ANN) were investigated to merge computer-extracted features in the task of differentiating between malignant and benign masses.  A total of four computer-extracted features – spiculation, margin sharpness and two density-related measures – were used to characterize these masses.  We investigated their learning and decision-making processes by studying the relationships between the input features and the outputs.  A correlation study showed that the outputs from the ANN-alone method were correlated strongly with one of the input features (spiculation), yielding a correlation coefficient of 0.91, whereas the correlation coefficients (absolute value) for the other features ranged from 0.19 to 0.40.  This strong correlation between the ANN output and the spiculation measure indicated that the learning and decision-making processes of the ANN-alone method were dominated by the spiculation measure.  Three-dimensional plots of the computer output as functions of the input features demonstrated that the ANN-alone method did not learn as effectively as the hybrid system in differentiating non-spiculated malignant masses from benign masses, thus resulting in an inferior performance at the high sensitivity levels.  We found that with a limited database it was detrimental for an ANN to learn the significance of other features in the presence of a dominant feature.  The hybrid system, which initially applied a rule concerning the value of the spiculation measure prior to employing an ANN, prevented overlearning from the dominant feature and performed better than the ANN-alone method in merging the computer-extracted features into a correct diagnosis regarding the malignancy of the masses.  (663)

 

 10. Potential Usefulness of Special View Mammograms in Computer-Aided Diagnosis

 

The performance of our computerized classification method was evaluated on an independent database consisting of 70 cases (33 malignant and 37 benign cases), each having CC, MLO and special view mammograms (spot compression or spot compression magnification views).  The mass lesion identified in each of the three mammographic views was analyzed using our previously developed and trained computerized classification method.  On this independent database, we compared the performance of individual computer-extracted mammographic features, as well as the computer-estimated likelihood of malignancy, for the standard and special views.  Computerized analysis of special view mammograms alone in the task of distinguishing between malignant and benign lesions yielded an Az of 0.95, which is significantly higher (P<0.005) than that obtained from the MLO and CC views (Az values of 0.78 and 0.75, respectively).  Use of only the special views correctly classified 19 of 33 benign cases (a specificity of 58%) at 100% sensitivity, whereas use of the CC and MLO views alone correctly classified 4 and 8 of 33 benign cases (specificities of 12% and 24%, respectively). In addition, we found that the average computer output of the three views (Az of 0.95) yielded a significantly better performance than did the maximum computer output from the mammographic views.  Our results show that computerized analysis of special view mammograms yielded a better performance in differentiating between benign and malignant masses than did standard view mammograms of the same breast.  (762)

 

11. Observer Study for Effectiveness of CAD in the Diagnosis of Breast Cancer

 

We evaluated the effectiveness of our computerized classification method as an aid to radiologists reviewing clinical mammograms for which the diagnoses were unknown to both the radiologists and the computer.  Six mammographers and 6 community radiologists participated in an observer study.  These 12 radiologists interpreted, without and with the computer aid, 110 cases that were unknown to both the 12 radiologist observers and the trained computer classification scheme.  When the computer aid was used, the average performance of the 12 radiologists improved, as indicated by an increase in Az from 0.93 to 0.96 (P=0.0002), and by an increase in sensitivity from 94% to 98% (P=0.022).  No statistically significant difference in specificity was found between readings with and without computer aid ( =-0.014; P=0.46; 95% CI =(-0.054, 0.026)).  When we analyzed results from the mammographers and community radiologists as separate groups, a larger improvement was demonstrated for the community radiologists.  Computer-aided diagnosis can potentially help radiologists improve their diagnostic accuracy in the task of differentiating between benign and malignant masses seen on mammograms.  (811)

 

12. Observer Study with an Intelligent CAD Workstation for Breast Imaging

 

We incorporated our computerized mass classification method into an intelligent workstation interface that displays known malignant and benign cases similar to lesions in question using a color-coding scheme that allows instant visual feedback to the radiologist.  The probability distributions of the malignant and benign cases in the known database were also graphically displayed along with the graphical “location” of the unknown case relative to these two distributions.  We investigated the usefulness of the intelligent search workstation for computer-aided diagnosis as an aid to radiologists in the classification of lesions in mammography.  Upon presentation of an unknown mammographic case, the workstation shows the computer output in terms of (a) computer-estimated likelihoods of malignancy, (b) images of lesions with known diagnoses from an on-line lesion atlas, and (c) graphics illustrating the characteristics of the unknown lesion relative to characteristics of lesions in the known reference atlas.  These images were retrieved automatically from a similarity search of lesions in the known mammographic atlas.  In an observer study, five radiologists interpreted 100 cases before and after presentation of the computer output.  On average, the radiologists’ performance was improved in terms of Az value from 0.86 to 0.90 with a P<0.02, in the task of distinguishing between malignant and benign mammographic mass lesion cases.  (707)

 

13. Automated Seeded Lesion Segmentation on Mammograms

 

Segmenting lesions is a vital step in many computerized mass-detection schemes for digital (or digitized) mammograms.  We developed two novel lesion segmentation techniques – one based on a single feature called the radial gradient index (RGI) and one based on simple probabilistic models to segment mass lesions, or other similar nodular structures, from surrounding background.  In both methods, a series of image partitions was created using gray-level information as well as prior knowledge of the shape of typical mass lesions.  With the former method, the partition that maximizes the RGI was selected.  In the latter method, probability distributions for gray-levels inside and outside the partitions were estimated, and subsequently used to determine the probability that the image occurred for each given partition.  The partition that maximizes this probability was selected as the final lesion partition (contour).  We tested these methods against a conventional region growing algorithm using a database of biopsy-proven, malignant lesions and found that the new lesion segmentation algorithms more closely matched radiologists’ outlines of these lesions.  At an overlap threshold of 0.30, gray level region growing correctly delineated 62% of the lesions in our database while the RGI and probabilistic segmentation algorithms correctly segmented 92% of the lesions. (609)

 

14. Feature Selection with Limited Datasets

 

In many computerized schemes, numerous features can be extracted to describe suspect image regions.  A subset of these features is then employed in a classifier to determine whether the suspect region is abnormal or normal.  Different subsets of features, in general, result in different classification performances.  A feature selection method is often used to determine an “optimal” subset of features to use with a particular classifier.  A classifier performance measure such as Az value must be incorporated into this feature selection process.  With limited datasets, however, there is a distribution in the classifier performance measure for a given classifier and subset of features.  We investigated the variation in the selected subset of “optimal” features as compared with the true optimal subset of features caused by this distribution of classifier performance.  We considered examples in which the probability that the optimal subset of features was selected can be analytically computed.  We showed the dependence of this probability on the dataset sample size, the total number of features from which to select, the number of features selected, and the performance of the true optimal subset.  Once a subset of features has been selected, the parameters of the data classifier must be determined.  We showed that, with limited datasets and/or large number of features from which to choose, bias was introduced if the classifier parameters were determined using the same data that were employed to select the “optimal” subset of features. (675)

 

15. Ideal Observer Approximation using Bayesian Classification Neural Networks

 

It is well understood that the optimal classification decision variable is the likelihood ratio or any monotonic transformation of the likelihood ratio.  An automated classifier which maps from an input space to one of the likelihood ratio family of decision variables is an optimal classifier or “ideal observer.”  ANNs are frequently used as classifiers for many problems.  In the limit of large training sample sizes, an ANN approximates a mapping function which is a monotonic transformation of the likelihood ratio, i.e., it estimates an ideal observer decision variable.  A principal disadvantage of conventional ANNs is the potential over-parameterization of the mapping function which results in a poor approximation of an optimal mapping function for smaller training samples.  Recently, Bayesian methods have been applied to ANNs in order to regularize training to improve the robustness of the classifier.  The goal of training a Bayesian ANN with finite sample sizes is, as with unlimited data, to approximate the ideal observer.  We evaluated the accuracy of Bayesian ANN models of ideal observer decision variables as a function of the number of hidden units used, the signal-to-noise ratio of the data and the number of features or dimensionality of the data.  We showed that when enough training data were present, excess hidden units did not substantially degrade the accuracy of Bayesian ANNs.  However, the minimum number of hidden units required to best model the optimal mapping function varied with the complexity of the data.  (767)

 

16. Evaluation of Computerized Detection Techniques with a Missed Lesion Database

 

Over the past 7 years, we have been collecting cases in which a lesion was missed in a mammogram.  To date, 69 cases with a lesion that went undetected by a radiologist were analyzed by the computerized detection schemes -- clustered microcalcifications and masses.  In all cases the lesions were rated retrospectively as being subtle to extremely subtle by an experienced mammographer.  The computer methods correctly identified approximately 50% of the missed lesions -- 54% of the malignant lesions and 45% of the benign lesions.  The false positives rate was 1.3 per image.  This result showed that our computer methods were capable of identifying cancers that were overlooked by radiologists.  (528, 735, 777, 778)

 

17. Initial Clinical Testing of an "Intelligent" Mammography Workstation

 

We implemented our computerized detection schemes for masses and clustered microcalcifications on a prototype "intelligent" mammography workstation in an ongoing clinical study.  The workstation consisted of a film digitizer, a high-speed computer, a magneto-optical jukebox, and hard and soft copy displays.  The system was installed in the clinical mammography reading area at the University of Chicago on November 8, 1994.   As of October 1997, more than 12,000 cases were analyzed.  We analyzed the sensitivity and false-positive rate of the intelligent workstation for the first two-years of implementation, which included 8,035 mammographic screening cases.  Thirty-five cancers were confirmed to date within this 2-year period, with one case yielding a negative mammogram but with a palpable lesion.  Twenty-three of the 34 cancers were detected by the computer (16 of 23 cases containing masses and 7 of 13 cases containing clustered microcalcifications).  Nine of the patients with cancer had 2 screening exams during the two-year period.  In three of the nine cases, the computer indicated the region in the first exam where the cancer was subsequently diagnosed by the radiologist in the second exam.  The computer output contained, on average, 0.9 false-positive microcalcification clusters and 1.4 false-positive masses.  In order to determine the effect of false-positive detections on mammographic interpretation, we calculated the call-back rate in one-year periods before and after implementation of the workstation in the clinical area.  Before introduction of CAD, 13.2% of screeners were called back for further workup and after the introduction of CAD, 12.6% of screeners were called back for further workup.  Thus, the false-positive output from the computer did not increase the number of women called back.  (465, 467, 615, 683)

 

18. Artificial Neural Networks for Decision Making in the Diagnosis of Breast Cancer

 

The interpretation of mammograms for the diagnosis of breast cancer is a difficult task.  We investigated the potential utility of artificial neural networks as a decision-making aid to radiologists in the analysis of mammographic data. Three-layer, feed-forward neural networks with a back-propagation algorithm were trained for the interpretation of mammograms on the basis of features extracted from mammograms by experienced radiologists.  Our database consisted of features extracted from 133 textbook cases and 60 clinical cases.  Performance of the neural networks was evaluated by ROC analysis.  A network that used 43 image features performed well in distinguishing between benign and malignant lesions, yielding an Az value of 0.95 for textbook cases in a test by the round-robin method.  With clinical cases, the performance of a neural network in merging 14 radiologist-extracted features of lesions to distinguish between benign and malignant lesions was found to be higher than the average performance of attending and resident radiologists alone (without the aid of a neural network).  Therefore, the networks may provide a potentially useful tool in the mammographic decision-making task of distinguishing between benign and malignant lesions.  (358, 398)

 

19. A Method for Producing Simulated Mammograms

 

We developed a method for producing computerized simulated mammograms.  It is now possible to model image formation in many different types of x-ray detectors.  That is, given an x-ray distribution incident on a detector, it is possible to predict how the final image will appear.  Therefore, we collected high fidelity images of biopsy specimens and mastectomy samples.  We are able to use these images to produce multiple simulated mammograms with different types of pathology.  The technique was tested on phantom images.

 

20. Computerized Analysis of Mammographic Parenchymal Patterns for Breast Cancer Risk Assessment

 

With the increasing awareness of breast cancer risk and the benefit of screening mammography, more women in all risk categories are seeking information regarding their individual risk of developing breast cancer.  Identification and close surveillance of women who are at high risk of developing breast cancer may provide an opportunity for early cancer detection.  The purpose of this study was to identify computer-extracted, mammographic parenchymal patterns that are associated with breast cancer risk.  We extracted fourteen features from the central breast region on digitized mammograms to characterize the mammographic parenchymal patterns of women at different risk levels.  Two different approaches were employed to relate these mammographic features to breast cancer risk.  In one approach, the features were used to distinguish mammographic patterns seen in low-risk women from those who inherited a mutated form of the BRCA1/BRCA2 gene, which confers a very high risk of developing breast cancer.  In another approach, the features were related to risk as determined from existing clinical models (Gail and Claus models), which use well-known epidemiological factors such as a woman’s age, her family history of breast cancer, reproductive history, etc.  Stepwise linear discriminant analysis was employed to identify features that were useful in differentiating between "low-risk" women and BRCA1/BRCA2-mutation carriers.  Stepwise linear regression analysis was employed to identify useful features in predicting the risk as estimated from the Gail and Claus models.  Similar computer-extracted mammographic features were identified in the two approaches.  Results show that women at high risk tend to have dense breasts and their mammographic patterns tend to be coarse and low in contrast.  (709, 710, 761)

 

21. Computerized Analysis of Digitized Mammograms of BRCA1/BRCA2 Gene Mutation Carriers

 

In this study, we aimed to evaluate, using computer image analysis, the mammographic density patterns of women with germ-line mutations in BRCA1 or BRCA2 genes in comparison with those of women at low risk of developing breast cancer.  Mammograms from 30 carriers of BRCA1 or BRCA2 mutations and 142 low-risk women were collected retrospectively and digitized.  In addition, sixty of the 142 low-risk women were randomly selected and age-matched at 5-year intervals to the 30 mutation carriers.  Mammographic features were extracted from the central regions of the breast image to characterize the mammographic density and the heterogeneity of dense portions of the breast.  These features were then merged by linear discriminant analysis (LDA) into a single value related to the risk of breast cancer.  Quantitative analysis of mammograms demonstrated that the carriers of BRCA1 or BRCA2 mutations tend to have dense breast tissue and their mammographic patterns tend to be low in contrast with coarse texture.  The LDA achieved Az values of 0.91 and 0.92 in distinguishing between the BRCA1/BRCA2-mutation carriers and the low-risk women in the entire database and the age-matched group, respectively.  The computerized analysis of mammograms suggests that mammographic patterns of carriers of BRCA1 or BRCA2 mutations were different from those of women at low risk for breast cancer.  Our computer-extracted features may potentially be useful as radiographic markers for identifying women at high risk for breast cancer.  (603, 813)

 

 

22. Eliminatation of false-positive microcalcification detections in a CAD scheme using a Bayesian neural network

 

We compared the performance of a Bayesian neural network (BNN) for feature classification with a rule-based classifier and a conventional artificial neural network (ANN) in a computer-aided diagnosis (CAD) scheme for the detection of clustered microcalcifications.  Five features were extracted from the images at each signal location.  A BNN, which can approximate the behavior of the ideal observer, was trained on a database of 39 mammograms containing clustered microcalcifications.  The performance of the trained BNN on an independent database of 50 mammograms was compared to the performance of a combined rule-based and conventional-ANN method.  For both methods, detected signals were clustered to yield detected cluster FROC curves.  At a true-positive fraction of detected clusters of 0.83, the number of false-positive clusters per image was 0.8 for the combined method, and 1.16 for the BNN.  The BNN does not require subjective selection of thresholds as in the rule-based and combined methods, its performance is robust to the properties of the testing dataset, and it is able in theory to approximate the performance of the ideal observer (705, 753).

 

23. Estimation of three-class ideal observer decision functions with a Bayesian artificial neural network

 

We are using Bayesian artificial neural networks (BANNs) to eliminate false-positive detections in our computer-aided diagnosis schemes. In the present work, we investigated whether BANNs can be used to estimate likelihood ratio, or ideal observer, decision functions for distinguishing observations which are drawn from three classes. Three univariate normal distributions were chosen representing three classes. We sampled 3,000 values of x for each of 10 training datasets, and 3,000 values of x for a single testing dataset. A BANN was trained on each training dataset, and the two outputs from each trained BANN, which estimate p(class 1|x) and p(class 2|x), were recorded for each value of x in the testing dataset. The mean BANN output and its standard error were calculated using the ten sets of BANN output. We repeated the above procedure to estimate the means and standard errors of the two likelihood ratio decision functions p(x|class 1)/p(x|class 3)/p(x|class 2)/p(x|class 3). We found that the BANN can estimate the a posteriori class probabilities quite accurately, except in regions of data space where outcomes are unlikely. Estimation of the likelihood ratios is more problematic, which we attribute to error amplification caused by taking the ratio of two imprecise estimates. We hope to improve these estimates by constraining the BANN training procedure (801, 863).

 

24. Maximum likelihood fitting of FROC curves under an initial-detection-and-candidate-analysis model

 

We have developed a model for FROC curve fitting that relates the observer's FROC performance not to the ROC performance that would be obtained if the observer's responses were scored on a per image basis, but rather to a hypothesized ROC performance that the observer would obtain in the task of classifying a set of "candidate detections" as positive or negative. We adopt the assumptions of the Bunch FROC model, namely that the observer's detections are all mutually independent, as well as assumptions qualitatively similar to, but different in nature from, those made by Chakraborty in his AFROC scoring methodology. Under the assumptions of our model, we show that the observer's FROC performance is a linearly scaled version of the candidate analysis ROC curve, where the scaling factors are just given by the FROC operating point coordinates for detecting initial candidates. Further, we show that the likelihood function of the model parameters given observational data takes on a simple form, and we develop a maximum likelihood method for fitting a FROC curve to this data. FROC and AFROC curves are produced for computer vision observer datasets and compared with the results of the AFROC scoring method. Although developed primarily with computer vision schemes in mind, we hope that the methodology presented here will prove worthy of further study in other applications as well. (802).
 

25. The use of a priori information in the detection of mammographic microcalcifications to improve their classification

 

In this work, we present a calcification-detection scheme that automatically localizes calcifications in a previously detected cluster in order to generate the input for a cluster-classification scheme developed in the past. The calcification-detection scheme makes use of three pieces of a priori information: the location of the center of the cluster, the size of the cluster, and the approximate number of calcifications in the cluster. This information can be obtained either automatically from a cluster-detection scheme or manually by a radiologist. It is used to analyze only the portion of the mammogram that contains a cluster and to identify the individual calcifications more accurately, after enhancing them by means of a Difference-of-Gaussians filter. Classification performances (patient-based Az = 0.92; cluster-based Az = 0.72) comparable to those obtained by using manually-identified calcifications (patient-based Az = 0.92; cluster-based Az = 0.82) can be achieved. (884)

 

26. Investigation of Psychophysical Measure for Evaluation of Similar Images for Mammographic Masses: Preliminary Results
 

We investigated a psychophysical similarity measure for selection of images similar to those of unknown masses on mammograms, which may assist radiologists in the distinction between benign and malignant masses. Sixty pairs of masses were selected from 1445 mass images prepared for this study, which were obtained from the Digital Database for Screening Mammography by the University of South Florida. Five radiologists provided subjective similarity ratings for these 60 pairs of masses based on the overall impression for diagnosis. Radiologists’ subjective ratings were marked on a continuous rating scale and quantified between 0 and 1, which correspond to pairs not similar at all and pairs almost identical, respectively. By use of the subjective ratings as “gold standard”, similarity measures based on the Euclidean distance between pairs in feature space and the psychophysical measure were determined. For determination of the psychophysical similarity measure, an artificial neural network (ANN) was employed to learn the relationship between radiologists’ average subjective similarity ratings and computer-extracted image features. To evaluate the usefulness of the similarity measures, the agreement with the radiologists’ subjective similarity ratings was assessed in terms of correlation coefficients between the average subjective ratings and the similarity measures. A commonly used similarity measure based on the Euclidean distance was moderately correlated (r=0.644) with the radiologists’ average subjective ratings, whereas the psychophysical measure by use of the ANN was highly correlated (r=0.798). The preliminary result indicates that a psychophysical similarity measure would be useful in the selection of images similar to those of unknown masses on mammograms. (967)
 

Relationship between radiologists’ average subjective ratings and psychophysical measure by use of five features

Home | Up | History | ROC Software | Publications | Contact Us

This site was last updated 11/24/04