









| |
Mammography
1.
Automated Detection of
Clustered Microcalcifications
We
developed a computer program that can automatically locate clustered
microcalcifications on mammograms. With our method, a digital mammogram
was processed by a linear filter to improve the signal-to-noise ratio of
microcalcifications on the image. Gray-level thresholding techniques,
which combined a global gray-level thresholding procedure and a locally
adaptive gray-level thresholding procedure, were then employed to extract
potential signal sites from the noise background. Subsequently,
feature-extraction criteria were imposed on the potential signals to
distinguish true signals from noise or artifacts. The computer then
indicated locations that may contain clusters of microcalcifications on
the image. Initially, for 60 mammograms used in the study, the true
positive cluster detection accuracy of our automated detection program
reached 87% at an FP detection rate of 4 clusters per image. An ROC study
was performed to determine whether this performance level could result in
an improvement in radiologists’ performance when the CAD results were
displayed on images. The results of the ROC study showed that CAD, as
implemented by the computer code in the current state of development, did
significantly improve radiologists’ accuracy in detecting clustered
microcalcifications under conditions that simulated the rapid
interpretation of screening mammograms. Further improvements, including
the use of a shift-invariant artificial neural network and edge-gradient
analysis, reduced the number of FP detections by our program to 0.5 per
image. (214, 228, 268, 345, 355, 359, 394, 422, 424, 438, 440, 451,
466, 523, 534, 542, 571, 586, 614, 625, 693, 705, 753)
|
 |
2.
Computerized Classification
of Clustered Microcalcifications
We
developed an automated scheme to help radiologist classify clustered
microcalcifications. The classification scheme first extracted, from the
image, features of the individual microcalcifications (thickness, volume,
area, and shape) and of the cluster itself (number of calcifications in
the cluster, and area and shape of the cluster). Second, these features
were used as input to an artificial neural network, whose output was the
probability that the cluster was malignant. The performance of our
classification scheme was tested against the average accuracy of 5
radiologists. While both the computerized method and the radiologists
were able to correctly identify all the malignant cases, the radiologists
misclassified 75% of the benign cases (called them malignant), while the
computer scheme misclassified only 25% of the benign cases. This result
indicated the potential of our classification scheme for reducing the
number of unnecessary biopsies. (506, 507, 606, 667, 715, 763, 817, 868)
|
 |
3. Effect of CAD on Radiologists'
Diagnosis of Clustered Microcalcifications
We
tested whether CAD can improve radiologists' diagnostic performance in
breast cancer diagnosis. Our computer classification scheme estimated the
likelihood of malignancy for clustered microcalcifications based on eight
computer-extracted features from standard-view mammograms. One hundred
and four histologically verified microcalcification cases (46 malignant,
58 benign) in a near-consecutive biopsy series were used in this study.
Observer performance was measured on ten radiologists who read the
original standard and magnification-view mammograms. The computer aid
provided a percentage estimate of likelihood of malignancy. Comparison
was made between computer-aided performance and unaided (routine clinical)
performance using ROC analysis and by comparing biopsy recommendations.
The results showed that the Az value increased from 0.61 (unaided) to 0.75
(CAD; P<0.0001). On average, each observer recommended 6.4 additional
biopsies for malignant cases (P=0.0006) and 6.0 fewer biopsies for benign
cases (P=0.003) with the computer aid. This corresponded to increases in
sensitivity (73.5% to 87.4%), specificity (31.6% to 41.9%), and
hypothetical positive biopsy yield (46% to 55%). We conclude that
computer-aided diagnosis can be used to improve radiologists' performance
in breast cancer diagnosis. (607, 667, 668, 669)
|
 |
4. Potential of CAD to Reduce
Variability in Radiologists' Interpretation
We
evaluated if CAD can reduce radiologists' inter-observer variability in
the interpretation of mammograms. We compared ten radiologists' decision
making on mammograms from 104 patients with clustered microcalcifications
with and without a computer aid. The computer estimated the likelihood
that a microcalcification cluster was due to a malignancy. We then
analyzed variability in the radiologists' recommendations of biopsy versus
follow-up. The results showed that variation in the radiologists'
accuracy as measured by the standard deviation of Az value was reduced 47%
by the computer aid. In addition, access to the computer aid increased
the agreement by all observers from 13% to 32% of total cases (P=0.0002)
while kappa increased from 0.19 to 0.41 (P<0.05). Finally, use of the
computer aid eliminated two thirds of substantial disagreements where
biopsy and routine screening were recommended for the same patient by two
different radiologists (P<0.05). We conclude that computer-aided
diagnosis holds the potential to reduce the variability in radiologists'
interpretation of mammograms in addition to its demonstrated potential to
improve diagnostic accuracy. (765)
|
5. Effect of Correct Detection of
Microcalcifications on Computer Classification
We studied the effects of
computer-detected true-positive microcalcifications and computer-detected
false-positive microcalcifications on performance of computer
classification. Using a database of 100 mammograms, we compared computer
classification performance obtained from computer-detected
microcalcifications to computer classification performance obtained from
manually-identified microcalcification. The computer classification
performance was comparable to or better than radiologists' performance as
the number of computer-detected true-positive microcalcifications
decreased to 40% and as the number of computer-detected false-positive
microcalcifications increased to 50%. Further loss in computer-detected
true-positive microcalcifications degraded classification performance
substantially. These results showed that computer performance in
classifying clustered microcalcifications as malignant or benign was
insensitive to moderate decreases in computer-detected true-positive
microcalcifications and moderate increases in computer-detected
false-positive microcalcifications. (763)
|
6.
Automated Detection of
Mammographic Masses
We
developed methods for the computerized detection of masses in digital
mammograms. One method was based on the deviation from the normal
architectural symmetry of the right and left breasts, a
bilateral-subtraction technique was used to enhance the conspicuity of
possible masses. The scheme employed two pairs of conventional
screen-film mammograms (the right and left MLO views and CC views), which
were digitized. After the right and left breast images in each pair were
aligned, a nonlinear bilateral-subtraction technique was employed that
involved linking multiple subtracted images to locate initial candidate
masses. Various features were extracted and merged using an artificial
neural network in order to reduce false-positive detections resulting from
the bilateral subtraction. In an evaluation study using 154 pairs of
clinical mammograms, the scheme yielded a sensitivity of 95% for detection
at an average of 2.5 false-positive detections per image. (275, 330,
400,
436, 437, 443, 477, 608, 610)
|
7. Computerized Classification of
Mammographic Masses
Malignant masses often can be distinguished from benign masses due to
their more spiculated appearance in the mammographic image. Thus, in the
classification of masses, our computerized scheme was based on the degree
of spiculation exhibited by the mass in question. We developed a method
for the automated extraction of the lesion from the parenchymal background
in order to facilitate the extraction of various features. The features
extracted were obtained from cumulative edge gradient histogram analysis
in which the gradient was analyzed relative to the radial angle. Other
features included gray-level measures and geometric measures. From the
cumulative edge-gradient-orientation histogram, various measures were
calculated including FWHM (full-width at half-max), standard deviation of
the cumulative edge gradient and average gradient in the radial
direction. With a pathologically-confirmed database, the computer
classification scheme (Az=0.94) performed at a level similar to that of an
experienced mammographer (Az=0.90) in distinguishing malignant from benign
masses. The average performance of general radiologists yielded an Az
value of 0.81. The biopsy rate at 100% sensitivity of the computer scheme
was about 30% higher than that of the experienced mammographer and was
over 60% higher than that of the average of the five general
radiologists. (454, 602, 663, 713, 740, 762, 811, 812)
|
 |
8. Robustness of Computerized
Classification of Masses
We
evaluated the robustness of our computerized method developed for the
classification of benign and malignant masses with respect to variations
in both case mix and film digitization. The method was evaluated
independently with a 110-case database consisting of 50 malignant and 60
benign cases. Mammograms were digitized twice with two different
digitizers (Konica and Lumisys). Effects of variations in both case mix
and film digitization on performance of the method also were assessed.
Categorization of lesions as malignant or benign with an ANN (or a hybrid)
classifier achieved Az value of 0.90 (0.94 for the hybrid) on the previous
training database in a round-robin evaluation, and Az values of 0.82
(0.81) and 0.81 (0.82) on the independent database for the Konica and
Lumisys formats, respectively. These differences, however, were not
statistically significant (P>.10). Therefore, the computerized method for
the classification of lesions on mammograms was robust with respect to
variations in case mix and film digitization. (713)
|
9. Effect of Dominant Features on
Neural Network Performance
Two
different classifiers, an ANN and a hybrid system (one step rule-based
method followed by an ANN) were investigated to merge computer-extracted
features in the task of differentiating between malignant and benign
masses. A total of four computer-extracted features – spiculation, margin
sharpness and two density-related measures – were used to characterize
these masses. We investigated their learning and decision-making
processes by studying the relationships between the input features and the
outputs. A correlation study showed that the outputs from the ANN-alone
method were correlated strongly with one of the input features (spiculation),
yielding a correlation coefficient of 0.91, whereas the correlation
coefficients (absolute value) for the other features ranged from 0.19 to
0.40. This strong correlation between the ANN output and the spiculation
measure indicated that the learning and decision-making processes of the
ANN-alone method were dominated by the spiculation measure.
Three-dimensional plots of the computer output as functions of the input
features demonstrated that the ANN-alone method did not learn as
effectively as the hybrid system in differentiating non-spiculated
malignant masses from benign masses, thus resulting in an inferior
performance at the high sensitivity levels. We found that with a limited
database it was detrimental for an ANN to learn the significance of other
features in the presence of a dominant feature. The hybrid system, which
initially applied a rule concerning the value of the spiculation measure
prior to employing an ANN, prevented overlearning from the dominant
feature and performed better than the ANN-alone method in merging the
computer-extracted features into a correct diagnosis regarding the
malignancy of the masses. (663)
|
10. Potential Usefulness of Special
View Mammograms in Computer-Aided Diagnosis
The
performance of our computerized classification method was evaluated on an
independent database consisting of 70 cases (33 malignant and 37 benign
cases), each having CC, MLO and special view mammograms (spot compression
or spot compression magnification views). The mass lesion identified in
each of the three mammographic views was analyzed using our previously
developed and trained computerized classification method. On this
independent database, we compared the performance of individual
computer-extracted mammographic features, as well as the
computer-estimated likelihood of malignancy, for the standard and special
views. Computerized analysis of special view mammograms alone in the task
of distinguishing between malignant and benign lesions yielded an Az of
0.95, which is significantly higher (P<0.005) than that obtained from the
MLO and CC views (Az values of 0.78 and 0.75, respectively). Use of only
the special views correctly classified 19 of 33 benign cases (a
specificity of 58%) at 100% sensitivity, whereas use of the CC and MLO
views alone correctly classified 4 and 8 of 33 benign cases (specificities
of 12% and 24%, respectively). In addition, we found that the average
computer output of the three views (Az of 0.95) yielded a significantly
better performance than did the maximum computer output from the
mammographic views. Our results show that computerized analysis of
special view mammograms yielded a better performance in differentiating
between benign and malignant masses than did standard view mammograms of
the same breast. (762)
|
11. Observer Study for Effectiveness
of CAD in the Diagnosis of Breast Cancer
We
evaluated the effectiveness of our computerized classification method as
an aid to radiologists reviewing clinical mammograms for which the
diagnoses were unknown to both the radiologists and the computer. Six
mammographers and 6 community radiologists participated in an observer
study. These 12 radiologists interpreted, without and with the computer
aid, 110 cases that were unknown to both the 12 radiologist observers and
the trained computer classification scheme. When the computer aid was
used, the average performance of the 12 radiologists improved, as
indicated by an increase in Az from 0.93 to 0.96 (P=0.0002), and by an
increase in sensitivity from 94% to 98% (P=0.022). No statistically
significant difference in specificity was found between readings with and
without computer aid ( =-0.014; P=0.46; 95% CI =(-0.054, 0.026)). When we
analyzed results from the mammographers and community radiologists as
separate groups, a larger improvement was demonstrated for the community
radiologists. Computer-aided diagnosis can potentially help radiologists
improve their diagnostic accuracy in the task of differentiating between
benign and malignant masses seen on mammograms. (811)
|
 |
12. Observer Study with an
Intelligent CAD Workstation for Breast Imaging
We
incorporated our computerized mass classification method into an
intelligent workstation interface that displays known malignant and benign
cases similar to lesions in question using a color-coding scheme that
allows instant visual feedback to the radiologist. The probability
distributions of the malignant and benign cases in the known database were
also graphically displayed along with the graphical “location” of the
unknown case relative to these two distributions. We investigated the
usefulness of the intelligent search workstation for computer-aided
diagnosis as an aid to radiologists in the classification of lesions in
mammography. Upon presentation of an unknown mammographic case, the
workstation shows the computer output in terms of (a) computer-estimated
likelihoods of malignancy, (b) images of lesions with known diagnoses from
an on-line lesion atlas, and (c) graphics illustrating the characteristics
of the unknown lesion relative to characteristics of lesions in the known
reference atlas. These images were retrieved automatically from a
similarity search of lesions in the known mammographic atlas. In an
observer study, five radiologists interpreted 100 cases before and after
presentation of the computer output. On average, the radiologists’
performance was improved in terms of Az value from 0.86 to 0.90 with a
P<0.02, in the task of distinguishing between malignant and benign
mammographic mass lesion cases. (707)
|
13. Automated Seeded Lesion
Segmentation on Mammograms
Segmenting lesions is a vital step in many computerized mass-detection
schemes for digital (or digitized) mammograms. We developed two novel
lesion segmentation techniques – one based on a single feature called the
radial gradient index (RGI) and one based on simple probabilistic models
to segment mass lesions, or other similar nodular structures, from
surrounding background. In both methods, a series of image partitions was
created using gray-level information as well as prior knowledge of the
shape of typical mass lesions. With the former method, the partition that
maximizes the RGI was selected. In the latter method, probability
distributions for gray-levels inside and outside the partitions were
estimated, and subsequently used to determine the probability that the
image occurred for each given partition. The partition that maximizes
this probability was selected as the final lesion partition (contour). We
tested these methods against a conventional region growing algorithm using
a database of biopsy-proven, malignant lesions and found that the new
lesion segmentation algorithms more closely matched radiologists’ outlines
of these lesions. At an overlap threshold of 0.30, gray level region
growing correctly delineated 62% of the lesions in our database while the
RGI and probabilistic segmentation algorithms correctly segmented 92% of
the lesions. (609)
|
14. Feature Selection with Limited
Datasets
In
many computerized schemes, numerous features can be extracted to describe
suspect image regions. A subset of these features is then employed in a
classifier to determine whether the suspect region is abnormal or normal.
Different subsets of features, in general, result in different
classification performances. A feature selection method is often used to
determine an “optimal” subset of features to use with a particular
classifier. A classifier performance measure such as Az value must be
incorporated into this feature selection process. With limited datasets,
however, there is a distribution in the classifier performance measure for
a given classifier and subset of features. We investigated the variation
in the selected subset of “optimal” features as compared with the true
optimal subset of features caused by this distribution of classifier
performance. We considered examples in which the probability that the
optimal subset of features was selected can be analytically computed. We
showed the dependence of this probability on the dataset sample size, the
total number of features from which to select, the number of features
selected, and the performance of the true optimal subset. Once a subset
of features has been selected, the parameters of the data classifier must
be determined. We showed that, with limited datasets and/or large number
of features from which to choose, bias was introduced if the classifier
parameters were determined using the same data that were employed to
select the “optimal” subset of features. (675)
|
15. Ideal Observer Approximation
using Bayesian Classification Neural Networks
It is
well understood that the optimal classification decision variable is the
likelihood ratio or any monotonic transformation of the likelihood ratio.
An automated classifier which maps from an input space to one of the
likelihood ratio family of decision variables is an optimal classifier or
“ideal observer.” ANNs are frequently used as classifiers for many
problems. In the limit of large training sample sizes, an ANN
approximates a mapping function which is a monotonic transformation of the
likelihood ratio, i.e., it estimates an ideal observer decision variable.
A principal disadvantage of conventional ANNs is the potential
over-parameterization of the mapping function which results in a poor
approximation of an optimal mapping function for smaller training
samples. Recently, Bayesian methods have been applied to ANNs in order to
regularize training to improve the robustness of the classifier. The goal
of training a Bayesian ANN with finite sample sizes is, as with unlimited
data, to approximate the ideal observer. We evaluated the accuracy of
Bayesian ANN models of ideal observer decision variables as a function of
the number of hidden units used, the signal-to-noise ratio of the data and
the number of features or dimensionality of the data. We showed that when
enough training data were present, excess hidden units did not
substantially degrade the accuracy of Bayesian ANNs. However, the minimum
number of hidden units required to best model the optimal mapping function
varied with the complexity of the data. (767)
|
16. Evaluation of Computerized
Detection Techniques with a Missed Lesion Database
Over
the past 7 years, we have been collecting cases in which a lesion was
missed in a mammogram. To date, 69 cases with a lesion that went
undetected by a radiologist were analyzed by the computerized detection
schemes -- clustered microcalcifications and masses. In all cases the
lesions were rated retrospectively as being subtle to extremely subtle by
an experienced mammographer. The computer methods correctly identified
approximately 50% of the missed lesions -- 54% of the malignant lesions
and 45% of the benign lesions. The false positives rate was 1.3 per
image. This result showed that our computer methods were capable of
identifying cancers that were overlooked by radiologists. (528, 735, 777,
778)
|
17. Initial Clinical Testing of an
"Intelligent" Mammography Workstation
We
implemented our computerized detection schemes for masses and clustered
microcalcifications on a prototype "intelligent" mammography workstation
in an ongoing clinical study. The workstation consisted of a film
digitizer, a high-speed computer, a magneto-optical jukebox, and hard and
soft copy displays. The system was installed in the clinical mammography
reading area at the University of Chicago on November 8, 1994. As of
October 1997, more than 12,000 cases were analyzed. We analyzed the
sensitivity and false-positive rate of the intelligent workstation for the
first two-years of implementation, which included 8,035 mammographic
screening cases. Thirty-five cancers were confirmed to date within this
2-year period, with one case yielding a negative mammogram but with a
palpable lesion. Twenty-three of the 34 cancers were detected by the
computer (16 of 23 cases containing masses and 7 of 13 cases containing
clustered microcalcifications). Nine of the patients with cancer had 2
screening exams during the two-year period. In three of the nine cases,
the computer indicated the region in the first exam where the cancer was
subsequently diagnosed by the radiologist in the second exam. The
computer output contained, on average, 0.9 false-positive
microcalcification clusters and 1.4 false-positive masses. In order to
determine the effect of false-positive detections on mammographic
interpretation, we calculated the call-back rate in one-year periods
before and after implementation of the workstation in the clinical area.
Before introduction of CAD, 13.2% of screeners were called back for
further workup and after the introduction of CAD, 12.6% of screeners were
called back for further workup. Thus, the false-positive output from the
computer did not increase the number of women called back. (465, 467,
615, 683)
|
 |
18. Artificial Neural Networks for
Decision Making in the Diagnosis of Breast Cancer
The
interpretation of mammograms for the diagnosis of breast cancer is a
difficult task. We investigated the potential utility of artificial
neural networks as a decision-making aid to radiologists in the analysis
of mammographic data. Three-layer, feed-forward neural networks with a
back-propagation algorithm were trained for the interpretation of
mammograms on the basis of features extracted from mammograms by
experienced radiologists. Our database consisted of features extracted
from 133 textbook cases and 60 clinical cases. Performance of the neural
networks was evaluated by ROC analysis. A network that used 43 image
features performed well in distinguishing between benign and malignant
lesions, yielding an Az value of 0.95 for textbook cases in a test by the
round-robin method. With clinical cases, the performance of a neural
network in merging 14 radiologist-extracted features of lesions to
distinguish between benign and malignant lesions was found to be higher
than the average performance of attending and resident radiologists alone
(without the aid of a neural network). Therefore, the networks may
provide a potentially useful tool in the mammographic decision-making task
of distinguishing between benign and malignant lesions. (358, 398)
|
19. A Method for Producing Simulated
Mammograms
We
developed a method for producing computerized simulated mammograms. It is
now possible to model image formation in many different types of x-ray
detectors. That is, given an x-ray distribution incident on a detector,
it is possible to predict how the final image will appear. Therefore, we
collected high fidelity images of biopsy specimens and mastectomy
samples. We are able to use these images to produce multiple simulated
mammograms with different types of pathology. The technique was tested on
phantom images.
|
20. Computerized Analysis of
Mammographic Parenchymal Patterns for Breast Cancer Risk Assessment
With
the increasing awareness of breast cancer risk and the benefit of
screening mammography, more women in all risk categories are seeking
information regarding their individual risk of developing breast cancer.
Identification and close surveillance of women who are at high risk of
developing breast cancer may provide an opportunity for early cancer
detection. The purpose of this study was to identify computer-extracted,
mammographic parenchymal patterns that are associated with breast cancer
risk. We extracted fourteen features from the central breast region on
digitized mammograms to characterize the mammographic parenchymal patterns
of women at different risk levels. Two different approaches were employed
to relate these mammographic features to breast cancer risk. In one
approach, the features were used to distinguish mammographic patterns seen
in low-risk women from those who inherited a mutated form of the
BRCA1/BRCA2 gene, which confers a very high risk of developing breast
cancer. In another approach, the features were related to risk as
determined from existing clinical models (Gail and Claus models), which
use well-known epidemiological factors such as a woman’s age, her family
history of breast cancer, reproductive history, etc. Stepwise linear
discriminant analysis was employed to identify features that were useful
in differentiating between "low-risk" women and BRCA1/BRCA2-mutation
carriers. Stepwise linear regression analysis was employed to identify
useful features in predicting the risk as estimated from the Gail and
Claus models. Similar computer-extracted mammographic features were
identified in the two approaches. Results show that women at high risk
tend to have dense breasts and their mammographic patterns tend to be
coarse and low in contrast. (709, 710, 761)
|
21. Computerized Analysis of
Digitized Mammograms of BRCA1/BRCA2 Gene Mutation Carriers
In
this study, we aimed to evaluate, using computer image analysis, the
mammographic density patterns of women with germ-line mutations in BRCA1
or BRCA2 genes in comparison with those of women at low risk of developing
breast cancer. Mammograms from 30 carriers of BRCA1 or BRCA2 mutations
and 142 low-risk women were collected retrospectively and digitized. In
addition, sixty of the 142 low-risk women were randomly selected and
age-matched at 5-year intervals to the 30 mutation carriers. Mammographic
features were extracted from the central regions of the breast image to
characterize the mammographic density and the heterogeneity of dense
portions of the breast. These features were then merged by linear
discriminant analysis (LDA) into a single value related to the risk of
breast cancer. Quantitative analysis of mammograms demonstrated that the
carriers of BRCA1 or BRCA2 mutations tend to have dense breast tissue and
their mammographic patterns tend to be low in contrast with coarse
texture. The LDA achieved Az values of 0.91 and 0.92 in distinguishing
between the BRCA1/BRCA2-mutation carriers and the low-risk women in the
entire database and the age-matched group, respectively. The computerized
analysis of mammograms suggests that mammographic patterns of carriers of
BRCA1 or BRCA2 mutations were different from those of women at low risk
for breast cancer. Our computer-extracted features may potentially be
useful as radiographic markers for identifying women at high risk for
breast cancer. (603, 813)
|
22. Eliminatation of false-positive microcalcification detections in a CAD scheme using a Bayesian neural network
We compared the performance of a Bayesian neural network (BNN) for feature classification with a rule-based classifier and a conventional artificial neural network (ANN) in a computer-aided diagnosis (CAD) scheme for the detection of clustered microcalcifications. Five features were extracted from the images at each signal location. A BNN, which can approximate the behavior of the ideal observer, was trained on a database of 39 mammograms containing clustered microcalcifications. The performance
of the trained BNN on an independent database of 50 mammograms was compared to the performance of a combined rule-based and conventional-ANN method. For both methods, detected signals were clustered to yield detected cluster FROC curves. At a true-positive fraction of detected clusters of 0.83, the number of false-positive clusters per image was 0.8 for the combined method, and 1.16 for the BNN. The BNN does not require subjective selection of thresholds as in the rule-based and combined methods, its performance is robust to the properties of the testing dataset, and it is able in theory to approximate the performance of
the ideal observer (705, 753).
|
23. Estimation of three-class ideal observer decision functions with a Bayesian artificial neural network
We are using Bayesian artificial neural networks (BANNs) to eliminate false-positive detections in our computer-aided diagnosis schemes. In the present work, we investigated whether BANNs can be used to estimate likelihood ratio, or ideal observer, decision functions for distinguishing observations which are drawn from three classes. Three univariate normal distributions were chosen representing three classes. We sampled 3,000 values of x for each of 10 training datasets, and 3,000 values of x for a single testing dataset. A BANN was
trained on each training dataset, and the two outputs from each trained BANN, which estimate p(class 1|x) and p(class 2|x), were recorded for each value of x in the testing dataset. The mean BANN output and its standard error were calculated using the ten sets of BANN output. We repeated the above procedure to estimate the means and standard errors of the two likelihood ratio decision functions p(x|class 1)/p(x|class 3)/p(x|class 2)/p(x|class 3). We found that the BANN can estimate the a posteriori class probabilities quite accurately, except in regions of data space where outcomes are unlikely. Estimation of the likelihood ratios is
more problematic, which we attribute to error amplification caused by taking the ratio of two imprecise estimates. We hope to improve these estimates by constraining the BANN training procedure (801,
863).
|
24. Maximum likelihood fitting of FROC curves under an initial-detection-and-candidate-analysis model
We have developed a model for FROC curve fitting that relates the observer's FROC performance not to the ROC performance that would be obtained if the observer's responses were scored on a per image basis, but rather to a hypothesized ROC performance that the observer would obtain in the task of classifying a set of "candidate detections" as positive or negative. We adopt the assumptions of the Bunch FROC model, namely that the observer's detections are all mutually independent, as well as assumptions qualitatively similar to, but different in nature from, those made
by Chakraborty in his AFROC scoring methodology. Under the assumptions of our model, we show that the observer's FROC performance is a linearly scaled version of the candidate analysis ROC curve, where the scaling factors are just given by the FROC operating point coordinates for detecting initial candidates. Further, we show that the likelihood function of the model parameters given observational data takes on a simple form, and we develop a maximum likelihood method for fitting a FROC curve to this data. FROC and AFROC curves are produced for computer vision observer datasets and compared with the results of the AFROC scoring method.
Although developed primarily with computer vision schemes in mind, we hope that the methodology presented here will prove worthy of further study in other applications as well. (802). |
25. The use of a priori information in the detection of mammographic microcalcifications to improve their classification
In this work, we present a calcification-detection scheme that automatically localizes calcifications in a previously detected cluster in order to generate the input for a cluster-classification scheme developed in the past. The calcification-detection scheme makes use of three pieces of a priori information: the location of the center of the cluster, the size of the cluster, and the approximate number of calcifications in the cluster. This information can be
obtained either automatically from a cluster-detection scheme or manually by a radiologist. It is used to analyze only the portion of the mammogram that contains a cluster and to identify the individual calcifications more accurately, after enhancing them by means of a Difference-of-Gaussians filter. Classification performances (patient-based Az = 0.92; cluster-based Az = 0.72) comparable to those obtained by using manually-identified
calcifications (patient-based Az = 0.92; cluster-based Az = 0.82) can be achieved. (884)
|
|
26. Investigation of Psychophysical Measure
for Evaluation of Similar Images for Mammographic Masses: Preliminary
Results
We investigated a psychophysical similarity measure for selection of
images similar to those of unknown masses on mammograms, which may
assist radiologists in the distinction between benign and malignant
masses. Sixty pairs of masses were selected from 1445 mass images
prepared for this study, which were obtained from the Digital Database
for Screening Mammography by the University of South Florida. Five
radiologists provided subjective similarity ratings for these 60 pairs
of masses based on the overall impression for diagnosis. Radiologists’
subjective ratings were marked on a continuous rating scale and
quantified between 0 and 1, which correspond to pairs not similar at all
and pairs almost identical, respectively. By use of the subjective
ratings as “gold standard”, similarity measures based on the Euclidean
distance between pairs in feature space and the psychophysical measure
were determined. For determination of the psychophysical similarity
measure, an artificial neural network (ANN) was employed to learn the
relationship between radiologists’ average subjective similarity ratings
and computer-extracted image features. To evaluate the usefulness of the
similarity measures, the agreement with the radiologists’ subjective
similarity ratings was assessed in terms of correlation coefficients
between the average subjective ratings and the similarity measures. A
commonly used similarity measure based on the Euclidean distance was
moderately correlated (r=0.644) with the radiologists’ average
subjective ratings, whereas the psychophysical measure by use of the ANN
was highly correlated (r=0.798). The preliminary result indicates that a
psychophysical similarity measure would be useful in the selection of
images similar to those of unknown masses on mammograms. (967)
|

Relationship between radiologists’
average subjective ratings and psychophysical measure by use of five
features |
|