Summary
This is a description of a
workshop held from 8:00 to 10:00pm, Wednesday, February 14, 1996 at SPIE's International Symposium, Medical Imaging 1996,
in the Pacific Ballroom of the Newport Beach Marriott Hotel in Newport Beach,
California. Approximately fifty researchers interested in medical image
registration attended. The meeting was chaired by J. Michael Fitzpatrick of Vanderbilt
University (click here for Fitzpatrick's
homepage).
Purpose
The purpose of the
Workshop was to promote the discussion of methods for the objective evaluation
of algorithms for image registration. Many of those in attendance were already
taking part in such an evaluation. That evaluation is sponsored by the National
Institute of Neurological
Disorders and Stroke (NINDS) and is
entitled, “Evaluation
of Retrospective Image Registration” (1 R01 NS33926-01, Fitzpatrick,
principal investigator).
The NINDS Project
The NINDS project, which
began in early 1995, takes advantage of Internet communication to distribute
images from 10 patients from Vanderbilt
University to each of
several sites outside Vanderbilt (eleven groups at ten sites as of the time of
this workshop). Investigators at each site apply registration techniques to
pairs of images acquired from each of these patients to determine a rigid
transformation to bring them into alignment. The transformations are then
converted into a standard format developed
for this project and are emailed by the investigators to Vanderbilt where they
are compared with gold standards. The gold standard for each image pair is the
transformation determined by a prospective system for registration developed
at Vanderbilt for use in image-guided neurosurgical navigation and based on
bone-implanted fiducial markers. Registration pairs
include both CT/MR and PET/MR. The evaluation is called blinded because (1) the
investigators outside Vanderbilt are not provided with the gold standard
transformations until after their registrations are submitted and evaluated and
(2) all traces of the fiducial markers are removed
from all images before they are distributed. Statistics on the accuracies of
the registration methods are calculated and tabulated. By the time of this
workshop eleven groups at ten sites had been evaluated.
Structure of the discussion
The Workshop began with
the chair thanking the co-chairs of the Image Processing track, Dr. Murray Loew of George Washington University and Dr. Ken Harrison
of Los Alamos National Lab, Ms. Donna Rode, SPIE Technical Programs
Coordinator, and Mr. Randy Cross, SPIE Conference Manager, for their assistance
in setting up the session. He then suggested a format for the session to
follow: Some questions relevant to this and future registration evaluation
projects were posed, and it was suggested that the following discussion be
aligned towards answering such questions. Before the open session began the
chair gave a brief overview of the history of the evaluation project and
defined some terms commonly used with regard to image registration and its
evaluation. What follows is that history, those terms, and a list of comments
and observations made during the open discussion organized according to the
relevant questions.
History of Acustar
and the NINDS project
While stereotactic
frames have been in use in brain surgery for almost fifty years to relate
points in an image of the brain to the physical head itself, only a decade has
elapsed since the earliest efforts were made to bring two volume images of the
same head into registration with each other. In the intervening years a variety
of promising techniques have been described and tested, many of them involving
CT-to-MR or PET-to-MR registration, and, unlike stereotaxy
or other so-called “prospective” registration techniques, requiring no special
patient preparation prior to imaging. These “retrospective” registration
techniques are useful whenever information from the two modalities is needed in
the same anatomical region.
In July, 1988, two years
into this decade of retrospective registration, Johnson & Johnson, through
their subsidiary, Codman and Shurtleff, Inc., funded
at Vanderbilt a one-year project on prospective registration entitled, Three
Dimensional Image Volume Registration and Reorientation Using Implantable Fiducial Markers, with Fitzpatrick and Robert J. Maciunas of the Department of Neurological Surgery as
co-principal investigators. That project was renewed yearly through 1996 with
Robert L. Galloway of the Department of Biomedical Engineering added early on
as co-PI. The first paper on this project was presented in the Image Processing
track at this Symposium four years ago [V. R. Mandava,
J. M. Fitzpatrick, C. R. Maurer, R. J. Maciunas, and
G. S. Allen, “Registration of multimodal volume head images via attached
markers”, Proc. SPIE Medical Imaging VI, Vol. 1652, pp. 271-282
(February, 1992, Newport Beach, CA)]. The project itself led to the development
of the system that was in 1993 named “Acustar”.
Clinical trials of the Acustar system began in 1994. It
was approved by the Food and Drug Administration for commercial use in December
1995.
In February 1993
Fitzpatrick and Maciunas attended the “Workshop on
Computer-Assisted Surgery” in Washington,
D.C., which was sponsored by
the National Science Foundation and organized by Russell Taylor and George Bekey. At that meeting every one of the five working groups
concluded independently that some form of image registration was a critical
problem. After Maciunas presented preliminary results
(four patients) on marker-based registration at Vanderbilt indicating a submillimetric level
of accuracy, it was suggested during informal discussions that the growing set
of registered images at Vanderbilt might serve as a standard for evaluating the
accuracy of less invasive image registration methods. Later that year after
receiving further encouragement from attendees and from others who had not
attended the meeting, Maciunas and Fitzpatrick
prepared a new patient consent form and obtained permission to go forward with
a proposal to the NIH to allow investigators at remote sites to access the
Vanderbilt database. In early 1994 Fitzpatrick sent letters to
many prominent researchers in the image registration field inquiring as to
their interest in taking part as investigators. To facilitate the communication
of images and other data between Vanderbilt and remote sites, it was proposed
to make use of the Internet. In May 1994 he submitted the proposal with Maciunas, Benoit Dawant of the
Department of Electrical and Computer Engineering, Robert M. Kessler of the
Department of Radiology and representatives from each of the external sites as
co-investigators.
December 1, 1994, five
days before the NIH review had been received the project began
informally (and optimistically!) with the posting of a set of ``practice''
images on the Internet. Funding was in fact awarded in early 1995 with an
official start date of March 20, 1995. Communication and formatting problems
occupied much of the first few months of the project. By mid-summer few
problems remained, and by early December most sites had provided a complete set
of registrations. During the period from March to December 1995 three sites
dropped out of the project, but by January 1996 eleven remaining sites had
completed all registrations. After all registrations had been received at
Vanderbilt, the registrations that had been submitted by each site were
compared with the marker-based registrations and statistics on their
differences were presented in a paper presented at this same symposium (earlier
in the day of this workshop) [J. West, J. M. Fitzpatrick, M. Y. Wang, B. M. Dawant, C. R. Maurer, Jr., R. M. Kessler, R. J. Maciunas, et al., “Comparison and evaluation of
retrospective intermodality image registration techniques”, Proc. SPIE
Medical Imaging 1996, Vol. 2710, pp. 332--347 (Newport Beach, CA). Click here
for the complete (Postscript) paper (available by permission of the SPIE). Click
here
for a text-only version (Postscript but without images). ].
Definitions of terms
1.
Blindedness: This term
is used here to denote the ideal that all retrospective registrations be
submitted without any knowledge of the gold standard transformations.
2.
Estimated
TRE: the distance between the image of an anatomical point under
the registration transform to be evaluated and its image under the gold
standard transformation. Note that error in the gold standard increases the
expected value of estimated TRE of each retrospective technique over its true
TRE.
3.
Fiducials :
devices implanted in or attached to the patient during scanning, designed to be
easily visible in the final image. These are the basis of prospective
registration methods.
4.
Image-to-image
registration: the determination of a one-to-one mapping between
images such that identical anatomical points are mapped together.
5.
Image-to-physical
registration: the determination of a one-to-one mapping between an
object (typically a patient in the operating room) and an image of that object
and such that identical anatomical points are mapped together.
6.
Investigators:
used herein to mean the participants in the registration evaluation project.
7.
Procrustes Method: named after a mythological Greek innkeeper who
mutilated his guests in order that they might better fit his beds, this is a
method of mapping one set of points onto another of the same size (i.e., same
number of points) so that the mean square distance between corresponding points
is minimized. The orthogonal
procrustes method refers to the special case in which
the mapping is required to be rigid. This latter method is used to effect the
registration in Acustar and many other registration
techniques that rely on the approximate alignment of corresponding points.
8.
Prospective
registration : image-to-image or image-to-physical registration
that requires preparation of the patient before imaging, typically the
attachment of some sort of fiducial system, such as
the stereotactic frame or fiducial
markers, as an aid in determining the mapping. Prospective techniques are
generally held to be more accurate than retrospective ones, but they also tend
to be uncomfortable and/or invasive for the patient and are applicable only for
images acquired after special preparation of the patient. The gold standard
technique (i.e., Acustar) used in the NINDS
evaluation project is prospective.
9.
Retrospective
registration: image-to-image or image-to-physical registration that
does not require prepration of the patient before
imaging use fiducials, i.e. that will operate on standard images obtained without
any special preparative steps. All the techniques tested in the evaluation
project were retrospective.
10. Rigid-body transformation, or rigid-body
mapping : a mapping
that preserves the distance between any pair of points, pre and
post-transformation. For the case of neurological images, registration is
usually assumed to be a rigid body problem. The retrospective mappings were
evaluated in the NINDS project as rigid transformations.
11. Stereotactic frame : a structure, usually cage-like, which
is rigidly attached to and surrounds the patient's head. It is designed to be
clearly visible in the scanning modalities used, providing many landmark
positions for use by prospective registration techniques.
12. Target Registration Error (TRE) : the distance between anatomically
identical points after a registration has been performed.
13. Volumes of Interest (VOIs)
: locations at
which the TRE was measured in the evaluation project in order to evaluate a
registration technique.
The open discussion
This has been arranged so that comments on the
same subject are grouped together, regardless of their chronological order
within the Workshop.
Is blindedness
necessary? Was it achieved?
1.
It was generally felt that blindedness
was important in order for the presented results to be trustworthy and meaningful.
One of the participants in the project who could not be present at the workshop
had sent this comment to the chair by electronic mail: “I am much more
confident about the validity of the results knowing that the study was blinded.
With registration techniques often involving a certain degree of manual
intervention, it is too easy with such a small data set to ‘tune’ the
registration procedure based upon knowledge of the end result.”
A
crucial step in this study, which occupied the Vanderbilt team early on, was
the removal from the images of all traces of the fiducial
markers, which were necessary for the determination of the gold standard
transformations. There was general agreement that this step had been
successful.
Serious
concern was expressed on the other hand that some sites were allowed to
reregister images after their “final” transformations had been submitted. These
sites had uncovered errors in the conversion of their transformations to the
standard transformation tables used in the study. The chair gave a detailed
account of these exceptions. One in particular was described by the
investigator involved. In this case rotation about one axis was consistently
reversed in the conversion to the standard format. This claim was checked
independently at Vanderbilt, found to be true, and corrected. It was requested
that this process be described in more detail in the paper (referenced above).
The chair agreed to add this detail. [The co-chairmen of the program committee
for the Image Processing track of this symposium was approached after the workwhop. They agreed to allow a resubmission with this
added detail. The details are present in the published paper.]
What other gold standards are there?
1.
One of the investigators referred to some recent work using
cadavers with implanted tubes as a gold standard for evaluation. This study was
also blind, as the tubes were removed from the images before application of the
techniques being evaluated.
2.
One investigator suggested that stereotactic frames might provide such a standard. Another
investigator voiced concern about registration using stereotactic
frames: some evaluations of some of these systems' accuracy would suggest that
they are unsuitable for use as a gold standard.
What makes a good set of standard images?
1.
It was pointed out that all the patient datasets used
in the project were very similar. General consensus was that it would be an
improvement if a wider range of image resolutions, acquisition parameters etc.
were available.
2.
A suggestion was made that a collection of images of
varied resolution could be made by acquiring high resolution images for each
patient, and subsampling these to produce lower
resolution data. It was also pointed out, however, that it would be difficult in
practice to obtain a high resolution MR T2-weighted image.
3.
A concern was raised about the difficulty of presenting
comprehensive statistics regarding registration results in the case when a
large spectrum of image types and resolutions is used. The problem is that the
total number of image registrations would be too large.
4.
The subject of evaluating the sensitivity of
registration techniques to the particular acquisition protocol was mentioned.
This led to a lengthy discussion of the ethics of “tuning” algorithms to work
better on the particular type of data present in an evaluation study. (n.b., this concern is not to be confused with the concern
raised above about tuning based on knowledge of the end result.) It was pointed
out that several of the investigators had modified or developed
techniques during the course of the project, and opinions were sharply divided
over whether this should be permissible in a blind study.
Was patient confidentiality maintained?
1.
In this project all patient information was stripped
from the image volumes before they were placed into the Internet database.
Access to these images was then password-protected. It was generally agreed
that patient confidentiality was maintained by this procedure.
How can user dependence and mistakes be
reduced?
2.
The suggestion was made that, in future, the evaluating
site might provide computer source code that would allow investigators to
reformat an image volume according to the registration transform specified in
their result submissions. Comparison of this reformatted image to that produced
by the registration algorithm itself would be a step towards insuring that the
correct transform was in fact specified by the submission. It was noted that
there were some unexpectedly large Estimated TRE values calculated for some of
the registration techniques in the project, and it was possible that these
values were due to a problem in conversion to the evaluation format rather than
to poor registration. This conversion step would not ordinarily be part of a
clinical protocol.
3.
In order to reduce user dependence, the possibility was
raised that the evaluating site do all registrations using source code provided
by the investigators.
4.
It was argued that the in-house implementation of all
registration by the evaluating site would destroy a vital advantage of the
project, the parallelism produced by many sites performing registration tasks
simultaneously.
What's next?
1.
The NINDS project is still underway at Vanderbilt. The
chair announced that persons interested in learning of the state of the project
and/or participating in the evaluation could contact him via email at jmf@vuse.vanderbilt.edu.
2.
One of the co-chairs of the Symposium's Image
Processing track, Dr. Murray Loew, plans to organize
a second workshop to be scheduled in conjunction with the 1997 Medical Imaging
Symposium.
Return to RIRE home page