Figure 4. Portion of TRA_000828_2495 with a) no annotations,
b) Manual Results, c) N-M Method results (yellow = high, purple = low),
d) MBARS Results. This area exemplifies the anomalous results from the
N-M method, which was more sensitive to boulder shadows and not
boulders. An example of adjacent boulders that were correctly identified
as separate boulders by MBARS is visible in the lower left of the
image.
3.3. Measuring Boulder
Dimensions
Once a dark area is identified and defined as an individual shadow,
MBARS constructs a boulder object (an internally defined Python data
structure) and begins the morphometric analysis portion. The G-H shadow
delineation method (Golombek et al., 2008) uses an area-preserving fit,
i.e., the resultant shadow ellipse had the same area as the initial
shadow. We find that the area-preserving constraint did not improve fits
and often led to less convergent solutions in our method. Instead of
using the area preserving fit, we developed a new method based on the
expected shape of the shadow according to our boulder model (Fig. 1). As
described in section 2.3, a model boulder shadow is a portion of an
ellipse extending from the base of the boulder, but this ellipse portion
can be reflected to create a full ellipse, as shown in Fig. 1. This can
be done on shadow observations in the same way, reflecting the
anti-sunward shadow boundary across the sunward end of the shadow
re-creating the full ellipse. The algorithm then, for each shadow, fits
an ellipse to the combined original and reflected shadow boundaries
(Figs. 1,4) using Orthogonal Distance Regression (ODR) based on the
Levenberg-Marquardt procedure (Boggs et al., 1992). In practice this
technique was more effective than fitting an ellipse to a shadow
directly or attempting to fit a half-ellipse to the shadow. The fit is
not fixed to align with the sunward and sun-perpendicular axes, as tests
undertaken enforcing these constraints caused poorer fits. However,
without constraining the orientation, ellipses are closely aligned
(<1° difference) with those axes. With the shadow ellipse
described, the boulder diameter and boulder height are measured from the
sun-perpendicular and sun-parallel axes of the shadow ellipse (Figs.
1,4). Fits that return diameters or heights that are too large to likely
be a boulder (i.e., >30 m) are not discarded but are marked
as bad results. Bad results typically come from shadow casting
topography (e.g., craters, knobs) or very small objects that produce
shadows that are not well described by an ellipse. These boulder objects
are retained in order to help distinguish when an object was not
detected and when it was mis-measured. At this stage, a list of shadow
objects, each containing information about the shadow, shadow ellipse,
and boulder, is recorded for each image panel.
After an image panel is complete, the measured boulders are re-visited
by MBARS, looking for overlapping shadows. First, all boulder objects
with shadows that touch (i.e., have adjacent pixels) are considered as
possible candidates for merging and are compiled into sets of boulder
objects with touching shadows. These groups are typically chains or
clusters of adjacent boulders or single boulders that have been
incorrectly split by the watershed algorithm. MBARS first merges shadows
of all objects in the group and recalculates the boulder object,
following all the steps described above. If the shadow was fragmented
incorrectly, this new, merged shadow should provide the best solution.
Then, the algorithm uses k-means clustering to break the merged shadow
into smaller shadows (k=2,3…n) up to n , the number of
originally detected boulder objects in the group. If the group of
shadows represents some number of clustered boulders, we expect the
k-means solution where k is equal to the number of boulders to provide
the best solution. For each potential solution, the fit error (provided
by the ODR) for each boulder in the group is calculated and summed. We
use the sum, as opposed to the average, of the fit error to bias the
method towards fewer boulders. Without this bias, over splitting via
k-means would likely lead to overfit shadow solutions and most shadow
objects being broken down into the smallest possible components. The
solution with the lowest total error is taken as final and saved.
Lists of all boulder objects in an image panel are stored as.shad files, a custom filetype created using the CPicklemodule of Python that serve as the initial record of boulders
identified in the image. A second program, MBARS_ANALYSIS, has several
built-in tools to analyze the boulder morphometry results from these
original files. At the end of each run, MBARS automatically compiles the
boulders detected in each image panel into a single table, containing
the essential boulder data and coordinates.
3.4 MBARS Output and
Calibration
MBARS outputs GIS-ready data tables which describe each boulder object
in the analyzed image. To save space, only the most important parameters
of each boulder object are included in this table. The imagevalue records the partition of the original image in which the boulder
is located. The flag value (as discussed in 2.4.2) is a unique ID
to the boulder within its partition. The xloc and ylocvalues give the x- and y- location of the boulder in meters Easting and
Northing in the projection of the HiRISE image. The bouldwid ,bouldheight , and shadlen values give the boulder diameter
and boulder height in meters and the shadow length in pixels. The last
four parameter are related to the results of the ellipse-fitting
process: measured, fitgood, and fiterr record
whether the boulder was measured by the algorithm, the quality of the
fit of the ellipse to the shadow boundary, and the error on the ODR fit
respectively.
By design, MBARS does not strongly filter the results except where
choosing not to filter causes computational concerns (e.g., large
boulders crashing the fitting algorithm). As a result, many
shadow-casting objects in the final results need to be removed for
accurate statistical assessment of boulder populations. Objects withfitgood =0 are potentially problematic objects, andfitgood =1 suggests a confident boulder detection. Thefitgood parameter can be flagged by several different conditions
within MBARS, such as exceeding the maximum defined boulder width or
height (30 m), exceeding maximum shadow area (3000 pixels), or having a
non-convergent shadow fit. Boulders are exported into two files:FILENAME_Clean_boulderdata andFILENAME_All_boulderdata . The All file contains all
boulder records, but the Clean data table only has the boulders
for which fitgood=1. The problematic objects are preserved to help
the user check for false negatives and adjust running parameters as
needed, as non-convergent fits can indicate poorly set running
parameters. After importing the GIS-ready table to ArcGIS, the boulder
results can be projected onto the image (using the “Display XY Data”
function in ArcGIS) and analyzed further.
For each test area (Section 2.2) we calculate the CFA and best fit RA
for both the manual data and the MBARS results, and the MBARS result
which best matches the manually derived RA is chosen as final. Previous
work on similar algorithms found that they operated best at boulder
diameters between 1.5 m and 2.25 m (Golombek et al., 2012). Following
these findings, we calculate the best-fit RA for boulders in the test
areas only between 1.5 and 2.5 m (Fig 5). Fig. 5a. demonstrates the
comparison of several MBARS outputs and a set of manual results within
test areas in HiRISE image TRA_000828_2495. In general, as the
boundary parameter is decreased, shadows become smaller due to more
restrictive filtering, and interpreted RA decreases. Fig. 5b. shows how
this change in running parameters is expressed on individual boulders,
showing that the most conservative and most generous interpretations of
a boulder are often within 1-2 pixels of one another. In some cases, in
order to match manual observations, boulders were further filtered
according to their height-to-diameter ratio, removing objects with
ratios above 0.75 (tall objects) or less than 0.25 (flat objects) in
accordance with the standard deviation of h/D observed on the moon
(Demidov & Basilevsky, 2014).