Figure 4. Portion of TRA_000828_2495 with a) no annotations, b) Manual Results, c) N-M Method results (yellow = high, purple = low), d) MBARS Results. This area exemplifies the anomalous results from the N-M method, which was more sensitive to boulder shadows and not boulders. An example of adjacent boulders that were correctly identified as separate boulders by MBARS is visible in the lower left of the image.

3.3. Measuring Boulder Dimensions

Once a dark area is identified and defined as an individual shadow, MBARS constructs a boulder object (an internally defined Python data structure) and begins the morphometric analysis portion. The G-H shadow delineation method (Golombek et al., 2008) uses an area-preserving fit, i.e., the resultant shadow ellipse had the same area as the initial shadow. We find that the area-preserving constraint did not improve fits and often led to less convergent solutions in our method. Instead of using the area preserving fit, we developed a new method based on the expected shape of the shadow according to our boulder model (Fig. 1). As described in section 2.3, a model boulder shadow is a portion of an ellipse extending from the base of the boulder, but this ellipse portion can be reflected to create a full ellipse, as shown in Fig. 1. This can be done on shadow observations in the same way, reflecting the anti-sunward shadow boundary across the sunward end of the shadow re-creating the full ellipse. The algorithm then, for each shadow, fits an ellipse to the combined original and reflected shadow boundaries (Figs. 1,4) using Orthogonal Distance Regression (ODR) based on the Levenberg-Marquardt procedure (Boggs et al., 1992). In practice this technique was more effective than fitting an ellipse to a shadow directly or attempting to fit a half-ellipse to the shadow. The fit is not fixed to align with the sunward and sun-perpendicular axes, as tests undertaken enforcing these constraints caused poorer fits. However, without constraining the orientation, ellipses are closely aligned (<1° difference) with those axes. With the shadow ellipse described, the boulder diameter and boulder height are measured from the sun-perpendicular and sun-parallel axes of the shadow ellipse (Figs. 1,4). Fits that return diameters or heights that are too large to likely be a boulder (i.e., >30 m) are not discarded but are marked as bad results. Bad results typically come from shadow casting topography (e.g., craters, knobs) or very small objects that produce shadows that are not well described by an ellipse. These boulder objects are retained in order to help distinguish when an object was not detected and when it was mis-measured. At this stage, a list of shadow objects, each containing information about the shadow, shadow ellipse, and boulder, is recorded for each image panel.
After an image panel is complete, the measured boulders are re-visited by MBARS, looking for overlapping shadows. First, all boulder objects with shadows that touch (i.e., have adjacent pixels) are considered as possible candidates for merging and are compiled into sets of boulder objects with touching shadows. These groups are typically chains or clusters of adjacent boulders or single boulders that have been incorrectly split by the watershed algorithm. MBARS first merges shadows of all objects in the group and recalculates the boulder object, following all the steps described above. If the shadow was fragmented incorrectly, this new, merged shadow should provide the best solution. Then, the algorithm uses k-means clustering to break the merged shadow into smaller shadows (k=2,3…n) up to n , the number of originally detected boulder objects in the group. If the group of shadows represents some number of clustered boulders, we expect the k-means solution where k is equal to the number of boulders to provide the best solution. For each potential solution, the fit error (provided by the ODR) for each boulder in the group is calculated and summed. We use the sum, as opposed to the average, of the fit error to bias the method towards fewer boulders. Without this bias, over splitting via k-means would likely lead to overfit shadow solutions and most shadow objects being broken down into the smallest possible components. The solution with the lowest total error is taken as final and saved.
Lists of all boulder objects in an image panel are stored as.shad files, a custom filetype created using the CPicklemodule of Python that serve as the initial record of boulders identified in the image. A second program, MBARS_ANALYSIS, has several built-in tools to analyze the boulder morphometry results from these original files. At the end of each run, MBARS automatically compiles the boulders detected in each image panel into a single table, containing the essential boulder data and coordinates.

3.4 MBARS Output and Calibration

MBARS outputs GIS-ready data tables which describe each boulder object in the analyzed image. To save space, only the most important parameters of each boulder object are included in this table. The imagevalue records the partition of the original image in which the boulder is located. The flag value (as discussed in 2.4.2) is a unique ID to the boulder within its partition. The xloc and ylocvalues give the x- and y- location of the boulder in meters Easting and Northing in the projection of the HiRISE image. The bouldwid ,bouldheight , and shadlen values give the boulder diameter and boulder height in meters and the shadow length in pixels. The last four parameter are related to the results of the ellipse-fitting process: measured, fitgood, and fiterr record whether the boulder was measured by the algorithm, the quality of the fit of the ellipse to the shadow boundary, and the error on the ODR fit respectively.
By design, MBARS does not strongly filter the results except where choosing not to filter causes computational concerns (e.g., large boulders crashing the fitting algorithm). As a result, many shadow-casting objects in the final results need to be removed for accurate statistical assessment of boulder populations. Objects withfitgood =0 are potentially problematic objects, andfitgood =1 suggests a confident boulder detection. Thefitgood parameter can be flagged by several different conditions within MBARS, such as exceeding the maximum defined boulder width or height (30 m), exceeding maximum shadow area (3000 pixels), or having a non-convergent shadow fit. Boulders are exported into two files:FILENAME_Clean_boulderdata andFILENAME_All_boulderdata . The All file contains all boulder records, but the Clean data table only has the boulders for which fitgood=1. The problematic objects are preserved to help the user check for false negatives and adjust running parameters as needed, as non-convergent fits can indicate poorly set running parameters. After importing the GIS-ready table to ArcGIS, the boulder results can be projected onto the image (using the “Display XY Data” function in ArcGIS) and analyzed further.
For each test area (Section 2.2) we calculate the CFA and best fit RA for both the manual data and the MBARS results, and the MBARS result which best matches the manually derived RA is chosen as final. Previous work on similar algorithms found that they operated best at boulder diameters between 1.5 m and 2.25 m (Golombek et al., 2012). Following these findings, we calculate the best-fit RA for boulders in the test areas only between 1.5 and 2.5 m (Fig 5). Fig. 5a. demonstrates the comparison of several MBARS outputs and a set of manual results within test areas in HiRISE image TRA_000828_2495. In general, as the boundary parameter is decreased, shadows become smaller due to more restrictive filtering, and interpreted RA decreases. Fig. 5b. shows how this change in running parameters is expressed on individual boulders, showing that the most conservative and most generous interpretations of a boulder are often within 1-2 pixels of one another. In some cases, in order to match manual observations, boulders were further filtered according to their height-to-diameter ratio, removing objects with ratios above 0.75 (tall objects) or less than 0.25 (flat objects) in accordance with the standard deviation of h/D observed on the moon (Demidov & Basilevsky, 2014).