Structure Tensor Image Filtering using Riemannian L_1 and L_ ∞ Center-of-Mass

Structure tensor images are obtained by a Gaussian smoothing of the dyadic product of gradient image. These images give at each pixel a n × n symmetric positive deﬁnite matrix SPD ( n ) , representing the local orientation and the edge information. Processing such images requires appropriate algorithms working on the Riemannian manifold on the SPD ( n ) matrices. This contribution deals with structure tensor image ﬁltering based on L p geometric averaging. In particular, L 1 center-of-mass (Riemannian median or Fermat-Weber point) and L ∞ center-of-mass (Riemannian circumcenter) can be obtained for structure tensors using recently proposed algorithms. Our contribution in this paper is to study the interest of L 1 and L ∞ Riemannian estimators for structure tensor image processing. In particular, we compare both for two image analysis tasks: (i) structure tensor image denoising; (ii) anomaly detection in structure tensor images.


INTRODUCTION
Given a 2D scalar image u : Ω ⊂ Z 2 → R, the associated structure tensor image represents the local orientation and the edge information of u(x, y) (Förstner and Gülch, 1987;Knutsson, 1989). More precisely, structure tensor image f (x, y) is just a regularization of the first fundamental form of image u(x, y). Hence, it involves a simple computation based on first derivatives of u(x, y), followed by a Gaussian smoothing of the dyadic product ∇u∇u T : u(x, y) → f (x, y) = ω σ * ∇u(x, y)∇u(x, y) T = is the 2D spatial intensity gradient and ω σ stands for a Gaussian smoothing with a standard deviation σ .
We note that f (x, y) can be understood as the local covariance matrix of the set of gradient vector around point (x, y). We should remark also that if σ is very small, f (x, y) is a rank-1 tensor at any point.
For the sake of simplicity, we focuss here on the case of 2D gray-level images; however, structure tensor can be easily extended to 3D images, including color and multispectral-valued ones. By its robustness against illumination changes as well as invariance to some geometric image transformations, structure tensor is a versatile method used frequently in computer vision for corner detection, optical flow estimation, segmentation, stereo matching, etc.
Structure tensor images are just an example of the so called tensor-valued images; namely a spatial structured matrix field f : Ω −→ SPD (n) where the support space is Ω ⊂ Z 2 , Z 3 and SPD(n) is the space of (real) n × n symmetric positive definite matrices. Besides structure tensor, SPD(n)-valued images appear nowadays in various image processing fields and applications, for instance in diffusion tensor magnetic resonance imaging (DT-MRI) (Basser et al., 1994) or in radar imaging based on covariance matrix estimation (Barbaresco, 2011). Fig. 1 gives an example of structure tensor image, where the SPD(2) element at each pixel is depicted by the corresponding ellipse of semi-axis 1/ √ λ 1 and 1/ √ λ 2 , where λ 1 and λ 2 , λ 1 ≥ λ 2 , are the eigenvalues of the matrix and the ellipse orientation represents their corresponding eigenvectors e 1 and e 2 . As it is shown, the tensor structure information can be also visualized by the image of tensor energy (i.e., sum of eigenvalues) and the image of anisotropy (i.e., ratio of eigenvalues). Motivation: Riemannian representation of structure tensor images The value at each pixel of a structure tensor image can be viewed as a point belonging to the non-Euclidean space underlying SPD(n) (Bhatia, 2007), which is just the interior of a convex cone, with apex 0 and boundary consists of all rank-deficient symmetric semi-positive definite matrices. In order to visualize this cone, let us consider the case of a tensor M ∈ SPD(2), i.e., such that a, b ≥ 0 and ab − c 2 > 0. The cone of SPD(n) is a differentiable manifold endowed with a Riemannian structure, where the base point-dependent inner product is defined by A, B P = tr P −1 AP −1 B . This inner product leads to a natural Riemannian metric on SPD(n) whose line element is Note that this metric it invariant under congruent transformations, i.e., P → LPL T , and inversion, i.e., P → P −1 . From the metric, the unique geodesic parameterized by the length, t → γ(t), joining two elements P, Q ∈ SPD(n), is defined as where γ(0) = P and γ(1) = Q. Similarly, the geodesic (metric length) distance between P, Q ∈ SPD(n) is given by The tangent space of a manifold M at a point p ∈ M is denoted by T p M . Let (M , g) be a Riemannian manifold, where the Riemannian metric g on M is a family of (positive definite) inner products, ·, · p : T p M ×T p M → R, which varies smoothly with respect to p ∈ M . The notion of exponential and logarithmic maps are extremely powerful notions in Riemannian manifolds, see diagram in Fig. 2(a). The exponential operator Exp p maps a point of T p M into a point in M . The exponential map is injective on a zero-centered ball B in T p M of some non-zero (possibly infinity) radius. Thus for a point q in the image of B under Exp p there exists a unique vector v ∈ T p M corresponding to a minimal length path under the exponential map from p to q. Exponential maps may be associated to a manifold by the help of geodesic curves. The exponential map Exp p : T p M → M associated to any geodesic γ v emanating from p with tangent at the origin v ∈ T p M is defined as Exp p (v) = γ v (1). The geodesic has constant speed equal to dγ v /dt (t) = v , and thus the exponential map preserves distances for the initial point: d(p, Exp p (v)) = v . The inverse operator, named logarithm map, Exp −1 p = Log p maps a point of q ∈ M into to their associated tangent vectors v ∈ T p M . Thus for a point q in the domain of Log p the geodesic distance between p and q is given by d(p, q) = Log p (q) .
Exponential map and its inverse map from the cone of SPD(n) onto the vector tangent space T P SPD(n) at a given matrix P are respectively defined in a closed form as (Moakher, 2005;Fletcher et al., 2009;Fiori and Toshihisa, 2009) Log P : Note that it is assumed that the tangent space to SPD(n) at P is identified to the linear vector space associated to Sym(n) (n × n symmetric matrices), i.e., T P SPD(n) ∼ = Sym(n).

Aim: Riemannian averaging of structure tensor images
In this context, the goal of this work is to show how to process tensor-valued images using algorithms based on the Riemannian nature of SPD(n). Filtering tensor images f ∈ F (Ω, SPD(n)) involves in our case the need of a method to compute the averaging a set of samples in SPD(n). Or more formally, let {A i } N i=1 be a finite set of N SPD(n) matrices, our aim is to compute their Riemannian L p center-of-mass.
This setting is a particular case of the problem of L p averaging a discrete set of sample points in a Riemannian manifold, see diagram in Fig. 2(b). Let M be a Riemannian manifold and let d(x, y) be the Riemannian distance function on M . Given N points x 1 , x 2 , · · · , x N ∈ M and the corresponding positive real weights α 1 , α 2 , · · · , α N , with ∑ 1≤i≤N α i = 1, the Riemannian L p center of mass, with p ∈ [1, +∞), is defined as the minimizer of the sum of p powered distances function General definition Eq. 6 includes two cases of well known Riemannian statistics. The classical geometric mean (Karcher-Fréchet barycenter) is the minimizer of the sum-of-squared distances function: and the geometric median (Fermat-Weber point) is the minimizer of sum-of-distances function: Additionally, the particular case p = +∞, known as Riemannian circumcenter (or 1-center or minimax center), corresponds to the minimizer of max-ofdistances function: To have an appropriate definition of Riemannian center-of-mass, it should be assumed that the points x i ∈ M lie in a convex set U ∈ M , i.e., any two points in U are connected by a unique shortest geodesic lying entirely in U. The diameter of U, denoted diam(U), is the maximal distance between any two points in U. We notice that the squared geodesic distance function and the geodesic distance function in U are convex. Existence and uniqueness of geometric mean Eq. 7 and geometric median Eq. 8 have been widely considered: both exist and are unique if the sectional curvatures of M are nonpositive, or if the sectional curvatures of M are bounded above by ∆ > 0 and diam(U) < π/(2 √ ∆) (Karcher, 1977;Kendall, 1984;Fletcher et al., 2009). More recently, the existence and uniqueness for the Riemannian L p center of mass, 1 ≤ p ≤ ∞ have been studied in (Afsari, 2010). We can also find more recent results on existence and uniqueness, including also practical algorithms for L 2 (Bhattacharya and Patrangenaru, 2003;Le, 2004), for L 1 (Fletcher et al., 2009;Yang, 2010), for general L p (Afsari, 2010;Afsari et al., 2011) and for L ∞ (Arnaudon and Nielsen, 2013). We can mention also some results on stochastic algorithms (avoiding to compute the gradient to minimize) (Arnaudon et al., 2011;Bonnabel, 2012).

Related work
Our contribution here is to study the interest of L 1 and L ∞ Riemannian center-of-mass for structure tensor image processing. In particular, we compare both estimators for two image analysis tasks: (i) structure tensor image denoising; (ii) anomaly detection in structure tensor images.
There are several proposals in the literature that intend to process the tensor images obtained from structure tensor computation. Tensor filtering can be achieved by PDE's approaches (Tschumperlé and Deriche, 2002) or by frequency filtering techniques (Larrey-Ruiz et al., 2006). Extension of diffusion filtering for matrix-valued images has been also widely studied in the literature (Burgeth et al., 2007). The latter approach is also related to the discrete counterpart which involves the computation of local adaptive neighborhood filters for matrix fields (Pizarro et al., 2008).
From an alternative viewpoint, regularization is done intrinsically in the structure tensor computation. The underlying idea consist in locally adapting the Gaussian kernel (Nagel and Gehrke, 1998), defining another adaptive shaped filter (Köthe, 2003), or other adaptive tensor computation (Brox et al., 2005). The notion of nonlinear structure tensor involves to replace the Gaussian smoothing of the classical structure tensor by a discontinuity preserving nonlinear diffusion (Brox et al., 2006).
Up to best of our knowledge, Riemannian L p center-of-mass has not been previously used as a filtering approach for structure tensor images.

Paper organization
The rest of the paper is organized as follows. In Methods Section are presented the algorithms for computing L 1 and L ∞ Riemannian center-ofmass of a set of tensors, which are the basic ingredients for regularization and enhancement of structure tensor images. All details for implementing those algorithms are given. Results Section discusses the performance L 1 and L ∞ Riemannian structure tensor image processing for the problems of denoising and anomaly detection. In the case of denoising, the comparison includes quantitative assessment. The case of anomaly detection deals exclusively with a qualitative comparison. Finally, Section on conclusions and perspectives closes the paper.

METHODS
We discuss in this section the algorithms for computing L 1 and L ∞ Riemannian center-of-mass of a set of tensors which are the basic ingredients for filtering the structure tensor images. We start by a remind on the L 2 case, which is used as a baseline for comparison with the other estimators.

MEAN OF SPD(n) MATRICES
Given a manifold M , the Fréchet-Karcher flow (Fréchet, 1948) (Karcher, 1977) is an intrinsic gradient flow on M that converges to the L 2 centerof-mass, called Fréchet-Karcher barycenter. In the discrete case, the L 2 center of mass for a finite set of N points on M is given by the iterative algorithm where Exp µ (·) is the exponential map and Log µ (a) ∈ T µ M is the tangent vector at µ ∈ M of the geodesic from µ to a; and where β > 0 is the step parameter of the gradient descent.
Using the expressions of exponential Eq. 4 and logarithmic Eq. 5 maps of tensors, the geometric mean of a set , can be computed by the following Fréchet-Karcher gradient flow where β > 0 is the step parameter of the gradient descent. We can typically use a constant step-size, i.e., it is fixed to β = 1 N for any k. In other to guarantee a fast convergence of the algorithm Eq. 10 to the unique minimum, it is useful to have an initialization close to the final average. Hence, we propose the initialization to the arithmetic mean tensor.
The Fermat-Weber point, or geometric median Eq. 8, can be also particularized to tensors. Indeed, for any Riemannian manifold M , the gradient of the Riemannian sum-of-distances function is given by With this result, the classical Weiszfeld-Ostresh algorithm (Weiszfeld, 1937;Ostresh, 1978) for iteratively computing the median was extended in (Fletcher et al., 2009) to Riemannian manifolds as: N] : m k = x i . Now, by straightforward substitution of expressions Eq. 4 and Eq. 5, one obtains the geometric median of a finite set {A i } N i=1 of N SPD(n) matrices, and weights {w i } N i=1 , using the Riemannian Weiszfeld-Ostresh algorithm as follows (11) where N k = {n ∈ [1, N] : A k = A n } and 0 ≤ β ≤ 2. The step size is fixed to β = 1.
It was proven in (Fletcher et al., 2009) that the Riemannian Weiszfeld-Ostresh algorithm converges to the geometric median lim k→∞ m k = m in the case of a negatively curved manifold as SPD(n) if 0 ≤ β ≤ 2 and if the set of points is not too dispersed. Even in the case of very spread data, we have observed as suggested in Fletcher et al. (2009) that for β = 1 the convergence is always obtained in SPD(n). Similarly to the case of Euclidean L p center-ofmass, the Riemannian median is theoretically a more robust estimator than the Riemannian mean. More formally, the robustness is related to the notion of breakdown point (Lopuhaä and Rousseeuw, 1991). The finite sample breakdown point of an estimator is the fraction of data that can be given arbitrary values without making the estimator arbitrary bad: minimal proportion of data that can be corrupted before the statistic becomes unbounded. Let X = (x 1 , x 2 , · · · , x N ), x i ∈ R d , the breakdown point of an estimator φ is defined as where D is a metric on the estimator space, the set Y k contains (n − k) points for the set X and k arbitrary points from R d . Typically this is some function of the sample size N. Let U be a convex subset of M with diam(U) < +∞, and let X = x 1 , x 2 , · · · , x N be a collection of points in U. Then, the Riemannian median has a breakdown point (Fletcher et al., 2009): which means that half of the data needs to be corrupted in order to corrupt this estimator. It should be compared with the breakdown point of the Riemannian mean ε * (µ, X) = 1/N.
Lets us give a first illustration of the notion of robustness. Fig. 3 depicts two examples of the comparison of L p Riemannian center-of-mass in SPD(2). In Fig. 3a, the Riemannian mean (in green) and the Riemannian median (in red) are computed from two tensors. Compare now the results of both with those obtained with the set of four tensors given in Fig. 3b. One of the two new tensors is an outlier with respect to the three others. We observe how the Riemannian median is less deformed by the outliers than Riemannian mean.

L ∞ RIEMANNIAN CENTER-OF-MASS OF SPD(n) MATRICES
Given a discrete set of N samples x 1 , x 2 , · · · , x N , with each x i ∈ R n , the circumcenter (Sylvester point or 1-center or minimax center) is defined as and corresponds to find the unique smallest enclosing ball in R n that contains all the given points.
Computing the smallest enclosing ball in Euclidean spaces is intractable in high dimension, but efficient approximation algorithms have been proposed. The Bȃdoiu and Clarkson (2003) algorithm leads to a fast and simple approximation (of known precision ε after a given number of iterations 1 ε 2 using the notion of core-set, but independent of dimensionality n): Initialize the minimax center c 1 ∞ with an arbitrary point of {x i } 1≤i≤N , then iteratively update the center where f k is the farthest point of set {x i } 1≤i≤N to c k ∞ . For the case L ∞ Riemannian center-of-mass (minimum enclosing ball) there is no canonical algorithms which generalizes the gradient descent algorithms considered for p ∈ [1, ∞) In a recent work by Arnaudon and Nielsen (2013), it has been introduced an extended version of the Euclidean algorithm (Bȃdoiu and Clarkson, 2003) for circumcenter in Riemannian manifolds. Let us consider a discrete set {x i } N i=1 ⊂ M on a manifold M . -Initialize the centerx ∞ with a point of set, i.e., x 1 ∞ = x 1 ; -Iteratively update the current minimax center as where f i denotes the farthest point of the set tō x k ∞ , and Geodesic(p, q,t) denotes the intermediate point m on the geodesic passing through p and q such that dist(p, m) = tdist(p, q).
The convergence of this algorithm in nonpositive sectional curvature manifolds as SPD(n) is guaranteed (Arnaudon and Nielsen, 2013). The geometric circumcenter of a finite set {A i } N i=1 of N SPD(n) matrices can be computed using the closed expression of the geodesic Eq. 2 and the distance Eq. 3 as the following instantiation of Arnaudon-Nielsen algorithm: -Initialization:Ā 1 = A 1 ; -Iteratively update 1. Obtain the farthest SPD(n) matrix to the current estimate: 2. Compute geodesic distance from current center estimation to farthest point: 3. Find the cut of the geodesic at a value t = 1 1+k , which gives the SPD(n) matrixĀ k+1 , so that We note that the complexity of this iterative algorithm is a little bigger than for the gradient descent algorithms of the Riemannian mean and median. Nevertheless, all these algorithms only require a few number of SPD(n) matrix operations: product, inversion, power to a real number, matrix exponential and matrix logarithm, which are available in many scientific computing languages.
One can compare in Fig. 3 the Riemannian circumcenter (in black) with respect to the Riemannian mean or median for both sets of tensors. We should remark that the L ∞ corresponds geometrically to the center of the minimal enclosing geodesic ball which contains the tensors and hence, by definition, is very sensitive to outliers. In fact, it can be seen as an average between the extreme and distant tensors.

RESULTS
Having an algorithm to compute the L p center-ofmass of a set of SPD(n) matrices, it can be naturally used for filtering structure tensor images f (x, y) ∈ F (Ω, SPD(2)) by simply computing the average in local neighborhood associated to each pixel (x, y) of the image, i.e., where W (x, y) is the set of pixels belonging to the window W centered at pixel (x, y), such that W is typically a square window. Each pixel neighborhood is processed independently of the others and consequently that can be done in parallel. Figure 5. From (a) to (d), corresponding images of tensor energy from images of Fig. 4; from (e) to (h), corresponding images of tensor anisotropy from images of Fig. 4.
A comparison of L 2 , L 1 and L ∞ Riemannian structure tensor image filtering is shown in Fig. 4, using the same window W = 3 × 3 pixels for the three cases. Obviously, by computing the center-ofmass is a such small neighborhood (i.e., only nine tensors are averaged), the obtained structure tensor images are quite similar. However, if we compare the corresponding tensor energy and tensor anisotropy images, depicted in Fig. 5, we observe that the three estimators have a different behavior in terms of image filtering. As classically in image processing, the median-based filter produces less blurring effect than the mean-based one. That involves a better edge preserving regularization. The result of the circumcenter can be compared with the effect of morphological filters, in the sense that there is neither blurring nor edge deformation but a suppression of structures smaller than the filtering window and an enhancement of those bigger than W .

STRUCTURE TENSOR IMAGE DENOISING
We consider now the performance of Riemannian filtering for structure tensor denoising. Given a tensor image f (x, y) ∈ F (Ω, SPD(2)), we first simulate a new tensor image f (x, y) by adding noise. We have considered two sets of experiments according to the type of simulated noise.
-"Gaussian" noise: It is obtained by simulating a decoupled componentwise i.i.d. Gaussian noise for the eigenvalues of each SPD(2) pixel value, the corresponding σ being a percentage of the dynamic range of eigenvalues and µ equals the empirical mean of the eigenvalues. In addition, for each SPD(2) pixel, a random rotation according to Gaussian distribution µ = 0 and σ is also included.
-Impulse noise: The simulation mechanism involves replacing the pixel values by an outlier tensor with a given probability Pr.
Then, simulated image f is denoised by Riemannian L p averaging Aver W,L p ( f ).
In order to quantitatively assess the denoising effect of the different estimators, we introduce the notion of Mean Riemannian Error (MRE), defined as which is basically the Riemannian tensor distance between the pixel of original unnoisy image and the pixel of denoised image, averaged for all the image pixels. Table 1 summarizes the denoising performance by Aver W,L p ( f ) (W = 3 × 3) for the Riemannian mean, median and circumcenter. We have also included the results for the simple arithmetic mean of SPD(2) matrices, denoted as Euclidean L 2 . For the "Gaussian" noise we have consider four values of σ : 1%, 5%, 10% and 50% and also four probabilities Pr for the impulse noise: 0.01, 0.05, 0.1 and 0.5. The values of MRE correspond to the average of ten realizations from the image test.
(a) f (x, y) ∈ F (Ω, SPD (2)) The results for the "Gaussian" noise are a bit surprising. We have, as expected, that the Riemannian mean performs better than the Riemannian median (expect for very low noise). However, the Riemannian circumcenter performs better than both, and the difference is particularly significant for hight levels of noise. We explain this effect by considering the fact that in kind of noise, the corrupted tensors are evenly distributed around the original tensors and consequently, an estimate based on the center of minimal enclosing geodesic ball is rather steady with respect to the level of noise. Obviously, in case of uneven distribute noise, we can expect a bad behavior of the L ∞ estimator. For the case of the impulse noise, besides the quantitative results given in Table 1, we have also included some images in Fig. 6. As we observe, this kind of outlier-based impulse noise is appropriate to state the robustness of L 1 against the other L p estimators. Before the breakdown point, which corresponds here to Pr ≥ 0.5, the Riemannian median filter yields a significant better performance. Then, for extremely noise situations, the Riemannian circumcenter produces a better estimate.
Before closing this study of structure tensor denoising, we should point out that the noise has been added to the structure tensor images. Such a problem is different from the case where the noise is present in the initial gray-level image. In fact, smoothing effect during the computation of the structure tensor helps to deal with this scalar noise.

ANOMALY DETECTION IN STRUCTURE TENSOR IMAGES
Let us consider the original images (a) in Fig. 7 and Fig. 8. Both cases correspond to a regular texture including a zone of irregularity. Defects in such regular textures can be detected as anomaly areas according to the structure tensor. The goal of these experiments is to show how an image f (x, y) ∈ F (Ω, SPD(2)) associated to a regular texture can be processed by Riemannian center-of-mass averaging Aver W,L p ( f ) in order to enhance the potential defect areas.
The first case study given in Fig. 7 corresponds to anisotropy anomaly detection. First of all, one needs to choose the parameter σ for the Gaussian smoothing of gradient required in the structure tensor computation. It typically depends on the scale of the regular pattern of the texture; here we fix σ = 15. Then, one should select the size of the window for averaging; here W = 7 × 7. The latter parameter is related to the scale of the irregular zone. In the Figure are compared the result of Aver W,L p ( f ) for the Riemannian median and Riemannian circumcenter, together with the corresponding tensor anisotropy images. As we can observe, the L 1 estimator produces a strong enhancement of the defect by "rounding" the corresponding area. As the window is a square, the L ∞ estimator regularizes according to this geometry with a better adjustment to the underlying zone of irregularity.
(a) u(x, y) Fig. 8 provides a second case study. Anisotropy is not significantly degraded in this example, but the gradient magnitude is lowered; which consequently involves a scenario for tensor energy anomaly detection. We have considered the same scale parameters than in the previous example. As we can observe from the results, the behavior of the Riemannian median and Riemannian circumcenter are similar to the previous case.
Finally, we note that a boundary effect appears in Figs. 7 and 8 (e.g., left in Fig. 7(g), bottom in Fig. 8(f) or top in Fig. 8(g)) due to the fact that the regular texture is cropped in a bounded window. As usually in these cases, one needs to remove from the analysis an image border of size equal to the filter size.

CONCLUSIONS
Riemannian averaging is a mathematically sound and useful tool for processing structure tensor images. Geometric median for tensor image filtering inherits properties of scalar median image filtering: robustness against impulse noise and rounding structure effect. The latter is related to the mean curvature motion (Guichard and Morel, 2003). Riemannian circumcenter is potentially relevant for very noisy images and it produces a limited blurring effect (boundaries are not shifted). For the particular problem of anomaly enhancement/detection, instead of processing spectral information (tensor energy and tensor anisotropy) from original structure tensor image, it seems more useful to first L 1 /L ∞ processing the structure tensor image and then to use spectral information of processed images.
From a computational viewpoint, both iterative algorithms have a complexity which depends linearly on the number of pixels as well as the size of the averaging window. In order to have fast algorithms, an efficient implementation of the exponential and logarithm of symmetric positive definite matrices is required. More precisely, only an efficient implementation of classical linear algebra tools is needed. Faster algorithms can be based on an approximated version of the definition, founded on the fact that when the averaging image window moves from one pixel to one of its neighbors only a limited number of new tensors are involved, with respect the set of tensors averaged in the previous pixel.
In this study we have only considered tensor filtering by a fixed kernel averaging for all the image neighborhoods. Obviously, the use of adaptive kernels, such as it is done in (Burgeth et al., 2007) for Euclidean tensor averaging, would improve the results in terms of object edge preserving. The algorithm for L 1 estimator can be used for this purpose by considering weights which represent the adaptivity, associated typically to bilateral kernels. That has been done for quaternionvalued images in (Angulo, 2013).
Formulation of morphological operators for structure tensor images has been considered in (Angulo, 2012) under different frameworks. However, work on structure tensor morphology should be pursued in order to fully exploit the Riemannian structure of such images.