Hole Filling in a 3D Scan using PGM

Avnath
18 min readOct 11, 2020

--

Abstract: This article outlines the ways to fill voids in a partial 3D scan in using Probabilistic Graphical Model principals and Markov Network Potentials. This outline possible usage of this learning in Missing Data in IOT applications, Forensic Anthropology, Prenatal Diagnosis and decision making

Keywords: Partial Hole filling, Markov Network, Correlation Correspondence Algorithm, Anthropology, Forensic Science, Occlusion

1. Background

Around 2 years ago I was trying to get myself 3D-fied by creating my model using photogrammetry. It’s a technique of merging photographs to generate a continuous image. Typically used in space photography to generate a large map.

The pictures were taken by a camera placed on a tripod and while doing so, it made it missed the top of my face and left a hole in my head. This was probably due to lack of camera coverage or placement. But and can be caused by many other things like occlusion (recesses too deep to be observed), low reflectance and missing views

There is a hole visible in my 3D scan

That day onwards, I got into the quest of filling that hole and it brought me to this day today where I could discover the way to fill it using probabilistic graphical model. This took me to a thesis — Learning Models of shape from 3D range data by Dragomir Anguelov where he has given a detailed description on how they achieved this, among the other things. I have tried to explain it in simple steps with focus towards the hole filling.

The answer to the question lies in the Bayesian learning technique which helps us create the missing data from and thus can be translated into tangible data and information. Bayesian learning and inference can be further utilized to calculate Markov Random fields to calculate potential in the adjacent network also called clique. With some added parameters, the same concept can leveraged to fill the holes in a 3D scan. Let me start from the basics.

2. Non-Rigid Iterative Closest Point Algorithm

Dragomir Anguelov has presented a non-Rigid Iterative Closest Point Algorithm to address the problem related to non rigid 3D registration, which is the problem of finding point to point correspondence between two deforming surfaces in a unsupervised manner. Non rigid registration is an essential capability for the task of example-based learning of deformable model. The examples are usually meshes or point cloud produced by 3D scanner which captures the deformable shapes from which we derive a model. These meshes are often missing part of the surface, usually have different topologies and capture variety of shapes in different configurations. These models are demonstrated to show reasonable result even in absence of prior knowledge about object deformation and initial surface alignment

Here, we consider a Model Mesh Mx. This mode is assumed to be complete model of surface of the object. Also, we have a scan mesh which is acquired using a 3D range scanner. We refer to it as scan mesh called MZ. This can be a complete or a partial model of the surface which may have some deformities due to occlusion or reflectance.

The model mesh is first deformed using non rigid transformation Ɵ. This transformation places the points and polygons of the mesh in a new configuration producing the transformed mesh

My = Ɵ (Mx). The points Vy of the transformed mesh are resampled to generate the scan mesh point Vz . The resampling process is then guided by a set of correspondence variable C. Each point Zk in the scan mesh Mz is associated with a correspondence variable Ck. The variable ck. has a discrete domain containing all model point indexes {1,2, , , ,Nx } . Setting ck. = i picks point yi to generate scan point zk .

This generative model of registration is fully specified when we define the probability distribution Ɵ and correspondence C. First we need to assign a penalty for deformation, in form of conditional distribution P(Ɵ | Mx ). This distribution will assign high likelihood to the cases with small deformation and low likelihood otherwise. Many different choices are possible. each scan point is generated from its corresponding point in the transformed model with Gaussian noise. In particular, we define the conditional probability:

P(zk | ck = i,yi) = N(zk;yi,ΣC)

where ΣC is a diagonal 3 × 3 matrix specifying the Gaussian variance. Finally, we assume uniform prior distribution P(C) over the correspondence variables, reflecting the lack of prior information about the correspondences.

Given the probabilistic model described above, the registration problem can be cast as likelihood maximization — finding the most likely

set of values for the Θ and C given the original meshes MX and MZ: This algorithm starts with a reasonable initial estimate of the deformation Θ, and tries to maximize the objective

.

Unfortunately, this is difficult to do in closed form, because of the complex correlations between the variables Θ and C. Instead, all non-rigid ICP algorithms are based on the insight that it is much easier to optimize the log-likelihood by iteratively optimizing either C or Θ, while keeping the other set of variables fixed.

The algorithm can be viewed as an instance of hard Expectation-maximization. It aims to maximize the log-likelihood logP(C,Θ |

.

This objective is optimized by iterating between two steps. The hard E-step solves for the most likely assignment for the correspondences C, assuming Θ is fixed; in other words, for argmax

. Given the transformation Θ, the transformed mesh MY is uniquely determined.

Non-rigid Iterative Closest Point Input: Meshes MX and MZ, initial alignment Θ∗.

1: while MY = Θ(MX) and MZ not sufficiently close do

2:Hard E-Step: Given Θ, compute the set of correspondences C:

For each scan point zk, find the nearest point yi in MY , and set ck = i.

3: M-step: Given C, compute a new transformation estimate Θ:

Solve for argmax

4: end while

5: return deformed mesh MY = Θ(MX) and correspondences C.

With this in mind, it is easy to infer from the Algorithm probabilistic model that the correspondence variable assignments are conditionally independent of each other given a known deformation Θ and the scan mesh MZ. Thus, the E-step objective can be optimized independently for each correspondence variable. In the

M-step, the correspondences computed in the E-step are used to update the deformation

Θ = argmax

, and the process is iterated until convergence.

It was observed that the using the directional graphical model the results were not consistent and varied across many iterations. Hence, we explored undirected graphical model aka Markov random fields.

4. Introduction to Markov Network

We introduce undirected graphical models known as Markov networks or Markov random fields. MRFs offer an alternative approach for encoding independence structure in joint probability distributions. We introduce them using a simple example and will then generalize it into a formal MRF definition.

We define a distribution over the possible segmentations of a surface into K regions. We want this distribution to prefer segmentations in which adjacent points pi and pj are assigned to the same region. The surface is represented as a discrete sampling of points, in which adjacent points are connected by links. Each point pi is associated with a discrete variable Xi which can take K values, assigning the point to one of the respective regions. To represent our preference that adjacent points are assigned to the same region formally in the model, we will associate a measure, called a potential, with pairs of variables (Xi,Xj). Each potential is associated with a fully-connected group of variables, called a clique. The set of connections between the variables of all cliques induces an undirected graph U. This graph can be used to encode a set of conditional independence assumptions in the underlying distribution.

6. Correlated Correspondence Algorithm

In order to find a consistent embedding, we need to define a model that captures the correspondence variable correlations. For this task, we use a pairwise Markov network. The network contains single potentials ψ(ck) which prefer embeddings that match similar-looking areas in the two surfaces. The network also contains probabilistic potentials ψ(ck,cl) associated with pairs of correspondence variables (ck,cl). These potentials model the variable correlations that enforce a preference for embeddings that minimize surface deformation and conform to geodesic distance constraints. The resulting Markov network is a model of a joint probability distribution of the form

which contains only single and pairwise potentials. A sketch of the Correlated Correspondence model is displayed in Figure below

The task of registration is thus reduced to one of performing probabilistic inference in the Markov network, in order to find the most likely joint assignment to the entire set of correspondence variables C. The inference problem for a general Markov network is NP hard, but extensive literature on approximate Markov net inference is available. We apply the algorithm called loopy belief propagation (LBP), which is an efficient search in the exponential space of correlated variable assignments. In contrast, the Non-rigid ICP algorithm requires that an initial alignment hypothesis for the entire surface is given to the algorithm. Generally, there are exponentially many such hypotheses, and the non-rigid ICP algorithm lacks a mechanism to perform efficient search in that space.

7. Probabilistic Model

Here, we describe in detail the probabilistic model of registration, which takes the form of a Markov network over the correspondence variables. This network contains the following kinds of pairwise and single potentials:

1. Single local surface signature potentials ψs(ck,cl), which prefer to match similar looking parts of the surface.

2. Pairwise deformation potentials ψd(ck,cl), which encode a preference for small deformation during the embedding.

3. Pairwise geodesic distance potentials ψn(ck,cl) and ψf(ck,cl), which enforce approximate preservation of geodesic distance during the embedding.

Below we describe how to define and compute each of these potentials in detail.

7.1 Local Signature Potential

We encode a set of potentials that correspond to the preservation of local surface properties between the model mesh and scan mesh. The use of local surface signatures is important, because it helps guide the optimization in the exponential space of assignments. We mainly experimented with spin-image features, although other features can be used as well. A spin-image is an oriented histogram associated with a point p on the surface; an example is displayed in below figure. The point normal n defines the plane Tp, tangent to the surface at p. Each point q in the neighborhood of p is associated with two statistics: the signed distance β between q and the plane Tp and the distance α from p to q’s projection in the plane Tp. The spin-image is a 2D histogram, which partitions the space of possible α and

Surface Mesh

Surface Mesh

Spin Image

Coordinate System

Coordinate system

Spin images are two-dimensional histograms computed at an oriented point P on the surface mesh of an object.

β values into bins, and counts how many neighboring surface points fall in each bin. When the surfaces around scan and model points are similar, we expect their spin-images to be similar as well.

The spin-image signatures are invariant to surface rotations around the point normal n (since the statistics α and β are invariant to such rotations). As a result, spin-images offer an efficient way of comparing the surfaces around two points without requiring their rotational alignment, which is usually unknown. We compress the spin-images using principal component analysis (PCA) to produce a low-dimensional signature sx of the local surface geometry around a point x. Two low-dimensional signatures sxi and szk can be compared simply by using their L2 distance: di,k = ksxi szkk. Our surface similarity potentials are defined as a Gaussian distribution over these distances:

ψs(ck = i) = N(di,k;0,ΣS),

where σS is a diagonal covariance matrix.

Spin-images to be highly efficient features, which performed well for registration of articulated objects. In that case, the surface around the joints undergoes significant deformation, but the rest of the surface is usually deformed only slightly, comparable to the spin-image resolution. The spin-images were enough to obtain high-quality registrations in this case.

There are limits as to how well features, such as spin-images, can perform in a completely unsupervised setting. For example, when we register scans of human bodies, we need to deal with large deformations in some parts of the shape, and small deformations in others. Whenever the resolution of the spin-image bins was small relative to the deformation, we got poor results. Increasing the spin-image resolution impaired the feature accuracy in areas which don’t deform as much (human head and fists). This suggests that in the presence of additional knowledge, we can adapt the size and scale of the local surface signatures for different parts of the scan and achieve even better registration results.

7.2 Deformation Potential

This model needs to encode a preference for embeddings of mesh MZ into mesh MX that minimize the amount of deformation Θ induced by the embedding. We model deformation using pairwise potentials

ψd (ck,cl) between the correspondence variables associated with adjacent scan mesh points (points connected by an edge in MZ).

We introduce a separate potential associate with each edge in the scan mesh. The deformation potential is a table, assigning values for each possible joint assignment to its correspondence variables. Because of this, there is a distinct benefit of using pairs of variables (instead of triples or larger groups). A particular value ψd(ck = i,cl = j) in the potential table denotes the preference given to the assignment

ck = i, cl = j. Intuitively, the value corresponds to the amount of deformation that model edge (i, j) incurs to transform into scan edge (k, l).

Importantly, the set of possible matches for scan edge (k,l) is not limited only to the set of model mesh edges EX. That set of edges is sparse and local, and therefore insufficient to cover the space of deformations we want. Instead, we will allow any two points in the model mesh MX to implicitly define a matching link for edge (k,l).

To quantify the amount of deformation, we treat the model mesh links as springs, which resist stretching and twisting at their endpoints. Consider a particular model link (i,j). Its stretching is easily defined by looking at changes in the link length li,j. Link twisting, however, is ill-specified by looking only at the Cartesian coordinates of the points alone. We attach an imaginary local coordinate system to each point on the model. This local coordinate system allows us to quantify the amount of link twisting: no twisting occurs if the orientation of the link endpoints in their neighbors’ coordinate systems is preserved. This orientation will be captured by defining the unit vector dij, which describes the orientation of point xj in the local coordinate system of point xi (and similarly, we can define dji):

.Above, Ri is the matrix describing the rotation of the coordinate system centered on point xi. For simplicity, we will assume that in the original model mesh the rotation is simply the identity matrix I.

We point out that the Correlated Correspondence algorithm can easily incorporate different deformation models. Most implementations of Non-rigid ICP use a carefully chosen definition of deformation, which can be easily linearized and results in a least-squares optimization objective. The probabilistic inference employed by the Correlated Correspondence algorithm does not rely on the continuity or differentiability of the deformation scoring function. Different models of deformation can be introduced without need for changing the optimization algorithm, simply by replacing the values in the deformation potentials.

7.3 Potential Associated to Geodesic Distance

We want to introduce constraints that preserve the distance along the mesh surface (geodesic distance). This probabilistic framework can treat such constraints as correlations between pairs of correspondence variables. We encode a nearness preservation constraint which prevents adjacent points in mesh MZ to be mapped to geodesically distant points in MX. For adjacent points zk,zl in the scan mesh, we define the following potential:

where ρ is the scan mesh resolution and α is a constant, chosen to be 3.5.

The farness preservation potentials encode the complementary constraint. For every pair of points zk,zl whose geodesic distance is more than 5ρ on the scan mesh, we have a

where β is also a constant, chosen to be 2 in our implementation. The intuition behind this constraint is fairly clear: if zk,zl are far apart on the scan mesh, then their corresponding points must be far apart on the model mesh.

In the previous section, we defined a Markov network, which encodes a joint probability distribution over the correspondence variables as a product of single and pairwise potentials. The goal is to find a joint assignment to these variables that maximizes this probability. This problem is one of standard probabilistic inference over the Markov network. However, the Markov network is quite large, and contains a large number of loops, so that exact inference is computationally infeasible. We therefore apply Sum-product loopy belief propagation (LBP), which is an approximate inference method that has been shown to work in a wide variety of applications. Running LBP until convergence results in a set of probabilistic assignments to the different correspondence variables, which are locally consistent. We then simply extract the most likely assignment for each variable to obtain a correspondence.

7.4 Dealing with Farness Preservation Potential

One complication arises from the form of our farness preservation constraints. In general, most pairs of points in the mesh are not close, so that the total number of such potentials grows as O(NZ2), where NZ is the number of points in the scan mesh. This can easily be the bottleneck of the algorithm, because the number of all other potentials scales linearly with the number of scan mesh points. However, rather than introducing all these potentials into the Markov net from the start, we can introduce them as needed. First, we run LBP without any farness preservation potentials. We can easily check whether the solution violates a set of farness preservation constraints. This is done efficiently by checking if nearby points in our solution on the model mesh are indeed nearby on the scan mesh as well. If this is not true in all cases, we add the geodesic potentials associated with the violated constraints and rerun BP. In practice, this approach adds a small number of farness preservation constraints.

7.5 Dealing with Local Minima of Loopy Belief Propagation

In some cases, LBP inference may converge to a local minimum of the energy defined by the Markov network — there may be another solution with a higher log-likelihood than the one found by the algorithm. In practice we observed that this can happen when the object shape contains symmetries. These cases most frequently include the presence of several identical object parts such as chair legs or car tires. Another popular case is that of planar (or mirror) symmetry, which is also frequently observed in many objects such as people, cars and chairs. To deal with the local minima problem, we need to run LBP with different starting conditions, which will cause the algorithm to explore different parts of the energy landscape.

The CC algorithm provides the capability to embed the scan mesh, which can contain any subset of the surface, into the model mesh. Thus, it also provides the capability to embed any subset of the scan mesh into the model mesh, as well. We consider breaking up the scan mesh into parts and using CC to find several different embeddings for each part. In the extreme case when the part contains a single point, and the embedding algorithm needs to only consider its local surface signature, there is a lot of ambiguity and hence many possible matches in the model. On the other hand, we expect that object parts of a non-trivial size will have only a few good candidate matches. Our algorithm will thus rely on computing the embeddings of a few medium-sized scan mesh parts. From these, we can obtain several high-scoring hypotheses that embed all parts into the model in a non-overlapping way. Each such hypothesis can then be used to initialize a different run of LBP inference. The algorithm is sketched in the above figure.

Each embedding hypothesis Hi, obtained with the algorithm above, contains a candidate match point in the model for each extremum point . We initialize LBP for each hypothesis, by requiring that the extrema points are embedded in the vicinity of the matches found by the algorithm in Figure (this is accomplished by allowing each point to match only points near in the model mesh). We compare all LBP results obtained using the different initialization hypotheses and pick the solution with the highest log-likelihood.

Algorithm MultipleStartingHypotheses

Require: Part size R, K embeddings per part.

Output: A set of N starting hypotheses for LBP.

1: Find a set of scan mesh parts PZ = (p1,…,pS).

2: Find the point zC on the scan mesh surface, which is nearest to the mesh centroid.

3: Compute the geodesic distance map DG from zC to all mesh points.

4: Get a set of extrema points V = (v1,…,vS) that correspond to local maxima of DG with non-trivial support.

5: Each part ps then contains the subset of the surface within a radius R around extremum point vs.

6: Use the CC algorithm to find K different embeddings of each part ps into the model mesh, and their likelihood.

7: Find the set of N highest-likelihood hypotheses which embed all parts into the model in a non-overlapping way. If no such hypotheses exist return ∅.

8. Experimentation Results for a partial view completion

The Correlated Correspondence algorithm allows us to register a partial scan of an object to a known complete surface model of the object. We can then use non-rigid ICP to morph the template mesh onto the partial scan. The result is a mesh that matches the scan surface, while completing the unknown portion of the surface using the template geometry.

The CC algorithm was used to hole-fill Cyberware human body scans. Here 4 additional markers were used to resolve the problem of body symmetries. Several shape-completions and the model template used to obtain them are displayed in below figure. The figure demonstrates that the shape-completion process produces reasonable results for a very rich set of poses using a single template model. Cyberware scans are acquired from four different points of view simultaneously and therefore contain only fairly small holes. The extensive partial view surface data minimized the effect of the original template shape prior and constrained it to fit the local surface in practically all cases.

This completed my quest for filling the holes in any 3D scan.

This feature has large scale impacts on the fields like anthropology and forensic science. Some of the applications of the probabilistic graphical model are outlined in the next section.

9. Applications of Probabilistic Graphical model in other fields

The framework of probabilistic graphical models provide support for natural representation, effective inference and feasible model acquisition. Thus, it naturally leads to integrated methodology for tackling a new application domain.

First, we define a class of models that encodes the key properties of the domain that are critical to the task. Then we use the learning to fill the missing details of the model. The learning model can be used as the basis of knowledge discovery with the learned structure and parameters providing important insights about the properties of the domain. It can also be used for a variety of reasoning task: diagnosis, prediction and decision making. Here are some of the major applications which have been applied in the real life example.

9.1. Forensic Anthropology

The details outlined in this white paper can help us in forensics to remodel deformed bodies and missing parts based on the specific features of the race based on their descriptive morphological traits. A part of bone or skull can be regenerated to give an important clue for identification of a person belonging to a certain morphology.

9.2. Generation of missing data in Sensor Based Industrial IOT network

As explained in the beginning of the paper the Bayesian learning techniques and expectation maximization techniques help us generate the data from sensors located in the remote location. This can help us predict the sensor behavior in the long run and minimize the service disruptions

9.3. Drug Discovery and trials

Probabilistic models are applied on various drug trials and subsequent discoveries by looking at the results based on actual drug sample administration and with those with placebo samples. Scientists are able to draw the definite inference of the drug effectiveness based on the sample profiles. The causal relationship through PGM helps us doing that using limited resources and sample data.

9.4. Prenatal Diagnosis:

It is important to determine chromosomal abnormalities present in fetus in early stages of pregnancy. Patient risk of having a child with serious diseases depends on multiple factors, including the child’s sex and race, mother’s age and family history.

A decision-making system called PANDA was developed based on multiple attribute which helps parents decide a course of action for prenatal testing.

9.5. Decision Making

Most commonly used application for Bayesian network technology is to that of fault diagnostic and repair. We construct and model of the device in question where random variables corresponds to different faults and different types of observations on device state. Actions in this type of test correspond to both diagnostic test that can help indicate where the problem lies and to action that repair or replace a broken component.

It helps create an influence diagram which provides a factorized representation for the agent’s action and utility function.

10. References:

10.1. https://www.coursera.org/specializations/probabilistic-graphical-models

10.2. Probabilistic Graphical Model — Dophne Koller and Nir Friedman

10.3. Learning Models of Shape from 3D Range Data — Dragomir Anguelov

1.

--

--