Carl Dehlin
Jan 27, 2025

RANSAC for multiple item estimation

This articles dives deep into some ideas presented in an earlier post about RANSAC, and extend the mathematical formalism to multiple items in a scene, wrapping up with a concrete formula for calculating the minimum required number of iterations to guarantee a certain probability of success and an upper bound on the number of items in a scene.

I talked about RANSAC and robust estimation for finding a plane in a point cloud in one of our earlier blog posts: https://www.rooniq.com/blog/robust-estimation-in-point-clouds . But what if there are two planes? Or in general, $m$ planes, present in the scene? The object plane can be replaced by other things as well, such as lines or homographies. What is the probability of finding all objects?

Let $L$ be the minimum number of points for estimating an object ($L = 3$ for planes), $Q = 1 - P^L$ be the probability of failure in one iteration, and $q = Q^n$ after $n$ iterations. Permitting one point to belong to several objects, the chance of finding all is simply (ignoring replacement effects)

$$
   p = (1 - q)^m = \sum_{k=0}^m (-1)^k {m \choose k} q^k
$$

First order expansion

Assuming $mq \ll 1$ we can expand to first order in $q$

$$\begin{align}
   p^\prime & \approx 1 - mq \newline
   q^\prime & \approx mq
\end{align}$$

So, let us say that $m = 2$ and $q = 1\%$ such that the expansion is valid, then the risk of failure linearly increase with the number of objects giving $q^\prime = 2\%$. This is all intuitive, but how does the that affect the number of required iterations $n^\prime$ for maintaining constant chance of success (risk of failure)?

$$\begin{align}
   q^\prime(n^\prime) &= mQ^{n^\prime} = Q^n \newline
   n^\prime = \frac{n \log{Q} - \log{m}}{\log{Q}} &= n - \frac{\log{m}}{\log{Q}} = n + \frac{\log{m}}{\log{1/Q}} \geq n
\end{align}$$

Quadratic approximation

Let us investigate the solution by incorporating second order terms as well

$$\begin{align}
   p^\prime & \approx 1 - mq + \frac{m(m-1)}{2}q^2 \newline
   q^\prime & \approx mq - \frac{m(m-1)}{2}q^2 \newline
\end{align}$$

Rearranging gives

$$
\begin{align}
mQ^{n^\prime} - \frac{m(m-1)}{2}Q^{2n^\prime} &= Q^n \newline
Q^{2n^\prime} - \frac{2}{m-1}Q^{n^\prime} &= - \frac{2}{m(m-1)}Q^n \newline
\Big( Q^{n^\prime} - \frac{1}{m-1} \Big)^2 - \frac{1}{(m-1)^2} &= - \frac{2}{m(m-1)}Q^n \newline
Q^{n^\prime} = \frac{1}{m-1} \pm \sqrt{ \frac{1}{(m-1)^2} - \frac{2}{m(m-1)}Q^n} &= \frac{1}{m-1} \Bigg( 1  \pm \sqrt{1 - \frac{2(m-1)}{m} Q^n} \Bigg) \newline
\end{align}
$$

This does look a bit complex. Let us start by analyzing the roots. Since we looks for the smallest possible $n^\prime$ satisfying the equation we can discard the positive root. For $m \gg 1$, we can approximate $(m-1)/m \approx 1$ such that

$$
Q^{n^\prime} \approx \frac{1}{m} \Bigg( 1 - \sqrt{1 - 2 Q^n} \Bigg)
$$

Expand the square root $\sqrt{1-x} \approx 1 - x/2$ and simplify

$$
Q^{n^\prime} \approx \frac{1}{m} \big(1 - (1 - Q^n)\big) = \frac{Q^n}{m}
$$

which is identical to the result using linear expansion!

Many item limit: Exponential series approximation

Let us consider the many item limit $m \rightarrow \infty$. The expansion of $q^\prime$ then becomes

$$
1 - q^\prime = \sum_{k=0}^\infty (-1)^k {m \choose k} q^k
= \sum_{k=0}^\infty (-1)^k \frac{m!}{(m-k)!k!} q^k
\approx \sum_{k=0}^\infty (-1)^k \frac{m^k}{k!} q^k = \exp(-mq)
$$

Solving for $n^\prime$ gives

$$\begin{align}
\exp(-mQ^{n^\prime}) &= 1 - Q^n \newline
-m Q^{n^\prime} &= \log(1 - Q^n) \newline
n^\prime &= \frac{\log(- \log(1 - Q^n)/m)}{\log Q}
\end{align}$$

In the limit of high single item probability of success (low risk of failure, $Q^n \approx 0$), we can expand the inner logarithm to first order as $\log(1+x) \approx x$ such that

$$
n^\prime \approx \frac{\log(Q^n/m)}{\log Q}
= \frac{n\log Q - \log m}{\log Q}
= n + \frac{\log m}{\log 1/Q} \geq n
$$

which is also exactly the same as the first order expansion!

A general formula

All the three expansions inspected above gives us confidence that the relation between the number of iterations that needs to be added in response to more items in the scene, to be on the following form

$$
\Delta n = \frac{\log m}{\log 1/Q}
$$

Upon first inspection, it looks like the number of required iterations to keep constant chance of success grows logarithmically with the number of items in the scene. However, there is a hidden dependence between $m$ and $Q$. Imagine that a certain fraction $\beta$ of all points belongs to items of interest. Further assume that the number of points per object is uniformly distributed between each item. Then, the chance of picking (and risk of not picking) $L$ points belonging to one item (ignoring replacement) are

$$\begin{align}
   P &= \bigg(\frac{\beta}{m} \bigg)^L \newline
   Q &= 1 - \bigg( \frac{\beta}{m} \bigg)^L
\end{align}
$$

Plugging back into the equation for $\Delta n$ gives

$$
   \Delta n = \frac{\log m}{\log 1/Q}
   = \frac{\log m}{\log \frac{1}{1 - (\beta/m)^L}}
   = \frac{\log m}{\log \frac{m^L}{m^L - \beta^L}}
   = \frac{\log m}{L \log m - \log(m^L - \beta^L)}
$$

Rewriting the logarithm

$$
\log (m^L - \beta^L) = \log ( m^L \frac{m^L - \beta^L}{m^L} ) = L \log m + \log (1 - (\beta/m)^L)
$$

and expanding the second term to first order in $(\beta/m)^L$

$$
\log(1 - (\beta/m)^L) \approx - \bigg( \frac{\beta}{m} \bigg)^L
$$

which plugged back in to the expression for $\Delta n$ gives

$$
\Delta n = \frac{\log m}{L \log m - L \log m + (\beta/m)^L}
= \frac{m^L\log m}{\beta^L}
$$

so the change in iteration count is polynomial in the number of items. For the special case $L = 3$ for planes, it has a "supercubic" complexity. That is slightly problematic, and without modifications to the original RANSAC algorithm it can become infeasible in practice.

We can use the same reparameterization to derive an expression for the single item iteration count

$$
n = \frac{\log(1 - S)}{\log(1 - P^L)} = \frac{\log(1 - S)}{\log(1 - (\beta/m)^L)}
\approx - \frac{m^L \log(1 - S)}{\beta^L} = \frac{m^L \log(1/Q^\ast)}{\beta^L}
$$

where we have defined $Q^\ast = 1 - S$ as the target risk of failure. Note that the risk term can not be expanded because it is close to zero. This gives the total many item iteration count

$$
n^\prime = n + \Delta n \propto \frac{m^L \log m}{\beta^L} + \frac{m^L \log(1/Q^\ast)}{\beta^L}
= \bigg( \frac{m}{\beta} \bigg)^L \bigg( \log m + \log\frac{1}{Q^\ast} \bigg)
$$

This means that the iteration count is exponential in the feature count $L$, and polynomial in the number of items $m$.

Let us evaluate this for different sets of objects. Fix the chance of success to $S = 99\%$. We can quickly verify the equation for the single plane ($m=1$, $L=3$, $\beta=50\%$) scenario used earlier, and the formula predicts 37 iterations, which is very close to the exact solution of 35 predicted earlier. However, in case we want to find up to $m = 10$ planes that gives $n^\prime \geq 55.000$. That is a lot more than the few 35 iterations for single plane! We can also apply the formula to line detection ($L = 2$) or homography estimation ($L=4$). Let us restrict ourselves to finding $m=400$ lines. The formula predicts that this requires about 7 million iterations. The numbers are starting to get big, and whether or not this is considered too high depends on the computational requirements in each iteration.

Conclusion

This article derived a formula for calculating the minimum number of required iterations to guarantee a certain probability of success when estimating multiple items in a scene using the RANSAC algorithm. Even though many approximations are applied in order to handle the problem analytically, almost exactly the same expression is derived in three different ways, giving a high degree of confidence in the formula. It is worth emphasizing that all calculations so far has been done for all items in the scene. In practice, when estimating multiple items there is an initial quick population of items, but becomes slower as fewer remain undetected in the scene. This is due to combinatorial sub-selection which drastically increase the change of picking a subset of items from all of them. Hence, if it is sufficient to find almost all items in the scene, the number of iterations can be drastically reduced. I leave it as an exercise for the future to derive the formula for that scenario, either for myself, or maybe for you?

Let's discuss how our 3D software can speed up your operations.
Contact me