Finding the two closest factors of an integer

I was browsing the latest C# questions on StackOverflow, and came across one that caught my eye. Sadly, it was quickly closed by people who, in my opinion, didn’t read it properly. The question was…

> I am trying to write a program to find the factor of a number with minimum difference. For Ex: for 20 it should be 4×5 or vice versa. How can this be achieved for a really BIG number. lets say 989287498274928743928174192847219347123984723498. Please guide

Ignoring the usual SO rules about showing some effort or research, the question is not simply “How do I find the factors of n?”, which is what all the linked answers were about. This question was how to find the two closest factors. Sure you can do that by finding them all, then sorting them by their difference, but for a big number, that’s hugely inefficient. There are ways of solving this specific question much more efficiently, which is what I decided to do.

Just for once, I did learn from my previous mistakes (see Project Euler – Problem 8 Largest product in a series, or “What I learned in school today”, Project Euler – Problem 14 Longest Collatz sequence, or “What I didn’t learn from problem 8” and more recently, Using Linq to debunk the Feng Shui about the number of days in December), and thought about the problem before diving into code. Before anyone thinks I’ve become smart, I should in all honesty confess that the reason I didn’t jump into coding was because I was using the wife’s lap top while I had my breakfast, and that doesn’t have any coding tools installed.

OK, confession over, let’s take a quick look at the problem first.

Suppose you have a number n, which has two factors a and b. Note that this is always possible, as even if n is a prime, the two factors will just be 1 and n.

A moment’s thought should convince you that there are two possibilities here…

  1. n is a square number, and a and b are the square root (ie a==b).
  2. n is not a square number, in which case one of the two factors must be less than the square root of n, and the other must be greater than it.

With that in mind, a simple algorithm springs to mind:

  1. Start with the square root of n (which we’ll denote by a), and check if this is a factor of n. This is easily done in C# by checking if n % a is zero. If so, we are done.
  2. If we want to exclude the above case, or if n % a is not zero, meaning that a is not a factor of n, then try a + 1, then a + 2 and so on, until we either find two factors, or reach n, meaning that it’s a prime number.

This will find all pairs of factors of n, albeit in a fairly inefficient way.

Now, as we started from the square root of n and are working our way away from it, any pairs of factors that we find will be increasingly distant form each other. To put it another way, the very first pair that we find must be the closest. Therefore, we only need to run this algorithm until we find a pair. This is why this is so much more efficient than finding all pairs first, and then looking for the closest.

OK, enough thinking, it’s time to open LinqPad and try this. As my regular reader will know, I enjoy trying to find elegant Linq expressions to solve these sorts of problems.

I decided to ignore the fact that the original question was about big integers, and work with plain integers first. It was pretty simple to find all pairs of factors, starting with the closest…

Enumerable.Range((int)Math.Sqrt(n), n - (int)Math.Sqrt(n) + 1)
  .Where(a => n % a == 0)
  .Select(a => (a, n / a, Math.Abs(n / a - a)))

The first line just gives us a sequence of integers that starts with the biggest integer less than or equal to the square root of n, and containing numbers up to and including n. Enumerable.Range uses yield under the covers, it only generates the numbers you need, so doesn’t use up any more CPU cycles than necessary. I’m not sure how much difference this would make, but worth noting.

The next line checks for factors. As mentioned above, if n % a is zero, then a is a factor, then n % (n / a), must also be.

Whilst testing, I used Select to dump them all out, and the third line just shows both factors.

In order to find the closest pair, we can change Select to First, and drop the Select (which only works on sequences, not individual items). This gives us the following expression which will find one of the two factors…

Enumerable.Range((int)Math.Sqrt(n), n - (int)Math.Sqrt(n) + 1)
  .First(a => n % a == 0)

Finding the other factor is just a matter of diving n by this. I didn’t bother fiddling with the Linq to do this, as it wasn’t particularly interesting.

The only thing left is to do this for big integers. We have two problems here. First, Enumerable.Range only works with regular integers, and second, Math.Sqrt doesn’t handle big integers.

The first problem is easy to solve. I linked to the source for Enumerable.Range above, so we can just rewrite it to use big integers…

static IEnumerable<BigInteger> RangeIterator(BigInteger start, BigInteger count) {
  for (BigInteger i = 0; i < count; i++) {
    yield return start + i;
  }
}

I called it RangeIterator, as that is the name of the private method that does the actual work, Enumerable.Range is just a wrapper that includes bounds checking.

The second problem was solved with a quick search, which turned up a simple method for getting the square root of a big integer

static double BigSqrt(BigInteger bi) =>
  Math.Pow(Math.E, BigInteger.Log(bi) / 2);

There are some important caveats to the code (read the comments there), but none of them are relevant to us.

With these two methods defined, the original problem can now be solved as follows…

RangeIterator(new BigInteger(BigSqrt(n)), n - new BigInteger(BigSqrt(n)) + 1)
  .First(a => n % a == 0)

As noted before, this is very efficient when the two factors are close to the square root, but can be horribly slow for big integers where they are far apart. As is well known, Linq is not always the most efficient way of writing code, even if it is (usually) the most elegant. However, the original question didn’t put any bounds on efficiency, and I’m doing this for the fun, so I didn’t bother trying to optimise it.

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.