Project Euler - Problem 8 Largest product in a series, or "What I learned in school today"

The problem

As I briefly mentioned in my rant about the F# fanboy lies, I have been using Project Euler to help me learn F#. I have got as far as problem 8, which was to find the largest product in a series of digits. To save you the bother of clicking the link, here is the description:

The four adjacent digits in the 1000-digit number that have the greatest product are 9 × 9 × 8 × 9 = 5832.
73167176531330624919225119674426574742355349194934 96983520312774506326239578318016984801869478851843 85861560789112949495459501737958331952853208805511 12540698747158523863050715693290963295227443043557 66896648950445244523161731856403098711121722383113 62229893423380308135336276614282806444486645238749 30358907296290491560440772390713810515859307960866 70172427121883998797908792274921901699720888093776 65727333001053367881220235421809751254540594752243 52584907711670556013604839586446706324415722155397 53697817977846174064955149290862569321978468622482 83972241375657056057490261407972968652414535100474 82166370484403199890008895243450658541227588666881 16427171479924442928230863465674813919123162824586 17866458359124566529476545682848912883142607690042 24219022671055626321111109370544217506941658960408 07198403850962455444362981230987879927244284909188 84580156166097919133875499200524063689912560717606 05886116467109405077541002256983155200055935729725 71636269561882670428252483600823257530420752963450

Find the thirteen adjacent digits in the 1000-digit number that have the greatest product. What is the value of this product?

Apart from the fact that this was an interesting problem, I learnt a very important lesson from this one, and thought it worth sharing.

Solving the problem – or not!

My initial stab at this looked like this:

let chop n (s : string) =
  [ for i in [0..(s.Length - n)] do yield s.[i..(i + n - 1)]]
let product (s : string) =
  s |> Seq.fold (fun p c -> p * (int (string c))) 1
let lgstProduct n (s : string) =
  s |> chop n |> Seq.map product |> Seq.max

The chop function chops the string into chunks of length n, the product function calculates the product of the digits (assuming that the string passed in only contains numbers of course), and the lgstProduct function sticks these together to find the maximum product.

I tried this with the 1000 digit number passed as a string, and using 4 for the chunk size, and it produced the right answer, 5832. Wanting to make the code shorter and neater, I included the two helper functions in the main one, and managed to come up with this…

let largestProductInt64 (n : int64) (s : string) =
  [ for i in [0L..((int64 s.Length) - n)] do yield s.[(int i)..int(i + n - 1L)]]
  |> Seq.map (fun s ->
                  s, s |> Seq.fold (fun p c -> p * (int64 (int (string c)))) 1L)
  |> Seq.maxBy snd

Note that I changed the code to give me a tuple, containing both the highest product, and the n-character chunk that produced it. Chuffed to bits, I threw the number 13 at it, and got the answer (“9781797784617”, 2091059712) , which I duly plugged into the answer box on the Project Euler site, only to be told that it was wrong! What a chutzpah! Of course it’s right, my code works!

Or does it?

So what went wrong?

Having spent quite a bit of time testing my code, and convincing myself that it was right, I resorted to searching for other people’s answers to the same problem. Along the way, I came across someone who had had exactly the same problem as me, albeit in C++, and had come up with the same (wrong) answer.

It turns out that the issue was simple. When multiplying 13 digits together, you could potentially end up with 9^13, ie 2,541,865,828,329. Given that the maximum number that can be stored in the .NET int type is 2,147,483,647 the problem becomes apparent.

I changed my code to use int64, which is the F# equivalent of the .NET long type, and can hold numbers up to 9,223,372,036,854,775,807. Lo and behold, project Euler acquiesced, and accepted my answer.

In order to make my code even more general, I actually changed it to use bigint, which can hold any size of integer, but the point I want to take away from this remains the same.

What I learnt in school today

I think there is a very important lesson here. Like many of us, I piled in and started coding without really thinking about the problem. What I should have done is take a look at the problem domain, and think it through. It should have been obvious that the eventual product was going to be too large to fit into a 32-bit integer, which is probably why the Project Euler people chose such a large number in the first place. Had I done that, I would probably have got the right answer first time.

Now, I don’t know about you, but I almost never get these sorts of interesting problems in my day job. I usually get “Pull the data from the database, display it on a window, wait for the user to make changes and then save it,” which is significantly less interesting. However, I think the basic point remains valid. Without thinking through the scope of the problem, and the bounds of the domain, it’s very easy to pile and and get coding, whilst introducing all sorts of subtle bugs. My tests worked absolutely fine, simply because I was testing on small numbers. How many times do we developers test our code against a Noddy database, mainly to save development time? No need to put your hands up, we’re all guilty.

Had my largest product function been production code, I would have released a bug that would only have been spotted some time down the line. Depending on how easy/hard it would be to predict the right numbers, it’s possible that it might not have been spotted for a long time. People would just assume that the number produced was correct.

So, am I going to contemplate my problem domain before diving in and coding next time? Probably not, but at least if I don’t, I might have some idea where to look when the bug reports come in!

Improving the code

Having sorted all that out, I asked for a code review, and came across a really useful F# function that I hadn’t seen before. My chop function, included as the first line of my slimline largestProduct function split the input string into a sequence of chunks of length n. It turns out that F# has the Seq.windowed function that does exactly the same thing, but is more readable.

I also got a slightly better understanding of function composition, and saw how to reduce the number of brackets needed to convert the character to a bigint. I ended up with:

let largestProduct n (s : string) =
  Seq.windowed n s
  |> Seq.map (fun s -> s, s |> Seq.fold (fun p c ->
                                               p * (string > bigint.Parse) c) 1I)
  |> Seq.maxBy snd

I was quite pleased with this. A lot of functionality in four lines.

Solving the problem in C#

I was interested to see if I could solve the problem in C# as well, so I fired up LinqPad and jumped in. My initial version (including the extra bits need to run it in LinqPad, and the line to write out the result) looked like this:

void Main() {
  string s = "7316...3450" // NOTE: Snipped for brevity!!
  int n = 13;

  var maxProduct = MaxProduct(s, n);
  Console.WriteLine ("1) Max product is " + maxProduct.Item2
                               + " from " + maxProduct.Item1);
}

public Tuple<string, long> MaxProduct(string s, int n) {
  return Chop (s, n)
    .Select (s1 => new Tuple<string, long> (s1, Product (s1)))
    .OrderByDescending (t => t.Item2)
    .First();
}

public long Product (string s) {
  long res = 1;
  for (int i = 0; i < s.Length; i++) {
    res *= Convert.ToInt32 (s [i].ToString());
  }
  return res;
}

public IEnumerable<string> Chop (string s, int n) {
  for (int i = 0; i < s.Length - n + 1; i++) {
    yield return s.Substring (i, n);
  }
}

Hmm, quite a lot of code there. Looks like F# really is shorter and cleaner!

There must be a way to improve this. A few moments’ thought made me realise that the Product() method is really doing what the Linq Aggregate() extension method does. Also, the Chop() method could easily be done with Linq if I fed in a range of numbers for the starting positions of the substring (like I did in my original F# code).

After a short bit of fiddling, I came up with this rather improved C# version:

public long MaxProduct (string s, int n) {
  return Enumerable.Range (0, s.Length - n + 1)
      .Select (i => s.Substring (i, n))
      .Max (s1 => s1.ToCharArray()
                  .Aggregate (1, (long a, char c) =>
                                 a * Convert.ToInt64 (c.ToString())));
}

That’s much better! Once you ignore the extraneous bits, the body of the actual method is only three lines, a mere one line longer than the F# version. The F# is definitely cleaner, but as I’ve mentioned before, that’s not always an advantage.

After passing this problem around the team, one of the brighter sparks came up with an even shorter version that runs faster:

public long MaxProductEC (string s, int n) {
  return Enumerable.Range (0, s.Length - n + 1)
    .Max (i => s.Substring (i, n).Aggregate ((long)1, (a, c) => a * (c - '0')));
}

I defy anyone to tell me that C# is verbose! Don’t get me wrong, I’m really enjoying F#, but the lies are getting on my nerves!

All in all, an interesting exercise.

One Comment

Finding the two closest factors of an integer - Dot Net What Not said:

[…] for once, I did learn from my previous mistakes (see Project Euler – Problem 8 Largest product in a series, or “What I learned in school today”, Project Euler – Problem 14 Longest Collatz sequence, or “What I didn’t learn from problem […]

January 31, 2021

Project Euler – Problem 8 Largest product in a series, or “What I learned in school today”