## Monday, 7 October 2013

### Collatz - more thoughts

Edit: nb Everything that this posts tries to explain is given a cleaner exposition in this post http://barkerhugh.blogspot.co.uk/2013/10/a-binary-representation-of-collatz.html

Here's a bit more on this, following on from some earlier thoughts (in fact this post and the next one contains a compressed version of the three previous Collatz posts, focusing on the bits I am currently most interested in).

I mentioned before this kind of series:

A+1
B+1+2^p
C+1+2^p+2^q
D+1+2^p+2^q+2^r (where r > q > p)

To get from A to B etc, you add the first term to 2x the full equation. For instance:

A + 2(A) = B
B + 2(B+1) = C
C + 2(C+1+2^p) = D

Etc. Here is a worked example.

7
21+1=22 (=11*2)
21+(2*22)=65
65+1+2 = 68 (=17*4)
65+(2*68)=201
201+1+2+4=208 (=13*16)
201+(2*208)= 617
617+1+2+4+16 = 640 (=5*128)
617+(2*640)=1897
1897+1+2+4+16+128 = 2048

Now, the point of this is that this method gives us a series which can be shown to be equivalent to a Collatz sequence. The sequence above is equivalent to 7, 11, 17, 13, 5, 1, ignoring the halving.

7 = 7
22 = 2 x 11
68 = 4 x 17
208 = 16 x 13
640 = 5 x 128
2048 = 1 x 2048

What the sequence above is doing is starting from this way of expressing a Collatz sequence (without halving):

(7 x3)+1 = 22
(22 x 3) + 2 = 68
(68 x 3) + 4 = 208 etc

Which we can also express as:

3^1 x 7 + 1 = 22
3^2 x 7 + 3 + 2 = 68
3^3 x 7 + 9 + 6 + 4

In this form, a number X's chain will reach 1 if it satisfies an equation of this sort:

(3^n x X) + 3^(n-1) + [2^a x 3^(n-2)] + [2^b x 3^(n-3)] …. + [2^c x 3]+ 2^d = 2^e

Where e>d>c>b>a

For instance, for 7 we end up with:

(243x7) + 81 + (27x2) + (9x4) + (3x16) + 128
1701 + 81 + 54+ 36 + 48 + 128 = 2048

So… the process I started with is a way of finding the number that reaches the same sum as this equation, but without the powers of 3.

21+1 = 22
63 + 3 + 2 is equivalent to 65 + 1 + 2
189 + 9 + 6 + 4 is equivalent to 201 + 1 + 2 + 4
567 + 27 + 18 + 12 + 16 is equivalent to 617 + 1 + 2 + 4 + 16
1701 + 81 + 54+ 36 + 48 + 128  is equivalent to 1897+1+2+4+16+128

Now, the important thing about this is that we can explore the pattern of how to reach a power of 2 in steps of powers of two relatively easily, since it reduces this problem to a binary number problem - and to a problem where we know every chain reaches the "target" of a power of 2.

(Because, from any number we can reach the next power of 2 in steps of increasing powers of 2 for instance

89+1+2+4+32 = 128
67+1+4+8+16+32 = 128
119+1+8 =128)

Where the next power of 2 is 2^n then in the region between 2^(n-1) and 2^n the maximum number of steps this can take is n-1 (for instance between 16 and 32 the maximum number of steps is four, from 17 + 1 + 2 + 4 + 8. The average number of steps in the same region is (n-1)/2 (this is easy to show, I'll skip it for brevity).

Now, as we go up through the A, B, C, D sequence etc we can make several  observations.

At each step we are increasing by a factor of just over 3. So the maximum number of steps in the region increases by about 1.5-1.6, the average number of steps increases by about 0.75-0.8 (again, this can be made a bit more rigorous, I’m summarising to keep this brief).

At the same time, each time we take a step through the sequence, the minimum number of powers of 2 we need to be able to add increases by 1. For instance in each of the sums below, we add one more term.

21+1
65+1+2
201+1+2+4
617+1+2+4+16

The consequence of these two things put together is that for any starting number we eventually reach a point where (for an equivalent Collatz sequence that doesn’t reach 1) each number we hit has to be more than the average number of steps from a power of 2 – approximately 1.25 x the average.

As an example of how this works - for the Collatz sequence for 27, after about 18 steps we are in region where the average number of steps to a power of 2 is 18 - thereafter it takes until about the 40th step to reach a power of 2.

Also, each time we skip a power of 2, we have to reduce the maximum number of steps available by 1 – for instance as we extend the sequence above, we will never be able to reach a number that adds 8, as each time we will have to start by adding 1+2+4+16. So while we are in the region 2^8-2^9, the maximum number of steps available is reduced to 8-1 = 7.

Now, while it is impossible to have an infinite sequence of steps that don’t skip a power (again, this can be proven but I’m skimming), quite long sequences of this sort are possible (for instance starting from 255 there are 7 upwards steps before we skip a power).

So the challenge facing this approach is to try to prove that the sequence A, B, C, D, starting from any number can’t be infinitely extended without hitting a number where the number of steps we need to reach 2^n is equal to the minimum number of steps we need.

(Below is very rough, as it's a bit fiddly to explain...)

I’ve noticed something else which takes us a tiny bit closer to this.

When we reach a number like 201 in the sequence above, after which we are going to skip a power, the number of subsequent upwards steps without skipping a power is partly defined by how many powers we skip this time. This sounds woolly, so let me give a couple of examples.

201+1+2+4+16+32 = 256

We know at this stage that we will skip 8 (because the next number will be 201 + 208 + 208 – this means we are adding 0 mod 16, and will land on another number that is 9 mod 16, meaning that adding 1, 2, 4 will take us to a multiple of 16).

Also, we don’t skip the subsequent power, 32. This means we won’t have an upward step without skipping next time, which is correct – 617 + 1+2 +4+16+128+256 = 1024 – we are now going to skip 32 and 64.

Let’s look at a different sequence:

55
165+1
497+1+2
1497+1+2+4
4505+1+2+4+32

The turning point (where you skip a power) comes after 1497

The continued equation here would be 1497+1+2+4+32+512.
1497 = 8n+1 and 16n+1 and we are going to add 0 mod 8, 8 mod 16 to get to the next number, 4505. So we have to skip 8. We actually skip 16 too (I’ll explain the logic of this later ) so the next number we add will be 32. Thereafter, 1497 is 64n-39, 128n-39, 256n-39, 512n-39, so we will subsequently have upwards steps of 64, 128, 256, before skipping a power.

So to get to an upwards sequence of 3 steps without skipping, we need to hit a number that skips 3 numbers.This means we can't hit a sequence of n steps where n is more than the difference between the maximum number of steps in the region and the minimum number of steps we are obliged to take.

Putting this stuff together we get 1) for all numbers we will eventually reach a region where we are obliged to take about 1.25 x the average number of steps in that region at every stage. 2) In order to get uninterrupted upwards runs of numbers (eg not skipping a power) in the next step, we need to skip that many powers this step. 3) Whenever we hit the latter stages of a region there will be some powers that we are forced to skip. For instance, from 1897, it is impossible for us to add 256 or 512 as part of the path to 2048. This makes it hard to see how we could avoid reaching a power of 2, but falls short of being a proof unless it can be refined some more.