A QUEST FOR FASTER FERMAT INTEGER FACTORING USING ANTI-KEYS

JB Johnson
13 min readJun 24, 2023

Part Eight: Basics of Integer Factoring

Count to three and add three dozen.
Wave the magic key over the sum.

If the path remains visible,
There you will find the solution you seek.

Excerpt from Factoring Spreadsheet with Path Modification using Anti-Keys.
Factoring Path Modification Using Anti-Keys: N#5 = 20740961

Foreword

Anti-Keys? I know it looks like click bait, but this is actually a serious essay on Fermat Integer Factoring. (Or at least as serious as I can be…)

What is the purpose of Anti-Keys? In simplest terms, the purpose of Integer Factoring (in general) is to reduce the computational time required to determine the integer factors of (very) large integers. In this context, the purpose of Integer Factoring Families is to reveal mathematical relationships that facilitate computation shortcuts. Anti-Keys are a part of those shortcuts and can increase the increment values used in the iterations, i.e. they make them faster, but they also increase the number of pathways that must be examined, i.e. they make more overhead. This is a very real dilemma, but as one might expect, there is still a minor increase in overall speed, thus keeping us (mostly me) on this relentless treadmill.

Just a note to the readers who are interested in reading the spreadsheet images. I know that some of them are tiny in this format. Please use the Open Image in a New Tab function in your browser. In Firefox this is accessed by moving the cursor to the image and clicking the right mouse button. The image can be magnified by holding the Ctrl Key and the + Key down at the same time.

This article is part of the series called the Basics of Integer Factoring. If you view my Lists and About data, all the other articles can be seen.

Chapter 1 — Background

The EFC (Elimination for Cause) Keys allow us to eliminate various values of (An) as we iterate toward the solution, however, these keys can also be turned into Anti-Keys. As an Anti-Key, an entire pathway can be removed, with of course the complication of more overhead. My goal here is to determine whether the overall increase in speed outweighs the burden of that overhead.

This is Part Eight in this series. If you have been following this discussion up to this point, then you are ready to go. If not, I have a good summary in Part Seven, so I am happy to give you a free link back to that article. See the article: A DISCUSSION OF INTEGER FACTORING FAMILIES.

See Appendix 1 — Standard Definitions and Equations for variable definitions and related equations and also see Appendix 2Spreadsheet Labels and Highlights for my spreadsheet documentation notes.

Our primary sample problems for the duration of this essay will be N#7 = 19850791 = 223 × 89017, i.e. the good one, and N#5 = 20740961 = 233 × 89017, i.e. the (really) difficult one. In addition, due to some irregularities that I have run into, I am also using a few similar values. See Appendix 3 — Data Table for a complete list.

Chapter 1 — Introduction to the Anti-Keys

A conventional Integer Family iteration sequence for N#7 = 19850791 has two pathways: Group:1 and Group:2. After synchronization our starting values of (An) are An = 4480 and An = 4484. The increment value for each trial is a table lookup value of 36 for Series 7 with An:Even. See Appendix 4 — Family Integer Factoring Constants for more information on Series and Group factoring values. The iteration sets for each are An = 4480 + 36 × Z = {4480, 4516, 4552, 4588, … 44620} and An = 4484 + 36 × Z = {4484, 4520, 4556, 4592, … }. The first path eventually produces An = 44620, which is the desired solution. We verify this by noting that the full integer solution An = 4480 + 36 × Z = 4480 + 36 × 1115 = 44620, which is precisely what we would expect with Trials = 2231, i.e. (1131–1)/2 = 1115 iterations for each path. The second path will overshoot by the value of (look surprised) four. Here is what it looks like in the spreadsheet. Note that the pathways are intertwined but are readily identifiable via the Group number.

Spreadsheet showing Pathways for Group 1 and Group 2 Factoring
Basic Fermat Factoring using Families for N = 19850791

Note again, that if we expect Path:4480 to require 1115 trial iterations, that both combined paths should require twice that or 2230 trial iterations and since we start at Trial:1 or Trial:2 and not Trial:0, this works out to either 2231 or 2232 total iterations, which is what we have.

If we create a sequence of (An) values, starting with the initial pathway value, each incremented by 36, we can then apply the EFT Key = 3 and we see the following effect.

Spreadsheet showing how some values of An are blocked by Anti-Key:3
EFC Anti-Key: 3 for N=19850791

From the illustration above it can be seen that the An:4480 pathway, i.e. An = 4480 + 3 × 36 × Z = 4480 + 108 × Z = {4480, 4588, 4696, … }, will never generate a valid candidate because every third value of the initial sequence is not found in the Ideal Drop Table. There is a downside though, An:4516 and An:4552 become the new pathways. Thus we trade one pathway for two, however, the upside is that our increment is tripled from 36 to 108! Note that since Fermat Factoring with Families requires two pathways to begin with, it will now require four pathways as this process is repeated for the associated An:Group 2. Here is what this looks like.

Spreadsheet shows four starting values with their subsequent iterations
Fermat Factoring with EFC Anti-Key: 3 for N=19850791

Even with the extra pathways, we cut the number of trials by nearly a third. Examining the iteration set for the desired pathway An = 4552 + 108 × Z = {4552, 4660, 4768, 4876, … 44620}, we can calculate backwards and see that An = 4552 + 108 × 371 = 44620 and we can surmise that the other three pathways will overshoot. Note again, that the expected iterations match the trials, i.e. 371 × 4 + 3 = 1487.

This procedure can not be used on an unmodified N#5, as this value is immune to the EFT Key = 3, potentially requiring more pathways, so we use a trick we discovered in the last essay. When the value for (An) for Trial 1 is the same as An:3rd:Group 0, we set the value for (An) for Trial 2 equal to An:4th:Group 0 + 6 (or 12 as required) avoiding an unnecessary duplication of using the same starting value and sidestepping the issue of needing more pathways. Do not be surprised if sometimes we get a duplicate solution, since this is a patch, not a final solution.

If we new create a sequence of (An) values, starting with the each new pathway value, each incremented by 3 × 36 = 108, we can then apply the EFT Key = 5 as an overlay to the EFT Key = 3, and we see the following effect.

Spreadsheet showing pathways being eliminated with Anti-Keys:3 and 5
EFC Anti-Key: 3 & 5 for N=19850791

Note that the new sequence is completely immune to the previous EFT Key = 3. This is expected because all the original values in the solution set that did not match the ideal Mod 3 have been skipped over. In addition we would expect the new generation of pathways will now be immune to both Key = 3 and Key = 5.

From the illustration above it can be deduced that the An:4552 pathway, i.e. An = 4552 + 3 × 5 × 36 × Z = 4552 + 540 × Z = {4552, 5092, 5632, … }, will never generate a valid candidate. This applies to An = {4732, 4768, 4948} as well. There is a downside though, An = {4624, 4660, 4840, 4876, 4984} along with An = 4516 become the new pathways. Thus we go from two pathways to six pathways, however, the upside is that our increment is now 540! Note that since Fermat Factoring with Families requires two pathways to begin with, it will now require 12 pathways as this process is repeated for the associated An:Group 2. Here is what this looks like.

Spreadsheet with twelve pathways being iterated and factored
Fermat Factoring with EFC Anti-Key: 3 & 5 for N=19850791

Our spreadsheet is beginning to look more like a race car track than a factoring engine with so many starting values. Even with the extra pathways, we cut the number of trials by nearly 40%. Examining the iteration set for the winning… er… desired pathway An = 4660 + 540 × Z = {4660, 5200, 5740, 6280, … 44620}, we can calculate backwards and see that An = 4660 + 540 × 74 = 44620 and we can surmise that the other 11 pathways will overshoot.

Just to caution anyone playing with this method. First, N:Odd:Group = 1 and N:Odd:Group = 2, do not fill out the 12 pathways, but instead function with only 8. Second, this procedure can not be used on an unmodified N#5, see the discussion above. Then after this is fixed, some values, N#5:Even:Group = 0, do not fill out the 12 pathways and function with only 8 pathways. Although this complexity can be dealt with, it seems more logical to approach the problem differently.

Chapter 2 — The Next Step

We could fix our problems and continue this process and add even more Anti-Keys, but as good as that may be, I think that it would be selling the method short. My idea is basically shoot for the moon. It involves a continuous process of progressively running every Anti-Key until a solution is found. This works under the assumption that we will eventually land on the exact solution in one of the comparison sets. I know some of you understand where I am going, but for everyone else let me explain.

Spreadsheet with blocks showing a continuous set of iterations
Fermat Factoring with Progressive EFC Anti-Keys: N=19850791 — Series 7

I know that my first reaction to this process was “Wow!” (Not so much due to the novel spreadsheet layout, but mostly due to the fact that it calculated a solution!) Regardless, a few notes are needed. First, the actual spreadsheet is nowhere this compact. I have deleted 266 lines for this illustration. The little note box above each block tells how many blocks and lines are required to make the calculation. In addition, I have implanted several switches so that I can route the calculation though only the blocks that are absolutely needed. Second, this illustration only takes care of Pathway:A, showing Pathway:B requires a duplicate effort, i.e. the size of the spreadsheet is doubled. Third, I have tried to make the trial values represent the actual trials as they would logically be processed. Pathway:A uses odd numbers and Pathway:B uses even numbers, so the trial values represent all iterations involved. (Or at least I hope that is true.) Fourth, the peculiarities of the various series, are no longer an issue. Whether the blocks are minimally occupied or not, each pathway is accounted for separately.

So, this new process can find a solution in 157 trials. If I turn on the EFC, this drops to about 16 trials. This may not seem very fast, but let me drop your jaw. The Brute Force routine, renown for crushing Fermat, requires 177 trials. (For the mathematically pure, using a sieve with 500 values will resolve in 111 trials though.)

At the point where we finish we have approximately 2 × 3 × 4 × 6 = 144 individual variables. Doubled for two pathways, this yields 288 total branches to keep track of. That makes the data overhead about comparable with Sieve Factoring. Knowing how quickly the sieve transforms from friend to foe, we should anticipate some problems ahead. Thus managing the overhead process will be a priority.

We do have a huge consolation though. Our increments between trials is currently 3780 and increasing dramatically with each step! Furthermore, any branch can be transported to a secondary processor, as prior information can be deleted after making each step up, thus making the hypothetical calculation speed really fast. Think of a breeding pair of rabbits… er… rabbits overrunning Australia. The current population size does not include any of the original progenitors, but continues to grow anyway and occupy any available environment.

If we exhaust our data storage, the process can be continued by taking the existing population and using the racetrack model where we just continue to increment each pathway with the last increment value until we find a solution or until all branches exceed the Top An. In our example above, had it failed, we would take the six values in the last block {10600, 14380, 25720, 29500, 40840, 44620} along with all the parallel blocks and continue to increment them with 3780.

Top maximum effective value of (An) has never been anything that I was concerned about, but in this case with the potential of thousands of pathways I want to reduce the calculations as much as possible. The Top An or An:Top = (Xn:Top × Xn:Top + N)/(2 × Xn:Top) where Xn:Top = {1, 2, 3, 4, 5, … }. In our context we will limit the Xn:Top to prime numbers that we might use in preforming a pre-check, i.e. Xn:Top = {5, 7, 11, 13, 17, … }. Trivia time. If Xn:Top = 1 then An:Top = An:Last, the Family Name of the last integer family where (N) will appear, which in this case is An:Top = 9925396. We can also get this value with An:Last = (N +1)/2 = 9925396. This value tells us the parity of (An), whether odd or even. When performing Fermat Factoring, Xn:Top should never be set below 5.

I recommend running a pre-check using a Brute Force factoring routine and pushing the Xn:Top as high as possible. Just moving Xn:Top from 1 to 5 will reduce the An:Top from 9925396 to 1985082. Taking Xn:Top up to 19 reduces An:Top to 522399. If we were following a single pathway with increments of 36, we would have 275706, 55142, and 14512 trials respectively. Fermat Factoring really spins its wheels on the lower factors, so we need to remove this burden as much as possible.

Spreadsheet showing Progressive Factoring of a Series 5
Fermat Factoring with Progressive EFC Anti-Keys: N=20740961 — Series 5

Good news, Series 5 can be solved using either Path A or Path B with this method, so total trials should be approximately half of expected values. Also, no transforms are required, but the factoring is not impressive for all values. If we use the ×5 transform, the number will be immune to the Anti-Key:5, so we will have to live with it.

With the new process we find a solution in 380 trials. Our best effort with untranformed factoring is around 6679 trials. If I turn on the EFC, this drops to about 8 trials.

Does the increase in speed outweigh the burden of the increased overhead? Sadly, it appears so.

Chapter 3 — Final Notes

When I have a chance, I will try writing a JavaScript engine for this factoring routine to see exactly what problems the overhead create.

Here, again, I finish another article about Integer Families and their effect on Fermat Integer Factoring. As I continue to write these articles, I feel more and more like I am writing a wizard’s spell book than making observations in a mathematical discussion. Regardless, I have pretty much resolved myself to this fate, even though I continue to hope that some brilliant soul will some day look at it and show me the fatal flaw that makes it all completely unravel.

In the mean time, I continue to think it is only a matter of time before someone stumbles onto a very fast factoring routine, if not a direct solution, or perhaps the quantum computers will fire up and all this will become irrelevant.

Appendix 1 — Standard Definitions and Equations

  • […] — Square Root Function: Bn = √[An² − N]
  • ^… — Raise … to the Power of …: Bn² = An² − N
  • : … — Attributes: … is …: Z:Last:Even = Za:1st:Odd + Zb:1st:Odd
  • An — Fermat Factor of Base Value: An = (Yn + Xn)/2 where An ≥ √[N]
  • Bn — Fermat Factor of Separation Value: Bn = (Yn − Xn)/2 = √[An² − N]
  • Gi — Group Identifier: Gi = An mod 3 = {0, 1, 2}
  • N’ — Integer Prime Number: N’ = N’×1 ≠ N
  • N — Composite Integer & Factors: N = Xn×Yn ≠ N’
  • N — Fermat’s Factorization: N = An² − Bn² = (An − Bn)(An + Bn)
  • ParityAn:Parity is Odd or Even: An:Parity = (N + 1)/2:Parity
  • Si — Series Identifier: Si = N mod 6 = {#2, #3, #5, #7≡#1}
  • Top An — Maximum An Based on Xn Trials: An:Top = (Xn:Top² + N)/(2×Xn:Top)
  • Top Xn — Last Xn value used in Pre-Checks: If Xn = {5, 7, 11, 13}; Then Xn:Top = 13
  • …#5 — Series 5: … is Series 5: N#5 = X#5× Y#7 -or- X#7× Y#5
  • …#7 — Series 7: … is Series 7: N#7 = X#7× Y#7 -or- X#5× Y#5
  • Xn — Smaller Integer Factor of N: Xn = An − Bn where 1 < Xn ≤ Yn
  • Yn — Larger Integer Factor of N: Yn = An + Bn where Yn ≥ Xn > 1
  • Z — Temporary Variable: Z = {Za, Zb, Zc, … }

Appendix 2 — Spreadsheet Labels and Highlights

Table of Labels and Highlights used in the Spreadsheet Illustrations
Spreadsheet Labels and Highlights (that I generally follow)

Appendix 3 — Data Table

Table of Sample Values used in this Article
Spreadsheet Sample Data Values in Use

Appendix 4 — Family Integer Factoring Constants

Table of Family Factoring Traits used in this Article
Family Factoring Traits

--

--

JB Johnson

I am a science and technology junky and this is my place where I can share my ideas.