FINDING PRIME NUMBERS

James Constant

[email protected]

Unknown Prime Numbers are predicted using 
best nonlinear regression curve fits of known primes

See also Find Prime Numbers Using Euler's Formulahttp://www.coolissues.com/mathematics/Bprimes/bprimes.htm

Introduction

The number of primes in an interval x is = yx where y is the fraction of primes in that interval. There are many ways of finding or estimating fraction y, some accurate some only approximately accurate. The Prime Number Theorem (PNT) is an approximately accurate way of estimating fraction y.

The PNT states that the xth prime px is of the order (~) of xlogx or that the number of primes ~ x/logx.. The PNT, therefore, theoretically implies y~1/logx. A consequence of the PNT is that px~x2 which says that we can find knowing px, or find px knowing , within some order of magnitude. The PNT was proved by Hadamard and de la Vallee Pousson (independently) using Riemann's Hypothesis, after showing that the zeros of Riemann's zeta function cannot lie too far off the critical line. It is now known that Riemann's Hypothesis produces the result =Li(x)+O(xlogx) where Li(x)= is Gauss's integral and the O term is the order of the error.1 There are a number of more accurate ways of finding fraction y but, while the PNT applies to all primes, these are less general.

In a first example, one can use Eratosthene's sieve to compute the actual number of primes =[.]x in an interval x, in which [.] is the formula, program or algorithm used to compute fraction y. While the Eratosthenes sieve is accurate it, like most computers, is limited by computational difficulties and thus is confined to small primes. Eratosthenes computers are most efficient to find small primes less than about x=108 (8 digits). Programs also exist to find small primes other than sieves less than about x=10200 (200 digits).2

In a second example, one can use Mersenne's formula 2x-1=Mx to compute prime Mx when prime x is known. While Mersenne's formula is accurate it does not compute fraction y and is limited to a small number of primes. Mersenne computers are most efficient to find large primes. The largest known prime is a Mersenne prime with 2,098,960 digits and, indeed, seven of the eight largest primes are Mersenne primes. However, Mersenne primes are few in number--only 38 are known today.

It is well known, therefore, that the PNT is not an accurate predictor of the number of primes in any interval x.3 Equally well known is that while a number of other ways of finding the fraction y, or primes themselves, are available they are limited to small primes or only to few large primes. In what follows, I will describe a method of finding primes in any interval x.

Best Nonlinear Regression Curve Fittings

The fraction of primes in an interval x is

(1) ..........y = /x

in which the data points (y,x) are those of known primes px.. Since <x, y is a descending function of x. A best fit nonlinear regression curve of points (y,x) is easily found, using available curve fitting programs4, of the form of the reciprocal logarithmic curve

(2).......... y = ........... a = 0.062095933 ......... b = 0.86509468

in which ln is the natural logarithm. An even better fit is the Harris curve

(3).......... y = ..... a = -28.91805 ..... b = 29.175022 ..... c = 0.025644389

Curves (2) and (3) were obtained using 9 values of x and available in the literature. 5 The actual data points (y,x) used are shown in the table below.6 It is emphasized that curves (2) and (3) were obtained using actual prime data points. Errors in x therefore are not present; only errors in y due to best fittings are present. Clearly, more data points can be used and other best fit curves might be possible. Here, the idea is that a best fit descending curve can be obtained in the y-x plane on any set (y,x) of actual primes. In what follows, I use equation (2) not because it is the most accurate fit but because it includes the PNT form y~1/lnx (when a=0 and b=1).

In such descending best fit y-x curves, the product yx represents the number of primes in the interval x, where x=px, is the abscissa side of rectangle yx. Consider now the statement

(4).......... (y - y)(x + x) - yx = 1

in which (y - y)(x + x) represents the number of primes in the interval (x + x) and yx represents the number of primes in the interval x and their difference represents 1 prime in the interval x. More generally, the right hand side of (4) can be any number n=1,2, 3, .... representing the number of primes in the interval x.

Equation (4) can be further simplified as

(5).......... (y - y) x = 1 + xy

and, if y = k x, equation (5) can be written as a second order equation

(6).......... k(x)2 + (kx - y)x + 1 = 0

with solution

(7) .......... x =

in which k,x,y are known. For example, the derivative of (2) is

(8) ............. = -- = k

which, when used in (7), produces the end result

(9).......... x =

Equation (9) is the interval between primes. Thus, if we know the prime at (y,x) we can predict the location of the next prime at (y--y, x+ x).

In this method the sources of error are, errors due to arithmetic accuracy; errors in y and k due to the best fit curve; and errors due in linearizing the square root term in (7).

Equation (9) says that the next prime after prime px is located approximately at the end of interval x + x. Some examples are shown in the next table

x=px ..................................... ............. y= /px ........ y2................... by2 ................ y + by2

--------------------- ......-------- ....------------- ....------------ ...------------....------------

1. ............................7 ....................4 ......0.57142850..... 0.2955913..... 0.2557144.....0.8271429

2. .........................97 ..................25...... 0.25773190 .....0.0664257..... 0.0574645 .....0.3151964

3. .......................997 ................168 ......0.16850550..... 0.0283941 .....0.0245864 .....0.1930919

4. ....................7,919 .............1,000 ......0.12627850..... 0.0159462 .....0.0137949 .....0.1400734

5. ...............104,729 ............10,000..... 0.09548450..... 0.0091172..... 0.0078945 .....0.1033790

6. ............1,299,827 ..........100,008..... 0.07693940 .....0.0059196 .....0.0051257..... 0.0820651

7. .......100,711,433 .......5,800,000 .....0.05759028..... 0.0033166 .....0.0028718 .....0.0604620

8. .......104,917,093 .......6,000,000 .....0.05718801 .....0.0032704 .....0.0028318..... 0.0600198

9. ....2,000,000,011 .....98,000,000 .....0.04900000 .....0.0024010 .....0.0020770......0.0510770

------------------------------------------------------------------------------------------------------------------------------------------

x = 1/(y + by2) ...........px+ (computed) ............................. px+(actual) .....numbers missed -------------------........-------------------------------- ....------------------.. ---------------------

1. .......1.2089809 ..........7 + 1 = 8 ..............................................................................11 ...................-3 .................

2. .......3.1726250 ..........97 + 3 = 100 .....................................................................101 ...................-1 ................

3. ......5.1788811 ...........997 + 5 = 1,002 ............................................................1,009 ...................-7 ................

4. ......7.1391142 ..........7919 + 7 = 7926 ...........................................................7,927 ....................-1 ...............

5. ......9.6731444 ..........104,729 + 10 = 104,739 .........................................104,743 ...................-4 ...............

6. ....12.1854470 ..........1,299,827 + 12 = 1,299,839 ..............................1,299,833 ..................+6 ..............

7. ....16.5393130 ..........100,711,433 + 17 = 100,711,450 ................100,711,463 .................-13 ..............

8. ....16.6611680 ..........104,917,093 + 17 = 104,917,110 ................104,917,097 ...............+13 ..............

9. ....19.5782830 ..........2,000,000,011 + 20 = 2,000,000,031 ......2,000,000,033 ................. -2 ..............

In the foregoing table, the x column shows that the separation between primes increases with x. I compute px+ by adding x to px rounded to the nearest integer. I then compare px+ (computed) with px+ (actual) and note the numbers missed as the difference. The numbers missed column shows that after a prime px+ (computed) is predicted, relatively few odd numbers only need be tested for primality. Of course, numbers missed will decrease as better curve fits are used and as arithmetic accuracy is improved. I have made no checks of numbers in the table and have made no attempt for accuracy but my purpose is to illustrate the method and its predictive value. Nor have I ventured beyond x=2x109 (10 digits). At this point I note that while the PNT theoretically implies an approximate value y~1/logx it is not a best fit of the actual (y,x) data. This, of course, is necessary if one asks what px+ might be accurately when px is known.

As to equation (2), I note that as x increases beyond some point by2<<y, and equation (9) simplifies as

(10) .......... x = = blnx

in which only one multiplication and a logarithm are needed to obtain x. Multiplication, of course, is the most difficult aspect of finding large primes. As to better fit equation (3),

(11) .......... x = = bxc

in which only one multiplication and an exponential are needed to obtain x.

The foregoing table shows that the separation between primes increases with x. However, a cursory look at tables of primes shows that one cannot assume that consecutive primes increase with x. A recent study of the separation gaps between primes concludes that while primes increase their separations with x, they do so not consecutively but in groups of primes. Thus, we have

6 =2x3 gaps between primes through x~1.7427x1035

30 =2x3x5 gaps between primes thereafter through x~10425

210 =2x3x5x7 gaps between primes thereafter, etc.

in which the separation gaps occur as factorials of primes.7 While this study is interesting and true for the intervals covered, its usefulness is not apparent. Basically, in the present method, finding the next prime is done by computing x at point x=px not by taking some number (product of primes) which occurs most often in a wide interval in which x=px is located.

The most interesting application of the present method will occur in the systematic finding of primes. Unattended computers might be put to this task. It is only necessary that a best fit curve, not necessarily (2) or (3), is established. Since y is a descending curve which appears to approach zero in the limit, it is conceivable that highly accurate linear fits may be possible at large x, at least over some portion of y. Finally, since RSA cryptography is based on factoring large integers, products of primes, and since the present non-factoring method filters most, if not all, non-primes, the method may considerably reduce the computation amount and time it takes to find RSA primes.

______________________________________________________

1 See pages 5-7 in Chris Caldwell's "How Many Primes Are There" at http://www.utm.edu/research/primes/howmany.shtml

2 See Chris Caldwell's "Prime Links" at http://primes.utm.edu/links/programs

3 Gauss's integral Li(x) and Legendre's 1/(logx - 1) equations for estimating y are only slightly different from the PNT.

4 CurveExpert 1.3 from http://www2.msstate.edu/~dgh2/cvxpt.htm This program can handle numbers x>1020 but can only handle numbers y with 10 decimal digits. Its accuracy, therefore is limited to numbers x<1010. In tables above, I limit numbers y to an accuracy of 8 decimal digits due to arithmetic accuracy.

5 See Chris Caldwell's "Prime Links" at http://primes.utm.edu/links/lists_of_small_primes/first_n_primes

6 The reader can generate curves (2) and (3) by plugging the (y,x) data points into CurveExpert 1.3.

7 See Ian Stewart, Jumping Champions, pages 106-107, Scientific American, December, 2000. http://www.sciam.com

Copyright 2000 by James Constant

By the same author: http://www.coolissues.com/mathematics/sameauthor.htm