FINDING PRIME NUMBERS
James Constant
Unknown
Prime Numbers
are predicted using
best nonlinear regression curve fits of known
primes
See also Find Prime Numbers Using Euler's Formulahttp://www.coolissues.com/mathematics/Bprimes/bprimes.htm
Introduction
The number of primes
in an interval x
is
=
yx
where y
is the fraction of primes in that interval. There are many ways
of finding or estimating fraction y,
some accurate some only approximately accurate. The Prime Number
Theorem (PNT) is an approximately accurate way of estimating
fraction y.
The PNT states that
the
xth prime px
is of the order (~) of
xlogx or that the number of
primes
~ x/logx..
The PNT, therefore,
theoretically implies y~1/logx.
A consequence of the PNT is that
px~x2
which says that we can find
knowing
px,
or
find px
knowing
, within some order of
magnitude. The PNT was
proved by Hadamard and de la Vallee Pousson (independently) using
Riemann's Hypothesis, after showing that the zeros of Riemann's
zeta function
cannot lie too far off the
critical line. It is
now known that Riemann's Hypothesis produces the result
=Li(x)+O(x
logx)
where Li(x)=
is
Gauss's integral and the O
term is the order of the error.1
There are a number of
more accurate ways of finding fraction y
but, while the PNT applies to all primes, these are less general.
In
a first
example, one can use Eratosthene's sieve to compute the actual
number of primes
=[.]x
in
an interval x, in
which
[.] is the formula, program or
algorithm
used to compute fraction y.
While the Eratosthenes sieve is accurate it, like most computers,
is limited by computational difficulties and thus is confined to
small primes. Eratosthenes computers are most efficient to find
small primes less than about x=108
(8 digits). Programs also exist
to find
small primes other than sieves less than about x=10200
(200 digits).2
It
is well known,
therefore, that the PNT is not an accurate predictor of the
number of primes
in any interval
x.3
Equally
well known is that while a number of other ways of finding the
fraction y,
or primes
themselves, are available they are limited to small primes or
only to few large primes. In what follows, I will describe a
method of finding primes in any interval x.
Best Nonlinear Regression Curve Fittings
The
fraction of
primes
in an interval x
is
in
which the data
points (y,x)
are those
of known primes px..
Since
<x,
y is a descending function of
x. A best fit nonlinear
regression curve of
points (y,x)
is easily
found, using
available
curve fitting programs4,
of
the form of the reciprocal logarithmic curve
(2)..........
y =
........... a
= 0.062095933 .........
b
= 0.86509468
in which ln is the natural logarithm. An even better fit is the Harris curve
(3)..........
y =
..... a = -28.91805 ..... b =
29.175022 ..... c = 0.025644389
Curves
(2) and (3)
were obtained using 9 values of x
and
available in the
literature. 5
The actual data points (y,x)
used are shown in the table below.6
It is emphasized that
curves (2) and (3) were obtained using actual prime data points.
Errors in x
therefore
are not present; only errors in y
due to best fittings are present. Clearly, more data points can
be used and other best fit curves might be possible. Here, the
idea is that a best fit descending curve can be obtained in the y-x
plane on any set (y,x)
of actual primes. In what follows, I use equation (2) not because
it is the most accurate fit but because it includes the PNT form y~1/lnx
(when a=0
and b=1).
(4)..........
(y -
y)(x +
x) - yx = 1
Equation (4) can be further simplified as
(5)..........
(y -
y)
x = 1 + x
y
and,
if
y = k
x,
equation (5) can be written as a second order equation
(6)..........
k(
x)2
+ (kx - y)
x + 1 = 0
in which k,x,y are known. For example, the derivative of (2) is
which, when used in (7), produces the end result
7. .......100,711,433 .......5,800,000 .....0.05759028..... 0.0033166 .....0.0028718 .....0.0604620
8. .......104,917,093 .......6,000,000 .....0.05718801 .....0.0032704 .....0.0028318..... 0.0600198
9. ....2,000,000,011 .....98,000,000 .....0.04900000 .....0.0024010 .....0.0020770......0.0510770
------------------------------------------------------------------------------------------------------------------------------------------
1. .......1.2089809 ..........7 + 1 = 8 ..............................................................................11 ...................-3 .................
2. .......3.1726250 ..........97 + 3 = 100 .....................................................................101 ...................-1 ................
3. ......5.1788811 ...........997 + 5 = 1,002 ............................................................1,009 ...................-7 ................
4. ......7.1391142 ..........7919 + 7 = 7926 ...........................................................7,927 ....................-1 ...............
5. ......9.6731444 ..........104,729 + 10 = 104,739 .........................................104,743 ...................-4 ...............
6. ....12.1854470 ..........1,299,827 + 12 = 1,299,839 ..............................1,299,833 ..................+6 ..............
7. ....16.5393130 ..........100,711,433 + 17 = 100,711,450 ................100,711,463 .................-13 ..............
8. ....16.6611680 ..........104,917,093 + 17 = 104,917,110 ................104,917,097 ...............+13 ..............
9. ....19.5782830 ..........2,000,000,011 + 20 = 2,000,000,031 ......2,000,000,033 ................. -2 ..............
In
the foregoing
table, the
x column
shows
that the separation between primes increases with x.
I compute px+
by adding
x
to px
rounded to the nearest integer. I then compare px+
(computed)
with px+
(actual)
and note the numbers missed as the difference. The numbers missed
column shows that after a prime px+
(computed)
is predicted, relatively few odd numbers only need be tested for
primality. Of course, numbers missed will decrease as better
curve fits are used and as arithmetic accuracy is improved.
I have made no checks of numbers in the table and have made no
attempt for accuracy but my purpose is to illustrate the method
and its predictive value. Nor have I ventured beyond x=2x109
(10 digits). At this point I
note that while
the PNT theoretically implies an approximate value y~1/logx
it is not a best fit of the
actual (y,x)
data. This, of course, is necessary if one asks what px+
might
be accurately when px
is known.
in
which only one
multiplication and an exponential are needed to obtain
x.
6 =2x3 gaps between primes through x~1.7427x1035
30 =2x3x5 gaps between primes thereafter through x~10425
210 =2x3x5x7 gaps between primes thereafter, etc.
in
which the
separation gaps occur as factorials of primes.7
While this study is
interesting and true for the intervals covered, its usefulness is
not apparent. Basically, in the present method, finding the next
prime is done by computing
x
at point x=px
not by taking some number (product of primes) which occurs most
often in a wide interval in which x=px
is located.
______________________________________________________
1 See pages 5-7 in Chris Caldwell's "How Many Primes Are There" at http://www.utm.edu/research/primes/howmany.shtml
2 See Chris Caldwell's "Prime Links" at http://primes.utm.edu/links/programs
3 Gauss's integral
Li(x) and
Legendre's 1/(logx - 1)
equations for estimating y
are only slightly different from the PNT.
4 CurveExpert 1.3 from http://www2.msstate.edu/~dgh2/cvxpt.htm This program can handle numbers x>1020 but can only handle numbers y with 10 decimal digits. Its accuracy, therefore is limited to numbers x<1010. In tables above, I limit numbers y to an accuracy of 8 decimal digits due to arithmetic accuracy.
5 See Chris Caldwell's "Prime Links" at http://primes.utm.edu/links/lists_of_small_primes/first_n_primes
6 The reader can generate curves (2) and (3) by plugging the (y,x) data points into CurveExpert 1.3.
7 See Ian Stewart, Jumping Champions, pages 106-107, Scientific American, December, 2000. http://www.sciam.com
Copyright
2000 by James Constant
By the same author: http://www.coolissues.com/mathematics/sameauthor.htm