I’m reading this fantastic book Shape by Joshua Ellenberg, and in one of the chapters it talks about rational approximations of irrational numbers. That is, like how 335/113 is very close to pi. He shows a technique for figuring out rational approximations that proves that it is possible to find an approximation that is better than dividing by a power of 10 (like 31415/10000). In other words, you can find a denominator that is smaller than 10000 that will lead to an error that is less than 1/10000. I don’t know, he explains it better in the book, but there are a few exercises that he leaves unshown that would be fun to create in R.
The first one is how he shows that the remainders of the rational approximations of pi cluster around 1/113. That’s super cool!
The other one is how he calculates a rational approximation of the golden ratio as 987/610.
Pi Is Kind of Rational
First let me try to recreate the clustering of the rational approximations of pi.
To do this, let’s create a helper function that takes an irrational number, multiplies it by an integer n, then returns an integer that contains the first few digits of the remainder. For example, if you multiply 8*pi, you get 25.13274… and the function returns 133.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
multiple_remainder <- function(n, irrat, digits=3) {
(((n*irrat) %% 1)*10^digits) %/% 1 + 1
}
multiple_remainder(8, pi, 3)
## [1] 133
Now we will get all the remainders for n from 1 to 1000.
n = 1000
boxes <- data.frame(n_multiple = 1:n, box_no = rep(NA, n))
boxes$box_no <- sapply(boxes$n_multiple, multiple_remainder, irrat = pi, log10(n))
boxes <- tibble(n_multiple = 1:n)
boxes <- boxes %>% mutate(box_no =
multiple_remainder(n_multiple, pi, log10(n)))
head(boxes)
## # A tibble: 6 × 2
## n_multiple box_no
## <int> <dbl>
## 1 1 142
## 2 2 284
## 3 3 425
## 4 4 567
## 5 5 708
## 6 6 850
table(boxes$box_no)
##
## 9 18 27 36 44 45 53 54 62 71 80 89 98 106 107 115
## 8 8 9 9 1 8 6 3 9 9 9 9 9 2 7 7
## 116 124 133 142 151 160 168 169 177 186 195 204 213 221 222 230
## 2 9 9 9 8 8 5 4 9 9 9 9 9 1 8 6
## 231 239 248 257 266 275 283 284 292 293 301 310 319 328 337 345
## 3 9 9 9 9 9 2 7 7 1 8 9 9 9 9 5
## 346 354 363 372 381 390 398 399 407 408 416 425 434 443 452 460
## 4 9 9 9 9 9 1 8 6 3 9 9 8 8 9 3
## 461 469 470 478 487 496 505 514 522 523 531 540 549 558 567 575
## 6 8 1 9 9 9 9 9 5 4 9 9 9 9 9 1
## 576 584 585 593 602 611 620 629 637 638 646 647 655 664 673 682
## 7 6 2 9 9 9 9 9 3 6 8 1 9 9 9 9
## 691 699 700 708 717 726 735 744 752 753 761 762 770 779 788 797
## 9 5 4 9 8 8 8 9 2 7 7 2 9 9 9 9
## 806 814 815 823 824 832 841 850 859 868 876 877 885 894 903 912
## 9 4 5 8 1 9 9 9 8 8 5 3 9 9 9 9
## 921 929 930 938 939 947 956 965 974 983 991 992 1000
## 9 2 7 7 2 9 9 9 9 9 4 5 8
We see they are not evenly distributed between 0 and 1000. There are more about every 9 (i.e., 9, 18, 27,..,).
Which ones create these remainders? Ones that are 113 apart as the author suggests!
boxes %>% filter(box_no == 9)
## # A tibble: 8 × 2
## n_multiple box_no
## <int> <dbl>
## 1 106 9
## 2 219 9
## 3 332 9
## 4 445 9
## 5 558 9
## 6 671 9
## 7 784 9
## 8 897 9
Here’s another example:
boxes %>% filter(box_no == 850)
## # A tibble: 9 × 2
## n_multiple box_no
## <int> <dbl>
## 1 6 850
## 2 119 850
## 3 232 850
## 4 345 850
## 5 458 850
## 6 571 850
## 7 684 850
## 8 797 850
## 9 910 850
This cyclic distribution of the remainders is reminiscent of a rational number, but not quite. I think the author says that the golden ratio is a much more irrational irrational number than pi because it does not have this cyclical nature to its remainders (see below).
Calculating a Rational Approximation
The idea behind deriving the rational approximation is: if you have n x pi remainder about equal to b and m x pi remainder also about equal to b, then the difference between the two (m pi - n pi) or (m-n) x pi is going to be pretty close to an integer. And (m-n) x pi / (m - n) is going to be the rational approximation of pi.
What’s the first pair of numbers we get in pi multiple remainders?
boxes %>% group_by(box_no) %>%
summarize(n = n())
## # A tibble: 141 × 2
## box_no n
## <dbl> <int>
## 1 9 8
## 2 18 8
## 3 27 9
## 4 36 9
## 5 44 1
## 6 45 8
## 7 53 6
## 8 54 3
## 9 62 9
## 10 71 9
## # … with 131 more rows
It’s box number 9.
boxes %>% group_by(box_no) %>%
summarize(n = n()) %>% filter(n > 8)
## # A tibble: 72 × 2
## box_no n
## <dbl> <int>
## 1 27 9
## 2 36 9
## 3 62 9
## 4 71 9
## 5 80 9
## 6 89 9
## 7 98 9
## 8 124 9
## 9 133 9
## 10 142 9
## # … with 62 more rows
So in theory, all of these should result in reasonable approximations of pi. Let’s try.
m1 <- 106
m2 <- 219
(m2 - m1)* pi # numerator
## [1] 355
(m2 - m1) # denominator
## [1] 113
OMG. The very first pair of numbers we tried got us to the famous 355/113 rational approximation of pi! Wait a minute, does that mean that any pair gets us there?
m1 <- 332
m2 <- 784
(m2 - m1)* pi # numerator
## [1] 1420
(m2 - m1) # denominator
## [1] 452
But 1420 over 452 simplifies to uhhh 355/113!
Ok, let’s try a different box. It looks like a lot of the boxes are multiples of 9, so let’s try one that’s not. Let’s use box 71.
boxes %>% filter(box_no == 71)
## # A tibble: 9 × 2
## n_multiple box_no
## <int> <dbl>
## 1 57 71
## 2 170 71
## 3 283 71
## 4 396 71
## 5 509 71
## 6 622 71
## 7 735 71
## 8 848 71
## 9 961 71
m1 <- 735
m2 <- 961
(m2 - m1)* pi # numerator
## [1] 709.9999
(m2 - m1) # denominator
## [1] 226
WTF! It’s the same milu! That’s just nuts.
Ok, let’s try a different irrational number, maybe the golden ratio…
The golden ratio is (1 + square root of 5) / 2.
phi <- (1 + sqrt(5))/2
boxes <- tibble(n_multiple = 1:n)
boxes <- boxes %>% mutate(box_no =
multiple_remainder(n_multiple, phi, log10(n)))
boxes %>% group_by(box_no) %>%
summarize(n = n()) %>% arrange(-n)
## # A tibble: 891 × 2
## box_no n
## <dbl> <int>
## 1 6 2
## 2 11 2
## 3 14 2
## 4 16 2
## 5 19 2
## 6 21 2
## 7 24 2
## 8 29 2
## 9 34 2
## 10 83 2
## # … with 881 more rows
boxes %>% filter(box_no == 6)
## # A tibble: 2 × 2
## n_multiple box_no
## <int> <dbl>
## 1 89 6
## 2 699 6
m1 <- 89
m2 <- 699
(m2 - m1)* phi # numerator
## [1] 987.0007
(m2 - m1) # denominator
## [1] 610
987/610
## [1] 1.618033
That’s the ratio given in the book! 987/610! It turns out that 610 and 987 are two Fibonacci sequence numbers anyway and actually the last two before 1000. What if we did this for 10000? Would it be the two greatest Fibonacci numbers that are less than 10000? I suspect yes.
n <- 10000
boxes <- tibble(n_multiple = 1:n)
boxes <- boxes %>% mutate(box_no =
multiple_remainder(n_multiple, phi, log10(n)))
boxes %>% group_by(box_no) %>%
summarize(n = n()) %>% arrange(-n)
## # A tibble: 8,901 × 2
## box_no n
## <dbl> <int>
## 1 3 2
## 2 15 2
## 3 22 2
## 4 34 2
## 5 46 2
## 6 65 2
## 7 68 2
## 8 77 2
## 9 87 2
## 10 96 2
## # … with 8,891 more rows
boxes %>% filter(box_no == 3)
## # A tibble: 2 × 2
## n_multiple box_no
## <int> <dbl>
## 1 1597 3
## 2 8362 3
m1 <- 1597
m2 <- 8362
(m2 - m1)* phi # numerator
## [1] 10946
(m2 - m1) # denominator
## [1] 6765
10946/6765
## [1] 1.618034
Yeah, that sort of worked but it got the Fibonacci number right after 10000! Why did that happen? That’s the 20th one. The 18th one is 4181 which is still a pretty good approximation. I mean, the definition of the Fibonacci sequence is that the ratio of any two consecutive numbers will approach the golden ratio as the numbers get larger, so it’s not that shocking. In fact, any two series of numbers in the sequence should get better at approximating the golden ratio.
6765/4181
## [1] 1.618034
I think we could do this for any irrational number. How about e?
n <- 1000
boxes <- tibble(n_multiple = 1:n)
boxes <- boxes %>% mutate(box_no =
multiple_remainder(n_multiple, exp(1), log10(n)))
boxes %>% group_by(box_no) %>%
summarize(n = n()) %>% arrange(-n)
## # A tibble: 973 × 2
## box_no n
## <dbl> <int>
## 1 13 2
## 2 24 2
## 3 26 2
## 4 28 2
## 5 35 2
## 6 37 2
## 7 39 2
## 8 41 2
## 9 46 2
## 10 48 2
## # … with 963 more rows
boxes %>% filter(box_no == 13)
## # A tibble: 2 × 2
## n_multiple box_no
## <int> <dbl>
## 1 39 13
## 2 575 13
m1 <- 39
m2 <- 575
(m2 - m1)* exp(1) # numerator
## [1] 1456.999
(m2 - m1) # denominator
## [1] 536
1457
## [1] 1457
536
## [1] 536
1457/536
## [1] 2.718284
(1457/536 - exp(1))/exp(1)
## [1] 6.451246e-07
Not too shabby. This is a less than 1 in a million error.
What we need is to look not just at a random approximation but all the approximations that are calculated by this method… There are 973 pairs after all that fall into the same box.
Oh well, maybe an exercise for another day…
One more before I give up… let’s do square root of 2.
n <- 1000
boxes <- tibble(n_multiple = 1:n)
boxes <- boxes %>% mutate(box_no =
multiple_remainder(n_multiple, sqrt(2), log10(n)))
boxes %>% group_by(box_no) %>%
summarize(n = n()) %>% arrange(-n)
## # A tibble: 912 × 2
## box_no n
## <dbl> <int>
## 1 48 2
## 2 53 2
## 3 56 2
## 4 61 2
## 5 72 2
## 6 117 2
## 7 119 2
## 8 122 2
## 9 125 2
## 10 127 2
## # … with 902 more rows
Lots of boxes! I guess that means square root of 2 is kind of like the golden ratio in that there are not a lot of collisions of remainders. There were 912 rows in square root of 2 table but 973 in the e table and 891 in the phi table. This compares to 72 in the pi table…
box_to_check <- 48
m1 <- boxes %>% filter(box_no == box_to_check) %>% select(n_multiple) %>% min
m2 <- boxes %>% filter(box_no == box_to_check) %>% select(n_multiple) %>% max
m1
## [1] 524
m2
## [1] 932
(m2 - m1)* sqrt(2) # numerator
## [1] 576.9991
(m2 - m1) # denominator
## [1] 408
So here we have 577/408 = 1.4142157. Pretty close to sqrt(2) = 1.4142136!
Discussion
Again, it’s getting late and I’m getting sleepy so I’m giving up before we take this too much farther. Some ideas would be:
- Create a function that could automatically return the best rational approximation of 3 digit numbers or 2 digit numbers for any irrational number.
- Look at why the distribution of remainders is broader for certain irrational numbers than for others? Why is pi < phi < sqrt(2) < e as far as boxes? Doesn’t that imply that e is “more irrational” than phi? But the author said that phi is the most irrational?
- Why does my function result in getting numbers that are bigger than 1000 when I’m only asking for numbers up to 1000?