jlm-blog
~jlm

9-Jul-2019

Quine, Hofstadter, Gödel again

Filed under: math, philosophy — jlm @ 20:25

During some recent traveling, I re-read some sections of Douglas Hofstadter’s Gödel, Escher, Bach. Of particular interest was a step in the proof of Kurt Gödel’s first incompleteness theorem which involved substituting a variable of a formula with the formula itself. Hofstadter calls this peculiar operation “quining” after the logician W. V. Quine, who wrote extensively about self-reference. As with the previous times I read through that part, I noticed that the operation didn’t specify a variable for substitution like substitutions generally do, but instead performed substitution on all free variables, which is something I haven’t encountered anywhere else. This oddity wasn’t even mentioned, much less justified. Unlike after my previous reads, this time I decided to figure out why it was performed that way.

Now, the more interesting parts of Gödel’s Incompleteness Theorem involve what we now call Gödel coding to demonstrate that classes of logical systems can represent quining their own formulas and verifying their own proofs, the latter of which was very surprising at the time. But those parts turn out to be unnecessary for dealing with this small facet of Gödel’s proof, so let’s just deal with the proof’s final proposition, which is comparatively simple and straightforward: given that a logical system supports the operations of quining and proof verification (and logical conjunction and negation, and an existential quantifier), that logical system is incomplete (or inconsistent).

(more…)

2-Aug-2016

Solving the quartic using “obvious” steps

Filed under: math — jlm @ 21:40

Looking over the derivations of solutions to quartics (fourth-degree polynomials), I was a little disturbed at how, except for Descartes’s, the derivations all have steps where “magic” occurs, by which I mean that a step does something mathematically valid but very unintuitive, which just happens to make stuff later on work out just right.

So I wondered whether one could take an approach like Ferrari’s, but do it without his “magic”. The basic idea to Ferrari’s approach is to convert the depressed* quartic equation f(x) = x⁴ + αx² + βx + γ = 0 into (x²+p)² = m(x+q)², because taking the square root of both sides produces the quadratic equation x²+p = ±(x+q)√m.

(x²+p)² = m(x+q)² expands into x⁴ + 2px² + p² = m(x²+2xq+q²), which is x⁴ + (2p−m)x² − 2mqx + p² − mq² = 0. We can name that polynomial g(x). From above, we know that the roots of g are the same as the roots of x² ± (x+q)√m + p, and the quadratic formula easily gives us those roots in terms of m, p, & q.

g ≡ f iff α = 2p−m and β = −2mq and γ = p²−mq². Thus our task is to get values for p, q, & m which satisfy these three equations.
m = −β/2q and p = ½(α+m), so p = ½(α − β/2q).
γ = p² − q²m = p² + q²β/2q = p² + ½βq = [½(α − β/2q)]² + ½βq = ¼[α² − αβ/q + β ²/4q² + 2βq].
4γq² = α²q² − αβq + ¼β ² + 2βq³.
2βq³ + (α² − 4γ)q² − αβq + ¼β ² = 0, which is just a cubic in q, which is something we’re presumed to know how to solve. Using q we get m and p from m=−β/2q and p=½(α+m).

With these values of m, p, & q, f ≡ g. So the roots of g(x) from above are the roots of f(x), et voilà, we’re done, no magic needed.

 

*A depressed nth degree polynomial is one in which the (n−1)th term has a coefficient of zero, so a depressed quartic is a quartic equation with no cubic term. For any nth degree polynomial F(x) = xn + an−1xn−1 + an−2xn−2 + …, there is a depressed nth degree polynomial f(x) = xn + bn−2xn−2 + … such that F(x+s) = f(x), because the coefficient of the (n−1)th term of F(x+s) is ns+an−1, which can always be made zero by a suitable choice of s.

21-Oct-2014

Why was algebra so difficult to discover?

Filed under: math — jlm @ 18:48

(tl;dr: Asking a bunch of questions, and not making any guesses.)
Consider integer arithmetic prior to the discovery of algebra. How do you go about a really foundational task such as defining division? You give everything involved names, like so: A number called the dividend divided by another number called the divisor is a number called the quotient with a non-negative number called the remainder if the divisor times the quotient plus the remainder is the dividend and the remainder is smaller than the divisor. How do you get anything done with that kind of word salad? No, when you want to actually do some number theory, you go with: n÷d = q R r is equivalent to n = q∙d + r ∧ 0 ≤ r < d. This terminology is taught because you teach division well before you teach algebra, but it’s so useless afterwards that I very well might have never used this sense of the word “dividend” all these years since I learned algebra. Anyone skeptical that basic grade school arithmetic isn’t far harder to grasp without algebra is invited to explain how come long division gets the right answer without using it.

So fine, algebra is incredibly useful, but utility has no relevance to ease of understanding. I remember having some confusion when I was introduced to it. It was along the lines of: “What number is x?” “That can be any number.” “*boggle*”. Yet, those named terms above, they’re just placeholders too: “What number is the dividend?” “That can be any number.” “Oh, okay.” — How come that’s different?

The Greeks were using letters to represent arbitrary points in geometry centuries earlier than the Arabs did that for arbitrary numbers. In teaching geometry we don’t get the problem “What point is A?” “That can be any point.” “*boggle*” — What makes “x can be any number” so much harder than “A can be any point”?

21-Apr-2012

Fun with metafunctions

Filed under: math — jlm @ 23:14

In mathematics, a function is a mapping from one set (the domain) onto another (not necessarily distinct) set (the range). Now, you can cause all kinds of weirdness if you let elements of these sets be functions themselves. Like, you can define a function k from integers to functions from integers to integers, which are the “constant” functions of that integer: if j = k(x) for some x in Z, then j(y) = x for all y in Z. j is a function which carries the x inside it, waiting to be applied to an argument, which is ignored and now the hidden x emerges. Written more densely, k(x)(y) = x for all x, y in Z.

But mathematics doesn’t allow you to do what’d be the most fun: Calling a function on itself. This is because you’d have to include a function in its own domain, and you get to an infinite regress, because the formal definition of the function includes specifying its domain which contains the function whose formal definition includes its domain … and you can’t build such a thing in set theory.

Okay, suppose I try to get around this problem by introducing a set Σ containing non-function elements which maps 1:1 to the functions from Σ to the integers Z. This can’t work because Σ has to be “bigger” than itself. So, suppose Σ only maps to a few of these functions. This seems like it should do, and I can just declare that any several functions I’m interested in have mappings to Σ, and functions obtained by Cantor diagonalization simply won’t. Let a be the function that maps (some of the functions from Σ to Z) onto Σ and b be its inverse.

Neither a nor b is a function from Σ to Z, so I can’t apply a(a) or b(b), so my attempts at self-applying need to be more subtle.
Let’s define a function from Σ to Z, and declare it to be a element of a’s domain:
    f(ξ) = b(ξ)(ξ) + 1
here ξ is a element of Σ, so b(ξ) is a function from Σ to Z, and b(ξ)(ξ) an integer, so it seems to work out.

I declared f to be in a’s domain, so a(f) exists and is an element of Σ, let’s call it θ.
Now I can finally achieve something close to application-on-itself with f(a(f)) = f(θ).
What can we say about this? Well, its value depends on b(θ)(θ).
But b is the inverse of a, so b(θ) = f, and so b(θ)(θ) = f(θ). But we defined f so that f(ξ) and b(ξ)(ξ) could never have the same value!

See what fun functions of functions returning functions are.

30-Mar-2012

Haiku proof

Filed under: math — jlm @ 12:01

Product of all primes
Incremented won’t factor
Thus a larger prime.

 
I wonder if it’ll catch on…

26-Oct-2010

Kinked pipes

Filed under: math — jlm @ 12:04

Our friend Keith Devlin poses an interesting question about expanding pipes: A pipe expands by one foot over its one-mile length; how high does it arch?

Keith assumes a single sharp kink in the center and uses Pythagoras to calculate that the pipe arches slightly over 50 feet! Yet if the pipe had its sharp kink near one end, it’d arch slightly under 1 foot. This suggests a calculus of variations problem: Of all the ways a pipe can bend, which way generates the highest arch? The dual problem would be: Of all the ways to arch to a given height (let’s say 50 feet), which way uses the least length of pipe?

Our pipe path will go from one pipe endpoint up to some point with height 50’, thence to the other pipe endpoint. If either of these paths weren’t a line segment, we could shorten the path by using a line segment. The angles the segments make with the horizontal line at 50’ will be equal, otherwise we could shorten the path by equalizing them. So the shortest path is a line up to the center 50’ up, kink, then a straight line to the ground at the other end.

Because this isoceles path is the shortest path to reach a given height, it’s also the highest-reaching path of its length. Keith’s assumption that the pipe sharply kinks in the center gives the highest arch of any possible path!

Minimizing the height isn’t so interesting: By making arbitrarialy fine accordion folds, we can keep the maximum height arbitrarialy small.

What if the pipe makes a parabolic arch? Depth below the arch height at distance x from the center is kx². For our case of a mile-long pipe, we have arch height k(2640)². The curve length is ∫02640 √(1+(2kx)²) dx. If you’re like me, you can’t integrate this without a reference, but it gives (2kx √(4k²x²+1)+sinh-1(2kx))/4k | x=2640. Numerically solving this for length 2640.5 gives k=6.385×10-6, which gives an arch height of 44.5 feet.

That was messier than I expected. I expect a catenary arch, which IIR my physics correctly is what a pipe should assume under gravity, would be much the same.

Why are the arches in the center so much higher than arches at the ends? Going back to the triangle case, if we increase our pipe length by an amount δ over a flat length x, we have height related by x²+h² = (x+δ)² from good old Pythagoras, which gives h = √(2xδ+δ²). δ will vary from near 1 at x=0 to near 0 at x=5280, δ ≈ 1-(x/5280), varies only slowly with x. So with the square root curve being steep at small values, at small x, x has a big impact on h! Toward the other end, δ’s decrease overpowers x’s increase, and as δ gets small h is pulled steeply down.

23-Aug-2010

Permutation puzzle in Futurama

Filed under: math — jlm @ 12:26

I finally got around to watching last week’s Futurama late last night, and there’s an interesting permutations puzzle involved. The puzzle itself is a minor SPOILER, so stop now if 5 days isn’t enough time for you.

Puzzle: There’s a machine that can swap minds between bodies, but can’t swap two people if it’s already swapped that pair of bodies. A group of people use the machine to perform an arbitrary sequence of swaps among themselves. How can this group and two outsiders use the machine to restore everyone’s mind and body?

My solution is here.

I wonder if this is a recasting of a classic swapping puzzle I’m not familiar with.

19-Aug-2010

Finger Multiplication

Filed under: math — jlm @ 12:39

“Finger multiplication” is an old trick for using the bottom-left (easier to memorize) quadrant of the times table to do multiplication in the top-right quadrant. It’s easy to work out why it works, but unsatisfying, but I found rederiving the technique gave more insight.

How might we do this? To go from one quadrant to the other, we subtract from 10, or generally from the base b: (b-x)(b-y) = b²-bx-by+xy
Well, there’s our xy term, to get it alone we want to subtract off b²-bx-by = b(b-x-y), which we can do by adding -b(b-x-y) = b(x+y-b). The factor of b is trivial to deal with, just use place value, so that leaves us with x+y-b. That’s easy enough to calculate on its own, but we can distribute the -b half to each number as (x-b/2)+(y-b/2). We have the numbers b-x and x-b/2 to do arithmetic on, these sum to b/2, the number of fingers on a hand, so if you hold up x-b/2 (x-5 for us pentadactyl folk) fingers on a hand, then the down fingers number b-x. And that’s how you do finger multiplication, hold up x-5 on one hand, y-5 on the other, put the sum of the up fingers in the tens place and the product of the down fingers in the units place et voilà.

Now that we’ve reverse engineered it for the existing use case, can we come up with something like it for others? How about the case where only one term is > 5 ?
We want to move this from the top-left to the bottom-left of the times table, so x∙(b-y) = bx-xy. So, put the smaller number in the tens place, and subtract off the product of it and the “down fingers” from the large number. 3×7? 7 has 3 down fingers, so 30-9 = 21. 4×8? 8 has 2 down fingers, so 40-8 = 32.

18-Aug-2010

Simpson’s paradox

Filed under: math — jlm @ 08:56

I’ve never seen Simpson’s paradox explained by showing the distributions, which I found to be very helpful in understanding what’s going on. So, take a look at these two distributions I made from calls to random():
Simpson's distribution
The cyan distribution has a larger mean than the magenta one, but anyone worth their salt should recognize them as mixtures of unimodal distributions (indeed, that’s how I made them) and look for a binary factor to separate the subpopulations. Upon separating them, in both cases the magentas’ means are higher than the cyans’, Simpson’s paradox.

15-Aug-2010

P vs. NP

Filed under: humor, math — jlm @ 14:37

So, Vinay Deolalikar has a purported proof that P ≠ NP here. But there must be some error, because I just came up with this simple proof that P = NP !
(If you think “joke proof” sounds like an oxymoron, you can stop reading here.)

Let W(m) ➙ s be a member of NP.
Let X(m; n) ➙ s be the first n symbols of W(m). X is obviously in NP.
So, X has a verifier V(m; n; s) in P.
We can find the first symbol t1 of an output of W(m) by trying V(m; 1; c1), V(m; 1; c2), … for all the symbols ci in the alphabet.
We can find the second symbol t2 by trying V(m; 2; t1c1), V(m; 2; t1c2), etc.
And so on for all the symbols t3, t4, … in the output of W(m).
Finding each symbol takes c calls to V, where c is the alphabet size, a constant. Each call of V takes polynomial time, so we can calculate each individual symbol in polynomial time.
The length of the output of W(m) is necessarily polynomial in ∥m∥, so we have a polynomial number of symbols, each calculable in polynomial time, so we can calculate the entirety of W(m) in polynomial time, so W ∈ P.
P=NP. QED

Powered by WordPress