Ridiculous bug with std::random_shuffle

Filed under: programming — jlm @ 17:49

Right now I’m very irate at C++’s STL because its random_shuffle algorithm has a landmine associated with it. I was using it with a vector, which I’ll refer to as v. The call which shuffles v’s contents is random_shuffle(v.begin(), v.end(), rfunc), where rfunc is a function object which, in the case of int-indexable vectors, accepts a single int n and returns a random int in the range [0, n). What’s the obvious way to get a random int in some range? uniform_int_distribution, of course. And boom, there goes the dynamite because uniform_int_distribution(0, n) generates a random int in the range [0, n], not [0, n).

Besides the need to use uniform_int_distribution(0, n-1) being counter-intuitive, some other attributes make this bug frustrating. While random_shuffle requires rfunc’s return value to be in [0, n), it doesn’t check that it is, even when debugging is enabled — instead, it’s our old friend undefined behavior, no compiler diagnostic provided. vector bears some blame too, as it’s happy to silently munge data pointed to by an out-of-range iterator even with debugging enabled, optimization off, warnings enabled — and again, undefined behavior, no compiler diagnostic provided. If 0 is a valid value of a vector element (naturally, for me it was), then the effect of the bug when v.size() is less than v.capacity() is to get a shuffled vector in which one of its elements has been replaced by 0, and the downstream effects of that can easily be too subtle to notice (naturally, for me it was). So, you only get crashes when v.size() == v.capacity() and something in v.data corrupts and gets corrupted by whatever’s just past it in memory — which is a freakin’ rare occurrence!

How do you debug this kind of thing? Since it’s an intermittent bug which any given run of the program is highly unlikely to trigger, turn on core dumps so that when it eventually does occur, you have the dump of the program state available to debug right there. Turn on debugging so you can get the most out of your corefiles. Stick in code which lets you track what you need to track. Here, this shim function is instructive: int f() { int rand_val = uid(rand_eng); return rand_val; }. You need to remove -O to prevent calls to f() from being replaced by calls to uid(rand_eng) and use -O0 to prevent the transient variable rand_val from disappearing. (volatile probably would work also.)

Don’t forget that this is terrible API design, though. We have a library which has a generator of random ints and a consumer of random ints, and their conventions are different enough that the obvious way to combine them is buggy, yet similar enough that they’ll never trigger compiler warnings (much less errors). That’s not a good API choice, that’s a trap for API users! It’s good to have obvious ways to perform the operations API users will want to perform provided they work correctly. Since API design is hard, often you can’t do that. In those cases where you can’t make an obvious use that works, don’t introduce the obvious use in the first place — it’s far worse than having no obvious use at all, because if you make an obvious way to perform an operation, people will perform the operation that way, simply because it’s the obvious way. It should never be wrong. That there are so many ways to perform unsafe operations that are not obvious, not caught by C++ compilers, and not caught at runtime is the thing I dislike most about this programming language. Today I learned one more way to do it, and I should be well beyond that point.


Use of unconditional probability

Filed under: math — jlm @ 21:06

Recently I had a discussion with some friends about the probability that a certain card was among the first n cards dealt from a shuffled deck of m cards. Here I’m going to only cover the specific case of a deck consisting of the 13 spades, and the chance that the ace is one of the first two cards dealt — n=2 and m=13. We came up with four ways to perform this elementary probability calculation.

Combinatorics: There are 12C1 ways to have the ace and one other spade, without regard to order. There are 13C2 ways to deal two spades. So the probability is 12C1/13C2 = 12/78 = 2/13.

Trickery: The chance the first card isn’t the ace is 12/13. The chance that the second card, dealt from the remaining 12 cards, also isn’t the ace is 11/12. So the chance that neither is the ace is (12/13)⋅(11/12) = 11/13. Thus the chance that one of the first two cards is the ace is 1 − 11/13 = 2/13.

Conditional probability: The chance that the first card is the ace is 1/13 and the chance it isn’t is 12/13. If we represent the event of the first card being the ace as 1 and the event the second card is the ace as 2, then that is P(1) = 1/13 and P(~1) = 12/13. Now, P(2|~1) = 1/12. ∴ P(1 or 2) = P(1) + P(2|~1)⋅P(~1) = 1/13 + (1/12)⋅(12/13) = 1/13 + 1/13 = 2/13.

Unconditional probability: P(1 or 2) = P(1) + P(2) − P(1 and 2) = 1/13 + 1/13 − 0 = 2/13.

They all generalize into dealing n cards out of m, but the one using unconditional probability not only is the simplest, it generalizes the most cleanly. But most of us found it the least satisfying solution. Maybe because it was too simple?


Ouroboros in Std.Logic

Filed under: math — jlm @ 18:14

This post dissects a very short yet fascinating and enigmatic proof that can be found in the Logic module of the standard library of the Lean proof verifier:

theorem iff_not_self : ¬(a ↔ ¬a)
  | H => let f h := H.1 h h
         f (H.2 f)

OK, the fact being proved isn’t enigmatic — it’s stating that a proposition’s truth value is never the same as its own negation’s — but the proof thereof sure is. If that looks to you like some incomprehensible formal logic circling to eat its own tail, that’s because it kinda is.

However, it’s written extremely densely, which inhibits comprehension — which is sad, because the way it works is very interesting. It’s quite the rabbit hole, though, so let’s take our trusty spade and start digging. The very first thing we encounter is that Lean implicitly converts the ¬(a ↔ ¬a) into (a ↔ ¬a) → False. This is because Lean uses dependent type theory to represent logical statements and their proofs, so we need to discuss how these kinds of theories are used with formal logic.



The Shockwave Flash that was

Filed under: web — jlm @ 09:55

swf is dead, a closed-source project shut down by its owner (Adobe) a few years ago.
Kazerad, best known as the artist-author of the web comic Prequel Adventure, wrote a touching eulogy for it on twitter here.

[Read it “unrolled” (all 15 parts together) here.]


Calculating limits with hyperreals

Filed under: math — jlm @ 18:37

Here’s a simple (first-year-calculus level) math puzzle:
What’s the limit as n → ∞ of nx⋅(nxx – 1) for x > 0 ?

The right insight makes it work out rather elegantly, but suppose your math muse is AWOL cavorting in Cancun today — is there a more systematic way to approach these kinds of problems? It turns out that the tools of Robinson’s hyperreal analysis are very good at attacking this stuff.

Since we’re looking at a limit as n approaches infinity, the first thing to do is to simply set n to infinity — or rather, to an arbitrary infinite hyperreal (all the positive infinite hyperreals are indistinguishable in a deep sense). The second step is to write everything in terms of infinitesimals, so we introduce the infinitesimal m = 1/n and our expression becomes x⋅((xx)m – 1)/m.

Now we want to get everything expressed as polynomials of infinitesimals (here just m) of as low an order as lets things work out, so we expand the Maclaurin series of all the not-ploynomial-in-m stuff. Here that’s only f (m) ≝ (xx)m, which expands to f (0) + f ’ (0)⋅m + o(m²) = 1 + (ln xx)⋅m + o(m²) because f ’ (m) = (ln xx)⋅(xx)m.

Plugging this back into x⋅((xx)m – 1)/m gives x⋅((ln xx)⋅m + o(m²))/m = x⋅((ln xx) + o(m)) = x⋅(ln xx) + xo(m). Now, x⋅(ln xx) = ln (xx)x = ln x. Also, xo(m) = o(m) since x is finite. So nx⋅(nxx – 1) simplifies into ln x + o(m).

Since m is infinitesimal, o(m) disappears when we transfer back into ordinary reals and make n a limit index variable once again, and we have limn→∞ nx⋅(nxx – 1) = ln x. Now, this is certainly more work than substituting in w ≝ 1/nx and applying l’Hôpital’s rule, but the general technique works on all kinds of limit calculations where a-ha! moments become few and far between.


How is this glibc deadlock still unaddressed?

Filed under: programming — jlm @ 13:29

Why Programming is Hard, Volume CXX: the maintainers of critical pthread implementations will ignore multiple bugfixes from parallel programming experts for a longstanding deadlock bug for three years and counting.


Do not meddle with the Application Binary Interface

Filed under: programming — jlm @ 15:18

Why Programming is Hard, Volume CXIX: a comprehensive explanation by JeanHeyd Meneide of how ABI stability stops us from having nice things despite enormous amounts of high-quality work on ABI-consistency and related issues. Naturally, it’s a dispiriting read. Fortunately, half a year later there’s this follow-up on how to fix it.


twitcode #5: Extract HyperRogue achievements from log file

Filed under: programming — jlm @ 17:28

The game HyperRogue is very interesting. Its main gimmick is that it’s played on a hyperbolic plane (surface with negative curvature) instead of an Euclidean plane (surface with zero curvature) like 99.99% of 2-D games are. Playing in this kind of geometry is compellingly mind-bending. Other than the geometry involved, the main way the game differs from typical roguelikes is that the player character doesn’t gain levels or skills in the course of play — instead, the game opens up new “lands” with new environmental challenges and/or monsters with new attributes as the player demonstrates their skill by meeting the unlock requirements for each land.

The game supports “achievements”, where whenever your gameplay meets the condition of some challenge (some very easy, some nightmarishly hard, most of medium difficulty), it records this in a log file. Of course, met achievements aren’t all that the log file contains, but it’s nice enough to start each “met achievement conditions” record with “ACHIEVEMENT” and a marginally-descriptive achievement name, followed by other relevant information. Unfortunately, you can’t get a list of your fulfilled achievements merely by running  grep '^ACHIEVEMENT ' hyperrogue.log  because the game logs when you fulfill an achievement’s conditions separately for every run, not just the first run to fulfill it. So, to extract the records of each achievement’s first fulfillment, we ignore any record with an achievement name we’ve seen before and print out the line otherwise (ie, if we haven’t seen it before), which is nigh-trivial in awk [link]:

    if (!($2 in ach)) {
        ach[$2] = 1;

At 86 characters, it’s the fifth example from me of “twitcode” (programs under 140 bytes). And being in awk, it’s the fifth language I’ve twitcoded as well.


Wacky security questions

Filed under: travel — jlm @ 13:18

While I privately celebrate Independence Day, let me reminisce on a small, but odd, part of my recent trip to the mother country we independenced from. For a round-trip international flight, you’re questioned by security agents four times: by the TSA before leaving the country, by ICE upon your return, and by their equivalents in the foreign country upon arrival and departure there. The TSA screening was very routine, but the British immigration/customs officer had this weird thing where he’d repeat all my answers back with a really skeptical tone.

“Where are you going from here?”
“I’ll be staying here in London for the next five days.”
“You’re lodging in London?”
“Why are you coming here?”
“Primarily to visit my family.”
“You’re visiting family, who live in London?”
“And when are you leaving?”
“I fly back home on July 1st.”
“You’re flying back to the US on the 1st of July?”
“And how are you getting to London?”
“I bought a ticket for a National Express bus.” [The “Tube” was shut down due to a labor strike.]
“You’re taking a National Express bus to London?”
[Dropping the skeptical act] “Very good, welcome to the UK.”

Maybe the idea is somebody fabricating their answers will feel a need to elaborate instead of simply confirming? Whatever. The odder bit was the trip back, where each security officer asked me a strange question.

“Where is your journey starting from?”
“Uh, what?”
“Where is your flight leaving?”
[Borrowing the tone of his countryman] “You want to know the airport my flight departs from?”
“From this airport, London Heathrow.”
“Very good.”

And then, from ICE:

“What’s in London?”
“Uh, what?”
“You said you flew in from London, what’s there?”
“Well, there are a lot of things in London: Parliament, and a stretch of the River Thames, and several bridges over the river, and millions of people, one of whom is a first cousin, and his wife, and —” [at this point, the agent cuts me off, and I don’t even get a “very good”. I guess that’s a British thing.]


Why cubie orientations are preserved

Filed under: math — jlm @ 10:59

A while ago in “Rotating only one cubie”, I said there was “extra structure” to the Rubik’s Cube’s moves that preserves the sum-total of all corner cubie orientations, and of all edge cubie orientations, but didn’t preserve the sum-total of center cubie orientations. I didn’t get into that then, because it’s hard to describe, and why they’re preserved wasn’t the point of that post. But I did make a note to get to it later, as I wasn’t aware of any good explanations of it that I could link to, and it’s high time for that “later” to become “now”.

The dynamics of the corner and edge cubies are pretty different in this regard, which is probably expected by everybody who’s played around with the Rubik’s Cube that much — edge and corner cubies have very distinct characters in general. As is usual with the Rubik’s Cube, the corner cubies are easier to think about. (I think this is mostly because there are only 8 of them, while there are 12 edge cubies.) So, let’s do them first.


Powered by WordPress