twitcode #4: reverse diff — using the right tool for the job

Filed under: programming — jlm @ 20:29

Mercurial’s  hg diff  command supports a  --reverse  option which shows the regular diff output except it reverses the sense of the comparison — i.e., it goes from the “destination” to the “source” (git-diff  supports this option too, but as the flag “-R”). Most of the time you want the ordinary “forward” sense, but occasionally the reverse sense comes in handy, and that’s why that option’s there. On rare occasion, I’ll even want to do this to files not under version control, but the regular system diff doesn’t support this feature.

So, after hitting this deficiency again recently, I decided to write up my own reverse-diff command which would swap its last two arguments and call diff. I started with the shell, as dealing with command arguments and calling programs is its forte. But it turned out to be surprisingly difficult to do stuff like copy the argument list or mess around with the end and near-end of the argument list, which I thought would be dead-simple operations. After futzing around with shell variables and parameters and the various options for variable/parameter expansion for something like 25 minutes, I came to my senses and did it in something like two minutes using C, where nothing’s going to interpret any kind of data as anything unless you explicitly request it to, and array manipulation is built in with clean syntax. All I had to do was swap argv[argc-1] and argv[argc-2] then execvp("diff", argv), easy peasy.

And if I golf argc and argv into c and v, then it fits in 130 chars [source]:
#include <unistd.h>
int main(int c, char**v) {
 if (c>2) { char*t=v[c-2]; v[c-2]=v[c-1]; v[c-1]=t; }
 return execvp("diff", v);

I could probably omit the check for ≥2 arguments, as the system diff doesn’t support the convention that a missing file argument means to use stdin (instead, it treats a filename of - (single hyphen) as representing stdin), but perhaps it’ll be used by somebody who’s installed an enhanced diff program.

I’m also amused that each of my “twitcodes” has been in a different language: shell, perl, python, and now C.


A puzzle about C’s stdio

Filed under: programming — jlm @ 18:06

I found the C puzzles webpage by Gowri Kumar to be a very interesting collection of oddities of the C language and some of its basic libraries. If you work with C for fun or profit, I encourage you to go and give them a try. I found very few of them to produce behavior I hadn’t expected, which could be a symptom of overfamiliarity with C. I did find a few surprises though, which I felt warranted further investigation. (more…)


twitcode #3: New mail in mbox

Filed under: programming — jlm @ 09:12

Once upon a time, people’s interactions with computers (those few people who got to interact with computers directly) was mostly through a teletype: a combination of a keyboard where they could type instructions to the computer and a printer where it gave the responses back to them. This model is tenaciously clung to by a handful of still-active projects such as gdb, but the bulk of its use nowadays is from command shells (bash, zsh, cmd.exe) because command-response interaction is much easier to specify, record, automate, examine, modify, and perform remotely in a teletype-style than a GUI-style.


twitcode #2: decoding MIME

Filed under: programming — jlm @ 12:12

Messing around with some mail handling scripts, I was surprised I didn’t find any good ways to decode MIME as a stream filter. Ten minutes later, I have 13 lines of Perl which do it in 201 characters in my normal non-terse style. It’s great for normal use, but a tiny bit of golfing fits it in a tweet’s 140-character limit:

$ cat mime_decode.pl
#!/usr/bin/perl -w
use strict; use utf8; use MIME::WordDecoder; 
binmode(STDOUT, ":utf8");
while (<>) { print mime_to_perl_string($_); }
$ wc mime_decode.pl
  4  16 137 mime_decode.pl

Good thing there was already a method which does all the real work…


Faces are hard

Filed under: web — jlm @ 22:10

So, there’s this webcomic Prequel Adventure set in the Elder Scrolls: Oblivion universe. It’s an excellent comic, with a compelling plot and great humor. (It also is paced extremely slow, with a real-time:in-universe-time ratio that handily exceeds even that of Freefall.) The panels are drawn simply, supporting the story’s elements and the comic’s jokes, and clearly indicating that the focus is on telling a good story in a funny manner, and not on having beautiful art (another similarity with Freefall). But despite the comic’s simple drawing style, there’s one bit which Kazerad absolutely nails: Facial expressions. I mean, look at the final panel from this page.


… and also the wrong things with the wrong people

Filed under: web — jlm @ 18:21

G+ advertises user control over sharing on post complaining of G+ unauthorized oversharing


"That's what we call Irony!"


Addressing the fragile base class problem

Filed under: programming — jlm @ 21:47

I’ve been thinking about the fragile base class problem lately. (Yes, I know it’s almost Christmas. My mind works mysteriously.) I started thinking by analogy to APIs, which the interface a superclass gives a subclass in fact is, even if it’s not called that. So, the superclass’s API changes, breaking the subclass, just like a regular API’s change can break a client. How do we deal with this with regular APIs? If we are to make a compatibility-breaking change (which introducing any member into a superclass potentially is), we version the API so that a client requesting version 1 semantics gets them while only clients written against the newer semantics will request version 2. We could do the same kind of thing with class inheritance if we mark everything with revision numbers, which we reference when inheriting.

class base@2 {
    void start@1();
    void stop@1();
    void idle@2();

class child@1 extends base@1 {
    void idle@1();
    void park@1();

Here’s our classic case of a fragile base class. child subclassed base and defined the new method idle(), then later base was extended with its own method idle(). Normally, this would cause a problem — the new stop() implementation might call idle() perhaps, and child’s idle() won’t be written with overriding a then-nonexistent base::idle() in mind. But with these revision markings, we say that child only overrides methods marked as being in revision 1 of base. So, when stop() calls idle(), it gets base::idle, not child::idle, and when park() calls idle(), the call resolution goes the other way.

The problem I see with this though, is that when going to an indirect superclass, it can be unclear which revision that should be.

class grandparent@3 {
    void method@2();

class parent@2 extends grandparent@2;

class child@1 extends parent@1 {
    void method@1();

Uh-oh. Should child’s method() override grandparent’s? If parent@1 extended grandparent@2, then yes. But if it extended grandparent@1, then no. So do we need to list the parent class revisions of every revision of the child class? I’d hope there’d be a better way. Perhaps we’d be relying on an IDE to handle the revision numbers for us, keeping them updated is just a dumb task, so in that case the IDE could maintain the manifest of parent revisions too.


How not to do automatic updates

Filed under: linux — jlm @ 10:47

Today’s attempt at upgrading packages produced this:

Reading package lists... Error!
E: Encountered a section with no Package: header
E: Problem with MergeList /var/lib/apt/lists/extras.ubuntu.com_ubuntu_dists_precise_main_i18n_Translation-en
E: The package lists or status file could not be parsed or opened.

The contents of that file?

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
   <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">  
   <META HTTP-EQUIV="Pragma" CONTENT="no-cache">
   <META HTTP-EQUIV="refresh" CONTENT="0;url=https://login.wifiportal.co.nz">
   <TITLE>Welcome to FIVO Hotspot, Product of Natcom LTD NZ</TITLE>


I don’t have unattended upgrades enabled on my Ubuntu laptop. Nevertheless, there’s something which goes around and replaces files in /var/lib/apt with whatever junk it gets from whatever network it happens to be connected to at random times. Can I be the only person who thinks this is a Really Bad Idea?


Stopping laptop suspend on Ubuntu Linux

Filed under: linux — jlm @ 21:15

The default configuration for Ubuntu is to suspend the laptop any and every time the lid is closed, regardless of whether it’s on AC power or using an external display and keyboard, which is pretty annoying but up until Oneiric the setting wasn’t too difficult to discover and override. With Oneiric, Ubuntu reset the suspend options during the upgrade and removed it from GUI accessibility (Options are bad! Everyone uses computers the same way! No one uses an external KVM with a laptop! Gag.), but 5 seconds of web search reveals how to set it with the command line: gsettings set org.gnome.settings-daemon.plugins.power lid-close-ac-action nothing

Except one problem. This only affects the PM settings if you’re logged in to the primary X console. I also want to be able to use my laptop headless. What to do when not logged in was set under “system” (as opposed to “user”) options in the no-longer-available GUI, but no references I found told how to set it from the command line. This is done by running gsettings set as a system user unsurprisingly, but unfortunately, gsettings doesn’t work from sudo or su, because gsettings wants to start up dbus because … I’m not sure why. And it won’t run without X, and gsettings won’t run without dbus. So, how do you run something non-graphical with dbus access? That turns out to be with the dbus-launch command, which can figure out it’s not in X unlike whatever gsettings is doing to start dbus, so what we want is sudo dbus-launch gsettings set org.gnome.settings-daemon.plugins.power lid-close-ac-action nothing

Ha ha, no. That changes the setting for the root user, but the laptop still suspends. This turns out to be because the setting used when awaiting login aren’t root‘s, but gdm‘s. So that means to stop the suspending, we do sudo -u gdm dbus-launch gsettings set org.gnome.settings-daemon.plugins.power lid-close-ac-action nothing

Ha ha, no. See, the documentation on whose settings are used is wrong. It’s not gdm‘s settings that are used at all. (That’s whose was used up to Natty.) If you look at your passwd file, you’ll see there’s a new display-manager user in addition to gdm now: lightdm. What the docs don’t say is that it’s that user whose settings are used now. So, no more teasing, this is the command which keeps your laptop powered on: sudo -u lightdm dbus-launch gsettings set org.gnome.settings-daemon.plugins.power lid-close-ac-action nothing


RSS death, the Javascript trap, and SaaS

Filed under: web — jlm @ 12:34

I read this recent post by Vambenepe on the campaign to kill RSS, and it bothered me. RSS/Atom is what makes a dynamically updating web usable, and here it as an open, decentralized protocol was being replaced by closed SaaS offerings under central control. For a reason I wasn’t sure of, it reminded me of Stallman’s anti-SaaS essay “The JavaScript Trap” from some time ago. That essay didn’t sit right with me: Because your software comes from a webserver on demand, instead of being pre-installed locally, doesn’t make it or what it does any more or less free, and Stallman’s solution of blocking javascript not tagged as being under a free-software license is impractical. And indeed, in the years since, we’ve seen plenty of open source javascript code written and published, coexisting alongside a vibrant ecosystem of proprietary javascript code, just like we have with client application software.

But it finally gelled: The problem with SaaS is that it welds the data to the code.

Let me explain using “traditional” software applications as an example. You have documents you edit in Microsoft Word. These documents are .doc files which are on your disk drive and you can do anything to them that you can do with any other file: Copy it, delete it, encrypt it, archive it to tape, attach it to an email, etc. All outside of Word. If Microsoft does something to annoy you, you can even edit the documents in WordPerfect or AbiWord or OpenOffice or anything else which understands the .doc file format, which there are plenty of because file formats aren’t protectable as intellectual property.

Contrast this with the SaaS situation: You can’t give a WebDAV address to Google Docs for a document you want to edit in that webapp, and have it open and save to that file. You can’t manipulate your Docs files at all, except through the webapp. The only way you can (eg) attach it to an email is to be using Google’s email webapp, and hope that Google’s programmers have provided integration between them (at time of writing, they haven’t).

In short, if you want to use Google’s word processor, you have to use Google for its data store. You can’t say “I love Google Docs’ UI, but I prefer to use Amazon for data storage.” SaaS leverages control or preference of one aspect (the code) into use of another aspect (the data). Why should the SaaS provider have custody of your files? You can store your data with any number of hosts, and the “cloud” lets you access that data from any client machine. But not if you want to access a webapp. Then it’s only if your data is hosted with the SaaS provider. And that’s the real Javascript trap.

Update: Seems Steve Wozniak has some concerns along these lines, about you not controlling data you upload to a cloud store.

Update 2: Now Wired is sounding this alarm, with the focus on data security.

Powered by WordPress