Power consumption and mobile
June 25, 2012 | CommentsI enjoyed reading this little piece that did the rounds last week, about the costs of power consumption for the iPad ($1.61 annually, if you can't be bothered to click). One thing I've always liked about working in mobile software is the way that environmental concerns line up rather nicely with end-user benefits, when it comes to power: you work hard to minimise your use of it. It's good to see this sort of thing quantified, though I have a feeling the disposability and upgrade cycle for devices pushes the overall environmental footprint of these things back in the wrong direction...
An update on the dissertation
June 19, 2012 | CommentsIt's been a few weeks since I reported on my dissertation project (an examination of the suitability of superoptimisation for virtual machines, by writing a superoptimiser for the JVM), so time for a little update.
One of the two big problems with superoptimisation is dealing with the combinatorial explosion of possibilities. I've restricted myself to using just 39 of the JVM opcodes, those needed for integer arithmetic. With the benefit of a little hindsight I realise I need another 12 (those for comparison operations and branching) too.
Even with this reduced set, there are 5,728,366 possible programs which are 5 opcodes long; and when you fill in all the arguments for those opcodes which take them, it gets way worse. 5 opcodes isn't long enough to do much, either; the really simple (and quite elegant-looking) implementation of the signum() call in java.lang.Math uses 9. So I've been focusing on the problem of pruning these possibilities, and leaving aside the other issue for now (how you definitively test whether a sequence of bytecodes performs as you wish).
I started out by using the Clojure math.combinatorics library to generate a cartesian product of all possibilities, and then filtering through them. This worked for small programs (I was able to find an optimal 2-opcode sequence for the identity function this way), but quickly becomes unworkable (which wasn't a surprise) - filtering through possibilities is slow.
So I've switched to considering the set of possibilities as a tree. At any given node of the tree, I apply two tests: firstly, is this node fertile? i.e. is it possible that any of its children can be optimal bytecode sequences. If a sequence contains any redundancy, for instance, it's infertile. And secondly, is this node a valid program itself? There's an overlap between invalidity and infertility, but they differ in a few places too.
Finding infertile sequences early is really important: I don't need to explore the children of an infertile node, so it lets me cut down the possibility space, and saves time. It's far quicker to never have to consider a candidate sequence, than to consider it... no matter how fast your testing is.
So what I'm doing amounts to a large amount of static analysis on sequences of JVM opcodes. I have a growing set of filters which look for invalid or suboptimal use of local variables; underflows in the operand stack; check for obvious redundancies (an optimal sequence having no redundancy in it); and check that the output of a program is in some way dependent on its input, by tracking the influence of its inputs across operand stack entries and local variables.
The upshot is that so far I'm able to cut the possible number of sequences for a 5-opcode program from 90,224,199 down to 10,927. This sounds great, but I then have to fill in all the possible arguments for opcodes which take them, which bumps the number of classes I have to build and test up to 276,616,752. This takes just over a day and a half to run, on my little laptop; and a 5-opcode program doesn't do much. That said, I'm making progress: a week ago I was taking 7.5 hours to run a search across 4-opcode sequences, and now this takes just under 4 minutes.
That's where I am; right now, each major step forward I make (and I can't see many obvious ones left) seems to buy me one additional opcode. I think that parallelising across oodles of machines (which should be straightforward) will buy me one more; running the process for a few days should get me another. So right now it looks like I'll be able to do a 7-opcode search before the project ends - presuming no more inspiration strikes as to how to speed this up.
Possible areas for improvement right now are speeding the process of generating and loading classes for test (quick benchmarks suggest this is where most of my running time goes) and getting more sophisticated about marking obviously redundant sequences as infertile. Waiting in the wings and looking to cause trouble are those branching operations, which I need, and which complicate (possibly fatally) some of the static analysis I've been doing to date. So I expect things to get slightly worse when I bring them in...
Clojure is pretty good, I'm finding. There's a repeated pattern I'm noticing in my use of it: spend a day and a half beating my head against a wall trying to do something, then find that there's a library function that does it for me. I'm feeling a bit more expressive in it, though - whilst what I'm writing definitely isn't optimal or idiomatic, it's concise and occasionally readable after-the-fact...
Quantified Self, Self-Hacking Day
June 16, 2012 | CommentsA little coterie of Brighton folks wandered up to the Self-hacking day in London, run by some of the Quantified Self crowd. I've not been to the monthly London meet-ups, but have been following this kind of thing for a few years now, on a no doubt hackneyed path that started with Nike+ and took in Mappiness, 23andme, and recently Runkeeper.
The morning saw a few talks:
- Alec Muffet and Eric King of Privacy International, talking about security concerns. In particular, the terrifying-sounding kit the Metropolitan Police have acquired to take copies of data from mobile phones (and the outdated laws which allow them to do it), broken assumptions in our attitudes to security regarding devices we physically control, privacy policies and the need to think more carefully about any data one records being published;
- John Fass on data visualisation principles, calling for visualisation to be considered from the start and not tacked onto the end of QS products. John gave a good talk I failed to completely follow (and will try to watch again); but I liked his idea that the rise of info graphics reflects a panic about the quantity of data we're faced with;
- Ian Clements told the by turns inspirational and horrifying story of his diagnosis with terminal cancer, when he was given a few weeks to live… 5 years ago. Ian's been gathering data on himself since 1974, and has a stronger imperative than many QS folk. I was horrified both at the attitude of medical professionals (who aren't interested in his data, or in one case will take it but on condition they don't reveal their findings to him), and that there's software out there he'd like to use for analysis of his multivariate logs, but just can't afford.
- Ken Snyder gave an overview of a few trends: convergence of the wellness, healthcare and social Internet industries; multi-sensor devices, medical devices become more consumer-oriented, rise of smartphones, and wearables; and consumer trends around simplicity, more mainstream use of QS (shifting the focus away from power users, where it is today), getting the isolation out of self empowerment, and data ownership.
Then after lunch, a couple of break-out sessions followed. For the first one, I took part in a group talking about behaviour change: where, after all, is the value in all this measurement if it can't be used to effect improvements? We rambled through BJ Fogg, talked about the need to consider closely which habits to create, what "kicks" would contribute to setting them up, and what environmental clues might be used to trigger "kicks"; the role of social support and accountability (touching briefly on buddy systems - tiny 2-person social networks, and something I don't think we've seen enough of, digitally); and models of willpower as a muscle which can be exercised, vs being a depletable resource. Personality styles also emerged as important - how would a service adapt to usage by loss averse individuals, vs gain/reward types? And is the act of recording data itself useful - or would a completely automated system which required no participation from its user miss an important trick? Finally, someone brought up Nicorette as a good metaphor: a system which encourages good habits, and is then designed to fall aside in time, leaving the good habits in place. You don't see many digital QS services which encourage you to leave after a while…
The second session I found a bit less focused: we talked about archiving of data and wobbled between the value of deliberately constrained networks (which I'm sceptical over), Wi-fi vs LTE, and lack of nuance in privacy policy (we permission to organisations to use our data, but rarely constrain what they can do with it beyond sharing onwards).
A fab day, overall. I've been to a couple of these events now, and every time they feel really exciting: QS feels like a movement that is just going mainstream. Kudos to Adriana and team for organising and running the day, and thanks to Hub Westminster for hosting.
Self-publishing through Amazon
June 09, 2012 | CommentsIt's no secret that Amazon is shaking up the publishing industry. I've been watching a friend go through some interesting experiences with self-publishing recently, and he's given some figures and permission to write about it.
Paul's been writing (mainly crime) fiction for a little whole now, drawing on his experiences as a police officer in Brighton, and his slightly diseased imagination. After quite a long process of talking to traditional publishers about The Follow, a novel about heroin dealers in Brighton, he decided in November last year to start publishing digitally. There's still a bit of a stigma associated with this sort of thing - I think that an important part of a traditional book deal is the validation of a large publisher saying "we think this is good" - but with the novel sitting unpublished for a little while, why not?
He hooked up with Trestle Press and put it onto Amazon at a quite low price point of £3.69. By mid-February, he was selling a couple of copies a day.
In March, he dropped the price to £1.99 after taking The Follow back from Trestle, and publishing himself. Sales hovered at between 10 and 18 a week.
In May, Paul decided to stop promoting The Follow and see what he could publish next, but noticed that he could make it free for 5 days using KDP Select; he flicked the switch to see what would happen, and saw 3000 downloads in the first day: both a boost for the ego and not a bad promotional device: "Now thousands of people were reading my book, and it hit number 2 in the free Kindle chart."
Sales crept upwards after that. The next day he was the 400-or-so most published book on Kindle, and had sold 30 copies; the day after that he was ranked at 250; and a week after the promotion he'd sold 400 copies and was sitting outside the Kindle top 100 (and in the top 10 of the "procedurals" section, which I think is police fiction); sales at this point were about ten a day.
Reviews seem to have a direct impact on sales: after a negative review sales dip slightly, then increase again once a positive one comes through (perhaps the "most recent review" holds a lot of power).
Things I've noticed, watching Paul go through this journey:
- Most people I know who write do it for love, and to reach an audience. Digital publishing seems to offer both this, and the promise of some sort of return.
- I'm sure that a publisher knows how to market and distribute books better than the average author; but when the alternative is to not be published at all, digital self-publishing looks attractive.
- A promotion can build sales traffic, but it drops off shortly afterwards.
- The direct relationship between quality of reviews and sales.
OTA 2012: Hack Day
June 02, 2012 | CommentsI'm camped out at OverTheAir putting the finishing touches to my hack and its presentation. This year I've done a solo effort: Facebook have been here (doing a couple of excellent talks, one about the Open Graph API and another about their internal processes), and I wanted to play with some of this. And I've been thinking about Bob Hoskins.
More specifically, the "It's Good To Talk" adverts he starred in during the 1990s - back in the days when telecommunications companies ran adverts that said "go on, make a phone call" instead of trying to sell insubstantial and vaguely aspirational lifestyles. Bob's point was sound: there's more meaning to a telephone call than what's said, the act of calling is itself an expression of care. I'm pretty comfortable with digital communication, but I'm certain that if I emailed my mum a weekly update instead of calling her, we'd both feel something was missing.
And look at Facebook, a history of my social contact: events I've been to, what I'm doing and with whom, things I like, where I've worked, groups I'm joining. Such an exhaustive social record, but with a phone-call shaped gap.
So my hack was simple: I want to Facebook to record when I've made a phone call to a Facebook friend. I imagine seeing my sister "like" my contact with my niece and nephew; or seeing clustered outpourings of telephoned support when friends talk about strife they're going through. I chose to implement this as an Android app because Android gives me access to the calling information I need, and I'm currently using a Galaxy S2 myself.
The app is extremely simple from a user's perspective: a single screen in which you log in and give permission to post to Facebook, and can deactivate the posting of calls if you want to stop. This simplicity is quite deliberate: I wanted all the posting of calls to happen silently, behind the scenes. The process of doing it is a little bit more involved than you might think:
- A simple state machine that watches for changes in the telephony state (between "idle", "off the hook", "ringing", etc.) to notice when an outgoing call has completed;
- The app then grabs the phone number of the last call, and looks it up in the on-phone address book to find the name(s) of the person you were calling. If your phone is like mine, many people have multiple entries in your address book: the app tries to reduplicate this list;
- It then connects to Facebook and looks up these names using an FQL query, to see if you were calling a Facebook friend;
- Finally, it creates a "call" action referencing this friend, which will appear in the Activity list of your timeline, and potentially elsewhere. At this point, the call can be referenced by other Facebook users: liked, commented upon, and so forth.
Obviously this is going to make some people uncomfortable: who you call is private, right? But I can't help looking at how far we've come over the last 10 years in our journey towards sharing and away from privacy, and feeling that this is only a short step forward (or backward, depending on how you view Facebook, and social sharing in general).
I'm going to run it for a while, see what happens, and see how it all feels. And if you'd like to have a play, you can find the source on github. There's one big improvement that's needed, and that's handling the case where you find more than one match on Facebook for a friend's name.