Google and Nexus

June 28, 2012 | Comments

I'm at Google IO this week. Lots of announcements today; some very interesting ones around software (Google Now in particular looks amazing and terrifying, I may write about that later). Some good stuff on hardware too, which got me thinking...

The Nexus range of handsets were a demonstration to the industry of what Google thought a phone ought to be. With such a low price point, the Nexus 7 tablet feels like a different beast: if it sells in any sort of numbers (and such a low price point must signal such an intention), it'll compete very effectively with tablet efforts from other manufacturers. It feels to me like Google has watched them fail to come up with any effective competition to the iPad and seen a threat from Amazon in the form of the Kindle Fire: repurposing Android and cutting out ad revenue. Google has, perhaps reluctantly, decided it'll have to deliver the goods itself - even if doing so makes it harder for Android licensees in the process. Better this than be locked out of tablets by Amazon at the low end and Apple at the high.

The Nexus brand is being extended in a new direction: last years Project Tungsten is now Nexus Q, a high priced home entertainment device. The Nexus Q was referred to fleetingly as hackable, but I got the sense this was more to please a crowd to whom it might otherwise mean little, and to contrast it with the closed devices of Apple, than to stimulate an ecosystem of add-ons... for the moment.

For me, the keynote felt like a drawing together of strings (tablet, home entertainment, and smartphone all tightly integrated) with a view to building a strong content business: that $25 Play Store voucher shipping with the Nexus 7 is designed to get a few Wallet accounts set up; Nexus Q is an incentive to move music to the Google ecosystem; and the Nexus 7 was presented as a content consumption device for magazines, film and TV. Is the aim to take the worlds best-distributed mobile platform and turn it into an engine for generating content revenues?

Power consumption and mobile

June 25, 2012 | Comments

I enjoyed reading this little piece that did the rounds last week, about the costs of power consumption for the iPad ($1.61 annually, if you can't be bothered to click). One thing I've always liked about working in mobile software is the way that environmental concerns line up rather nicely with end-user benefits, when it comes to power: you work hard to minimise your use of it. It's good to see this sort of thing quantified, though I have a feeling the disposability and upgrade cycle for devices pushes the overall environmental footprint of these things back in the wrong direction...

An update on the dissertation

June 19, 2012 | Comments

It's been a few weeks since I reported on my dissertation project (an examination of the suitability of superoptimisation for virtual machines, by writing a superoptimiser for the JVM), so time for a little update.

One of the two big problems with superoptimisation is dealing with the combinatorial explosion of possibilities. I've restricted myself to using just 39 of the JVM opcodes, those needed for integer arithmetic. With the benefit of a little hindsight I realise I need another 12 (those for comparison operations and branching) too.

Even with this reduced set, there are 5,728,366 possible programs which are 5 opcodes long; and when you fill in all the arguments for those opcodes which take them, it gets way worse. 5 opcodes isn't long enough to do much, either; the really simple (and quite elegant-looking) implementation of the signum() call in java.lang.Math uses 9. So I've been focusing on the problem of pruning these possibilities, and leaving aside the other issue for now (how you definitively test whether a sequence of bytecodes performs as you wish).

I started out by using the Clojure math.combinatorics library to generate a cartesian product of all possibilities, and then filtering through them. This worked for small programs (I was able to find an optimal 2-opcode sequence for the identity function this way), but quickly becomes unworkable (which wasn't a surprise) - filtering through possibilities is slow.

So I've switched to considering the set of possibilities as a tree. At any given node of the tree, I apply two tests: firstly, is this node fertile? i.e. is it possible that any of its children can be optimal bytecode sequences. If a sequence contains any redundancy, for instance, it's infertile. And secondly, is this node a valid program itself? There's an overlap between invalidity and infertility, but they differ in a few places too.

Finding infertile sequences early is really important: I don't need to explore the children of an infertile node, so it lets me cut down the possibility space, and saves time. It's far quicker to never have to consider a candidate sequence, than to consider it... no matter how fast your testing is.

So what I'm doing amounts to a large amount of static analysis on sequences of JVM opcodes. I have a growing set of filters which look for invalid or suboptimal use of local variables; underflows in the operand stack; check for obvious redundancies (an optimal sequence having no redundancy in it); and check that the output of a program is in some way dependent on its input, by tracking the influence of its inputs across operand stack entries and local variables.

The upshot is that so far I'm able to cut the possible number of sequences for a 5-opcode program from 90,224,199 down to 10,927. This sounds great, but I then have to fill in all the possible arguments for opcodes which take them, which bumps the number of classes I have to build and test up to 276,616,752. This takes just over a day and a half to run, on my little laptop; and a 5-opcode program doesn't do much. That said, I'm making progress: a week ago I was taking 7.5 hours to run a search across 4-opcode sequences, and now this takes just under 4 minutes.

That's where I am; right now, each major step forward I make (and I can't see many obvious ones left) seems to buy me one additional opcode. I think that parallelising across oodles of machines (which should be straightforward) will buy me one more; running the process for a few days should get me another. So right now it looks like I'll be able to do a 7-opcode search before the project ends - presuming no more inspiration strikes as to how to speed this up.

Possible areas for improvement right now are speeding the process of generating and loading classes for test (quick benchmarks suggest this is where most of my running time goes) and getting more sophisticated about marking obviously redundant sequences as infertile. Waiting in the wings and looking to cause trouble are those branching operations, which I need, and which complicate (possibly fatally) some of the static analysis I've been doing to date. So I expect things to get slightly worse when I bring them in...

Clojure is pretty good, I'm finding. There's a repeated pattern I'm noticing in my use of it: spend a day and a half beating my head against a wall trying to do something, then find that there's a library function that does it for me. I'm feeling a bit more expressive in it, though - whilst what I'm writing definitely isn't optimal or idiomatic, it's concise and occasionally readable after-the-fact...

Quantified Self, Self-Hacking Day

June 16, 2012 | Comments

A little coterie of Brighton folks wandered up to the Self-hacking day in London, run by some of the Quantified Self crowd. I've not been to the monthly London meet-ups, but have been following this kind of thing for a few years now, on a no doubt hackneyed path that started with Nike+ and took in Mappiness, 23andme, and recently Runkeeper.

The morning saw a few talks:

Alec Muffet and Eric King of Privacy International, talking about security concerns. In particular, the terrifying-sounding kit the Metropolitan Police have acquired to take copies of data from mobile phones (and the outdated laws which allow them to do it), broken assumptions in our attitudes to security regarding devices we physically control, privacy policies and the need to think more carefully about any data one records being published;
John Fass on data visualisation principles, calling for visualisation to be considered from the start and not tacked onto the end of QS products. John gave a good talk I failed to completely follow (and will try to watch again); but I liked his idea that the rise of info graphics reflects a panic about the quantity of data we're faced with;
Ian Clements told the by turns inspirational and horrifying story of his diagnosis with terminal cancer, when he was given a few weeks to live… 5 years ago. Ian's been gathering data on himself since 1974, and has a stronger imperative than many QS folk. I was horrified both at the attitude of medical professionals (who aren't interested in his data, or in one case will take it but on condition they don't reveal their findings to him), and that there's software out there he'd like to use for analysis of his multivariate logs, but just can't afford.
Ken Snyder gave an overview of a few trends: convergence of the wellness, healthcare and social Internet industries; multi-sensor devices, medical devices become more consumer-oriented, rise of smartphones, and wearables; and consumer trends around simplicity, more mainstream use of QS (shifting the focus away from power users, where it is today), getting the isolation out of self empowerment, and data ownership.

Then after lunch, a couple of break-out sessions followed. For the first one, I took part in a group talking about behaviour change: where, after all, is the value in all this measurement if it can't be used to effect improvements? We rambled through BJ Fogg, talked about the need to consider closely which habits to create, what "kicks" would contribute to setting them up, and what environmental clues might be used to trigger "kicks"; the role of social support and accountability (touching briefly on buddy systems - tiny 2-person social networks, and something I don't think we've seen enough of, digitally); and models of willpower as a muscle which can be exercised, vs being a depletable resource. Personality styles also emerged as important - how would a service adapt to usage by loss averse individuals, vs gain/reward types? And is the act of recording data itself useful - or would a completely automated system which required no participation from its user miss an important trick? Finally, someone brought up Nicorette as a good metaphor: a system which encourages good habits, and is then designed to fall aside in time, leaving the good habits in place. You don't see many digital QS services which encourage you to leave after a while…

The second session I found a bit less focused: we talked about archiving of data and wobbled between the value of deliberately constrained networks (which I'm sceptical over), Wi-fi vs LTE, and lack of nuance in privacy policy (we permission to organisations to use our data, but rarely constrain what they can do with it beyond sharing onwards).

A fab day, overall. I've been to a couple of these events now, and every time they feel really exciting: QS feels like a movement that is just going mainstream. Kudos to Adriana and team for organising and running the day, and thanks to Hub Westminster for hosting.

Self-publishing through Amazon

June 09, 2012 | Comments

It's no secret that Amazon is shaking up the publishing industry. I've been watching a friend go through some interesting experiences with self-publishing recently, and he's given some figures and permission to write about it.

Paul's been writing (mainly crime) fiction for a little whole now, drawing on his experiences as a police officer in Brighton, and his slightly diseased imagination. After quite a long process of talking to traditional publishers about The Follow, a novel about heroin dealers in Brighton, he decided in November last year to start publishing digitally. There's still a bit of a stigma associated with this sort of thing - I think that an important part of a traditional book deal is the validation of a large publisher saying "we think this is good" - but with the novel sitting unpublished for a little while, why not?

He hooked up with Trestle Press and put it onto Amazon at a quite low price point of £3.69. By mid-February, he was selling a couple of copies a day.

In March, he dropped the price to £1.99 after taking The Follow back from Trestle, and publishing himself. Sales hovered at between 10 and 18 a week.

In May, Paul decided to stop promoting The Follow and see what he could publish next, but noticed that he could make it free for 5 days using KDP Select; he flicked the switch to see what would happen, and saw 3000 downloads in the first day: both a boost for the ego and not a bad promotional device: "Now thousands of people were reading my book, and it hit number 2 in the free Kindle chart."

Sales crept upwards after that. The next day he was the 400-or-so most published book on Kindle, and had sold 30 copies; the day after that he was ranked at 250; and a week after the promotion he'd sold 400 copies and was sitting outside the Kindle top 100 (and in the top 10 of the "procedurals" section, which I think is police fiction); sales at this point were about ten a day.

Reviews seem to have a direct impact on sales: after a negative review sales dip slightly, then increase again once a positive one comes through (perhaps the "most recent review" holds a lot of power).

Things I've noticed, watching Paul go through this journey:

Most people I know who write do it for love, and to reach an audience. Digital publishing seems to offer both this, and the promise of some sort of return.
I'm sure that a publisher knows how to market and distribute books better than the average author; but when the alternative is to not be published at all, digital self-publishing looks attractive.
A promotion can build sales traffic, but it drops off shortly afterwards.
The direct relationship between quality of reviews and sales.