Stanford ArtificiaI Intelligence MOOC

I’m proud to have completed the first ever offering of the MOOC in Artificial Intelligence, run by Sebastian Thrun and Peter Norvig through Stanford University.

It was intriguing, challenging, and ultimately fun to get a first bit of working knowledge of things like spam filters, robot localization, and computer vision.

I’ve written a little Bayes filter based on the model introduced in that course.  I’ve hooked it up to my IRC client to alert me about the most interesting messages.  As they say though, the hardest part is the training and data collection - it’s hard and time-consuming to come up with enough good data to form a workable model.

The main things I’ve gained from the course is an appreciation of the kinds of problems AI can solve, as well as an idea of what tool to use in a given situation.

Testing on Autopilot

I was reminded of the power of automated testing by this talk by Rod Johnson, the original creator of the Spring framework. It is a little dated (2007), but what he says is still highly relevant.  The content mainly covers things we should already be practicing as developers, but it’s worth a reminder every now and then. Following are the main points I took away from the presentation.

First, there are several key concepts to bear in mind.  These came up again and again in the talk:

  • Test Early, Test Often
  • Test at Multiple Levels
  • Automate Everything

Unit Testing

As developers, we know we should do lots of unit testing.  We do this by targeting classes in isolation, and mocking out collaborators.

To be clear, unit testing is looking at your class “in the lab”, not in the real world.  A unit test should not interact with Spring, or the database, or any infrastructure concerns.  Therefore, unit tests should run extremely fast: of the order of tens of thousands of tests per minute.  It shouldn’t be painful to run a suite of unit tests.

Do Test-Driven Development. Not only does this help you discover APIs organically, but it’s a way of relieving stress.  Once a defect is detected, you can write a failing test for it, then come back to fix it later on.  The failing test is a big red beacon reminding you to finish the job.

Use tools such as Clover to measure the code-coverage of your tests.  80% is a useful rule of thumb.  Any more than this, and the benefits are not worth the cost.  Any less than 70%, and the risk of defects becomes significant.

Integration Testing

We should also do integration testing - for example to ensure our application is wired up correctly, and SQL statements are correct.  

But how many of us are still clicking through flows in a browser?  Ad-hoc testing by deploying to an application server and clicking around is very time-consuming and error-prone.  If it’s not automated, chances are it won’t happen.  If it doesn’t happen or occurs late in the project cycle, defects will be expensive to fix.

So instead, maintain a suite of integration tests.  It should be possible to run hundreds or thousands of these per minute and again, they should be automated so they just happen.

Use Spring’s Integration Testing support.  Among other things, this provides superclasses which can perform each test in a transaction, and roll it back upon completion to avoid side-effects across tests.  This avoids the need to re-seed the database upon each test.

Another benefit of Spring Integration Testing is that the Spring context is cached between tests.  This means that the highly expensive construction of the Hibernate SessionFactory (if you use one) only happens once.  This context caching is usually impossible, because the test class is reconstructed by JUnit upon each test.

Remember to test behaviour in the database.  Stored procedures, triggers, views - regressions at the schema level should be caught early, in an automated fashion.

Integration tests should be deterministic - that is, they should not rely on the time of day, or random side-effects from previous tests.  This should be obvious, but when testing concerns such as scheduling, this can become difficult.  One strategy is to abstract out the concept of the current time of day.  This could be done by replacing a literal call to System.getCurrentTime() with a call to a private method.  This method would check for an override property set only during testing, the existence of which would cause a static Date to be returned to your application code.

Performance Testing

This should begin as early as possible.  Use scriptable frameworks such as The Grinder, so performance testing is cheap to execute early and often.  This means performance regressions will be caught immediately, for example if somebody drops an index.

Many performance problems are due to lack of understanding of ORM frameworks.  Learn to use your framework, for example relating to fetch strategies.  A common idiom is to eagerly fetch a collection of child entities up-front, rather than invoking the “N+1 Selects” problem by lazily loading each child record in a loop.  Additionally, consider evicting objects from the Session at appropriate points, to avoid memory overhead and to prevent the need for dirty-checking upon flushing of the Session.

One strategy to dive deeply into database performance concerns, is to enable SQL logging in your persistence framework.  A large number of SELECT statements per-use case will quickly become apparent.

Conclusion

Developers should Invest time into writing automated tests at multiple levels.  Even with a dedicated QA team in place, defects will only be caught early and fixed cheaply through an intelligent approach to automation. Along with adoption of best practices such as Dependency Injection and separation of concerns, the industry has many tools on offer to make comprehensive testing cheap and easy.

References / Further Reading

The Curse of the Splinternet

This week’s New Scientist featured a particularly fear-mongering article about internet security. Entitled “Age of the Splinternet”, it at first appears to be a hooray to the importance of net neutrality. But the subplot quickly becomes clear: the internet is a great place to be, but anything can and will go wrong, even fantastical, sci-fi doomsday scenarios …

The author begins with an illuminating history lesson on the structure of the internet. Dating back to the 1960s, the underlying system of routers was initially designed by the military as a fault-tolerant network, able to withstand a nuclear blast. The lack of central command, and the presence of autonomous nodes resulted in a decentralised, self-sustaining mesh, able to route traffic around a fault without human input.

The article goes on to praise the open, anonymous nature of the internet, making it difficult (though not impossible) for repressive regimes to censor the information their citizens can access. Then the author shifts a gear, and a dark side to this openness is revealed. We are warned that companies such as Apple, Google and Amazon are starting to - can you imagine it - “fragment” the web to support their own products and interests.

Perhaps it should not be such a shock that business and commerce continues to operate through self-interest, even on the internet. There is no problem with the way Apple restrict the apps users can install: if they didn’t do it, someone else would. The motivation is that a large proportion of users want things to “just work”, and are happier on the “less choice, more reliability” side of the equation.

The existence of the iPhone caused its flip-side to come into existence - Android - with its comparatively open policy. As long as business happens on the internet, it will want to manipulate things to make a profit; nothing new there. The “internet” as a whole is unblemished by that fact.

The cloud is described as a single point of failure. For example, when Amazon’s EC2 service croaked, businesses that relied on it were offline for the duration. Ignoring the fact that the definition of “the cloud” is precisely the reverse of a “single point” - that it is distributed and ought to be redundant (the EC2 outage was a bit of a freak occurence, due to a network engineer running the wrong command and taking out the whole farm), there is - again - no threat here. The internet - like any social system - will grow and evolve based on the demands of its users. EC2 makes up a tiny part of the internet and like any other service, has advantages and disadvantages, and nobody has to use it.

The very next paragraph does an about-turn by admitting that the cloud actually is distributed, spreading your data among many locations, and that this too is a problem. An example of this “threat” is the hack of RSA, which led to the intrusion of Lockheed-Martin’s computers. It’s not clear exactly what the problem is here, other than the fact that specialised groups of people depend on each other to get things done, and sometimes they mess up.

Beyond this point, we are taken on a ghost train ride into sheer speculation and technical folly. “Imagine being a heart patient and having your pacemaker hacked” the author warns. Even with the “evils” of the internet, we still know how to put up a firewall on our home PCs, surely also on medical devices. We may pick up a virus by randomly browing the web, but pacemakers will never be so general and will always be highly limited in functionality, through common sense and good engineering. If a pacemaker was somehow monitored over the internet, then it would not be hard to isolate this function from its critical control system.

If we’re not already quaking in our boots and reaching for the “off” switch on our broadband modems, a similar medical doomsday scenario is presented. What if the glucose levels of a diabetic patient is monitored and controlled over the internet? Wouldn’t it get hacked? Again, this is pure speculation out of the context of real constraints. The fault-tolerance levels of medical systems are far more stringent than those of general consumer devices. Not to mention the fact that nobody would design a system which gambles a human life on network availability.

Anonymity is blamed for the ability of online criminals to operate with impunity. I guess the humble balaclava, or simply “hiding from the police” aren’t sophisticated enough tactics to make it into New Scientist.

Are we offered any solutions to these terrors of the modern age? Actually, yes. The first is a good old “internet licence”, along with some kind of hardware identification system. Although the author recognises the technical challenges in such an idea, we must also consider the 100% likelihood of the system being circumvented by those it intends to control. DRM comes to mind as an example of an identification and control system which simply doesn’t work, and is prohibitively expensive to fix after the fact.

In summary, it would be fair to say that the article finds a balance between encouraging openness of the internet, and preventing misuse. But a prominent message prevails, which is that nonetheless, the internet is a dangerous place and therefore must be controlled. The solution really is in basic common sense and best practices.

Armed with fully patched software and treading wisely, there is no cause for concern. Business will continue to bend the internet to their own ends (as it does in the real world). Criminals will continue to attempt to exploit it (as they do in the real world). We can’t prevent these things from happening, but we can use our heads to drastically reduce our chances of going splat on the imformation superhighway.

Runtime Dependency Analysis

I was wondering: if I change class Foo, how do I determine 100% which use-cases to include in my regression tests? It would be useful to know with 100% certainty that I must consider the Acme login process, as well as the WidgetCo webservice authentication. And nothing else. Can my IDE help me with this? Well, in some cases it’s straightforward to analyse for backward dependencies. If I change class Foo, then static analysis tells me that webservice WSFoo, and controller Bar are the only upstream entry points to your application affected by this change.

Smart Trax

It seems I’m obsessed with finding new applications for GPS data. The latest is an idea called Smart Trax: a hypothetical social application for discovering and sharing cycle routes. Imagine if you could upload a route (recorded via GPS), and find “similar” routes. These similar routes can then be compared to your own. It turns out there are a number of applications for this. Operation Duck Pond It’s the weekend, you’re a keen cyclist, and your bike is getting lonely.

Congestion Maps for Cyclists

I had an idea to create a road map for cyclists, colour-coded by the likelihood of congestion, using GPS data. Here’s some background. I’ve been plotting a cycle commuting route from the West End to the Kingston area. On the map, Fulham Road looked like the most direct route: a relatively straight diagonal to the SW. But when I jumped on the bike to try it out, I found that Fulham Road is narrow, and so there is no way to filter past heavy traffic.

Time Management (Computer Metaphors) Part 2 – Polling and Interrupts

Good time management is a bit like computer programming, in some ways at least … How do you keep track of tasks which can’t be carried forward, until some outside event has taken place? Perhaps you’re waiting for a response from a vendor, or a decision from your manager on which option to take. ‘Hanging’ tasks like this can compete for your attention: you can’t do anything for the time being, but you don’t want to forget about them.

Time Management (Computer Metaphors) Part 1 – Streams

Good time management is a bit like computer programming, in some ways at least … How do you handle large tasks, without being overwhelmed by their size? Streams Computer programs generally read data from some location, process it, then output it somehow. For example, a video player will read data from the disk, decode it, then draw frames on the screen. Now, what is the best policy for this? How much data should be read from the disk, before it’s processed and flung out at the screen?

Natural Language Processing of Integer Values

I just pushed my most recent changes to NaturalNum - a python library for natural language representation of integer values. E.g. usage: $ python example.py 123456 en_GB [‘one’, ‘hundred’, ‘and’, ‘twenty’, ‘three’, ‘thousand’, ‘four’, ‘hundred’,‘and’, ‘fifty’, ‘six’] $ python example.py 123456 fr_FR [‘cent’, ‘vingt’, ‘trois’, ‘mille’, ‘quatre’, ‘cent’, ‘cinquante’, ‘six’] Currently, only English and French are supported, for values up to hundreds of thousands. More languages will be added as inspiration strikes.

Getting Started with Drools Expert

I’m trialling expert systems, in order to abstract away some tricky internationalization logic in an IVR application. Drools Expert might be what I need and will hopefully save time, compared with devil-in-the-details DSLs. The idea of a Rules Engine is that business rules are abstracted out of your application. Business rules are likely to change, so ideally they should not be in the source tree. Additionally, rules may be consulted or modified by business users, so ideally would be free from syntactic mess, and should be self-documenting.