Bootstrap your node.js project in the cloud

So you have a great website idea and you want to build and bring that first version online as fast as you can. You figured that node.js is the way to go. You kick-off the development and after a couple of hours of hacking you realize that although you’re progressing at breakneck speed you’re missing a few important bits:

  • – How do I better structure my project?
  • – I want to test this thing. I want unit tests, UI (headless browser) tests and public API tests (I want that API offering out too of course)
  • – I want proper CSS and html templating
  • – Looks like I need non-trivial request routing, I need more than the default provided

Oh, and after you have all of this, you want to be able to deploy it to a node-ready cloud environment like Heroku without hassle.

Enter bootstrap.js.

Continue reading “Bootstrap your node.js project in the cloud”

A Comparison of Places APIs

Location Based Services are all the rage these days. The space is still being defined and the players are trying to differentiate their service offerings in order to attract the critical mass of developers. In this post I’ll draw a side-by-side comparison of the main features provided by the major Places API providers today. While I have no hard numbers to back-up the “major provider” claim, I’ll simply go for the web companies I would look for when building an application around Location services.

Here are my candidates ordered by their first API release date:

Provider Name API Link First Released
Yahoo Yahoo GeoPlanet API Yahoo! GeoPlanetâ„¢ May 2009
Foursquare Foursquare API Foursquare APIv2 Nov. 2009
Twitter Twitter Places API Geo methods in Twitter API Jun. 2010
Facebook Facebook Places API Scattered under the Graph API Aug. 2010
Google Google Places API Google Places API Nov. 2010

The features of all these APIs are designed primarily to support (and promote) the business use cases of each respective competitor. One notable exception is Yahoo’s GeoPlanet API which advertises itself as being a general purpose API for referencing places.

I won’t try to identify any “best” API in the end. This post is meant to allow the reader to make an informed decision on which API(s) to use.

Continue reading “A Comparison of Places APIs”

Towards An Open Database of Places: Location Autodiscovery

A short while back I read a challenging article titled t’s Time For An Open Database Of Places. There, Erich Schonfeld notes:

A long list of companies including Twitter, Google, Foursquare, Gowalla, SimpleGeo, Loopt, and Citysearch are far along in creating separate
 databases of places mapped to their geo-coordinates. These efforts at creating an underlying database of places are duplicative, and any competitive advantage any single company gets from being more comprehensive than the rest will be short-lived at best. It is time for an open database of places which all companies and developers can both contribute to and borrow from.

I agree that there is duplication of effort but this is what happens with many competitive technologies (look at now many online maps are available today). Each company tries to add a competitive advantage to its offering while providing the same core functionality as the competition.
Update: I started this post back in April and a lot of developments recently only enforce: this point. (Check Facebook Places and Google Places for more info).

I like the idea of an open database of places. Any company could build value-added services on top of it and sell them while they are not concerned about issues that come with building and maintaining such database like geo-location/address accuracy and duplicate place resolution to name just a few. Techcrunch’s Schonfeld adds another issue: who can a place and who should be in control of it, suggesting that anybody can update the database and “the best data should prevail”. This is hard and suggests a wiki-like approach for better or worse.
I’m not a fan of centralizing such database. Since there are such great market forces at play, it may become a playground for fights (my data is better than yours), a committee will attempt to regulate it just to push it into oblivion while everybody will get their toys and go build their own database.

I have a different idea (and it’s not new either).

Businesses have a great deal of interest in such database. It puts them on the map. They don’t particularly care who is using their place as long as the data about their business is correct and their customers easily reach their venue. The experience with using a mobile routing software to get to a place in real world is the equivalent of not waiting more than four seconds for a webpage to load. It just has to route the customer precisely to a location.

Why not letting the business to own their own geo data? All it takes is for them to have a website and add a bit of information to it to allow for auto-discovery; it’s called geotagging. It’s the same idea that Matt Griffith had back in 2002 that allows RSS feed autodiscovery applied to geo. The real win is for small businesses that adopt geotagging. All they need to do is add a small bit of metadata on their homepage and let web indexers do the job of collecting this data. Oh, and it’s free.
This brings a double win: companies in the mapping business access accurate geo information about businesses. The business themselves are happy that their customers can precisely find their physical location by means of address and/or geo-coordinates. Moreover, the accuracy of the data is maintained by the businesses since they want their customers to find them even when they move. A Places database that aggregates this type of data can mark these places as “verified” since they come directly from merchants. It even provides more accurate means of building forward and reverse geocoding tools.
Going forward with this model, the competition will shift their efforts from building a database of places to adding value to a (more or less) common Places database like local promotions and building great mapping products to allow us, the customers to find them.

The hard part is promoting this model. If say, half of the brick and mortar small businesses with a web presence embed geo metadata on their website, then the big players take notice. How to get there is the real challenge.

Image via Flickr/bryankennedy

RESTful error handling with Tomcat and SpringMVC 3.x

Handling errors in a REST way is seemingly simple enough: upon requesting a resource, when an error occurs, a proper status code and a body that contains a parseable message and using the content-type of the request should be returned.
The default error pages in Tomcat are ugly. Not only they expose too much of the server internals, they are only HTML formatted and making them a poor choice if a RESTful web service is deployed in that Tomcat container. Substituting them to simple static pages is still no enough since I want a dynamic response containing error information.

Here’s how to do it in 3 simple steps:

Continue reading “RESTful error handling with Tomcat and SpringMVC 3.x”

Building a content aggregation service with node.js

Fetching, aggregating and transforming data for delivery is a seemingly complex task. Imagine a service that serves aggregated search results from Twitter, Google and Bing where the response has to be tailored for mobile and web. One has to fetch data from different sources, parse and compose the results then transform them into the right markup for delivery to a specific client platform.
To cook this I’ll need:
– a web server
– a nice way to aggregate web service responses (pipelining would be nice)
– a component to transform the raw aggregated representation into a tailored client response.

I could take a stab at it and use Apache/Tomcat, Java (using Apache HttpClient 4.0), a servlet dispatcher (Spring WebMVC) and Velocity templating but it sounds too complex.

Enter Node.js. It’s an event-based web server built on Google’s V8 engine. It’s fast and it’s scalable and you develop on it using the familiar Javascript.
While Nodejs is still new, the community has built a rich ecosystem of extensions (modules) that greatly ease the pain of using it. If you’re unfamiliar with the technology, check-out the Hello World example, it should get you started.
Back to the task at hand, here are the modules I’ll need:
Restler to get me data.
async to allow parallelizing requests for effective data fetching.
Haml-js for view generation

Continue reading “Building a content aggregation service with node.js”

Using Spring 3.0 MVC for RESTful web services (rebuttal)

Update Mar.04 Thanks to @ewolff some of the points described below are now official feature requests. One (SPR-6928) is actually scheduled in Spring 3.1 (cool!). I’ve updated the post and added all open tickets. Please vote!

This post is somewhat a response to InfoQ’s Comparison of Spring MVC and JAX-RS.
Recently I have completed a migration from a JAX-RS implementation of a web service to Spring 3.0 MVC annotation-based @Controllers. The aforementioned post on InfoQ was published a few days after my migration so I’m dumping below the list of problems I had, along with solutions.

Full list of issues:

Same relative paths in multiple @Controllers not supported
Consider two Controllers where I use a versioned URL and a web.xml file that uses two URL mappings:

public class AdminController {
   public SomeResponse showUserDetails(String userId) {

public class UserController {
   public SomeOtherResponse showUserStreamtring userId) {
In web.xml:

Continue reading “Using Spring 3.0 MVC for RESTful web services (rebuttal)”

Unit testing with Commons HttpClient library

I want to write testable code and occasionally I bump into frameworks that make it challenging to unit test. Ideally I want to inject a service stub into my code then control the stub’s behavior based on my testing needs.
Commons Http Client from Jakarta facilitates integration with HTTP services but how to easily unit test code that depends on the HttpClient library? Turns out it’s not that hard.
I’ll cover both 1.3 and the newer 1.4 versions of the library since the older v1.3 is still widely used.
Here’s some typical service (HttpClient v1.3) we want to test. It returns the remote HTML page title:

public class RemoteHttpService {
   private HttpClient client;
   public String getPageTitle(String uri)  throws IOException {
     String contentHtml = fetchContent(uri);
     Pattern p = Pattern.compile("<title>(.*)</title>");
     Matcher m = p.matcher(contentHtml);
     if(m.find()) {
     return null;

   private String fetchContent(String uri)  throws IOException {
      HttpMethod method = new GetMethod("" + uri);
      int responseStatus = client.executeMethod(method);
      if(responseStatus != 200) {
        throw new IllegalStateException("Expected HTTP response status 200 " +
"but instead got [" + responseStatus + "]");
      byte[] responseBody = method.getResponseBody();
      return new String(responseBody, "UTF-8");

   public void setHttpClient(HttpClient client) {
      this.client = client;

with the HttpClient is injected at runtime (via some IoC container or explicitly).
To be able to unit-test this code we have to come-up with a stubbed version of the HttpClient and emulate the GET method.

Continue reading “Unit testing with Commons HttpClient library”

Spincloud, now with worldwide forecast

In my constant search for free weather data for Spincloud, a short while ago I have found a gem: free forecast data offered by the progressive Norwegian Meteorologic Institute. The long range forecast coverage is fairly thorough and covers most more than 2700 locations worldwide. I am happy to announce that I have extended to include it.


The data is refreshed every hour and the forecast range is available for the next seven days. You can bookmark any location or subscribe to weather reports via RSS.
On a different but related note, I can only be thankful for the free data offered by various meteorological organizations that allows Spincloud to exist and I believe in freeing public data (weather related and otherwise) as it belongs to the public that finances government and inter-government agencies in the first place. The Norwegian Met Institute is a great example for freeing its data and I am saluting them for making the right decision.
Now if only all such progressive Meteorological institutes around the world would agree on a common format for disseminating their data, it would make developers like me a tad happier…

Continuous everything?

I admit that I regard automation as a dull but vital part in the success of a project. Automation had evolved into Continuous Integration, a powerful toolset allowing frequent and regular building and testing of the code. I won’t get into what CI is (check the internets). Instead, I am going to explore a couple of aspects of CI that can be added to the artifacts of the development process and note some others that cannot.

Continuous performance
You wrote performance tests. You can run performance tests by firing a battery of tests from a client machine and targeted on an arbitrary environment where the application lives. The test results are collected on the client and you can publish them on a web server. Why not automating this completely then? To accomplish this, automate the execution, gathering and publishing of the performance results. Daily performance indicators not only increase visibility of the progress of the application but it becomes much easier to fix a performance degradation on a daily changeset than between two releases. There are a couple of factors that may add complexity to establishing performance tests:
Dealing with dependencies
The obvious rule of thumb is to minimize dependencies. However, if there still are dependencies on other (perhaps external) systems, use mocks and to isolate the system you’re testing for performance. We’re talking about nightly performance tests so don’t put unnecessary stress where you shouldn’t.
Finally, the main artifact of the final integration (done once per iteration) is a running environment where all components run together, in a production-like setup. Use this environment to run your system performance test where you measure the current performance against the baseline.
Measuring relative performance
The environment you’re using for the nightly PT cycle most likely will not be a perfect mirror of production (especially true when dealing with geographically distributed systems). Use common-sense to establish the ratio between the two environments then derive rough production performance numbers using it (assuming a linear CPU/Throughput relationship).

Continuous deployment
This is as simple as it sounds: automate the install. Make it dead easy to deploy the application in any environment by providing installation scripts. Simplify the configuration down to a single file that is self-documented and easily understood by non-programmers (read Operations Teams). The goal here is to unlock a powerful tool: making the application installable and upgradable with a click of a button. If all the other pieces of the continuum are in place then you could confidently deploy your application in production it on a much tighter release cycle, even on a daily basis. Deployment and integration become tasks in the background rather than first-class events.

More continuous
Since I am just fresh off the Agile Testing Days conference and I have learned a few more Cs from the distinguished speakers which I term as Soft Cs since they involve constant human engagement:
– Continuous learning (Declan Whelan)
– Continuous process improvement (Stuart Reid)
– Continuous acceptance testing (i.e. stakeholder signoff at the end of every sprint) – Anko Tijman
– Continuous customer involvement (Anko Tijman) is one year old!

Sep. 15 came and went and I didn’t realize that this blog is one year old. I didn’t post in the last few months but this is because I have been busy moving across the world from Toronto back to Europe (I’m in Berlin now), becoming a father (“the best job in the world”) and taking a new job (LBS, yes!).
Things are happening, wheels are in motion and there’s still a lot to write about.

Stay tuned!