On Engineers and Technicians

E_Bridge Log_bridge

The Software Engineering field is a liberal one with a relatively low barrier to entry. Formal education is valued less than hands-on experience but that tends to be difficult to accurately measure. The industry came up with various solutions to gauge a candidate’s skills, ranging from coding tests, whiteboard interview to showing their public work to asking the candidate for a work portfolio. Programming is also an art, isn’t it?

Having been on both sides of the interview table, I’ve been long preoccupied with finding the right fit for the job at stake. Here’s a simple method that helps me qualify the candidates’ skills by using the traditional technical industries distinction between technician and engineer.
Wikipedia defines a technician as follows:

A technician is a worker in a field of technology who is proficient in the relevant skills and techniques, with a relatively practical understanding of the theoretical principles.

and for engineer:

Engineers design materials, structures, and systems while considering the limitations imposed by practicality, regulation, safety, and cost.
A professional engineer is competent by virtue of his/her fundamental education and training to apply the scientific method and outlook to the analysis and solution of engineering problems. He/she is able to assume personal responsibility for the development and application of engineering science and knowledge, notably in research, design, construction, manufacturing, superintending, managing and in the education of the engineer.

There is a major difference between the skills used by a electrician mounting or fixing your broken wall plug (who cares if it’s using AC and not DC?) and the skills needed by the engineer who designed it. Consequently, there is a similar difference between someone hacking together a website with Nginx and another who designs the browser that runs that website or the one devising the HTTP specification used by both the browser and the Nginx-driven website.

Hints how to distinguish a software technician from a software engineer:
– A technician will tend to use a restricted set of tools. An engineer knows what tool to use to get the job done: programming language choice, databases, frameworks, third party libraries, etc.
– An engineer’s work is articulate, symmetric and consistent while striving to address all other concerns like regulation, safety and cost (sic).
– An engineer follows the idioms of the language or framework she’s using. This goes beyond personal coding style and into leveraging language constructs and recognizable patterns that best fit the solution.
– Ask the five whys when inquiring about a technical detail. Engineers know how to explain their architectural, design as well as implementation choices. A technician will get a certain job done but will not be able to explain the design decisions behind the building blocks he’s using.
– An engineer’s work is long lasting with the architectural intent surviving refactorings and indeed guiding the evolution of a system.

The seniority scale can be applied to both trades. A Sr. Engineer masters the discipline by making the right architecture, design and implementation choices given both small and large scale systems. A senior technician will be able to quickly solve a difficult local problem with the tools they have but they recognize they need the help of an engineer to correctly design complex systems.

Lastly, both trades have virtues and technology companies need both skill sets just the same a hospital needs both nurses and doctors to function.

Images courtesy cogdog jimmywayne via Flickr.

Posted in software

Five years of weather in a timelapse

This has been a long time coming, five years actually. Spincloud went live in January 2009 and I’ve added the temperature heatmap overlay a few months later.
The heatmap and the corresponding map overlay is generated once an hour and it’s been faithfully doing so for six years. It does it by first generating an global temperature heatmap then cutting tiles (up to zoom level 6). Every hour.
But what I have also done back then was to also save one heat map image once a day. The idea was to generate a timelapse at some point, showing a visualization of the global temperatures over a longer span of time.
Now it’s time to show the result: below are 5 years of global temperatures in a one minute timelapse (make sure to switch to the HD version if not enabled by default):


* I started collecting this data in Aug.2009 but the lapse above starts in in July 2010 as I have a data gap between Dec.2009 and Jul.2010. Frankly I can’t remember why.

The timelapse neatly shows the SYNOP and METAR coverage globally. The weather stations behind these data sources report at various times of day and I’ve figured that 19:00 GMT is the hour in the day with the most reporting stations. Looks like this has changed about a year ago if you look at Russia’s remote locations that are not covered well anymore at this time of day. The rest of the landmass is reasonably well covered.

I think it’s interesting to explain how I have actually generated the temperature heat map. The code you’ll see is pre Java 8 since I haven’t upgraded the code much in the past years.

Spincloud uses several sources of weather data and it’s quite remarkable how little they have changed over the past 6 years. I witnessed some data sources going offline but the core of the data still comes from the same sources. For the heatmap I use the METAR and SYNOP global data, mostly coming from the NWS servers who are providing clean and reliable data for many years. The data is stored in the local database and it essentially contains a map point, a temperature and a timestamp.
With these sources at hand, the logic to generate a heatmap image is as follows:
1. Iterate over each pixel on the global map and get the respective temperature. Only include land masses.
2. Interpolate that temperature with the temperatures of the nearest locations where there are temperature readings
3. Generate an 2×2 pixel rectangular area with a color that corresponds with the interpolated temperature. Only fill land masses in order to make the overlay look realistic.
4. Append that image in memory to the global heat map image in the correct location
5. Repeat until complete then dump the generated image to disk

Turns out that at any the temperature data points are less than 20,000 globally and so I can add all of them in a list in memory. In step 1, when iterating over each map pixel this list is looked-up for temperature points in vicinity.
The map mask referenced in the code below, the worldmap-mask.png looks like this:

Here’s the code (warning: not executable):

public void generateHeatMap() {
  //Collect all temperature points
  List metaPoints = getMetaPoints();
  BufferedImage heatMap;

   //Load the world map mask in memory; all black pixels belong to landmasses.
   //We'll use it to filter-in only land masses.
  try {
    heatMap = ImageIO.read(this.getClass().getClassLoader().getResourceAsStream("worldmap-mask.png"));
  } catch(Exception e) {
    logger.error("Exception: " + e.getMessage(), e);
    throw new RuntimeException("worldmap-mask.png not found");
  Graphics2D g = (Graphics2D) heatMap.getGraphics();
  int blackRGB = Color.black.getRGB();
  //1. take every second pixel then scan the entire map surface
  int step = 2;
  for (int x = 0; x < MapUtil.MAP_WIDTH; x += step) {
    for (int y = 0; y < MapUtil.MAP_HEIGHT; y += step) {
      //only draw over land mass (masked with black).
      //seas/oceans will be left white.
      if(heatMap.getRGB(x, y) == blackRGB) {
      //Get the nearest temperature points along with their relative distance to the current map point
      SortedMap nearestPts = findNearest(metaPoints, x, y);
      //2. Compute the interpolated temperature using an inverse distance algorithm:
      // https://en.wikipedia.org/wiki/Inverse_distance_weighting
      int interpolatedT = (int)getInterpolatedTemperature(x, y, nearestPts);
      try {
          //identify the pixel color for this temperature
      } catch(Exception e) { //skip exception, at worst there'll be gaps in th heatmap
          logger.info("Exception: " + e.getMessage(), e);
          logger.info("The temperature was: " + interpolatedT);
        //3,4. Generate a step x step rectangle at current coordinates
        g.fillRect(x, y, step, step);

  //5. Save image to disk. Done every day at 19:00 GMT only.
  //For Google maps overlay


The second resource used and buried behind the call to getTempColor is the color temperature scale.
The code that gets a temperature and figures its color is this:

//init code:
  private int[] colorPixels = null;
  colorPixels = new int[colorMapW * colorMapH];
  PixelGrabber pg = new PixelGrabber(heatColorImage, 0, 0, colorMapW, colorMapH, colorPixels, 0, colorMapW);

private Color getTempColor(int temp) {
  if (temp == -9999) {
    return Color.white;
  // 273 px, 39 gradations each 7 px. starting -60 ends +60
  int pos = (273 / 120) * (temp + 60);

  int c = colorPixels[(colorMapH / 2) * colorMapW + pos];
  int red = (c & 0x00ff0000) >> 16;
  int green = (c & 0x0000ff00) >> 8;
  int blue = c & 0x000000ff;
  // and the Java Color is ...
  Color cl = new Color(red, green, blue);
  return cl;

Currently there are 1774 useful images collected so far and still going. Two years back I have moved Spincloud to DigitalOcean (note: referral link) and kept collecting this historical data without a hitch.

To geneate the timelapse I used ffmpeg and this tutorial. You’ll notice the month-year embedded in the video, I have used used this howto to get them in and this to figure the subtitle format. There’s some code I wrote to generate the subtitles file to be in sync with the timelapse but it’s too boring to include.

As a side note, Spincloud runs a total of 8 background jobs collecting temperature and forecast data, generating temperature and radar tiles, and weather warning data, all on the cheapest DataOcean plan.

Tagged with: , , ,
Posted in software

A TinyGPS upgrade adding NMEA v3.0 and GLONASS support

TL;DR: Checkout my forked https://github.com/florind/TinyGPS, adding GLONASS and NMEA 3.x support and that it’s fully compatible with the original TinyGPS and Arduino.

It’s been too long of a dry streak on my blog (four years!) but in the mean time I’ve been actively working on several projects. About 3,5 years ago I have reconnected with hardware engineering, a hobby of mine ever since I was a kid.
I am integrating GPS in one of my Arduino-compatible hardware projects and I’m using a Maestro Wireless part, called A5100 that notably comes with GLONASS support (the Russian constellation of GPS satellites).
There’s some pretty nice GPS support in the TinyGPS project but it lacks any of the advancements in the field such as NMEA v3.0 or support for additional constellations. This means that some good data is ignored when parsing NMEA sentences from devices such as the aforementioned A5100.

I have therefore worked on an update to TinyGPS to incorporate some missing support that I published here https://github.com/florind/TinyGPS. This is a drop-in replacement for TinyGPS and backwards compatible with any NMEA compatible GPS receiver integrated with your Arduino.

The full complement of details is in the Github documentation. Notably, this update adds GLONASS support and exposes some interesting data when not tracking:
– Date and time
– Satellites in view

With this data we can build GPS user interfaces containing more advanced data. Here’s one I’ve built on a small Sharp ePaper LCD model LS013B4DN04

gps-notrack gps-track

The left screen capture shows the GPS device searching for a fix. There is a valid date and time shown on the top as well as the satellites in view: 4 from the GPS constellation and 5 from the GLONASS constellation (having PRNs 65 and above). Each satellite specifies its signal/noise ratio in dB (abbreviated to ‘d’ here).
NMEA IDs in depth here: https://github.com/mvglasow/satstat/wiki/NMEA-IDs.

The screen capture on the right shows the GPS device tracking. Satellites in view are still showing although only three are participating in the solution. Shown is also the GNS mode indicator, the “AN” string, indicating that only the GPS constellation is used in the solution in this moment.

If you wonder what the e210 is, it’s the horizontal dilution of precision, HDOP. Divide by 100 to get to the ranges specified on wikipedia.

A5100 and other modern GPS parts promise out of the box support for Galileo who will become operational in 2016 and BEIDOU, already operational in Asia.
As this update only adds GLONASS support, feel free to add support for others if they’re available in your area (BEIDOU for now).

Tagged with: , , ,
Posted in Arduino, hardware, IoT, software

Bootstrap your node.js project in the cloud

So you have a great website idea and you want to build and bring that first version online as fast as you can. You figured that node.js is the way to go. You kick-off the development and after a couple of hours of hacking you realize that although you’re progressing at breakneck speed you’re missing a few important bits:

  • – How do I better structure my project?
  • – I want to test this thing. I want unit tests, UI (headless browser) tests and public API tests (I want that API offering out too of course)
  • – I want proper CSS and html templating
  • – Looks like I need non-trivial request routing, I need more than the default provided

Oh, and after you have all of this, you want to be able to deploy it to a node-ready cloud environment like Heroku without hassle.

Enter bootstrap.js.

Read more ›

Tagged with: , ,
Posted in software

A Comparison of Places APIs

Location Based Services are all the rage these days. The space is still being defined and the players are trying to differentiate their service offerings in order to attract the critical mass of developers. In this post I’ll draw a side-by-side comparison of the main features provided by the major Places API providers today. While I have no hard numbers to back-up the “major provider” claim, I’ll simply go for the web companies I would look for when building an application around Location services.

Here are my candidates ordered by their first API release date:

Provider Name API Link First Released
Yahoo Yahoo GeoPlanet API Yahoo! GeoPlanet™ May 2009
Foursquare Foursquare API Foursquare APIv2 Nov. 2009
Twitter Twitter Places API Geo methods in Twitter API Jun. 2010
Facebook Facebook Places API Scattered under the Graph API Aug. 2010
Google Google Places API Google Places API Nov. 2010

The features of all these APIs are designed primarily to support (and promote) the business use cases of each respective competitor. One notable exception is Yahoo’s GeoPlanet API which advertises itself as being a general purpose API for referencing places.

I won’t try to identify any “best” API in the end. This post is meant to allow the reader to make an informed decision on which API(s) to use.

Read more ›

Tagged with: , ,
Posted in software

Towards An Open Database of Places: Location Autodiscovery

A short while back I read a challenging article titled t’s Time For An Open Database Of Places. There, Erich Schonfeld notes:

A long list of companies including Twitter, Google, Foursquare, Gowalla, SimpleGeo, Loopt, and Citysearch are far along in creating separate
 databases of places mapped to their geo-coordinates. These efforts at creating an underlying database of places are duplicative, and any competitive advantage any single company gets from being more comprehensive than the rest will be short-lived at best. It is time for an open database of places which all companies and developers can both contribute to and borrow from.

I agree that there is duplication of effort but this is what happens with many competitive technologies (look at now many online maps are available today). Each company tries to add a competitive advantage to its offering while providing the same core functionality as the competition.
Update: I started this post back in April and a lot of developments recently only enforce: this point. (Check Facebook Places and Google Places for more info).

I like the idea of an open database of places. Any company could build value-added services on top of it and sell them while they are not concerned about issues that come with building and maintaining such database like geo-location/address accuracy and duplicate place resolution to name just a few. Techcrunch’s Schonfeld adds another issue: who can a place and who should be in control of it, suggesting that anybody can update the database and “the best data should prevail”. This is hard and suggests a wiki-like approach for better or worse.
I’m not a fan of centralizing such database. Since there are such great market forces at play, it may become a playground for fights (my data is better than yours), a committee will attempt to regulate it just to push it into oblivion while everybody will get their toys and go build their own database.

I have a different idea (and it’s not new either).

Businesses have a great deal of interest in such database. It puts them on the map. They don’t particularly care who is using their place as long as the data about their business is correct and their customers easily reach their venue. The experience with using a mobile routing software to get to a place in real world is the equivalent of not waiting more than four seconds for a webpage to load. It just has to route the customer precisely to a location.

Why not letting the business to own their own geo data? All it takes is for them to have a website and add a bit of information to it to allow for auto-discovery; it’s called geotagging. It’s the same idea that Matt Griffith had back in 2002 that allows RSS feed autodiscovery applied to geo. The real win is for small businesses that adopt geotagging. All they need to do is add a small bit of metadata on their homepage and let web indexers do the job of collecting this data. Oh, and it’s free.
This brings a double win: companies in the mapping business access accurate geo information about businesses. The business themselves are happy that their customers can precisely find their physical location by means of address and/or geo-coordinates. Moreover, the accuracy of the data is maintained by the businesses since they want their customers to find them even when they move. A Places database that aggregates this type of data can mark these places as “verified” since they come directly from merchants. It even provides more accurate means of building forward and reverse geocoding tools.
Going forward with this model, the competition will shift their efforts from building a database of places to adding value to a (more or less) common Places database like local promotions and building great mapping products to allow us, the customers to find them.

The hard part is promoting this model. If say, half of the brick and mortar small businesses with a web presence embed geo metadata on their website, then the big players take notice. How to get there is the real challenge.

Image via Flickr/bryankennedy

Posted in software

RESTful error handling with Tomcat and SpringMVC 3.x

Handling errors in a REST way is seemingly simple enough: upon requesting a resource, when an error occurs, a proper status code and a body that contains a parseable message and using the content-type of the request should be returned.
The default error pages in Tomcat are ugly. Not only they expose too much of the server internals, they are only HTML formatted and making them a poor choice if a RESTful web service is deployed in that Tomcat container. Substituting them to simple static pages is still no enough since I want a dynamic response containing error information.

Here’s how to do it in 3 simple steps:

Read more ›

Tagged with:
Posted in java, software, spring

Building a content aggregation service with node.js

Fetching, aggregating and transforming data for delivery is a seemingly complex task. Imagine a service that serves aggregated search results from Twitter, Google and Bing where the response has to be tailored for mobile and web. One has to fetch data from different sources, parse and compose the results then transform them into the right markup for delivery to a specific client platform.
To cook this I’ll need:
– a web server
– a nice way to aggregate web service responses (pipelining would be nice)
– a component to transform the raw aggregated representation into a tailored client response.

I could take a stab at it and use Apache/Tomcat, Java (using Apache HttpClient 4.0), a servlet dispatcher (Spring WebMVC) and Velocity templating but it sounds too complex.

Enter Node.js. It’s an event-based web server built on Google’s V8 engine. It’s fast and it’s scalable and you develop on it using the familiar Javascript.
While Nodejs is still new, the community has built a rich ecosystem of extensions (modules) that greatly ease the pain of using it. If you’re unfamiliar with the technology, check-out the Hello World example, it should get you started.
Back to the task at hand, here are the modules I’ll need:
Restler to get me data.
async to allow parallelizing requests for effective data fetching.
Haml-js for view generation

Read more ›

Tagged with: , ,
Posted in software

Using Spring 3.0 MVC for RESTful web services (rebuttal)

Update Mar.04 Thanks to @ewolff some of the points described below are now official feature requests. One (SPR-6928) is actually scheduled in Spring 3.1 (cool!). I’ve updated the post and added all open tickets. Please vote!

This post is somewhat a response to InfoQ’s Comparison of Spring MVC and JAX-RS.
Recently I have completed a migration from a JAX-RS implementation of a web service to Spring 3.0 MVC annotation-based @Controllers. The aforementioned post on InfoQ was published a few days after my migration so I’m dumping below the list of problems I had, along with solutions.

Full list of issues:

Same relative paths in multiple @Controllers not supported
Consider two Controllers where I use a versioned URL and a web.xml file that uses two URL mappings:

public class AdminController {
   public SomeResponse showUserDetails(String userId) {

public class UserController {
   public SomeOtherResponse showUserStreamtring userId) {
In web.xml:

Read more ›

Tagged with: , ,
Posted in java, software, spring

Unit testing with Commons HttpClient library

I want to write testable code and occasionally I bump into frameworks that make it challenging to unit test. Ideally I want to inject a service stub into my code then control the stub’s behavior based on my testing needs.
Commons Http Client from Jakarta facilitates integration with HTTP services but how to easily unit test code that depends on the HttpClient library? Turns out it’s not that hard.
I’ll cover both 1.3 and the newer 1.4 versions of the library since the older v1.3 is still widely used.
Here’s some typical service (HttpClient v1.3) we want to test. It returns the remote HTML page title:

public class RemoteHttpService {
   private HttpClient client;
   public String getPageTitle(String uri)  throws IOException {
     String contentHtml = fetchContent(uri);
     Pattern p = Pattern.compile("<title>(.*)</title>");
     Matcher m = p.matcher(contentHtml);
     if(m.find()) {
        return m.group(1);
     return null;

   private String fetchContent(String uri)  throws IOException {
      HttpMethod method = new GetMethod("http://blog.newsplore.com/" + uri);
      int responseStatus = client.executeMethod(method);
      if(responseStatus != 200) {
        throw new IllegalStateException("Expected HTTP response status 200 " +
"but instead got [" + responseStatus + "]");
      byte[] responseBody = method.getResponseBody();
      return new String(responseBody, "UTF-8");

   public void setHttpClient(HttpClient client) {
      this.client = client;

with the HttpClient is injected at runtime (via some IoC container or explicitly).
To be able to unit-test this code we have to come-up with a stubbed version of the HttpClient and emulate the GET method.

Read more ›

Tagged with: , ,
Posted in java, software