Unit testing with Commons HttpClient library

I want to write testable code and occasionally I bump into frameworks that make it challenging to unit test. Ideally I want to inject a service stub into my code then control the stub’s behavior based on my testing needs.
Commons Http Client from Jakarta facilitates integration with HTTP services but how to easily unit test code that depends on the HttpClient library? Turns out it’s not that hard.
I’ll cover both 1.3 and the newer 1.4 versions of the library since the older v1.3 is still widely used.
Here’s some typical service (HttpClient v1.3) we want to test. It returns the remote HTML page title:

public class RemoteHttpService {
   private HttpClient client;
   
   public String getPageTitle(String uri)  throws IOException {
     String contentHtml = fetchContent(uri);
     Pattern p = Pattern.compile("<title>(.*)</title>");
     Matcher m = p.matcher(contentHtml);
     if(m.find()) {
        return m.group(1);
     }
     return null;
   }

   private String fetchContent(String uri)  throws IOException {
      HttpMethod method = new GetMethod("http://blog.newsplore.com/" + uri);
      int responseStatus = client.executeMethod(method);
      if(responseStatus != 200) {
        throw new IllegalStateException("Expected HTTP response status 200 " +
"but instead got [" + responseStatus + "]");
      }
      byte[] responseBody = method.getResponseBody();
      return new String(responseBody, "UTF-8");
   }

   public void setHttpClient(HttpClient client) {
      this.client = client;
   }
}

with the HttpClient is injected at runtime (via some IoC container or explicitly).
To be able to unit-test this code we have to come-up with a stubbed version of the HttpClient and emulate the GET method.

Continue reading “Unit testing with Commons HttpClient library”

Spincloud, now with worldwide forecast

In my constant search for free weather data for Spincloud, a short while ago I have found a gem: free forecast data offered by the progressive Norwegian Meteorologic Institute. The long range forecast coverage is fairly thorough and covers most more than 2700 locations worldwide. I am happy to announce that I have extended spincloud.com to include it.

worldwide_fcast    

The data is refreshed every hour and the forecast range is available for the next seven days. You can bookmark any location or subscribe to weather reports via RSS.
On a different but related note, I can only be thankful for the free data offered by various meteorological organizations that allows Spincloud to exist and I believe in freeing public data (weather related and otherwise) as it belongs to the public that finances government and inter-government agencies in the first place. The Norwegian Met Institute is a great example for freeing its data and I am saluting them for making the right decision.
Now if only all such progressive Meteorological institutes around the world would agree on a common format for disseminating their data, it would make developers like me a tad happier…

Continuous everything?

I admit that I regard automation as a dull but vital part in the success of a project. Automation had evolved into Continuous Integration, a powerful toolset allowing frequent and regular building and testing of the code. I won’t get into what CI is (check the internets). Instead, I am going to explore a couple of aspects of CI that can be added to the artifacts of the development process and note some others that cannot.

Continuous performance
You wrote performance tests. You can run performance tests by firing a battery of tests from a client machine and targeted on an arbitrary environment where the application lives. The test results are collected on the client and you can publish them on a web server. Why not automating this completely then? To accomplish this, automate the execution, gathering and publishing of the performance results. Daily performance indicators not only increase visibility of the progress of the application but it becomes much easier to fix a performance degradation on a daily changeset than between two releases. There are a couple of factors that may add complexity to establishing performance tests:
Dealing with dependencies
The obvious rule of thumb is to minimize dependencies. However, if there still are dependencies on other (perhaps external) systems, use mocks and to isolate the system you’re testing for performance. We’re talking about nightly performance tests so don’t put unnecessary stress where you shouldn’t.
Finally, the main artifact of the final integration (done once per iteration) is a running environment where all components run together, in a production-like setup. Use this environment to run your system performance test where you measure the current performance against the baseline.
Measuring relative performance
The environment you’re using for the nightly PT cycle most likely will not be a perfect mirror of production (especially true when dealing with geographically distributed systems). Use common-sense to establish the ratio between the two environments then derive rough production performance numbers using it (assuming a linear CPU/Throughput relationship).

Continuous deployment
This is as simple as it sounds: automate the install. Make it dead easy to deploy the application in any environment by providing installation scripts. Simplify the configuration down to a single file that is self-documented and easily understood by non-programmers (read Operations Teams). The goal here is to unlock a powerful tool: making the application installable and upgradable with a click of a button. If all the other pieces of the continuum are in place then you could confidently deploy your application in production it on a much tighter release cycle, even on a daily basis. Deployment and integration become tasks in the background rather than first-class events.

More continuous
Since I am just fresh off the Agile Testing Days conference and I have learned a few more Cs from the distinguished speakers which I term as Soft Cs since they involve constant human engagement:
– Continuous learning (Declan Whelan)
– Continuous process improvement (Stuart Reid)
– Continuous acceptance testing (i.e. stakeholder signoff at the end of every sprint) – Anko Tijman
– Continuous customer involvement (Anko Tijman)

blog.newsplore.com is one year old!

Sep. 15 came and went and I didn’t realize that this blog is one year old. I didn’t post in the last few months but this is because I have been busy moving across the world from Toronto back to Europe (I’m in Berlin now), becoming a father (“the best job in the world”) and taking a new job (LBS, yes!).
Things are happening, wheels are in motion and there’s still a lot to write about.

Stay tuned!

New Spincloud feature: heat map overlay

heatmap-overlayIt took a while since the previous feature update to Spincloud. I have done a number of upgrades to the underlying tech and some intensive code refactoring but nothing visible. The time has come for another eye candy: heat maps. It is a map overlay that shows a color-translated temperature layer based on interpolated values of the current weather conditions. It gives a quick indication of the average temperature across all land masses where data is available.
Naturally in areas where data is more dense, the visual representation is better. The toughest job was to figure-out the interpolation math. I still have to refine the algorithm currently based on the inverse distance weighting algorithm which interpolates scattered points on a surface. Apparently a better one to use is kriging but I have yet to implement it.
The temperature map is generated from current temperatures and is updated once every hour. You can toggle the overlay by clicking the “Temp” button found on the top of the active map area.

Reviewing Google AppEngine for Java (Part 2)

In the first part I’ve left-off with some good news: successful deployment in the local GAE container. In this second part I’ll talk about the following:


Loading data and browsing
After finishing-off the first successful deployment, next on the agenda was testing out the persistence tier but for this I needed some data to work with in the GAE datastore.
Along with the local container Google makes available a local datastore but so far the only way to populate the datastore is described here and it uses python. Furthermore, currently there is one documented way to browse the local datastore using the local development console but this again requires the python environment. There’s a hint from Google though that there will be a data viewer for the Java SDK. In the mean time I voted up the feature request.

Back to bulk loading, after voting on the feature request to have a bulk uploader, I decided not to use the python solution but to handcraft a loader that will fill-in two tables: COUNTRY and COUNTRY_STATE (for federated countries like US and CAN). Since there’s no way to schedule worker threads (but is on it’s way) the only mechanism to trigger the data loading is via URLs. I’m using Spring 3.0 and so it wasn’t hard to create a new @Controller, some methods to handle data loading then map them to URLs:

@Controller
public class BulkLoadController {
	@RequestMapping("/importCountries")
	public ModelAndView importCountries() {
           ...
        }
	@RequestMapping("/importProvinces")
	public ModelAndView importProvinces() {
           ...
        }
}

Continue reading “Reviewing Google AppEngine for Java (Part 2)”

Upgrading to Spring 3.0.0.M3 and Spring Security 3.0.0.M1

A short two months back I posted an article describing how to upgrade to Spring 3.0 M2. Spring folks are releasing at breakneck speed and so I got busy again upgrading spincloud.com to Spring 3.0 M3 released at the beginning of May. Just yesterday (June 3rd) the team released Spring Security 3.0 M1 and I decided to roll this in Spincloud as well.

Upgrading Spring Security from 2.0.4 to 3.0.0 M1
For Spring Security 3.0.0 M1 I’m doing a “soft” upgrade since I had done my homework when I migrated from Acegi. I won’t use any of the new 3.0 features, just getting ready to use them.

To digress a bit, Spring Security a technology that is harder to swallow due to its breadth. To simplify the picture, Spring Security (former Acegi) provides three major security concerns to enterprise applications:
– Authorization of method execution (either through standard JSR-250 JEE security annotations or via specific annotation-based method security).
– HTTP Request authorization (mapping URLs to accessible roles using Ant style or regexp’ed paths, dubbed channel security).
– Website authentication (integrating Single Sign On (SSO) services by supporting major SSO providers, HTTP BASIC authentication, OpenId authentication and other types).
For Spincloud I’m using OpenId for authentication and Channel Security to secure website pages.
Continue reading “Upgrading to Spring 3.0.0.M3 and Spring Security 3.0.0.M1”

Proposal to standardize the Level 2 query cache configuration in JPA 2.0

Level 2 cache is one of the most powerful features of the JPA spec. It’s a transparent layer that manages out-of-session data access and cranks-up the performance of the data access tier. To my knowledge it has been first seen in Hibernate and was later adopted by the then-emerging JPA spec (driven mostly by the Hibernate guys back in the day).
As annotations gained strength and adoption, L2 caching that was initially configured through XML or properties files, was brought closer to the source code, alas in different forms and shapes. This becomes apparent if you ever have to deploy your webapp across a multitude of containers as you have to painfully recode the cache configuration (or worse, hand-coded cache access). Why not standardizing the cache control in JPA? This seems to be simple enough to achieve and yet it isn’t there. Now that JPA 2.0 is standardizing on Level 2 cache access (See JSR 317 section 6.10) it is the natural thing to do.
Every JPA provider has its own way of specifying cache access (both Entity and query cache).
To grasp the extent of the various ways cache is configured, here are some examples:

Hibernate:
Cache control for entities

@Entity
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
public class Employee {...

Cache control for named queries:

@javax.persistence.NamedQuery(name="findEmployeesInDept",
query="select emp from Employee emp where emp.department = ?1",
hints={@QueryHint(name="org.hibernate.cacheable",value="true")})

or

@org.hibernate.annotations.NamedQuery(cacheable=true, cacheRegion="employeeRegion")

OpenJPA
Cache control for entities

@Entity
@org.apache.openjpa.persistence.DataCache(timeout=10000)
public class Employee {...

Query cache requires hand coded cache access:

OpenJPAEntityManagerFactory oemf = OpenJPAPersistence.cast(emf);
QueryResultCache qcache = oemf.getQueryResultCache();

Continue reading “Proposal to standardize the Level 2 query cache configuration in JPA 2.0”

Reviewing Google AppEngine for Java (Part 1)

appengine_lowres

When Google announced that Java is the second language that the Appengine will support I almost didn’t believe it given the surge of the new languages and the perception that Java entered legacy but the JVM is a powerful tried-and-true environment and Google saw the potential of using it for what it is bound to

become: a runtime environment for the new and exciting languages (see JRuby and Grails). The JVM is the new gateway drug in the world of languages.

Note: I’ll break down this review into two posts as it’s too extensive to cover everything at once. This first part is about initial setup, JPA and some local webapp deployment issues. In the second part I’ll describe how to load data into the datastore, data indexing, how I reached some of the limitations of the AppEngine and how to get around some of them.

I managed to snatch an evaluation account and last few days I have been playing with it. (as of today Google opened the registrations to all). My goal was (again) simple: to port Spincloud to the AppEngine. The prize: free hosting which amounts to $20/month in my case (see my post about Java hosting providers), some cool monitoring tools that I can access on the web and of course the transparent super-scalability that Google is touting, in case I get techcrunched, slashdotted or whatever (slim chances but still…).
I am a couple of weeks into the migration effort and I can conclude that the current state of the AppEngine is not quite ready for Spincloud as I’ll detail below but it comes quite close. The glaring omission was scheduling but Google announced today that the AppEngine will support it (awesome!) and I plan to use it from day one.

My stated goal is running Spincloud on AppEngine. Here are the technologies I use in Spincloud:
– Spring 3.0 (details here)
– SpringAOP
– SpringSecurity with OpenId
– Webapp management with JMX (I know, it won’t fly)
– Job scheduling using Spring’s Timer Task abstraction (not optimistic about this one either)
– SiteMesh (upgraded to 2.4.2 as indicated)
– Spatial database support including spatial indexing
– JPA 2.0 (back story here)
– Level 2 JPA cache, query cache.
– Native SQL query support including native SQL spatial queries.
– Image processing (spincloud scraps images and processes pixel-level information to get some weather data)
Continue reading “Reviewing Google AppEngine for Java (Part 1)”

Selecting location data from a spatial database

I have been thinking to write about this subject a while back when project Spincloud was still under development. I was even thinking about making this the first post on my blog.
The idea is simple: you have location-based data (POIs for instance) stored in some database (preferably a spatial DB) and now you want to perform a select statement that will indicate the area that should include the points we want. In case of Spincloud’s weather map, we want the weather reported by the stations located within a given area determined by the Google Map viewport that the user is currently browsing.
In all my examples I’ll use SQL Spatial Extensions support, specifically MSQL spatial extensions.
Here’s a visual representation of the spatial select (the red grid is the area where we want to fetch the data):

select_smpl

This is quite easy to accomplish by issuing a spatial select statement on the database:

select * from POI where Contains(GeomFromText
    ('POLYGON ((-30 32, 30 -8, -89 -8, -89 32, -30 32))', LOCATION))

But what about selecting an area that crosses the 180 degrees longitude? Let’s say we want to select data in an area around New Zealand that starts at 170 degrees latitude and ends at -160 degrees latitude going East. The selected area will look like this:
Continue reading “Selecting location data from a spatial database”