New Spincloud feature: heat map overlay

heatmap-overlayIt took a while since the previous feature update to Spincloud. I have done a number of upgrades to the underlying tech and some intensive code refactoring but nothing visible. The time has come for another eye candy: heat maps. It is a map overlay that shows a color-translated temperature layer based on interpolated values of the current weather conditions. It gives a quick indication of the average temperature across all land masses where data is available.
Naturally in areas where data is more dense, the visual representation is better. The toughest job was to figure-out the interpolation math. I still have to refine the algorithm currently based on the inverse distance weighting algorithm which interpolates scattered points on a surface. Apparently a better one to use is kriging but I have yet to implement it.
The temperature map is generated from current temperatures and is updated once every hour. You can toggle the overlay by clicking the “Temp” button found on the top of the active map area.

A project triumvirate

There are three forces that shape a project: domain, process and technology. Add the “-driven” suffix to any of them and you’ll perhaps recognize some of the methods used in projects you’ve been involved in and yet, as soon as one takes too much of a lead against the other two, failure will follow almost inevitably. At the intersection of these three forces we find familiar terms and concepts but first a word or two about each of them:

Technology
Back in the days of the tech bubble, tech was allmighty. Buzzwords like Java, EJBs, PHP defined entire projects. It was the time where software became accessible to a much larger audience. The new wave of enthusiastic geeks embraced everything from new languages to professional certifications to the then-nascent open source. I admit having my share of technology-driven projects back in the day…

Process
Process brings structure and pace to a project. The two complementary components of a project process are methodology and integration. We are all too familiar with methodology: waterfall, RUP or agile methods of software development are vastly documented and practiced; integration largely is defect management, testing, build, deployment and documentation. Today they all come together in what is called continuous integration where all these concepts become interrelated in a repetitive process that produces accountability and visibility into the progress of a project. Process is also the one force that tends to disappear first after a project is finished.

Domain
The domain captures entities and business logic, is driven by the business requirements and is in no way influenced by the two other forces. The most successful way to employ the domain is through Domain Driven Design where the main focus of software development is neither technology nor process but the business requirements.

Continue reading “A project triumvirate”

Process Journalism really is Agile Journalism

MediaThere’s an interesting story rippling the news stream these days: The New York Times is questioning the ways Techcrunch reports on news. The crux of the issue is product journalism v. process journalism, the act of producing news only after all facts have been verified versus just writing the story based on what the author best knows about an event at that moment then let the story evolve and provide rigorous updates and corrections once it does.
As I was reading, I got an epiphany. The “old media” is following the old methods, much like the waterfall process where the story (the requirements) has to be fully known before proceeding with actually breaking the story (the implementation). The new media follows a radical approach: it starts with a lead (or rumor) that is initially published in a somewhat crude form, then the story evolves based on field feedback and facts drawn from the actual reality, until it becomes accurate. Even the language of the aforementioned references use words like “iterative process”, “transparent” and “standards” the same way I’d use them when talking about developing software.

This method of early release followed by incremental improvements resembles so much the agile methods of software development, I’m calling it Agile Journalism.

A revolution is happening in the media. Old methods are swiftly pushed aside; the emergence of the Internet as the news dissemination medium is fundamentally changing the ways the reality is reported upon. Who wants to read a story that is already obsolete or wrong altogether and impossible to be corrected/evolved until a new paper edition is off the press? The new media not only breaks news faster (release early) but gets the stories right eventually, simply because the new rules (standards) are clearly stating that until confirmed, the product -otherwise useful- is still a beta representation of the reality and the consumers already know how to consume such product.

From blogs to real time news streams, the new media’s aspiration is to bring the most accurate information yet to the consumers. It can and will do it by revolutionizing the way the stories are iteratively discovered based on constant feedback, the same way agile methods are revolutionizing the software industry.

Image courtesy hirefrank

Reviewing Google AppEngine for Java (Part 2)

In the first part I’ve left-off with some good news: successful deployment in the local GAE container. In this second part I’ll talk about the following:


Loading data and browsing
After finishing-off the first successful deployment, next on the agenda was testing out the persistence tier but for this I needed some data to work with in the GAE datastore.
Along with the local container Google makes available a local datastore but so far the only way to populate the datastore is described here and it uses python. Furthermore, currently there is one documented way to browse the local datastore using the local development console but this again requires the python environment. There’s a hint from Google though that there will be a data viewer for the Java SDK. In the mean time I voted up the feature request.

Back to bulk loading, after voting on the feature request to have a bulk uploader, I decided not to use the python solution but to handcraft a loader that will fill-in two tables: COUNTRY and COUNTRY_STATE (for federated countries like US and CAN). Since there’s no way to schedule worker threads (but is on it’s way) the only mechanism to trigger the data loading is via URLs. I’m using Spring 3.0 and so it wasn’t hard to create a new @Controller, some methods to handle data loading then map them to URLs:

@Controller
public class BulkLoadController {
	@RequestMapping("/importCountries")
	public ModelAndView importCountries() {
           ...
        }
	@RequestMapping("/importProvinces")
	public ModelAndView importProvinces() {
           ...
        }
}

Continue reading “Reviewing Google AppEngine for Java (Part 2)”

Upgrading to Spring 3.0.0.M3 and Spring Security 3.0.0.M1

A short two months back I posted an article describing how to upgrade to Spring 3.0 M2. Spring folks are releasing at breakneck speed and so I got busy again upgrading spincloud.com to Spring 3.0 M3 released at the beginning of May. Just yesterday (June 3rd) the team released Spring Security 3.0 M1 and I decided to roll this in Spincloud as well.

Upgrading Spring Security from 2.0.4 to 3.0.0 M1
For Spring Security 3.0.0 M1 I’m doing a “soft” upgrade since I had done my homework when I migrated from Acegi. I won’t use any of the new 3.0 features, just getting ready to use them.

To digress a bit, Spring Security a technology that is harder to swallow due to its breadth. To simplify the picture, Spring Security (former Acegi) provides three major security concerns to enterprise applications:
– Authorization of method execution (either through standard JSR-250 JEE security annotations or via specific annotation-based method security).
– HTTP Request authorization (mapping URLs to accessible roles using Ant style or regexp’ed paths, dubbed channel security).
– Website authentication (integrating Single Sign On (SSO) services by supporting major SSO providers, HTTP BASIC authentication, OpenId authentication and other types).
For Spincloud I’m using OpenId for authentication and Channel Security to secure website pages.
Continue reading “Upgrading to Spring 3.0.0.M3 and Spring Security 3.0.0.M1”

Proposal to standardize the Level 2 query cache configuration in JPA 2.0

Level 2 cache is one of the most powerful features of the JPA spec. It’s a transparent layer that manages out-of-session data access and cranks-up the performance of the data access tier. To my knowledge it has been first seen in Hibernate and was later adopted by the then-emerging JPA spec (driven mostly by the Hibernate guys back in the day).
As annotations gained strength and adoption, L2 caching that was initially configured through XML or properties files, was brought closer to the source code, alas in different forms and shapes. This becomes apparent if you ever have to deploy your webapp across a multitude of containers as you have to painfully recode the cache configuration (or worse, hand-coded cache access). Why not standardizing the cache control in JPA? This seems to be simple enough to achieve and yet it isn’t there. Now that JPA 2.0 is standardizing on Level 2 cache access (See JSR 317 section 6.10) it is the natural thing to do.
Every JPA provider has its own way of specifying cache access (both Entity and query cache).
To grasp the extent of the various ways cache is configured, here are some examples:

Hibernate:
Cache control for entities

@Entity
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
public class Employee {...

Cache control for named queries:

@javax.persistence.NamedQuery(name="findEmployeesInDept",
query="select emp from Employee emp where emp.department = ?1",
hints={@QueryHint(name="org.hibernate.cacheable",value="true")})

or

@org.hibernate.annotations.NamedQuery(cacheable=true, cacheRegion="employeeRegion")

OpenJPA
Cache control for entities

@Entity
@org.apache.openjpa.persistence.DataCache(timeout=10000)
public class Employee {...

Query cache requires hand coded cache access:

OpenJPAEntityManagerFactory oemf = OpenJPAPersistence.cast(emf);
QueryResultCache qcache = oemf.getQueryResultCache();

Continue reading “Proposal to standardize the Level 2 query cache configuration in JPA 2.0”

Reviewing Google AppEngine for Java (Part 1)

appengine_lowres

When Google announced that Java is the second language that the Appengine will support I almost didn’t believe it given the surge of the new languages and the perception that Java entered legacy but the JVM is a powerful tried-and-true environment and Google saw the potential of using it for what it is bound to

become: a runtime environment for the new and exciting languages (see JRuby and Grails). The JVM is the new gateway drug in the world of languages.

Note: I’ll break down this review into two posts as it’s too extensive to cover everything at once. This first part is about initial setup, JPA and some local webapp deployment issues. In the second part I’ll describe how to load data into the datastore, data indexing, how I reached some of the limitations of the AppEngine and how to get around some of them.

I managed to snatch an evaluation account and last few days I have been playing with it. (as of today Google opened the registrations to all). My goal was (again) simple: to port Spincloud to the AppEngine. The prize: free hosting which amounts to $20/month in my case (see my post about Java hosting providers), some cool monitoring tools that I can access on the web and of course the transparent super-scalability that Google is touting, in case I get techcrunched, slashdotted or whatever (slim chances but still…).
I am a couple of weeks into the migration effort and I can conclude that the current state of the AppEngine is not quite ready for Spincloud as I’ll detail below but it comes quite close. The glaring omission was scheduling but Google announced today that the AppEngine will support it (awesome!) and I plan to use it from day one.

My stated goal is running Spincloud on AppEngine. Here are the technologies I use in Spincloud:
– Spring 3.0 (details here)
– SpringAOP
– SpringSecurity with OpenId
– Webapp management with JMX (I know, it won’t fly)
– Job scheduling using Spring’s Timer Task abstraction (not optimistic about this one either)
– SiteMesh (upgraded to 2.4.2 as indicated)
– Spatial database support including spatial indexing
– JPA 2.0 (back story here)
– Level 2 JPA cache, query cache.
– Native SQL query support including native SQL spatial queries.
– Image processing (spincloud scraps images and processes pixel-level information to get some weather data)
Continue reading “Reviewing Google AppEngine for Java (Part 1)”

“Thinking in” what?

image010“Thinking in…” anything has been a marketing quirk for a while now, being used and abused from the field of language learning to computer science. Thinking in Java is the title of a well known book written by Bruce Eckel. I am passing a “Think in Spanish” course flyer ad every time I stroll on Bloor Street West here in Toronto.
Funny, thinking within the rigors of a discipline is the very definition of thinking inside the box.With respect to programming languages, it encourages one to think within the limits of a single language which is the exact opposite of what one should do when developing software.
The hegemony of agile methodologies inflicted a disruptive change on the face of the software industry and amongst the established roles within a team. The legions of “coders” bred by the tech bubble of 2000 are facing extinction; the roles of the software designer and Technical Architect are fuzzier than ever. Thinking inside of the language box makes a better coder but not a better software engineer.

Modern software design methodologies shatter that box. Domain Driven Design disconnects the developer from the technology and places domain rules before the intricacies of a language. Test Driven Development brings the Domain in the software realm and it does it by forcing to write the tests first, then write the actual code. Test-first forces you to think behavior and APIs first then perform the act of coding which is nothing more than implementing that behavior.
Eric Evans’s DDD: Strategic Design is an eye opener. And while you’re at it, check his other presentation about domain modeling.

So instead of “Thinking in Java”, think Domain and APIs first then act. In Java.

Mobile internet is here to stay

Since I went back from vacation (Easter with family back home in Romania) I got quite interested in the mobile internet. This was the first time I wasn’t going to internet cafes or asking buddies to let me use their internet (and their PCs).
I have a beaten Nokia 3100 that works both in RO and here in Toronto. I was playing with it in my first vacation days back home, exploring the features of my romanian mobile carrier (I love the *…# service commands, they remind me of dumb terminals) and I noticed that they had a new offering, mobile internet. Configuration was a breeze and I found myself browsing the internet a few minutes later. Email, news, even checking if my websites are still up and running; they were all there where I left them.
Needless to say that from that day I didn’t step into an internet cafe anymore to get my fix.

I also got a lovely gift from Corina around the same time, an iPod Touch. Since I’m a fan of Apple’s products I couldn’t be happier and since I started using it I realized that I use my regular laptop less and less. I’m using the iPod Touch more to check Email, news, weather, live TV (gotta love France24) and play the occasional game, than to listen to music. The experience is more condensed and focused and so I spend less time to find what I’m looking for. Mobile internet experience achieved the goal that it’s parent couldn’t: ease of use.

Funny that I had to go back home to find it out.
Continue reading “Mobile internet is here to stay”

Selecting location data from a spatial database

I have been thinking to write about this subject a while back when project Spincloud was still under development. I was even thinking about making this the first post on my blog.
The idea is simple: you have location-based data (POIs for instance) stored in some database (preferably a spatial DB) and now you want to perform a select statement that will indicate the area that should include the points we want. In case of Spincloud’s weather map, we want the weather reported by the stations located within a given area determined by the Google Map viewport that the user is currently browsing.
In all my examples I’ll use SQL Spatial Extensions support, specifically MSQL spatial extensions.
Here’s a visual representation of the spatial select (the red grid is the area where we want to fetch the data):

select_smpl

This is quite easy to accomplish by issuing a spatial select statement on the database:

select * from POI where Contains(GeomFromText
    ('POLYGON ((-30 32, 30 -8, -89 -8, -89 32, -30 32))', LOCATION))

But what about selecting an area that crosses the 180 degrees longitude? Let’s say we want to select data in an area around New Zealand that starts at 170 degrees latitude and ends at -160 degrees latitude going East. The selected area will look like this:
Continue reading “Selecting location data from a spatial database”