Reviewing Google AppEngine for Java (Part 1)

appengine_lowres

When Google announced that Java is the second language that the Appengine will support I almost didn’t believe it given the surge of the new languages and the perception that Java entered legacy but the JVM is a powerful tried-and-true environment and Google saw the potential of using it for what it is bound to

become: a runtime environment for the new and exciting languages (see JRuby and Grails). The JVM is the new gateway drug in the world of languages.

Note: I’ll break down this review into two posts as it’s too extensive to cover everything at once. This first part is about initial setup, JPA and some local webapp deployment issues. In the second part I’ll describe how to load data into the datastore, data indexing, how I reached some of the limitations of the AppEngine and how to get around some of them.

I managed to snatch an evaluation account and last few days I have been playing with it. (as of today Google opened the registrations to all). My goal was (again) simple: to port Spincloud to the AppEngine. The prize: free hosting which amounts to $20/month in my case (see my post about Java hosting providers), some cool monitoring tools that I can access on the web and of course the transparent super-scalability that Google is touting, in case I get techcrunched, slashdotted or whatever (slim chances but still…).
I am a couple of weeks into the migration effort and I can conclude that the current state of the AppEngine is not quite ready for Spincloud as I’ll detail below but it comes quite close. The glaring omission was scheduling but Google announced today that the AppEngine will support it (awesome!) and I plan to use it from day one.

My stated goal is running Spincloud on AppEngine. Here are the technologies I use in Spincloud:
– Spring 3.0 (details here)
– SpringAOP
– SpringSecurity with OpenId
– Webapp management with JMX (I know, it won’t fly)
– Job scheduling using Spring’s Timer Task abstraction (not optimistic about this one either)
– SiteMesh (upgraded to 2.4.2 as indicated)
– Spatial database support including spatial indexing
– JPA 2.0 (back story here)
– Level 2 JPA cache, query cache.
– Native SQL query support including native SQL spatial queries.
– Image processing (spincloud scraps images and processes pixel-level information to get some weather data)

That’s a lot of stuff but first things first. I recommend creating a new repository branch (I use Subversion) dedicated to the new environment as there may be many things to change and you still want to continue the development unaffected on your mainline.

I downloaded the latest SDK and exploded it under the tools/ folder. Make sure you install the Eclipse plugin as it’s pretty useful.

After the initial setup I started-out with the JPA part since this seemed to be the most challenging piece of technology that I had to change. AppEngine uses DataNucleus JPA provider, a poor choice in my opinion since they are a JDO provider turned to JPA (much like KODO/OpenJPA) and so they drag a lot of JDO baggage along. I haven’t heard of them before and they don’t seem to be that popular (why not Hibernate or the new RI: EclipseLink? Update: because JPA is RDBMS oriented while JDO is not and the backing store of GAE is BigTable which is not RDBMS). DataNucleus JPA requires a pre-compilation step that does bytecode enhancement which I completely dislike as it’s archaic (all modern JPA providers don’t have this anymore) but I had no choice. So I’ve followed the reference documentation. The only thing not described there is the ant integration.
Here is what I added to my build.xml

<project name="weather"  default="build"  basedir=".">
...
  <property name="sdk.dir" value="/Users/florin/tools/appengine-java-sdk-1.2.1"/>
  <import file="${sdk.dir}/config/user/ant-macros.xml" />

  <path id="enhancer.classpath">
    <pathelement path="${bin.dir}"/>
    <fileset dir="${sdk.dir}/lib">
      <include name="appengine-tools-api.jar"/>
    </fileset>
    <fileset dir="${lib.dir}">
      <include name="gisjts/jts*.jar"/>
      <include name="appengine/appengine-api-1.0-sdk-1.2.1.jar"/>
    </fileset>
  </path>

  <target name="enhance" depends="compile">
    <enhance verbose="true" classpathref="enhancer.classpath">
      <fileset dir="${bin.dir}">
        <include name="com/newsplore/weather/bo/*.class"/>
      </fileset>
    </enhance>
  </target>

  <target name="build-appengine" depends="compile, enhance, jar"/>
...

Essentially this is to accommodate the enhancement step (notice how the “enhance” target is interposed between compile and jar in the build-appengine target).
I ran the “build-appengine” target and noticed the first exception:

  [enhance] Please see the logs [/tmp/enhance64409.log] for further information.

That’s a convoluted way of looking at the errors (why not showing the exception in the console?). Here’s the actual error found in the said file:

java.lang.RuntimeException: Unexpected exception
...
   Caused by: java.lang.reflect.InvocationTargetException
...
Caused by: java.lang.NoClassDefFoundError: Lorg/apache/log4j/Logger;
  at java.lang.Class.getDeclaredFields0(Native Method)

This was because one of my business objects used log4j. Adding the log4j jar file didn’t help and the solution was to remove the logging from the class altogether.

After cleanup I got this error:

  [enhance] SEVERE: Class "com.newsplore.weather.bo.GeometryUserType" was not found in the
CLASSPATH. Please check your specification and your CLASSPATH.
  [enhance] org.datanucleus.exceptions.ClassNotResolvedException: Class
"com.newsplore.weather.bo.GeometryUserType" was not found in the CLASSPATH.
    Please check your specification and your CLASSPATH.

That was because GeometryUserType (a class I used to handle Geometry types in Hibernate) was in the com.newsplore.weather.bo package and the enhancer went through all classes in this package thinking they are all @Entities and failed with this one. Weird, I thought that the enhancer was smarter and filtered-out all non @Entities… The solution is either to specify all the classes to be enhanced in the “enhance” ant target or move the non-entities to other packages (a good practice). In my case I deleted the class since I wasn’t using it anyway (I used this utility class before Hibernate became a JPA provider).
After this cleanup, the enhance target spat yet another error:

[enhance]Class com.newsplore.weather.bo.UsForecast has property weatherIconByCode declared
    in MetaData, but its getter method doesnt exist in the class!
  [enhance] May 27, 2009 4:03:50 PM org.datanucleus.enhancer.DataNucleusEnhancer main
[enhance] SEVERE: DataNucleus Enhancer completed with an error. Please review the enhancer
log for full details. Some classes may have been enhanced but some caused errors
  [enhance] Class com.newsplore.weather.bo.UsForecast has property weatherIconByCode
declared in MetaData, but its getter method doesnt exist in the class!
  [enhance] org.datanucleus.metadata.InvalidMetaDataException: Class
com.newsplore.weather.bo.UsForecast has property weatherIconByCode declared in MetaData,
 but its getter method doesnt exist in the class!
  [enhance]     at org.datanucleus.metadata.ClassMetaData
.populateMemberMetaData(ClassMetaData.java:497)

So a domain object (UsForecast) has a property called weatherIconByCode. Hmm, I didnt remember of such property but looking at the source I saw that the offender was a private utility method called getWeatherIconByCode(String code) that was being processed by the enhancer as if it was a property (I’m using annotations bound to getters and not to fields). Since adding @Transient didn’t fix it, the solution is to rename the method not to start with get/set (or move it in an utility class which you shouldn’t do if you follow DDD).
To my relief, after this fix the build didn’t yield any more errors so I had a nice webapp folder ready to be deployed. AppEngine has a local sandbox (based on Jetty) that should be used before deploying into the cloud. You can start the appserver using an ant target but I chose to get more control and create a script that accomplishes the same thing. Here it is:

cd build/webapp
java -classpath <appengine-sdk-home>/lib/appengine-tools-api.jar
com.google.appengine.tools.development.DevAppServerMain --port=8080 --address=localhost

Save this in a file called startgae.sh (mind the path, you should be able to cd to the webapp folder from it) then make it executable.
Using this script I fired-up the container and got no startup error. This was weird since I knew I was using JMX (via Spring and annotations) and as well the task scheduler (using Quartz) was seemingly up and running.
I hit the home page to see what has been deployed and I get a 404 error. My web.xml has this mapping:

  <servlet>
    <servlet-name>weather</servlet-name>
    <servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
  </servlet>

  <servlet-mapping>
	<servlet-name>weather</servlet-name>
	<url-pattern>/</url-pattern>
  </servlet-mapping>

but this doesn’t work in the local servlet container. This was quite annoying as not only the home page doesn’t work, the static folders (css/, js/, img/) are routed through the weather servlet too. Surprisingly as it turns out, this mapping worked perfectly in the AppEngine environment. To be able to continue with the work in the local environment, I had to define fine grained mappings for all the URLs.
After this change, the first URL that started working was the /faq which provided the first breakthrough. Although only the presentation tier is in use on the /faq page, the fact that the page worked provided a bunch of good news: Spring 3.0 DI container, SiteMesh, Spring Security and SpringAOP were all correctly running.

Stay tuned for the second part detailing more JPA, datastore indexing, a bunch of the AppEngine limitations, a bit of local environment hacking and how to use “request welding” to accomplish long running tasks.