Friday, December 28, 2012

Use Quartz Manager to monitor Quartz jobs inside Mule Studio

Now that we can successfully schedule jobs inside Mule Studio using Quartz, the next logical question is how we can monitor these jobs to know whether they executed or not, succeeded or failed? Mule does not provide monitoring at such a fine grained level. Luckily, Terracotta offers Mule Manager, as part of its Enterprise offering.

One of the key challenges we need to solve is to provide in-depth monitoring capability for our next generation platform. Currently we use cron to schedule some of recurring jobs. While cron is easily to setup, it provides no visibility into if and when the job was executed, the status of the job execution (success, fail, or still ongoing).

In this blog, we will show how to configure Mule Studio and Quartz connector so jobs can be monitored by Quartz manager.

From Terracotta's web site,

"Quartz Manager provides real-time monitoring and management for Quartz Scheduler. Use its rich graphical user interface to:
  • Gain immediate visibility into job schedules, status and activity
  • Readily add or modify scheduling information
  • Manage multiple instances of Quartz Scheduler through a single interface
  • Simplify ongoing management of job scheduling and execution
Quartz Manager is an enterprise-grade addition to Quartz Scheduler that comes with a commercial license and support."
We really like the facts that Quartz monitoring is JMX based and Quartz manager works with existing Quartz schedule w/o any configuration changes, other than enabling JMX support.

Here are the necessary steps to integrate Quartz end point with Quartz Manager.

Download and Install Quartz Manager

Follow the link to download and install Quartz Manager.

Enable JMX for Quartz Connector inside Mule Studio

Next we must enable JMX for both Quartz Connector and when we run the Mule flow.

Enable JMX for Quartz Connector

Enabling JMX for Quartz connector is accomplished through setting the following "quartz:factory-property" entries,
 <quartz:factory-property key="org.quartz.scheduler.jmx.export"  
                value="true" />  
 <quartz:factory-property key="org.quartz.scheduler.jmx.objectName"  
                value="quartz:type=QuartzScheduler,name=JmxScheduler,instanceId=NONE_CLUSTERED" />  

Replace org.quartz.scheduler.jmx.objectName's name and instanceId with one's own appropriate values.

Enable JMX Remoting when Running Mule Flow

Now we need to enable JMX remoting when running Mule Flow. Go to eclipse->Run->Run Configuration. Find the run configuration for the existing flow under "Mule Application". Add the following entries to VM arguments,
 -XX:PermSize=128M -XX:MaxPermSize=256M -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote.ssl=false  

Upgrade Quartz JAR File in Mule Studio

Next we need to upgrade the quartz-all.jar file shipped with Mule Enterprise Studio. Currently Mule ships quartz-all-1.6.6.jar, which is very old and won't work with Quartz manager. We can't use the most recently release of Quartz as the most recent version breaks the backward compatibility and Mule won't support it. Download quartz-all-1.8.6.jar from Terrocatta's web site.

Go to MuleStudio installation directory and replace quartz-all-1.6.6.jar with quartz-all-1.8.6.jar.

We also need to fix the built-in Mule runtime libraries reference to quartz-all-1.6.6.jar with quartz-all-1.8.6.jar. Mule runtime dependency is defined in a MANIFEST.MF file under your Mule studio installation directory. Locate this file, replace quartz-all-1.6.6.jar with quartz-all-1.8.6.jar.

Start Mule Flow and Quartz Manager

Now we can start Mule Flow and Quartz Manager. 

When Quartz Manager first starts up, it asks for JMX host and port number,

Enter the correct host name and port number.

After successfully connecting to the remote Quartz scheduler, Quartz manager display the following job details page,

Click on "Triggers" to show trigger details.
















Click on "Job Monitor" to see the detailed status for each executed job.



Thursday, December 27, 2012

Schedule Jobs using Mule Studio and Quartz, part II

In the previous blog, we discuss how to use inbound Quartz end point to trigger jobs. In this blog, we discuss how to use outbound Quartz end point to schedule jobs when some external events happen.

Below is the sample Mule flow. When someone accesses the HTTP end point, a message is put on a VM end point, which triggers Quartz Scheduler to run a job that put a message on a different VM end point, which gets picked up by Java transformer that will process it eventually.


 Pay attention to the key difference between this flow and the inbound end point flow. In the inbound end point flow, there is no message following into the inbound end point. For the outbound event, a message (VM or JMS) is necessary to trigger the job.

Double click on the outbound end point and the following window shows up,


Fill in the necessary scheduling information and click on plus sign to add a new job.



Make sure we select "quartz:schedule-dispatch-job".


Make sure "Group Name:", "Job Group Name:", and "Address:" are not filled in at all. If they are filled in, one will get some weird error that can't be made any sense of. I wish Mule Studio can be more helpful but it doesn't. I finally resolved this issue after spending hours searching and googling.

That's it. Now execute the flow and access the HTTP end point. You will see our Java transformer gets executed twice after initial delay of 15 seconds.

Schedule Jobs using Mule Studio and Quartz, Part I

There are a lot of confusions in how to use Quartz to schedule jobs inside Mule Studio. The documentation is slim to none.
In this series of blogs, we attempt to explain and show how to use Quartz to schedule both inbound and outbound jobs, and how to enable Mule Quartz connector so we can use Quartz manager to monitor jobs.

For Mule's reference, "The Quartz transport provides support for scheduling events and for triggering new events. An inbound quartz endpoint can be used to trigger inbound events that can be repeated, such as every second. Outbound quartz endpoints can be used to schedule an existing event to fire at a later date. Users can create schedules using cron expressions, and events can be persisted in a database."

An inbound quartz endpoint is used when we need to trigger inbound events periodically, like generating a predefined message, or loading a file, and then pass the generated message down the flow, either directly or through a VM or JMS.

An outbound quartz endpoint is used when we need to trigger an event based on incoming events. For example, in our world, we want to send all users a 25% off coupon everyday at 9am, for all the users who click on the welcome link embedded in our email campaign for the previous day.

The key difference between the inbound and outbound endpoint is that inbound message is scheduled entirely by Quartz, while the outbound endpoint is scheduled by Quartz, but only after an incoming event has been triggered. We will use some examples to illustrate the key differences.

Quartz Connector

But before we dive into the nitty gritty details, we need to create a Mule Quartz connector first. A Mule Quartz connector is how a Quartz schedule should be created by Mule.

Here is a sample scheduler using RAMJobStore. Enter the following block of xml in Mule flow's configuration XML section.
   <quartz:connector name="quartzConnector_vm" validateConnections="true" doc:name="Quartz">  
     <quartz:factory-property key="org.quartz.scheduler.instanceName" value="MuleScheduler1"/>  
     <quartz:factory-property key="org.quartz.threadPool.class" value="org.quartz.simpl.SimpleThreadPool"/>  
     <quartz:factory-property key="org.quartz.threadPool.threadCount" value="3"/>  
     <quartz:factory-property key="org.quartz.scheduler.rmi.proxy" value="false"/>  
     <quartz:factory-property key="org.quartz.scheduler.rmi.export" value="false"/>  
     <quartz:factory-property key="org.quartz.jobStore.class" value="org.quartz.simpl.RAMJobStore"/>  
   </quartz:connector>  

Mule uses "quartz:factory-property" to specify all Quartz related properties.

Here is how to configure a clustered scheduler using a JDBCJobStore using MySQL. Enter the following block of xml in Mule flow's configuration XML section.

      <quartz:connector name="quartzConnector_vm"  
           validateConnections="true" doc:name="Quartz">  
           <quartz:factory-property key="org.quartz.scheduler.instanceName"  
                value="JmxScheduler" />  
           <quartz:factory-property key="org.quartz.scheduler.instanceId"  
                value="_CLUSTERED" />  
           <quartz:factory-property key="org.quartz.jobStore.isClustered" value="true" />  
           <quartz:factory-property key="org.quartz.scheduler.jobFactory.class"  
                value="org.quartz.simpl.SimpleJobFactory" />  
           <quartz:factory-property key="org.quartz.threadPool.class"  
                value="org.quartz.simpl.SimpleThreadPool" />  
           <quartz:factory-property key="org.quartz.threadPool.threadCount"  
                value="3" />  
           <quartz:factory-property key="org.quartz.scheduler.rmi.proxy"  
                value="false" />  
           <quartz:factory-property key="org.quartz.scheduler.rmi.export"  
                value="false" />  
           <!-- JDBC JOB STORE -->  
           <quartz:factory-property key="org.quartz.jobStore.class"  
                value="org.quartz.impl.jdbcjobstore.JobStoreTX" />  
           <quartz:factory-property key="org.quartz.jobStore.driverDelegateClass"  
                value="org.quartz.impl.jdbcjobstore.StdJDBCDelegate" />  
           <quartz:factory-property key="org.quartz.jobStore.dataSource"  
                value="quartzDataSource" />  
           <quartz:factory-property key="org.quartz.jobStore.tablePrefix"  
                value="QRTZ_" />  
           <!-- MYSQL Data Source -->  
           <quartz:factory-property  
                key="org.quartz.dataSource.quartzDataSource.driver" value="com.mysql.jdbc.Driver" />  
           <quartz:factory-property key="org.quartz.dataSource.quartzDataSource.URL"  
                value="jdbc:mysql://localhost:3306/quartz2" />  
           <quartz:factory-property key="org.quartz.dataSource.quartzDataSource.user"  
                value="root" />  
           <quartz:factory-property  
                key="org.quartz.dataSource.quartzDataSource.password" value="root" />  
           <quartz:factory-property  
                key="org.quartz.dataSource.quartzDataSource.maxConnections" value="8" />  
           <!-- JMX Enable -->  
           <quartz:factory-property key="org.quartz.scheduler.jmx.export"  
                value="true" />  
           <quartz:factory-property key="org.quartz.scheduler.jmx.objectName"  
                value="quartz:type=QuartzScheduler,name=JmxScheduler,instanceId=NONE_CLUSTERED" />  
      </quartz:connector>  

To verify that the Quartz connector has been configured successfully, go the "Global Elements" in the Mule flow.


Click on Quartz.


Click on "Properties" tab,

Verify all Quartz properties have been properly configured.

Use Inbound Quartz End Point

An inbound quartz end point is self-triggered and will generate one or more messages, whose payload can be preconfigured or reloaded from a file.

Here is the picture of the sample Mule Flow. Every 5 seconds, Quartz end point will generate a VM message containing "HELLO!' as its body. We have a Java transform that listens to the same VM end point and processes the generated "HELLO!" message.



Double click on "Quartz_Event_Generator" and it brings up the Quartz Mule component,



Fill in the desired information so that our job will fire every five seconds. Now we need to add a new job. Click on plus sign.



Select "quartz:event-generator-job" and click on "Next".

Fill in "Group Name", "Job Group Name", and enter "HELLO!" for Text. We can also click on "..." next to File to load the file as the message payload.

Now we need to select our already defined quartz connector by clicking on the "References" tag on the Quartz component and select the appropriate pre-defined quartz connector from the drop down list for "Connector Reference:" field.


Now we can run our Mule to verify that Quartz will trigger a message very five seconds and the message will be processed by our Java transformer.

We will discuss Quartz outbound events in the next blog.


Monday, September 24, 2012

Riak Cluster, Sanity Check, Part Two

In this blog, we continued with our sanity check on Riak Cluster.

Setting up Riak Cluster

I have used HBase, Cassandra, MongoDB in the past and I was preparing for a long, laborious efforts to setup a five-node Riak cluster but I was pleasantly surprised on how easy it was to set up the cluster. I found the command line tools are extremely helpful at diagnosing the cluster status, etc.

Sanity Check

Our sanity check consists of loading 20 million objects in the cluster and executes a series of Get, link walking, and free text search over 10 concurrent threads, while physical nodes are brought down and brought back up in the middle of performance testing.

Use HA Proxy for load balancing and failover

We couldn't get Java client's cluster client to work for us so we switch to use HA proxy for failover and load balancing purpose and it works out really well for us. 

When we brought down a Riak node during the performance testing, some of the in-flight queries failed but succeeded immediately after we retried with the same method call.

Test Result

Our sanity check performed flawless when physical nodes were brought down during the testing. Inflight queries failed but recovered right after we retried with the same query and there is little degradation when losing a physical node.

However, a big surprise came when we brought a physical node backup during the performance testing. It increased the overall performance time by 6 times. Baffled, I reached out to Brian at Basho and there is his explanation,
"The reason I ask is the http/pb API listener will start up before Riak KV has finished starting. So, while these nodes will be available for requests they will be slow and build up queues as they start up needed resources. You can use the riak-admin wait-for-service riak_kv <nodename> command from any other node to check if riak_kv is up."

My follow-on question, 
"Yes, I have my performance test running, through a HA proxy that maps to 5 nodes. I understand that the initial response from the starting up node will be slow, but it is impacting the entire performance, which consists of 10 testing threads. I would rather have the node not accepting request until it is ready than accepting requests but queuing them up to drag down the cluster performance."

And Brian's response,
"I completely agree, there is currently work being done on separating the startup into two separate processes. One which will reply with 503 to all requests and then dies when KV is started and ready to take requests. Currently, even if a request is not sent to a node directly, in the case of a PUT, the data will be forwarded to the node even though KV is not ready to accept it. Depending on your W quorum this could result in a long response time for requests as the coordinating node waits for a response from the starting up node.  Currently there is no method to make a node 'invisible' to other nodes in the cluster without bringing it down or changing the cookie in vm.args."

Summary

We are pleased with our overall sanity check results but there are definitely kinks need to be worked out and I am glad Riak team is on top of things.

Riak Sanity Check, Take One

Before we can officially recommend Riak to the management team, we executed a series of sanity checks to make sure that we will make a sound recommendation and won't come back and bite us in the rear.

Testing Scenario

Our sanity testing environment consists of 20 million objects, running on a single physics node. After 20 million objects are loaded into the Riak 1.1, we executed a series of get, link walking, and free text search with 10 concurrent threads.

We are not interested in the actual performance number, but are looking for obvious bottlenecks and abnormal behaviors.

Test Result

We are surprised at the poor results exhibit by our sanity checks, Riak consuming 100% of CPU and 75% of memory and queries didn't return in any reasonable fashion.

Needless to say, we are somewhat concerned. This is where Riak's excellent support jumps into the play. Riak's Develop Advocate Brian worked with us and came up with an excellent diagnosis;

"LevelDB holds data in a "young level" before compacting this data into numbered levels (Level0, Level1, Level2….) who's maximum space grow as their number iterates. In pre Riak 1.2 levelDB, the compaction operation from the young level to sorted string table's was a blocking operation. Compaction and read operations used the same scheduler thread to do their work so if a compaction operation was occurring that partition would be effectively locked. Riak 1.2 moves compaction off onto it's own scheduler thread so the vnode_worker (who is responsible for GET/PUT operations) will not be waiting for compaction to complete before read requested can be serviced by the levelDB backend (write requests are dropped in the write buffer, independent of compaction).

In your scenario, the bulk load operation caused a massive amount of compaction. Most likely, because of the large amount of objects you loaded, there was compaction occurring on all 64 partitions for a while after your write operation completed. The input data sat in the write buffer and young level and eventually compaction moved them to their appropriate levels but during this time read operations will timeout.  "

So our bulk loading led to a contention on a single thread, which is responsible for compacting and querying at the same time.

We can't upgrade to 1.2 just yet because a bug in search so we will have to wait for the next release. 

Summary

Overall we are not pleased with our sanity checks but am satisfied with the explanation.

Thursday, August 23, 2012

Riak HTTP API Cheat Sheet

Riak HTTP API Cheat Sheet

Listing Buckets and Keys

To list all buckets,
 http://localhost:8098/buckets?buckets=true  

To list all keys within a particular bucket,
 http://localhost:8098/buckets/bucket-name/keys?keys=true  

To list a single object based on key,
 http://localhost:8098/buckets/bucket-name/keys/key-value  

Secondary Index

To find an object based on secondary index,
 http://localhost:8098/buckets/bucket-name/index/index-name_bin/index-value  

Link Walking 

To walk a link from a source key,
 curl -v http://localhost:8098/riak/bucket-name/key-value/_,link-name,1/_,link-name,1  

The walk result provides the interesting results, including secondary index and links on the return objects.
 --AZBlFqksfFqU1it69r5Ygf99o0p  
 X-Riak-Vclock: a85hYGBgzGDKBVIcypz/fvrf7r6bwZTInMfKMOXIypN8WQA=  
 Location: /riak/schema_schema2/campaigns-efg  
 Content-Type: application/json; charset=UTF-8  
 Link: </riak/schema_schema2/folder%3Acampaigns-folder1>; riaktag="PARENT", </riak/schema_schema2>; rel="up"  
 Etag: 4qk1erJBcxtRZxJBDFgxK3  
 Last-Modified: Wed, 22 Aug 2012 21:01:08 GMT  
 x-riak-index-uri_bin: _schemas_schema2_campaigns_campaigns-efg  
 {"bucket":"schema_schema2","contextType":"CAMPAIGN","json":{"name":"campaigns-efg","id":"campaigns-efg"},"name":"campaigns-efg","relationshipMap":{"PARENT":["folder:campaigns-folder1"]},"schema":"schema2","type":"CAMPAIGN","uri":"/schemas/schema2/campaigns/campaigns-efg","uriIndex":"_schemas_schema2_campaigns_campaigns-efg","version":"1.0"}  

The highlighted entries show the Riak Link and secondary index on the current object. This is a valuable debugging tool to ensure the object has the correct relationship.

Search

Use search command line tool to execute free text search,
 search-cmd search bucket-name query  
Query string is something like "type:Folder", etc.

To use the HTTP API,
 curl http://localhost.com:8098/solr/bucket-name/select?q=name:foo  

Thursday, July 12, 2012

Riak Java Client Distilled

In this blog, we will show how to use Riak Java client to,
  • Create/update objects
  • Enable and search by secondary index
  • Add links and walk links
  • Enable and search by free text through MapReduce

Riak Configuration

There are a few configuration changes that we will need to make to app.config to enable secondary index and listening for all ports,
  1. Change back end to use ELevelDB, the only storage engine that supports secondary index
  2. Change localhost or 127.0.0.1 to 0.0.0.0 for all IP addresses so Riak will listen on all ports
  3. Enable Riak for search by modifying app.config file
  4. {riak_search, [
                    %% To enable Search functionality set this 'true'.
                    {enabled, true}
                   ]}
    

Riak Java Client

All Riak server access is done through a Riak client.

Which Riak Client to Use
Riak Java library offers two types of Riak clients, which is very confusion. We found that most tasks can be accomplished using the pbc (low level protocol buffer client) client, except for the following exceptions that one must use the HPTT client,
  • Enable free text search for buckets
How to Obtain a Riak Client

 RiakClient riakClient = RiakFactory.pbcClient(host, port);  

Shutdown Riak Client in the End

One must shutdown all active risk client before shutting down the application/Tomcat server itself.
 riakClient.shutdown();  

Create, Update, and Lookup Object

Riak Client API offers a few annotations to indicate a particular field and we highly recommend use them rather than playing around the metadata ourselves,

  • The Riak Key field (through @RiakKey annotation)
  • A Riak secondary index field (through @RiakIndex annotation)
  • A Riak links collection field (through @RiakLinks annotation)
When we persist an annotated object through Riak client, Riak client will process the key, secondary indices, and links first before handling the object to Jackson for serializing into JSON string and storing the JSON string in Riak. If we choose to manage the object serialization/deserialization through Jackson ourselves, we must also handle the metadata changes like a new secondary index is added/removed or new links are added or removed. If not handled carefully, we could easily lose the existing secondary indices/links when an object is updated.

Here is an example highlighting the usage of above annotations,

 public class JsonObject  
 {  
   @JsonProperty  
   String bucket;  
   
   @RiakKey  
   String key;  
   
   @JsonProperty  
   String name;  
   
     
   @RiakLinks  
   @JsonIgnore  
   Collection<RiakLink> riakLinks = new ArrayList<RiakLink>();  
   
     
   @RiakIndex(name = "uri")  
   @JsonProperty  
   String uriIndex;  
   
  }  

To save/update an object,

 this.riakClient.createBucket(bucket).execute().store(object).execute();  
   

To lookup an object by key,
   
 @Override  
   public <T> T get(final String bucket, final String key, final Class<T> kclass)  
   {  
     try  
     {  
       return this.riakClient.fetchBucket(bucket).execute().fetch(key, kclass).execute();  
     }  
     catch (final RiakRetryFailedException e)  
     {  
       throw new RuntimeException(e);  
     }  
   }  

Secondary Index Creation and Retrieval
When an object's field has @RiakIndex annotated, secondary index is automatically created/updated when the object is stored or updated.

To look up an object based on secondary index,

 public List<String> fetchIndex(final String bucket, final String indexName, final String indexValue)  
   {  
     try  
     {  
   
       return this.riakClient.fetchBucket(bucket).execute().fetchIndex(BinIndex.named(indexName))  
           .withValue(indexValue).execute();  
     }  
     catch (final RiakException e)  
     {  
       throw new RuntimeException(e);  
     }  
   
     // Collection<String> collection = results.getResult(String.class);  
   }  

Riak Search
Riak search must be enabled at the bucket level before Riak will index properties on all objects in the bucket. To

 bin/search-cmd install my_bucket_name  
To execute a Riak search on a given bucket,
 @Override  
   public Collection<JsonObject> search(final String bucket, final String criteria)  
   {  
     try  
     {  
       final MapReduceResult mapReduceResult = this.riakClient.  
           mapReduce(bucket, criteria)  
           .addMapPhase(new NamedJSFunction("Riak.mapValuesJson")).execute();  
       return mapReduceResult.getResult(JsonObject.class);  
     }  
     catch (final Exception e)  
     {  
       throw new RuntimeException(e);  
     }  
   }  
Where parameter bucket is the bucket name and the criteria is the search criteria like "type=Folder" or "(type=Folder AND name=Hello)".

Riak Link Walking
Riak link walking apparently only with HTTP client, not the pbc client for some reason.

Here is a sample code to link walk a specific number of steps from the current object, identified by key.
  @Override  
   public List<List<String>> walk(  
                   final String bucket, // bucket name  
                   final String key,  // originating object key  
                   final String linkName,  // link name  
                   final int steps   // number of steps to walk. Riak will stop if it can't walk further  
                   )  
   {  
     final List<List<String>> walkResults = new ArrayList<List<String>>();  
   
     try  
     {  
       final LinkWalk linkWalk = this.riakHttpClient.walk(this.riakHttpClient.createBucket(bucket).execute().fetch(key)  
           .execute());  
   
       for (int i = 0; i < steps; i++)  
       {  
         linkWalk.addStep(bucket, linkName, true);  
       }  
       final WalkResult walkResult = linkWalk.execute();  
   
       final Iterator<Collection<IRiakObject>> it = walkResult.iterator();  
       while (it.hasNext())  
       {  
         final List<String> list = new ArrayList<String>();  
         final Collection<IRiakObject> collections = it.next();  
   
         for (final IRiakObject riakObject : collections)  
         {  
           list.add(riakObject.getKey());  
         }  
         if (list.size() > 0)  
         {  
           walkResults.add(list);  
         }  
       }  
     }  
     catch (final Exception e)  
     {  
       throw new RuntimeException(e);  
     }  
   
     return walkResults;  
   }  

Tuesday, July 10, 2012

Why We Chose Riak


In this blog, we will discuss why we chose Riak as one of the persistence storage engines for our next generation platform. In the next blog, we will show how to use Riak Java client library to create and update objects, creating new secondary index, links, and free text search.

Object Model
Just recap our dual object model, one for external interfacing RESTful Web Services and the internal persistence object model below.

We store only JSONWrapper objects in Riak, along with the appropriate relationships and links. We also need to search for objects based on their name, type, etc.

Why We Chose Riak

I have been playing around with Riak for the past month and came to the conclusion that Riak is a good  option for our next generation platform, for the following reasons;

Ever-Evolving Object Model
The highly adaptive nature of our object model is not a good fit for the traditional ORM on top of RDBMS, as the object model is highly customizable from customer and customer and may evolve from version to version. The transitional ORM would require RDBMS schema to continuously keep up with our ever-evolving object model, requiring enormous efforts on Engineering, Testing, and Operations.

The platform really does not care about the customized and highly evolved properties of object types. In other words, the platform only needs to know a pre-defined set of object properties for persistence and relationship resolution purpose and does not need to know all the other properties.

Riak, on the other hand, gives us the flexibility for storing opaque objects and we decide to store objects as JSON rather than Java objects or XML because JSON serialization is much more flexible and compact and needs far less storage than Java or XML.

High Availability and Multi-Data Center Support
Riak is built as a distributed data storage, with tunable read and write replica strategy.
Riak Enterprise offers multi-data center replication.

Free Text Search 
Riak comes with build-in free text search support, built on top of Lucene.

Adjacency Link Walking
Our object model relies on adjacent link between objects and it is critical to be able to follow the object graph through these adjacency links. Riak offers MapReduced based link walking functionality so we can easily retrieve all objects that are linked to a particular object through any levels of links.

Secondary Index Support
Like RDBMS, Riak offers secondary index support in addition to primary key lookup.

Multi-Tenant Support
Our platform must support multi-tenancy for security, partition and performance reasons, which is not trivial to accomplish in a RDBMS environment.

Riak, on the other hand, partitions data naturally in buckets and buckets are distributed across different nodes. Tenants can be mapped to buckets and data level security can be accomplished through securing access to buckets. If we store  a tenant related data in the same bucket, a user can only access the data if he has access to the bucket and he can't access any objects not belong to accessible buckets.

Ad Hoc Query Support Through MapReduce
Riak provides us the ability to run Ad Hoc queries through the entire data set, through a series Map and Reduce phases. The only limitation is that MapReduce is executed in memory and must complete with a timeout limit. This is not a major concern given the size of data set.

Performance
Riak is based on a distributed data model, which should perform better than master-slave type of model.

Operation and Monitoring Support
Riak ships with a UI monitoring tool and a set of commends for other administrative tasks like backup/restore, etc.

Concerns about Riak
We do have concerns regarding Riak from a business perspective. Even though Riak is an open source solution, its commercial backer Basho is still relatively young and the user community is not as big as Hadoop, Cassandra, or MongoDB.

To mitigate the risk, we built a persistence abstraction layer that allows us to swamp Riak with a different NoSQL technology in the future if necessary.




Monday, July 9, 2012

Building an Adaptive Object Model

In this series of blogs, we will discuss how we build our next generation platform using Jeresy, Jackson, JSON, and Riak. But first, we will show how to build an adaptive object model, supporting multiple versions of object types simultaneously.

Object Model Requirement

For our platform, we are storing various types of configuration objects, with parent/child relationship linking objects together. Each object type has a set of fixed pre-defined properties and a set of custom properties, which can vary from customer to customers.

Support Versioned Objects

As our platform evolves, our object model will need to adapt and evolve, which means we need to support and store different versions of same object type. Properties can be added or removed between different versions.

Building an Adaptive Object Model

After some exploring, we decided to go with two set of object models, an explicit object model for Web Services and a generic persistence object model.

Here is a picture of the two object models,

The generic object graph approach consists of two layers of abstraction, a wrapper object and an inner JSON object. The wrapper layer contains the following static information that does not change from one version to another, like,

  • Object name
  • Object key
  • Object type
  • Object uri
  • Object version
  • A relationship map
  • Parent id
and a map representation of the inner JSON object.

The inner JSON object is the JSON object created by the user or will be returned to the user. The inner JSON object can be different from version to version and from type to type. Since the platform does not need to know what is actually stored in the inner JSON object, other than a set of standard fields, like,
  • Object name
  • Object type
  • Object version
The platform just stores the wrapper object, with inner object as an opaque map. Since the wrapper object structure does not change based on version or type and the inner object is stored as a generic Map type, the platform does not need to change every time we add a new object type or change an existing object type.

Object Validation

However, we still need to validate user supplied JSON object to make sure that it matches the correct version of the object type. We accomplish this by creating a set of validating classes and register them by type and version. When we receiving an object creation request, we will first deserialize the input JSON to a raw Map type and pull out the type and version from the map object. Then we look up the validation class based on type and version and then deserialize the input JSON object again based on the validation class.

When a new type is introduced or new version of type is introduced, we just update the validation map with the new configuration and deploy the new classes and nothing else needs to be changed.

Sequence Diagram for Web Services

Create a New Object through Web Services.

Get an Object through Web Services.






Tuesday, June 19, 2012

First Impression on Riak vs mongoDB vs Cassandra/HBase

As part of the building out the next generation technology platform, we would like to explore NoSQL solutions to compliment our existing platform, which is built on RDBMS, primarily Oracle.

I have used mongoDB, Cassandra and HBase in the past life and I am eager to learn what Riak could offer as an alternative NoSQL solution.

In the next few weeks, I will be publishing our findings, lessons learned along the way. In this post, I will give my first un-biased impression on Riak vs other NoSQL technologies.

First Impression on Riak

Riak is an open source, distributed database solution, written in Erlang and supported by Basho. From first glimpse, it offers the following nice features,
  • Objects are stored by buckets and keys
  • Really nice HTTP API (for developing/debugging purpose)
  • Horizontal and linear scalability
  • Masterless replication with tunable read/write consistency level
  • Consist Key Hashing and even load distribution
  • Automatically rebalancing when new nodes are introduced or removed
  • Support Linkage between objects, a nature way to build hierarchy object model
  • Complex query support including secondary index, free text search, and MapReduce support
  • Excellent client library support
  • Thousands of name branded customers
From the first glance, Riak is very much like Cassandra with automatic cluster management, nicer API, and much less complexity in terms of cluster management and learning curve (no more Thrift API).

On the other other hand, Riak is a strict name/value pair model and does not offer column family or super column family support, as supported in Cassandra/HBase. I guess we can simulate column family support by turning bucket into row key, and keys into column family columns.

Overall, Riak looks like a good candidate for our prototyping and I will share our experience in the next  few blogs.

Performance Tuning Hibernate Transaction Flush Mode

I have been using Hibernate on and off for the post ten years and here are some tips and tools that I have used to help me identify, tune, and improve Hibernate performance.

Tip 1 Use a good JDBC Profiler

My personal favorite is Elvyx, which is easy to install, configure, and use. While Hibernate SQL log is useful, it is not easy to read and it won't show the actual parameters sent to the database. Elvyx, on the other hand, has a UI that will show both unbound (similar to Hibernate) and bound SQL, which shows the actual parameters in the SQL. Elvyx UI also allows us to do the following,
  • Sort the queries
  • Total time eclipsed summary graph
  • Drill down to a single query and how execution status
  • Export data into Excel and other formats

A JDBC profiler should be used as part of the development, QA process to catch potential performance issues and in production to help trouble shoot live performance issues.

In Development and QA

In development, JDBC profiler should be used to profile every Web Services call or every single page-turn for web applications, to identify the following potential performance issues,
  • Hibernate is generating the correct SQL (from HQL)
  • Hibernate is loading just right amount of data (use lazy loading whenever possible)
  • Hibernate is generating the correct amount of SQL calls. An abnormal amount of SQL calls per web service call or per web page turn indicates poor design and potential performance issue
  • Look for SQL that is taking long time to execute. Examine the explain plan and make sure the plan makes sense. If the generated sql does not meet requirement, consider rewriting the query or using native SQL or a function or a stored procedure for better performance

In Production

Since Elvyx is not intrusive and does not recompiling application or any other type of special treatment, it is ideal to trouble shoot live production performance issue. Simple deploy, configure, restart, and start troubleshooting.

Tip 2 Understand Transaction Flush Mode

Most people don't understand Hibernate Transaction Flush Mode and what is the most appropriate Flush mode to use. Wrong Transaction Flush mode will lead to huge performance issues.

What is Transaction Flush Mode

Hibernate does not flush every add or update transaction to the database. Rather Hibernate collects them and waits for the right time to flush them all t the database. And the right time is defined by the Transaction Flush mode. There are four Flush mode,

  • Always, the session is flushed every query
  • Commit, the session is flushed when transaction is committed
  • Manuel, the session is flushed manually, i.e., Hibernate will NOT flush session to the database during query or commit time
  • Auto, default Flush mode and yet the most confusion one. The session is flushed before a query is executed or transaction is committed

Why we need transaction Flush mode?

Database transaction is expensive and does not perform well so Hibernate turns auto commit off. Hibernate defers database transaction until the end when all necessary database updates have been made.

For example, in a transaction, we can do the following,
  1. Begin transaction
  2. Create employee A
  3. Create employee B
  4. Associate A with its manager C
  5. Associate B with its manager D
  6. Commit transaction
Instead of 4 separate transactions, we only need a single transaction. Very efficient.

Now, if we change the follow a little,
  1. Begin transaction
  2. Create employee A
  3. Create employee B
  4. Associate A with its manager C
  5. Look up all employees reporting to C
  6. Associate B with its manager D
  7. Look up all employees reporting to D
  8. Commit transaction
If we don't' call transaction flush before step 5 and step 7, we will get incorrect results, because the query results won't include the newly created employee A and B. If we want to include newly created results in the query results before committing them to the database, we must flush the pending transactions (creation of employee A and B) to the database before they can be included in a later query.

Hibernate default Flush mode, AUTO, is designed to be overly cautious and does a database flush every time before executing a query. It is designed to protect novice user but it does come with a hefty performance penalty.

What is the Performance Penalty associated with Database Transaction Flushing

Hibernate does not keep track of which object has been modified in session object. In order to do a proper transaction flush, it must first determine which object has changed in the session by going through ALL the objects in the session and comparing the current object with what's in the database one object at a one. This process is extremely CPU intensive and only gets worse if one has a lot of objects loaded in the session, which is typical in a bulk load/update type of transactions.

Default Flush Mode introduces Performance Problem during Bulk Operations

We had a page that creates a new campaign based on an existing campaign template via a deep copying . A campaign object could contain possibly hundreds of other objects. A typical flow is like the following,

  • Begin Transaction
  • Retrieve the template campaign
  • Shallow copy and save the top level campaign objects
  • For each top level campaign object 
    • Retrieve next level campaign objects
    • Shallow copy and save the secondary level campaign objects
  • Iterate through all nested objects
  • Commit Transaction
A typical copy operation takes 30 minutes. This clearly indicates a performance issue. After further investigation, we traced the problem back to hefty performance cost introduced by database transaction flush.

For each select statement like retrieving the next level campaign objects, Hibernate does a database flush and as the number of objects loaded in the session increases, the time to determine the "dirty" objects increases dramatically. And there is absolute NO need to do database flush, since we are NOT making any changes to existing objects, only creating new ones.

The solution is to switch the default flush mode to COMMIT. This cuts the execution time from 30 minutes to 3 seconds.

So next time if an operation takes abnormal long time to execute and it is not being held up by the database itself, check Hibernate transaction flush mode carefully. Typically I use either MANUEL or COMMIT for any type of bulk operations or read-only operations.

Tip 3 Use Batch Operations

As we have shown before, Hibernate carries huge performance penalty if we execute one query at a time, because of the overhead related to database transaction management. However, we can reduce this cost dramatically if we can batch a set of operations together and carry them out in a single transaction or a single query.
We had a page displaying a grid, which can be sorted or filtered by a set of criteria. The original implementation performs poorly because it is implemented like the following,
  • Select a set of user ids based on the selection criteria
  • Get each user for each returned user id
A must better performant implementation is like,
  • Select a set of user ids based on selection criteria
  • For every 300 user id
    • Select users where user id in (the set of 300 users)
The second implementation is typically 10 to 20 times faster than the first one.

Friday, June 15, 2012

How to Update Facebook Status on user's behalf

In the previous blog entry, "how to tweet on user's behalf", we talked about how to tweet on user's behalf through Twitter application. In this blog, we will discuss similar function on Facebook, i.e., how to update status on user's behalf. And necessary steps are very similar,

  1. Create a Facebook campaign specific application
  2. Obtain an access token for a particular Facebook account. This process does require user to log into his/her Facebook account and explicitly grant permission to the Facebook app, which will post on user's behalf. Once the access token is retrieved, we can post on the user's behalf through the app. Unlike Twitter, the access token currently does expire in 60 days, upon which we must go through the same process of obtaining a new access token. Not very user friendly in our opinion.

Setting up a Facebook Application

Step1. Go to https://developers.facebook.com/
Step2. Fill in necessary information,
App ID/App Key and App Secret will be used in obtaining access token.

Obtaining Access Token

Obtain an access token for a particular Facebook account. This process does require user to log into his/her Facebook account and explicitly grant permission to the Facebook app, which will post on user's behalf. Once the access token is retrieved, we can post on the user's behalf through the app. Unlike Twitter, the access token currently does expire in 60 days, upon which we must go through the same process of obtaining a new access token. Not very user friendly in our opinion.

Here is the high level flow for obtaining a Facebook access token,

Here is the detailed sequence diagrams on obtaining access token,
Once an access token is granted, we can start updating user's status as the following,

HTTP POST "https://graph.facebook.com/" + facebookUser + "/feed?access_token=" + token + "&message=" + URLEncoder.encode(message, "utf-8");

How to Tweet on user's behalf

As part of multichannel campaign, we would like to tweet on customer's behalf. In order to accomplish that, we need the following,

  • Create a Twitter application, which will tweet on customer's behalf. The application should also contain the appropriate campaign information and URL
  • Obtain an access token for a particular Twitter handle or handles. This process does require user to log into his/her Twitter accounts and explicitly grant permission to the Twitter app, which will tweet on user's behalf. Once the access token is retrieved, we can tweet on the user's behalf through the app. The access token currently does not expire, except for in the situation where user changes his password or revokes the access permission explicitly
In the blog, we walk through the necessary steps on how to obtain an access token.

Setting up a Twitter Application

Step1: Log into https://dev.twitter.com/apps/new
Step2: Filling the necessary information for the App, including the appropriate information for the Campaign,
Consumer Key, Consumer secret, and Callback URL will be used in obtaining access token.

Obtaining Twitter Access Token

Obtaining Twitter access token does require user to log into his/her Twitter accounts and explicitly grant permission to the Twitter app, which will tweet on user's behalf. Once the access token is retrieved, we can tweet on the user's behalf through the app. The access token currently does not expire, except for in the situation where user changes his password or revokes the access permission explicitly.

Here is a high level interaction diagram on this process,

Here is the corresponding sequence diagram,

Once an access token and associated token secret are obtained, we can store them in a persistent storage
and tweet on behalf of the user.
Twitter native Web Services API is very hard to use and we highly recommend going with Twitter4j.

Wednesday, June 6, 2012

How to use Apache HttpClient 4 library securely connect to RESTful Web Service

In this blog, we will show to how to securely connect to RESTful web services, using Apache HttpClient 4 and HTTPS and basic authentication through username and password

Self-signed Certificate Support

Apache HttpClient 4 has changed quite a bit from version 3 and has much better self-signed certificate support. In order to support self-signed certificates, we need to create a new class that implements TrustStrategy called TrustSelfSignedStrategy and our TrustSelfSignedStrategy will trust any certificates and just return true. This class is just for illustration purpose and shouldn't be used in production.

   protected static class TrustSelfSignedStrategy implements TrustStrategy  
   {  
     @Override  
     public boolean isTrusted(X509Certificate[] arg0, String arg1) throws CertificateException  
     {  
       return true;  
     }  
   }  

The following code shows how to create a ClientConnectionManager object using the above TrustSelfSignedStrategy.
   protected ClientConnectionManager enableSelfSignedCerts() throws Exception  
   {  
     TrustStrategy trustStrategy = new TrustSelfSignedStrategy();  
     X509HostnameVerifier hostnameVerifier = new AllowAllHostnameVerifier();  
     SSLSocketFactory sslSf = new SSLSocketFactory(trustStrategy, hostnameVerifier);  
     Scheme https = new Scheme("https", 443, sslSf);  
     SchemeRegistry schemeRegistry = new SchemeRegistry();  
     schemeRegistry.register(https);  
     ClientConnectionManager connection = new PoolingClientConnectionManager(schemeRegistry);  
     return connection;  
   }  

Preemptive Basic Authentication with Username and Password

Next we will show how to use preemptive basic authentication using username and password. In web services world we must use preemptive basic authentication, since there is no web client to ask back to and prompt user for authentication credentials.

       String urlString = PING_IDENTITY_SERVER_URL + TOKEN_AUTH + URLEncoder.encode(TEST_TOKEN, "UTF-8");  
       URL url = new URL(urlString);  
       // support self-signed certificates  
       DefaultHttpClient httpClient = new DefaultHttpClient(enableSelfSignedCerts());  
       // add username/password for BASIC authentication  
       httpClient.getCredentialsProvider().setCredentials(new AuthScope(url.getHost(), url.getPort()),  
           new UsernamePasswordCredentials("user", "secret"));  
       // Create AuthCache instance  
       // Add AuthCache to the execution context  
       AuthCache authCache = new BasicAuthCache();  
       BasicScheme basicAuth = new BasicScheme();  
       authCache.put(new HttpHost(url.getHost(), url.getPort(), url.getProtocol()), basicAuth);  
       BasicHttpContext localcontext = new BasicHttpContext();  
       localcontext.setAttribute(ClientContext.AUTH_CACHE, authCache);  
       //  
       HttpGet getRequest = new HttpGet(urlString);  
       getRequest.setHeader("Content-Type", "application/json");  
       // call HTTP GET with authentication information  
       HttpResponse response = httpClient.execute(getRequest, localcontext);  
       if (response.getStatusLine().getStatusCode() != 200)  
       {  
         throw new RuntimeException("Failed : HTTP error code : " + response.getStatusLine().getStatusCode());  
       }  
       BufferedReader br = new BufferedReader(new InputStreamReader((response.getEntity().getContent()))); 

Troubleshooting Tips

If getting a 401 error, make sure that preemptive authentication is used and username and password is correct.

Saturday, June 2, 2012

Secure Jersey with OAuth2, Open Authentication Framework

Overview

Our platform must be secure. After some initial investigation, we decided to go with OAuth2, the next generation of OAuth protocol. The OAuth protocol enables websites or applications (Consumers) to access Protected Resources from a web service (Service Provider) via an API, without requiring Users to disclose their Service Provider credentials to the Consumers. More generally, OAuth creates a freely-implementable and generic methodology for API authentication.


Securing Jersey with OAuth2

We looked at implementing OAuth2 support as Tomcat security realm, servlet filter, or Tomcat valve. In the end, we decided to go with Tomcat valve for the following reasons,
  • Since we are implementing web services and not web applications, there is no standard way of caching Tomcat session between requests. This makes session based security realm irrelevant.
  • Servlet filer can do almost exactly the same thing as a Tomcat Valve, except for a servlet filter is deployed at web application level. This means we have to deploy this servlet for every web application we deploy. Not as convenient as a Tomcat valve
  • On the other hand, Tomcat valve is Tomcat specific. If we want a portable solution, we will have to stick with Servlet filter. Luckily we are sticking with Tomcat for now and there is very little effort if we need to switch to a Servlet filter based implementation

OAuth Tomcat Valve Class

Here is the example Valve class,
 package jersey.oauth;  
 import java.io.IOException;  
 import javax.servlet.ServletException;  
 import javax.servlet.http.HttpServletResponse;  
 import org.apache.catalina.connector.Request;  
 import org.apache.catalina.connector.Response;  
 import org.apache.catalina.valves.ValveBase;  
 import com.sun.security.auth.UserPrincipal;  
 public class OAuthValve extends ValveBase  
 {  
     protected String identityServerURL;  
     public String getIdentityServerURL() {  
         return identityServerURL;  
     }  
     public void setIdentityServerURL(String identityServerURL) {  
         this.identityServerURL = identityServerURL;  
     }  
     @Override  
     public void invoke(Request request, Response response) throws IOException,  
             ServletException {  
         if (request.getMethod().equals("OPTIONS"))  
             getNext().invoke(request, response);  
         else  
         {  
 //            response.sendError(HttpServletResponse.SC_FORBIDDEN);  
             String authentication = request.getHeader("authentication");  
             if (authentication == null)  
             {  
                 authentication = request.getParameter("access_token");  
             }  
             else  
             {  
                     String[] tokens = authentication.split(" ");  
                 if (tokens.length >= 2 && tokens[0].equalsIgnoreCase("Bearer"))  
                 {  
                     authentication = tokens[1];  
                 }  
                 else  
                 {  
                     authentication = null;  
                 }  
             }  
             if (authentication == null)  
                 response.sendError(HttpServletResponse.SC_UNAUTHORIZED);  
             else  
             {  
                 // TODO call identity server, passing on the access token
                 // Set return principal 
                 request.setUserPrincipal(new UserPrincipal("name"));  
                 getNext().invoke(request, response);  
             }  
         }  
     }  
 }  


Here is the sample Host block of Tomcat server.xml file,
 <Host appBase="webapps" autoDeploy="true" name="localhost" unpackWARs="true" xmlNamespaceAware="false" xmlValidation="false">  
     <!-- SingleSignOn valve, share authentication between web applications  
        Documentation at: /docs/config/valve.html -->  
     <!--  
     <Valve className="org.apache.catalina.authenticator.SingleSignOn" />  
     -->  
     <!-- Access log processes all example.  
        Documentation at: /docs/config/valve.html -->  
     <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" pattern="common" prefix="localhost_access_log." resolveHosts="false" suffix=".txt"/>  
         <Valve className="jersey.oauth.OAuthValve" identityServerURL="localhost"/>  
    <Context docBase="JerseyCors" path="/jersey" reloadable="true" source="org.eclipse.jst.jee.server:JerseyCors"/></Host>  
Deploy, start Tomcat and test.

Friday, June 1, 2012

Enable Cross Origin Resource Sharing for Jersey

Overview

In this blog, we will talk about how to enable and configure CORS support for Jersey, and more importantly, how to trouble shoot if CORS is not working properly.

As mentioned in the previous blog, we were disappointed to find out Apache CXF CORS support did not work and were pleasantly surprised on how easy CORS filter has been to setup and configure. We have tested CORS filter against Jersey, RESTeasy, and Apache CXF and it worked for every single one of them.

Enable CORS Support

CORS filter is implemented as a Servlet that must be enabled and configured at the web app's level, inside web.xml file.
Here is a sample web.xml file,
<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" id="WebApp_ID" version="2.5">
  <servlet>
    <servlet-name>Jersey Root REST Service</servlet-name>
    <servlet-class>com.sun.jersey.spi.container.servlet.ServletContainer</servlet-class>
    <init-param>
      <param-name>com.sun.jersey.config.property.packages</param-name>
      <param-value>jersey.cors</param-value>
    </init-param>
 <init-param>
  <param-name>com.sun.jersey.api.json.POJOMappingFeature</param-name>
  <param-value>true</param-value>
 </init-param>
 <load-on-startup>1</load-on-startup>
  </servlet>
  <servlet-mapping>
    <servlet-name>Jersey Root REST Service</servlet-name>
    <url-pattern>/*</url-pattern>
  </servlet-mapping>
  <filter>
  <filter-name>CORS</filter-name>
  <filter-class>com.thetransactioncompany.cors.CORSFilter</filter-class>
  
  <!-- Note: All parameters are options, if ommitted CORS Filter
       will fall back to the respective default values.
    -->
  <init-param>
   <param-name>cors.allowGenericHttpRequests</param-name>
   <param-value>true</param-value>
  </init-param>
  
  <init-param>
   <param-name>cors.allowOrigin</param-name>
   <param-value>*</param-value>
  </init-param>
  
  <init-param>
   <param-name>cors.supportedMethods</param-name>
   <param-value>GET, HEAD, POST, OPTIONS, PUT, DELETE</param-value>
  </init-param>
  
  <init-param>
   <param-name>cors.supportedHeaders</param-name>
   <param-value>Content-Type, X-Requested-With, Accept, Authentication</param-value>
  </init-param>
  
  <init-param>
   <param-name>cors.exposedHeaders</param-name>
   <param-value>X-Test-1, X-Test-2</param-value>
  </init-param>
  
  <init-param>
   <param-name>cors.supportsCredentials</param-name>
   <param-value>true</param-value>
  </init-param>
  
  <init-param>
   <param-name>cors.maxAge</param-name>
   <param-value>3600</param-value>
  </init-param>

 </filter>

 <filter-mapping>
  <!-- CORS Filter mapping -->
  <filter-name>CORS</filter-name>
  <url-pattern>/*</url-pattern>
 </filter-mapping>
  
</web-app>

Configuring CORS Filter

The default configuration values are good for everything except for the following two fields, 
  • cors.supportedMethods
  • cors.supportedHeaders
These two fields must be checked if CORS filter is not working as one expected.

cors.supportedMethods specifies a list of supported methods and the default value is GET, HEAD, and POST only. We recommend listing all HTTP methods as supported methods.

cors.supportedHeaders lists the set of supported header fields. This set must be expanded if more headers are passed in unexpected.  We recommend listing as many headers as possible.

Testing and Troubleshooting

CORS support can be tested through javascript and here is an example,
<html> 
<head> 
<title>Cors Example</title> 
<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>
<script type="text/javascript" src="log4javascript.js"></script>
<script> 
var hello = JSON.stringify({"greeting":"Hello","name":"jersey"});
//alert(hello);
$(document).ready(function() {

 //alert('before ajax call');
 $.ajax({
  headers: {
   Authentication : 'Bearer access_token'
  },
  
  //this is the php file that processes the data and send mail
  //url: "http://localhost:8080/cxf-hello-cors/rest/annotatedGet/hello", 
  //url: "http://localhost:8080/cxf-hello-cors/service1/time", 
  // url: "http://localhost:8080/resteasy/tutorial/helloworld",
  url: "http://localhost:8080/jersey/hello",
  
  contentType: "application/json",

  //GET method is used
  type: 'DELETE',
  
  //pass the data         
  dataType: 'json',   
   
  //data: JSON.stringify(hello), 
  data: hello,
  
  //Do not cache the page
  cache: false,
   
  //success
  success: function (html) {  
   //alert(html); 
   document.getElementById("cors").innerHTML = "Echo: " + html.greeting + "," + html.name; 
           
  } ,
  error:function (data, status) {
   alert(data);
   alert(status);
     }      
 });
     
 });
</script>
</head> 
<body> 

<h1>This is the CORS test page</h1>

<p>Hello, <div id="cors"/>

</body> 
</html>

Troubleshooting CORS

We use a combination of Tomcat access log, Firefox Firebug, and Jersey client to troubleshoot CORS support.
CORS relies on header to relay cross origin  resource sharing information back to the browser and CORS-supported browser will enforce CORS based on these header fields. When CORS is not working as expected, the majority of the errors happen when Web Services do not pass back the appropriate headers due to permission related issues, like supported headers or supported methods. The best place to look for this type of information is in Tomcat's access log.
Here are some sample entries from the access log,

127.0.0.1 - - [31/May/2012:15:40:42 -0400] "GET /jersey/hello HTTP/1.1" 401 -
127.0.0.1 - name [31/May/2012:15:42:27 -0400] "GET /jersey/hello HTTP/1.1" 200 36
127.0.0.1 - - [31/May/2012:15:42:39 -0400] "GET /jersey/hello HTTP/1.1" 401 -
127.0.0.1 - - [31/May/2012:15:44:18 -0400] "GET /jersey/hello HTTP/1.1" 401 -
127.0.0.1 - - [31/May/2012:15:45:45 -0400] "GET /jersey/hello HTTP/1.1" 401 -
127.0.0.1 - - [31/May/2012:15:46:38 -0400] "GET / HTTP/1.1" 401 -
127.0.0.1 - - [31/May/2012:15:46:52 -0400] "GET /jersey/hello HTTP/1.1" 401 -
0:0:0:0:0:0:0:1%0 - - [31/May/2012:15:47:02 -0400] "OPTIONS /jersey/hello HTTP/1.1" 403 94
127.0.0.1 - - [31/May/2012:15:48:06 -0400] "GET / HTTP/1.1" 401 -
0:0:0:0:0:0:0:1%0 - - [31/May/2012:15:51:23 -0400] "OPTIONS /jersey/hello HTTP/1.1" 200 -
0:0:0:0:0:0:0:1%0 - name [31/May/2012:15:51:23 -0400] "DELETE /jersey/hello HTTP/1.1" 200 36
127.0.0.1 - name [31/May/2012:16:01:12 -0400] "GET /jersey/hello HTTP/1.1" 200 36 
Each entry represents an access from the client. The last three entries represent the following, request URI, HTTP status code, return content length. If CORS is not working as expected, check the following,
  • Make sure there is an entry in the access log that corresponds to the request
  • Make sure HTTP status code is correct. If HTTP status code is 403, check CORS filter's
    supported methods and supported headers to make sure that both settings are configured
    properly
If HTTP code is 200 but CORS is still not working, turn Firebug on and examine the request
pay special attention to response headers,
Debugging CORS response with Firebug


Make sure the set of Access-Control-* headers present in response.