Jan Amoyo

on software development (and stuff)

MongoDB Transaction Across Multiple Documents using Async and Mongoose

Unlike traditional databases, MongoDB does not support transactions. So suppose you have multiple documents and want to perform a "save all or nothing" operation, you'd have to simulate a transaction from within your application.

Implementing such a transaction in NodeJS can be tricky because IO operations are generally asynchronous. This can lead to nested callbacks as demonstrated by the code below:
var MyModel = require('mongoose').model('MyModel');

var docs = [];
MyModel.create({ field: 'value1' }, function (err, doc) {
  if (err) { console.log(err); }
  else {
    docs.push(doc);
    MyModel.create({ field: 'value2' }, function (err, doc) {
      if (err) { rollback(docs); }
      else {
        docs.push(doc);
        MyModel.create({ field: 'value3' }, function (err, doc) {
          if (err) { rollback(docs); }
          else {
            console.log('Done.');
          }
        });
      }
    });
  }
});
Despite not having a dependency on other documents, inserting each document must be performed in series. This is because we have no way of finding out when all other documents will finish saving. This approach is problematic because it leads to deeper nesting as the number of documents increase. Also, in the above example, the callback functions modify the docs variable - this is a side effect and breaks functional principles. With the help of Async, both issues can be addressed.

The parallel() function of Async allows you run multiple functions in parallel. Each function signals completion by invoking a callback to which either the result of the operation or an error is passed. Once all functions are completed, an optional callback function is invoked to handle the results or errors.

The above code can be improved by implementing document insertions as functions of parallel(). If an insert succeeds, the inserted document will be passed to the callback; otherwise, the error will be passed (these can later be used when a rollback is required). Once all the parallel functions complete, the parallel callback can perform a rollback if any of the functions failed to save.

Below is the improved version using Async:
var async    = require('async'),
    mongoose = require('mongoose');

var MyModel = mongoose.model('MyModel');

async.parallel([
    function (callback) {
      MyModel.create({ field: 'value1' }, callback);
    },
    function (callback) {
      MyModel.create({ field: 'value2' }, callback);
    },
    function (callback) {
      MyModel.create({ field: 'value3' }, callback);
    }
  ],
  function (errs, results) {
    if (errs) {
      async.each(results, rollback, function () {
        console.log('Rollback done.');
      });
    } else {
      console.log('Done.');
    }
  });

function rollback (doc, callback) {
  if (!doc) { callback(); }
  else {
    MyModel.findByIdAndRemove(doc._id, function (err, doc) {
      console.log('Rolled-back document: ' + doc);
      callback();
    });
  }
}
Lines 8, 11, and 14 inserts a new document to MongoDB then passes either the saved document or an error to the callback.

Lines 18-21 checks if any of the parallel functions threw an error, and calls the rollback() function on each document passed to callback of parallel().

Lines 28-33 performs the rollback by deleting the inserted documents using their IDs.

Edit: As pointed-out in the comments, the rollback itself can fail. In such scenarios, it is best to retry the rollback until it succeeds. Therefore, the rollback logic must be an idempotent operation.

Adding version to JavaScript and CSS Resources via JSP/JSTL

Browsers usually cache static resources like JavaScript and CSS so that downloads are reduced the next time the same website is visited.

However, if the JavaScript or CSS gets updated in the server, the browser will still be using an outdated copy of the resource. This could lead to unexpected behavior.

This can be addressed by including a version to the resource name and incrementing the value every time an update is released (ex: /my-styles-v2.css). An easier alternative is to use a query parameter that indicates the resource version (ex: /my-styles.css?version=2). On either solution, HTML pages that link to the resources need to be updated whenever a resource version changes -- this can be difficult to maintain especially if there are many resources that constantly change.

There are tools that automate this process and are usually incorporated into the build process. The solution I'm going to demonstrate will address the issue without adding an extra build step.

Note: This example is specific to Java web applications using JSTL.

The first step is to add an application-level configuration that holds the current version of the static resources. This comes as a context parameter in web.xml:
<web-app>
  ...

  <!-- Indicates the CSS and JS versions -->
  <context-param>
    <param-name>resourceVersion</param-name>
    <param-value>1</param-value>
  </context-param>

  ...

</web-app>
Lines 6 and 7 indicate the value of the resource version and should be incremented before every release.

The next step is to update the JSPs that link to the resources and format the resource URLs to include the version number.
<%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core"%>
<c:url value="/resources/css/my-styles.css" var="myCss">
  <c:param name="version" value="${initParam.resourceVersion}" />
</c:url>
<c:url value="/resources/js/my-scripts.js" var="myJavaScript">
  <c:param name="version" value="${initParam.resourceVersion}" />
</c:url>
...
<!DOCTYPE html>
<html lang="en">
<head>
...
<link rel="stylesheet" href="${myCss}" type="text/css" />
...
</head>
<body>
...
<script src="${myJavaScript}"></script>
</body>
</html>
Lines 2-7 declare URL variables that include "version" as a query parameter. Note that context parameters are accessible from JSTL via ${initParam}.

Lines 13 and 18 uses the variables to render the URLs.

Generating Unique and Readable IDs in Node.js Using MongoDB

I had a requirement from an upcoming project to generate unique human-readable IDs. This project is written in Node.js and uses MongoDB for its database.

Ideally, I can use an auto-incrementing Sequence to achieve this. However, unlike most relational databases, MongoDB does not support Sequences. Fortunately, it is not difficult to implement this behavior in MongoDB.

We will use a collection to store our sequences. The sequences will then be incremented using the findAndModify() function. To ensure that sequences do not increment to unmanageable values, the counter must be restarted after a certain period. In my case, I will restart the counter everyday. To achieve this, I will identify each sequence using a prefix of YYMMDD.

Below is the raw MongoDB statement:
db.ids.findAndModify({
  query:  { prefix: 140625 },
  update: { $inc: { count: 1 } },
  upsert: true,
  new:    true
});
It is important to set the upsert and new options - setting upsert to true will insert a new document if the query cannot find a match; while setting new to true will return the updated version of the document.

Testing on the console yields the expected result.
> db.ids.findAndModify({ query: { prefix: 140625 }, update: { $inc: { count: 1 } }, upsert: true, new: true });
{
    "_id" : ObjectId("53aae1d126d57c198d861cfd"),
    "count" : 1,
    "prefix" : 140625
}
> db.ids.findAndModify({ query: { prefix: 140625 }, update: { $inc: { count: 1 } }, upsert: true, new: true });
{
    "_id" : ObjectId("53aae1d126d57c198d861cfd"),
    "count" : 2,
    "prefix" : 140625
}
> db.ids.findAndModify({ query: { prefix: 140625 }, update: { $inc: { count: 1 } }, upsert: true, new: true });
{
    "_id" : ObjectId("53aae1d126d57c198d861cfd"),
    "count" : 3,
    "prefix" : 140625
}
> db.ids.findAndModify({ query: { prefix: 140625 }, update: { $inc: { count: 1 } }, upsert: true, new: true });
{
    "_id" : ObjectId("53aae1d126d57c198d861cfd"),
    "count" : 4,
    "prefix" : 140625
}

Node.js + Mongoose

I use Mongoose in Node.js as a MongoDB object document mapper (ODM). Mongoose offers an intuitive API to access MongoDB from within Node.js.

To translate the above implementation, we first need to declare a Mongoose schema.
var mongoose = require('mongoose'),
    Schema   = mongoose.Schema;

var IdSchema = new Schema({
  prefix: { type: Number, required: true, index: { unique: true } },
  count:  { type: Number, required: true }
});

mongoose.model('Id', IdSchema);
The schema defines the structure of the document and as well as validation.

Once the schema is declared, translation becomes pretty straightforward. Note that Mongoose does not have a function called findAndModify(), instead, it offers 2 forms: findByIdAndUpdate() and findOneAndUpdate(). In our case, we will use findOneAndUpdate().
var moment = require('moment'),
    Id     = mongoose.model('Id');

var nextId = function (callback) {
  function prefix (date) {
    return parseInt(moment(date).format('YYMMDD'));
  }

  Id.findOneAndUpdate(
    { prefix: prefix(new Date()) },
    { $inc:   { count: 1 } },
    { upsert: true },
    function (err, idDoc) {
      callback(err, idDoc);
    });
};
Lines 5-7 generates the prefix with the help of Moment.js.

Note that while findAndModify() and its Mongoose equivalents are an atomic operations, there is still a chance that multiple clients try to upsert the same document and hence would fail due to constraint violation - in such scenarios, the call to nextId() must be retried.

House Keeping

Because a new document is inserted every time the counter is reset, the documents will accumulate overtime. Fortunately, because the prefixes are stored as numbers, removing old documents becomes very easy.

For example, if we want to remove documents older than 2015, we just issue the below statement.
db.ids.remove({
  prefix: { $lte: 150000 }
});
The $lte operator stands for "less than". The above statement roughly translates to: delete from ids where prefix < 15000.

Sublime Text Packages for Node.js/JavaScript

Since I started working on Node.js for an upcoming project, I've become a fan of Sublime Text. It is fast; extensible; has great community support; and most importantly, runs on Linux.

So far, I use Sublime Text exclusively for Node.js and web development (HTML5, CSS, JavaScript). Below, I've compiled a list of packages I found most useful.

As a prerequisite, Package Control needs to be installed. Installing Package Control is a matter of copy-pasting a code snippet to your Sublime Text console. Installation instruction is found here.

Once Package Control is installed, you can start installing other packages by opening the Command Pallete (ctrl+shift+p) and searching for "Install Package".

Without further ado, below are the packages (search for the text in bold):

Alignment
* Aligns various texts
* Use via ctrl+alt-a
Figure 1a: Alignment (before)
Figure 1b: Alignment (after)
BracketHighlighter
* Highlights brackets, braces, and parentheses.
Figure 2: BracketHighlighter
Emmet
* Easily write HTML
* Use via ctrl+alt-enter
* For more information on Emmet, check-out this interactive guide.
Figure 3: Emmet
SidebarEnhancements
* Adds useful menu items to your sidebar
* Only available for Sublime Text 3
Figure 4: SidebarEnhancements
HTML-CSS-JS Prettify
* Formats HTML, CSS, and JavaScript files
* Use via ctrl+shift+h
* Requires Node.js to be installed
Figure 5a: Prettify (before)
Figure 5b: Prettify (after)
TrailingSpaces
* Highlights trailing spaces
Figure 6: TrailingSpaces
SublimeLinter
* Highlights lint errors for various file formats.
Figure 7: SublimeLinter
  • For Sublime Text 3, each linter needs to be installed separately:
    • SublimeLinter-jshint (JavaScript)
      • Requires jshint
      • install via "sudo npm install -g jshint"
    • SublimeLinter-html-tidy (HTML)
      • Requires tidy 
      • install via "sudo apt-get install tidy"

Setting-up Tomcat SSL with StartSSL Certificates

Part of an effort to improve the security of CheckTheCrowd.com is to enable SSL on my web server. Enabling SSL allows it to support HTTPS connections.

The CheckTheCrowd web application is hosted on Apache Tomcat which provides a pretty good, albeit generic documentation on how to achieve this setup.

In summary, enabling SSL on Tomcat requires three things:
  • Creating a Java keystore which contains the private key that Tomcat would use to start SSL handshakes
  • Ensuring that you or your website owns the private key by having it signed by a trusted authority which in turn, issues a digital certificate verifying your ownership of the key
  • Configuring a Tomcat connector to listen on HTTPS from a specified port
Creating a keystore and configuring a Tomcat connector is simple enough. However, acquiring an SSL certificate from a trusted provider can be expensive.

Thankfully,  I heard about StartSSL which provides free SSL certificates with one year validity (a new one can be generated upon expiry).

Below are the steps I took to set-up Tomcat SSL using StartSSL certificates.

1. Creating the Java Keystore File (.jks)

As per the Tomcat documentation, the first thing I needed to do was to generate a Java keystore which would hold my private key. This was done by using the keytool command that is part of JDK.
keytool -genkey -keysize 2048 -keyalg RSA -sigalg SHA1withRSA \
  -alias [name of server] -keystore [name of keystore].jks \
  -keypass [password] -storepass [password] \
  -dname "CN=[domain name], OU=Unknown, O=[website], L=[city], ST=[state], C=[country]"
Note that due to a Tomcat limitation, the keypass and storepass must be the same. The dname entry is optional; if not provided, these details will be asked by keytool during the process.

Example:
keytool -genkey -keysize 2048 -keyalg RSA -sigalg SHA1withRSA \
  -alias webserver -keystore checkthecrowd.jks \
  -keypass ****** -storepass ****** \
  -dname "CN=checkthecrowd.com, OU=Unknown, O=CheckTheCrowd, L=Singapore, ST=Unknown, C=SG"
At this point, my keystore already contains the private key required by Tomcat to start an SSL connection.

I can already start using this keystore to enable SSL in Tomcat, but a rogue entity can hijack the connection and pretend that his private key was issued by CheckTheCrowd. This rogue entity can then trick my users that they are securely connected to CheckTheCrowd when in fact they are connected to something else.

To solve this, I need to acquire a signed certificate to prove that my private key is associated to my domain (checkthecrowd.com).

2. Creating a Certificate Request File (.csr)

A certificate request is submitted to a certificate provider and an SSL certificate is generated based on this file.
keytool -certreq -alias [name of server] -file [name of request].csr \
  -keystore [name of keystore].jks
Note that this command would ask for the password previously set for the keystore.

Example:
keytool -certreq -alias webserver -file checkthecrowd.csr \
  -keystore checkthecrowd.jks

3. Submitting the Certificate Request to StartSSL

I needed to signup for an account in order to use StartSSL. Signing-up involves generating a signed private key which proves my identity. Here onwards, the key is used by StartSSL to authenticate my access to their website.

Note that it is important to keep a back-up copy of this private key for future use. This file needs to be imported on all computers used to access StartSSL.
Figure 1: StartSSL
Once I have an account, I can use the Control Panel to generate my certificate. The first step is to validate that I own the domain checkthecrowd.com. The aptly named Validation Wizard takes care of this.

Once my domain is validated, I used the Certificates Wizard to submit my certificate request (.csr file):
  • Select Web Server SSL/TLS Certificate.
  • Because I already have a private key and a certificate request, I skip the next screen
  • I pasted the contents of my certificate request (.csr file) to the text area provided
  • When finished, the generated certificate is displayed on another text area -- I copied this and saved to a file called ssl.crt.
4. Import the Generated Certificate and StartSSL Certificate Chains

The next step is import the generated certificate to my keystore. The StartSSL certificate chain is also required to be imported.

The StartSSL certificate chain can be downloaded from:
The free SSL certificate from StartSSL is only a Class 1 level certificate. With an upgraded package (Class 2 and higher), all applicable class certificates must be downloaded.

I again used keytool to import these certificates:
keytool -import -alias [ca alias] -file [ca file].cer \
  -keystore [keystore name].jks -trustcacerts
keytool -import -alias [class1 alias] -file [class1 file].pem \
  -keystore [keystore name].jks -trustcacerts
keytool -import -alias [name of server] -file ssl.crt \
  -keystore [keystore name].jks
The first two commands imported the certificate chain as trusted certificates, the last command imported the signed certificate.

Example:
keytool -import -alias startsslca -file ca.cer \
  -keystore checkthecrowd.jks -trustcacerts
keytool -import -alias startsslca1 -file sub.class1.server.ca.pem \
  -keystore checkthecrowd.jks -trustcacerts
keytool -import -alias webserver -file ssl.crt \
  -keystore checkthecrowd.jks
Listing the contents of my keystore verified that I have 3 certificates:
$ keytool -list -keystore checkthecrowd.jks
webserver, Aug 5, 2013, PrivateKeyEntry,
Certificate fingerprint (SHA1): [...]
startsslca, Aug 5, 2013, trustedCertEntry,
Certificate fingerprint (SHA1): [...]
startsslca1, Aug 5, 2013, trustedCertEntry,
Certificate fingerprint (SHA1): [...]

5. Configure Tomcat with SSL

Enabling SSL with Tomcat involves creating a new connector which listens to HTTPS connections. This connector needs to know the location of the keystore file as well as the password to access the keystore.

For convenience, I placed my keystore under $TOMCAT_HOME.
<!-- 
Define a SSL HTTP/1.1 Connector on port 8443
This connector uses the JSSE configuration, when using APR, the
connector should be using the OpenSSL style configuration
described in the APR documentation 
-->
<Connector
  protocol="HTTP/1.1"
  port="8443" maxThreads="200"
  scheme="https" secure="true" SSLEnabled="true"
  keystoreFile="checkthecrowd.jks" keystorePass="******"
  clientAuth="false" sslProtocol="TLS"/>
Note that by default, the Tomcat HTTPS port is 8443.

That's all there is to it! After bouncing Tomcat, I am now able to access CheckTheCrowd via HTTPS from port 8443: https://checkthecrowd.com:8443/.

The next step is to configure Apache httpd to forward HTTPS requests to port 8443. I still haven't figured out how to do this yet, so if you have an idea, let me know!

Flowee: Sample Application

My previous post introduced Flowee as a framework for building Java services backed by one or more workflows. Through a sample application, this post will demonstrate how easy it is to build workflow-based services using Flowee.

The sample application is a service which authenticates two types of accounts: an admin and a user. The service will display a greeting then authenticate each type of account using different authentication methods.
Figure 1: Sample Workflow Service
Note that this example is by no means a realistic use-case for production, it is only used here for illustration purposes.

Implementing the Service

(1) We start by defining the request:
public class LoginRequest {
  private String username;
  private String password;
  private String type;
  // Accessors omitted
}

(2) We also create an application specific WorkflowContext (this is an optional step, the default workflow context is probably sufficient for some applications):
public class LoginContext extends WorkflowContext {
  private static final String KEY_IS_AUTHENTICATED = "isAuthenticated";

  public void setIsAuthenticated(Boolean isAuthenticated) {
    put(KEY_IS_AUTHENTICATED, isAuthenticated);
  }

  public Boolean getIsAuthenticated() {
    return (Boolean) get(KEY_IS_AUTHENTICATED);
  }
}
Here, we extend the default context with convenience methods to access the value mapped to the "isAuthenticated" key (return values are normally stored in the context).

(3) Then we define an application specific Task interface to remove generic types:
public interface LoginTask
  extends Task<LoginRequest, LoginContext> {
}

(4) Next, we define an application specific abstract Task. We make it BeanNameAware so that the tasks assumes the bean name if declared in Spring:
public abstract class AbstractLoginTask
    extends AbstractTask<LoginRequest, LoginContext>
    implements LoginTask, BeanNameAware {
  @Override
  public void setBeanName(String name) {
    setName(name);
  }
}

(5) We can now create an application specific Workflow:
public class LoginWorkflow
    extends AbstractWorkflow<LoginTask, LoginRequest, LoginContext> {
  public LoginWorkflow(String name) {
    super(name);
  }
}
The abstract implementation should provide all the functionality we need, this step is to simply to declare the generic parameters.

(6) Next, we define a WorkflowFactory. For this example, we will make our workflow factory configurable from properties files. To achieve this, we need to inherit from AbstractPropertiesWorkflowFactory:
public class LoginWorkflowFactory
    extends AbstractPropertiesWorkflowFactory <LoginWorkflow, LoginTask, LoginRequest, LoginContext> {
  @Override
  protected LoginWorkflow createWorkflow(String name) {
    return new LoginWorkflow(name);
  }
}
This requires us override createWorkflow(), an Abstract Factory method.

(7) We then define the Filter that will be used by our configurable workflow factory. Recall from my previous post that configurable factories uses filters to evaluate conditions that determine which workflows gets created.

Flowee comes with an abstract implementation that evaluates conditions as JEXL expressions. Using JEXL allows us to define JavaScript-like conditions for our workflow configuration:
public class LoginFilter
    extends AbstractJexlFilter<LoginRequest, LoginContext> {
  @Override
  protected ReadonlyContext populateJexlContext(LoginRequest request,
      LoginContext context) {
    JexlContext jexlContext = new MapContext();
    jexlContext.set("request", request);
    return new ReadonlyContext(jexlContext);
  }
}
The method, populateJexlContext() populates a JEXL context with the LoginRequest. This allows us to access fields and methods of the request using JEXL expressions (ex: request.type == 'admin' ).

(8) We now have everything we need to define the WorkflowService:
public class LoginService
    extends AbstractWorkflowService<LoginWorkflow, LoginTask, LoginRequest, LoginContext> {
  @Override
  protected LoginContext createContext() {
    return new LoginContext();
  }
}
Here, we override an Abstract Factory method for creating an instance of the LoginContext.

Implementing the Tasks

Now that we have the infrastructure for our workflow service, the next stage is to define the actual tasks that comprise the workflows.

(1) We create a simple task that greets the user being authenticated:
public class GreetUserTask extends AbstractLoginTask {
  @Override
  protected TaskStatus attemptExecute(LoginRequest request,
      LoginContext context) throws WorkflowException {
    System.out.println(String.format("Welcome '%s'!",
        request.getUsername()));
    return TaskStatus.CONTINUE;
  }
}

(2) We then define the task which authenticates admin accounts
public class AuthenticateAdmin extends AbstractLoginTask {
  @Override
  protected TaskStatus attemptExecute(LoginRequest request, LoginContext context) throws WorkflowException {
    if ("admin".equals(request.getUsername()) && "p@ssw0rd".equals(request.getPassword())) {
      System.out.println(String.format("User '%s' has been authenticated as Administrator",
          request.getUsername()));
      context.setIsAuthenticated(Boolean.TRUE);
      return TaskStatus.CONTINUE;
    } else {
      System.err.println(String.format("Cannot authenticate user '%s'!",
          request.getUsername()));
      context.setIsAuthenticated(Boolean.FALSE);
      return TaskStatus.BREAK;
    }
  }
}
Normally, this task should perform authentication against a data source. For this example, we are only trying to simulate a scenario where admin and user accounts are authenticated differently.

(3) We then define the task which authenticates user accounts
public class AuthenticateUser extends AbstractLoginTask {
  @Override
  protected TaskStatus attemptExecute(LoginRequest request, LoginContext context)
      throws WorkflowException {
    if ("user".equals(request.getUsername())
        && "p@ssw0rd".equals(request.getPassword())) {
      System.out.println(String.format("User '%s' has been authenticated",
          request.getUsername()));
      context.setIsAuthenticated(Boolean.TRUE);
      return TaskStatus.CONTINUE;
    } else {
      System.err.println(String.format("Cannot authenticate user '%s'!",
          request.getUsername()));
      context.setIsAuthenticated(Boolean.FALSE);
      return TaskStatus.BREAK;
    }
  }
}

Spring Integration

We now have all the components we need to build the application. It's time to wire them all in Spring.

(1) We start with the tasks. It is a good practice to provide a separate configuration file for tasks, this makes our configuration manageable in case the number of tasks grows.
<beans>
  <bean id="greet" class="com.jramoyo.flowee.sample.login.task.GreetUserTask" />
  <bean id="authenticate_user" class="com.jramoyo.flowee.sample.login.task.AuthenticateUser" />
  <bean id="authenticate_admin" class="com.jramoyo.flowee.sample.login.task.AuthenticateAdmin" />
</beans>
Note that we are not wiring these tasks to any of our components. Flowee Spring Module (flowee-spring) comes with an implementation of TaskRegistry that looks-up task instances from the Spring context.

(2) We then define our main Spring configuration file.
<beans>
  <import resource="classpath:spring-tasks.xml" />
  <bean id="workflowFactory" class="com.jramoyo.flowee.sample.login.LoginWorkflowFactory">
    <property name="filter">
      <bean class="com.jramoyo.flowee.sample.login.LoginFilter" />
    </property>
    <property name="taskRegistry">
      <bean class="com.jramoyo.flowee.spring.ContextAwareTaskRegistry" />
    </property>
  </bean>
  <bean id="workflowService" class="com.jramoyo.flowee.sample.login.LoginService">
    <property name="factory" ref="workflowFactory" />
  </bean>
</beans>
Here, we declare our LoginFactory as a dependency of LoginService. LoginFactory is then wired with LoginFilter and ContextAwareTaskRegistry. As mentioned in the previous step, ContextAwareTaskRegistry allows our factory to look-up task instances from the Spring context.

Workflow Configuration

The last stage is to configure the WorkflowFactory to assemble the workflows required for an account type.

Recall from the previous steps that we using an instance of AbstractPropertiesWorkflowFactory. This loads configuration from two properties files: one for the workflow conditions (workflow.properties) and another for the workflow tasks (tasks.properties).

(1) We create the workflow.properties file with the following content:
admin=request.type == 'admin'
user=request.type == 'user'
This configuration means that if the value of LoginRequest.getType() is 'admin', execute the workflow named 'admin' and if the value is 'user' execute the workflow named 'user'

(2) Then, we create the task.properties file:
admin=greet,authenticate_admin
user=greet,authenticate_user
This configures the sequence of tasks that make up the workflow.

By default, both workflow.properties and task.properties are loaded from the classpath. This can be overridden by setting setWorkflowConfigFile() and setTaskConfigFile() respectively.

That's all there is to it! From Spring, we can load LoginService and inject it anywhere within the application.

Notice that while we created several components to build our service, most of them were simple inheritance and required very little lines of code. Also, we only need to build these component once per service.

You'll find that the value of Flowee becomes more evident as the number of workflows and tasks increases.

Testing

We create a simple JUnit test case:
@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration("classpath:/spring.xml")
public class LoginServiceTest {

  @Resource(name = "workflowService")
  private LoginService service;

  @Test
  public void testAdminLogin() throws WorkflowException {
    LoginRequest request = new LoginRequest();
    request.setUsername("admin");
    request.setPassword("p@ssw0rd");
    request.setType("admin");

    LoginContext context = service.process(request);
    assertTrue("Incorrect result",
      context.getIsAuthenticated());

    request.setPassword("wrong");
    context = service.process(request);
    assertFalse("Incorrect result",
      context.getIsAuthenticated());
  }

  @Test
  public void testUserLogin() throws WorkflowException {
    LoginRequest request = new LoginRequest();
    request.setUsername("user");
    request.setPassword("p@ssw0rd");
    request.setType("user");

    LoginContext context = service.process(request);
    assertTrue("Incorrect result",
      context.getIsAuthenticated());

    request.setPassword("wrong");
    context = service.process(request);
    assertFalse("Incorrect result",
      context.getIsAuthenticated());
  }
}
Notice how we injected LoginService to our unit test.

Our test yields the following output:
Welcome 'admin'!
User 'admin' has been authenticated as Administrator

Welcome 'admin'!
Cannot authenticate user 'admin'!

Welcome 'user'!
User 'user' has been authenticated

Welcome 'user'!
Cannot authenticate user 'user'!

The code used in this example is available as a Maven module from Flowee's Git repository.

So far, I've only demonstrated simple workflows with linear task sequences. My next post will introduce special tasks which allows for more complex task sequences.

Introducing Flowee, a Framework for Building Workflow-based Services in Java

Overview

My past roles required me to write three different applications having surprisingly similar requirements. These requirements were:
  1. The application must run as a service which receives some form of request
  2. Once received, several stages of processing needs to be performed on the request (i.e. validation, persistence, state management, etc)
  3. These stages of processing may change overtime as new requirements come in
The third requirement is arguably the most important. While the first two can be performed without special techniques or design, the last requirement requires a bit of planning.

In order to achieve this required flexibility, adding new processing logic or modifying existing ones should not affect the code of other tasks. For example: if I add a new logic for validation, my existing code for state management should not be affected.

I need to encapsulate each logic as individual units of "tasks". This way, changes to each task is independent from the other.

The behavior of my service is now defined by the sequence of tasks needed to be performed. This behavior can easily be changed by adding, removing, or modifying tasks in the sequence.

Essentially, I needed to build a workflow.

Design

My design had several iterations throughout the years. My initial design was pretty straightforward: I implemented a workflow as a container of one or more tasks arranged in a sequence. When executed, the workflow iterates through each task and executes it. The workflow has no knowledge of what each task would do, all it knows is that it is executing some task. This was made possible because each task share a common interface. This interface exposes an execute() method which accepts the request as a parameter.

Through dependency injection, the behavior of the workflow becomes very easy to change. I can add, remove and rearrange tasks via configuration.

While proven to be effective, my initial design was application specific -- the workflow can only accept a specific type of request. This makes it an ineffective framework because it only works for that particular application. There was also the problem of tasks not being able to share information: as a result, temporary values needed to be stored within the request itself.

I had the chance to improve on this design on a later project. By applying Generics, I was able to redesign my workflow so that it can accept any type of request.

In order for information to be shared across each tasks, a data structure serving as the workflow "context" is also passed as a parameter to each task.
Figure 1: Workflow and Tasks

More often than not, different workflows need to be executed for different types of request. For example, one workflow needs to be executed to open an account, another workflow to close an account, etc.

I came up with a "workflow factory" as a solution to this requirement. Depending on the type of request, the factory will assemble the workflows required to be executed against the request. The factory is exposed as an interface so that I can have different implementations depending on the requirement.
Figure 2: Workflow Factory

My service now becomes very simple: as soon as a request comes in, I will call the factory to assemble the required workflows then one-by-one execute my request against each of them.
Figure 3: Workflow Service

Flowee Framework

Flowee is an open source implementation of the above design. Having gone through several iterations, I feel that this design has matured enough to warrant a common framework that can be useful to others.

The project is hosted via GitHub at https://github.com/jramoyo/flowee.

Workflow and Tasks

The core framework revolves around the Workflow interface and its relationship with the Task interface.
Figure 4: Flowee Workflow

The execute() method of Workflow accepts both the request and an instance of WorkflowContext. Errors encountered while processing the request are thrown as WorkflowExceptions.

AbstractWorkflow provides a generic implementation of Workflow. It iterates through a list of associated Tasks to process a request. Application specific workflows will inherit from this class.

Most workflow Tasks will inherit from AbstractTask. It provides useful features for building application specific tasks namely:
  • Retry task execution when an exception is encountered
  • Silently skip task execution instead of throwing an exception
  • Skip task execution depending on the type of request

WorkflowFactory

The WorkflowFactory is another important piece of the framework. The way it abstracts the selection and creation of workflows simplifies the rest of the core components.
Figure 5: Flowee WorkflowFactory

AbstractConfigurableWorkflowFactory is used to build configuration  driven workflow factories. It defines an abstract fetchConfiguration() method and an abstract fetchTaskNames() method that sub-classes need to implement. These methods are used to fetch configurations from various sources: either from the file system or from a remote server.

The configuration is represented as a Map whose key is the name of a workflow and whose value is the condition which activates that workflow.

AbstractConfigurableWorkflowFactory uses a Filter instance to evaluate the conditions configured to activate the workflows.

AbstractPropertiesWorkflowFactory is a sub-class of AbstractConfigurableWorkflowFactory that fetches configuration from a properties file.

WorkflowService

WorkflowService and AbstractWorkflowService acts as a facade linking the core components together.
Figure 6: Flowee WorkflowService
With all the complexities taken care of by both the Workflow and the WorkflowFactory, our WorkflowService implementation becomes very simple.

While most applications will interact with workflows through the WorkflowService, those requiring a different behavior can interact with the underlying components directly.

Conclusion

The primary purpose of Flowee is to provide groundwork for rule-driven workflow selection and execution. Developers can focus the majority of their efforts on building the tasks which hold the actual business requirements.

Workflows built on Flowee will run without the need for "containers" or "engines". The framework is lightweight and integrates seamlessly with any application.

This post discussed the design considerations which led to the implementation of Flowee. It also described the structure of the core framework. My next post will demonstrate how easy it is to build workflow-based services with Flowee by going through a sample application.