QR code

Declarative and Immutable Pipeline of Transformations

  • Moscow, Russia
  • comments

OOPoop xml

A few months ago I made a small Java library, which is worth explaining since the design of its classes and interfaces is pretty unusual. It’s very much object-oriented for a pretty imperative task: building a pipeline of document transformations. The goal was to do this in a declarative and immutable way, and in Java. Well, as much as it’s possible.

Barfuss (2005) by Til Schweiger
Barfuss (2005) by Til Schweiger

Let’s say you have a document, and you have a collection of transformations, each of which will do something with the document. Each transformation, for example, is a small piece of Java code. You want to build a list of transformations and then pass a document through this list.

First, I made an interface Shift (instead of the frequently used and boring “transformation”):

interface Shift {
  Document apply(Document doc);
}

Then I made an interface Train (this is the name I made up for the collection of transformations) and its default implementation:

interface Train {
  Train with(Shift shift);
  Iterator<Shift> iterator();
}
class TrDefault implements Train {
  private final Iterable<Shift> list;
  @Override
  Train with(Shift shift) {
    final Collection<Shift> items = new LinkedList<>();
    for (final Shift item : this.list) {
        items.add(item);
    }
    items.add(shift);
    return new TrDefault(items);
  }
  @Override
  public Iterator<Shift> iterator() {
      return this.list.iterator();
  }
}

Ah, I forgot to tell you. I’m a big fan of immutable objects. That’s why the Train doesn’t have a method add, but instead has with. The difference is that add modifies the object, while with makes a new one.

Now, I can build a train of shifts with TrDefault, a simple default implementation of Train, assuming ShiftA and ShiftB are already implemented:

Train train = new TrDefault()
  .with(new ShiftA())
  .with(new ShiftB());

Then I created an Xsline class (it’s “XSL” + “pipeline”, since in my case I’m managing XML documents and transform them using XSL stylesheets). An instance of this class encapsulates an instance of Train and then passes a document through all its transformations:

Document input = ...;
Document output = new Xsline(train).pass(input);

So far so good.

Now, I want all my transformations to log themselves. I created StLogged, a decorator of Shift, which encapsulates the original Shift, decorates its method apply, and prints a message to the console when the transformation is completed:

class StLogged implements Shift {
  private final Shift origin;
  @Override
  Document apply(Document before) {
    Document after = origin.apply(before);
    System.out.println("Transformation completed!");
    return after;
  }
}

Now, I have to do this:

Train train = new TrDefault()
  .with(new StLogged(new ShiftA()))
  .with(new StLogged(new ShiftB()));

Looks like a duplication of new StLogged(, especially with a collection of a few dozen shifts. To get rid of this duplication I created a decorator for Train, which on the fly decorates shifts that it encapsulates, using StLogged:

Train train = new TrLogged(new TrDefault())
  .with(new ShiftA()))
  .with(new ShiftB());

In my case, all shifts are doing XSL transformations, taking XSL stylesheets from files available in classpath. That’s why the code looks like this:

Train train = new TrLogged(new TrDefault())
  .with(new StXSL("stylesheet-a.xsl")))
  .with(new StXSL("stylesheet-b.xsl")));

There is an obvious duplication of new StXSL(...), but I can’t simply get rid of it, since the method with expects an instance of Shift, not a String. To solve this, I made the Train generic and created TrClasspath decorator:

Train<String> train = new TrClasspath<>(new TrDefault<>())
  .with("stylesheet-a.xsl"))
  .with("stylesheet-b.xsl"));

TrClasspath.with() accepts String, turns it into StXSL and passes to TrDefault.with().

Pay attention to the snippet above: the train is now of type Train<String>, not Train<Shift>, as would be required by Xsline. The question now is: how do we get back to Train<Shift>?

Ah, I forgot to mention. I wanted to design this library with one important principle in mind, suggested in 2014: all objects may only implement methods from their interfaces. That’s why, I couldn’t just add a method getEncapsulatedTrain() to TrClasspath.

I introduced a new interface Train.Temporary<T> with a single method back() returning Train<T>. The class TrClasspath implements it and I can do this:

Train<Shift> train = new TrClasspath<>(new TrDefault<>())
  .with("stylesheet-a.xsl"))
  .with("stylesheet-b.xsl"))
  .back();

Next I decided to get rid of the duplication of .with() calls. Obviously, it would be easier to have the ability to provide a list of file names as an array of String and build the train from it. I created a new class TrBulk, which does exactly that:

Iterable<String> names = Arrays.asList(
  "stylesheet-a.xsl",
  "stylesheet-b.xsl"
);
Train<Shift> train = new TrBulk<>(
  new TrClasspath<>(
    new TrDefault<>()
  )
).with(names).back();

With this design I can construct the train in almost any possible way.

See, for example, how we use it here and here.

sixnines availability badge   GitHub stars