Ricardo Rocha's Website

Musings on Programming and Programming Languages

Email GitHub Twitter LinkedIn

YamlTag: Fluent Configuration with SnakeYAML

YamlTag is a SnakeYAML add-on used to:

  • Assign yaml tags to class constructors
  • Specify generic types for list and map properties
  • Enter class literals

Code is available at https://github.com/xrrocha/yamltag

What? Gimme the TL;DR

Ok. Imagine we need to use SnakeYAML to enunciate the following data:

Name: Lomic Trianón
Birthdate:
14/11/1987

Phones Languages
  • Cell: (754)321-9876
  • Home: (561)123-4567
  • English: Advanced. Native speaker
  • Italian: Intermediate. 4 years of study

Because our bean properties have generic types we need to annotate them with class names for SnakeYAML to properly build the object graph:

--- !!net.xrrocha.example.Person
name: Lomic Trianón
birthdate: 14/11/1987
photoUrl: /img/lomic-trianon.jpg
phoneNumbers:
  - !!net.xrrocha.example.Phone
    number: (754)321-9876
    usage: CELL
  - !!net.xrrocha.example.Phone
    number: (561)123-4567
    usage: HOME
languages:
  English: !!net.xrrocha.example.LanguageAbility
    level: ADVANCED
    comments: Native speaker
  Italian: !!net.xrrocha.example.LanguageAbility
    level: INTERMEDIATE
    comments: 4 years of study

What YamlTag gives you, in a nutshell, is the ability to declaratively rephrase the above as:

--- !person # Look ma: a tag instead of a class name
name: Lomic Trianón
birthdate: 14/11/1987
photoUrl: /img/lomic-trianon.jpg
phoneNumbers: [ # Look ma: no list class name
    { number: (754)321-9876,  usage: CELL },
    { number: (561)123-4567,  usage: HOME }
]
languages: # Look ma: no map key/value class names
  English: { level: ADVANCED, comments: Native speaker }
  Italian: { level: INTERMEDIATE, comments: 4 years of study }

Why YamlTag?

SnakeYAML is often used to express Java framework configurations as well as data object graphs. In these contexts, developers need to edit complex Yaml-encoded object literals and annotate them with Java type information.

For this, SnakeYAML provides the !! type tag which introduces a class name to use for object instantiation. Using this tag, however, quickly becomes unwieldy for properties of interface, collection or map types.

YamlTag alleviates this verbosity by providing:

  • User-defined type annotations via tag names that map to Java class names
  • Type declarations for collection and map properties
  • Java class literals

How does it Work?

YamlTag works in a purely declarative fashion. All that is required is placing a file named yamltag.yaml (exemplified below) at the top level of your classpath.

Multiple yamltag.yaml resources can exist in the classpath and all are honored as long as no duplicate definitions occur.

Once yamltag.yaml resource(s) are on the classpath, use the YamlTag-supplied factory:

YamlFactory factory = new DefaultYamlFactory();
Yaml yaml = factory.newYaml();
// ... business as usual ...

Example yamltag.yaml

The example above is based on the following Java classes:

// Accessors elided for brevity
public enum PhoneUsage { CELL, HOME, WORK }
public class Phone {
    private String number;
    private PhoneUsage usage;
}
public enum SkillLevel { BASIC, INTERMEDIATE, ADVANCED }
public class LanguageSkill {
    private String language;
    private SkillLevel level;
    private String comments;
}
public class Person {
    private String name;
    private Date birthdate;
    private String photoUrl;
    private List<Phone> phoneNumbers;
    private Map<String, LanguageSkill> languages;
}

The following yamltag.yaml resource file contains the annotations required to educate SnakeYAML on how to instantiate the class and its properties:

net.xrrocha.example.Person # The class we want to simplify
  tagName: person # The tag name to associate with the class
    listProperties: # Properties of type java.util.List
      phoneNumbers: net.xrrocha.example.Phone #phoneNumbers has type Phone
    mapProperties: # Properties of type java.util.Map
      languages:
        keyClass: java.lang.String # The language name
        valueClass: net.xrrocha.example.LanguageSkill # The language skill

Given this definition we can populate unpolluted object graphs like:

--- !person
name: Alexio Flako
birthdate: 24/11/1999
photoUrl: /img/alexio-flako.jpg
phoneNumbers: [ { number: (561)9876-543,  usage: CELL } ]
languages:
  English: { level: ADVANCED, comments: Native speaker }
  Japanese: { level: INTERMEDIATE, comments: 中間体 }

Class Literals

Class literals can be used for bean properties of type Class<?>. Thus, for a property like:

public class Example {
    private Class<?extends Format> formatClass;
    // getter/setter omitted
}

the correspoding Yaml property value would be:

formatClass: !class net.xrrocha.example.BigDecimalFormat

Ok, I get it, but when would I need this?

Like most utilities, YamlTag was born from an itch to scratch.

Throughout the years I’ve developed a number of Java frameworks whose instantiation required wiring complex object graphs and configurations.

This is best illustrated with the following example drawn from an old version of a CLI utility to convert between tabular file formats:

source: !delimitedSource
    input:  !location [people.txt]
    delimiter: '\t'
    fields: &fields
        - { name: id }
        - { name: first_name }
        - { name: last_name, type: STRING } # STRING is the default type
        - { name: salary,   type: NUMBER, format: '$###,###.##' }
        - { name: hiredate, type: DATE, format: MM/dd/yyyy }

filter: !condition [salary > 75000]

destination: !databaseDestination
  tableName: person
  columns: &#42;fields
  dataSource: !!org.postgresql.ds.PGSimpleDataSource
    user: test
    password: test
    serverName: localhost
    databaseName: hr

The above Yaml script is used to read a delimited file, select only records with a salary above $75,000 and store them in a Postgres database table.

The framework underlying this utility is built upon abstractions such as record source and destination as well as filtertransformer and parser.

Each of these abstractions has multiple, alternative implementations:

  • Interface Source has concrete incarnations such as CSVSourceFixedLengthSourceJDBCSource or XBaseSource
  • Correspondingly, Destination has implementations such as XMLDestination SQLInsertDestination or JDBCDestination
  • Filter and Transformer are typically application-specific so they’re best served by ScriptFilter and ScriptTransformer. These implementations enable users to inline Javascript logic right inside the Yaml script

Each alternative component implementation has an associated tag. Thus, for instance, FixedLengthSource is mapped to !fixedLengthSource JDBCDestination is mapped to !jdbcDestination and the tag !location corresponds to URLInputStreamProvider.

YamlTag enables a simple (but powerful) form of domain-specific language for instantiating blackbox frameworks in a declarative fashion.

comments powered by Disqus