Tuesday, July 29, 2014

Fun with Scala Dynamic, macros and Yaml

SDynamic is a small utility to write untyped object literals in Scala and then treat dynamic results as if they were regular Scala objects.

// Look ma: no intervening case classes!
val naftaCountries = dyaml"""
  |- { name: USA,  currency: USD, population: 313.9,
  |    motto: In God We Trust, languages: [ English ] }
  |- { name: Canada, currency: CAD, population: 34.9,
  |    motto: A Mari Usque ad Mare, languages: [ English, French ] }
  |- { name: Mexico, currency: MXN, population: 116.1,
  |    motto: 'Patria, Libertad, Trabajo y Cultura', languages: [ Spanish ] }
assert(naftaCountries.length == 3)
assert(naftaCountries(0).name == "USA")
assert(naftaCountries(1).population == 34.9)
assert(naftaCountries(2).motto == "Patria, Libertad, Trabajo y Cultura")
assert(naftaCountries(1).languages.toList == Seq("English", "French"))

The serialization language used to enunciate object graphs is Yaml. Object-like property manipulation is based on Scala's Dynamic trait.

The dyaml and syaml string interpolators provide a convenient notation while ensuring Yaml well-formedness at compile-time via a simple macro. (syaml uses the SnakeYAML parser)

Intellij Idea users get the added bonus of Yaml literal syntax highlighting and edit-time validation:

In a small way, SDynamic relates to an old request first formulated in 2008: SI-993: Add support for YAML (like XML).

Code is available at https://github.com/xrrocha/sdynamic.

Why on Earth?

Yeah, why? And what about type-safety? ;-)

Like many such small utilities, SDynamic was born of a personal itch to scratch: I've needed to write numerous unit tests requiring lots of structured (but otherwise volatile) data.

Creating case classes nesting other case classes and then writing looong object literal expressions for them quickly grows tedious and cumbersome:

case class Country(name: String, currency: String, population: Double,
                   motto: String, languages: Seq[String])
// Wrappers, parens, quotes, commas. Oh my!
val naftaCountries = Seq(
      name = "USA",
      currency = "UDS",
      population = 313.9,
      motto = "In God We Trust",
      languages = Seq("English")),
      name = "Canada",
      currency = "CAD",
      population = 34.9,
      motto = "A Mari Usque ad Mare",
      languages = Seq("English", "French")),
      name = "Mexico",
      currency = "MXN",
      population = 116.1,
      motto = "Patria, Libertad, Trabajo y Cultura",
      languages = Seq("Spanish"))

The astute reader will notice the above could be written sàns named parameters. For nested structures with more than just a few fields, however, positional parameters in object literals quickly become a liability as they obscure value-to-field attribution.

When dealing with one-off object literals we want:

  • Minimal verbosity
  • Maximal readability

Why Yaml?

Yeah! Why not JSON? Or XML?

Let's see:

Language Example
- name: USA
  currency: USD
  population: 313.9
  motto: In God We Trust
  languages: [ English ]
- name: Canada
  currency: CAD
  population: 34.9
  motto: A Mari Usque ad Mare
  languages: [ English, French ]
- name: Mexico
  currency: MXN
  population: 116.1
  motto: Patria, Libertad, Trabajo y Cultura
  languages: [ Spanish ]
[{"name": "USA",
  "currency": "USD",
  "population": 313.9,
  "motto": "In God We Trust",
  "languages": [ "English" ] },
 {"name": "Canada",
  "currency": "CAD",
  "population": 34.9,
  "motto": "A Mari Usque ad Mare",
  "languages": [ "English", "French" ] },
 {"name": "Mexico",
  "currency": "MXN",
  "population": 116.1,
  "motto": "Patria, Libertad, Trabajo y Cultura",
  "languages": [ "Spanish" ] }
    <motto>In God We Trust</motto>
    <motto>A Mari Usque ad Mare</motto>
    <motto>Patria, Libertad, Trabajo y Cultura</motto>

Yaml minimizes punctuation while enhancing readability:

  • No need to enclose property values or (the horror!) property names in quotation marks
  • No need to separate list elements with commas or enclosing lists in brackets when using multi-line mode
  • No need to verbosely mark the beginning and end of each property


The example below builds the following HTML content:

object Example extends App {
  import DYaml._

  val countries = dyaml"""
    |- name: USA
    |  currency: USD
    |  population: 313.9
    |  motto: In God We Trust
    |  languages:
    |    - { name: English, comment: Unofficially official }
    |    - { name: Spanish, comment: Widely spoken all over }
    |  flag: http://upload.wikimedia.org/wikipedia/en/thumb/a/a4/Flag_of_the_United_States.svg/30px-Flag_of_the_United_States.svg.png
    |- name: Canada
    |  currency: CAD
    |  population: 34.9
    |  motto: |
    |    A Mari Usque ad Mare<br>
    |    (<i>From sea to sea, D'un océan à l'autre</i>)
    |  languages:
    |    - { name: English, comment: 'Official, yes' }
    |    - { name: French, comment: 'Officiel, oui' }
    |  flag: http://upload.wikimedia.org/wikipedia/en/thumb/c/cf/Flag_of_Canada.svg/30px-Flag_of_Canada.svg.png
    |- name: Mexico
    |  currency: MXN
    |  population: 116.1
    |  motto: |
    |    Patria, Libertad, Trabajo y Cultura<br>
    |    (<i>Homeland, Freedom, Work and Culture</i>)
    |  languages:
    |    - { name: Spanish, comment: 'Oficial, sí' }
    |    - { name: Zapoteco, comment: Dxandi' anja }
    |  flag: http://upload.wikimedia.org/wikipedia/commons/thumb/f/fc/Flag_of_Mexico.svg/30px-Flag_of_Mexico.svg.png

  import Html._
  def country2Html(country: SDynamic) = html"""
          |  <td><img src="${country.flag}"></td>
          |  <td>${country.name}</td>
          |  <td>${country.motto}</td>
          |  <td>
          |    <ul>
          |      ${
                    country.languages.toList.map { lang =>
                      s"<li>${lang.name}: ${lang.comment}</li>"
          |    </ul>
          |  </td>

  val pageHtml = html"""
          |<head><title>NAFTA Countries</title><meta charset="UTF-8"></head>
          |<table border='1'>
          |  <th>Flag</th>
          |  <th>Name</th>
          |  <th>Motto</th>
          |  <th>Languages</th>
          |<tr>${(countries map country2Html).mkString}</tr>

  val out = new java.io.FileOutputStream("src/test/resources/countries.html")

object Html {
  implicit class HtmlString(val sc: StringContext) extends AnyVal {
    def html(args: Any*) = sc.s(args: _*).stripMargin.trim


  1. Interesting stuff, indeed. What would it take to make this type safe?

    1. For this to be type-safe the Yaml parser would have to return a fully assembled object (which, by the way, would render Dynamic unnecessary).

      SnakeYAML does assemble complete objects, of course, but it requires classes to follow the Java Beans property convention (getters/setters) as well as to use Java collection types (List and Map in java.util).

      Most Scala classes (most notably, case classes) are not Java beans and use Scala's own collection classes. SnakeYAML doesn't support this yet and there's currently no Scala-aware Yaml parser in sight

  2. Very interesting and useful. The difference between Yaml and XML is telling. A few more examples of how to handle one more level of inner classes / elements would have been helpful, IMHO. Thanks anyway for a readable post.

    1. Thanks for your feedback Nirmalya!

      I've modified the last example to include a nested "language" object that provides one more level of depth (and renders as an unordered list).

  3. You may find these relevant too (Scala Dynamic for JSON): https://github.com/pathikrit/dijon