Wednesday, September 3, 2014

Building Object Oriented Frameworks (2)


Building Object Oriented Frameworks

Object oriented frameworks are a mainstay of modern software development. Whether you develop in Java, C#, Objective-C, Python, Ruby or Javascript, chances are you're basing your development on some sort of application development framework. Yet, not many of us are familiar with building application frameworks to fulfill business needs in our organizations. This series of posts illustrates object oriented framework development around a simple (though not trivial) application domain. The code for these documents, written in the Xtend programming language, is available at https://github.com/xrrocha/xrecords. Xtend is a modern JVM language whose syntax is highly readable to developers acquainted with the various mainstream object oriented languages.
  1. In our first post we outlined what a framework is, how it is implemented and how it can be used to build concrete applications.
  2. In this second post we zoom into the details of framework design around the xrecords example. We identify the portions of the application domain that do not change from application to application and capture them as framework frozen spots. In doing so, we also identify the portions that do change between applications and capture them as abstract interfaces to be concretized by application developers: the hot spots.

2. Frozen Spots, Hot Spots

For any given application domain, a framework separates what is fixed from what is variable.
The intent is to capture, once and for all, the application aspects that are fixed so that future applications do not have to repeat them over and over. Because such application aspects are captured into the framework kernel in an unchanging form, they're referred to as frozen spots.
Application aspects that can change from application to application, on the other hand, are referred to as hot spots.
Obviously, hot spot actual functionality cannot be captured by the framework beforehand, but its relationship with the frozen spots can: hot spots are seen by the framework as interfaces (or abstract classes) for which an open-ended number of pluggable implementations may exist.

An important consequence of this architecture is that framework customization occurs mostly by supplying concrete hot spot implementations, seldom -if ever- by altering the framework kernel's source code!

xrecord Frozen Spot

Our example record copying framework exhibits one obvious frozen spot: that of reading records from an input source, possibly filtering and transforming them, and then writing them to their output destination:

A first-draft implementation for the above may look like:
// The framework data
class Record extends HashMap<String, Object> {} // Simplified, for now

// A framework-defined contract: setting up and wrapping up
interface Lifecycle {
    def void open()
    def void close()
}

// Hot spot: interface for record producer
interface Source extends Iterator<Record>, Lifecycle {}

// Hot spot: interface for record consumer
interface Destination extends Lifecycle { def void put(Record record) }

// Hot spot: interface for record filtering
interface Filter {
    def boolean matches(Record record)
    val nullFilter = new Filter { override matches(Record record) { true } }
}

// Hot spot: interface for record transformation
interface Transformer {
    def Record transform(Record record)
    val nullTransformer = new Transformer { override transform(Record record) { record } } 
}

// Framework frozen spot (hence a concrete, possibly final, class)
class Copier {
    @Accessors Source source
    @Accessors Filter filter =  Filter.nullFilter
    @Accessors Transformer transformer = Transformer.nullTransformer
    @Accessors Destination destination

    def copy() {
        source.open()
        destination.open()

        source.forEach [ inputRecord |
            if (filter.matches(inputRecord)) {
                val outputRecord = transformer.transform(inputRecord)
                destination.put(outputRecord)
            }
        ]

        destination.close()
        source.close()
    }
}

xrecord Hot Spots

As seen above, our framework leaves four components unimplemented for developers to supply. When concrete instances of these hot spots are supplied to our framework, it is instantiated into a running application:
  • Source: an iterator returning Records
  • Filter: an optional sieve to select only applicable records
  • Transformer: an optional modifier to refine selected records
  • Destination: a sink where to put resulting records
Note how our framework supplies ready-made, no-op implementations for Filter and Transformer (nullFilter and nullTransformerrespectively). This reflects the fact that such steps are optional as many applications just need to convert between formats without further elaboration.
Since Source extends Iterator we can use Xtend's powerful forEach lambda construct to traverse it:
source.forEach [ inputRecord |
    if (filter.matches(inputRecord)) {
        val outputRecord = transformer.transform(inputRecord)
        destination.put(outputRecord)
    }
]
Both Source and Destination extend the Lifecycle contract and thus must provide implementations for the open() and close() operations. This makes sense as record sources typically need to perform houskeeping actions such as:
  • opening and closing files
  • connecting and disconnecting from databases
  • connecting and disconnecting from remote FTP servers

Time for an Example

Let's see how we can turn our framework into an executable application by adding simple implementations for these hot spots.
Consider the case where need to process a mainframe-supplied, fixed-length record containing the following fields:
FieldOffsetLength
code03
desc324
qty274
price316
where price has 2 (implicit) decimal positions. This file would look like:
123Bolts x 10               0012000245
234Eau de Perrier           2000000075
345Acqua Pellegrino         0520000055
456Caturro Coffee           0032015024
and we want to produce a spreadsheet-friendly, CSV file for totals above $1,000. The resulting file would look like:
"Code","Desc","Total"
123,"Bolts x 10",2940
234,"Eau de Perrier",1500
456,"Caturro Coffee",4807.68
The following is a (somewhat verbose) approximation to the application code:
class MyXRecordsApplication {
  static def main(String[] args) {
    val copier = new Copier => [
        // Hot spot implementation: Source
        source = new Source {
            var String line
            var BufferedReader in

            override open() {
               in = new BufferedReader(new FileReader('mainframe-file.dat'))
            }
            override hasNext() {
              line = in.readLine()
              line != null
            }
            override Record next() {
              val record = new Record
              record.put('code', line.substring(0, 3))
              record.put('desc', line.substring(3, 27))
              record.put('qty', Integer.parseInt(line.substring(27, 31)))
              record.put('price', Double.parseDouble(line.substring(31, 37)) / 100)
              record
            }
            override close() {
                in.close()
            }
            override remove() {}
       }
       // Hot spot implementation: Filter
       filter = new Filter {
           override matches(Record record) {
             val qty = record.get('qty') as Integer
             val price = record.get('price') as Double
             qty * price > 1000
           }
       }
       // Hot spot implementation: Transformer
       transformer = new Transformer {
           override transform(Record record) {
             val qty = record.get('qty') as Integer
             val price = record.get('price') as Double
             record.put('total', qty * price)
             record
           }
       }
       // Hot spot implementation: Destination
       destination = new Destination {
           var PrintWriter out

           override open() {
               out = new PrintWriter(new FileWriter('spreadsheet-file.csv'), true)
               out.println('"Code","Desc","Total"')
           }
           override put(Record record) {
               val line = #['code', 'desc', 'total'].
                   map['''"«record.get(it).toString.trim»"'''].
                   join(',')
               out.println(line)
           }
           override close() {
               out.close()
           }
       } 
    ]

    copier.copy()
  }
}

Cool! A bit involved, though...

Yes, a bit... 
This is so because (so far) our framework only captures the essence of record copying. Our hot spot implementations are still burdened with ancillary responsibilities such as:
  • Formatting fields
  • Decoding input lines into records
  • Assembling output lines from records
We can do better.
In the next post we'll see how to augment the framework with new frozen spots dealing with these aspects. We'll also write ready-made, configurable hot spot implementations developers can reuse to minimize repetition and verbosity.
Stay tuned! 

Wednesday, August 20, 2014

Building Object Oriented Frameworks (1)

Object oriented frameworks are a mainstay of modern software development. Whether you develop in Java, C#, Objective-C, Python, Ruby or Javascript, chances are you're basing your development on some sort of application development framework.
Yet, not many of us are familiar with building application frameworks to fulfill business needs in our organizations. This series of posts illustrates object oriented framework development around a simple (though not trivial) application domain.

  • This first post outlines what a framework is, how it is implemented and how it can be used to build concrete applications.
  • The second post zooms into the core of framework design: identifying unchanging (frozen) and changing (hot) spots of the application domain
  • The code for these documents, written in the Xtend programming language, is available at https://github.com/xrrocha/xrecords. Eclipse's Xtend is a modern JVM language whose syntax is highly readable to developers acquainted with the various mainstream object oriented languages.

    What is a framework anyway?

    At is essence, a framework is a foundation for developing a particular type of application.
    A framework captures the expertise needed to solve a particular class of problems. In doing so, it provides pre-written code that you can add to in order to build a concrete application.
    This concept is beautifully illustrated by Apple's OSX documentation in the following allegory:
    Integrate Your Code with the Frameworks
    Unlike an application, a framework is not directly executable. This is so because, for its given application domain, a framework captures what doesn't change and deliberately leaves out what can change. You, the application developer, must provide the bits that change for the framework to become an executable application.
    Because of this frameworks exhibit a property dubbed inversion of control, where it is the framework that calls into your code, not the other way around.

    What a framework is not

    As follows from the above, a framework is not a library. When you make use of a library you decide when and how to call it. In a framework setup, though, the framework is in control; you supply it with your code for it to execute at a time of its choosing.
    As it's often the case in software development, the term framework is somewhat overloaded and is sometimes used with too narrow a meaning. Among web developers, in particular, "framework" has become synonymous with "web application development framework" or "model-view-controller framework". While these development tools are indeed frameworks, the notion of framework as the foundation for a class of applications is much more general.

    Too abstract! Show me an example

    Sure! Mind you, though, frameworks are abstract

    Consider a utility to convert between tabular record formats such as:
    • Relational database tables
    • CSV and delimited files
    • Fixed-length files
    • XBase (DBF) files
    • Flat JSON, XML or Yaml files
    This utility uses Yaml as its configuration format. Thus, for example, the following Yaml script populates a database table from a CSV file:
    source: !csvSource
        input: !fromLocation [data/acme-form4269.csv]
        fields: &myFields [
            { name: tariff, format: !integer },
            { name: desc,   format: !string  },
            { name: qty,    format: !integer },
            { name: price,  format: !double ['#,###.##'] },
            { name: origin, format: !string },
            { name: eta,    format: !date [dd/MM/yyyy] }
        ]
    
    filter: !condition [tariff != 0] # javascript
    
    destination: !databaseDestination
        tableName:  form4269
        columns: *myFields # CSV field names match column names
        dataSource: !!org.postgresql.ds.PGSimpleDataSource
            user: load
            password: load123
            serverName: customs.feudalia.gov
            databaseName: forms
    
    The Yaml parser used here is SnakeYAML. Type tags are based on YamlTag

    What's our application domain?

    Every framework addresses a particular application domain for which it supports building concrete applications.
    Our utility's application domain is that of tabular file format conversion: it transcribes data from one tabular file format to another.
    From: CSVTo: Fixed-length
    We use the term tabular to state our records are flat in nature: we support scalar fields only, with no provisions for arrays or nested records.
    Incidentally, we treat relational database tables as just another format, on par with its flat file cousins.

    What's the domain theory?

    A framework embodies a theory about its problem domain. This is always the case, even when the theory is implied only by the implementation.
    Because frameworks strive to provide the foundation for any application within their stated domain, their theories need to be comprehensive.
    A comprehensive theory about an application domain can only stem from repeated experiences in automating the domain. This is referred to as the Three Examples pattern of framework development.
    Our tabular format conversion domain is simple enough to have a small theory. Within this domain the notion of concrete application corresponds to a program or script performing a specific conversion. In regard to the 3 examples rule, this framework is backed by numerous hand-written, ad-hoc conversions.

    Lingua Franca

    Translating among many formats calls for an intermediate representation to be chosen such that all formats have translations to and from the intermediate representation rather than translations to every other format.
    This reduces the number of translations from n2 - n to a more manageable 2n. In absence of such lingua franca, we'd be in a Babel Tower predicament.
    A suitable intermediate representation for tabular records is the format-agnostic, in-memory map whose keys are field names and whose values are field contents:
    class Record {
        val fields = new HashMap<String, Object>
        . . .
    }
    

    General Theory

    • Every conversion has a source and a destination.
    • The Source reads zero or more records encoded in the input tabular format. As each record is read, it is converted to a Recordrepresentation.
    • The Destination accepts zero or more Records. As each record is accepted, it is converted to the output tabular format and subsequently written.
    • As each record is read it can be filtered to determine whether it should be included in the output or not. The optional component responsible for this selection is referred to as the Filter.
    • Finally, selected records can be transformed so as to conform with the requirements of the given Destination. The optional component responsible for this is referred to as the Transformer.

    The Framework Model

    The above theory is captured in the following class diagram:
    We had previously stated that a framework captures what doesn't change in its domain. In our case, what doesn't change is the general algorithm followed to copy records from a source to a destination:
    // Copier.xtend
    source.open()
    destination.open()
    
    source.forEach [ in |
      if (filter.matches(in)) {
        val out = transformer.transform(in)
        destination.put(out)
      }
    ]
    
    source.close()
    destination.close()
    
    Because this logic doesn't ever change it is referred to as a frozen spot.
    The portions of the aplication that can change are called, correspondingly, hot spots. In our framework they are:
    • Source. Responsible for reading data items and translating them to the intermediate representation
    • Destination. Responsible for translating from the intermediate representation to the output format and writing the result
    • Filter. Responsible for determining whether a given instance of the intermediate representation should be further processed
    • Transformer. Responsible for converting an intermediate representation instance provided by the Source to one suitable for theDestination
    The following class diagram depicts the framework implementation for the database (JDBC) and CSV tabular formats:



    Note how, despite their seemingly opposite natures, Source and Destination share a common superclass for each tabular format.
    Thus, for example, CSVSource (which reads comma-separated files) and CSVDestination (which writes comma-separated files) share common attributes such the separator character, the field-enclosing quote character and whether the file has a header record or not.
    Likewise, the database components JDBCSource and JDBCDestination share a SQL DataSource attribute.
    Note that both Filter and Transformer above have scripting (rather than framework-supplied) implementations. This reflects the fact that logic for record selection and modification is application-specific and, thus, hard to capture in a general, reusable way. Scripting provides a mechanism for developers to pass simple filtering and transformation expressions without having to write framework-aware code. TheFieldRenamingTransformer component, on the other hand, satisfies the commonly occurring need to map input field names to different output field names.

    Framework Instantiation

    The process of extending the framework to turn it into an executable application is called framework instantiation.
    Instantiation for most frameworks require developers to write code extending framework-provided classes and interfaces. Such frameworks are referred to as whitebox frameworks because their internal structure (in terms of classes and interfaces) is visible to application developers.
    Other frameworks (ours included!) provide a repertoire of ready-made components such that framework instantiation no longer requires application code but only framework component configuration. Such frameworks are referred to as blackbox frameworks because their internal implementation is opaque to application developers who are only concerned configuring component instances.
    Blackbox frameworks make it possible to write new applications by wiring pre-existing components like thus:
    source: !databaseSource
         # Column labels are used as field names
        sqlText: |
            SELECT *
            FROM   emp
            ORDER BY deptno, empno
        dataSource: !!org.hsqldb.jdbc.JDBCDataSource
            url:    jdbc:hsqldb:file:hsqldb/example;hsqldb;shutdown=true
            user:   sa
    
    filter: !scriptFilter [sal > 1000] # javascript
    
    transformer: !renameFields # Only named fields are included in output record
        - empno: id
        - ename: name
        - sal: salary
    
    destination: !xbaseDestination
        output:  !outputDestination [data/well-paid-emps.dbf]
        fields: [
            { id, format: !integer },
            { ename, format: !string },
            { sal, format: !double },
        ]
    
    For our framework instantiation we've chosen Yaml to enunciate the application's object graph. For scripting we default to Javascript.
    In addition to Yaml, other forms in blackbox framework instantiation are used.
    In the Java world, in particular, the ever-popular Spring IoC container is frequently used as means of expressing framework instantiation. See the Heritrix settings guide for an example of Spring IoC framework configuration.
    Of course, good ole' source code can be used to enact framework instantiation:
    val copier = new Copier => [
        source = new JDBCRecordSource => [
            sqlText = "SELECT * FROM emp ORDER BY deptno, empno"
            dataSource = new JDBCDataSource => [
                user = "sa"
                url = "jdbc:hsqldb:file:hsqldb/example;hsqldb;shutdown=true"
            ]
        ]
    
        filter = new ScriptingCopierComponent => [
            script = "sal > 1000"
        ]
    
        transformer = new FieldRenamingTransformer => [
            renames = #{
                "empno" -> "id",
                "ename" -> "name",
                "sal"   -> "salary"
            }
        ]
    
        destination = new XBaseRecordDestination => [
            output = new FileLocationOutputStreamProvider => [
                location = "data/well-paid-emps.dbf"
            ]
            fields = #[
                new FormattedField<Integer> => [ parser = new IntegerParser ],
                new FormattedField<String> => [ parser = new StringParser ],
                new FormattedField<Double> => [ parser = new DoubleParser ]
            ]
        ]  
    ]
    

    Conclusion

    The essence of framework design lies in capturing the application logic that doesn't change. Such immutable logic (or frozen spot) expresses the business behavior in terms of one or more abstract components (or hot spots) that model what does change from application to application.
    For each hot spot multiple, alternative implementations may exist. Framework instantiation generally involves selecting what specific hot spot implementations to use as dictated by the application's requirements.
    As we'll see later on, each concrete hot spot implementation can be itself modeled as a framework. This recursive design process pervades framework design and implementation.
    Ideally, frameworks evolve towards a blackbox style where new applications can be created with a much simpler, declarative API. Thus, new applications are built by mixing, matching, wiring and configuring pre-existing components.
    This is the approach illustrated by our tabular format conversion utility. By capturing all the moving parts in its domain, this framework can be fully instantiated as an object graph wiring together existing components to assemble a fully working application capable of performing a specific format-to-format conversion.

    Continue to the next entry: Frozen spots, Hot Spots