Ricardo Rocha's Website

Musings on Programming and Programming Languages

Email GitHub Twitter LinkedIn

URL Shortener #2: Java Implementation

How do we go about implementing the URL shortener in Java? This series of articles contrasts the Java and Xtend languages around a very simple URL shortening REST service. Xtend is a JVM language that compiles into readable Java and is fully compatible with all Java frameworks, libraries and tools. If you know Java you already know most of Xtend! A presentation is also available

Implementing the Domain Model

Our basic domain model is:

This translates readily into Java like this:

public interface Shortener {
  String shorten(String str);
}

public interface StorageProvider {
  Map<String, String> openStorage();
}

class UrlShortener {
  private Shortener shortener;
  private StorageProvider storageProvider;
  
  public void start() {
    // Start the server here
  }
  
  public void stop() {
    // Stop the server here
  }
}

SparkJava and Server Instantiation

One of the more compelling features of the SparkJava micro-framework is that configuration can be omitted and the framework will take care of most of the server details with sensible defaults.

In order to implicitly start the server to listen on its default 4567 port it’s enough to invoke one or more of the framework-provided REST methods (post, get, put, delete, head, patch and options).

Thus, the UrlShortener.start() method will look like:

import static spark.Spak.*;

class UrlShortener {
  private Shortener shortener;
  private StorageProvider storageProvider;
  
  private Map<String, String> storage;
  
  public void start() {
    // Get our key/value store on starting
    storage = storageProvider.openStorage();
    
    post("/api/shorten", (req, res) -> {
      // shorten stuff goes here
    });
  
    get("/:hash", (req, res) -> {
      // get/redirect stuff goes here
    });
  
    delete("/:hash", (req, res) -> {
      // delete stuff goes here
    });
  }
  
  public void stop() {
    spark.Spark.stop();
    // Wrap-up stuff goes here
  }
  
  // Peripheral stuff follows...
}

It is important to note that in invoking these methods we’re actually configuring the server routes. Thus, when UrlShortener.start() is called it returns almost immediately after the server has been started on a separate thread and configured with routes specified through the above HTTP methods.

By the way, the req and res objects shown above are not the same as those in the Servlet API; they’re provided by SparkJava and are simpler to use.

Posting a Long URL to Obtain a Short One

Given the shortener and storageProvider dependencies we can implement URL shortening as a POST request:

post("/api/shorten", (req, res) -> {
  // The long URL comes in the POST's body
  String longUrl = req.body();
  
  // Try to build URI from long URL
  final URI uri;
  try {
    uri = new URI(longUrl);
  } catch (Exception e) {
    // If invalid, return 400 and
    // an appropriate message
    res.status(SC_BAD_REQUEST);
    return "Malformed URL: " + longUrl;
  }

  // We only deal with HTTP/HTTPS requests
  if (uri.getScheme() != null &&
      uri.getScheme().startsWith("http")) {
    // Compute the short hash
    String hash =
      shortener.shorten(longUrl);
    // Make the hash point to the long URL in our key/value store
    storage.put(hash, longUrl);
    // Build the redirect short URL
    String redirectUrl =
      new URI(req.url()).
        resolve("/" + hash).toString();
    res.status(SC_CREATED);
    return redirectUrl;
  } else {
    // Not an HTTP URI, bad boy!
    res.status(SC_BAD_REQUEST);
    return "Not an HTTP URL: " + longUrl;
  }
});

Requesting a Short URL that Redirects to the Long One

When we request a short URL (built by post() above) the response redirects to the original long URL.

get(redirectPath + ":hash", (req, res) -> {
  // Extract the hash from the path
  String hash = req.params(":hash");
  // Do we have it in our key/value store?
  if (storage.containsKey(hash)) {
    // Get the associated long URL...
    String longUrl = storage.get(hash);
    // ..and redirect the client there
    res.redirect(longUrl);
    // Return no content
    return "";
  } else {
    // No such short URL: 404
    req.status(SC_NOT_FOUND);
    return "No such short URL: " + req.url()
  }
});

Deleting a Short URL

Deleting a short URL is as simple as:

delete(redirectPath + ":hash", (req, res) -> {
  // Extract the hash from the path
  String hash = req.params(":hash");
  // Do we have it in our key/value store?
  if (storage.containsKey(hash)) {
    // Remove hash/longUrl entry
    storage.remove(hash);
    // Declare success w/no content
    req.status(SC_NO_CONTENT);
  } else {
    No such short URL: 404
    req.status(SC_NOT_FOUND);
  }
  // Return no content
  return "";
});

Stopping the URL Shortener

In order to stop our server we need to stop the Spark server as such and also close the storage map:

public void stop() {
  // Stop the Spark server
  spark.Spark.stop();
  // Close persistent key/value store
  if (storage instanceof Closeable) {
    try {
      ((Closeable) storage).close();
    } catch (IOException ioe) {
      logger.warn("Ignoring closing error in storage");
    }
  }
}

Implementing the Shortener Interface

We’ve chosen Guava’s Hashing utility class to provide us with hashing algorithms suitable for string shortening due to their very low probability of collision.

Two appropriate algorithms are: mumur3 and sipHash. Google’s Guava implements both algorithms in its Hashing utility class.

Implementing our Shortener interface based on Guava’s Hashing is a breeze:

public class Murmur3Shortener
  implements Shortener {
  
  @Override
  public String shorten(String string) {
    checkNotNull(string,
      "String to be shortened cannot be null");
    
    return Hashing.murmur3_32().
      hashString(string, UTF_8).
      toString();
  }
}

public class SipHashShortener
  implements Shortener {
  
  @Override
  public String shorten(String string) {
    checkNotNull(string,
      "String to be shortened cannot be null");
    
    return Hashing.sipHash24().
      hashString(string, UTF_8).
      toString();
  }
}

Implementing the StorageProvider Interface

For key/value persistent stores we’ve selected ChronicleMap. This embeddable key/store store is very fast and easy to use.

The StorageProvider implementation is straight-forward but requires a little bit of configuration:

public class ChronicleMapStorageProvider
  implements StorageProvider {
  
  private final String name;
  private final String filename;
  private final int entries;
  private final double averageKeySize;
  private final double averageValueSize;
  
  // All-property constructor
  // and associate builder elided

  @Override
  public Map<String, String> openStorage() {
    try {
      return ChronicleMapBuilder.
        of(String.class, String.class).
        name(this.name).
        entries(this.entries).
        averageKeySize(this.averageKeySize).
        averageValueSize(this.averageValueSize).
        createPersistedTo(new File(filename));
    } catch (IOException e) {
      throw
        new RuntimeException(e.getMessage(), e);
    }
  }
}

Since the returned store implements Map<String, String> usage is extremely simple. The only caveat: it’s important to ensure the returned map is properly closed (yes, the retorned store implements both Map and Closeable). This is achieved by means of a shutdown hook implemented in our Main class (discussed next).

We also provide an InMemoryStorageProvider interface useful for testing:

public class InMemoryStorageProvider
  implements StorageProvider {
  
  @Override
  public Map<String, String> openStorage() {
    return new ConcurrentHashMap<>();
  }
}

Putting It All Together: the Main Application

The Main application is responsible for instantiating, starting and stopping a properly configured instance of UrlShortener:

public static void main(String... args) {
  // Extract configuration filename
  // from CLI arguments
  String configFilename =
    configFromArgs(args);
  
  // Open filesystem or resource file
  InputStream is =
    openConfigFile(configFilename);
  
  // Build a ready-made,
  // fully-configured UrlShortener instance
  UrlShortener urlShortener =
    buildUrlShortener(is, configFilename);
  
  // Start the URL shortener adding a
  // shutdown hook for graceful termination
  startUrlShortener(urlShortener);
}

Extracting the configuration filename from the command-line arguments is very simple: in absence of arguments, the filename defaults to "url-shortener.yaml". Given at least one argument, it’s used as the filename.

Opening the configuration file tries to open the file on the filesystem and, failing that, opens it as a top-level classpath resource.

The truly interesting bit is the instantiation of a ready-made UrlShortener instance without having to consume and analyze the configuration.

This feat is achieved by means of SnakeYAML, a widely used YAML processor providing directives to instantiate arbitrary classes and set their properties even when they’re private final fields!

Thus, if we want to configure a UrlShortener instance with our dependency interface implementations our Yaml file will look like:

shortener: !!net.xrrocha.urlshortener.shortening.Murmur3Shortener []

storageProvider: !!net.xrrocha.urlshortener.storage.ChronicleMapStorageProvider
  name: url-shortener
  filename: /var/url-shortener/url-shortener.dat
  entries: 1048576
  averageKeySize: 32
  averageValueSize: 64

This simple approach to instantiation centralizes all configuration and dependency injection in a single, readable Yaml file.

The logic required to create our UrlShortener instance from a Yaml file is:

static UrlShortener
  buildUrlShortener(InputStream is,
                    String filename) {
  Yaml yaml = new Yaml();
  yaml.setBeanAccess(BeanAccess.FIELD);
  UrlShortener urlShortener = null;

  try {
    urlShortener =
      yaml.loadAs(in, UrlShortener.class);
  } catch (Exception e) {
    exit("Error in configuration file '" + filename + "': " + e);
  }

  if (urlShortener == null) {
    exit("Invalid yaml content in configuration file: " + filename);
  }

  return urlShortener;
}

static void exit(String message) {
  System.err.println(message);
  System.exit(1);
}

We make all our configurable classes immutable by implementing all-property constructors and builders. As a consequence, our classes don’t feature getter/setter accessors. Because of this, we configure SnakeYAML to reflectively set private final fields. This way we have all the advantages of immutability while still leveraging Yaml’s intuitive property setting syntax.

Lastly, the logic required to start the URL shortener while ensuring graceful shutdown is:

static void
  startUrlShortener(UrlShortener urlShortener) {
  
  try {
    urlShortener.start();
  } catch (Exception e) {
    // Failure to start is associated with
    // incorrect configuration or
    // misbehaving dependencies
    exit("Error starting URL shortener: " + e.getMessage());
  }
  
  // Ensure closeable dependencies
  // are gracefully stopped on shutdown
  Runtime.getRuntime().
    addShutdownHook(urlShortener::stop);
}

Testing the Service with curl

Our implementation also creates an über-jar usable from the command line like so:

$ java -jar java-url-shortener-1.0.0.jar my-conf.yaml
DEBUG [main] (Routes.java:177) - Adds route: post, /api/shorten, spark.RouteImpl$1@71248c21
DEBUG [main] (Routes.java:177) - Adds route: get, /:hash, spark.RouteImpl$1@6166e06f
DEBUG [main] (Routes.java:177) - Adds route: delete, /:hash, spark.RouteImpl$1@3d3fcdb0
 INFO [Thread-2] (Log.java:192) - Logging initialized @408ms to org.eclipse.jetty.util.log.Slf4jLog
 INFO [Thread-2] (EmbeddedJettyServer.java:127) - == Spark has ignited ...
 INFO [Thread-2] (EmbeddedJettyServer.java:128) - >> Listening on 0.0.0.0:4567
 INFO [Thread-2] (Server.java:372) - jetty-9.4.z-SNAPSHOT
 INFO [Thread-2] (DefaultSessionIdManager.java:364) - DefaultSessionIdManager workerName=node0
 INFO [Thread-2] (DefaultSessionIdManager.java:369) - No SessionScavenger set, using defaults
 INFO [Thread-2] (HouseKeeper.java:149) - Scavenging every 660000ms
 INFO [Thread-2] (AbstractConnector.java:280) - Started ServerConnector@1b649e{HTTP/1.1,[http/1.1]}{0.0.0.0:4567}
 INFO [Thread-2] (Server.java:444) - Started @460ms

Once our server is running we can test it with curl:

$ curl \
  -v \
  -d'http://www.eclipse.org/xtend/documentation/201_types.html#local-type-inference' \
  http://localhost:4567/api/shorten
     
  *   Trying 127.0.0.1...
  * TCP_NODELAY set
  * Connected to localhost (127.0.0.1) port 4567 (#0)
  > POST /api/shorten HTTP/1.1
  > Host: localhost:4567
  > User-Agent: curl/7.52.1
  > Accept: */*
  > Content-Length: 78
  > Content-Type: application/x-www-form-urlencoded
  > 
  * upload completely sent off: 78 out of 78 bytes
  < HTTP/1.1 201 Created
  < Date: Thu, 08 Jun 2017 01:37:48 GMT
  < Content-Type: text/html;charset=utf-8
  < Transfer-Encoding: chunked
  < Server: Jetty(9.4.z-SNAPSHOT)
  < 
  * Curl_http_done: called premature == 0
  * Connection #0 to host localhost left intact
  http://localhost:4567/3e3ae7a9    

Requesting the returned short URL would look like:

curl -s -L http://localhost:4567/3e3ae7a9 | head -10
<!DOCTYPE html>
<html>

  <head>
	<meta charset="UTF-8">
	<title>Xtend - Java Interoperability</title>
	<meta name="viewport" content="width=device-width, initial-scale=1.0">
	<meta name="description"
		content="Xtend is a statically typed programming language sitting on top of Java.">
	<meta name="author" content="Sven Efftinge">

On to the Xtend Implementation…

In our next post we study the Xtend implementation of our Main method contrasting it, side-by-side, with is Java sibling.

You already know Java so you may be surprised by how much Xtend you didn’t know you knew

comments powered by Disqus