Monday, September 19, 2011

Mixing Ruby: Dynamic Properties

"The very things I find ugly in Ruby are what make amazing Ruby software like RSpec possible, and that Python could never have (given the current implementation)." Gary Bernhardt [1]

Reading Metaprogramming Ruby, I was inspired to create my own Ruby mixin for a pattern I've seen implemented quite often: "Dynamic Properties".  This pattern is definitely not new; a number of popular libraries very effectively use the pattern to deliver amazing functionality (and expressiveness).

What is a Dynamic Property?  Dynamic properties (or methods) are the use of undefined methods (never defined on the class) to represent model 'constructs' of that class.  Let me demonstrate:

package com.berico;

import java.util.Date;

public class WeatherObservation {

  private Date observationTime = null;

  public Date getObservationTime() {
    return observationTime;
  }

  public void setObservationTime(Date observationTime) {
     this.observationTime = observationTime;
  }	
}

Ok, we have a simple Java class.  Hell, in three edits, this could be a C# class as well.  Think about what happens when you do something like this:

public static void main(String... args){
		
  WeatherObservation wxOb = new WeatherObservation();
  wxOb.setObservationTime(new Date());

  // WTF?  This doesn't even compile!!!!		
  wxOb.setTemperature(42);
		
}

We know that this is virtually impossible in staticly-typed languages.  The "setTemperature" method does not exist and the class cannot be compiled.  In some instances (in Java), it is possible to call a method on an class defined in an external archive (JAR) that doesn't exist at runtime; this might occur if the project is compiled against a different version of a dependency than is in the classpath when the application is running.  Calling a method that doesn't exist in Java incurs the NoSuchMethodException; and your program breaks unless it is wrapped in some try-catch block (which is unlikely because you probably aren't accounting for methods "disappearing" on you).

Ruby handles things a little bit differently.  Instead of immediately raising an error, the runtime offers the class an option to handle the event of an undefined method being called on a class/instance.  If the class (or a parent class) has a function cleverly called "method_missing" defined, that method will be called with the name of the missing method and an array of arguments supplied to that method.  How you handle the unavailability of the method is completely up to you.

Many frameworks use "missing_method" to do some pretty awesome things.  Think about it, in Ruby, you can catch any method that doesn't exist as it is called on that class!  ActiveRecord, an extremely popular ORM in Ruby, uses method_missing to create "dynamic finders" [2].  The following is an example from the ActiveRecord documentation:

Person.where(:user_name => user_name, :password => password).first
Person.find_by_user_name_and_password(user_name, password)

The method "find_by_user_name_and_password" does not exist on the person object, and certainly doesn't exist on ActiveRecord::Base, the class model classes extend when using the ORM.  Instead, ActiveRecord catches the method when it goes "missing", and parses the name to determine the "intent" of the function (in this case "find" by two parameters: user_name and password).

Having liked the pattern, I've used it on a couple of my personal programming projects.  After my second implementation, I decided that it was time to figure out some way to abstract the code into something more "reusable".  In Java and C#, we would naturally create some base class (gross).  Thankfully, Ruby supports mixins, so I've created a very tiny implementation of the pattern as a mixin you can include on your classes.  I also took the pattern one step further and borrowed a technique from ActiveRecord, in which the class is "monkey patched" to permanently add the functionality (in my case the addition of a "getter" and "setter") to prevent the overhead of lookups on ancestor classes, as well as, any processing within method_missing.

Here is the source code for the DynamicProperties module/mixin:

module Berico

  # Enable the use of properties on a class
  # that are not explicitly defined within
  # the class.  These properties exist as a
  # hash on the object, which can be accessed
  # via the instance property "properties" or
  # by calling the property's name (key) on
  # the object.  Properties can also be
  # dynamically set and added to the hash by
  # invoking the "{property_name}=" method.
  module DynamicProperties

    attr_reader :properties, :configured

    # Called on the first method_missing invocation.
    # Configure the mixin using configuration
    # properties from the class including the mixin,
    # or use default configuration if those properties
    # are missing.
    # @param configuration [Object] (optional) config hash
    def configure(configuration = {})
      # Initialize the Property Bag
      @properties = {}

      # default configuration
      @_configuration = {}

      # has the config been checked?
      # since we rely on state from an initialized
      # object or class, and can't guarantee that
      # the info has been applied before including
      # this module, we need to lazy-load the functionality
      # on the first missing_method call.
      # This is the flag that will tell us whether
      # that was performed.
      @configured = true

      # Merge existing properties if the configuration
      # hash has a :properties key
      if configuration.has_key? :properties
        configuration[:properties].each do |k, value|
          key = (k.instance_of? Symbol)? k.to_s : k
          @properties.store(key, value)
        end
      end

      # If the class we are mixing
      # has supplied configuration
      # details for the mixin
      unless configuration == {}
        configuration = {} unless self.config_valid?(configuration)
      end
      # Create a parser for the property name
      @name_parser = create_name_parser(configuration)
      true
    end

    # Create a lambda that will parse the correct
    # property name based on the supplied naming strategy
    # (found in the config hash).
    # @param config [Hash] configuration of the property parser;
    #  options include a prefix, suffix, regex (or identity)
    # @return [Lambda] property name parser (default is identity)
    def create_name_parser(config)
      # Regex Matcher
      return lambda do |method_name|
          return $1 if method_name =~ config[:matcher]
        end if config.has_key? :matcher
      # Prefix Parser
      return lambda do |method_name|
          return method_name.sub(config[:prefix], "") 
             if method_name.start_with? config[:prefix]
        end if config.has_key? :prefix
      # Suffix Parser
      return lambda do |method_name|
          return method_name.chomp(config[:suffix]) 
             if method_name.end_with? config[:suffix]
        end if config.has_key? :suffix
      # Identity Parser
      lambda { |method_name| return method_name }
    end

    # Is the supplied configuration valid
    # for the DynamicProperties mixin?
    # @param config [Hash] Hash of config properties
    def config_valid?(config)
      valid_for_key? config, :prefix, String or
          valid_for_key? config, :suffix, String or
          valid_for_key? config, :matcher, Regexp
    end

    # Is the configuration valid for the given key
    # @param config_hash [Hash] Configuration
    # @param key [String or Symbol] Key to look up
    # @param class_type [Class] class the value should be
    # @return [TrueClass or FalseClass] whether the key is valid
    def valid_for_key?(config_hash, key, class_type)
      # Hash has the key
      if config_hash.has_key? key
        # Value is the right type
        config_hash[key].instance_of? class_type
      end
    end

    # Here's the magic! Every time a method
    # goes missing, we will test the method name
    # to see if it matches our requirements.
    # If the requirements are a match,
    def method_missing(name, *args)
      if not @configured
        @configured = configure
      end
      # By default, we are getting properties
      mode = :getter
      # if the method name is a symbol,
      # convert it to a string,
      # otherwise, clone the name string
      # (we're going to modify it)
      method = (name.instance_of? Symbol) ? name.to_s : name.clone
      # If this is a setter
      if method.end_with? "="
        # remove the "=" sign
        method.chomp! "="
        # change the mode to set
        mode = :setter
      end
      # get the property name
      property_name = @name_parser.call(method)
      # if the property name is null, call the
      # base object's method_missing
      if property_name.nil?
        super
      else
        # Monkey Patch the property so the next
        # call doesn't go "missing"
        self.patch_property property_name
        # if we are dealing with a getter
        if mode == :getter
          if @properties.has_key? property_name
            return @properties[property_name]
          else
            super
          end
        # else, this is a setter!
        else
          # create the property
          @properties[property_name] = args[0]
        end
      end
    end

    # Monkey patch the existing class to have
    # the property (thereby not incurring the
    # overhead of a method_missing call)
    # @param method_name [String] name of the method
    # to add to the class.
    def patch_property(method_name)
      self.class.class_eval  %Q{
         class #{self.class}
           def #{method_name}
             @properties['#{method_name}']
           end
           def #{method_name}=(value)
             @properties['#{method_name}'] = value
           end
         end }
    end

  end
end

Now you can use properties dynamically by including the mixin.  The following is a very simple example of the pattern in action:

require_relative 'dynamic_properties'
require "date"

class WeatherObservation
  include Berico::DynamicProperties

  def initialize
    @date_time = ::DateTime.now
  end

  def to_s
    output = "Weather Observation \n"
    output << "  Time: #{@date_time}\n"
    @properties.each do |key, value|
       output << "  #{key}: #{value}\n"
    end
    output
  end
end

observation = WeatherObservation.new

observation.temperature = 75
observation.dew_point = 25
observation.wind_speed = 10
observation.wind_dir = 270
observation.visibility = 10
observation.sky_con = :clear
observation.altimeter = 29.92

puts observation

Like any good Rubyist, I have a battery of RSpec tests verifying the mixin performs as advertised.  I'm hosting the project as a part of a Commons Repository for Ruby:

https://github.com/berico-rclayton/berico-ruby-common

Please let me know what you think!  I'm also interested if someone has already done the same thing (please let me know!).

Good Luck and Happy Coding.

Richard

Footnotes:
  1. Cited in: http://blog.peepcode.com/tutorials/2010/what-pythonistas-think-of-ruby
  2. http://api.rubyonrails.org/classes/ActiveRecord/Base.html

Saturday, August 27, 2011

Ruby for you "Static Types"

Over the last month, I decided to make an honest attempt to "really" learn a new language.  For a while I had been flirting with Scala, Clojure, and Python as potential alternatives to C# and Java, but none of the languages really fit my personality and goals as a programmer.  Luckily,  a serendipitous stream of events led me to exploring the Ruby ecosystem.

I suggested the language to a couple of "static types", particularly my friend John (whom I am always referring to), and many of them scoffed.  Most had preconceptions of what Ruby is and isn't with out actually knowing much of the language and its capabilities.  John even suggested that Ruby was a language for people who program with their "pinky in the air" (for those who don't get the slight, he's inferring that Rubyists are pretentious).

I want to take the opportunity to show developers heavily ingrained in the Java and C#/VB communities some of the advanced capabilities of Ruby.  Keep in mind that I still haven't even achieved Padawan status as of this post (so if you see a better way to do something, please let me know).

INSTALLING RUBY

The best way to install Ruby on Linux or OSX (and it's not by downloading it from the main site).
  1. Install Curl (if it's not already installed).
  2. Install Git (http://git-scm.com/).
  3. Install RVM (Ruby Version Manager)
    bash < <(curl -s https://rvm.beginrescueend.com/install/rvm)
    
  4. Use RVM to install Ruby for you!
    # Show available distros and versions (there are a lot!)
    rvm list known
    # Select the latest version of the core distro
    rvm install 1.9.2
    # Create a Gem set (think namespace for the package manager)
    rvm --create use 1.9.2@roundhouse
    
If you are using Windows, use the installer from the Ruby website. You will also want to download "Ruby Gems" a the de facto package manager for Ruby.

When your install is finished, you check to ensure that it works from the shell/prompt:
ruby --version
# You should see something like this (I'm on OSX):
# ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin10.8.0]

Running a Ruby script is extremely simple.  Ruby scripts are simply text files with the "rb" extension (by convention). To run a script, use the "ruby" command followed by the file you want to run:
ruby myscript.rb

Now you should know everything you need to know to run the examples.

LANGUAGE DEMONSTRATION

I'm not going to show you basic Ruby syntax.  You will figure it out on your own.  The goal of this guide is simply to show you some of the more advanced language features.


Everything is an object (absolutely freaking everything!).

There are no primitives.

Consider this code:
puts 1.class
puts 3.14.class
puts true.class
puts nil.class
puts "String".class

class Clazz
end

puts Clazz.class

module MyModule
end

puts MyModule.class

Output:
Fixnum
Float
TrueClass
NilClass
String
Class
Module

Extension Methods?

Since everything is an object, and because Ruby is awesome, we can extend these basic language constructs:
class String

  def word_count
    words = self.split(" ").length
  end
end

word_count = "here is my string".word_count

# the "<<" is the concatenation operator for strings
# you will see that this is actually pretty gross
puts "The word count was " << word_count
And the output is:
The word count was 4
But we are not limited to extending Strings, we can extend the actual "Class" class, "Module" class, etc. This ability is similar to "extension methods" in C#, except we get much better support in what we can change to the original classes.

Operator Overloading

Just like C#, but very unlike Java, Ruby enables operator overloading:

class String

  def word_count
    words = self.split(" ").length
  end

  def -(value)
    case value
      when String
        return self.delete value
      when Fixnum
        return self.slice(value..self.length)
    end
    nil
  end

end

word_count = "here is my string".word_count
puts "The word count was " << word_count.to_s

puts "Mississippi" - "i"
puts ("Hello World" - 6)
And we see on the console:
Msssspp
World
Unlike C#, Ruby has a lot more available operators that can be overloaded, including: |, ^, &, <=>, ==, ===, =~, >, >=, <, <=, +, -, *, /, %, **, <<, >>, ~, +@, -@, [], []=, !=, !~ (thanks to: http://www.zenspider.com/Languages/Ruby/QuickRef.html)

In many cases, the meanings of these operators in the context of your class are completely up to you (I mean, what the hell does this "!~" mean to you? 'not approximately'?). Of course, Ruby is a language for "big boys"; so define these at your own risk.

Powerful Strings

String literals can be defined in a number of ways in Ruby. Typically, the way you choose is dependent on the characters you want to use.
# I need double quotes in my string
puts '"Hello World"'
# I need single quotes
puts "'Hello World'"
# I need both
puts %q{"Hello World" and 'Hello World'}
# Another alternative
puts %Q{"Hello World" and 'Hello World'}
# I need a multiline, any character string
puts <<EOF
Here is a multiline string with " and ' escaped!
I can have more content on this line
EOF
Console output:
"Hello World"
'Hello World'
"Hello World" and 'Hello World'
"Hello World" and 'Hello World'
Here is a multiline string with " and ' escaped!
I can have more content on this line
C# and Java both have their own special conventions for "formatted strings".  Some libraries in Java even handle complex expression-based logic (MVEL and SPEL).  Ruby natively supports this out of the box:
# Wow, native expressions...
answer = 42
puts "The answer to life, the universe, and everything is #{answer}"

# Expressions can are not simply variable placeholders
puts "Strings are more powerful in Ruby = #{ Time.now }"

# What?!
puts "#{ int1 = gets.to_i } + #{ int2 = gets.to_i } = #{int1 + int2}"
Output (for the last expression, I'm going to input the integers "1" and "2" in the console:
The answer to life, the universe, and everything is 42
Strings are more powerful in Ruby = 2011-08-27 21:40:15 -0400
1 + 2 = 3

Flexible Typing

Probably the most unnerving thing to the "Static Types" is the lack of interfaces.  This might lead people to believe that Ruby is full of complex class hierarchies.  This is simply not the truth!   Ruby follows the "Duck Typing" philosophy: "if it looks like a duck and quacks like a duck, it's a duck!".

This is an example of it works:
class Duck
  def quack
    puts "Duck goes 'Quack'!"
  end
end

class Goose
  def quack
    puts "Goose goes 'Quack'!"
  end
end

# Array of Waterfowl
waterfowl = [ Duck.new, Goose.new ]

# "quack" each fowl
waterfowl.each {|fowl| fowl.quack }
And our output:
Duck goes 'Quack'!
Goose goes 'Quack'!

Metaprogramming (Reflection on Steroids)

Ruby is a phenomenally reflective language.  The language provides "hooks" you can exploit within the language to do all sorts of neat things that don't really have an equivalent in Java or .NET.  These hooks allow Ruby to implement patterns that also have no equivalent in most static languages.

Variants of the Delegation Pattern

In this pattern, we are going to delegate an action to the appropriate class capable of handling the provided "context".  This is similar to the handler pattern, with one caveat.  Instead of adding the handler implementations manually to some container/composite class, we are going to use inheritance (in a very strange way).  In this case, a super class is going to register all derived classes as they are declared!
require 'forwardable'

# Weapon
class Weapon

  # Class variable to store known
  # weapon classes
  @known_weapons = []

  # Extend our class by adding
  # some variables
  class << self
    attr_reader :known_weapons
  end

  # Capture inheriting classes
  # and add them to the list of
  # known weapons
  def self.inherited(klass)
    puts "I've been inherited by #{klass.to_s}"
    # Creating a new instance
    Weapon.known_weapons << klass.new
  end

  # Attack our enemy
  def attack(enemy)
    Weapon.known_weapons.each do |weapon|
      weapon.dispatch enemy if weapon.appropriate? enemy
    end
  end

  # Is the weapon appropriate for the given enemy
  # Note: Subclasses must implement
  def appropriate?(enemy)
    raise NotImplementedError, "Subclasses must implement"
  end

  # Dispatch the enemy
  # Note: Subclasses must implement
  def dispatch(enemy)
    raise NotImplementedError, "Subclasses must implement"
  end

end

# Katana Blade
class KatanaBlade < Weapon

  def appropriate?(enemy)
    enemy == 'Ninja'
  end

  def dispatch(enemy)
    puts "#{enemy} has been sliced by the Katana Blade"
  end

end

# Ninja Stars
class NinjaStars < Weapon

  def appropriate?(enemy)
    enemy == 'Dragon'
  end

  def dispatch(enemy)
    puts "#{enemy} has been slain by my stars"
  end

end

weapon = Weapon.new

weapon.attack('Ninja')
weapon.attack('Dragon')
As we call the attack method on the weapon object, the calls are delegated to the class that can handle a particular "enemy".
I've been inherited by KatanaBlade
I've been inherited by NinjaStars
Ninja has been sliced by the Katana Blade
Dragon has been slain by my stars
The magic is found in the "self.inherited"method.  Every time a class inherits from a class with this method defined, the method is called providing a reference to the derived class.  Please note that any time the "self" keyword is used to define a method, this is known as a "Class method" (similar to static methods in C# and Java, except they are actually "instance" methods of the particular "Class" object).


Pure Awesome Pattern

OK, that's not its real name.  To be honest, I don't even know what you would call this, but it is "pure awesome".  This is why C# and Java are 'Legos', and Ruby is 'Playdough'.

Imagine writing in a language where your model is a part of the language!  I'd like to think that LINQ gets you to the 30 yard-line, but Ruby will score the touchdown:
module Vehicles

  class Make

    attr_reader :company

    def initialize(company)
      @company = company
    end

    def method_missing(name, *args)
      Model.new(name, args[0], self)
    end

  end

  class Model
    extend Forwardable

    attr_reader :model, :year, :make

    def_delegators :@make, :company

    def initialize(model, year, make)
      @model = model
      @year = year
      @make = make
    end

    def find
      Car.new(self)
    end
  end

  class Car
    extend Forwardable
    def_delegators :@model_container, :company, :model, :year, :make

    def initialize(model)
      @model_container = model
    end

    def to_s
      "#{year} #{company} #{model}"
    end
  end

  class CarCatalog
    def method_missing(name, *args)
      Make.new(name)
    end
  end
end

catalog = Vehicles::CarCatalog.new
car = catalog.Nissan.Frontier(2007).find
puts car
And we get:
2007 Nissan Frontier
Right now you are probably thinking, WTF!? The magic is in the "method_missing" function. Whenever a method is called on a class and it does not exist, Ruby will look to see if the class defines the "method_missing" function. If it does, Ruby will send the name of the "method that is missing" along with it's parameters to that method.

From there, we can do whatever we want. In this case, we are going to use name of that method that was missing as the name of our car "make" and "model". In this case, we are using "Nissan" and "Frontier". If you would like a better (perhaps a bit more practical example), look at the MongoDB driver for Ruby which is quite elegant (and through me for a complete loop when I first saw it).

Finalize, Disposed...Meet "at_exit"

The final "hook" I would like to demonstrate is the "at_exit" global method that runs a "code block" when the program is terminating.  A "code block" is a Ruby construct similar to a "lambda" in C#.  Like lambdas code blocks keep scope of the variables around them.

at_exit do
  puts "Program has ended"
end

class Apocalypse

  def initialize(message)
    @message = message
    at_exit { sound_horn }
  end

  def sound_horn
    puts @message
  end
end

Apocalypse.new("The end is nigh!")

puts "Right before termination"
Here's the output, pay special attention to the order:
Right before termination
The end is nigh!
Program has ended

CONCLUSION

So that's a tidbit of Ruby.  I must mention that I only showed you a fraction of the language, and I am only in my first month of study.  I hope that if you are wandering like I was (in search of a language), that you might find a home in the Ruby community.  Ruby is an extremely powerful language, and despite common misconceptions, can be incredibly fast depending on the Ruby runtime (and there are several).  More importantly, the Ruby community is phenomenal.  Many of the best Java and C# libraries being fielded today are ports of Ruby frameworks.

If you are an experienced programmer and looking to get a jump start into the Ruby ecosystem, I highly recommend that you skip the beginners books and jump directly into (I should add that Russ Olsen is a pretty funny read in addition to providing excellent content):

Sunday, July 3, 2011

Introducing Microsoft's Reactive Extensions (Rx) Framework

In case you are wondering what all of the "hoopla" is about, Microsoft's newest framework Reactive Extensions (Rx) is a set of libraries simplifying the construction of event-based applications in an asynchronous environment.  Rx quite literally unifies the .NET event model with a Complex Event Processor (CEP), where the query and filter language is LINQ!

Released a few months ago (though it's been around for a while in the "experimental" state), Reactive Extensions has been ridiculously under promoted.  I think the problem is that many of us software engineers have been unclear as to what it is, or how the framework fits into our architectures.  Perhaps because of the confusion, the Rx team produced a series of tutorials on Channel 9 explaining the "how and why" of the framework: http://channel9.msdn.com/Series/Rx-Workshop.

I took the time this weekend to review the 2+ hours worth of video content to distill a couple of high-level examples for your view pleasure.
Before we get started, please ensure you have downloaded the Reactive Extensions framework (or NuGet) and added a reference to the assembly in your project.

Let's begin with the "Hello World" equivalent of Rx:
//Any IEnumerable can be converted into an IObservable stream.
//For instance, consider the string "Here is my Message".
IObservable<char> obs = "Hello World".ToObservable();

//We will subscribe onto the observable a lamda expression that will
//print each character (as it is observed) to the console.
obs.Subscribe((ob) => { Console.WriteLine("Observing '{0}'", ob); });
Console Output
Observing 'H'
Observing 'e'
Observing 'l'
Observing 'l'
Observing 'o'
Observing ' '
Observing 'W'
Observing 'o'
Observing 'r'
Observing 'l'
Observing 'd'
In the previous example, we demonstrated how an IEnumerable can be converted into a stream of observable objects. On the next line of code, we subscribe to this stream using a lambda (Action<char>) that simply prints each character to the console.

Reactive Extensions, in one way, is a formalization by Microsoft of the familiar Pub/Sub (Eventing) model. But to describe the framework as some fancy "Pub/Sub" system would be insulting. In the next example, we will demonstrate how Rx also provides an abstraction to asynchronous programming.

//Invoke an action asynchronously
var o = Observable.Start(
  //Here is our Action<Unit> (void)
  () => {
    Console.WriteLine("Starting");
    Thread.Sleep(3000);
    Console.WriteLine("Done");
  }
);

//Block until the asynchronous action
//is done executing (or more appropriately,
//block until the first element of the IObservable<Unit>
//is returned).
o.First();
Console Output
Starting
Done
In the next example, we'll demonstrate how to use Pub/Sub in Rx without relying on an IEnumerable. Rx provides an interface ISubject and implementation Subject that is both IObservable and IObserver. This means that Subject objects can both be produce events and be subscribed onto. We will use a subject to register handlers and then feed objects into the Subject's data stream.

//Subjects are observable sequences as well an observer.
//You can subscribe lamdas onto the subject and add data
//to the stream.
ISubject<int> subject = new Subject<int>();

//Subscribe onto our subject with three lamdas:
// OnNext, OnError, and OnCompleted.
subject.Subscribe<int>(
    //On Next
    (item) => {
        Console.WriteLine("Next item: {0}", item);
    }, 
    //On Error
    (ex) => {
        Console.WriteLine("An exception occurred: {0}", ex.Message);
    }, 
    //On Completed
    () => {
        Console.WriteLine("Completed.");
    });

//Now let's manually add data to the subject stream
for (int i = 0; i < 10; i++)
{
    //Add data by calling OnNext
    subject.OnNext(i);
}
//Signal completion by calling OnCompleted
subject.OnCompleted();
Console Output
Next item: 0
Next item: 1
Next item: 2
Next item: 3
Next item: 4
Next item: 5
Next item: 6
Next item: 7
Next item: 8
Next item: 9
Completed.
So far the demos have been quite trivial. We'll conclude with a much more practical demo using Rx and WPF.

In this example, we are going to capture the total CPU utilization % every second and plot that utilization on a line chart (powered by Visiblox).

The XAML is pretty standard:
<Window x:Class="ReactiveUI.MainWindow"
  xmlns=
     "http://schemas.microsoft.com/winfx/2006/xaml/presentation"
  xmlns:x=
     "http://schemas.microsoft.com/winfx/2006/xaml"
  xmlns:charts=
     "clr-namespace:Visiblox.Charts;assembly=Visiblox.Charts"
  Title="Processor Utilization" Height="300" Width="600">
    <Grid>
  <charts:Chart Name="chart" Width="550" 
                   Title="Total CPU Utilization" 
          Background="Transparent" Margin="0" 
          LegendVisibility="Collapsed" >
      <!-- Add zooming and a trackball -->
      <charts:Chart.Behaviour>
    <charts:BehaviourManager x:Name="behaviourManager" 
                             AllowMultipleEnabled="True">
        <charts:TrackballBehaviour x:Name="track" />
        <charts:ZoomBehaviour />
    </charts:BehaviourManager>
      </charts:Chart.Behaviour>
      <!-- Define x and y axes. -->
      <charts:Chart.XAxis>
    <charts:DateTimeAxis ShowMinorTicks="False" 
                            ShowGridlines="False">
        <charts:DateTimeAxis.Range>
      <charts:DateTimeRange />
        </charts:DateTimeAxis.Range>
    </charts:DateTimeAxis>
      </charts:Chart.XAxis>
      <charts:Chart.YAxis>
    <charts:LinearAxis LabelFormatString="0'%" 
                          ShowMinorTicks="True" 
           ShowGridlines="True" Title="Utilization">
        <charts:LinearAxis.Range>
      <charts:DoubleRange Minimum="0" Maximum="110"/>
        </charts:LinearAxis.Range>
    </charts:LinearAxis>
      </charts:Chart.YAxis>
  </charts:Chart>
    </Grid>
</Window>
Here is the "code-behind". Follow the inline comments for a description of what's happening.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Data;
using System.Windows.Documents;
using System.Windows.Input;
using System.Windows.Media;
using System.Windows.Media.Imaging;
using System.Windows.Navigation;
using System.Windows.Shapes;
using System.Diagnostics;
using System.Reactive.Linq;
using System.Reactive.Concurrency;
using Visiblox.Charts;

namespace ReactiveUI
{
  /// <summary>
  /// Interaction logic for MainWindow.xaml
  /// </summary>
  public partial class MainWindow : Window
  {
    /// <summary>
    /// The CPU Counter we will be using to monitor CPU utilization.
    /// </summary>
    private PerformanceCounter cpuCounter 
      = new PerformanceCounter(
              "Processor", "% Processor Time", "_Total");

    /// <summary>
    /// Default constructor for the Window
    /// </summary>
    public MainWindow()
    {
      //Initialize the GUI components
      InitializeComponent();

      //Set the min and max range for the chart.
      //Unfortunately, if we don't do this, the Visiblox
      //chart won't autoscale nicely (we need an interval
      //of at minimum a second!).
      IRange range2 = chart.XAxis.CreateRange();
      range2.Minimum = DateTime.Now;
      range2.Maximum = DateTime.Now.AddSeconds(60);
      chart.XAxis.Range = range2;

      //Now we are going to set up the observer.
      //We will keep a reference to the 
      //IObservable<CpuUtilizationInstant>
      //as "performanceObserver".
      var performanceObserver = 
        //Collect the "long" timespan offset
        //produced by the interval in the variable
        //"counter"
        from counter 
        //The "Interval" method will create
        //a collection of observable events,
        //with each event occurring at the defined
        //'interval'.
        in Observable.Interval(
          //We will defined the interval
          //to be every seconds
          TimeSpan.FromSeconds(1), 
          //And since we want our subscribers
          //to be able to modify the state of the
          //UI, we need to be able to synchronize
          //between the UI thread and the event thread.
          //The Rx extension for WPF includes the
          //DispatcherScheduler for this exact
          //situation.
          new DispatcherScheduler(this.Dispatcher))
        //Finally, we want to create and return a new
        //reading of the CPU utilization (storing it
        //in our custom-made structure).
        select new CpuUtilizationInstant(
          DateTime.Now, cpuCounter.NextValue());

      /*
       // Without comments, it's simply:
       
       var performanceObserver = 
        from counter 
        in Observable.Interval(
          TimeSpan.FromSeconds(1), 
          new DispatcherScheduler(this.Dispatcher))
        select new CpuUtilizationInstant(
          DateTime.Now, cpuCounter.NextValue());
       */

      //Create the dataset for a new series 
      //to display on the chart
      DataSeries<DateTime, float> dataForSeries 
        = new DataSeries<DateTime, float>("Total CPU Utilization");

      //Create a line series
      LineSeries lineSeries = new LineSeries();
      //Bind the dataset to the line series
      lineSeries.DataSeries = dataForSeries;
      //Set the thinkness of the line
      lineSeries.LineStrokeThickness = 1.5;
      //Add the line to the chart
      chart.Series.Add(lineSeries);

      //Now we will subscribe to the interval event
      //through our observer dubbed "performanceObserver"
      performanceObserver.Subscribe(
        //Handle the event using a lambda expression
        (cpuUtilizationInstant) => { 

          //Let's add a new data point on the chart
          dataForSeries.Add(
            new DataPoint<DateTime, float>(
              cpuUtilizationInstant.Instant, 
              cpuUtilizationInstant.PercentageUtilized));

          //And for debug purposes, send the instant
          //to the Debug output
          Debug.WriteLine(cpuUtilizationInstant);
      });

      
    }


    /// <summary>
    /// A simple container to hold the instantaneous reading of 
    /// CPU utilization.
    /// </summary>
    public struct CpuUtilizationInstant
    {
      /// <summary>
      /// The moment the CPU Utilization was recorded
      /// </summary>
      private DateTime instant;

      /// <summary>
      /// The utilization percentage at this recorded moment
      /// </summary>
      private float cpuUtilizationPercentage;

      /// <summary>
      /// Instantiate the struct with the time the instance was recorded
      /// and the percentage of CPU utilization at that moment.
      /// </summary>
      /// <param name="timeTaken">Time the utilization was recorded</param>
      /// <param name="cpuUtilizationPercentage">CPU Utilization (%)</param>
      public CpuUtilizationInstant(
                 DateTime timeTaken, float cpuUtilizationPercentage)
      {
        this.instant = timeTaken;
        this.cpuUtilizationPercentage = cpuUtilizationPercentage;

      }

      /// <summary>
      /// Get the instant the utilization was recorded
      /// </summary>
      public DateTime Instant { 
        get { return this.instant; } 
      }


      /// <summary>
      /// Get the percentage the CPU was utilized at
      /// this moment.
      /// </summary>
      public float PercentageUtilized { 
        get { return this.cpuUtilizationPercentage; } 
      }

      /// <summary>
      /// String representation of the current state of this
      /// struct instance.
      /// Ex Output: "At 11:06:04 AM, the Cpu was 5.086277% utilized." 
      /// </summary>
      /// <returns>String representation of the struct</returns>
      public override string ToString()
      {
        return string.Format("At {0}, the Cpu was {1}% utilized.", 
          this.instant.ToLongTimeString(), this.cpuUtilizationPercentage);
      }
    }
  }
}
The following is a video of the CPU Utilization demo in action!

Sample of the Debug Output
At 10:28:48 PM, the Cpu was 0% utilized.
At 10:28:49 PM, the Cpu was 23.15563% utilized.
At 10:28:50 PM, the Cpu was 22.47902% utilized.
At 10:28:51 PM, the Cpu was 33.51285% utilized.
At 10:28:52 PM, the Cpu was 42.08793% utilized.
At 10:28:53 PM, the Cpu was 44.70722% utilized.
At 10:28:54 PM, the Cpu was 33.75841% utilized.
At 10:28:55 PM, the Cpu was 30.23645% utilized.
At 10:28:56 PM, the Cpu was 31.96174% utilized.
At 10:28:57 PM, the Cpu was 25.83793% utilized.
At 10:28:58 PM, the Cpu was 29.00658% utilized.
Well, that's it! Reactive Extensions is an extremely powerful framework, and certainly a welcome feature on the .NET platform. In future posts, we'll discuss some of Rx's more powerful features, ending in a look at the new Reactive Extensions for JavaScript, a port of Rx to the browser.

Happy Coding!
Richard

Saturday, June 25, 2011

Calculating Similarity (Part 3): Damerau-Levenshtein Distance

I promised this post a while ago and have unfortunately too busy to complete it.  I noticed a couple of people had searched Google explicitly for this post, so that encouraged me to complete it!

According to Wikipedia:
"Damerau–Levenshtein distance (named after Frederick J. Damerau and Vladimir I. Levenshtein) is a "distance" (string metric) between two strings, i.e., finite sequence of symbols, given by counting the minimum number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, or substitution of a single character, or a transposition of two adjacent characters." [1]
In Computer Science, we commonly call algorithms like Damerau-Levenshtein Distance "Edit Distance", since the distance and resultant matrix tell us the number of transpositions, insertions, deletions, etc. necessary to make two strings identical.

The algorithm to calculate Damerau-Levenshtein is remarkably simple. Please follow the inline comments for a better understanding of how the algorithm works.

DamerauLevenshteinDistance.java

package com.berico.similarity;

/**
 * Damerau-Levenshtein Distance
 * Based on the algorithms provided at the following websites:
 * 
 * http://snippets.dzone.com/posts/show/6942
 * http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance
 * 
 * @author Richard Clayton (Berico Technologies)
 * @date June 25, 2011
 */
public class DamerauLevenshteinDistance 
                   implements IDistanceCalculator {

  /**
   * Calculate the Damerau-Levenshtein Distance (edit distance)
   * between two strings.
   * @param source Source input string
   * @param target Target input string
   * @return The number of substitutions it would take
   *          to make the source string identical to the target
   *         string
   */
  public int calculate(String source, String target){
    //If both strings are empty, I'm of the opinion that
    //this is an error (technically the distance is zero).
    assert( !(source.isEmpty() && target.isEmpty()));
    
    //If the source string is empty, the distance is the
    //length of the target string.
    if(source.isEmpty()){
      return target.length();
    }
    
    //If the target string is empty, the distance is the
    //length of the source string.
    if(target.isEmpty()){
      return source.length();
    }
    
    //Delegate the calculation to the method that produces the matrix
    //and distance, but then only return the distance
    return calculateAndReturnFullResult(source, target).getDistance();
  }
  
  /**
   * Perform the distance calculation, but also return the
   * resulting matrix and distance.
   * @param source Source input string
   * @param target Target input string
   * @return A simple object with the matrix and distance
   */
  public DameauLevenshteinDistanceResult 
           calculateAndReturnFullResult(String source, String target){

    //If both strings are empty, I'm of the opinion that
    //this is an error (technically the distance is zero).
    assert( !(source.isEmpty() && target.isEmpty()));
    
    //We are going to construct a matrix of distances
    int[][] distanceMatrix 
      = new int[source.length() + 1][target.length() + 1];
    
    //We need indexers from 0 to the length of the source string.
    //This sequential set of numbers will be the row "headers"
    //in the matrix.
    for(int sourceIndex = 0; 
      sourceIndex <= source.length(); 
      sourceIndex++){
      
      //Set the value of the first cell in the row
      //equivalent to the current value of the iterator
      distanceMatrix[sourceIndex][0] = sourceIndex;  
    }
    
    //We need indexers from 0 to the length of the target string.
    //This sequential set of numbers will be the 
    //column "headers" in the matrix.
    for(int targetIndex = 0;
      targetIndex <= target.length();
      targetIndex++){
      
      //Set the value of the first cell in the column
      //equivalent to the current value of the iterator
      distanceMatrix[0][targetIndex] = targetIndex;
    }
    
    //We'll use this to add a penalty
    //to some operations.
    int cost = 0;
    
    //Iterate over all characters in the source
    //string.
    for(int sourceIndex = 1; 
    sourceIndex <= source.length(); 
    sourceIndex++){
      
      //Iterate over all characters in the target
      //string.
      for(int targetIndex = 1;
      targetIndex <= target.length();
      targetIndex++){
        
        //If the current characters in both strings are equal
        if(source.charAt(sourceIndex - 1)
               == target.charAt(targetIndex - 1))
        {
          //There is no penalty.
          cost = 0;
        }
        else 
        {
          //Not equal, there is a penalty.
          cost = 1;
        }
        
        //We want to find the current distance by determining
        //the shortest path to a match (hence the 'minimum'
        //calculation on distances).
        distanceMatrix[sourceIndex][targetIndex] 
          = minimum(
           //Character match between current character in 
           //source string and next character in target
           distanceMatrix[sourceIndex - 1][targetIndex] + 1, 
           //Character match between next character in
           //source string and current character in target
           distanceMatrix[sourceIndex][targetIndex - 1] + 1,
           //No match, at current, add cumulative penalty
           distanceMatrix[sourceIndex - 1][targetIndex - 1] + cost);
        
        //We don't want to do the next series of calculations on
        //the first pass because we would get an index out of bounds
        //exception.
        if(sourceIndex == 1 || targetIndex == 1){
          continue;
        }
        
        //transposition check (if the current and previous 
        //character are switched around (e.g.: t[se]t and t[es]t)...
        if(source.charAt(sourceIndex - 1) 
              == target.charAt(targetIndex - 2)
          && source.charAt(sourceIndex - 2) 
              == target.charAt(targetIndex - 1)){
          
          //What's the minimum cost between the current distance
          //and a transposition.
          distanceMatrix[sourceIndex][targetIndex] 
            = minimum(
               //Current cost
             distanceMatrix[sourceIndex][targetIndex],
             //Transposition
             distanceMatrix[sourceIndex - 2][targetIndex - 2] + cost);
        }
      }
    }
    
    //Return the matrix and distance as the result
    return new DameauLevenshteinDistanceResult(distanceMatrix);
  }
  
  /**
   * Calculate the minimum value from an array of values.
   * @param values Array of values.
   * @return minimum value of the provided set.
   */
  private static int minimum(int... values){
    
    //Hopefully, everything should be smaller
    //than the max int value!
    int currentMinimum = Integer.MAX_VALUE;
    
    //Iterate over all provided values
    for(int value : values){
      
      //Take the minimum value between the current
      //minimum and the current value of the
      //iteration
      currentMinimum = Math.min(value, currentMinimum);
    }
    
    //return the minimum value.
    return currentMinimum;
  }
  
  /**
   * Simple container for the result of the Dameau-Levenshtein
   * Distance calculation
   * @author Richard Clayton (Berico Technologies)
     * @date June 25, 2011
   */
  public class DameauLevenshteinDistanceResult {
    
    //Distance matrix
    private int[][] distanceMatrix;
    
    /**
     * Instantiate the object with the resulting distance matrix
     * @param distanceMatrix Matrix of distances between edits
     */
    public DameauLevenshteinDistanceResult(int[][] distanceMatrix){
      this.distanceMatrix = distanceMatrix;
    }

    /**
     * Get the Distance Matrix
     * @return Matrix of edit distances
     */
    public int[][] getDistanceMatrix() {
      return distanceMatrix;
    }
    
    /**
     * Get the Edit Distance
     * @return number of changes to make before
     *         both strings are identical
     */
    public int getDistance(){
      return 
        distanceMatrix
          [distanceMatrix.length - 1][distanceMatrix[0].length - 1];
    }

    /**
     * Get a string representation of this class
     * @return A friendly display of the distance and matrix
     */
    @Override
    public String toString() {
      
      StringBuilder sb = new StringBuilder();
      
      sb.append(
         String.format(
           "Distance: %s \n", this.getDistance()));
      sb.append("Matrix: \n\n");
      
      for(int i = 0; i < this.distanceMatrix.length; i++){
        
        sb.append("| ");
        
        for(int j = 0; j < this.distanceMatrix[0].length; j++){
        
          sb.append(String.format("\t%s", this.distanceMatrix[i][j]));
        }
        
        sb.append(" |\n");
      }
      
      return sb.toString();
    }
  }
}

Here are some examples of using the distance calculator.

IDistanceCalculator distanceCalc = new DamerauLevenshteinDistance();
      
String distOne = "snapple";
String distTwo = "apple";
      
int editDistance = distanceCalc.calculate(distOne, distTwo);
      
System.out.println(
  String.format("The distance between %s and %s is %s",
    distOne, distTwo, editDistance));

And the output from the console:

The distance between snapple and apple is 2

I've also added a method for getting the full result back from the calculator (matrix and distance). There is an example below, but remember, you will not be able to access the method if your are using the interface's type to reference the calculator.

String distOne = "snapple";
String distTwo = "apple";

DamerauLevenshteinDistance distanceCalc2 
     = new DamerauLevenshteinDistance();

DameauLevenshteinDistanceResult result 
     = distanceCalc2.calculateAndReturnFullResult(distOne, distTwo);

System.out.println(result);

And the output from the console:

Distance: 2 
Matrix: 

|  0 1 2 3 4 5 |
|  1 1 2 3 4 5 |
|  2 2 2 3 4 5 |
|  3 2 3 3 4 5 |
|  4 3 2 3 4 5 |
|  5 4 3 2 3 4 |
|  6 5 4 3 2 3 |
|  7 6 5 4 3 2 |

Once again, you can access the Eclipse project at the following link: http://dl.dropbox.com/u/12311372/StringSimilarity.zip.

If you have any questions or comments, I would love to hear them.

Richard

[1]. Wikipedia contributors. "Damerau–Levenshtein distance." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 20 Jun. 2011. Web. 25 Jun. 2011.

Friday, June 24, 2011

JavaScript Map-Reduce Jobs in Hadoop using Mozilla Rhino

If you read my blog, you know that I am a huge fan of scripting, especially in areas of enterprise architecture that don’t classically see that kind of dynamism in their component logic.  One application of a script engine that recently piqued my interest was writing Map-Reduce jobs in JavaScript within Hadoop.
After playing with technologies like Pig and Hive, I realized that these frameworks don’t exist so much to provide a SQL-like interface to Hadoop, but rather to allow analysts and developers to quickly explore their datasets without needing to compile an implementation in Java.  During my Cloudera Developer training, I noticed that the biggest barrier of entry into the Hadoop ecosystem was a solid Java background.  Many of the students had C++ or .NET backgrounds.  More importantly, I personally found it tedious to write and compile a Hadoop job every time I wanted to create some variation of an existing Map-Reduce capability. 
But what about Hadoop Streaming?  Sure, this is certainly a viable answer, and you could even write your application in Python or Javascript (using Node.js).  I think my biggest problem with Hadoop Streaming is that it is very brittle.  The separation between the key and value is a tab character!  If you are using structured data formats native to Hadoop, you lose the ability to automatically marshal them back into a rich format (think Avro and getting real objects as values!).  If you are writing more complex applications using the Streaming API and need to reference external libraries, you may find yourself in a sticky situation.  Every “Task Node” (DN with TT) may need to have its environment preconfigured manually, vice transferring scripts to the distributed cache (for instance, what if you wanted to use a Python library like NLTK?).
What if instead of using a SQL-like language or an imperative language that has to be compiled, we use a scripting language?  Believe it or not, this is extremely simple to set up since Java already has a specification for hosting scripting languages (JSR 223).  In this post, I’m going to demonstrate how you can use JavaScript to write MapReduce jobs in Hadoop using the Mozilla Rhino script engine.
Before we can start crunching data in JavaScript, however, we need to set up a “hosting environment” that will allow us to use Mozilla Rhino within the context of Hadoop.  This will simply be a generic Map-Reduce job in which we delegate both the map and reduce functions to a JavaScript function in Rhino.
JsMapReduceBase.java

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

import javax.script.Invocable;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;

import org.apache.hadoop.filecache.DistributedCache;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.MapReduceBase;

/**
 * Provides the scaffolding necessary for hosting a 
 * a script engine in Hadoop.
 */
public abstract class JsMapReduceBase extends MapReduceBase {

  protected static final ScriptEngineManager mgr 
                            = new ScriptEngineManager();

  protected ScriptEngine jsEngine;
  protected static final Text outputKey = new Text();
  protected static final Text outputValue = new Text();
  
 /**
  * Configure the Mapper/Reducer using the Job Context.
  * In our case, we want to pull the Mapper and Reducer
  * JavaScript functions from the Distributed Cache.
  * @param job Map-Reduce Job Context
  */
  @Override
  public void configure(JobConf job) {
    super.configure(job);
    
    //Create a new Script Engine using the JSR 223 API.
    //Under the hood, Java will locate any registered 
    //implementation of the Script Language "JavaScript", 
    //which will be the Rhino engine.
    jsEngine = mgr.getEngineByName("JavaScript");
      
      try {
      //Using the Script out of the Cache, load the script
      //body into the engine (parsing and evaluating).
      jsEngine.eval(getScript(job));
      //Key are going to register an instance of a Text key
      //which we will reuse in the Hadoop job
      jsEngine.put("oKey", outputKey);
      //As well as a Text value.
      jsEngine.put("oValue", outputValue);
      
    } catch (Exception e) {
      
      System.err.print(e.getMessage());
    }
  }
  
 /**
  * Clean up any resources used during the Map-Reduce task
  */
  @Override
  public void close() throws IOException {
    super.close();
  }
  
 /**
  * Pull the JavaScript script from the distributed
  * cache.
  * @param job Map-Reduce Job Context
  */
  private String getScript(JobConf job) throws Exception {
    
    StringBuilder sb = new StringBuilder();
    
    Path[] cacheFiles;
    
    cacheFiles = DistributedCache.getLocalCacheFiles(job);
    
    for(Path p : cacheFiles){
    
      String file = readFile(p, job);
      
      System.err.println(file);
      
      sb.append(file).append(" \n");
    }
    
    return sb.toString();
  }
  
 /**
  * Read the contents of a file to a string.
  * @param path Path to the file
  * @param job Map-Reduce Job Context
  * @returns Body of the file as a string
  */
  private String readFile(Path path, JobConf job) 
     throws Exception {

    FSDataInputStream in = null;
    BufferedReader br = null;
    FileSystem fs = FileSystem.get(job);
    in = fs.open(path);
    br  = new BufferedReader(new InputStreamReader(in));
    
    StringBuilder sb = new StringBuilder();
    
    String line = "";
    while ( (line = br.readLine() )!= null) {
      sb.append(line);
    }
    in.close();
    return sb.toString();
  }
  
 /**
  * Call a function on the Script Engine.
  * @param functionName Name of the function to call
  * @param args An array of arguments to pass to the function
  *        representing the function argument signature.
  * @returns The result (if any) from the script function
  */ 
  protected Object callFunction(
    String functionName, Object... args)
    throws Exception {

    return ((Invocable)jsEngine)
               .invokeFunction(functionName, args);
  }
  
 /**
  * Call a method on a Script object within the Script Engine.
  * @param objectName The reference name of the object with
  *        the method that will be called (e.g.: in foo->bar(),
  *        we want 'foo').
  * @param methodName Name of the method to call (bar())
  * @param args An array of arguments to pass to the function
  *        representing the function argument signature.
  * @returns The result (if any) from the script method
  */ 
  protected Object callMethod(
    String objectName, String methodName, Object... args) 
    throws Exception {

    return ((Invocable)jsEngine)
               .invokeMethod(objectName, methodName, args);
  }
}

We will also need to implement the Mapper and Reducer interfaces:

JsMapper.java

import java.io.IOException;

import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

/**
 * A very simple implementation of the Mapper interface
 * that delegates the 'map' task to a Script Engine.
 * Keep in mind that this implementation currently uses
 * the Text class for input and output Key-Value pairs,
 * but could easily be changed to use anything.
 */
public class JsMapper extends JsMapReduceBase 
   implements Mapper<Object, Text, Text, Text> {
  
/**
 * Perform a map on the given key-value pair, delegating the
 * call to JavaScript.
 * @param key The Key
 * @param value The Value
 * @param output The output collector we use to send key-value
 *        pairs to the reducer.
 * @param reporter A mechanism we can use for 
 *        'reporting' our progress.
 */
  @Override
  public void map(
      Object key, 
      Text value, 
      OutputCollector<Text, Text> output,
      Reporter reporter) throws IOException {
    
    try {
      //Delegate the call to the "map()" function in JavaScript
      //Note: this was hard-coded to keep the demo simple.
      callFunction("map", key, value, output, reporter);
    } catch (Exception e) {
      //Handle Error
    }
  }
  
}

JsReducer.java

import java.io.IOException;
import java.util.Iterator;

import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

/**
 * A very simple implementation of the Reducer interface
 * that delegates the 'reduce' task to a Script Engine.
 * Keep in mind that this implementation currently uses
 * the Text class for input and output Key-Value pairs,
 * but could easily be changed to use anything.
 */
public class JsReducer extends JsMapReduceBase 
    implements Reducer<Text, Text, Text, Text> {

/**
 * Perform a map on the given key-value pair, delegating the
 * call to JavaScript.
 * @param key The Key
 * @param values A Value Collection for the corresponding key
 * @param output The output collector we use to save key-value
 *        pairs to HDFS.
 * @param reporter A mechanism we can use for 
 *        'reporting' our progress.
 */
  @Override
  public void reduce(
           Text key, 
           Iterator<Text> values,
           OutputCollector<Text, Text> output, 
           Reporter reporter) throws IOException {
    
    try {
      //Delegate the call to the "reduce()" function in JavaScript
      //Note: this was hard-coded to keep the demo simple.
      callFunction("reduce", key, values, output, reporter);
    } catch (Exception e){
      //Handle Error
    }
  }
}

Finally, we need to create a Generic Job Runner for executing the Map-Reduce job.

JsJobDriver.java

import java.net.URI;

import org.apache.hadoop.filecache.DistributedCache;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;

/**
 * Here is a pretty simple Job Driver for loading
 * the correct scripts and input directories for the
 * Map-Reduce job.  You will probably notice
 * that I was running this on the Cloudera VM
 * (the user directory is 'training'.
 * The dataset I was processing is a list of METAR
 * and TAF reporting weather stations that you can
 * find at this url: 
 * http://aviationweather.gov/adds/metars/stations.txt
 *
 * Ideally, you will want to pass in the location of 
 * the input directory and map/reduce JavaScript files.
 */
public class JsJobDriver {
  
  //Default paths for my application
  public static String inputPaths =
    "hdfs://localhost/stationsInput.txt";
  public static String outputPath = 
    "hdfs://localhost/stationsOutput";
  public static String mapJsFile = 
    "hdfs://localhost/user/training/map.js";
  public static String reduceJsFile = 
    "hdfs://localhost/user/training/reduce.js";

 /**
  * Start the Job
  * @param args Console input arguments
  */
  public static void main(String... args) 
     throws Exception {
    
    //If we have two inputs, they are the map-reduce
    //scripts.
    if(args.length == 2){
      mapJsFile = args[0];
      reduceJsFile = args[1];
    }
    
    //If we have four inputs, we are getting the
    //input and output paths, and javascript map and reduce
    //scripts.
    if(args.length == 4){
      inputPaths = args[0];
      outputPath = args[1];
      mapJsFile = args[2];
      reduceJsFile = args[3];
    }
    
    JobConf conf = new JobConf(JsMapper.class);
    conf.setJobName("Js Test.");
    
    FileInputFormat.setInputPaths(conf, new Path(inputPaths));
    FileOutputFormat.setOutputPath(conf, new Path(outputPath));
    
    //Associate the correct Mappers and Reducers
    conf.setMapperClass(JsMapper.class);
    conf.setMapOutputKeyClass(Text.class);
    conf.setMapOutputValueClass(Text.class);
    
    conf.setReducerClass(JsReducer.class);
    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(Text.class);
    
    //Store our scripts in the cache.
    DistributedCache.addCacheFile(new URI(mapJsFile), conf);
    DistributedCache.addCacheFile(new URI(reduceJsFile), conf);
    
    //Run the job
    JobClient.runJob(conf);
  }
}

That's all it takes to get Scripting support in Hadoop! In my opinion is actually pretty trivial, and a lot cleaner than using some of the other frameworks.

Let me show you some JavaScript MapReduce now:

Identity Mapper
map.js

function map(key, value, output, reporter){
  oKey.set(key);
  oValue.set(value);
  output.collect(oKey, oValue);
}

Identity Reducer
reduce.js

function reduce(key, values, output, reporter){
  oKey.set(key); 
  while(values.hasNext()){
    output.collect(oKey, values.next()); 
  } 
}

I realize that these JavaScript examples are trivial. In another post, I will get into more complex examples. My goal is simply to emphasize that there are easy ways to get greater extensibility with Hadoop without having to use the Streaming API or Hive and Pig. In my next post, I'm going to take this example one step further and create a web-based interface for writing Map-Reduce jobs on Hadoop, much like many of us have enjoyed with CouchDB.

In the meantime, good luck, and have a great time coding.

Richard

Thursday, May 5, 2011

Micro Effects in 5 Minutes

The Micro Effects .NET Framework is very close to open release. In my last post, I gave a sneak peak of some of the more advanced features of the framework. A couple of my fellow engineers mentioned that I probably should have shown some of the more basic functionality of the API. I hope this post will give you a better idea of the possibilities a scripting framework can bring to your applications.

The following is the Micro Effects 5 Minute Tutorial I've scraped from the upcoming Micro Effects website (http://microfx.bericotechnologies.com [Not Available Yet!]). Please let me know if it took you more than 5 minutes to read, or if it sparked any other general questions about the framework.

Richard

Micro Effects in 5 Minutes

In this demonstration, we will instantiate the Engine, add an "ad hoc" script, and then instruct the engine to run.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using com.berico.mfx;

namespace com.berico.mfx.examples {
  
  public class MicroEffects2MinuteExample {

    public static void Main(string[] args) {

      new MfxEngine()
         .Load(
          new Script() { 
            Info = ScriptInfo.Bldr().ID("My Script").Build(),
            Body = "console.log('Hello World.');"
          })
         .Run();
         
      Console.ReadLine();
    }
  }
}

   
By default, the Console logs to the Debug output (System.Diagnositics.Debug). You will not see anything on your console screen. If you look at the Debugger output, you will find the following line intermixed with other debugging information.
2011-04-24T17:38:15.6270000-04:00 [My Script] INFO - Hello World.
We will continue to operate in the Main method of the class, but for the sake of brevity, we will not include the surrounding code.
Let's make this demonstration even smaller by importing the "loaders" namespace, which adds extension methods onto the Micro Effects engine (MfxEngine).
using com.berico.mfx.loaders;

/* ... */

new MfxEngine()
  .Load("console.log('Hello World.');")
  .Run();

   
Once again, in the Debugger output, you should see an identical line (timestamp and script
name being the only things that have changed).

2011-04-24T17:59:02.8810000-04:00 [AdHoc] INFO - Hello World.

Ok, we admit that looking for the output in the Debug window is annoying. In the next example, we will redirect that output to the Console Window.
This brings us to our next concept. The MfxEngine does not exist on it's own, but actually operates within a Micro Effects Environment (MfxEnvironment). MfxEngine and MfxEnvironment are bidirectionally associated (in that they are aware of each other's existence). It's more appropriate to say MfxEngine is an aggregate of MfxEnvironment, in that one environment may have many engines. The environment provides the engine with a number of services, including providing logging support ("console"), global variables, and a service registry.
When you instantiate an engine, the engine is associated with the default environment. If you need to configure the environment, it's best to start by retrieving and configuring the default environment and grabbing an engine from it.

using com.berico.mfx.loaders;
using com.berico.mfx.services;

/* ... */

MfxEnvironment
  .Default
  .RegisterConsole(typeof(WindowsConsoleService))
  .GetEngine()
  .Load("console.log('Hello World.');")
  .Run();

You should now finally see the message in the Console window.




Note, we had to reference the com.berico.mfx.services namespace, where all Console Implementations are stored. Implementing your own IConsoleService is really quite simple. Keep in mind, we already provide implementations that
use the Windows Debugger (DebugConsoleService), Windows Console (WindowsConsoleService) and Log4NET (Log4NetConsoleService).
If you register more than one console service with Environment, the environment will wrap all console services in the CompositeConsoleService, routing the logging output to each registered console sequentially.

Now that you better understand how Micro Effects is instantiate, let us conclude with a couple of "rapid fire" examples of using Micro Effects.

Register a Variable with the Engine.

DateTime now = DateTime.Now;

MfxEnvironment
  .Default
  .RegisterConsole(typeof(WindowsConsoleService))
  .GetEngine()
  .SetVar("currentTime", now)
  .Load(
   "console.log('The current time is {0}.', currentTime.ToString());")
  .Run();


2011-04-24T20:08:53.2340000-04:00 [AdHoc] INFO - The current time is 4/24/2011 8:08:52 PM.

Call a C# function from JavaScript.

public static string StringToLower(string input){
  return input.ToLower();
}

public static void Main(string[] args) {

DateTime now = DateTime.Now;

MfxEnvironment
  .Default
  .RegisterConsole(typeof(WindowsConsoleService))
  .GetEngine()
  .SetFunction("StringToLower", 
     new Func<string, string>(MicroEffects2MinuteExample.StringToLower))
  .Load("console.log('MICRO EFFECTS to lower: {0}.'," 
      + "StringToLower('MICRO EFFECTS'));")
  .Run();

 Console.ReadLine();
}

2011-04-24T20:16:37.9640000-04:00 [AdHoc] INFO - MICRO EFFECTS to lower: micro effects.

Call a JavaScript function from C#.

*Note: the JINT engine uses System.Double as the return type of mathematical
operations in JavaScript, so if you ask for an integer return value, you will get a cast error.

MfxEngine engine = MfxEnvironment.Default
  .RegisterConsole(typeof(WindowsConsoleService))
  .GetEngine()
  .Load("function add(var1, var2){ return var1 + var2; }")
  .Prepare();

double sum = engine.CallFunction<double>("add", 1.1234, 2.1234132);

Console.WriteLine("The sum of 1.1234 and 2.1234132 is {0}.", sum);

The sum of 1.1234 and 2.1234132 is 3.2468132.

Load a Script from the File System.

C:\Scripts\test.mfx.js
console.error("Ooops!");
console.fatal("Huge error");
console.log("New Console Message");

function iterate(){
  for(var i = 0; i < 10; i++){
    console.log("Iteration {0}", i);
  }
}

iterate();

var anonObj = {
  name: "Anonymous object"
};

console.log("Name: {0}", anonObj.name);

(function(){ console.log("Anonymous function called."); })();

Load the file by submitting a FileInfo object to the Load function (extension method).

MfxEnvironment.Default
  .RegisterConsole(typeof(WindowsConsoleService))
  .GetEngine()
  .Load(new FileInfo("C:\\Scripts\\test.mfx.js"))
  .Run();

2011-04-24T20:47:54.7370000-04:00 [C:\Scripts\test.mfx.js] ERROR - Ooops!
2011-04-24T20:47:54.7620000-04:00 [C:\Scripts\test.mfx.js] FATAL - Huge error
2011-04-24T20:47:54.7650000-04:00 [C:\Scripts\test.mfx.js] INFO - New Console Message
2011-04-24T20:47:54.7810000-04:00 [C:\Scripts\test.mfx.js] INFO - Iteration 0
2011-04-24T20:47:54.7900000-04:00 [C:\Scripts\test.mfx.js] INFO - Iteration 1
2011-04-24T20:47:54.7920000-04:00 [C:\Scripts\test.mfx.js] INFO - Iteration 2
2011-04-24T20:47:54.7930000-04:00 [C:\Scripts\test.mfx.js] INFO - Iteration 3
2011-04-24T20:47:54.7940000-04:00 [C:\Scripts\test.mfx.js] INFO - Iteration 4
2011-04-24T20:47:54.7950000-04:00 [C:\Scripts\test.mfx.js] INFO - Iteration 5
2011-04-24T20:47:54.7960000-04:00 [C:\Scripts\test.mfx.js] INFO - Iteration 6
2011-04-24T20:47:54.7980000-04:00 [C:\Scripts\test.mfx.js] INFO - Iteration 7
2011-04-24T20:47:54.7990000-04:00 [C:\Scripts\test.mfx.js] INFO - Iteration 8
2011-04-24T20:47:54.8000000-04:00 [C:\Scripts\test.mfx.js] INFO - Iteration 9
2011-04-24T20:47:54.8040000-04:00 [C:\Scripts\test.mfx.js] INFO - Name: Anonymous object
2011-04-24T20:47:54.8050000-04:00 [C:\Scripts\test.mfx.js] INFO - Anonymous function called.

Load a Script from a Website.

http://microfx.bericotechnologies.com/content/fivemindemo/external.js
console.log("Hello from the WWW!");

Load the file by submitting an Uri object to the Load function (extension method).

MfxEnvironment.Default
  .RegisterConsole(typeof(WindowsConsoleService))
  .GetEngine()
  .Load(new Uri(
"http://microfx.bericotechnologies.com/content/fivemindemo/external.js"))
  .Run();

2011-04-24T20:54:38.9880000-04:00 [/content/fivemindemo/external.js] INFO - Hello from the WWW!

You have survived the Micro Effects 5 Minute Demo, and hopefully it only took you 5 minutes!
Micro Effects has much more to offer, and we you walk away with a sense of the possibilities
a Scripting framework can bring to your architecture.

Last updated: April 24, 2011 @ 9:00pm.

Monday, April 18, 2011

Berico Technologies “Micro Effects .NET” Preview

//@Inject("eventPublisher")
  var eventer;

(function () { 
   eventer.publish("Engineers", "Micro Effects has arrived.");
})();
I’m proud to introduce a preview of a very special project I’ve been working on for Berico called “Micro Effects,” an Enterprise Scripting framework for .NET. The goal of Micro Effects is to “dynamicize” business logic in your applications by loading and hosting scripts, replacing classically hardcoded domain logic. Our first supported scripting language is JavaScript, powered by the awesome JINT Engine (JavaScript Interpretter for .NET) [http://jint.codeplex.com/]. Micro Effects, however, is not just a wrapper for a 3rd party engine. At its core, Micro Effects brings a number of capabilities need to make Enterprise Scripting possible:

  • Opinionated framework for managing and reusing scripts.
  • Support for configuring the Script Engine via an IOC/DI container.
  • Dependency and Script management.
  • Script loading and repository functionality.
  • Script “Hot Swapping” support.
  • Numerous examples demonstrating how JavaScript can fit into your architecture.
  • Clean abstraction, allowing different scripting languages and/or engines to exist on the same platform.

Don’t worry Java Engineers, a port is in the works, and will be released with Berico’s forthcoming Next Generation Enterprise Stack (more to come later). We are considering porting the JINT Engine to Java, which will allow the features to remain relatively consistent (half the work is done for us since JINT uses ANTLR). We may also utilize and existing JavaScript parser/hosting environment like Mozilla Rhino.

Why Micro Effects?


Scripting languages typically get a bad rap. They are considered security risks, typically have poor tooling, and often require a large commitment in learning with little gain. In many cases this may be true. I would argue, however, that you probably already use a scripting language of some sort in many different areas of development. Consider web frameworks, like JSP, ASP.NET, and PHP. Perhaps you’re a Ruby or Python engineer. I doubt many would argue these platforms lack their usefulness.

Consider the following problems or needs a scripting language could address:

  • Need to recompile code or restart a server every time application logic changes.
  • Requirement to add extension points (plugin support) into an application.
  • Governance/Repository for application logic; this could allow a change to application logic in a central location to propagate across an entire architecture.
  • The ability to send more than just data across the wire. Consider the possibility of sending a “predicate” that can be evaluated at an Authorization service or Data Source.

For these reasons alone, we have decided to build a scripting platform which can be used across our platform for handling all sorts of things: responding to events, predicate-based security, making decisions within a domain, routing messages in an ESB.

Ok, no more talking; here’s the demo.

Before I start, let me mention that there are only two objects that exist by default in the Micro Effects environment, the “console” object and the “context” object. The Console is simply a Firebug-like logger exposed to the Scripting API. The Console has multiple implementations in Micro Effects, including a DebugConsoleService which writes all information to the Debug output (using the System.Diagnostics.Debug class), a Log4netConsoleService which adapts the Log4NET framework to the console API, and finally, a CompositeConsoleService which sends the console output to a chain of handlers implementing the IConsoleService interface.

The “context” object, on the other hand, is a .NET Dictionary<string, object> containing contextual information about the environment that may enter the API from various points (metadata within scripts, or submitted by a user). The context, however, is not a generic bag of variables in which users should store information. In fact, the Micro Effects engine allows users to set variables, functions and parameters on the engine, and even react to when those properties change.

I will start by demonstrating a number of key pieces of functionality in order of ascending complexity.

Basic Micro Effects

Create an engine with the default environment, register the Log4NET console service, and execute an adhoc script.

//Start with a new Environment
MfxEngine engine = MfxEnvironment.Default
  //Register the Log4Net console service
  .RegisterConsole(typeof(Log4netConsoleService))
  //Get a new (or pooled) engine from the environment
  .GetEngine()
  //Create an adhoc script and load it into the engine
  .Load(new Script()
  {
    //create a ScriptInfo object 
    //(metadata for the script)
    Info = ScriptInfo.Bldr().ID("TestScript").Build(),
    //define the script body
    Body = "console.log('Hello World.');"
  })
  //Run the script
  .Run();

Console Output

2011-04-18T21:22:48.0210000-04:00 [AdHoc] INFO - Hello World.

Now, let’s load a Script from the file system and execute it.

//Get the default environment
MfxEngine engine = MfxEnvironment.Default
 //Register the Log4Net console service
 .RegisterConsole(typeof(Log4netConsoleService))
 //Get a new (or pooled) engine from the environment
 .GetEngine()
 //Load a new script from the file system
 .Load(new FileInfo("JsCapabilitiesTest.js"))
 //Execute the script
 .Run();

External Script [JsCapabilitiesTest.js]

//Simple for loop
for (var i = 0; i < 10; i++) {
    console.log("Hello iteration {0}", i);
}

//Anonymous function
(function () { 
    console.error("Defined an anonymous function and called it!");
})();

//Define a named JavaScript function.
//By the way, this can be called from C#!
function myfunction(){
    console.log("In function");
}

//Call the previously defined function
myfunction();

//Create an anonymous object
var someObject = {
    name: "Bob",
    phone: "1234567890",
    age: 25
};

//Call properties of the anonymous object
console.log("{0}, {1}, {2}", 
    someObject.name, someObject.phone, someObject.age);

Console Output

2011-04-18T21:32:07.2780000-04:00 [AdHoc] INFO - Hello iteration 1
2011-04-18T21:32:07.2790000-04:00 [AdHoc] INFO - Hello iteration 2
2011-04-18T21:32:07.2800000-04:00 [AdHoc] INFO - Hello iteration 3
2011-04-18T21:32:07.2810000-04:00 [AdHoc] INFO - Hello iteration 4
2011-04-18T21:32:07.2820000-04:00 [AdHoc] INFO - Hello iteration 5
2011-04-18T21:32:07.2830000-04:00 [AdHoc] INFO - Hello iteration 6
2011-04-18T21:32:07.2840000-04:00 [AdHoc] INFO - Hello iteration 7
2011-04-18T21:32:07.2840000-04:00 [AdHoc] INFO - Hello iteration 8
2011-04-18T21:32:07.2850000-04:00 [AdHoc] INFO - Hello iteration 9
2011-04-18T21:32:07.2870000-04:00 [AdHoc] ERROR - Defined an anonymous function and called it!
2011-04-18T21:32:07.2900000-04:00 [AdHoc] INFO - In function
2011-04-18T21:32:07.2940000-04:00 [AdHoc] INFO - Bob, 1234567890, 25

Now, let’s load a directory of scripts, in this case there is only one, and we’ll watch the directory for changes. Every time the script is changed on the file system, we will re-run the engine (by registering on the Engine’s StateDirtied event).

//Start with a new Environment
MfxEngine engine = MfxEnvironment.Default
 //Register a new console
 .RegisterConsole(typeof(Log4netConsoleService))
 //Get a new engine instance
 .GetEngine()
 //Load all scripts in this directory and watch for changes
 .LoadAndWatch(new DirectoryInfo("C:\\Scripts"))
 .Run();

//Every time the state is dirtied, let's rerun the example.
engine.StateDirtied += new EventHandler<MfxEngineEventArgs>(
 (sender, eventArgs) => { eventArgs.Engine.Run(); });

//Run forever
while (true) ;

External Script [from C:\Scripts]

console.log("Wow haas!");

console.log("Change");

console.error("Ooops!");

console.fatal("Huge error");

Video Demonstrating dynamic reloading of the script.



Using an IOC Container to define the Environment

//Grab the Spring.NET Application Context
IApplicationContext context = 
  new XmlApplicationContext("Config/MfxEnvironmentConfig.xml");
//Pull the Micro Effects Environment from the context
MfxEnvironment environment = context.GetObject<MfxEnvironment>();
//Nothing new here.
environment.GetEngine()
  .Load(new FileInfo("JsCapabilitiesTest.js"))
  .Run();

Spring Application Context

The Spring Context is rather verbose.  Most of this configuration is handled programatically in the static property MfxEnvironment.Default, but if you want to extend the environment, you will probably want to use an IOC to configure those capabilities.  This example config is a pretty good start if you choose to extend the Micro Effects API.

<?xml version="1.0" encoding="UTF-8"?>
<objects xmlns="http://www.springframework.net"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.springframework.net 
                      http://www.springframework.net/xsd/spring-objects.xsd">

  <!-- Define a MicroEffects Environment -->
  <object id="mfxEnvironment" type="com.berico.mfx.MfxEnvironment">
    <!-- This is the object responsible for loading external script dependencies -->
    <property name="ScriptLoader" ref="scriptLoader" />
    <!-- This is the object responsible for loading external service dependencies -->
    <property name="ServiceRegistry">
      <object type="com.berico.mfx.ext.spring.SpringServiceRegistry, com.berico.mfx.ext.spring" />
    </property>
    <!-- Here is a list of classes that implement the IConsoleService (where console
         output from within the script will be directed). -->
    <property name="ConsoleClassNames">
      <list element-type="System.String">
        <!-- This is a facade for System.Diagnostics.Debug -->
        <value>com.berico.mfx.services.DebugConsoleService</value>
        <!-- Log4net Facade -->
        <value>com.berico.mfx.services.Log4netConsoleService</value>
      </list>
    </property>
  </object>

  <!-- Configure the ScriptParserFactory (the guy who delegates parsing to the correct Parser) -->
  <object id="scriptParserFactory" type="com.berico.mfx.parsers.ScriptParserFactory">
    <!-- Set the object responsible for detecting the Script type -->
    <property name="ScriptDetector">
      <object type="com.berico.mfx.parsers.DelegatingScriptTypeDetector">
        <property name="Detectors">
          <list>
            <!-- We only support JavaScript at the moment, so we will register the
                 JavaSciptDetector.  Keep in mind, you don't need to use the 
                 DelegatingScriptTypeDetector; you could simply use this guy. -->
            <object type="com.berico.mfx.parsers.js.JavaScriptDetector" />
          </list>
        </property>
      </object>
    </property>
    <!-- Register Script metadata parsers -->
    <property name="Parsers">
      <list>
        <object type="com.berico.mfx.parsers.js.JavaScriptMetadataParser" />
      </list>
    </property>
  </object>

  <!-- Configure Script Loaders to use the ScriptParserFactory -->
  <object id="stringLoader" type="com.berico.mfx.loaders.StringLoader">
    <property name="SetScriptParserFactory" ref="scriptParserFactory" />
  </object>

  <object id="httpLoader" type="com.berico.mfx.loaders.HttpLoader">
    <property name="SetScriptParserFactory" ref="scriptParserFactory" />
  </object>

  <object id="fileLoader" type="com.berico.mfx.loaders.FileLoader">
    <property name="SetScriptParserFactory" ref="scriptParserFactory" />
  </object>
  
  <!-- Create a Composite Script Loader for the Environment.  These are the actual
       mechanisms used to retrieve external files.                                 -->
  <object id="scriptLoader" 
          type="com.berico.mfx.loaders.CompositeScriptLoader">
    <property name="Loaders">
      <dictionary key-type="System.String" 
                  value-type="com.berico.mfx.IScriptDependencyLoader">
        <entry key="http" value-ref="httpLoader" />
        <entry key="https" value-ref="httpLoader" />
        <entry key="file" value-ref="fileLoader" />
      </dictionary>
    </property>
  </object>
  
</objects>

Using the Spring.NET Factory Method to create new engines

This is probably the best way to use an IOC with Micro Effects.

IApplicationContext context = 
  new XmlApplicationContext("Config/MfxEngineFactory.xml");
//Pull the engine from the context and run it.
context.GetObject<MfxEngine>().Run();

Spring Application Context

We do a couple of fancier things in this context file. First, we reuse the environment from the previous example. First, and probably my most favorite thing about this demo, we define a script within the Spring context! We also create a Service Dependency within the script. A Service Dependency is a requirement the script imposes on the engine, that basically enforces the existence of a variable in the script environment meeting a specific contract (think interface). The engine looks to the environment, specifically the environment's IServiceRegistry to fulfill that requirement. In this context file, we utilize Micro Effect's Spring.NET support to adapt the IApplicationContext to the IServiceRegistry interface (to be honest, Spring was the inspiration for IServiceRegistry). The engine pulls the dependency from the environment (via Spring) and injects it into the script.

You also see the use of the context object. Variables defined in a Script's ScriptInfo object are placed into the context for use by the Script. In this case, we iterate over the context printing the key-value-pair to the console.

<?xml version="1.0" encoding="UTF-8"?>
<objects xmlns="http://www.springframework.net"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.springframework.net 
                      http://www.springframework.net/xsd/spring-objects.xsd">

  <import resource="MfxEnvironmentConfig.xml"/>

  <object id="specificEngine" type="com.berico.mfx.MfxEngine"
          factory-object="mfxEnvironment" factory-method="GetEngine">

    <property name="RegisteredScripts">
      <list element-type="com.berico.mfx.Script">
        <object type="com.berico.mfx.Script">
          <property name="Info">
            <object type="com.berico.mfx.ScriptInfo">
              <property name="ID" 
              value="Example of using a factory to create an engine and register a script" />
              <property name="Application" value="MicroEffects Examples" />
              <property name="Component" value="Factory Demo" />
              <property name="Version" value="1.0.0.0" />
              <property name="Properties">
                <dictionary key-type="System.String" value-type="System.Object">
                  <entry key="Property 1" value="Value 1" />
                  <entry key="Property 2" value="Value 2" />
                  <entry key="Property 3">
                    <value>12</value>
                  </entry>
                </dictionary>
              </property>
              <property name="ServiceDependencies">
                <list element-type="com.berico.mfx.ServiceDependency">
                  <object type="com.berico.mfx.ServiceDependency">
                    <property name="Variable" value="calculator" />
                    <property name="Contract" 
            value="com.berico.mfx.examples.services.CalculatorService, com.berico.mfx.examples" />
                  </object>
                </list>
              </property>
            </object>
          </property>
          <property name="Body">
            <!-- Using a CDATA element, we can write scripts in the Spring Config without
                 worrying about escaping characters. -->
            <value>
              <![CDATA[
              
              //Look, we are actually defining a script right here!
              for(var item in context){
                console.log("{0} => {1}", item.Key, item.Value);
              }
              
              console.log("2 + 3 = {0}", calculator.Add(2, 3));
              
              ]]>>
            </value>
          </property>
        </object>
      </list>
    </property>
  </object>

  <object id="calculatorService" 
    type="com.berico.mfx.examples.services.CalculatorService" />
  
  
</objects>

Console Output

2011-04-18T22:26:41.4205000-04:00 [AdHoc] INFO - Property 3 => 12
2011-04-18T22:26:41.5335000-04:00 [AdHoc] INFO - Property 1 => Value 1
2011-04-18T22:26:41.5345000-04:00 [AdHoc] INFO - Property 2 => Value 2
2011-04-18T22:26:41.5395000-04:00 [AdHoc] INFO - 2 + 3 = 5

Aspect-Oriented Micro Effects

In our final demonstration, I wanted to show a more practical application of using Micro Effects. Here we have defined a pointcut on all methods of the IClassBeingAdvised interface. We apply some "Before Advice" on each method, passing the context of the method invocation to the Micro Effects engine. This is performed from within C#, by calling a specific JavaScript method (method name defined in the Spring context).

IApplicationContext context = 
new XmlApplicationContext("Config/AopExample.xml");

IClassBeingAdvised obj = 
  context.GetObject<IClassBeingAdvised>("proxiedTargetObject");

//Call some methods on the object
obj.Initialize();
obj.PrintTheFollowingStatement("Hello Dynamic AOP Handling!");

The class being advised

public class ClassBeingAdvised : IClassBeingAdvised {

  public void Initialize() {
    Debug.WriteLine("Doing some initialization.");
  }

  public void PrintTheFollowingStatement(string statement) {
    Debug.WriteLine(statement);
  }
}

Spring Application Context

We will reuse the Environment configuration and define the script within the body of the application context.

<?xml version="1.0" encoding="UTF-8"?>
<objects xmlns="http://www.springframework.net"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:aop="http://www.springframework.net/aop"
 xsi:schemaLocation="http://www.springframework.net 
                      http://www.springframework.net/xsd/spring-objects.xsd
                      http://www.springframework.net/aop
                      http://www.springframework.net/xsd/spring-aop.xsd">

  <import resource="MfxEnvironmentConfig.xml"/>

  <!-- The object that will be Proxied with Advice -->
  <object id="targetObject" 
          type="com.berico.mfx.examples.ClassBeingAdvised" />

  <!-- The proxy of the target object -->
  <object id="proxiedTargetObject" 
          type="Spring.Aop.Framework.ProxyFactoryObject, Spring.Aop">
    <property name="proxyInterfaces" 
              value="com.berico.mfx.examples.IClassBeingAdvised"/>
    <property name="target" ref="targetObject" />
    <property name="interceptorNames">
      <list>
        <value>beforeAdvisor</value>
      </list>
    </property>
  </object>
  
  <!-- The advising class (we inject the MfxEngine into the advisor) -->
  <object id="beforeAdvisor" 
          type="com.berico.mfx.ext.spring.aop.MfxEngineBeforeAdvice">
    <!-- This is not necessary; the default name 
         method name is "runBeforeAdvice" -->
    <property name="AdvisingMethod" value="runBeforeAdvice" />
    <property name="Engine" ref="aopEngine" />
  </object>

  <!-- The Micro Effects Engine pulled from 
       our previously configured Environment -->
  <object id="aopEngine" type="com.berico.mfx.MfxEngine"
        factory-object="mfxEnvironment" factory-method="GetEngine"
        init-method="Prepare">
    <property name="RegisteredScripts">
      <list element-type="com.berico.mfx.Script">
        <!-- Create a Script to handle the Advice -->
        <object type="com.berico.mfx.Script">
          <property name="Info">
            <object type="com.berico.mfx.ScriptInfo">
              <property name="ID" value="Before Advisor" />
              <property name="Application" value="MicroEffects Examples" />
              <property name="Component" value="AOP Demo" />
              <property name="Version" value="1.0.0.0" />
            </object>
          </property>
          <property name="Body">
            <value>
              <![CDATA[
              
              console.log("Script Initialized");
              
              // This is called when the method is intercepted
              function runBeforeAdvice(method, argsArray, target){
              
                console.log("Target: {0}", target);
                
                //If we have arguments, lets inspect the first
                if(argsArray.Length > 0){
                  console.log("First Argument = '{0}'", argsArray[0]);
                }
                
              }
              
              ]]>>
            </value>
          </property>
        </object>
      </list>
    </property>
  </object>
</objects>

Console Output

2011-04-18T22:42:02.7245000-04:00 [AdHoc] INFO - Target: com.berico.mfx.examples.ClassBeingAdvised
Doing some initialization.
2011-04-18T22:42:02.7435000-04:00 [AdHoc] INFO - Target: com.berico.mfx.examples.ClassBeingAdvised
2011-04-18T22:42:02.7495000-04:00 [AdHoc] INFO - First Argument = 'Hello Dynamic AOP Handling!'
Hello Dynamic AOP Handling!

I hope this preview gives you an idea of the power scripting offers in Enterprise Components and what Berico Micro Effects has to offer. If you are interested in Micro Effects, please leave me a message and I will get in touch with you soon. We plan on releasing Micro Effects as an open source project, but there are still a couple of features I want to implement before a release.

Richard.