Onyx: A new data bridge

tl;dr: Onyx can do imports/exports on Datomic from SQL, Kafka, HornetQ, core.async, and Datomic itself. Code is here to route from MySQL to Datomic and back.

Ahem. Now for the more patient crowd…

If you haven’t heard yet, Onyx is the new kid on the data processing block. Onyx is a scalable, fault tolerant, distributed data processing framework for Clojure. It was open sourced at my StrangeLoop talk last week. Within 6 days of launching, it’s seen industry interest and adoption as a direct replacement for Apache Storm.

Interestingly enough, when I began this project, I set out to replace Cascading - which is a batch processing framework over of Hadoop. Replacing Storm became a goal later in the project when the design firmed up into a batch/streaming hybrid. I’m pleased with the broad variety of use cases that Onyx can be useful for.

In this post, I’d like to focus on one particular area that I think Onyx particular excels in - storage medium transfer. Storage medium transfer refers to an activity that moves data from one type of storage (say, MySQL) to another type of storage (perhaps Datomic). It can be somewhat tricky to coordinate this activity at scale, as well as express the semantics for what data “looks like” in its destination storage.

I recall a moment from the very start of my career with distributed systems. My task for the day was to use Apache Sqoop to migrate some data from an existing MySQL database into HDFS. I found it peculiar that there exists a set of distributed computation tools for data movement separate from the ones I was using for distributed analysis at the time - namely map-reduce. Even more curiously, I speculated why this toolchain was isolated - even though it was built directly on top of map-reduce. After some thought, I found this situation to be very telling about the state of distributed application design.

Onyx’s super power lies in its computational expressiveness. That is, a distributed computation can be described as a plain data structure. Data structures transcend language, locality, and time. Onyx can reach to the browser - something nearly all other distributed computation frameworks have trouble doing. Techniques like map-reduce invariably lock you into language level constructs (functions, macros, objects) and particular modes of expression (function composition, object chaining). No wonder framework authors abscond this level of abstraction when it comes to storage medium transfer.

Another exciting advantage of Onyx is its IO composability. Often, distributed computation frameworks need to read and write data across novel storage mediums. To be successful in this area, a competitor framework must make it easy to adapt to new storage. Onyx accomplishes this through plugins, and you can get started with IO for Datomic, Kafka, SQL, HornetQ, and core.async today. Plugins are fully configurable through data. That means changing plugins requires virtually no change to your application. This is gorgeous for debugging production workflows locally.

In Onyx, storage medium transfer is just a special case of distributed computation. There are three parts to performing storage medium transfer:

  1. Read data from the target data source
  2. Performance semantic transformations on the ingested data
  3. Write to the destination data source

Notably, steps 1 and 3 can be satisfied by existing plugins - meaning there’s no burden on the developer. If a plugin doesn’t already exist, it’s reasonably straightforward to write one.

I built Onyx to give me simple materials at scale. One of those materials is plain functions. Point 2 can be satisfied by writing a single Clojure function. This function receives a segment, compacted with data from the target data source, and emits one or more segments. The new segments can be semantically transformed for the new data source.

Enough talk, time for an example. Today, we’re going to load up a MySQL database with a table of data. We’ll use Onyx to transfer the data into Datomic. Finally, we’ll transfer the data from Datomic back into MySQL - a full round trip.

I’ll let the code do the talking from here, it’s heavily commented. You can see that we start with a MySQL table, route all of its data to Datomic, and then go in the reverse direction. You can clone and run the repository with a MySQL instance - ZooKeeper and HornetQ run in memory.

Output should look like the following:

"Datomic…"

({:user/age 23, :user/name “Mike”}

 {:user/age 32, :user/name “Bridget”}

 {:user/age 24, :user/name “Dorrene”}

 {:user/age 25, :user/name “Amanda”}

 {:user/age 70, :user/name “Joe”}

 {:user/age 30, :user/name “Steven”})

"MySQL…"

({:age 23, :name “Mike”}

 {:age 24, :name “Dorrene”}

 {:age 32, :name “Bridget”}

 {:age 70, :name “Joe”}

 {:age 25, :name “Amanda”}

 {:age 30, :name “Steven”})

I hope this short example demonstrates how Onyx can act as a powerful data bridge. In under 350 lines, we move data from MySQL to Datomic and back. Try it on your own data set. Write a new plugin. Explore, and fall back on the coordination offered by Onyx where necessary.

Like my work? Let me know @MichaelDrogalis on Twitter.

Onyx: Distributed Data Processing for Clojure

I’m pleased to announce the release of Onyx. Onyx is a new cloud scale, fault tolerant, distributed data processing framework for Clojure. It’s been in development since November of 2013, and I’m elated to open source it to you!

image

Fancy logo, huh?

Onyx is a batch and stream processing hybrid, and it offers transparent code reuse between both modes. This makes Onyx applicable in situations where you’d want to do data extraction, information ingestion, map/reduce operations, or event stream processing. It’s also killer as a low-ceremony tool for replicating data across multiple storage mediums.

image

Use the existing Onyx plugins to leverage your storage. Adapt to your needs by writing your own plugin.

The Big Deal about Onyx is rooted in its treatment of computations as more than code. There’s a fundamental difference between the specification of a distributed computation and its mechanism of execution. The former describes the structure and flow of what we’re trying to accomplish, whereas the latter is concerned with the concrete building blocks that we use to construct our programs. I contend that these two concerns are deeply complected in even the most mature frameworks.

Consider map/reduce for a moment. Map/reduce is a long-held, proven technique for processing data at enormous scales. But we notice that even map/reduce falls to this subtle complexity. By using programming language constructs to chain map and reduce operations together, we’ve baked the the flow of our program into its mechanism - namely map/reduce.

I care a lot about my ability to grow the structure of a distributed computation independently of its mechanism, and you should too. As our field matures, we’re frequently seeing customer requirements for distributed systems involve user control of the workflow running against large data sets. You’ll experience the frustration the moment that your programmer typing into an editor and compiling a JAR file isn’t the primary controller of the structure of the workflow. We’re starting to tug hard on a fundamental complexity of the way contemporary computation frameworks are designed, and it’s time for a serious change to alleviate the pain.

Onyx cuts a hard line down the concerns of computation specification and mechanism. It aggressively uses data structures to enable specification construction at a distance - in a web form, on another box in another data center, by another program, etc.

You can have your first Onyx program up and running in just a few minutes using the Starter repository. It doesn’t require any external dependencies, other than Leiningen and Java. There’s also a walk through to guide you through each piece of the program.

Deploy to the cloud, you say? There’s virtually transparent code reuse between the development environment and the constructs necessary to run in a fully fault-tolerant setting. A production Onyx environment depends on ZooKeeper and HornetQ - two very mature technologies. You’re closer than you think to your first deployment of Onyx into your data center or cloud provider.

So, if you’re ready to give it a shot, I’d recommend browsing the Examples repository to get your feet wet, and referencing the full documentation as you go. If you run into any trouble, I’m more than happy to help on the mailing list.

I hope Onyx helps you to build genuinely simpler systems and avoid much of the pain I experienced trying to build highly dynamic systems with modern tooling. If you’re interested in taking Onyx to production, get in touch. I will personally help you.

 My name is Michael Drogalis. I’m an independent software engineering consultant. Want to work together? Send a tweet to @MichaelDrogalis or an email to mjd3089-at-rit-dot-edu.

Onyx: Distributed Data Processing for Clojure

I’m pleased to announce the release of Onyx. Onyx is a new cloud scale, fault tolerant, distributed data processing framework for Clojure. It’s been in development since November of 2013, and I’m elated to open source it to you!

image

Fancy logo, huh?

Onyx is a batch and stream processing hybrid, and it offers transparent code reuse between both modes. This makes Onyx applicable in situations where you’d want to do data extraction, information ingestion, map/reduce operations, or event stream processing. It’s also killer as a low-ceremony tool for replicating data across multiple storage mediums.

image

Use the existing Onyx plugins to leverage your storage. Adapt to your needs by writing your own plugin.

The Big Deal about Onyx is rooted in its treatment of computations as more than code. There’s a fundamental difference between the specification of a distributed computation and its mechanism of execution. The former describes the structure and flow of what we’re trying to accomplish, whereas the latter is concerned with the concrete building blocks that we use to construct our programs. I contend that these two concerns are deeply complected in even the most mature frameworks.

Consider map/reduce for a moment. Map/reduce is a long-held, proven technique for processing data at enormous scales. But we notice that even map/reduce falls to this subtle complexity. By using programming language constructs to chain map and reduce operations together, we’ve baked the the flow of our program into its mechanism - namely map/reduce.

I care a lot about my ability to grow the structure of a distributed computation independently of its mechanism, and you should too. As our field matures, we’re frequently seeing customer requirements for distributed systems involve user control of the workflow running against large data sets. You’ll experience the frustration the moment that your programmer typing into an editor and compiling a JAR file isn’t the primary controller of the structure of the workflow. We’re starting to tug hard on a fundamental complexity of the way contemporary computation frameworks are designed, and it’s time for a serious change to alleviate the pain.

Onyx cuts a hard line down the concerns of computation specification and mechanism. It aggressively uses data structures to enable specification construction at a distance - in a web form, on another box in another data center, by another program, etc.

You can have your first Onyx program up and running in just a few minutes using the Starter repository. It doesn’t require any external dependencies, other than Leiningen and Java. There’s also a walk through to guide you through each piece of the program.

Deploy to the cloud, you say? There’s virtually transparent code reuse between the development environment and the constructs necessary to run in a fully fault-tolerant setting. A production Onyx environment depends on ZooKeeper and HornetQ - two very mature technologies. You’re closer than you think to your first deployment of Onyx into your data center or cloud provider.

So, if you’re ready to give it a shot, I’d recommend browsing the Examples repository to get your feet wet, and referencing the full documentation as you go. If you run into any trouble, I’m more than happy to help on the mailing list.

I hope Onyx helps you to build genuinely simpler systems and avoid much of the pain I experienced trying to build highly dynamic systems with modern tooling. If you’re interested in taking Onyx to production, get in touch. I will personally help you.

My name is Michael Drogalis. I’m an independent software engineering consultant. Want to work together? Send a tweet to @MichaelDrogalis or an email to mjd3089-at-rit-dot-edu.

Getting a Grip on the Small Things

One of the most effective ways to bring a software project to success is to aggressively pursue simplicity throughout the entire design. Simplicity, as Rich Hickey and the dictionary define it, means not intertwined. It’s a piece of criteria than can be objectively assessed. The best designers have a multitude of different techniques at their disposal to achieve this goal at every layer of a system.

At the highest level, usage of queues, transactions, commutativity, and immutable data structures help us to architect beautiful systems compromised of multiple programs. Further down the stack, pure functions, communicating sequential processes, and software transactional memory guide us to the promised land of decomplected. Perhaps even a little below this level lies the challenge of effectively designing for error handling, logging, monitoring, metrics, and conditions. What constructs does the masterful designer turn to for fending off this potential complexity?

I think the answer to this question is that many of us have simply given up. I’ve seen too many software projects littered with functions like the following:

It’s my contention that the small things often end up drowning us in complexity. This hypothetical function, while only 10 lines, is hopelessly complected in at least 7 ways. Aside from actually executing a query and returning its results, it also performs logging, precondition validation, records metrics, logs error states, handles error conditions, and determines error values.

In my opinion, most developers choose to ignore the littering that ensues from this approach because aspects like logging and metrics tend to usually not affect the output of the function. I’m not saying this is actually the case, because excessive logging can fill up the local disk, or metric calls can raise socket errors. But the vast majority of the time, less skilled developers are operating in the closed mode and care only about the fact that the output is typically as expected.

We know better.

"But no matter what technology you use …, the complexity will eventually kill you. It will kill you in a way that will make every sprint accomplish less - most sprints be about completely redoing things you’ve already done. And the net effect is that you’re not moving forward in any significant way." - Rich Hickey

To these challenging aspects, I have devised a solution that dramatically reduces complexity. About a year and a half ago, I invented and open sourced Dire. Dire provides decomplected, ad-hoc error handling, conditions, and a few other helpful constructs. Dire was, in my opinion, a step in the right direction.

Perhaps the most interesting thing one can do with Dire is conditional loading of function modifiers. This technique pleasantly lets you compose loggers, exception handlers, conditions, and other common aspects. The recent API addition for removing these function modifiers means that runtime decisions can be made to add or drop aspects.

This technique was criticized by others. With the flexibility of runtime composition, we lost temporal control over who, and when, these aspects were being added or removed. I spent a couple of months in search of a remedy to this incidental complexity. I never came up with an answer.

Sometime later, I was happy to hear that Stuart Sierra opened sourced a library called Component. Stuart’s library provides just enough structure to control the stateful parts of a program in a way that’s particularly easy to understand. Conceptual components can be composed, dependencies made explicit, and set up/tear down convenient from the REPL. Stuart unknowingly finished the puzzle that tortured me.

The following Gist shows how we can capture each aspect as a Component record (it’s on GitHub too). This allows for aspects to be composed and returns the temporal control that we previously lost. I’ve used this technique of Dire in combination with Component many times over the last few months with great success. In particular, this makes a stellar way to slot Riemann metrics reporting into your program without mucking up your application logic.

I hope this helps you build genuinely simpler programs.

My name is Michael Drogalis. I’m an independent software engineering consultant and contractor. Interested in working with me? Send a tweet over to @MichaelDrogalis or an email to mjd3089 at rit dot edu. I’d be delighted to hear from you.

Pipelining with core.async

There are many techniques for parallelizing the execution of programs. We’ll be looking at just one in this post - pipelining. I’ve constructed a little idiom you can follow when this design technique is appropriate.

Pipelining is a method that can be applied to tasks that meet two criteria:

1. The task be broken into subtasks.

2. No two subtasks can be executing at the same time.

The Wikipedia article about pipelining as a concept is pretty good. Optionally you can just keep reading, and it’ll probably make more sense if it doesn’t now.

Here are three functions that we chain together. Very straightforward:

There’s no need to parallelize that. Let’s imagine that those functions perform side effects that take a while:

It takes 3.5 seconds to execute this chain of functions.

What if function ‘m’ could be executing in parallel while ‘n’ and ‘o’ are too? More work can be accomplished faster. The only thing we want to avoid is ‘m’ running more than once at a single point in time. The same constraint applies to ‘n’ and ‘o’.

And we see the output here, showing how function ‘m’ can do a lot more work, with function ‘o’ catching up at the end:

It works by making a channel for each function. We then spin up infinitely looping go-blocks that wait for something on the channel. The function is applied to the channel contents, and placed on the next channel. We return the head channel from the pipeline function so we have a reference to the channel to feed values into.

Pull out this template as needed. It’s on GitHub. Tweet at @MichaelDrogalis.

Dire 0.5.0: Slingshot integration

Few things are more satisfying than composing two solid libraries together to bring applications to greater heights. I’m elated to announce that thanks to the work of Dylan ParisDire now integrates with Slingshot.

The 0.5.0 release readily allows applications to catch maps thrown by Slingshot throw+ calls. Gone are the days when one was restricted to only catching based off exception types. More interesting is the resolution of the following situation.

In a typical try/catch scenario, application code must make decisions in the case of failure which exception type to raise up the stack. This complects application logic with error handling strategies, and often conflates recovery logic too.

In previous versions of Dire, the best that one could do in Clojure is to throw a n ExceptionInfo map up the stack, use a Dire hook to catch the exception type, and dispatch on a multimethod inside the handler. Sort of okay, but not great.

With 0.5.0, we can now dispatch based on predicates. This is tremendously powerful. It pulls error decision handling logic back in line with the Open/Closed principle. That is: application logic, the decision of how to react to failure, and the actual reaction to failure are decomplected. Applications can behave the Erlang way. Let it fail, they say! With enough context being raised up the stack, we can delay the decision about what the error actually is by using predicates, ultimately deciding much later in the processs of call chain unwinding.

Onward we work, constructing ever more simple programs.

Clojure Understood: the Rush Hour Platform

Software development is fundamentally about taking things apart.
I’ve been taught this concept along with a handful of other core
principles by Rich Hickey and the Clojure community over the last few years. I wanted to put it all together in a medium for my own learning and as an example to others (especially non-Clojurians) about how to build genuinely simple systems.

It’s my pleasure to unveil the Rush Hour platform. Rush Hour exposes the facilities to create highly accurate vehicle traffic simulations. It ships with a rules language to describe chains arbitrarily complex intersections with traffic lights and stop signs. Additionally, it visualizes the simulation using a dynamic heat map drawn on Google Maps.

image

It’s architected in a way that pays homage to the way I’ve been taught to build systems. I chose to build a traffic simlation because it’s a domain that’s highly familiar to most people, which minimizes the amount of learning you have to do to understand what it *is*, and maximizes the amount of learning you can do about its underlying principles. I’ll now
describe Rush Hour’s architecture.

The big idea

Rush Hour is composed of three major components: the simulation (Sim), Asphalt, and a web service called the Triangulation service. It’s quite simple.

The Sim is a loop that transitions from values to values. This transition is computed in parallel using Clojure Reducers. A storage abstraction of a few small protocols sits in front of an in-memory data structure that acts as a datastore for the Sim to use. The schema for city streets, rules about traffic, and timing of traffic lights are housed in this data store. The Sim exposes the current state outside the loop over an agent. One observer of the agent watches and serves up changes over a websocket streaming API.

image

Asphalt is a client of the Sim streaming API. It receives values emitted by the Sim. Asphalt analyzes the values one at a time and uses the Triangulation web service to determine a set of coordinates that describe the location of all the cars on the road. Asphalt itself exposes a streaming API too. A ClojureScript program listens to Asphalt’s streaming API and draws a dynamic heat map of the cars on Google Maps.

image

And that’s it. Here’s a live demo of a few blocks in Philadelphia. Caution: the rendering of the heat map is somewhat intense. It’s not mobile friendly.

Let’s zoom in on why this is interesting.

The value of values

As usual, a system based around immutable values is the way to go.
The Sim is process that’s about transforming one snapshot of the world into another. That’s all it does. It accomplishes this by applying a pure function to a snaphot to produce another snapshot. It’s a plain Jane infinite loop.

Since it’s all immutable, we can parallelize the computation with Clojure Reducers. That means the Sim makes good use of being hosted on beefier machines. All the logic to transition states is pure, and hence easy to reason about, easy to test, and so on. No concurrent semantics, locks, promises, deals with the devil, etc. Recreating state between components is a breeze since it’s just data - no objects, custom types, connections, or any of that yuckiness.

Just like Datomic and Simulant, we move time out of the equation.
There’s no notion of time in the code that computes the next state. Time passage is simulated at the tail end of the simulation loop with a single sleep. That value can be adjusted to control the rate at which the simulation runs as compared to ‘real time’. Each time the clock ‘ticks’, every entity in the system gets an opportunity to change state. One clock tick per second runs the simulation in ‘normal time’. Making the clock tick faster results in time moving faster within the Sim. Clocks with nonconstant properties can be used - I chose not to do this.

The Sim also pushes out another form of complexity. There is no notion of addressing or coordinates. We get much better reach by describing streets as lines of a certain length, and car position as a point on the line. This enables one to describe fictitious streets. No need to talk to Google Directions API in this component.

In order to be useful, we need a tiny amount of mutable state. The agent that holds the current snapshot takes care of this. There is a little more mutability though. I made the design choice that 3rd parties be able to “inject” traffic into the Sim at runtime. To accommodate this, each street has a j.u.c. blocking queue associated it. Anything sitting on the queue gets pulled into the street just before the end of the transition function. I consider this very controlled mutability, though. It’s uniformly operated on and has tightly isolated scope.

Data all the way down

"Data - please! We’re programmers! We’re supposed to write data processing programs. There’s all these programs, and they don’t have any data in them. They have all these constructs you put around it, globbed ontop of data. Data is actually really simple. There’s not a tremendous number of variations in the essential nature of data." - Rich Hickey

Data is king. Rush Hour has a rather small code base for how much it accomplishes. This can mostly be credited to the aggressive use of data.

Rules are data

Descriptions of laws of traffic for each intersection are maps.
They can be created in any language, by a human or program.
Rules use unification for a declarative style of conveying the laws of traffic. This is extremely powerful, as it obviates conditionals that would otherwise run rampant throughout the program.

Schema is data

The lanes themselves and how they connect are also maps. This makes them amenable to static analysis by a tool in any language. Additionally, integrating with Rhizome to create graphical pictures of roads is a cinch.

Duration is data

Time is pushed out of the equation by transitioning purely from values to values. But - traffic lights don’t update uniformly across the city at each clock tick. This fact can be conveyed as data to keep its meaning evident.

Navigation is data

Descriptions of individual intersections are isolated to facts about itself. To connect the streets of one intersection to another, it’s all data all the day.

Directions are data

We want to give a realistic depiction of how people drive around the city - not choosing streets at random. This gets represented as a weighted map to bias choices about where to go next.

From another angle, since it’s all data, we get terrific reuse out of all the constructs mentioned above.

Another more subtle point is the style of testing that this sort of thing allows. While I didn’t employ it, the door is wide open to generative testing because it’s all well-specified data.

Taking things apart

"I think one of the most interesting things about design is that people think its about generating this intricate plan - but designing is not that. Designing is fundementally about taking things apart. It’s about taking things apart in such a way that they can be put back together." - RH

This project was an exercise in taking things apart. I sincerely believe that this is the most import concept to understand as a designer.

Just at the edge of the Sim’s boundary is the mutable agent that contains the latest snapshot. It’s here that we can add a component that does one thing well. The streaming API watches for changes to the agent and pushes values to consumers. It’s involvement with the Sim ends there.

Similarly, other components can watch the agent without getting in the way. Implementing a HornetQ queueing communication protocol is an open operation. The same goes for other communication protocols -
they all get grouped together in one spot. Communication protocols are another thing you can take apart.

There are more things that we make simples out of. From the domain, we take apart streets, lanes, lights, light sequences, traffic rules, directions, and intersection connections. From the solution side, we take apart storage choices with protocols, clock implementations, navigation algorithms, communication mechanisms, and visual maps. All decomplected.

Relentless caching

To the extent that just about everything is immutable, we can cache relentlessly. Rush Hour uses Elasticache and Clojure memoization (in-memory caching). This dramatically decreases network traffic, to the point where when all caches are hot, very little data crosses the wire at all.

Queues rule

"The other great thing about about conveyor belts and queues is that.. What do they do? What’s their job? They move stuff. What’s their other job? There’s no other job. That’s all they do." - RH

Rich has been saying this one for years. Put queues between the major components of your architecture. The architectural agility gained from having independence in the identity and availability of communicating parties is huge. Rush Hour’s simulation walls off consumers by exposing a single agent, and opening up its streaming and (future) HornetQ API off of that. This lets developers make use of the Sim from outside Clojure. Language boundaries are transcended.

Queues clear the way for an open, pluggable system. Adding durability of snapshots via Datomic and monitoring via CloudWatch or Riemann (or both!) are tasks that require no modification to existing
code. The power of queues lets us add more consumers (at runtime!) to react to change. It’s really great - Rush Hour has tons of connection points to build on, giving it a large surface area. If you want an open system, you do it like this.

Human vs. machine interfaces

"But of course, we also start to see the levels, right? If you look at the back of one of these modules, there’s another piece of design there. And these … are analogue circuits that determine what the module does. The other thing that’s interesting … is that each of these knobs has a corresponding jack. In other words - there’s a human interface, and a machine interface. … And the machine interfaces were there all the time. In fact, they were first. And then the human interfaces come. … You can always build a human interface ontop of a machine interface, but the other is often disgusting." - RH

The machine interfaces for Rush Hour are in place. Those multithousand line EDN files serve as a great machine interface, and not a bad human interface. I wrote them by hand. I didn’t particularly enjoy it, but it was easy to reason about. The pieces are in place for a human-interface to be built ontop of the machine interface.

Good testing

"If you have designs and they specify things well, and you have some automated way to go from that specification to a test - that’s good testing. Everything else is backwards." - RH

"Test systems - not functions." - Timothy Baldridge

The Sim uses scenario based testing, and does very little
unit level work. It’s the same idea that core.async uses for testing. It has hundreds of lines of code that don’t have unit tests, and instead uses overarching tests to verify behavior. We don’t care to test implementation details. It’s a brittle thing to maintain tests for. If you can write overarching tests that make it easy too figure out what’s broken when tests fail, you are golden.

Coltrane couldn’t build a website in a day

"There are people who can make music by waving through hands through the air." - RH

I think one of the more interesting ancedotes about this project is that in its 5 month development effort, the entire first month was spent on the hammock. I spent a lot of time in June laying in Rittenhouse park working out the complexity of the problem space. Coding didn’t proceed until I was finished taking things apart, and ready with a solution to put the pieces back together.

I was sort of forced into this period of hammock time. I moved to Philadelphia at the beginning of June and didn’t have internet for a few weeks. With my evenings completely free of distractions after work, I had time to diassemble the problem space. The hammock period was by far the most difficult phase, and bordered on excruciating at some points. It’s hugely frustrating to remain patient and work past the feeling of producing no code. That is, of course, merely the voice of insecurity tempting you to act early. Take the time to hammock. It sets the stage for an incredible show.

Fin

For a few months, I was a man possessed building this system. It’s not perfect (or perhaps even useful!), but I hope it serves an example for you to learn and teach others.

Thanks

A huge thank you to a few people that helped me create this. Timothy Baldridge gave me plenty of guidance in building the Sim. James Drogalis did the math for me to compute coordinates based off of simulation data.

Contact

Tweet at @MichaelDrogalis or email at mjd3089 at rit dot edu, or mdrogalis on #clojure IRC. There’s not much in the way of documentation, because I honestly don’t expect anyone to use this project so much as use it as a guide for learning. If you want to build on it, I’d be happy to hook you up with some docs.

Handy, super light weight debugging utility

When I’m debugging, I really dislike having to type println’s or prn’s of incoming arguments to functions and return values to see what data is flowing through my program. So I wrote a very light weight library that has a function to be employed at the repl to do it for me. I call it Night Vision:

See the GitHub page for getting set up. It’s a 20-line library, so forks to make it better are easy and welcome.

Follow @MichaelDrogalis

A puzzle (or a bug)

Consider the following behavior (function ‘make-key’ not shown):

Can you explain this behavior?

Answer.

Debugged in about 3 minutes. First 2 minutes were making sure my eyes weren’t playing tricks on me.