Kickstarting

After a lot of work over the past few years I have teamed up with a great mate of mine and gone into the hardware business! Well the kick starting hardware business!

Its going to be one interesting journey, remember that hardware is hard!

You can keep a track of what we are up to on our site or our blog. Its going to be very interesting, and we expect to learn a huge deal along the way but we can’t wait to get out there and move towards kickstarter!

 

Read More

Installing Leiningen behind a proxy

Leiningen is the defacto build tool for clojure projects. Installing in any corporate environment probably means you will need to install behind a proxy, which I found wasn’t that well documented even though it’s very simple with just a few steps:

  1. Set the environment variables http_proxy and https_proxy, leiningen will honor these settings.
  2. Download the lein installer script from http://leiningen.org/#install
  3. Make sure the script is executable & on your path
  4. Run it with the argument self-install, for example on windows run lein.bat self-install

After that you will have it installed, from anywhere on your machine run lein new my-project and get to work!

Read More

Clojure by example: Basic Application

This weekend I decided to knock up an example of some of the great features of Clojure such as java inter-op, multi-threading, it’s data structures and its functional style of coding.

Clojure is a fantastic language that runs on the JVM, it’s a true modern language that takes into account the impact that Moore’s Law is becoming more prevalent in programming and that object orientation comes with a number of downsides that slow the progress of creating reliable, quality software quickly. If you’re new to Clojure its well worth checking out this video

I created this project to demonstrate how a simple use case can be answered using the various features in Clojure, the source code is up on github. The features demonstrated are:

Functional Code

The key to creating concise, reliable and quality code in Clojure is that its a LISP. Taking the step from Java to Clojure means getting used to working in a lisp, if this is your first step into programming Clojure I would suggest you checkout Try Clojure and the Clojure Koans before looking into this code, it will make a huge difference to your understanding.

To understand how this works in use, look the following line. It takes a list of tweets, filters out any tweets that have less that 5 retweets, sorts the list in order of tweets (which by default is ascending) and then reverse that list.

(reverse (sort-by :retweets (filter (fn [x] (< 5 (x :retweets ))) @tweets)))

For anyone coming from a Java world being able to do that in a concise, clear way is a revelation.

Java Inter-Op

As Clojure is built on the JVM it provides access to any libraries and code that would be available to an application written in java. This is a killer feature of Clojure, it means that it already has a huge base of libraries that it can access and use. There are a number of special forms for accessing the underlying functions which make use of the dot operator, for example you can call a function on a java object as such:

(.getText x)

Multi-Threading

Clojure has first class support for working in multiple threads, it’s software transactional memory model ensures safe and consistent access to data. In this example the data is accessed using refs, it is read by dereferencing (using the @ symbol for shorthand):

@tweets

Writes occur in a dosync block, very much like transactional writing to a database.

(dosync
    (ref-set tweets
    (map (fn [x] {:content (.getText x) :retweets (.getRetweetCount x)}) (get-timeline))))

Data Structures

Clojure provides 4 basic data structures, lists, vectors, hash-maps and sets. Their mapping to java objects can be seen as such:

  • List – Linked List
  • Vector – Array List
  • Hash-map – hash-map
  • Sets – Sets

The difference between the java collections and Clojure ones are that the Clojure ones are immutable and persistent. These qualities mean that accessing them in the multithreaded manor above ensures it is safe, fast and reliable. Again, if these ideas are new to you, it’s worth watching the video above.

Summary

I wrote this code to demonstrate some of the key features of Clojure and as a stepping stone from learning the basics into seeing how these tools could be used to create a useful program in very few lines of code. Clone the repo, play around and if you think it could be improved please chuck me a pull request.

Read More

Installing Tomcat on CentOS 6

I’ve been doing some web work lately using tomcat, if anyone needs to get an environment up and running quickly then just use this gist:

Read More

The cost of bureaucracy in software companies

A lot of software companies suffer from the same problem, as they grow they fall into the trap of adding unnecessary bureaucracy. Now I’m not talking about making sure a work environment is safe, or that people are looked after, or that no bullying takes place. Maintaining a healthy workplace needs certain levels of procedures and policies.

However as organizations grow it becomes so easy to to create unnecessary policies, ones that are so tight they kill any sense of autonomy.

Lets take an extreme example of this, lets say one day someone gets caught torrenting files at work. Apart from firing them what else could happen? A policy could be set, firewalls could be turned up to max, teams could be brought in to police things, you could lock the software users have to a standard set. There’s a lot of ways to stop this, but now that one person has ruined things for the future.

Lets look at this mathematically. Lets say you bring in a firewall team, a security team and a compliance team to keep people in check, and lets say some talented developer needs to run a new service on port 8080 (really, that isn’t even an odd port!)

Now a change that would have taken no time suddenly involves 3 new people, each one bringing their own level of efficiency. If each person is 80% efficient, here’s what happens:

Before

Overall Efficiency = 80%

After

Overall Efficiency = 80% * 80% * 80% * 80% = 40.96%

It gets worse!

That’s right, by adding just 3 people to a decision, working at 80% we’ve managed to reduce overall efficiency by a half!

Now think about decisions in an organization, any modern organization. Think what it would take to change some text on a production website, or feeding back on requirements. As companies grow they bring in various departments and every single point requires more people to discuss.

Even the best of people, working at 90% can get stumped by this. Even at that pace a decision that needs 6 other people means you run at 47%.

With people working at 50% that could fall to 7% overall, essentially grinding progress to a halt

Summary

There are levels of bureaucracy that are necessary, but I fail to see how so many organizations fail to understand the most basic notions of how important to individuals and themselves autonomy really is.

Read More

Debugging a Cloudera Hadoop install in the cloud

From the offset I will be honest, hadoop is a nightmare to setup, its versions are all over the place, miss-matches lead to random failures and its just not a fun thing to be doing. However the nice people at Cloudera have a much easier solution to all of this, the provide a nice management interface to install your cluster. Though this is almost seamless, there are a few gotcha’s that you need to be aware of that can catch you out. So my hints are below:

Firewalls

The nodes in a hadoop cluster talk to each other in a lot of different ways, the number of ports you need open depending on your configuration is mind blowing, and from the way things are with hadoop its also ever changing! The shortcut to this is to shut off your firewall using a command such as:

service iptables stop

Or the equivalent for your linux version. Now I’m well aware this isn’t best practice, but if you’re just getting something up and running to test out or are hitting a brick wall and want to make sure its not a firewall problem then its a good test. Later on I’ll cover a long term fix for this.

DNS

Hadoop expects a fully working DNS setup, however this isn’t always in line with how cloud providers set up their servers. For instance my host of choice is RackSpace, who are awesome by the way, but when you setup new nodes they all get names so you can do things like:

ping datanode1

However if you have 3 data nodes there is no way for nodes 2 or 3 to know about data node 1, or each other. If you end up in this state your hadoop cluster gets in all sorts of a mess, some systems use DNS, some use IP’s and it’s impossible to know what’s going on.

The fix for this is easy, you need a working DNS system. This can either be achieved by setting up a fully working DNS server (various cloud providers support this or roll your own on a linux box) or if you have a small cluster you can do this manually. If you edit the /etc/hosts file it will contain a list of IP to name mappings separated by tabs, such as

127.0.0.1    locahost
198.0.0.1    node1

Now all you need to do is add IP to name mappings for any other servers in your cluster and you’re sorted.

Long Term

Longer term the suggested fixes above just aren’t feasible. Shutting off the firewall is not smart and manually setting up DNS is a long process. This advice is just to help you over that first hurdle and get things working. If you plan to invest in a production hadoop cluster I suggest going with a tool such as puppet to setup your servers so they are ready for Cloudera but also secure.

Read More

Simple Complexity

As computer engineers we aim to make things simple. Recently I’ve started to notice something which I have been calling simple complexity, which manifests over and over again in new frameworks, approaches and designs. Im not sure what drives this, somehow it seems that driven by the desire to produce less lines of code or embrace the latest ideas without thinking, but it manifests like this:

AbstractSingletonProxyFactoryBean

Really? I mean, take a look at that class name. Maybe its just me but in the effort to make things simple, has it not become something hugely complex? Simple complexity at its finest.

Maybe this sums it up. Maybe its not the wrong design, the wrong decision, but I just wonder…

fault-tolerance

Read More

Setting up OS X for Clojure Development

Developing in clojure is an absolute pleasure, however setting it up is rarely as easy. In this guide I will take you through the steps to get OS X up and running with Clojure and Leiningen (the Clojure build system).

Install Macports

Macports is a great package manager for OS X, it has all you need to get up and running. Install Macports from here http://www.macports.org/install.php

Install Clojure

To install clojure you need to run this as root:

sudo port -R install clojure

When you run that you will be asked for your password, after that it will automatically install. The -R will upgrade any dependencies too, just to keep things tidy.

Install Leiningen

Leiningen is the build tool for clojure, like maven is for java. In fact it even plays nicely with maven! Install it as such:

sudo port -R install leiningen

Once its installed it comes with a 2nd step you need to run:

lein self-install

Test your work

After which you are done. We’re going to test this by creating a new project using leiningen and proving the template tests fail as they are designed to do.

lein new testing
cd testing
lein test

If you see the line:

FAIL in (replace-me) (core.clj:6)

Then you have a working setup, welcome to the world of clojure!

Read More

Commitment to blogging

Over the past few years I’ve had a blog in one form or another, but I’ve never really commited to it. Yes I wrote the odd article, yes I jotted some ideas down, but never really gave it enough effort to do so consistently.

Well, that’s about to change. Its time I started working on this more seriously and giving a lot more back.

More to come.

Chris

Read More