I’m a Data Engineer at Metail where I’ve worked for 6 years. Over the last 4 years I’ve been part of the team first building and then keeping Metail’s data analytics pipeline up-to-date and able to meet our changing demands. This has meant deciding where to keep up with a rapidly changing field and where to enjoy some stability. I came to Metail after graduating with a PhD in high energy physics based on the LHCb experiment at CERN. There I spent too much time working on the control system and monitoring software, but I still managed to code up and version control my analysis. I haven’t really seen a hill since leaving Geneva and I’m hoping to have some time to attempt a run up from the river to Clifton suspension bridge.

Can you tell us about your background in software?

I started programming in C++ at 20 when I spent a year working at the Rutherford Appleton Laboratory in Oxfordshire as part of my undergrad degree. Here I learned how to set control registers and think in hex as well as the much grander scale of object oriented programming and building a large scale software architecture. During my PhD I put my C++ skills to good use and picked up a few more languages on along the way. This included Python where I learned the joys of the REPL and dynamic typing. Since joining Metail I have dived into more languages and in the last few years started scratching the surface of functional programming through Clojure and Metail’s data processing and analytics stack.

During your PhD you spent 18 months at CERN in Geneva. Were there any eye-popping moments you can share?

I went down to see the four main detectors at their different points on the 27km accelerator. Their scale varies from really big to really very big and the number of technologies both bespoke and off the shelf that have been connected together is mind blowing. Another was foolishly climbing the Fort l’Ecluse via ferrata which starts 430m above the valley floor, it’s largely vertical (or at least felt like it) and I don’t really like heights!

Tell us about the inspiration for your talk.

During the end of last year and the start of this year we put some effort into modernising our batch pipeline. I saw a lot of analogies between the Spark’s and Clojure’s way of problem expression. For example the map/reduce model of Hadoop lends itself to functional programming and there’s also laziness with Spark’s execution model. Plus Spark’s extensive library set has given us a natural path to streaming as well as tools for data science.

Who should attend and what do you expect them to take away from it?

Those who would like to see how you can combine Clojure and Spark to do data processing. They will hopefully be inspired to give it a try!

What’s the best bit of professional advice you’ve ever been given?

Be thoughtful about what you add to your identity and how you allow it to change. People get very defensive when they feel their identity is being questioned. I don’t consider Clojure as part of my identity but I do try to express the traits of a good Clojure programmer. When I need to give in and learn Scala then I can without challenging my part of my identity as a Clojure programmer.


Twitter: @gareth625

See also: