Conversations about Software Engineering

Conversations about Software Engineering (CaSE) is an interview podcast for software developers and architects about Software Engineering and related topics. We release a new episode every three weeks.

Transcript

Joy Clark: Hi, everyone. Welcome to the Case Podcast. This is Joy Clark, and today I am at the EuroClojure and I have a guest, Alex Miller, who is working on Clojure at Cognitect, and is also the author of Clojure Applied. Thank you for taking the time to talk to me.

Alex Miller: You're welcome.

Joy Clark: So we're at the EuroClojure this week, and it's a Clojure conference. Can you tell us a little bit about what Clojure is?

Alex Miller: Sure. It's a functional dynamic language that runs on the Java Virtual Machine, and then there's also a version called ClojureScript that transpiles to JavaScript and runs on JavaScript engines instead.

Joy Clark: What does it roughly look like?

Alex Miller: It is a Lisp dialect, so it looks like things like Lisp or Scheme or Racket. Most people's familiarity with that is that it's a lot of parenthesis. Most Clojure developers say they stop seeing those after a while, and you start thinking instead in the structure of your code, because it makes the structure of your code -- it sort of lays that bare. I find that's a very natural way for me to think about it. In Clojure, everything is an expression, and every expression is surrounded by parenthesis, so it's a very regular structure.

Joy Clark: Do you still see the parenthesis, or do you kind of filter them out now?

Alex Miller: I see them, but I'm really thinking in terms of expressions. I think that's what people mean when they say that - you stop thinking in terms of syntax, and you're thinking more in the structure of the code instead.

Joy Clark: How does it differ from more mainstream programming languages?

Alex Miller: In some sense it's similar to other mainstream programming languages; it has all the same kinds of things that you have in other languages, like the ability to do conditionals and looping constructs and things like that, and function invocation - all those things exist in Clojure, as well. It's a dynamic language, so it has some similarities to other dynamic languages like Ruby or Python, or even JavaScript, in that sense.

Alex Miller: It's a functional language, so it has definitely an orientation around functions, and that's similar to other functional languages, things like OCaml or Haskell or Scala or something like that. So it has some similarities to a number of different languages.

Alex Miller: A couple of the identifying characteristics of Clojure is definitely that from the very beginning it's had a focus on working with immutable values as data, and many people describe that as the key thing that they enjoy in Clojure. It underlies a lot of the other things that you have in Clojure, like the concurrency constructs, state constructs and things like that. They all really rely on/are based on this dealing with immutable data.

Joy Clark: What's the strength of Clojure? What types of problems is Clojure particularly good at solving?

Alex Miller: Clojure was designed by Rich Hickey, and he had done a lot of work in C++ and C# and Java and Common Lisp, and part of the rationale for Clojure was really to be good in the kinds of applications where Java is good for applications. Java, of course, is used in a very wide variety of scenarios, and I think Clojure can also be used in a very wide set of places. Not all of those are the perfect fit for Clojure... So there's definitely places, like embedded code, or different more specialized things, but sort of just generic business applications and things like that is where I think Clojure really hits a sweet spot. And that's what people are using it for, mostly.

Joy Clark: Mostly web applications? Or what other kinds of applications?

Alex Miller: Most applications are web applications these days, so that's inevitable, and when we do surveys we see 80% of the people are doing web applications. Those tend to be in areas like e-commerce, travel apps, financial apps, and just all sort of typical business/enterprise-type applications. There are a lot of different companies using it: insurance, travel, and all those kinds of domains. There are a lot of big companies that people have heard of that are using Clojure for those kinds of things.

Joy Clark: Can you use it for mobile development as well?

Alex Miller: You can. Probably the leading contender right now for that sort of thing is to use React Native and ClojureScript - that's probably the easiest path right now to cross-platform mobile development and things like that. Because Clojure runs on the JVM, and the Android runs Java classes or compiled Java classes to the Android Engine it is possible to do that. There are various technical reasons why that's not the fastest thing to do, so that hasn't been too well supported in the last few years, so most people that are doing development are using React Native as far as I know.

Joy Clark: Clojure is personally my favorite language, and I get a lot of comments (I think it's interesting) from the Haskell and Scala communities. You would think that they would be like "Oh, another functional programming language... Welcome to the club!", but then they're always like "Where's the types?" and I always get a bit apologetic about it maybe, but I kind of never know what to say to someone who's like "I really want a strongly-typed language!" I can see where they come from - the compiler helps me program something that's better - but I never really know what to say, because I never miss types. Can you maybe elaborate on the benefits of dynamically-typed languages?

Alex Miller: Yes, and I've spent more of my career working with statically-typed languages than working with dynamically-typed languages, so I have a lot of experience in both. I think it's pretty easy to talk about the benefits of statically-typed languages in terms of enforcement and things like that. People don't talk very often about the cost of statically-typed languages in terms of the things that they put in your way in terms of flexibility of growth of a code base over time, and things like that.

Alex Miller: You'll find that most large statically-typed applications have portions of the application that are actually effectively dynamically-typed. You might be shoving that stuff in a map, or something like that to get around that; you'll find that there are often parts of a statically-typed language that end up needing to be a little bit more flexible, and that you expect to change more frequently over time, and things like that. So you can sort of turn those things around and say "Instead, why don't we start from that perspective and have that flexibility all the time, but then allow us to add additional validation or checks or things like that at the places where it's important, to get those benefits of statically-typed languages."

Alex Miller: Clojure has found a path that makes it really easy to rapidly build up applications, build functionality and do those kinds of things, and we're starting to add some additional features in the latest version of Clojure with the clojure.spec that give you additional things that you would want, maybe coming from a statically-typed language or something like that.

Joy Clark: Can you explain a little bit about what spec is?

Alex Miller: Sure. Like I was saying, in Clojure you really have a small number of core data structures that compose together to let you build all of your data together. So you have lists, vectors, maps and sets - those are the primary data structures. In practice, the majority of your data ends up in maps and vectors usually. So you're really using primarily two data structures to model all of your data. The big benefit for that is that you can do generic programming. We have this large library of functions that allow us to generically operate over different collection types. This is all just standard functional programming stuff, being able to map and filter and replace and all these sorts of things. So Clojure has this really well developed library that works on generic data structures.

[00:10:06.15] The benefits that you get from that are tremendous reuse. You're always using this small number of data structures everywhere, you use it to represent everything: your configuration data, your data coming from your database, your data coming from the user over the wire - all those things end up being just a small number of data structures.

[00:10:06.15] The benefits that you get from that are tremendous reuse. You're always using this small number of data structures everywhere, you use it to represent everything: The thing that I always felt like I was promised in object oriented languages like Java was "You're going to be able to make these objects and then reuse them", but in practice what I always found was that you're making these very bespoke custom objects that I couldn't reuse at all. You rarely got any reuse out of them. I feel like I get a tremendous amount of reuse out of the small number of data structures that Clojure provides and the library provides to work on them.

[00:10:06.15] The benefits that you get from that are tremendous reuse. You're always using this small number of data structures everywhere, you use it to represent everything: The flipside of that though is that what that means when you look at a function that operates on your customer data, your customer data is just represented as a map, whereas in an object-oriented or statically-typed language you might actually have a customer object or a customer type of some kind... So you actually have lost some of the concrete details about what's happening, they are no longer as visible in your code. Instead of being annotations on your functions, they are sort of implicit in the structure of the data. clojure.spec gives us tools to actually continue programming in the same generic way that we have been, but add additionally the ability to annotate that data and say "Oh, this map is actually not just a map, it actually conforms to a customer specification, and it has these attributes, and these attributes have their own nested format." Once we have that, we have the ability to talk precisely about data structures in concrete terms that are useful to the programmer as they're communicating things, and then also be able to do things like validate that data, check if it's invalid, tell us things about the way that it's invalid... You can get explanations of invalid data, and then it also has built into it the ability to generate example data that conforms to the specification.

[00:10:06.15] The benefits that you get from that are tremendous reuse. You're always using this small number of data structures everywhere, you use it to represent everything: You can actually do generative testing, where you generate random customers and you can then inject them into a function and verify that it actually does things you say it does. That actually goes way beyond what most people get out of statically-typed languages. It's really an exciting addition to Clojure to have that new functionality in there.

Joy Clark: How easy is it to get started with spec?

Alex Miller: I think it's very easy. There are definitely some things you need to learn in terms of the different spec forms, ways that you can compose specs together. Specs are inherently made of predicates; predicates are just functions that have the ability to take a value and tell you something logically true or false about it. "Is this number odd? Is this string empty?", things like that. Those are examples of predicates. Then we have a family of composites that can combine those things together in terms of logically “and”ing and “or”ing them, or talking about collections of predicates, and things like that.

Alex Miller: You often already have the predicates there, they exist in your program because you need them for the actual program code, and then it's just a matter of learning this set of forms that you can use to combine them together.

Joy Clark: In my experience, it's very easy to write a Clojure program, but how easy is it to maintain the code? It's like "You just write it down and then it works!" but then with the dynamicness of the language - maybe spec helps a bit - you might forget what a function does, if it's not well-documented.

Alex Miller: Yes. I think spec is really stepping in there to have the ability to put language and more precise specifications for what data is flowing in and out of your system. That's a big new tool that we have, and it also includes the ability to include instrumentation, so you can instrument a function and then get immediate feedback if you invoke it with an invalid value.

Joy Clark: What does "instrumenting" mean?

Alex Miller: Instrumenting means that we basically take a function that exists and we wrap a little function layer around it that checks the input values to verify that they conform with the spec for that function. During development, you will turn on these things - and they have some costs, because you're basically wrapping every function in another function, but every time you invoke a function, you're getting these extra checks. At development time, I find that to be really a transformative thing; it really gives you much better feedback, much faster than you got before.

Alex Miller: The other thing that I would say is that the way that most Clojure developers work is with very close integration with a REPL (Read–Eval–Print Loop). A lot of different languages have some sort of a interpreter that you can run, and run commands or expressions in the language and see the results. For any Lisp, that's really not an interpreter, that's really actually the heart of what running a Lisp-like program is. It's really this notion of just reading a form, evaluating it, and then printing the result, or reading the next form and doing more work. It's really what I call "the beating heart of Lisp", and that's just given to you directly and you can work with it.

Alex Miller: Most people work in an editor, but then they're able to send a particular form, a paranthesised expression in their editor, over to the REPL, and evaluate it, and you sort of build up this state. You're sort of working with the REPL back and forth in terms of you're building up a function, then you're evaluating that function, and then you're creating data and inserting that data and invoking the function with the data, and that gives you this sort of very rapid feedback cycle to understand what your program is doing. The question is really "How does that work over time?" and some of it is just typical doing good diligence on documenting what your code does, and providing enough hooks so that when the next person comes along it's easy for them to pick up the strands and build up a state that puts them up in that REPL workflow.

Alex Miller: You do need to take some care there, and there are a lot of best practices around the way that you manage your code. Typically, code is in functions, and the functions are collected in namespaces; namespaces are often organized around data structures, so a lot of times you'll see at the bottom of a namespace a comment block that has example data structures in it, and example code, and things that you can go directly there and start evaluating some things and give yourself some state to work with the functions in that namespace.

Alex Miller: Joy Clark:[00:17:58.27] What's the documentation like?

Alex Miller: It's terse. The actual Clojure API docs itself are intentionally terse, partly because that documentation gets compiled into the final classes typically, and that means that you can add a REPL, ask for the documentation for a function, and it can show you that. If it's available, it can also show you the actual source of that function itself. I'd say that in Clojure probably more so than in most other languages I find myself looking at the source of the functions I'm invoking - or the functions in the core library - more frequently, and that's just the nature of Clojure and Lisp... You tend to just do more of that investigation at the REPL, naturally.

Joy Clark: Yes, I actually read Clojure source code. I never read Java source code. So would you say, as a rule of thumb, is Clojure easier to maintain than a language like Java, or less easy to maintain?

Alex Miller: I think you have the same exact problems you have in any other language. The problems that are problems in other languages are problems in Clojure, too. It's this notion of - you're building a large set of abstractions, and concrete implementation details and things like that, and you need to be able to walk up to the system and understand what it's doing. The bulk of my career has been working in Java; when you walk up to a Java system, you're necessarily gonna find a lot of classes, those classes will have JavaDoc, and you'll be able to start looking at those things... And it's the exact same thing with Clojure.

Joy Clark: ...they'll hopefully have JavaDoc.

Alex Miller: Yes, hopefully they'll have JavaDoc, or at least the things that you care about. And the same is true of Clojure - you want to look for docstrings, and things like that. The big difference, from what I see, is that with Clojure you typically are looking at anywhere from 10 or 100 times more concise code, and that typically means that you have to read a lot smaller number of lines to actually get the same conceptual unit that you do in Java. That's something that easier to explain away as not important; in my experience it is important. Being able to find that the implementation of something all exists in one file, and that file is 50 lines long - that gives me a much narrower focus on what to look at, and what I can see at a time on a screen. You can just fit an enormous amount of code, of value onto one screen, so I find that to be really valuable.

Joy Clark: So you have a little program and you can see it, but for a bigger application you probably need more than one file, right?

Alex Miller: Yes, absolutely.

Joy Clark: Do you have any tips, are there architectural patterns that have become standard, or an opinion, or something to help structure your application?

Alex Miller: This is a really common question, especially for new Clojure developers. I generally don't find it that hard to do that sort of stuff myself; it's a thing that I've thought about a lot, trying to just mentally piece through "When do I make the decision of how to split things apart?" and it's really the same kinds of decisions I make in other languages...

Alex Miller: A lot of times in Java you have a class - your orienting a file is really about a class, and that class is about representing something. Similarly, in Clojure you're going to have a file which is a namespace, and it's typically going to be things that are about one data structure or one small collection of data structures that are important. You might have a collection of functions that are about a customer, or something like that, or you might have a set of functions that provide some functionality, some particular handler in a web app. It's really one of those things that as you work on enough Clojure code you sort of get to that point where you get that uneasy feeling like "There's going to be too much stuff in this file. I need to impose some more structure on it."

Alex Miller: I've done some analyses on larger codebases to look at that, and generally it seems like that limit (at least for me personally) is somewhere in the 200-500 lines of code range. That's where I start feeling uneasy, and that's actually pretty small compared to other languages.

Joy Clark: You mean as the whole program, or just one file?

Alex Miller: In a file. I find that when stuff gets to be over about 500 lines of Clojure, then typically there's some logical divisions in there that will then lead me to break that out into multiple namespaces or things like that.

Joy Clark: So do you usually put everything in one namespace and then later extract it into other namespaces?

Alex Miller: Mostly. There are definitely times when I start working on something where I know that there's independent things that I want to make that I know are going to evolve at different rates or they're about different things, and I'll just naturally start by putting those in different files. But if I'm not sure, I'm really just sort of starting and working through it, I'll just put everything in one file and at some point the structure will start to emerge and I'll find that in the file itself I've segmented and written a comment line that says "This is now stuff about this." Those eventually become really distinct units, and at some point I just take that chunk of code and move it into another namespace and expand out from there.

Joy Clark: What's the largest codebase you've worked on in Clojure?

Alex Miller: Well, Clojure itself; Clojure core itself is one gigantic file...

Joy Clark: So it's written in Clojure...

Alex Miller: It's a mixture of Java and Clojure. A lot of the data structures and some of the concurrency primitives and things like that are written in Java, and the compiler, of course; the compiler is one giant file. But the Clojure core library itself is actually one namespace split across a number of files. The main core one is pretty large; it's unwieldy. It's actually painfully large, but there are reasons why it is the way it is.

Joy Clark: I've never thought about how big that file must be...

Alex Miller: It's thousands of lines, it's a lot of stuff to look through. It's definitely one of those things where you can load it in a typical Clojure editor to test how well it handles large files.

Joy Clark: So those are all of the normal functions that you would wanna use - map, reduce...

Alex Miller: Right. Most of the core library is in one namespace.

Joy Clark: How many functions are there in that core library, roughly?

Alex Miller: There are about 700-800, something like that. Most of those are two lines; most of them are tiny functions. It is actually split across multiple files; there are ways to split a namespace across multiple files, so there are chunks of it for things like departs that define protocols and data types - that's pushed off in another file. The pretty printing parts - some of those are pushed off in separate files, as well.

Joy Clark: How popular is Clojure?

Alex Miller: It's a little hard to get hard numbers on that thing, but I can pretty safely say that there are tens of thousands of Clojure developers out there in the world, and it's being used at hundreds or thousands of companies, many of which are Fortune500 type companies. We had Cognitect work with people at some of these big companies. Some of them are still a little circumspect about saying publicly that they use Clojure, so there are definitely more companies using it than are talking about it.

Joy Clark: Is it something to be ashamed of?

Alex Miller: Well, some people use it as sort of a strategic advantage, and they don't want to talk about it too much. Some companies are just naturally more stealthy about what they're doing internally and they don't want to be known for their tech as much as they want to be known for their product, so they just don't talk about it quite as much. Sometimes we're able to convince them to go do a talk at a conference, and sometimes we're not. But it's being used in all sorts of big companies all over the place.

Joy Clark: What's tooling like?

Alex Miller: There's a lot of tooling. It used to be back when I started that that was like one of the number one complaints, tooling. People still complain about it because people always want better tools, right? But there are fantastic tools available. As far as editors go, probably the two leading ones that most people use are either Emacs, which is the programmers' editor of old, and that's still probably the most common one that you'll see in the Clojure community. IntelliJ - there's an IntelliJ plugin called Cursive (a commercial plugin) developed by Colin Fleming and that's also really good. I actually use both Emacs and Cursive for different situations

Alex Miller: Then there are Eclipse plugins if you want more of an IDE experience. There's Vim plugins if you're a Vim user, there's Sublime Text, and TextMate, and other more Clojure-focused things like Nightcode and Light Table... There are lots and lots of editors. Atom actually has really good support these days for more web-based editing workflow.

Alex Miller: When you're in Clojure you can use JVM tools as well. People use tools like YourKit to do performance and memory debugging. There are now good debuggers available in both Cursive and Emacs that you can use.

Alex Miller: Then there are some more interesting tools that people have been starting to build out lately that really leverage a lot of Clojure's unique strengths, things like ProtoREPL and Sayid. There's new things coming out all the time, it's hard to keep on top of all of it.

Joy Clark: Is there support in the editors for spec? Because I would think that finally -- you know, before it was all dynamic, we didn't know what was coming in, but now maybe we could...

Alex Miller: There's a new version of Cider, which is the Emacs tooling for Clojure that came out yesterday, which includes a vastly expanded support for spec and a spec visualizer of some kind. I have not had a chance to actually even see it yet, but that just came out. I know that Colin is also working on some spec-related features for Cursive as well. That stuff is not actually out yet, spec is not actually part of the stable Clojure release yet, so I think tool authors are waiting for a little bit more finality on some of that.

Joy Clark: If someone's just getting started with Clojure, what editor tooling do you recommend usually?

Alex Miller: I would say my first question is "Do you already have a programming editor that you like?" If you do, you should probably use that one, so that you're focusing on learning Clojure and not learning another tool to use Clojure with. I would not say "Go download Emacs and learn Emacs to learn Clojure." I don't think that's a good idea. And you don't need Emacs to learn Clojure, for sure.

Alex Miller: When I first started using Clojure I used TextMate, and I didn't use any tooling of any kind, I used just a raw text editor and a REPL for a few months, until I felt comfortable that I understood what was happening. Then I dove into Emacs and did more with Emacs at that point.

Joy Clark: You mentioned Cider - what is Cider actually?

Alex Miller: Cider is a set of tooling that runs inside of Emacs, and it provides integration with a REPL and debugging tools, code completion and all those sorts of things... So all the things you want in a modern development environment. Cursive also provides most of those same things. They provide them in a very different way, it's a GUI way versus Emacs, so different people tend to gravitate towards one or the other. I tend to use Cursive most of the time, that fits my workflow. It's not quite as sort of text command-friendly. You tend to do a little but more pointing, clicking with the mouse and things like that. That doesn't bother me; I find that allows me to structure things a little bit better. In Clojure I find it's not really about the typing as much, it's really about the thinking, so hopefully Clojure and your tools get out of your way as much as possible, and just let you focus on solving your problems.

Joy Clark: How long does it take for a new person to get familiar with Clojure?

Alex Miller: It really depends a lot on the background. I did some Scheme in college, and when I picked up Clojure I was productive within a few days... But for whatever reason, Clojure has a very natural match to my brain and the way that I think about things, so for me it was just like "Oh yeah, this makes total sense, and I totally understand how to do things." It was just a very natural match.

Alex Miller: In general, you can learn the basics within a couple weeks. Clojure is nicely made up of a bunch of interdependent things, and you can choose which parts of it to use. You can defer learning about a lot of different topics for a long time. Macros is something that Lisp is typically known for. Macros are the ability of thinking about your source code itself as a data structure, and then applying functions to take input data structures - which happen to be code - transform them, and produce output data structures which also happen to be code. So macros are just functions that take code and return code, that happen to run a little earlier in the process.

Alex Miller: Macros are often thought about as a tricky subject, and certainly there are a bunch of different tricky things that come up in macros, and they can get very, very complicated if they're macros that generate macros, and all sorts of crazy stuff. But when I learned Clojure, I actually did not write a macro probably for the first year that I wrote Clojure. You don't need macros, you can do almost everything you want to do without macros, or using existing things that happen to be written as macros. That's a good example of a topic that you can defer learning about for a really long time. You can also defer learning about the software transactional memory parts of it, or different concurrency aspects of it, or spec. All those things are sort of more a-la-carte, where you just pick which ones you wanna use.

Alex Miller: The core things you really need to understand are "How do I define vars?" which really hold functions, and things like that; "How do I define a function? How do I invoke a function? How do I use the core Clojure data structures?" Those sorts of things - I've taught a bunch of classes in Clojure, and we cover all of those things in the first day. It's pretty easy to teach the basics of Clojure within a few days and be able to write code that does thing.

Joy Clark: What would be your advice if someone wants to get started?

Alex Miller: If you can find somebody to learn from, I think it's great to have a resource, to ask somebody about. There are a lot of good online communities to look at; there's an excellent Clojure Slack community, which is very helpful. There's a beginners room on there and I answer questions there all the time, a lot of other people do too, and it's a very friendly, open place to ask. There's also a good IRC channel, and people are on there all the time, and mailing lists, and Reddit, and all those things.

Alex Miller: As far as books go, there are a lot of different Clojure books. Some of the more recent ones that are really good intro texts are "Clojure for the Brave and True", which is actually available as a free online book and also as a printed version from No Starch. That one's by Daniel Higginbotham. A lot of people like Carin Meier's "Living Clojure", which is an O'Reilly book.

Alex Miller: I'm actually working on the third edition of Programming Clojure, which was the very first Clojure book, and actually the one that I learned Clojure from; it was originally written by Stu Halloway, and then the second edition was written by Aaron Bedra, and I'm working on the third edition... That's out in beta right now, from Pragmatic.

Joy Clark: And you wrote a book.

Alex Miller: I also have a book that I co-authored with Ben Vandgrift called "Clojure Applied." That's really intended to be a second book. It doesn't cover syntax at all, it assumes that you've read one of these other books and that you've spent some time working with Clojure, and it answers a lot of questions about things like "How do I structure my namespaces? How do I use state? If I have a concurrency problem, how do I approach that and what are the different tools available, and which one should I choose?", things like that. It's meant to act as sort of a really good journeyman book, sort of "Okay, I think I understand how the language works, kind of, but how do I actually apply it to problems?" That's what it's trying to fill in the gaps on.

Joy Clark: Great. Well, thank you so much for taking the time to answer all my questions.

Alex Miller: No problem, happy to talk to you. Always happy to talk about Clojure.

Joy Clark: Me too.