Conversations about Software Engineering

Conversations about Software Engineering (CaSE) is an interview podcast for software developers and architects about Software Engineering and related topics. We release a new episode every three weeks.

Transcript

Stefan Tilkov: Welcome, listeners, to a new episode of the CaSE Podcast, another conversation about software engineering. This is Stefan Tilkov, and my guest today is Felienne Hermans. Hi, Felienne.

Felienne Hermans: Hi, Stefan.

Stefan Tilkov: Welcome to the show! Why don't you just start us off by talking a bit about yourself and introducing yourself to our listeners?

Felienne Hermans: My name is Felienne Hermans, I am Associate Professor at Leiden University in the Netherlands, where I head the PERL research group - Programming Education Research Lab.

Stefan Tilkov: Awesome. That is going to be our topic today, and maybe we should clarify for our listeners first whether we're talking about training before the job, or training and education on the job... So what is it that we should be talking about, in your opinion, first?

Felienne Hermans: My research expertise is mainly in how to teach children, high school level children and university-age students how to program... So I don't really know that much about learning on the job. However, I do know that within lots of companies learning and teaching is going on. Senior developers that mentor junior developers, or people that teach themselves by looking at Google, or reading API documentation... So there's definitely value also in the learnings from research into teaching and learning; however, my expertise is mainly classroom teaching.

Stefan Tilkov: We'll start with that, and maybe we'll loop back to the on-the-job thing at the end of the episode. For an older person like me - because it's been literally quite a few decades since I've last had some sort of university education - what has changed in the past years?

Felienne Hermans: To be honest, I don't think that much has changed. I know not a lot has changed since I was an undergrad, and that's like 20 years ago. And of course, that isn't necessarily a problem. Often, I hear people say "Oh, we should disrupt education because people are teaching the same way in 20 years." I don't think necessarily because nothing has changed, that is an issue. But if your question is "Has a lot changed?", I don't think a lot has changed.

Stefan Tilkov: What is a typical programming language education look like?

Felienne Hermans: There's really a focus on algorithms and algorithmic thinking. A typical course, how I received it - and probably how you received it, and I think how many university students still receive the course today - is "Well, here's some syntax elements. This is Java (for example), or Python. Here's a variable, here's a loop, and this week's assignment is to reverse a linked list", for example. That would be a typical thing to do. There's not a lot of explanation about how you go about such a problem, what are the strategies that you can use; how do you use a mind map, or a diagram, how you will approach that problem. So there's not a lot of strategies, and there's also not a lot of focus on practicing syntax. It's basically just "Hey, this is a loop in Python. Here's a semicolon, and there go some spaces here. This is what you do, and now you know it because I've explained it to you." That's more or less the basics, how I see it.

Stefan Tilkov: That sounds pretty straightforward, and as you said, exactly the same thing. What is there to research?

Felienne Hermans: Ha! That's a great question, and I could go on for that for a long time.

Stefan Tilkov: Oh, please do.

Felienne Hermans: I think if you compare this form of teaching to other things people learn in life - and an example I really like to use is if you're training for a marathon, which I happen to be doing at this time - the worst thing you can do if you practice for a marathon is do a marathon every day. That will totally demotivate you, and it probably will destroy your body. So what do you do if you're training for a marathon? You do different things. You do weight-lifting, maybe you are also taking care of your food intake, and then you do slow long-distance runs, and short, very speedy runs. So there's all these other things that you do, but still in the end you can run a marathon, probably without you even having done one; maybe the most you've done is like 30k.

Felienne Hermans: That makes total sense, and it makes the same sense if you're learning to play violin. You don't say "Hey, here's Bach. Play this." You do lots of tone ladders and you do deliberate practice with your fingers... That's the things you do. And by practicing those, ultimately you can make a bigger song. However, in programming we very much think "Well, if children need to be able to reverse a linked list, the best thing we can do is tell them to reverse a linked list. Then they practice it once, and then they know how to do it." If you compare these two things to each other, you see how little deliberate practice there is in programming education. Why don't we say, like we say with the violin, you play the note A one hundred times, or maybe a thousand times? Why are we satisfied in programming with "Oh, here's a loop. Now you get it." Why don't we have children or students write a loop one hundred times, without it being necessarily a part of an exercise where you have to print the Fibonacci numbers, or the prime numbers, or do a linked list example?

Felienne Hermans: We don't do enough deliberate practice, whereas we know from all sorts of other fields that deliberate practice really works. So that's the interesting thing to think about in programming education. Firstly, I want to understand why do people teach like this, and then can we change that, and how should we change it, and what are the things that deliberate practice should consist of for programming?

Stefan Tilkov: What are some of the skills that a programmer needs to have, in your opinion? Because those skills are obviously what need to be practiced towards, right?

Felienne Hermans: Yes, that's a very good question, and that question can be answered at a lot of different levels. First, one of the skills that programmers need is to absolutely memorize the syntax of their programming language by heart. And I know this is not typically a popular position, because people will be like "Yeah, but you can just google that." But the short-term memory of people is very limited. Maybe you know this research by Miller - it's from the '50s, from the previous century... He said that human memory is limited to between five and nine things. So you can only have between five and nine things in your head at the same time. And actually, newer research suggests that it might be between two and six things in your brain. So our memory is very limited.

Felienne Hermans: So if you're a learner, if you're a novice programmer and your brain is still filled with "Oh, for i in range, was it a colon or was it a semicolon? Or was it the brackets?", then there's no room to think about strategies like using recursion or using a tree. So I think rote memorization of syntax is the basic skill you need. High-schoolers I teach, for example - I really tell them "If I wake you up in the middle of the night, I shake you awake and I'm like "Good morning! What is a for loop in Python?", you should totally be able to say 'for i in range(4):' You need to memorize that." That's the basic level.

Felienne Hermans: At the next level, once you've mastered the syntax, you need to have a number of strategies. Lots of professional programmers know these strategies, but that's not because we've practiced them, and we have names for them, and we really use them a lot; it's just because we've taught ourselves or we've seen other people do it. For example, strategies are -- divide and conquer is a typical strategy that is actually talked about a little bit... But that's still very abstract.

Felienne Hermans: Also, strategies like in a piece of source code you have in front of you, draw the execution trace. If you have a for loop, you take your pen and you put an arrow back to the beginning of the loop. This is typically something we do with novices once they have acquired the basic syntax level, to help them understand how is the program executed.

Felienne Hermans: So that's the next level - understanding how the program is executed, and then you get these bigger-level strategies of learning how a data structure works, learning how a database works... And I think actually at that level programming education is doing quite well. So once you've broken through the syntax barrier, after that, the way we teach to teach algorithms - I think that's what we know about quite well. And the fact that we don't have that much syntax practice and tracing practice - tracing is executing code in your brain - that might be one of the contributing factors that programming has a ridiculously high amount of dropout.

Felienne Hermans: I actually checked the Dutch Bureau of Statistics numbers yesterday, and computer science has a higher attrition rate (so more people drop out) than programs that are typically seen as very hard, like physics. So why is that? Probably because some of the students just never reach the level where that algorithmic explanation makes sense to them, because they're still struggling with "How does the syntax work? How will the compiler interpret this program?"

Stefan Tilkov: So is your theory then that we're actually hurting our industry by not doing a great job there and losing a lot of people who might have become awesome programmers if we just spent more time with them?

Felienne Hermans: Yes.

Stefan Tilkov: Okay, very interesting.

Felienne Hermans: And especially, the children or students that we're losing are (let's say) the non-traditional computer science students. Because if you come into a programming class and you only see a loop once, and then you immediately have to apply it, if it happens to be the case that you've done some programming in high school, or at home with your parents, or with friends, it's way more likely that you can actually follow the course, that you're able to do it, because you already have some prior knowledge. And we know about learning - if you already have some prior knowledge, it's easy to connect new knowledge to knowledge that you already have.

Felienne Hermans: So traditional people that learn programming might be boys. We just did a survey, actually, among 100 code clubs, so out-of-school programs for children, and we've found that an overwhelming majority of the code clubs have a majority of male participants. So the traditional people that come in with some knowledge of programming might be boys, and the same of course is true for children from lower socio-economic families - they might have less access to computers, they might have less time to play with computers... So I definitely think it's hurting the number of people we get into programming, and it also is another contributing factor why many programmers are overwhelmingly middle-class white male, because those are typically the students that might have some education before university, and then it's easier if the education isn't very crisp and clear, and isn't very tailored towards a total novice learner.

Stefan Tilkov: That's a fascinating thought. I'm sort of aware of a lot of privileges, but I never really thought that knowing the syntax earlier than some of my peers might have been a strong factor. Everybody likes to think of themselves that "Well, this is just some innate ability that I have. I'm just naturally smarter at grasping how a data structure works than others", whereas the reason might be that I just wasted more time writing a stupid "10 PRINT "Hello"; GOTO 10" basic programs when I was ten years old... That's a fascinating thought.

Felienne Hermans: Yes, it's very likely that that helped you a little bit.

Stefan Tilkov: It makes sense. Do you think that it's just a matter of spending enough time, and then everybody can learn the more advanced concepts, or is there a natural inclination to either get it or not get it, given that you've just spent enough time?

Felienne Hermans: Yes, those might not be necessarily mutually exclusive things. It could be the case that everyone can learn it, and still, some people might have naturally a little bit better ability to learn. Compare this to learning a natural language. We all want all children to learn how to read and write. You need that as like an entry pass to society. You need to be able to read and write. We don't accept that some people cannot do it. Of course, there are some people, and if you really have mental disabilities... But otherwise, we say every child, even if they later go on to be a car mechanic, or something that doesn't necessarily require that much reading and writing - everyone, to be able to participate in society, needs to be able to read one language. But still, some children go to kindergarten, and they can already read, and some children take until maybe they're 10 or 11 to really reach a level of proficiency.

Felienne Hermans: So I do think everyone can learn a natural language, and in the same sense, everyone can learn the basics of a computer programming language, but still, there will be lots of differences in how quickly people pick it up due to, for example, innate ability, but also, of course, as we've talked about before, because of prior exposure, because that will make learning a lot easier.

Stefan Tilkov: What about some of the things that I would consider sort of litmus tests for programming ability, like recursion, for example? Is that a fundamentally hard concept to teach, or is it just the order in which it is taught? Why do I have this feeling?

Felienne Hermans: I love this tweet that was going viral a few days ago...

Stefan Tilkov: I remember it, yes...

Felienne Hermans: Grady Booch was one of the perpetrators of this tweet. If you get a Ph.D. in computer science, they take you into a room and then they will tell you that recursion is not really that important, it's just a way to make programming education hard. I think I should just print that tweet out and put it on my door, because I think it's very true.

Felienne Hermans: I've never been a professional programmer, so correct me if I'm wrong, but I do think for many websites and current applications and difficult software that people will build, you don't really need recursion that bad if you're not building a parser or a compiler.

Felienne Hermans: It is one of the things that gets -- in terms of how often you use it, it gets ridiculous amounts of exposure in a computer science undergrad degree, I think. What do you think? You're a professional programmer; do you think this is a valid assessment?

Stefan Tilkov: Well, I don't know. I've used recursion in actual real-world production code a number of times, and depending on the programming language, a lot of times, because different programming languages have different concepts of doing that. If you're programming in a functional language, then you're much more likely to use recursion than if you're doing that in an imperative one.

Stefan Tilkov: I do admit that part of it might be that -- it may be some sort of arrogance, but it's one of those things that signal that you understood things at a certain level, so that's I think why it's often used in stupid programming interview questions, and things like that.

Stefan Tilkov: It's one of the things where you're very happy if you got it, and where you're sort of proud of yourself that you've understood it, and maybe it's being used like that because you wanna identify others who, just like you, like this particular thing. I think it's a very elegant concept, and it's a useful thing, and it's interesting. It's a positive thing to understand how recursion can be replaced by iteration, or the other way around. It's not something that you probably do every day unless you're writing a programming language interpreter, or you're building low-level data structures, or interfunctional programming... Which many people are not. I completely can see that.

Felienne Hermans: And it's of course interesting because there's also this feedback loop, because we teach people recursion and we emphasize how important it is, therefore people know it and therefore people will use it, and they will also use it in a way that you said, a little bit like virtue-signaling, like "Look at me being smart. I'm using recursion." So of course, the emphasis and the value we put on education will also pour over into how people use concepts.

Felienne Hermans: But I don't know if it's a litmus test. I would want to run the study where we test students early in their career how well they understand recursion, and then we bring them back into the lab 20 years later and we see what they've so far created and how well they're doing.

Stefan Tilkov: That would be really interesting, yes.

Felienne Hermans: There are also other things that get less emphasis in computer science programs, for example debugging skills, how to properly use a debugger, how to choose great variable names, how to write documentation - those are things that are under-appreciated in undergrad programs, that could also be great predictors of programming success. How well you are able to understand the role of a certain variable, and how well you're then able to pick an ambiguous name for a class or a method. I would say that's maybe as good as a predictor. I would want to run a study where we compare that type abstraction skills to using recursion, and then see who's the best.

Stefan Tilkov: That reminds me of the other tweet that made the rounds, which said that "Of course you need a Ph.D. to be a data scientist", he said as he manually renamed 70 CSV files. I like that a lot as well. So yes, I completely agree that there is a ton of skills that are very important, that are on a different level, and the examples that you gave are extremely important in real-life programming... I still don't think I'm prepared to let go of the idea that it's quite unlikely that you're a really good programmer unless you are able to understand the concept of recursion. But you're right, maybe this is a completely conscious or unconscious bias on my side, because I believe I understand the concept...

Felienne Hermans: I totally feel you. For me also it would feel really weird, for example, to give students a computer science degree without recursion. If I think about that, I'm like "No! That cannot be right!" But then if I consciously think about it, "What are all the things that students could build without knowing that concept?" and there are really many, many things that have use and value in the world, all sorts of web apps and databases where you don't really need recursion. And then if I compare that, again, to debugging skills, and identifier naming skills, I'm like "Okay, but what program can you build if you're very sloppy with variable names? Everything you touch will turn to lead, and not gold." So I feel you. Really, I feel you. It would be weird. However, I think it gets just too much attention. I'm not saying drop it, but...

Stefan Tilkov: I think I can probably agree with that, yes. So you were talking about these different levels, like the memorized syntax thing and the strategies part. Is there another level of things that you need to add on top of that, or is it the one we just talked about?

Felienne Hermans: I was actually making notes on my notepad, like "Okay, what levels was I talking about?" I think it would totally be first syntax and then tracing, so being able to predict how a program will execute, and then the next level would be strategies. And maybe a higher level even would be tactics, so the architecture level - what type of data structures do we need, what type of classes do we need, how does this system interact with the world? That's maybe even a higher level than just "How am I gonna approach this problem?" So architecture design would be the highest level.

Stefan Tilkov: What are some of the exercises that you would do, comparable to running at full speed, and weight-lifting, and whatever it is? What are some of the things that you can do to flex your muscles in these different levels?

Felienne Hermans: That's a great question, and actually we should turn our attention to learning from running, or learning from music... But what we're actually trying to do here in Leiden is learn from how people learned their first natural language. One of the things that people do when they learn their first natural language - you know this if you know or have a child that's five or six years old - they read letters, and then words, and then sentences aloud. It's a very natural thing to do; if you're a novice reader, you cannot read without making sounds. And if you're five and you're learning to read, you have the word "cats", you do not read "cats" at once. It's "c, a, t", or if you're a kid, it will be "C-ats. Cats." And you do that for a very long time. Only if you're in grade six or grade seven you can comfortably read natural language in silence.

Felienne Hermans: One of the things we've been doing with kids - and it has been a very exciting experience - is you have them read Python code aloud as a means of syntax automization. The same way we want kids to automate letters and phonemes and words, we want children to automate Python code, so we just have them read source code aloud... Really, like "for i in range(4): print (i)" And it's not just language, actually, that does this, because now people will think "Well, but programming language is not like a real language." That's true, but also, of course, in mathematics education, reading aloud is used commonly as a way to improve memory. Think of when you were learning the tables of multiplication. How often did you say "One times five is five. Two times five is ten"? Hundreds of times before you've really automated, before you can just say "Oh, eight times five is forty." And even now that I'm saying "Eight times five is forty" in English, it definitely takes me a longer time to think of it, because I've memorized those rules in Dutch.

Felienne Hermans: So it is not just that I'm calculating it, I'm relying on my memory, even though, of course, I can calculate eight times five, but I don't; I'm relying on memory, and then I'm relying on my Dutch memory, because that's where I've trained it, and I quickly translate.

Felienne Hermans: So we would say it makes all the sense in the world to have children memorize Python code like language and math, so that once they need it, it doesn't fill up their entire space. Remember, the working memory items that are very low - it just fills up one space, because they're going to quickly retrieve it from their long-term memory.

Felienne Hermans: So that's one of the things we're doing, and we're quite excited about this idea, admittedly because it makes everyone angry. And if everyone is angry, you just know an idea is a great idea.

Stefan Tilkov: Okay. But as you mentioned it, does this create a problem for non-native English speakers?

Felienne Hermans: Yes, absolutely. We were not prepared for that. The first study we ran about this - vocalization is how it's called; Python vocalization - was with Dutch kids, because I live in the Netherlands, and the children that I have easy access to in high schools that I work with are Dutch kids. And a very interesting thing - we didn't think of this; what we thought was we will have children read aloud, and we can see how consistent they are, and probably the children that are most consistent are the best kids, because they've automated the syntax best; that actually turned out to be true, so that was good. That hypothesis was true.

Felienne Hermans: But there was another hypothesis we didn't have before, because in the Netherlands - and I think it's the same in Germany, in German - if you have the letter "i", the "i" in English, we don't call it letter "i" [eye], we call it letter "i" [e, as in keep]. That's how we say it. So we had Dutch children that read the snippet that contained the variable "i" like this “for i in range(): print(e)”. So they would vocalize it within range -- in close proximity to range, they would say i, because if you see "range", that's an American word, so you get to the American or English reading line... But "print" is also a Dutch word. So what probably happens in their brains is they see "print", they say it in a Dutch way, and then the next letter is the letter "e", how we say it in Dutch... Because these were 12-year-olds, and for 6, 7, 8 years they've already practiced that letter, that in English is called "i", we call it "e".

Felienne Hermans: Of course, if you're at that level, if you're not consistently pronouncing that letter, are you really fully aware into the smallest veins of your being that that is the same variable if you don't pronounce it consistently? It's very likely that you don't really have automated the belief, the understanding that those are the same variables. So in these we've found very interesting natural language effects that we didn't anticipate... So that's another benefit of reading aloud, actually. We know the benefit of reading aloud is for the students, because if you read something aloud, you'll pay closer attention and it's good for memory. We know that.

Felienne Hermans: Another value of reading aloud that this example nicely illustrates is the value for the teacher. If you ask a kid to read aloud, you can look into their brains, which is pretty hard, especially with programming; you don't really know what they know. So if you ask a kid to read that Python aloud and you see that they're still struggling with the variable name, then obviously they are not at a level where you want them to be. So it's also very revealing for teachers.

Felienne Hermans: And then, of course, understand that Dutch - if you don't know Dutch - is very much like English, even though it's not the exact same thing. They are very close. Imagine the effects of reading aloud on learners that naturally speak Hindi, or Chinese, or Arabic... If your language is even further away from English, and "print" for example (like in Dutch) isn't even a word in your language, probably you will have even more benefit from memorizing it by the sounds, because the looks of those letters you don't commonly see don't really carry that much value to you.

Stefan Tilkov: That makes a lot of sense. It actually reminded me of the benefit that many people say, or that I've observed myself, from talking to somebody and have them type something. The seeing into the brain - it's very interesting to listen to somebody explain something, as opposed to just seeing the resulting program that they type, which is probably part of pair programming and mob programming and all of those things... It gives you more insight into those things.

Felienne Hermans: Definitely. And at a higher level, I think - our profession, programmers do understand this, because you have this concept of rubber ducking. If you're really stuck, then explain your problem to your rubber duck, or your leprechaun or what have you in your office, and it will help. So at this level - let's say the level of strategies - we definitely believe as a community already that vocalization is a good tool, because I definitely think most people believe in either rubber ducking or brainstorming with their office mates.

Felienne Hermans: So we do believe it at a higher level, but we don't really believe it at syntax level, because we sort of think syntax is easy and googleable. But the idea, of course, that expressing an idea with your voice has also comprehension benefits - that's not very revolutionary.

Stefan Tilkov: How much is this influenced by the fact that when you're in university, you're very likely to have to learn more than one language? Are you supposed to do the same kind of learning for each of the languages that you want to be perfectionating in?

Felienne Hermans: That's a great question. We don't know yet. We also don't know yet at what level vocalization has to take place, and when. Clearly, if you're a novice, then it is valuable to say - let's say in Python "for i in range(0, 4):", because you want to memorize that that's where the symbols go. But probably after a while, if you're a more experienced programmer, then if you're reading code aloud, you wouldn't say "for i in range." Maybe you just say "A for loop from 0 to 4." You could still practice reading a longer program, but maybe you don't have to stress all the individual elements. Maybe you only stress the importance. Maybe you're abstracting while reading aloud, as well.

Felienne Hermans: In the same sense, indeed, if you are already a proficient Python programmer and then you're learning C, maybe you only need to practice this syntax level for a really short time, or maybe you do not need that at all, because you already know that there is a concept of a for loop, and you can easily retrieve it from your memory, even though in C it's different from Python... But you can sort of look that up without too much overhead, because you can retrieve the for statement as one thing, even though you don't know the syntax. These are all open questions that are very interesting. We don't know those yet.

Stefan Tilkov: I'm going to get back to this "How can we know" thing later, but I wanted to ask about more methods. We've talked about vocalization as one strategy/method; what else is there in your toolbox?

Felienne Hermans: One of the things that's also very interesting - again, coming from reading education is deliberate explanation of strategies. This holds specifically for reading text. If you learn to read text - and now we're not talking about your first encounter with text if you're six or seven, but think of early high school, where you're really learning text comprehension. You get told specific strategies for reading text. They will tell you things like "Some of the things you can do is look at the images. Look at the headers. Try to summarize the text." So you read it -- you get two minutes for this text, so clearly you cannot read everything... So you have to look at only the important elements. Look at words that are made bold.

Felienne Hermans: If I explain it like this, you'll be like "Yeah, that is so basic. Do we really need to explain this to kids?" But then if you look at textbooks about reading comprehension, we actually do explain these things to kids. We explain to them how to read a text, and we tell them "Here are your strategies", and also evaluate those strategies. So "You looked at the pictures. What did you think the text was about? Now actually read the text. What do you think the text is about now? How helpful were the pictures?" That type of instruction I haven't really seen for programming.

Felienne Hermans: So how do people learn how to read source code? I don't really know, and I've seen third-year undergrads - if I ask them "What does this code do?", they start at the top of the program and just read from top to bottom. It makes sense, because it's basically the only strategy they know from natural language. You read it top to bottom.

Felienne Hermans: How would they know that something else is more practical? And there's been interesting research - actually, eye-tracking research - that has compared how novices versus experts read code, and of course, you know what experts do, because you are an expert... We follow the control flow of the program, the execution trace. So we start at the main loop, that's what we read first, and then there's a method and we're like "Oh, where is this method defined?" and we go to the method definition. Then we read the method, and maybe it calls another method; or maybe it acts as a field of the class, and then we're like "Oh, let's go to the definition of the field." We follow the execution of the program. Who has ever told you that that is a good strategy? How have you acquired that strategy?

Felienne Hermans: It's sort of survival bias, and it's good that we know this, and it's great that we practically taught ourselves, but I think we could be, again, more efficient, and also maybe more inclusive. If this is an explicit thing we tell to students, "Here's your program. Find the execution path" and "This is the order in which you read things", and also things like scanning... For natural language you just read the headers, and you get a sense of what text is meaning if you just have two minutes. Well, for programming, it might make sense to collapse, and many IDEs can do this. Just literally collapse all the classes or collapse all the methods, and just read the signatures. You can get something from that. That's another strategy.

Felienne Hermans: So specifically naming these strategies and telling students "Hey, these are things you do. You get this code, you have to read it..." This also, in general, is an exercise we don't really do. Most of programming education is aimed at constructing programs, and not at reading programs... But even if you would have reading, then what does the student do? We want to really make this catalog of reading code strategies, and deliberately practicing those strategies. We haven't done a big study on that yet, but we hypothesize. It's almost sort of boring, because you're just replicating results. You sort of know that'll hold, because in natural language they hold, so why wouldn't it hold for programming? We think that if you explain to students those strategies and if they practice, then ultimately they'll be more versatile code readers, because they have this library of strategies.

Stefan Tilkov: That absolutely makes a lot of sense. I can imagine some reading class; this is the class where we do nothing but read a program and summarize what it does. This is a fascinating idea.

Felienne Hermans: Yes, there was an amazing study done in the '90s... I want to pitch this paper because I've just heard about it a few months ago; it's called "The case for case studies in Pascal." It is online, we can link to it in the show notes. They compared a group of high-schoolers learning to program in Pascal. One group just programmed, like in normal/typical programming education, and then after they created the program, they got to see an expert solution. That was group one.

Felienne Hermans: Group two also programmed, they got the expert solution, but also with expert explanation. Group three did not program at all; they just read expert programs with expert explanation... And that third group did as well as the second group, and better than the first group. So a programming course in which there was no programming going on, only reading code and reading about strategies, about how experts had created those programs, was actually a very valid and efficient way to teach programming to high-schoolers. That's like "Wow...!" That is so contradicting the things we believe in programming, that in order to become a good programmer you need to do lots of programming.

Felienne Hermans: It turns out that this study - and I would love to replicate this study now, because '92 was a long time ago... But I do think it's valid still. If you just explain to kids how they should do things, then it's more efficient than having them figure out all this stuff about syntax and about strategies by themselves. It's just possible.

Felienne Hermans: I know people will now be up in your mentions or my mentions on Twitter, and they're like "Yes, but I have taught myself programming, so this is possible." I am not saying it's not possible; you have value, calm down. But it is probably more efficient to just explain this to kids.

Stefan Tilkov: But I’ll byte – the figure I just postponed a few minutes ago... I'm always kind of worried by those studies, because every time I look into one of those studies in more detail, I find them not really convincing. Because what they typically do is they have super-small groups of like six students here and seven students there, and no explicit assessment of what they knew before they did this particular thing... So my problem with many of those studies is that I just don't believe them. What is your experience there? Am I just wrong, and an old man shouting at clouds type of thing? Is this all different these days, or was my perception wrong? How well-researched is the actual research?

Felienne Hermans: That's a very valid question, and of course, part of science is definitely criticizing methods, and "Can we really believe this?", and as I said, I want to replicate those studies, because of course, our confidence in those studies (and in any kind of study) increases if lots of replication is done, if lots of similar studies are ran. However, it is valid and important to critique studies, but I see there's often a correlation between how much people criticize study and how much they want the study not to be true.

Felienne Hermans: Sometimes studies that we don't like get more criticism than if it's something that validates everything we already believe. We're like "Oh yeah, but they're starting with saying this is true." So I do think it's very good to be critical; I do however also think sometimes it's just a way of people that aren't that well-versed, they aren't scientists themselves, so they don't really sure what such a study should be about, and just say "Oh, but it doesn't have much participants. It was just seven" or "It was just 70" or "It was just 700."

Felienne Hermans: First of all, any number is small. Even if you would do a study with a million programmers, that's still not all the programmers, and that's still not all the programmers and that's still quite a small percentage of the world population. So in general, if you have a randomized control trial where half of the students get a certain treatment, like reading aloud, or like programming by only reading programs, and another group gets another treatment, and you do that in a valid, randomized way... So even though you don't know what prior knowledge there already is, you can either do a test before, and then distribute the knowledge of the groups, or you can say "Well, we randomly assigned everyone to groups, and if your group is reasonably sized, then you can still learn something."

Felienne Hermans: So I don't think necessarily if studies are small that they don't have value. They can definitely still have value. But also - yes, more of these studies are good, and the more skeptical we are as a programming community in general, the more reason there is for us to run these studies, I think, for a field that is extremely critical. You know this, and everyone knows this - we bicker about Vim versus Emacs, "Oh, C is the best programming language!", "No, Python!", "No spaces, tabs!" Everything is under debate, and that is good. It's good that we keep talking about those things. However, the fact that you learn programming by programming a lot - which again, compared to language, and math, and music, and sports - is sort of a weird position to be in, is entirely never criticized.

Felienne Hermans: Skills like debugging, reading code - they're never taught. Not in undergrad programs. It doesn't seem - as far as I'm aware - to be taught that much in coding bootcamps, so let's say competitors for traditional programming education in universities... It's all focused on programming, programming, programming. Why aren't people more debating there, where it matters? So I hope people are interested in "Are these studies actually valid? Does that Pascal '92 story still hold?" That's good, because if more people are interested, then that's more ammunition for me to actually run those studies.

Stefan Tilkov: In my defense, I have to say that I tend to not believe any studies, no matter whether they support or don't support my personal point of view...

Felienne Hermans: Ha-ha! That's good.

Stefan Tilkov: ...but it's still true that those studies could be dismissed for all the wrong reasons, and I'm certainly not free of that, so... And I completely agree with the point that if there are more studies replicating the same results, then of course that increases the confidence in those results.

Felienne Hermans: Yes. And you will not believe this, because it's yet another study, but there's actually some evidence that people more quickly reject a study if it doesn't fit their opinion. There are studies done on gender bias, so if you believe gender bias, that it's true, and you read a study about gender bias, you'll be like "Oh, this is true!" Whereas if before you have said "No, I don't believe that", and they present you the same evidence, you say "Well, there's no evidence, because reasons."

Felienne Hermans: So you'll not believe me because it's yet another study, but studies show that it is indeed true that if you believe something, you are less likely to reject evidence showing that that is true.

Stefan Tilkov: Let me qualify that... I actually do believe you, because I do believe those kinds of studies. I have a particular problem with studies regarding programmer productivity, because I find that's incredibly hard to measure. It's sort of impossible to neutralize the effect of the participants' experience. That's the one thing that troubles me a lot. I have complete confidence in studies that are done with a number of people regarding general things like confirmation bias, or any of the fallacies that people have, and they can be replicated... I completely agree with that. It's really just programmer productivity that always drives me nuts.

Felienne Hermans: Yes, productivity is very hard to measure. It's very hard also to have a randomized control trial...

Stefan Tilkov: Exactly.

Felienne Hermans: ...in which some programmers do one things and other programmers do another thing, because the realistic situation where you would work on a project for months - that's almost impossible to replicate. So I do agree with you that productivity -- and how do you measure it? Number of feature points, or lines of code, or whatever... So yes, productivity is specifically hard to measure. But of course, measuring if students have acquired a certain concept, something like doing an exam at the end, I would say is less hard to measure. It's pretty easy to measure if someone understands what a variable is, and this is something you do in practice, as well. I think those studies are slightly less up to discussion than things about productivity.

Stefan Tilkov: Agreed. Let's leave that and maybe talk a bit about algorithms and data structures, because clearly that's at the next level, beyond the mere syntax. I believe you called them strategies, right?

Felienne Hermans: Yeah.

Stefan Tilkov: Do you have some additional ways of teaching those?

Felienne Hermans: No, as I said earlier in the episode as well, I think once we're at that level, the way we teach is pretty good. You do get lots of experience, lots of practice with a red-black tree, or a linked list. Because the education is sort of designed for people that already know programming, even though it starts at a lower level, I think the higher the level is, let's say the more the students look like the professors, the better the teaching becomes. For me, I'm less interested in that level, because I think there we're doing quite well.

Stefan Tilkov: One question I have is what's your opinion on how much do people need to understand those things? How important is it for somebody to understand how, say, a hash map works internally, or a red-black tree, or whatever data structure, or how to do a particular algorithm?

Felienne Hermans: That's a great question. I think this question hits right in the heart of one of the struggles that computer science programs have all around the world... Because we are very (let's say) schizophrenic about who are we educating. I think many undergrad programs are secretly programs that prepare for a career as a computer scientist... A scientist as in someone that does a Ph.D, and that later goes on to work as a post-doc or a professor.

Felienne Hermans: So I do think if you were the person that is designing algorithms, if you're creating new algorithms for machine learning or AI, then it's very important to be able to create new data structures, to deeply understand how those things work. However, many programs, even though they want to be scientific, are actually creating programmers; people that stop their education after the bachelor's degree, or more common in the E.U. after a master's degree... They go into industry. And then if you go into industry and you become a programmer, then maybe it's really important that you know how to use data structures. You definitely need to know how to use a tree, how to use a hash map. But do you really need to know how to design new data structures? Do you need to deeply know how to implement those? Do you need to know how they run on a machine? Maybe... But maybe not.

Felienne Hermans: I think it's very hard that we have to cater to - "we", I'm a professor myself - these two entirely different categories of people, where people that go on to be a programmer, they absolutely need to know how GitHub works, and how an IDE works, and how to do code reviews, and how to do Scrum... And those are the things we get lots of criticism from companies where our graduates go to work, like "Why didn't you teach them Git? Why didn't you teach them stand-up meetings?" Well, because initially we were very much created as an education to educate new scientists, and that's still in our veins, and you can just see that from the programs.

Felienne Hermans: Going back to the recursion discussion, which is maybe interesting. It's very much "Yes, it is important, but other things are also important." I guess also people might be upset with this. I'm sure this can be on Hacker News, like "Oh, computer science professor says data structure doesn't matter." I'm not saying they don't matter, but I am saying that I think they get enough attention, and a deep understanding of how exactly a hash table maps to memory on the computer is not something you use that often, and you can really have a nice career in creating programs if you just know in what Python library you can get the hash table or the red-black tree, and how to work with it. You don't necessarily need to know -- how does a red-black tree stay balanced? I don't know, but I know it guarantees me quick search, so that's fine.

Stefan Tilkov: One counter-argument might be that this kind of knowledge is way more long-term knowledge, as opposed to the use of Git. I completely do get your point, and I do have the same ambivalent opinion regarding the litmus test effect of the whole thing... I also understand what you're saying regarding the difference between maybe computer scientists and software engineers, or whatever we might call them, if there's such a difference and if we should be maybe differentiating in education as well. The only thing is that maybe some of those things are skills that you'll be able to apply even if details in the underlying technology or the products or tools change. Would you agree with that?

Felienne Hermans: Yes. I definitely think that this is one of the things that if you teach them, you teach them for the long-term. You don't teach how red-black trees work exactly in Python 3.7, you teach them the abstract concepts, so that you can use it and recognize it in different places. So I definitely think that is true, but that also goes for let's say the transferable skills that I would advocate need more attention, like variable names. That's also a skill that stays relevant with all new technologies and with all new programming languages that could exist. So I definitely believe in teaching abstract concepts in addition to practical skills. I think I'm just saying there should be a balance, and some of the things get too little emphasis, while there are also long-term skills like organizing your code really well, or giving feedback to a colleague in a professional, proper way.

Stefan Tilkov: Completely agreed. You mentioned a few minutes ago that we sort of expect people to be able to read and write to be a member of society; at least if they want to take part in everything, that's an essential skill. Is that true for programming as well?

Felienne Hermans: Yes, I think so. It's not necessarily true that you need to be able to create a program, but I do think you need to be able to understand how hard it is to create a program for something, which is very, very like knowing how to program.

Felienne Hermans: I'll give an example from the Dutch political situation - we have AirBnB in Amsterdam, and that's a problem for the city, because it creates tourism hubs in places where the city government did not envision tourism, because it was residential areas, that are now turning into tourism areas... And that's a problem, because now we get too much tourism, and people are unhappy in their neighborhoods.

Felienne Hermans: That is a political problem, but it's also a software problem, because suppose Amsterdam wants to do something. The City Council says "We are sick of Airbnb. We don't want that. We will build our own app. We will build Amsterdam BnB, so that people can still rent out their houses, but we have some control." How hard is that? Is this 100 Euros? Is this one million Euros? Is it technically impossible to build such an app? What are the ethical implications of this? In order to be able to participate in such a discussion, which I would like all citizens of my country to be able to do, you need to have some knowledge about how hard is it to create something... And that's just one example.

Felienne Hermans: If you want to participate in society, 10-20 years ago you'd write an angry letter to the government and you'd say "Hello, Government. I am angry because there are too many people parking in my street." What you could do now if you have some programming skills is you could actually download open data from the government and say "No, there are not too many cars in my street... Only the past three months there were 427 cars in my streets, while there are only 100 permits, so how can this be possible? People without a permit must be parking here."

Felienne Hermans: Knowing some programming also enables you to participate in such discussions that are very similar to discussions that you could solve with just text or language previously. For those types of reasons I think it's very important that everyone knows a little bit about programming, so that they can participate in these discussions... And it's not just Airbnb. It's Uber, it's Uber Eats, and Deliveroo... There's so many parts of society where - they say software is eating the world - if we don't give everyone access to that type of discussion and that type of power, who already has this power and will continue to wield this power? That will be Silicon Valley, mainly middle-class, middle-aged white men that create these apps that are disrupting - for better or worse - societies all over the world... And I think everyone, especially people that aren't that traditional group of solving their problems, everyone needs access to that, so that technology is maybe more beneficial, or at least less disruptive in places where we don't want it to be disruptive.

Felienne Hermans: It's not just "I like programming, so everyone should like programming", or "I like running, so everyone should like running marathons." It is partly that, of course, but it's also being able for everyone to participate in these enormously impactful discussions. That's why I think programming is very important for everyone to know.

Stefan Tilkov: Conversely, do you also think that there should be mandatory politics and ethics training for programmers?

Felienne Hermans: Yes.

Stefan Tilkov: I thought so. And I agree.

Felienne Hermans: It's like medicine, we need this Hippocratic Oath for programming. Yes, you should at least be aware of what the effects of what you're creating could be on society. I think very much of programming is focused on: “Oh, I can build this and it'll be really cool! Let's make a robot that can do a backflip, because it is amazing to look at!”

Stefan Tilkov: Well, it is. It's also very scary.

Felienne Hermans: It is, and it's amazing, but why do we want this, and what could be a negative effect, and what else could the robot do with that very strong and powerful limbs, that can totally move everywhere? Hm... Maybe other things than a backflip. Do we also want that...?

Stefan Tilkov: I completely agree. Let's move back to simpler questions... One that occurred to me - in your experience teaching, what would be a good programming language to start?

Felienne Hermans: Yes, because that's totally an easier question than all the other ones, right?

Stefan Tilkov: Yes... You just tell me which language is the best one, and then this is settled once and for all.

Felienne Hermans: I really like teaching Python at this point, so I would definitely say Python is one of my favorite languages. I think also it doesn't really matter, so that's two contradicting answers. I think what really matters is that you're fluent in the language and you practice a lot.

Felienne Hermans: The reason that I like Python is that it is very versatile. It can do many things. You can create a web app with Django, you can do data analysis with Python... So it allows you to be many different things. That's what I like about Python, and that's also what I like about the Python community. I definitely think also that if you're teaching someone a programming language, you should think not just then they can know the programming language, but they will also be part of a community. Especially, again, thinking of female students, if I think of sending my students into the Python community - you know, going off to conferences wearing shirts that say "Python is for girls", that is a welcoming space that I would happily send students to.

Felienne Hermans: There might be other programming languages or platforms that are less inclusive in that way, so for me that's also an important reason to select a programming language, because you do influence the type of people and the type of events that your students will go to. So I like Python, but also I don't care that much.

Stefan Tilkov: What are some of the characteristics that a language needs to have to be a good teaching vehicle?

Felienne Hermans: What I like about Python is it is very gradual. It's a little bit like gradual typing... Also a concept that I very much like is you don't have to make many decisions in the beginning. If you're just learning Python, you don't need to think about types. Just create your variables, everything works, and then if you move on a little bit, you can actually use optional typing. You can think of types, and there are types, and you can ask Python what a type of variable is, and you can write down what types for a function is, but you don't have to do it. It is sort of ease into it, where I think there are lots of other programming languages that might be a little bit harder to learn, where you need to get everything right from the beginning, so any language with a type system.

Felienne Hermans: I love types systems. They're empirically shown to be working, and they get less errors, so yes, use type systems if you're a professional. They're amazing. Lots of research about that. However, if you're a beginner, it can be overwhelming. You have to think about syntax, and also the types have to match... It is too much initially.

Felienne Hermans: So I would definitely like languages where you can not think about some things initially and add them later on. I think for teaching that is definitely the best. Again, think of language - initially, if students learn to read, they're five or six or seven, you don't need to do interpunction. They don't even need to do capital letters, they just do everything lower-case. Because first we want to focus on handwriting, we want to focus on spelling some words, and reading a bit. And then only later we say "Well, now instead of doing every sentence on another line, we can match sentences together, but then you need to have a period in between, otherwise it gets confusing."

Felienne Hermans: So you add more and more levels, and I definitely think programming languages that can mimic that experience, stepping up and creating harder and harder programs, with more elements of detail that were initially left out, I think that's the best. But again, also - it doesn't matter as much as people think it matters.

Stefan Tilkov: I'm very sure our listeners have strong opinions on that, so that's definitely going to make somebody--

Felienne Hermans: Well, that's good if they do have strong opinions on this. We're actually running a survey. It's great how you coined that. If people are interested in actually expressing their opinions on programming languages, we are running a survey at bit.ly/pl-views where you can express what you think is the best, the worthy, the most value programming language. You can pick all the programming languages you want and then tell us why you think one language is better than the other language. We are still gathering results, so we'd very much like to hear from you what you, listeners of the CaSE Podcast, what your favorite programming language is, and why.

Stefan Tilkov: Very good. We'll make sure to put that in the show notes, as well. I'd like to loop back to the beginning of our discussion. We started off by making sure that people were aware that we're basically talking about education before people get a job, so maybe we can turn back to that topic again. How much of what we discussed is transferable to training for people who are already active programmers and software developers, and how is it different?

Felienne Hermans: I think one of the most important takeaways is that if you see colleagues struggling, for example because they're learning a new language, or platform, or API, help them with the basic stuff. It might very well be that you're an excellent C programmer, and one of the best in your team, and now we're moving to Python, and suddenly they're like babies again; they don't know anything... Because they don't. So help them really focus and practice with some syntax, and don't say stuff like "Oh, but this is easy. You already know C, so this programming language will be easy." It will not be easy, because you still need to have lots of muscle memory for the syntax.

Felienne Hermans: It's also like "Oh, but you already know Dutch... Well, then French will be easy." Yeah, I know the letters, I know the alphabet, I know some of the grammar, but it'll still be very hard. So in the same sense, transferring to something new always takes tremendous amounts of energy, and people will -- it's like, if I say something in French, I will sound a little bit dumb, because everything I say will be half off. In a similar way, if you start a new programming language, you will probably do some weird stuff, even though they're really smart. It's not because they're not smart; they're still smart, they just still have to get used to new things.

Felienne Hermans: So give people a little bit of credit, and don't think it's just syntax, and you can easily google it or pick it up. You really need some time to automate skills in any new environment that you're in. And also, focus and practice. Reading code is still something you can get better at. If you're a proficient reader of English, you still learn five new words every week because you read a lot. I'm sure if you would read lots of code, then you would also still pick up new strategies, new data structures...

Felienne Hermans: So all those people doing those open source projects on Saturdays, like "Oh, this is my GitHub page, and I have 20 projects", I'm always slightly skeptical. I mean, I'm happy you're doing things that you think are fun - by all means, have fun - but I don't think you're necessarily learning something from just applying the same things. I think if you want to learn new skills, a better way might actually be to go on GitHub and find a program that you use, that you like, and read the source code. This could be an exercise you could do with a group of developers in your company. "Let's read the Linux kernel. Let's read the Open Office code. What is going on there? What can we learn from there?"

Felienne Hermans: I think that might be something, if you want to do a team learning/group learning exercise, that could have lots of value, and maybe more value than just creating new things.

Stefan Tilkov: Awesome. I totally love that idea of a reading club, a reading circle for interesting source code... And I think we'll leave it at that. Felienne, it's been awesome talking to you. Thank you so much for being on the show.

Felienne Hermans: Thanks for having me.

Stefan Tilkov: Goodbye, listeners. Until next time.