Manuel Pais on Team Topologies

Transcript

Sven Johann: Welcome to a new conversation about software engineering. Today, I’m talking to Manuel Pais, the author of Team Topologies, the book about organizing business and technology teams for fast flow. Manuel is recognized as a DevOps thought leader. He’s an independent IT consultant and trainer.

Sven Johann: He focuses on team interactions, deliberate practices, and accelerating flow. He is also a LinkedIn instructor on continuous delivery. He was for a long time the DevOps lead for InfoQ and a QCon London program committee member. Welcome Manuel.

Manuel Pais: Thank you so much. I’m happy to be here.

Sven Johann: Did I forget to mention anything about you?

Manuel Pais: Probably, but I also forget many things. I think you got the main ideas. I’ve been a DevOps advocate, if you like, for a long time. And I did a lot of work with InfoQ and consulting as well. And now, since the book Team Topologies came out about, well, it’s actually today that we’re recording the two-year anniversary that the book was published.

Manuel Pais: So I do most of my work around training, consulting, and giving my opinions about team organizations, team structures, interactions, etc.

Sven Johann: When I heard about the book, almost two years ago, I was more like, yeah, it’s probably interesting if Manuel writes it, I must read it, but I just didn’t read it for a long time.

Sven Johann: And now I read it and I thought, why hasn’t the book been written like 15 years before? Because almost 15 years ago, Werner Vogels, so the CTO of Amazon, gave an interview now famous at ACM Queue how Amazon is developing software and he was talking about those two-pizza teams and “you build it, you run it”, and all those kinds of things.

Sven Johann: And I think most of the companies started adopting those ideas maybe a couple of years later. But, you know, ever since I adopted this kind of work, I’m running into problems, and also the teams I’m working with, and you gave those problems a nice, let’s say vocabulary. The first thing you’re mentioning in the book is the challenge if you form such teams is cognitive load. So can you elaborate, what’s the problem of cognitive load and those self-organizing two-pizza teams.

Manuel Pais: Sure. So yeah, that famous interview was, I think, helpful to get people thinking about, what’s the right size of teams, and that promotes faster flow and more autonomy, etc.

Manuel Pais: And also 15 years ago, it was more or less when agile started to gain a bit more attraction. I think there was a gap there. So some of the praise we got for the book from Jeff Sussna, he said this is the book that does the homework for DevOps around how to organize teams. Obviously the cultural aspects and the practices, all that is super important, but there’s also this part.

Manuel Pais: And so going back to what Werner Vogels said about the two-pizza team, what we’ve done in the book is basically there is some truth to that, but it’s sort of intuitive, right? People think, oh yeah, that sort of makes sense. There are the jokes about how big is the pizza? How hungry are the people, but, you know, we’re talking about 7, 8, 9 people. Probably not much more than that. The book builds on ideas some of which have been around for quite long, but no one was putting them into the perspective of team organization, team structures. And so the idea of cognitive load comes from the psychology field?

Manuel Pais: Someone called John Sweller defined cognitive load in the 1980s as the total amount of mental effort being used by a person doing some task. And so it’s interesting because people working in user experience, user interaction, tend to have this idea of cognitive load, very present when they’re thinking about how much cognitive load it is for a user, for a customer to perform some task. How do we reduce that cognitive load? When you have to keep a lot of things in your mind to understand what do I need to do next? How do we make that simpler? And so what we’re saying is we can also apply the idea of cognitive load at a team level.

Manuel Pais: Because we know that there is a limit based on the capacity of the group of people in the team to how much they can have in their minds. And so that has implications on the work they can do. And with software engineering in particular, teams are being pulled in many directions.

Manuel Pais: They’re being asked to do more and more. They’re being asked to take care of infrastructure for their services, take care of security, build and run the service that you own. So also the operational side. So what this is doing is increasing cognitive load of the teams to a point where often they’re on overload.

Manuel Pais: They’re like a processor in overload, it gets really difficult. And so what we’re saying with Team Topologies is that actually once we understand better what cognitive load is, and then we break down cognitive load into different types of cognitive load, we can have a better chance of addressing that problem that many teams have.

Manuel Pais: So there are actually three types of cognitive load. There’s intrinsic, extraneous, and germane. And the reason why this is useful is that we understand what this relates to. So intrinsic cognitive load is if you like the skills that you need as a software engineer, or as part of a software delivery team, to do your work right, to deliver the changes to software.

Manuel Pais: And so depending on the languages, you use the tools, etc. You might need different skillsets. But you might be a Java programmer on a banking application. So you need to know about Java. You need to know some of the frameworks you use, but also how do we then create infrastructure for the service to run on, etc. So all these kinds of skills.

Manuel Pais: And then you have extraneous cognitive load, which is related to if you like the mechanics of the delivery work. And obviously the operations work as well. So whenever we need to remember to put the effort in our memory, to remember how do I deploy this service to Kubernetes, this new version, or how do I create some new infrastructure? How do I access the test database? How do I run the acceptance tests. All these things that take up mental effort are extraneous cognitive load.

Manuel Pais: And then finally we have germane cognitive load, where we say everything that’s related to the actual problem space. So what are we trying to do? Are we trying to, I don’t know, allow customers to do monthly transfers in an easier way or whatever it is, the problem we’re trying to solve, or the feature we’re trying to build for our customers. That’s all the germane part. That’s why we’re actually adding value to the organization, to the customers.

Manuel Pais: And so when we understand these different types of cognitive load, it becomes clear how we can help reduce the overall cognitive load is by minimizing intrinsic cognitive load, where you might have a pair programming, mob programming, internal training sessions, traditional training, etc.

Manuel Pais: All those things can help upskill people and reduce the intrinsic cognitive load. If they have some gaps in the things that they need to do on a daily basis. And then we want to reduce extraneous cognitive load as well. And that’s where one of the key things will be, offloading some of that extraneous cognitive load to maybe a platform that provides useful services for us to do things like provisioning, like deployment pipelines, monitoring, etc., where we’re not asking every team to do that by themselves and set up the whole stack and learn about all the different tools.

Manuel Pais: If we understand what are the needs of this, the teams, then maybe we have a better chance of creating some good platform services that address their needs, not just some generic services that everyone can use, but actually what specifically do we need for our teams? And so if we minimize then the extraneous intrinsic cognitive load, that means we should have ended up with more capacity for germane to actually focus on the customer needs.

Sven Johann: I think that making it right is really tricky. I once worked with a customer and they said, oh, we want now DevOps. And we want self-organizing teams. And then they thought about what are we doing? And it was even something like, do we buy our own licenses for software and stuff like that.

Sven Johann: And it was like, nah, maybe not, maybe you need someone who negotiates the price for you. You know, no developer should negotiate a price with some tool vendor. So it’s hard to find the balance. How do I know if I have too much cognitive load, how do I measure it?

Manuel Pais: We have some free tools and templates on GitHub on github.com/teamtopologies. One of them is a cognitive load assessment, but it’s just an example, right? Because when we’re talking specifically about germane cognitive load, this will be different depending on which industry and which context you’re working on.

Manuel Pais: But in terms of software engineering, there are some common things. When we talk about building run teams, we’ll talk about DevOps. We know what we’re asking teams. If we want them to go faster, to be a bit more autonomous, maybe it’s too strong, but more self-sufficient, is to understand the different aspects of the life cycle.

Manuel Pais: We want to reduce those handovers between functional areas. So we want teams to know about development, obviously, design, testing, infrastructure, security, operations, etc. So because we’re asking all these things from the teams, we need to find out how is this manageable?

Manuel Pais: And like I mentioned, we can have a platform. you can have even just the idea of a platform, even if you don’t have a dedicated platform team, just defining what is your platform. This could even be a wiki page where you say we use these AWS services, this is a default setup.

Manuel Pais: That is, it works well for this situation. So you can help other teams, even just with this kind of simple curated experience of how should we use the AWS services? Because if you just put it in front of a team and say, well, now you are using public cloud and just find out what you need to do, your cognitive load sort of explodes. So there is sort of a subjective measurement, which is people feel overwhelmed. They feel like we’re just running after things and we’d never have time to stop and think how to do this better. That’s one kind of more intuitive way to see, and the reason why so many people, even people who are not deciding on team structures who are individual contributors, relate to the book is because we talk about this problem of cognitive load.

Manuel Pais: And they feel like we’re being asked to do so much at the same time, there’s pressure to deliver features faster. And you know, how do we manage this? But you can do, like I said, that sort of cognitive load assessment too. You have an example on GitHub where basically asking questions from the team members around, how, what is the experience, the engineering experience of testing, of deploying, of building, of fixing problems in production, of finding or diagnosing problems, all those things that are common to most build and run teams.

Manuel Pais: Then we can ask the teams, how does it feel? Is it easy? Is it difficult? And so we start basically to have a conversation at the end of the day. The important thing to have a conversation about “is our cognitive load too high”? What are the areas where we’re really struggling? Maybe deployments are very painful. Maybe it takes a week or two to do a deployment. Then definitely it looks like that’s where we should invest and maybe have a better approach to deployment. Let’s take some time to automate some things, to understand better how to do continuous delivery properly, etc. So that’s where I would start.

Manuel Pais: We are also working on having a more detailed assessment for cognitive load. So we’re working with people with a psychology background as well. So hopefully next year we will have more of a, not just an assessment example, but more of a simple tool that can help teams get a better insight on where they are.

Manuel Pais: But I would recommend, even inside the team, you can do this sort of exercise, even if management is involved or anyone looking at multiple teams inside your team, you could start discussing these things.

Sven Johann: We put every link in the show notes. We used your questionnaire in two different ways. So one with a big team that we just send out a Google form or something like that. But we also did one-on-one interviews with people to really understand where they are, where they are struggling. And then the question is once we understand where teams are struggling, what are we doing then?

Sven Johann: And I think in a sense it’s kind of obvious, but as we said in the beginning, nobody ever worked it out, so to speak. You came up with four different team topologies, or four fundamental team topologies. Could you please explain each of them briefly?

Manuel Pais: I’ll try. So just as context, the four types of teams that we came up with are based on our own experience, me and Matthew Skelton, who co-wrote the book, helping clients; usually they were asking for help around DevOps and continuous delivery. And then figuring out that actually there was very little clarity on what types of teams exist.

Manuel Pais: What is their purpose, it’s more like groups of people doing work. And then somehow we expect this to magically come together. So having clarity on what is the type of team, who are the customers of this team? Are they the end customers? Are they internal customers in your organization? What are the expected behaviors from this type of team? Are we expected to have ownership of a service, of a build and run service, or are we expected to provide more of a support role to other teams, a teaching role? So the four team types come from there, having more clarity on how the different teams relate to each other and help each other as a sort of organism you feel that is evolving and improving the way the organization works, but also being able to deliver value faster for the organization.

Manuel Pais: And so specifically that we started with stream-aligned teams. So you could say these are sort of build and run teams. But we’re also saying they should be aligned to a stream of work. So we need to identify what are the sort of streams of work that are more or less decoupled as much as possible. So obviously this goes back to the idea of identifying your value streams, but then even within the value stream, you can have multiple products, you can have multiple streams within a product.

Manuel Pais: So it’s at a more kind of fine-grained level. You might have a team that’s focused on a specific user persona. For example, if you have accounting software, probably a large enterprise, a different persona from an individual freelance customer, right? So it might make sense that your streams are aligned to those user personas, but there are different ways that we can think about different streams, but effectively we’re trying to get to service or software where one team can own the end-to-end life cycle, that they can understand the customers, they can discover what these customers of this stream need. And they can then build, test, deploy, and run this service or software with part of a larger product, more or less independently.

Manuel Pais: It’s not like there’s no dependencies, but we’re trying a loosely coupled approach to teams, just like we try to do with services in software.

Sven Johann: So something like the two-pizza team …

Manuel Pais: Yes, that’s basically the kind of size of teams. So usually between seven, nine people, can be less, can be more, but to really have a jelled team that has a high degree of trust inside the team, you need to keep it to within that sort of boundary. So we start with stream-aligned teams, ideally, we’ll be able to identify what are the different value streams, different products, and different streams that we can then align the teams to.

Manuel Pais: But like we were saying before, the cognitive load for these teams, if we didn’t have any other types of teams, is very high. And then you get into those problems, where, how do we handle everything that’s being asked from us and deliver features and learn about all this other stuff. And that’s where we then have the other types of teams.

Manuel Pais: We have platform teams, where we’re talking about platform as more of an experience, not necessarily as much focused on services and tools. Yes, there will be. But how does the platform experience, the usage, help the teams reduce cognitive load? Because if you provide a platform service that’s very cumbersome, it’s difficult to understand, or the API doesn’t work as we expected, or the services are often unavailable or in maintenance. And this is not really helping the teams that much right there. This becomes more of a problem. Now we have to use this platform service which is complicated to understand.

Manuel Pais: So the starting point for platform teams as described in Team Topologies is to reduce cognitive load, is to provide a very high level of developer experience in the platform and user experience. So if we do that, we start to be able to offload some of that extraneous cognitive load into the platform.

Manuel Pais: And the nice thing is then, for the platform teams, this becomes their germane cognitive load, right? Because they were working on what helps our internal customers, right? So you might have platform teams around continuous delivery or around infrastructure, things like this, but they’re helping the internal teams, the stream-aligned teams, to reduce their load.

Manuel Pais: And then a third type of team are enabling teams. Again from our consulting experience, what we saw is that many teams in our organization usually have similar gaps. So it might be that they don’t know how to do test automation in an effective way, or they don’t know how to do continuous integration, or they don’t know about user experience, it can be any sort of domain. And often organizations struggle to meet those needs of those teams because they cannot hire a user experience expert for every team. They cannot hire a test automation expert for every team. So instead of relying on hiring, we should rely on internally enabling teams, ideally, where maybe we bring two or three experts.

Manuel Pais: And the way they work is to facilitate and mentor the other teams, they teach. Through pairings, through workshops, it could be video-based training, can be anything that helps our internal teams gain the skills that they need in a certain domain. And at the same time, enabling teams are in a good position to identify what could be useful platform features or services for those stream-aligned teams.

Manuel Pais: So we start have this sort of ecosystem, if you like, of different types of teams that understand well what their purpose is and how they’re helping the overall organization, the team of teams, if you like, become better.

Manuel Pais: We have a fourth type of team that we actually don’t recommend. It sounds weird, but it’s a complicated subsystem. There are some situations where you have a very niche, demanding type of subsystem. For example, it could be face recognition or real-time financial trading, where you have very complicated algorithms or technology where you do need a team to be owning that subsystem, even though that’s not directly used by the customers, is maybe part of another stream, but it would be too much to ask a stream-aligned team with everything we talked about already and tell them, also you have to own this face recognition module in your service.

Manuel Pais: Because these things tend to require a PhD-level type of understanding. So it’s something we have to rely on because of the cognitive load. But if we can avoid them, it’s better. So we’re not saying create a complicated subsystem around DevOps or create a complicated subsystem around Kubernetes.

Manuel Pais: No, those probably should be aligning to platform type of teams, probably the ones that are going to use that technology. So a complicated. subsystem team is a team that builds and runs a subsystem that requires a very niche sort of knowledge where you can’t really find people easily that know that.

Manuel Pais: So those are the four team types.

Sven Johann: Wonderful. So now we eventually have a vocabulary. You know, I found that quite helpful to, when we talk about teams, that we really have this vocabulary, who is doing what. But nevertheless, so for example, I’m working quite often in an enabling team. And the problem of course with enabling teams is, at least in my experience, how do you avoid becoming an ivory tower?

Sven Johann: That’s one thing, how do I avoid being an ivory tower? And the second thing is being really helpful. You can often help a little bit, but you can have a teaching session, but maybe a workshop is not enough. You really have to closely work with the team.

Sven Johann: That’s the kind of two struggles I see. So how do I avoid those?

Manuel Pais: Those are good points. There’s also another one, if I may add, where the enabling team, we don’t want that to become a dependency in terms of the execution of the work of the stream-aligned teams, right? So we don’t want an enabling team that, well … making you do a new infrastructure for our application, we need to ask the enabling team, that’s not the kind of enabling work we’re talking about, so yes, you need to find what is the good way.

Manuel Pais: So in terms of how much we need to help the teams actually what’s better is to, well, we said in the book to orbit around the teams. So that means we’re not expecting any enabling team, let’s say around test automation, to make it more concrete. Let’s say you have a stream-aligned team that really don’t have experience with test automation.

Manuel Pais: And the enabling team is going to help them. You don’t expect that this team is going to learn everything about desktop automation in a few weeks. We can try to teach everything as an enabling team. Probably they will remember 10% and be able to apply 10, 20%. That’s absolutely normal. So that’s not very effective.

Manuel Pais: It’s much more effective, and that’s where the enabling teams become powerful, to meet the team where they are now. So if we know this team has very little knowledge, let’s start with the basics. How, you know, what are some test automation tools? Why should you work with maybe higher-level scenarios, not with very detailed actions in the user interface? This kind of thing, the basics of what makes a good automated test, etc. Let’s not try to teach them everything. Maybe let’s have a couple of weeks. We help them. We teach, we maybe pair on some examples and then we come back maybe in a couple of months and see where they are.

Manuel Pais: Have you been able to apply these ideas to your everyday work and how is it going? And then at some point they’ll get into other challenges maybe because now they have too many tests and they take too long to run in the pipeline. And so we need to think about making tests more independent, more performant, etc.

Manuel Pais: So the enabling teams, especially if it’s a domain where teams need quite a lot of help, should try to help piece by piece, understand this team is here now, they need this help. And in three months, then they need help with the next step. And that’s one of the main differences also from what people ask us about, is this the center of excellence type of approach, and the ivory tower type of approach?

Manuel Pais: So an enabling team is on the ground, right? It’s understanding what does this team need? Where are they now? How can we help them? It’s not from a sort of ivory tower that you were talking about of saying, well, these are the good practices that everyone should follow. And you should have this type of architecture and you should all be doing infrastructure as code.

Manuel Pais: I’m not saying that’s not helpful. Teams are already overloaded. How are they going to be expected to just take guidelines and then learn how did this change their work and so on. So it’s a very much hands-on approach without becoming a dependency. So we’re teaching together with the other team and we’re helping the other team, but also we’re not doing the work for them. We’re not doing the delivery work for them. We’re helping them learn and improve their knowledge. But I agree, it’s not easy to find that right balance.

Sven Johann: One experience I made, which was really good, is really talk to them, as you said, right? So not come up with, you know, here is something, throw it over the wall, but really try to understand, really try to understand the problem level of each team.

Sven Johann: Because if you just say here is something, …

Manuel Pais: I have three small kids and I like the analogy, the oldest is six and is going to be seven. The analogy is we’re trying to help him with his homework. If he has a problem, how much is seven plus six? Of course I can tell him it’s 13 and write it down, but that’s not going to help him the next time he has another homework.

Manuel Pais: It’s not easy because it’s so obvious for us, let’s say experts, because we’ve been through it. But to understand what is the thought process in his mind, what is difficult for him to do that calculation? And so you have to start digging, do you know how to add these two numbers?

Manuel Pais: How do you do it in your mind? You have to understand where the other person or the other team is. And that’s what ivory towers usually don’t do. They just expect everyone to be everything, to be the same. They just need to adopt these practices. And in reality it’s not so easy.

Sven Johann: The other thing I found let’s say interesting was if there are a lot of enabling teams, because let’s say, as you said, we are overloaded. If you’re a stream aligned two-pizza team, you have a lot of things on your plate. For example, I once helped a team with performance and capacity testing and they were really far. Automated performance testing on a nightly basis and stuff like that. And then it turned out that in a lot of other areas, for example a normal test automation, like integration tests or something like that, or security, they basically were at zero. They were not so much interested in it and they just blocked the other enabling teams.

Sven Johann: So basically we need an overall understanding of what the team really needs. Does it need a lot of security? Does it need a lot of this? Does it need a lot of that? Because at some point an enabling team also needs to stop and say, okay, we could be way better. There are still other areas where you need to improve.

Manuel Pais: Yes. On the one hand it’s kind of natural that the team relies more on the things they know. So in that example, we know about how to do performance testing. So we focus more on that. And so that’s why it’s, first of all, important to understand the cognitive load aspects, and the cognitive load might also mean we’re looking at areas where we’re not doing things because we don’t have the skills.

Manuel Pais: We’re not doing integration testing, because we don’t have the knowledge, or something else. So we want to improve across those different domains. I would say there’s also the other extreme. You feel like, because I’m remembering a thread from Twitter yesterday where some organizations, because they’re taking literally you have cross-functional teams to mean, well, now we need teams to have one UX expert, one test automation, maybe one performance tester.

Manuel Pais: And they’re ending up with teams of 15 people, and that’s not the point. So to your point, what we need is to first understand that teams are different. Some teams will have gaps in testing. Others will have gaps in architecture, etc. So if we try to treat teams all the same, that’s not going to work very well.

Manuel Pais: We need to understand different teams, have different gaps, and then try to help them, step by step. We’re not saying they’re all going to become experts at everything, but you need to identify the gaps. Like in your example, this team has a gap in integration testing. Is this a gap that exists across several teams?

Manuel Pais: Maybe it makes sense to have an enabling team, or if it’s just this one individual team, then how can we address it? Maybe we can think about hiring someone that will help the team go faster, or maybe we just do some facilitating work with someone else from another stream-aligned team where they’ve done good integration, testing approaches.

Manuel Pais: So there are different ways we can think about reducing the gaps. In fact, enabling teams are in my experience so far after the book was published one of the types of teams that generate more doubts. And some organizations don’t have clarity in why should we invest in an enabling team.

Manuel Pais: And it’s fine not creating an enabling team to start with. So in the book, we also talk about interaction modes. So you can have people who maybe are in other stream-aligned teams help another team by doing this facilitating, by spending maybe two weeks or one day per week for a couple of months, whatever makes sense, to help the other team learn about this domain.

Manuel Pais: And that’s what we call facilitating. Maybe sometimes platform teams do some enabling as well, because they are experts in, for example, infrastructure automation. So they can help these stream-aligned teams not just use the platform, but also understand what are the good practices around infrastructure automation, for example.

Manuel Pais: So it’s really about having the mindset of we need to address gaps in teams. We need to find a way to upskill that is not just relying on people to, I don’t know, learn in their free time. That’s not scalable, and then find out what makes sense for different gaps. Is it we need an enabling team, or we just need some help, some facilitation, from another stream-aligned team, or we need help from a platform team, etc.

Manuel Pais: So there are different ways we can go about it. Then if there is a lack of certainty, if we need to create an enabling team that doesn’t necessarily need to be a dedicated team, it can just be some facilitation happening to start with.

Sven Johann: You mentioned platforms and platform teams. So it seems each and every company has a platform team, but it’s not really in every case clear what this platform team is doing. Is it just providing raw infrastructure and creating accounts and stuff like that and maintaining Kubernetes or something like that. So what does a platform team do? What’s the purpose?

Manuel Pais: We often see those platform teams … obviously “platform” is an overloaded term. But if we’re talking about having teams that are more self-sufficient, stream-aligned teams that can deliver more independently, then we don’t want dependencies during the execution of the work.

Manuel Pais: You don’t want them to depend on another team to provision infrastructure for them, or to open a firewall, a port on a firewall, or to do this kind of thing, this type of request-based work. So the platform team in Team Topologies is not the team that’s responding to requests or doing things on behalf of other teams or maintaining the operational side of the services.

Manuel Pais: What they’re doing is treating the platform as a product. So this idea of we’re providing a product in the form of platform services to our internal teams, who are the customers. And so they should be able to use this product to make their life easier, just like we use any kind of product to help make our life easier, without depending directly on the platform team, in terms of the execution work. Because then we’re just having more dependencies and then we have all the problems of the platform team that’s responding to requests has to prioritize, and you have all this conflict happening that teams are waiting for other teams.

Manuel Pais: So that’s what we want to move away from if we want fast flow. And so the platform has a product idea that, again is not new, has been around for quite a long time. People from Pivotal Labs have talked about it and Red Hat, etc. And so it’s a platform provided in a way that has a very high user experience, a very easy experience, and addresses the needs of the internal teams, but it’s not the blocking dependency.

Manuel Pais: It’s not that we’re waiting on the platform team to do things for us. And so we go back also to the enabling teams maybe. Before a team can use the platform, they need to understand more about infrastructure. They need to understand more about deployments, but the platform service also helps them do that, with less cognitive load.

Sven Johann: If you say it should be a product, when I read it, I was like, ah, yeah. But this is really hard. I believe you have a product, you have internal customers, you need a product manager. You need probably service support and everything. So you need to think about all that you have to do.

Sven Johann: I think Nicki once talked about, you have a community effort. Again, ivory tower. I’m not thinking about what could be of use. I really have a product manager, an internal one, talking to my internal customers, coming up with a roadmap or something like that.

Sven Johann: And validating the roadmap with all the internal customers, implementing something. So, yeah, I think it’s really hard, but it’s probably the only thing which will probably work.

Manuel Pais: By the way, Nicki Watt has a great … the talk that you mentioned about platforms focusing on the community that they serve. And actually when I was still working as DevOps lead at InfoQ, one of the last articles that I sort of curated was from her based on the talk. So I recommend that, and we can add it in there in the show notes. So yes, one of the main difficulties for this type of platform team is the lack of a product management approach.

Manuel Pais: It’s not very easy to find product people who understand the technical details of this type of platform usually. but it’s critical. So we often recommend customers to, you know, don’t put a junior product manager in the platform, you need probably your best, more experienced product managers in the platform because they will be critical to making it a success.

Manuel Pais: And so obviously in the book we couldn’t expand too much, but now we have, we’re just about to release a video-based training called “Platform as a Product” on our Team Topologies Academy, which is possibly the first of maybe other courses around platform teams.

Manuel Pais: But the first thing is applying product thinking to the platform, right? So you need to understand there are different types of customers inside your organization. You need to understand the adoption life cycle of your product. So not everyone’s going to jump on your first version of your service in the platform. You need to understand what’s going to create friction to other teams to adopt.

Manuel Pais: So it’s like you said, you need strong product management to make it work, but you also need to be careful not to go into too far. I’ve also seen platform teams that have this huge roadmap, and then a stream-aligned team comes to them and says, well, we need help with this.

Manuel Pais: And they say, well, we’re busy for the next six months, then you’re not really focusing on the customers. Are you? So we need to be careful. There are some smells like that. If you can’t have a conversation with stream-aligned teams, because you’re too busy as a platform team, then you’re doing something wrong.

Manuel Pais: If your roadmap is for six months or a year, and there’s no flexibility to change course, then there’s something wrong as well.

Sven Johann: We put all the links on the show notes, also for your courses. And I think I will visit that one because I have so many questions.

Sven Johann: Maybe one last question on the platform. So I always think a platform is something like it’s X-as-a-service, I have self-service. Is that true? That a platform is always self-service?

Manuel Pais: Yes and no. That’s kind of the target, the way of working of the platform, right? So when you get to the platform service, let’s say per service, to a level where it’s easy to use, it’s easy to understand for people who are new to the service, you have adequate documentation. Basically the things you expect from any AWS service, right, at least the ones that are not in that anymore, that are available on Google Cloud or any of these big providers.

Manuel Pais: Your expectation is to consume it in this X-as-a-service way. I will be able to self-serve, I will be able to understand how it works, to look at the examples, and to use it without actually talking to an AWS engineer. So that’s the target, but then there’s also another interaction that needs to happen, which is collaboration between platforms as stream-aligned teams. Because when there are new needs that arise and the stream-aligned team says, or maybe the enabling team tells the platform team, there’s a gap here that we think the platform could help with with some functionality or some API. The platform team needs to collaborate with their customer, who are the stream-aligned teams, very much in an agile way.

Manuel Pais: And get fast feedback and maybe do a prototype before they just go and take a requirement and spend three months, and then come back and “is this what you need?” “Oh, no, actually, we misunderstood each other.” So we don’t want that to happen in a platform. You want the platform teams actually working in a similar way towards their customers as stream-aligned teams.

Manuel Pais: So the platform actually ends up being a group of teams. Especially if the platform grows, you tend to have different services or streams inside the platform itself. It’s almost like a Russian doll. So you also have stream-aligned teams inside the platform where we might have a team that’s aligned to the continuous delivery service and another team aligned to the provisioning service, or whatever it might be. Can also be more data-level, business-level, services as well.

Sven Johann: Observability, things like that.

Manuel Pais: But those are the two kinds of typical interaction modes for platform teams. Collaboration: when we need to make changes to a service or create a new service or add a new feature because teams need that. And so we need to collaborate to do it iteratively, to understand when is this really ready to be consumed by other teams in that X-as-a-service way.

Manuel Pais: And it’s a constant alternating between these two, right? You might have a service that’s pretty stable, easy to use, but now we have a new feature. So now we go into collaboration mode again, and then at least for that feature, we need to understand that there’s this cycle, this feedback that we need to get until it’s usable again.

Manuel Pais: And then we say, okay, it’s X-as-a-service. And there are ways to kind of self-monitor, if you like, how much excellent service it is. If you say this feature or this service is generally available internally, and then you start getting a lot of support requests and people don’t understand how to use it, or they get a lot of errors, then, well, we probably were too early to make it X-as-a-service. We need to go back and understand better what’s going on.

Sven Johann: It’s also very hard to come up with X-as-a-service all the time. You know, when I think about platform, you know, we are using Kubernetes, but now someone wants to use AWS Lambda. And if they say, oh, we want to use it and say, okay, I’ll build you something, then it takes forever until you can use it.

Sven Johann: So probably it’s easier to do it step-wise, to say, okay, we really speed you up. We had to understand everything, then we try and during the collaboration, we try to find out if there is a need to automate something, to offer something as a service, and if not, then not.

Sven Johann: And then, if you have a service that you come back, yeah. What you say, if there are too many support tickets …

Manuel Pais: The platform teams need to firstly listen to stream-aligned teams and they need to prioritize obviously, but prioritization is based on the needs. It’s not based on, oh, everyone’s using Kubernetes, now we need to use it too. Is there an actual need?

Manuel Pais: And if there is a stream-aligned team that comes with that request, the thing is first we listen, we try to understand why do they need it? But it needs to be prioritized again. We need very good product management to be able to do that. You know, if the platform is successful, you will get many, many requests from many teams.

Manuel Pais: And so sometimes another pattern might be that the stream-aligned team that thinks they need this different tool or this different approach, they should have the option to do it by themselves, where they are incurring the cost because they believe it’s going to bring value. Okay. Then we should not stop that.

Manuel Pais: We should not say, oh no, that’s something that only the platform team can do because it’s related to infrastructure. No, you can have two stream-aligned teams that are sort of pioneering if you like some new approaches. And that makes total sense. It doesn’t make sense that the platform team is going to try out a lot of different new tools just because people in other teams heard about it. That’s not the point of the platform team. They should be helping with specific needs based on the work that teams are doing, today. But yes, it’s an exercise of treating the platform as a product, prioritizing and listening to customers, but also not stopping stream-aligned teams from taking the initiative and pioneering on things that they are the first ones to have that need or that request.

Manuel Pais: I had one example, a client where they have classified ads across different industries. And so they’ve mostly organized the teams across different industries. And one of the teams needed to, it was the first one that wanted to have videos in the ads. And so you would expect, okay, usually this might make sense as part of a platform to provide support for video processing. But it doesn’t make sense now to ask a platform team to do this for one team. So that stream-aligned team should take the initiative and they do this for their own needs and then later maybe other teams will need it.

Manuel Pais: And then we’ll see how we sort of, commoditize, if you like, by having it in the platform so many other teams can use it. So at the end of the day, it’s finding better ways to achieve and do the things we need to do without that sort of very strict boundary between saying this is only the platform team that can do this.

Sven Johann: So, we are almost at an end. You already mentioned the interaction modes and you basically explained them, but maybe it’s a good idea, to wrap it up a little bit, what interaction modes exist and what should they be used for?

Manuel Pais: So there are three. So we’ve talked about them. The first is collaboration, but in the sense of being much more well-defined, so it’s not just open-ended collaboration. So sometimes teams say, well, yeah, we collaborate with that other team. But what they really mean is they have a relationship with that other team. Sometimes they work together. We’re defining collaboration as two teams working together for a specific period of time with a specific goal. Maybe it’s we need to find a way to automate the deployment so we don’t depend on the other team to do deployments for us. But it can be anything. So it’s collaboration to solve a specific problem.

Manuel Pais: Then we have facilitating, which I mentioned is typical for enabling teams, but it could also happen between stream-aligned teams. It could happen between stream-aligned teams and platform teams. Wherever you have one team that has expertise in some domain where another team needs help. So we can do facilitation, for a defined period of time, which can change. But we set the expectations that this is not an open-ended interaction. We expect this to take two weeks so that you teach us the basics about test automation, for example. So again, we facilitate, one team teaches the other, but it should also be open to learning. Because there’s no, I think to me, absolute experts, even if we’ve spent a lot of years in some domain, there is always something, or some problem, that people can tell us that we hadn’t thought about. But basically facilitating is that.

Manuel Pais: And then we have X-as-a-service, what we were just talking about, typical for the platform where you provide a service that other teams consume. You don’t actually need the teams to interact because the services have good enough quality, reliability, usability, so that other teams can use it independently.

Manuel Pais: So those are the three ones.

Sven Johann: So now one last short question and to really wrap it up. If I’m a small organization, like a startup or something, what can I adopt? You know, a startup probably doesn’t have a platform team, doesn’t have the money for enabling teams. What can I do if I work in a small organization?

Manuel Pais: That’s a good question that people often ask, from what size do I need to worry about team topologies, from what size of the organization. You can look at the ideas in team topologies at any size, right?

Manuel Pais: The difference might be in the topologies that you find. So if we only have two teams, then we’re definitely not going to be able to have enabling, platform, and stream-aligned teams. That’s already three. But we can think about the idea. We can think of enabling and facilitating between, you know, if we have two teams, maybe one has been around longer, has more experience around the product, and they know how do we change the product, what has worked well? Why were some decisions made on the architecture? Probably they will be in a good place to facilitate knowledge to a more junior team, maybe that you’ve created to address new customers or whatever it might be. So the idea of enabling can happen even without a dedicated enabling team, and the same for platform.

Manuel Pais: Like we said earlier, a platform could even just be a wiki page where you’re documenting, you know, these are some good approaches to use these services, and this has helped us accelerate and not have every team think about, oh, how do I set up my infrastructure? Well, oh, they have here some recommendations, use Lambda for this type of job. Use other services for other types of work.

Manuel Pais: So the ideas can take place. And then finally, the other thing that is quite relevant, I think, for startups and scale-ups, is to think about the trust boundaries that we talk about in the book and a bit more in our academy training, where the work of Robin Dunbar is quite interesting. People might be familiar with Dunbar’s number, which says 150, it’s actually between 100 and 200 people, is the number that we can keep meaningful interactions with as an individual. So that we know who these people are and what they do, and how we relate to them. And that’s also useful in the workplace, but there are other trust boundaries at smaller scales.

Manuel Pais: So when you go beyond 30, 40, 50 people, which tends to be when startups start to scale up, the dynamics change. You start having more teams, you start having less shared knowledge across all teams because the work increases. And so what worked in the beginning with 15, 20, 30 people might not work as well with 40, 50, 60.

Manuel Pais: And so understanding the trust boundaries can be quite important because that means we need to look at different ways of enabling. Do we need now a platform team. You know, we need to look into how things change when we cross different trust boundaries. And other things like Conway’s law that we talk about in Team Topologies, basically all of that is meaningful. Maybe the way that you implement is different.

Sven Johann: All right. Thank you very much. So obviously we only had one hour, but so many questions. Where can our listeners find more answers, more information? What’s your recommendation?

Manuel Pais: We have our main website, teamtopologies.com, where you can find key concepts, free resources. We have a number of public talks and articles and things that people can use. We now have infographics as well, which I think they’re pretty cool. And they’re a good way to share the main ideas with other people who maybe are not as familiar with team topologies.

Manuel Pais: And then we have started a Team Topologies Academy. So this is video-based training. So we have, for example, a three-hour self-paced course on the kind of key ideas of team topologies, and we’re about to publish that other course I mentioned on platform as a product, which is obviously specific for everyone working with internal platforms. And there’s more to come, I’m quite excited about it because we also want to bring in other people who have expertise, for example, with team topologies and Wardley Mapping or team topologies and domain-driven design. Because there’s a lot of overlap, team topologies provides the team design and organizational aspects and those other approaches provide other things that combine quite well.

Manuel Pais: So I would say those two, teamtopologies.com and academy.teamtopologies.com. And we have those GitHub repositories that are freely accessible, creative commons, which you can find on github.com/teamtopologies. And of course we’re on social media. And if you look for Team Topologies you’ll find us on Twitter, LinkedIn.

Sven Johann: Awesome. Very nice. So thank you very much Manuel, and thank you to all of our listeners.