Artur Ortega on GraphQL

Transcript

Stefan Tilkov: Welcome, listeners, to a new episode of the CaSE Podcast, another conversation about software engineering. My name is Stefan Tilkov, and today my guest is Artur Ortega. Artur, I am very pleased to have you on the show. Welcome!

Artur Ortega: Thank you so much for the invite, Stefan.

Stefan Tilkov: Our topic today is gonna be a lot of fun, I'm very sure of that, because we will be talking about REST versus GraphQL. If you've ever heard me speak about REST, then you know I'm a big fan of this. So in the same spirit as with the episode with Markus Voelter, this is going to be one of the controversial ones. I actually like them a lot, because they give me a chance to learn a lot; and Artur actually liked the one with Markus, so that's why he volunteered and tried to pick a topic that he was sure would annoy me. Is that a fair summary, Artur?

Artur Ortega: Yes, it's a very fair summary.

Stefan Tilkov: Okay, so please do start by telling us a little bit about who you are and what you do.

Artur Ortega: Yes. My name is Artur Ortega. I’ve lived in London since 2007, I'm a software architect. At the moment, I'm working for a digital health startup, and I'm heading there the platform architecture.

Stefan Tilkov: Okay. And one of the topics you're dealing with right now is GraphQL... And you said that what we should start with is a REST intro, and then weirdly enough, you asked me to do it. So this is a little strange for our format... I will be doing stuff first; I will be doing a brief REST intro, and then we will continue by the GraphQL criticism, or maybe a brief GraphQL intro first, and then address some of the criticism that GraphQL folks have for the REST paradigm.

Artur Ortega: That’s alright.

Stefan Tilkov: Okay. So I will try to be very brief, and that is really hard for me. Probably by now everybody has heard something about REST and has some idea what that is... So I will try to summarize this very briefly. REST stands for Representational State Transfer; not that that matters in any way... And it's the title of a Ph.D. thesis written 20 years ago by Roy Fielding, who's one of the people who was very influential in standardizing the web, mostly the HTTP protocol, but also quite a few other standards, RFCs, around core web architecture protocols and formats.

Stefan Tilkov: The REST architectural style is called an architectural style because it is sort of a little bit of scientific approach to classify a certain kind of system, namely the web. So what REST tries to do is it tries to summarize a few key decisions in the design of the architecture of the web, and then explain what kinds of benefits you might derive if you apply those decisions, if you stick to those restrictions, and then it gives that thing a name. So it starts by talking a bit about different styles that you might want to use, something like pipes and filters, or client-server, and it derives this particular style that makes up the web.

Stefan Tilkov: So REST is not, as many think, a web API specification, or a style to compete with SOAP and WSDL web services. It essentially is the architectural style underlying the architecture of the web. And with that out of the way, the core idea here is that there are things that have an identification on the web - that's a URI, or a URL, if you prefer. So there are identifiable things, you interact with those things, and the way you interact with those things is by way of hypermedia. Hypermedia as H in HTTP, or hypertext, in that case.

Stefan Tilkov: The hypermedia idea is essentially that whatever you do is based on what you got from your communication partner before. So if you consider a standard, classical web application, what happens is you retrieve something from some URI, a representation of that resource identified by that URI, and that might include a few links to other things. If you follow a link, you retrieve another representation... And then that might include maybe a form, and then you add some input into one of those forms, and then you submit the form and get back a new representation... And so on and so on. So you navigate through the state of your system by following links and using other hypermedia affordances.

Stefan Tilkov: That core idea is the same whether you are using a browser as a human being, or whether you're writing an API client that does that by using the same mechanisms automatically... And that in turn gives you some benefits. That's the core idea – if you do that, you get a very decoupled system, or a very loosely-coupled system, and that is evidenced by the fact that we use it every day to navigate sites, jump from Twitter to some other site, go to Facebook, and interact with lots of different applications, written by different people all over the world. So it scales pretty well, and it fulfills that promise of being decoupled. That is my attempt at a brief summary of what REST is about.

Artur Ortega: It sounds really good.

Stefan Tilkov: Is that brief enough?

Artur Ortega: Yes, it's really good, and it gives me a good segue to GraphQL.

Stefan Tilkov: Oh, perfect. So please go ahead and explain what GraphQL is then.

Artur Ortega: Okay. GraphQL was defined a few years ago - probably five years ago - by Facebook as a blueprint, later on evolved and implemented by several other teams and different languages and so on. I think the most famous now is at the moment Apollo. But GraphQL itself basically exists now as a standard.

Artur Ortega: The idea is to look at REST and to say "Okay, is there a better way of navigating one entity to the other, like we described at the moment, and following hypermedia references?" And at the beginning, people were using it a little bit like a smart proxy, because in the end, people wanted to have not just one REST endpoint, they were usually interested into the data sitting behind the hypermedia references, and usually they were looking into optimizing the data fetching when you are using the API. So they didn't want to overfetch the data, fields that they were not interested in, and they didn't want to issue additional HTTP requests to fetch additional data that were sitting behind hypermedia references.

Artur Ortega: This standard comes with a strongly-typed schema definition, very data-centric, and takes this navigation from REST to the next level by specifying a data model and specifying the references to other objects or to itself, so that you would be able to explore the schema and navigate the schema... But the interesting part with the navigation is that you are navigating by cherry-picking nested data.

Artur Ortega: And that’s the biggest difference to REST... So more data-centric, a clear schema, which in REST probably you would defer to Swagger or other tools, or JSON schema, to define your REST, but is not as standardized, GraphQL standardizes it. It is a very strongly-typed way of specifying your data and the references between them, and it optimizes the query of the data. That's in essence basically GraphQL.

Stefan Tilkov: Okay. Maybe we can try to separate some issues here. One is the criticism that I often read when I see people introducing GraphQL. You actually didn't do that very much, which I sort of suspected, that you wouldn't be doing that... But many people who introduce GraphQL use the first two pages of their article to bash REST. So there's this implied criticism that something is wrong with REST, and bad about REST, and we can address that as one topic.

Stefan Tilkov: Then there's another thing, which is simply a description of what GraphQL actually is, and I think we should make a clear separation there. Because while it's fun to disagree about the criticism of REST or not, it's also interesting to simply know what GraphQL is, independent of whether you think the criticism of REST is justified or not. So maybe we can separate the two.

Stefan Tilkov: So let's spend a bit on the fun part before we dive into the actual technical description, and then maybe later on we can revisit the criticism again, in sort of a more informed way.

Artur Ortega: Let's talk about the pain points that you would have with REST, okay?

Stefan Tilkov: Yes, please.

Artur Ortega: One, I touched slightly in the definition was overfetching. So what do you do if you have a mobile device, low bandwidth, and you want to interact with REST? The provider of the REST API usually doesn't have only one customer or one use case, and needs to find the common denominator, or the superset of the fields that someone needs. And this means that you have to provide basically the full data to the client, or create bespoke Backend for Frontend (BFF) endpoints for particular customers.

Stefan Tilkov: I disagree with that assessment. That's one of the things that drives me crazy about this discussion... And that is that I think many people who criticize REST are not really criticizing REST, what they're criticizing are bad HTTP APIs. And I have no problem accepting that GraphQL is a vast improvement over bad HTTP APIs; I just am reluctant to call them REST APIs, because that's not what they are.

Stefan Tilkov: Let me try this from a historical perspective. When this thing started out, initially people had zero respect for HTTP at all. First of all, they decided not to use it at all. Then they decided "Okay, this thing seems to be taking off, and people seem to want to use it, so let's use it in some way", and then things like SOAP got invented, which essentially tunneled everything through HTTP POST. Or these horrible HTTP query APIs got invented, that even worse, tunneled everything through HTTP GET. And that was not only no respect for HTTP, that was basically disrespect of HTTP. It was actively ignoring everything about the thing. And we've moved quite a bit from that stage to a stage where GraphQL people, as well as REST people, as well as whatever you call the HTTP, JSON people etc. they all acknowledge that HTTP has some benefits, and some of those benefits include caching, and the meaning of the HTTP verbs, and headers, and stuff.

Stefan Tilkov: So most people have moved to at least something like a CRUD HTTP API, where GET, PUT, POST, DELETE are used roughly mapped to the create, read, update, delete verbs known in database programming... And they've created APIs that do a little better in respecting HTTP. But they're not REST APIs, because a REST API is not something that you use to write an application. If it's supposed to be a REST API, it actually is an application; it's supposed to sit on a different layer.

Stefan Tilkov: If you have an HTTP API, and if you expose an HTTP API to the outside world that in a visualization of layers is too low, too close to the database, then you will be required to write a client that has a lot of business logic. And HTTP – or let me rephrase... REST APIs are really not about that. A REST API is supposed to encapsulate a lot of business logic. Say you have an approval process or something, then a REST API is excellent to expose that approval process to the outside world. It's not a good candidate to expose a low-level API that somebody else can use to write an approval process. It just sits at the wrong layer.

Stefan Tilkov: So if you compare GraphQL to a low-level HTTP API, what I would consider a bad REST API, then that criticism holds. That's essentially the short version of what I just ranted about. If you write a good REST API, then it will have the information needed to go to the next step, regardless of what kind of client it is that you're using there. Because essentially, they all have the same requirements in terms of the business logic on server-side they want to reuse. Does that make sense?

Artur Ortega: It makes sense, but I think GraphQL goes a little bit -- well, it splits the work up a little bit. So you don't have only queries in GraphQL, you have mutations as well...

Stefan Tilkov: Mm-hm. We'll get to that...

Artur Ortega: So basically, the approval process you described would go through a mutation, and there's a standardization there as well on if you provide the ID back, you can provide the client with all the fields that have been updated through your approval process. So you have a strongly consistent way of returning the data that has changed on that object that you basically changed the status...

Stefan Tilkov: Yes, I can understand this.

Artur Ortega: And you would get basically back on the client. So it's a slightly different – it wouldn't be the query path. The query path would be more with like navigating the data and querying the data itself.

Stefan Tilkov: Okay, I think we'll get into more detail about that... And that is a valid thing, because what you're saying now is that you could essentially achieve the same – you could put the API at the same level, at the same layer, with both … But you would have the benefit of standardization with the GraphQL approach. We can discuss that. But that is a different one than saying "The REST API will be too low-level and too chatty" That always drives me crazy; that is the standard argument that I always read for people trying to sell me GraphQL. They want to sell me GraphQL by telling me "REST sucks", and then I look at the way they use REST, and I agree it sucks, because it's not REST what you're doing there. If you used it at a different level you wouldn't have the problem. That's maybe just a way of selling it. Maybe it works, so that's perfectly fine if it works for selling GraphQL to people who have bad experience with a RESTful approach. Okay. It doesn't work for me, because I don't have that bad experience with those things.

Artur Ortega: Sure.

Stefan Tilkov: And maybe it sort of alienates me for no good reason. It makes me criticize GraphQL simply because of the way it's sold to me, as opposed to its technical benefits.

Artur Ortega: Let's take the advantages you were explaining about hypermedia references. So in my case, for example, I work in digital health, so there would be for example an appointment. The appointment would have a hypermedia reference to the patient, a hypermedia reference to the location where the appointment is, a hypermedia reference for the practitioner who is taking care of the consultation.

Artur Ortega: If you go for REST, you would basically fetch the appointment and see when is the appointment, when does it start, when does it end, and in the moment you would like to know who is taking part at the appointment and where is the appointment, you would need to follow these hypermedia references. For the example that you gave, it's like the web - you would follow them if you are interested into that.

Stefan Tilkov: Sorry for interrupting you. I'm not sure I agree … But please finish first, and then I'll see whether I disagree or not.

Artur Ortega: Okay. So the idea would be now to say "Okay, we are defining these entities, like an appointment, a patient, a practitioner, a location as their own separate resources or entities, and we live with these references." And what you would basically do when you are navigating it with GraphQL, you would get the definition for your appointment, but while following the hypermedia reference, you would get as well the definition for the nested entities, like a patient, practitioner and location, and you would be able to cherry-pick some of the values that you need there. The interesting part is then you get all the fields that you need in one way back, without looking at the hypermedia references themselves. They're basically transparent to the moment.

Artur Ortega: So in the moment you go to the appointment and say "Give me the appointment with the patient name, consultant name and the location name." You would these three additional values to your appointment object back, without the need of looking at the hypermedia reference themselves. They are just like pointers in C. You are not interested in the pointer, you're just interested in the values they are referencing at. This is basically a big difference to REST, I think.

Stefan Tilkov: I mostly disagree with that assessment as well. One of the things is that you sort of implicitly introduced a -- you could call it a normalization step there. You said "Well, in a REST world you would have the appointment, and then you point to the doctor, the patient, the medication etc." sort of implying that if I were to design this system in a RESTful way, I would be decomposing things on this fine-grained level. But of course, I have absolutely no incentive to do that if that's not what the use cases ask for.

Stefan Tilkov: If you go to a website -- let's say you buy stuff at Amazon.com, or Amazon.co.uk. Then you're not presented with this kind of interface. You're not presented with something that only has some information about one entity, and every time you want some additional information, you have to navigate to another one. And you don't have to follow a hundred links to assemble the information you need to make a purchasing decision. That's not the way a website works. The way a website works is, it presents you with everything you need to take the next step, whatever that is - maybe click a link, or submit some data. And a good REST API would follow the same principle. It would give you whatever you need now, and it would possibly give you ways to drill down into more detail, but only if you need it.

Stefan Tilkov: Let's say your use case is, in your domain, making an appointment, or maybe canceling an appointment. To cancel an appointment, I need to know whether I've found the right appointment, and maybe I need an option to not cancel, but reschedule it; or maybe I can ask whether it's still possible to cancel it, or I can get some information about what the cost would be if I canceled it now. That information would all be presented to me, along with the next possible steps that I can do, like cancel it with a reason, or cancel the cancelation, or whatever it is.

Stefan Tilkov: So in a good RESTful system, whether it's an API, or a website, or a web app, you would be presented with the contextual information that you need to make the next decision of how to move the thing forward. Now, what I do grant you is that there is a difference in who gets to decide what that is.

Artur Ortega: Yes, exactly.

Stefan Tilkov: Because in the RESTful model it's the server that decides that. Essentially, to me, that is what the client-server system in that case means. The server decides that and presents those options, which has both benefits and downsides. One of the benefits is that it's consistent across all clients.

Stefan Tilkov: I now have the same information – this is the kind of information that I need to present to my users, my clients, my consumers if they need to take that next step. And maybe I have variants of that. I have these three or four paths through the stages in my system, with maybe different stuff... But it's sort of predefined on the server-side. I determine on the server-side, with the benefit that it's consistent, and the downside that I can't deviate from that. If you as a client are just interested in just one particular aspect, you might get too much data or too little data, and with GraphQL you could influence that.

Artur Ortega: It has an influence as well on the architecture on the server-side, because... The example now with the appointments, and you would say "Actually, if I do an appointment, i would provide you with the data of the patient name, practitioner name and location, because I know you need it." But practically, this data in a big enterprise would sit in different places, and there are probably even other REST APIs maintained by other teams. So the way it ends up is that on the server side, the responsibles for the appointment need to understand how the patient data works, how the practitioner data works, how the location data works, and need to deal with probably requests into that to get the data into their domain... But in the end, they need to understand the patient, the practitioner and the location - their three foreign resources to the appointment - they need to understand and to provide you with information.

Artur Ortega: GraphQL gives you a different architecture for that, because you leave the fetching of the nested data to the GraphQL service itself, by just referencing it. That means the appointment team doesn't need to know how a patient works, doesn't need to know the data model and doesn't need to fetch the data, so they don't have to deal with additional sub-requests, what to do when the data is missing, how the retries or even subscribing to data to basically put it into that data model.

Stefan Tilkov: But that sounds a bit like, you know, if I were to write an application that has a user interface, it sounds a bit like, I would say, "Well, I don't have to understand this data and that data and that data, because I simply just give my user a SQL query panel where they can enter a SQL query and they get the raw data." I mean, of course, it's true, they get the raw data, but I haven't really solved a problem. I've just delegated the problem to somebody else, namely the user, or in your case, the client... Because now the client will need to understand the data, as opposed to the server. Somebody has to understand it to make sense of it.

Artur Ortega: But they still have to understand it, even if you provide them the API. The interesting part with GraphQL is this strongly-typed data schema of the GraphQL schema. That means you can inspect the schema and you get the full schema in one go. So the whole API that you're getting, basically, becomes one domain. It's not like an API gateway, where you would have several REST APIs on it, and you need to figure out how one works and the other if you just would follow hypermedia references and then you leave it to the client. On that case, you get the full data model for an appointment, and that would include the data model for the patient, and so on. So you still have to understand it, regardless if you make it part of the appointment on the REST API or you basically get a GraphQL schema.

Stefan Tilkov: I agree to a certain degree. I will not argue that schemas don't have value; I think they have downsides as well, but they do definitely have value... So maybe now is a good point to really go into more detail about the GraphQL aspects. Why don't you start us off with a little more of a technical introduction to GraphQL? How does it work, what features does it provide?

Artur Ortega: Okay. Central to GraphQL is GraphQL's schema. This is a standardized way of defining your data model. And additional to the data model itself, where you define different entities, you have a standardization as well about pagination, and some best practices as well. The first book that specifies about how you basically filter on this pagination, and how you name things -- so you have a particular kind of style guide for these data schemas, how you define them... And then an additional part is you can annotate the schemas. They're called directives. Or you can even define custom directives, where you define for each of the fields basically some specifics about that field. And that could be, for example, visibility for a particular role, or you want to apply additional polymorphic options for that field. The schema itself then is basically a combination of the different entities that are referencing each other.

Artur Ortega: The other part - and this goes a little bit beyond the just using GraphQL to reference the schema - would be as well the how to navigate a graph. That would mean, for example - if you just think about pagination - the reverse search or the adjacency list. So in the case of the example I gave, an appointment references a patient; the patient itself doesn't reference the appointment, but the reverse search would be a list of appointments. So you would basically define, for example, as part of the patient definition, the list of appointments that would be all the appointments that are belonging to that patient, and then you would basically act on that list of appointments.

Artur Ortega: You get a sense of a graph by navigating in one direction, and having the pagination on the adjacency list. That gives you a different semantic about navigating the data itself.

Artur Ortega: And under the hood, to each entity, each field itself needs its own resolver. These resolvers can be bespoke code. Under the hood, these resolvers could even end up to go to several backend services... Even, basically, you are fetching a patient, but the patient data would come from different services; because you are resolving it field by field, you would basically be able to resolve it from different resources.

Artur Ortega: The other part is you can annotate as well these fields with directives to specify which fields are being deprecated. And then the other part is because GraphQL itself doesn't allow you to say "Give me the full piece of data, and all the fields", but because you always have to specify explicitly the fields that you are interested in, it gives the server-side the option to know which fields are actually being used by the client, and you know if you are breaking something, if you are deprecating a field or adding a new field... Which allows a complete new evolution of the API without versioning, but gradually moving fields along.

Stefan Tilkov: Let me see whether I have understood this correctly. Essentially, what you've just described is I have a data modeling language that allows me to specify the entities that make up my context (or whatever it is) with all their fields, and types, and the relation between those entities, and some meta information about all the fields and entities, like deprecation, or maybe some information about validation rules, or whatever it is...

Artur Ortega: Yes, yes.

Stefan Tilkov: That will be something that is sort of the -- well, I don't know whether GraphQL people call it the contract, but it's sort of...

Artur Ortega: Yes, GraphQL schema.

Stefan Tilkov: Okay, so it's the schema; in the WSDL times we would have called it a contract-first design. It doesn't really matter. So you basically design this schema so that you have an agreement between the users of that schema and the providers of that schema. So from the user side, if I am a GraphQL user, I can retrieve that schema from some place, I can look at it, and then I can start to actually do something with the other side without know what's behind that particular schema; how it's fulfilled, how it's provided.

Artur Ortega: Correct.

Stefan Tilkov: Okay. So from the client-side of things, what are the things that I can do with that schema?

Artur Ortega: The interesting part with this schema, because you can basically inspect the schema, it is standardized - you can use interactive GraphQL clients; there are several of them. GraphiQL is one of them. Playground... There are several of them, which explore the schema of an API. They get the documentation that it's annotated with, they know the types that you can specify... And that means you can start to explore the API. Very often, providers of an API would even provide predefined queries for that API... But in the end, you would have an interactive tool with autocompletion, where you could start to query the data in an interactive way.

Stefan Tilkov: A bit like a SQL query builder for--

Artur Ortega: Yes, exactly that.

Stefan Tilkov: Okay, I think we will get back to that.

Artur Ortega: But you basically are working with the live data. It's interesting, because you're not -- otherwise, you would go from a REST definition, and you need to read the documentation, or read the Swagger documentation to understand your API. This one is more like an explorative way to doing that, and making use that JSON is as readable as it is.

Stefan Tilkov: Let me just make a brief note of the fact that I don't consider Swagger to be RESTful, or a good example of something...

Artur Ortega: Okay, fair enough.

Stefan Tilkov: But okay. Okay. Understood. So I use, for example, an interactive client. I don't have to do that, but I can do that, to explore the schema and figure out what it is that I want, and how to get that, and maybe fiddle around with some actions. You mentioned that it's not just queries. I think everybody gets that there are queries, but there are also other options, right?

Artur Ortega: Yes, there are two additional ways of interacting with GraphQL. The next one is mutation, which includes create, update and delete. This is probably the most equivalent to what you were explaining. The interesting part with mutations and a lot of the clients - the clients for this standard, basically, very often use local caches, local copies of the data... And that means that you can be -- when you're mutating data, you can basically return for the ID that has changed or the affected data back, so that the client itself can update the local copy.

Artur Ortega: Usually what happens is otherwise you would update a piece of data, you would get some additional information about how to continue with your journey, and there could be another query, for example. This is slightly different.

Stefan Tilkov: Sorry, I think you lost me here.

Artur Ortega: Basically, it's very aligned with CQRS probably, where you do the command layer with the nice advantage that you would get the effect of your command immediately back, if you want to. There are other ways as well, but basically it’s recommended that you return with the result of your command, and then the query would be a separate password, very close to what you basically do with CQRS.

Artur Ortega: Then the other part is subscriptions; there are different implementation of them. Some of them are webhooks. So you are mutating something, it has an impact, but then you want to subscribe to some of the data that you have locally, if they have been updated.

Artur Ortega: So some of them basically can be implemented with webhooks, or with websockets. A lot of implementations use websockets to keep your data locally up to date, for the data that you have directly impacted.

Stefan Tilkov: Okay. So the typical model here would be a local application, let's say a native application running on some device, or maybe a single-page application, whatever it is - something that has logic, it has its own local copy of the data relevant to its scope, like maybe the users it's interested in or the transactions it's interested in. It would send commands to the GraphQL side to get stuff mutated, it would get back the results, if they're relevant for something they're subscribed to, because they're interested in that, and then they update their local copy.

Stefan Tilkov: Essentially, that is a perfectly valid and absolutely reasonable discussion, because what we're describing here is a certain architecture, and that's the architecture that GraphQL is built for. My only argument that I'm probably going to come back to multiple times during the rest of our talk is that this is specifically not something you would ever aim for with a RESTful system.

Artur Ortega: Probably not.

Stefan Tilkov: So the whole comparison that we started out with is maybe broken, but... Let's continue, because I'm just learning so much at the moment, and I'm having fun doing that. So I understood the client-side... I have standard ways of interacting with that thing; I would just assume I don't have to worry about anything implementation-wise on the server-side, because I have standardized interaction with everything that exposes this kind of GraphQL schema... So in terms of the standardization, is there one GraphQL standard in one particular version that's current, or what is the status there?

Artur Ortega: Yes, there's a working group on GraphQL that are working on a standard. Usually, the fundamental part is "No breaking changes." That is as well how you evolve your schema itself. And it's a lot of tooling around that, additionally. So you would have toolings around what are the queries you're getting in, can you double-check if the changes you are having basically have an impact on other clients, it could be other services as well that you are using... And most of the time it's evolving an API without breaking changes; the same is for the schema as well.

Artur Ortega: There are more and more directives being added over the time as it is evolving, this kind of things... But at the same time, you can extend it yourself with custom directives, and so on. Otherwise, it's clearly defined and standardized.

Stefan Tilkov: Okay. I'm just asking this because for other standards you sometimes have some people call something a standard, but in actual real life is not that interoperable at all. That is not the case in this space?

Artur Ortega: No. You can basically use a Java client and a Python server for GraphQL, or a TypeScript client and a Scala server; it wouldn't make any difference at all.

Stefan Tilkov: Okay. Perfectly fine. So I think we've covered most of the client-side view, right? Or is there something missing?

Artur Ortega: Yes, I think we probably have to come back as well to why it became so popular... Because you mentioned basically badly-implemented REST APIs. So GraphQL became mostly popular by the frontend developers and the mobile developers. They were basically tired of dealing with probably badly-designed REST APIs. And especially on mobile, where they wanted to make sure that the number of connections to the server are optimized. At the time, five years ago, basically HTTP/2 wasn't a thing yet on the CDN level; so you'd still speak about HTTP 1.1. If it's HTTPS, each connection would need to basically do the handshake, and the SSL certificates, and so on. So each connection on a mobile phone was having a lot of impact on how fast you can interact with the REST APIs on the server-side... And they were trying to optimize the connections to have one connection and leaving it to the server-side to do all the steps necessary to deal with REST APIs. That's usually where GraphQL came in.

Artur Ortega: Usually, you had frontend developers, mobile developers - they were saying "We don't want to deal with (as you would say) badly-implemented REST APIs", but that's basically how the reality looked like, and they were saying "Let's simplify the interaction with the REST APIs. Let's simplify the interaction with hypermedia references, and let's move the complexity to the server-side, so that on the client-side I can have a very lightweight UI, and dealing just with local state and local copies, but minimizing the business logic on the client."

Stefan Tilkov: That makes perfect sense.

Artur Ortega: That's usually where it starts.

Stefan Tilkov: That makes perfect sense, and I don't even disagree. We can settle on the fact that we're having these badly-designed REST APIs, for whatever reason; maybe it's because REST is so hard, maybe it's because people did a bad job... Who knows/who cares, if we had that situation...? And it's definitely not something that I would deny that those exist; there are tons of really, really bad REST APIs. It's just the question of what strategy do you use to fix that situation. Of course, I would argue that the best way would be to fix the bad APIs and make them good APIs, thereby sort of reducing the need for something like GraphQL. But maybe that's just wishful thinking.

Stefan Tilkov: Then the other option, of course, is to invent something that can make more sense of that, or make it easier for people who design bad REST APIs to design -- well, maybe not good REST APIs, but at least good GraphQL schemas. Maybe that's a good strategy. And if you assume that that is the case, then the rest of what we've just discussed makes perfect sense. I have no disagreement on the technical side of things. We can have a debate about what the better architecture is, for what scenario, but let's move that to the end.

Stefan Tilkov: I now have a pretty good picture of what GraphQL does on the client-side, so maybe let's move to the server, to the implementation side of things. I have this thing, and now I'm responsible for implementing that schema, that contract. How do I do that?

Artur Ortega: Usually, people define the schema, and then they start to implement the resolvers for each of the fields.

Stefan Tilkov: Please explain that again to me... I do need a resolver for each of the fields? That puzzles me a bit.

Artur Ortega: Usually, it's basically a one-to-one mapping; it's not a big deal. Very often - and we are speaking about the frontend, trying to simplify some of the work that they have to do. That means usually they implement resolvers for the fields you are exposing, and these resolvers could fetch the data from a database, from a REST API, from any kind of different sources, but they are unifying and making the data consistent, even if they are usually sitting in different endpoints on the server side.

Artur Ortega: Usually, you are implementing in that case resolvers for each of the fields. You are very, very local to each of the fields when you're implementing GraphQL. And that's a big difference to probably classical REST APIs, where you're not as specific on a field level.

Artur Ortega: At the beginning - and that's the interesting part - they're using it a little bit only like a smart proxy, where they can say "Can you pick these fields from these two different endpoints and make it consistent to me, and make sure that the data type is the right one?", and deal with the case if some of the APIs don't respond at all.

Artur Ortega: For example, how to deal with partial responses if some of the REST APIs are not there. How to store the data. Which of the fields you would basically make nullable, to say basically that there's no data coming back... And you can still deal with partial updates, even if the server-side is not completely available. It all goes down to the level of writing this resolver. The first encounter usually you see GraphQL just as a smart proxy, and not as a graph data modeling tool in its own right. That's usually how it starts.

Stefan Tilkov: Okay. So just for me to get things clear in my head - what I can imagine in a very straightforward way would be to have a moderately complex GraphQL schema, and then just map that to one existing database. I have this SQL data store, or maybe a NoSQL data store, and then I need to implement the resolvers for the fields, and entities, and relationships, and I simply map them to the appropriate queries to my data store.

Artur Ortega: You can see even beyond of that. If you have APIs that are specialized for a particular thing; you have one API that's optimized for doing static searching on some of the data. And that API just comes back with the hypermedia references, in the end the IDs for the resources themselves. And GraphQL basically by combining these resolver for the IDs, and using then the resolvers for the objects that are sitting behind the IDs, the client wouldn’t need to look at the hypermedia references themselves they would just fetch the fields of the search result as coming back. And this is where basically it can really simplify the approach that they're using, that usually is more complicated when backend engineers are providing them with those different APIs.

Stefan Tilkov: Okay. So that would have been my next question, but it's perfectly fine. You've answered "How would I go about aggregating data from different sources?" That's a good example. I could have a search index somewhere, I can perform a search, get some result there, or some IDs; I pick some data from somewhere else, combine them so that they conform to the schema, return that to the client. I completely understand that.

Artur Ortega: You get probably as well a clear contract of the data. In the moment, you are defining the types for your schema, and you can do unions of them, and so on, but in the end, you are specifying a contract, and this allows you basically to do contract validation during the query time... Which is quite a difference to usually -- because in this moment we are speaking, frontend developers are maintaining that. So they can ensure the thing they are expecting from the REST API is not diverging over time.

Stefan Tilkov: Okay, understood. If I were to implement a REST API (in my terminology), I would basically do the same thing; it would just be maybe an implementation to some ad-hoc specification, or some ad-hoc format. It'd basically be the same thing. I would implement the same backend interactions with existing systems, and then return an aggregated result as well... But I would have to come up with a data format and a way to represent that stuff, because I don't have the standardization provided by the GraphQL schema. But it's the same layer in that regard.

Stefan Tilkov: How about if I have some sort of business logic? Is business logic commonly implemented in that GraphQL layer? And if so, how?

Artur Ortega: Potentially, yes. In theory, you can build business logic in each of the resolvers for each field, but in my experience of the things that I've seen - usually, the logic moves into the mutation, and the query becomes more simplified. It's similar to what you would expect on CQRS. That's what I see... Even if it would be possible to add more logic into that.

Artur Ortega: But speaking to that, the next step - usually what happens of not just paginating simple data would be to say then "Can we model the data in a graph? Can we model the list that we're getting as adjacency lists?" And that is a big difference after the first step of just using it as a smart proxy... Because then you would say "I don't want to just have a list of appointments", but you would navigate the graph. You would say "I want to go to this patient, I want from this patient all the appointments", and that would be an adjacency list. It would be a list of all the appointments referencing that patient. And suddenly, you are navigating your entities, and the references, and you are navigating the adjacency list of that graph. That's where the graph part comes into it.

Stefan Tilkov: Right. It sounds as if somebody's just invented hypermedia, which is awesome. I like that. Very positive development. I'm happy about that.

Artur Ortega: Yes, but think about a web page, let's say. In the web page itself, at the moment you can only have the links to the other pages. But if you have a page, you don't get a list of all the pages pointing to you. And that's usually where the pagination comes in. So if you have a patient, you would say "What are the appointment pages pointing to this patient?", and it would be slightly different to the classical web functionalities.

Stefan Tilkov: But is that something that I do on the client or on the server side?

Artur Ortega: On the server side.

Stefan Tilkov: But that's the same thing, right?

Artur Ortega: You would implement the adjacency lists.

Stefan Tilkov: You just have a convenient way to get that, because you have the schema, you know about this relationship and this direction. You know that a patient has n appointments... So your schema allows your environment, your infrastructure to give you a very easy way to get to that thing. But in the end, what you return to the client is the same thing that I would return to the client.

Artur Ortega: Correct... In a defined, or best-practice way. So how do you do pagination--

Stefan Tilkov: Understood, yes. Yes, I buy the standardization argument, with all the benefits and downsides that come with that kind of thing.

Artur Ortega: Obviously, yes.

Stefan Tilkov: That's perfectly fine. I think I also got a new epiphany when you mentioned the CQRS thing. So maybe for our listeners who don't know the acronym, CQRS is Command Query Responsibility Segregation; essentially, separating the writing, the transactional part, the changing part from the reading part. The reading part would then be very optimized, for reading, obviously. And I think that is something that you've just explained to me, so I'm rephrasing it in my words, and you can tell me whether I understood it correctly...

Artur Ortega: Yes, alright.

Stefan Tilkov: What you said was that essentially the way you end up with the schemas, in a good GraphQL design - you design the schema in such a way that it's good for reading, which makes perfect sense.

Artur Ortega: Correct.

Stefan Tilkov: So it's optimized for consumption for just read purposes, and then the writing part is sort of a different channel into the same thing. That makes a lot of sense. So it essentially standardizes that as well.

Artur Ortega: Yes.

Stefan Tilkov: Very good. What else -- well, the business logic question; I think we were addressing that, right?

Artur Ortega: Yes.

Stefan Tilkov: You mentioned that the business logic - and that makes perfect sense then - does not belong in the reading part, because you would sort of denormalize things when you handle the mutation things. You handle the mutations so that the reading then becomes easy and fast, or based on the spread-out data...

Artur Ortega: Correct.

Stefan Tilkov: So how do you implement the mutating actions on the server-side?

Artur Ortega: Most of the time you are using APIs that are available already in the company. They are obviously now already -- and that's the part where you usually would write code on the mutations and the resolvers for the fields, where you would put the logic into the mutation. I just wanted to clarify - you could write business logic in the resolvers of the queries, but it's less common.

Stefan Tilkov: So the resolvers of the queries, does that--

Artur Ortega: That could have code as well, and business logic, in theory... But it's less common.

Stefan Tilkov: But if you use the term query here... Are you just talking about the reading queries, or do you call the mutation actions queries as well?

Artur Ortega: No, no. Queries is one thing, mutation is the--

Stefan Tilkov: Reading only.

Artur Ortega: Yes, exactly. So the command side is the mutation, and the query part is the query part of the CQRS.

Stefan Tilkov: That makes sense. I'm not sure you answered my question. I understood that there's not supposed to be business logic in the queries, in the reading part; that was the CQRS discussion we just had, right?

Artur Ortega: Yes.

Stefan Tilkov: But my assumption is if I have some sort of GraphQL server or framework, whatever they're called, then I have maybe some kind of hook that I can use to implement some logic that runs when the mutation happens, right?

Artur Ortega: Yes, the resolvers and the mutation side.

Stefan Tilkov: So on the mutation side they're also called resolvers.

Artur Ortega: Yes.

Stefan Tilkov: Okay. I wasn't sure of that. Okay.

Artur Ortega: Yes, yes.

Stefan Tilkov: So what you're saying is in theory I could write business logic in the resolvers, in practice I would essentially most of the time end up delegating to some other system.

Artur Ortega: Yes. Very often it goes to APIs that exist. A lot of the APIs are already having an event-driven architecture in place, which fits nicely to that, if the APIs already have that. At the same time, you have situations where mutations and query goes still to the same database. You can have that as well.

Stefan Tilkov: Okay. So let me play my curmudgeon role here again... It sounds like a super-sophisticated, fancy solution to a problem that I'd rather not have...

Artur Ortega: Interesting...

Stefan Tilkov: Because what we have is we have backend APIs that aren't' great, and then we put this complicated server-side infrastructure on top of that, to turn them into something that is okay, so that we have a client that can consume that stuff that is now standardized and okay to build something on the client-side, that I would at the very beginning have built on the server-side in the first application layer that doesn't suck.

Artur Ortega: Correct, yes.

Stefan Tilkov: So that's sort of an architectural thing.

Artur Ortega: It's exactly what happens in the corporations.

Stefan Tilkov: Okay. It makes sense.

Artur Ortega: You have a situation where the frontend really says "Actually, backend, you are not doing what the customer actually wants. It's not the exact thing that I need." We build, let's say, a highly sophisticated GraphQL service in front of it, and we're doing that at that point. In the meantime, the backend continues to do their work and just ignoring what the frontend does, because they will build their GraphQL, and they're adding more and more endpoints to their API gateway, or to their mesh, and they're adding more and more microservice...

Artur Ortega: So we are basically probably on what are the optimized numbers of microservices, so you're probably on the 400-something... And then the situation where the frontend says "We can't cope anymore to write resolvers up to the field level for 400 endpoints, and we have to maintain all in one big schema." It gets out of hand … That's usually what happens in that situation.

Stefan Tilkov: Big sigh... But okay. Let's move on. One thing that I was wondering about, and I think you mentioned it in passing before - obviously, this will at some point in time become unmanageable on the GraphQL server side as well, right?

Artur Ortega: Yes.

Stefan Tilkov: So how can you turn that into something more manageable when you need to scale it to bigger schemas?

Artur Ortega: Yes, the interesting part is a lot of the frontend developers that got to that point - it happened at The Economist, as we were doing that there with microservices, where the frontend became like a monolith...

Stefan Tilkov: I'm surprised...

Artur Ortega: The whole corporation data model is in one schema, and all the translation for all the APIs are in one place, and it feels like a monolith... And then you have this transition where the frontend says "Actually, can you, the one who's building the APIs, actually look at what we built, and basically follow our path and take care of your small bits?" And then the backend says, "Actually, we are not looking at such a big domain. We are only working in smaller teams, and we are only taking care of smaller microservices; we are not taking care of the whole thing." Nobody wants to be responsible on the backend.

Artur Ortega: That was a classic problem that a lot of companies went into with GraphQL. That's why probably a year ago roughly there was -- at some point, people are coming out and saying "Can we stitch GraphQL queries together, so that different backend teams can take a part of it?" That didn't really work out. Then Apollo came up with a new standardization - it's called Apollo Federation, but it's not specific to Apollo - where you specify a very lightweight gateway that inspects the data schemas of different domain gateways that basically are added to the federation... And it basically looks them up every time some things change in any of the upstream domain gateways. And it looks at the definitions of them, it looks at foreign references to the other domain gateways, and builds up a federated domain gateway across a company, while everyone else just provides a small subset of their domain into the federation.

Artur Ortega: That's basically the next iteration, which you would basically call an API gateway on the REST side. It's basically the part where the pain of the frontend of dealing with a lot of microservices being available and dealing with a common data model that you have across them, to split the responsibility up. And you then have the situation where you have a little bit like a shared data model between the frontend and the backend, but the backend is then responsible for their entities, with the normalization on the data model, so that you would reference to the other one.

Artur Ortega: GraphQL itself is more like a query optimizer. So in the moment you are asking it for some data, it basically has the information about the data models coming from the different domain gateways, it knows where the data is sitting, which format it has, and optimizes the queries across the different domains for the queries coming in. And that's at the moment the new state of the art.

Stefan Tilkov: *sigh*

Artur Ortega: I can hear you're sighing, yes...

Stefan Tilkov: You know, there was a time when I would have been really happy to hear all this... My old mind can basically hear all the consulting revenue coming in. There is so much to talk about, and teach people, and there's so much to coach people... It's fantastic. I can see how some people might be really big fans of this approach.

Stefan Tilkov: It reminds me a bit of the SOAP, WSDL, WS death star world that I used to be a part of... So I'm guilty of that as well. I'm not pointing fingers at anybody; I was just as guilty of this as well... I've just stopped believing in that. All this complication, all these layers, all this complexity... I mean, things could be simpler; it would be better if they were simpler, because they would be more manageable, and you could better understand them, and you could better maintain them, and you could actually have something that moves faster, and has more value, for less of an investment... But maybe that battle is lost.

Artur Ortega: I think the interesting part is it's not too complicated if we... Let's put it this way - it's not too complicated, and you can see that because GraphQL was initially introduced in a lot of the companies to simplify the work with the REST API, and having a clear data structure. So the approach there is still very data-driven and simplification-driven for the frontend.

Stefan Tilkov: Maybe, yes.

Artur Ortega: And it's very lightweight, still. The federation tries to get these several domains we have and tries to normalize it as much as possible. But you are moving complexity in the architecture. So what you would have in your REST API where you would call out to other services, or you would subscribe to other services to get the data - that complexity moves into your data schema. You end up having to define the schema to say "I have an appointment, but this appointment needs a patient, a practitioner, a location, and I'll leave it to the other three domain gateways to provide us this data and the schemas for them." Basically, it's moving responsibility and delegating responsibility away, and it's probably more solving an organizational thing, and at the same time moving responsibility away from the teams, so they don't have to call out and fetch data and understand the data themselves.

Stefan Tilkov: This all makes perfect sense. There are certain use cases where this matches perfectly, so I'm not debating whether that is the case or not. It's more of a discussion whether you wanna be in a situation where you need something like that, and maybe you can't avoid being in that situation. And if you can't, or if some other circumstances force you to deal with it with that kind of scenario, then that seems like a very well thought-out attempt... Which is probably evidenced by its popularity. I mean, there are always some good ideas in every approach, otherwise it would simply disappear into darkness somewhere... But this is definitely a very interesting thing.

Stefan Tilkov: [02:03:26.19]So I have learned a lot about where a GraphQL might be applicable. I can't say it has fundamentally changed my opinion, but I understand it a lot better, and I have in my mind a certain scenario of use cases where I could imagine GraphQL could be a very good match. Are there some additional use cases that you want to mention, where GraphQL is a particularly good fit?

Artur Ortega: The interesting part there is a little bit as the discussion about ownership of domains, and how the domains relate to each other... Because at the moment, especially if you're going into the part of the federation, where you specify what you own and you delegate the definition of other values to others, at that moment you have to figure out who owns which domain, and how the relationships are to each other... And this conversation is forced in that moment where in a classical API gateway, the only thing people are interested in is they don't have a colliding endpoint on the API gateway... And there's probably even less of a conversation if you just have a mesh.

Artur Ortega: I think the interesting part is more like what does it support? It supports a conversation about the data model, consistency, where are the boundaries of the domains, how are they relating to each other and so on... Which in the moment basically everyone in the company is contributing to it. Basically, you have to have a much more conscious discussion about that. And it's enabling this discussion. It doesn’t solve it obviously, because you can still mess it up, so...

Stefan Tilkov: Yes. I will refrain from comparisons to canonical data models and enterprise-wide data models. I will just not mention it. That's my new strategy, just not mentioning it by mentioning it. So that's fine...

Stefan Tilkov: Artur, this was extremely interesting. Thanks a lot for all the information. As usual, my last question is do you have some resources that you want to point our listeners to?

Artur Ortega: Yes. I think there are several interesting talks. I think Apollo has a few ones about how they're being used. There are some other talks about different companies that are explaining how they implemented GraphQL, or even implemented federation... And I will provide the links in the show notes.

Stefan Tilkov: Perfect. So the show notes are going to be awesome. Lots of links provided by Artur, so it's all based on his input. Listeners, thank you so much for listening. Artur, thank you so much for being with us here today.

Artur Ortega: Thank you so much for having me.

Stefan Tilkov: Have fun everyone, and make wise choices in terms of the architecture.

Artur Ortega: Always.

Stefan Tilkov: Thank you. Bye-bye!

Artur Ortega: See you later! Bye-bye!