Theo Schlossnagle on Meat, Machines, and Mastery in Software Engineering

Transcript

[00:01] And the recording has started. Welcome, Theo. Thank you very much for joining me on this podcast. It's good to be here. The podcast is actually called "Conversations About Software Engineering," and I thought, what better start to this than inviting the best software engineer that I know and had the long pleasure to work with over the course of five years, Theo.

[00:28] Flattery will get you through the interview.

[00:34] Setting the tone! I thought we could just have this conversation about software engineering and look at your long history, the lessons you've learned along the way, and how you got to the place where you are. But before we do, I want to talk a little bit about your newest endeavors in the butchery industry. So, what makes a good steak for you?

[00:58] What makes a good steak for me? That is a lot of personal preference; it depends on the day of the week. A good beefy flavor, good marbling. You know, I like ribeyes, I like skirt steaks, I like picanha. I like pretty much everything. But, you know, 80% of it is the chef, but a solid 20% of it is the cow's job.

[01:23] So, is there something about the experience of meat and the whole meal that draws you into the industry? I think there is so much craftsmanship in there and the relationship to nature that I find quite interesting. Is that kind of the package that you're also looking for?

[01:51] I think there's a complicated proxy element to that. The adventure started sitting on my porch drinking bourbon with my neighbors, and we were lamenting that the butcher shops in the area have either shut down or are not impressive. The large chain grocery shops in the United States have changed the way they deliver and cut meat. They took the butchers out of the shops and moved them to the warehouse to consolidate and pay them all part-time instead of full-time. Most of the meat that comes into a butcher shop or grocery store meat counter is already cut, making it very hard to get a custom cut of meat.

[03:48] So, is the problem really that it should be fresher when it's closer to the customer, or is it that you cannot order the right parts?

[04:00] It's the parts, more than anything. They tend not to have an extreme freshness problem. In fact, the quality of the meat from grocery stores is usually pretty good because they buy in such bulk that they can get the quality they want. The problem is, if you want a one-and-a-half-inch thick ribeye, you can't get it. You only have one-inch thick ribeyes or two-inch thick ribeyes. If you want a prime rib, it's difficult to find that because they're all cut into three-bone or four-bone parts before they come into the grocery store. So it's really about how the animals are sliced up.

[04:53] What kind of farms do you get the meat for your shop?

[05:00] We source the meat; I would say about 60% to 80% of our meat comes from the local area within 50 to 100 miles of us. The other 20% is interesting. It's hard to get New Zealand lamb local, and it's also hard to get Australian wagyu or Japanese wagyu local. Those come from afar. The chicken comes from a nice farm in Pennsylvania known for its special processing and handling after slaughter. They air dry it and air cool it instead of water cooling it, which really makes it taste like a European chicken. If you've ever had chicken in the United States and thought, "What is this trash?" this chicken tastes like the chicken you would get normally in Europe. The pork comes from a farm about 45 miles away from here. We get half a pig every week and cut it up, and we try to use every part, which is great.

[06:07] That sounds good. We have some problems with regulation in Germany. There was a very family-owned butcher shop in the town where I live that had to close down last year, or at the beginning of last year, I think, which was a tragedy. The regulation was the problem; they couldn't fulfill the new standards that were required. Before that, we really had the full value chain, really local. I could see the cow standing in the neighbor's yard. It was all local, and if you knew the right people, you could really get the exact meat that had never left the town. I always found that extremely nice, although I don't have that kind of developed taste for meat. Just the whole process and having the local experience together with some barbecue and whiskey—that's definitely a good time.

[07:10] The regulations are complicated here. I won't say that they're problematic, but one of the reasons we don't slaughter animals is that it requires US Department of Agriculture supervision and oversight. Butchery has already gone through the slaughter process. We get the animal cut in half or into six pieces, and that comes in on a truck, and then we just do the cut down from there. That's largely due to the complexities of running a slaughterhouse, which is very complicated in the US.

[07:42] Theo, I know that you have managed to sneak in some software engineering into your new role, so maybe we can talk a little bit about this because I found it quite interesting. What did you do?

[08:00] Well, I will say that I'm not sure I was ever really on the cutting edge of software engineering, but I was up there as an early adopter of a lot of languages, coding in C, C++, Rust, and JavaScript to build complex apps. If you ever want to step back in time, you can go into old-style retail, where you have scales, weights and measures, and point of sale systems. I know that places like Shopify and Square have tried to innovate there and replace a lot of those systems. But when you start doing things like deli work, where you're weighing things by the gram or the ounce, the systems don't really accommodate those needs very well.

[09:48] My expectations for automation in the industry are really not to reduce workload as much as they are to increase consistency and reduce human error. If you're receiving inventory, you don't want lots of manual steps that could lead to incorrect entries depending on who's doing it. We want to manage inventory processes, which, as you can imagine, is pretty complicated in a butcher shop because we don't get ribeyes; we get export rib plates or something like that. We get a sub-primal, and we have to cut it up into pieces. One of the things that comes out of that is a ribeye.

[10:48] You end up having inventory that you shrink out of inventory for internal use and then produce new types of inventory. If you don't have that inventory right, you can't really sell it online. It's easy in the store because someone walks up to the counter and asks, "Do you have any ribeyes?" and you can say, "No, but I can cut you some," or "Yes, they're right there." But in an online store, you don't have that conversation. You need to bridge the gap between the old-style service and accurate inventory online. I will say we haven't solved that problem, but we're a lot closer than most organizations are.

[11:48] This is a lot about automating the processes so that a butcher can avoid mistakes in data entry. The last thing you want is someone operating a knife who also has to ensure that the pounds and ounces of the items they've just produced are accurate in the system right after they cut them, which is pretty insane. We want to automate those systems. As any software engineer knows, you did not build a product for me. Hopefully, you have a back end that can accommodate what I want, but I really need to bridge that gap through API usage.

[12:12] We use four different products for that. Square, which we are starting to adopt more, has an excellent API and great documentation. The other three are tragic tire fires.

[12:16] So Square is the payment system, right?

[12:26] Square is a point of sale, payment system, and retail system. They do a lot of different stuff. I think they also handle HR, like payroll and things like that.

[12:30] We use a company called IT Retail, which powers a whole bunch of point of sale systems that deal with weights and measures. That's sort of one of their sweet spots. The product sort of works, but the API is undocumented and can change at any time. I believe the back end is written in SAP. The database is distributed, which is really interesting tech, but the product development and software engineering on their side do not adhere to modern standards for what people want to consume.

[13:48] We use another product called Local Express, which really doesn't have a public API either. They've been great to work with, but it's just difficult to do anything automated at all, and you never know when the API is going to change because it just powers their React app. We also use a set of scales from a South Korean company called CAS, which apparently has 32-bit Windows DLLs and headers that look like they're from 1980, documented in Korean.

[14:49] So those are the systems that we need to glue together in a way that reduces workload, increases consistency, reduces human error, and reduces turnaround time for change. If you change the price of an item, it should show up on the scales without waiting for tomorrow morning to reload them, which is pretty common in those environments. You've got this Windows app that you load an Excel or CSV file into, and then you flush the scales and reload them all every morning. The old-school people just punch the digits in, and it's insane. They spend hours a day reprogramming their scales and looking up codes for all their items.

[16:52] So, the CAS scales are a Win32 app, which is a pain. They were very cooperative. I wrote to them and said, "Look, I'm really trying to do something here. Can I have the code?" They sent me some DLLs. They are 32-bit, unfortunately not 64-bit. After about a month of work in Rust, I managed to get it all to work with the correct structure packing to match the Windows structures that their Windows C API expects.

[18:33] For the DLLs, did you throw them into JIRA or something like a disassembly tool, or how did you approach that?

[18:41] No, I used Rust to load and run the DLLs. The APIs expect the structures that are passed in to be packed in a very specific way, and the header is not clear. I don't even know what compiler they used for that.

[19:01] But if you load it in Rust, you get symbolized output, so you know the function names, at least.

[19:06] I do. I have the header. It has the C API functions. It's just that you sort of don't know what they mean.

[19:13] I get the settings. You probe it a lot and hope that you can figure out a lot of access violations and segfaults. It was my first experience writing any sort of production code on Windows, which was daunting at first, but Rust made it very comfortable.

[19:33] Nice! Windows as a production platform can be surprisingly common.

[19:43] My biggest problems with it are just really getting it to behave itself on boot. One of the things that I think most Unix systems are extremely good at is coming back up quickly to a known good, minimal state of service operation. I'm sure there's a way to do that in Windows, but everything I look at is like, "Oh, click in this menu." It's like, "No, no, not what I want."

[20:11] They just had this security problem where, after the reboot, they wouldn't come up again for the kernel update. I forgot the exact details, but that was pretty horrific.

[20:22] I have auto updates on, and somehow I updated to a point where it said that I needed to welcome back to Windows and click through some screens. The system was offline for like a week while I wasn't in the shop. I'm not going to put a KVM in a butcher shop; it's just nuts. What I really want is a little Raspberry Pi, but I have 32-bit Intel DLLs that I can't easily run. I've tried to run them under Wine, and after a day of just trying to brute force it, I could not get that to work.

[21:38] Very sharp edges. Yeah, I found that Rust has been extremely good for cross-compilation. For me, it's always been possible, and I did cross-compilation with GCC and Clang in the past. It's doable; it just requires rolling up your sleeves and getting grease all over you. Whereas with Rust, you just do the two lines to add a target, and you're done. It's like, "Oh, yeah, that works. Let's do it this way." I think Go is very similar. I used to cross-compile because we ran Alumos a lot, and I would cross-compile to Linux and then ship them up to production that way.

[22:20] Any Alumos running in the butcher shop at this point?

[22:27] No, I think I have an Alumos box running somewhere, but not in the butcher shop.

[22:36] And for tying together all the services, is there some web stuff that people can operate, or is it more behind the scenes, feeding the other systems?

[22:46] So far, it's been mostly behind the scenes—automated jobs doing customer list syncs, product list syncs, and inventory syncs. For example, one of the cute things is the loyalty program that we have. We didn't like any of the loyalty programs for the products; they seemed very specific to that product. So if we were going to switch back ends from, say, IT Retail to Shopify or Square, we would have to change our loyalty program, and I was not excited about that. So we designed a loyalty program that gives you a percentage discount based on how many dollars you've spent in the last 180 days. One of the automated processes pulls all of the purchases from all the different platforms, creates a source in a Postgres database, and then updates the discount level for all the customers across the different platforms to match based on that.

[24:23] A lot of that has been out of band, sort of behind-the-scenes routine jobs, but now I've started to add a web interface to allow the butchers to dig into records. One of those systems is for the cut-in, cut-out process. Before, they would take, say, a beef tenderloin out of the fridge, break the cryovac, weigh it, and then shrink that out of inventory. They would cut it into filets, weigh the filets, and then receive that back into inventory.

[25:00] We do that by switching the scale to print special barcodes. Unfortunately, IT Retail can't read a barcode with weight on it unless it's selling it, so you can't find the product. There are lots of weird sharp edges with these old products. One of the interfaces is a web app where you can just put it in the internal use column and scan all the barcodes. As the butchers are doing this, they'll take a couple of things out of inventory, print labels, and put them in column one for internal use. As they cut stuff up, they'll put it in column two for items to be put back in. Then, as stuff gets thrown away because it's either spoiled, too old, or doesn't look good, or due to shrinkage from cutting fat off of meats that won't be used in burgers or grind, it goes into column three for spoilage.

[26:23] Instead of the manual effort of scanning the barcodes and entering the weights, the funny thing about the IT Retail system is that when you shrink things, if you scan your tenderloin and put 14 pounds, and then you have another one, you scan that and put 15 pounds, it doesn't add them; it replaces. If you're not careful with the things you're scanning, you will actually replace the previous entry. You'll think you have stuff in the fridge when you don't. This fixes all of those problems. The weight is encoded into the barcode, so you just snap, snap, snap, snap, snap, and hit go, and then it does everything. Those are the types of automations that you can glue together with a little bit of rough web design and some reverse engineering of people's APIs.

[26:52] Now that's beautiful. That's kind of the sweet spot for a lot of things. Bad English is the language of business, like PHP. That kind of rough web engineering makes a lot of processes work.

[27:26] Maybe we can go a little bit back in history. When we first met, you had already founded two companies and had a ton of experience as a software engineer. I was always surprised by how well you knew all the different Unix tools and how much time you probably spent reading all those man pages. Could you walk us through your career as a software engineer, how you first got into it, and how you developed this proficiency in the Unix environment?

[27:26] Sure. I think the core of that secret power is curiosity. I started at university as a physics and electrical engineering major, but I kept taking computer science courses very aggressively and avidly. I ended up meeting the requirements for a computer science major at Hopkins and graduated without an electrical engineering or physics degree. At Hopkins, once you've satisfied the requirements for one major, you graduate and are no longer a student there. If you're going to double major, make sure you finish in the same semester.

[28:06] Interesting learning.

[28:10] Yeah, so I scrambled and went to grad school there because I had already rented an apartment and everything. They let me go to grad school to pursue a Master's and PhD, and I worked in the computing lab as a systems administrator for extra money. It was an interesting zoo being a systems admin at a university because you have people who have no idea what they're doing, people trying to exploit the system, and a wide array of users. You have users that are always smarter than you, right? When they ask you, "Can you fix this? It doesn't do what I want," you're like, "I don't even know the system could do that."

[29:15] This was in the mid-90s, and we were running Solaris 2.5, 2.6, and then 2.7. During that time, a whole bunch of the students I was with graduated and went into the booming internet industry.

[30:06] So that was the initial bubble, and there was more work than anyone could do. When they ran into problems, they would always say, "Oh, Theo's smart," and call me up. I ended up starting a consulting company to help these people out and got a tremendous amount of work solving people's problems. Someone would say, "My Oracle database is down. It's running on a Sun E4500 with a NetApp or fiber channel array attached to it. Here's the root password. Please save me." Those calls came in at all hours, and I had to get things back online.

[31:27] This included Cisco equipment, Foundry, Brocade—everything in the data center environment, all pre-cloud. You had to know how everything worked, from networking layer two and layer three to storage arrays, hardware kernels, and software on top of it. In the late '90s, stuff worked surprisingly well, but it never really did work perfectly. When someone is paying you a lot of money and anxiously waiting for something to get fixed, you can't say, "I don't know how that works" or "I don't know how to fix that." That's where curiosity kicked in, and all of the learning happened.

[31:34] Did you have any experience with computing or computers before grad school?

[31:34] I had a lot. I wrote a flight simulator in fifth grade in BASIC on an Apple IIc. My first exposure to C was with a bulletin board system that ran on MS-DOS or OS/2. I ran a bulletin board and had a couple of phone lines coming into my computer. I was a total geek in school, and that's how I learned C, which is probably why my C code was so bad for so long. Not that the C code was truly bad, but it was in a generation where good programming practices hadn't really been developed for C.

[32:06] If you look at the old K&R book, just don't write code that way. That's my answer. I don't think that's a good example of code structure. The code design is not very good. Nomenclature is not very good. People don't name things right. They over-comment things that don't need comments, like commenting a for loop that goes from zero to max. The for loop says that, so you have to give a lot of, I don't know, not amateur vibes, just early green vibes.

[32:26] When I went to college, all the professors thought they knew everything. The worst code I ever saw came from university professors. Everything was a global variable, everything was a static variable. Most of it was a zip file or a tar file, and every time they'd make a change, they'd copy it to another directory. There are still companies that do that, where you've got "do this.back.back." When you go into that, it's like stepping through a portal in time. We have such good tools now for change control, consistency, management, and integrity across systems, and it's just so easy to use them.

[35:24] After I got out of university, I started to write software and get peer reviews through open source, through the Apache Software Foundation and other projects like Postgres. I started to understand what better code looked like, what better software development practices looked like, and what better design change control looked like. Not just change control in the sense of who changed this file and what line, but how should we change these things? I have a set of changes to make. How should they be separated? Don't make cosmetic changes at the same time as functional changes.

[35:24] I think that really helped me evolve my respect for clean, concise software development practices.

[35:24] Just for the listener, I have read quite a bit of your code, including quite complicated things like the heartbeat context you wrote for the alerting engine and the replacement of the second alerting engine and message broker called FQ. You think, "Okay, this is a complicated system," and then you read the code one file at a time, and everything is pretty straightforward and simple. You always wait for the point where it gets complicated, and it never really comes. It's all relatively straightforward. The data structures are at the top, then a ton of functions are written in reverse order, so the main comes at the very bottom, and the dependencies are over the functions that use them. It's pretty verbose; it's not super DRY in the sense of "Don't Repeat Yourself." Specifically, all HTTP interactions are always like 20 lines of boilerplate, but it just integrates extremely well. It is still very easy to read, and changes are surprisingly straightforward. There's never a lot of complexity at the same place. For me, it's really an ideal of how good software engineering should look. It should not be the Perl experience, where the lines are super clever. It should be unsurprising.

[37:10] I have not ever seen your Perl code.

[37:12] I was the king of Perl one-liners. Well, no, probably not the king—one of many kings of Perl one-liners. But there are a lot of things you can do with large changes in systems using some sort of recursive Perl, dash pi, dash E, one-liner. I love regular expressions. I still use them, not as frequently as I used to, but I love the simplicity and directness of regular expressions.

[37:40] Most software engineering has these tenets and rules that are constantly misapplied by engineers with less experience. The classic one is, "Premature optimization is the root of all evil." It's not the whole quote; you have to read the rest of it. The tricky part is understanding what's premature. Sure, premature optimization is a big waste of time, but if it's not premature, it's not a waste at all. In fact, it's a great gain.

[38:05] That goes for "Don't Repeat Yourself." You can't repeat yourself unless you've repeated yourself. So abstracting everything away and using it once is the dumbest thing I've ever seen. You didn't repeat yourself, but you moved it into 15 different files, and you made a generalized API that makes the changes you make because you're the only consumer of it.

[38:05] I think it was Adam Leventhal who said, "In order to generalize an API, you have to have at least two consumers." The reason is that even with one consumer, you're just serving yourself. That's not useful, which is why a lot of the code you read of mine is very straightforward. When I was writing it, there was one consumer. There might be future consumers, and I will refactor it once I hit three. When I copy it to the second one, it's probably going to change. I don't know if those changes are generalizable yet until that third person comes along, and then you start to see that really gel.

[40:05] DRY is a great example of something that people do when it's absolutely unnecessary. You could not apply all of this abstraction and still not repeat yourself because you're the only consumer of your stuff.

[40:16] It breaks the flow in a lot of situations, and it introduces a lot of names of entities that you don't really need to name, creating so much overhead.

[40:16] I feel like the entire industry of enterprise Java is about creating files with complex, long names that nobody understands how they're used. It's just so unnecessarily abstract. One of the big problems with abstractions is that most of my time was spent at Omniti helping the Alexa top 100. I think we serviced about 15 of the Alexa top 100 during the life of the company, and it was all high-scale, distributed systems, all high performance.

[41:50] A lot of times you're in there, and you're like, "Why is this taking 48 milliseconds? It should take no more than 20." In order to ask that question, when you have abstractions in there, it makes that job so much more difficult. I always appreciate ugly code that's more direct because it ends up being easier to grasp the profiling of that code. It doesn't do things unexpectedly. I spent most of my time writing code that is very direct, and then in the occasion that my code is very useful, it ends up being abstracted away, hopefully in a way that's useful and not harder to debug.

[42:16] That's a good approach, a good way to look at it. Thanks for creating the context. I think a lot of these things are in flux. The DRY principle has come full circle; people are aware that you shouldn't be completely non-repeating. The Java world has now seen Scala, Clojure, and Kotlin, so they're trying to get away from the kind of Java culture.

[42:20] The Java world still exists, but it's not the environment I'm currently in. We have a lot of Go at the company I work for. There was this radical agility phase where they just allowed everything, which made it a lot more colorful. But we definitely don't have this kind of Java culture where everything is super architected.

[43:16] Coming back to your earlier experience, which shaped your style, the flight simulator is a good starting point. You mentioned the ASF and some projects you were involved with, including Postgres and some peer review. What would you name as the biggest influences in terms of codebases or peers that you had early in your career?

[43:16] I'm not sure that there's a big influence in particular. I ended up reading a lot of code. When I was working at Omniti, I was doing a lot of high-performance profiling, and DTrace came out. It turned out that DTrace allowed me to ask questions I had not thought of asking before without code modification. That was the fundamental difference. Typically, you write code and think, "I wonder how long this takes." You have to add timing in, start, finish, printf, or whatever.

[44:37] DTrace allowed me to ask questions like, "I keep having slow accesses to this file. Which processes are reading which disk blocks, and on which drives are they hitting?" In software that I did not write, DTrace trivially allowed me to ask that question. The only way to get that is usually to recompile your kernel, put a lot of print statements in your virtual file system or device drivers, and then restart your machine.

[44:56] Now eBPF allows you to do the vast majority of this stuff, if not all of it, but at the time, that didn't exist. Sadly, due to the licensing issues that are common in open source, DTrace didn't get adopted across other platforms other than FreeBSD. I have a soft spot for DTrace because it allowed you to expose it to users without superuser privileges. It was such a safe language and runtime.

[45:37] I ended up writing user probes for the Apache web server and for Postgres. I provided some tooling, which I don't even think people use anymore, but I don't think you can do the stuff we did with Postgres back then today without running on a DTrace box. It's really interesting. In that process, I started instrumenting other things. I instrumented tons of code. I read C code, C++ code, did a lot of PHP code, and probably the cruelest thing I ever did was debug large Ruby apps. That was a certain place in hell.

[46:40] I ended up consuming massive arrays of code, and I think that was more of the influence. I got a feel for what was easy to debug, what was easy to instrument, what was easy to observe, and what was easy to debug. That mostly influenced my elective decisions in programming style because I wanted to write things where, when I went back to them later, I didn't feel like I hated the author anymore.

[47:40] There's so much to unpack here, and one of the things that impressed me most about working with you is your ability to debug things. Instead of trying to restart or reinstall everything, you would drill down to the root of the problem. I remember when I first tried to install the second US database on my machine; I had all kinds of library resolution problems. You would look at the libraries and see exactly what was missing, what version it expected, and then you would get to the root of it.

[49:03] I think that my debugging experience is informed by two things. One is that I spent seven years in university studying computer systems. I built a kernel, I built a compiler, and I worked on multiple instruction sets. I took computer systems architecture, networking fundamentals, and all of these things. I retained a lot of that information, so I feel like I have a very good understanding of how computers work.

[49:40] The other thing is a puzzle-solving desire. I don't like not understanding something. It's a visceral urge to not let something go when I don't understand it. If I fix something and don't know why I fixed it, I can't let it go. I still have to figure out why what I did fixed it, which can be hard sometimes because the system dynamics change, and you'll never actually know.

[50:00] Those two things are fundamental components. You have to have a yearning to really understand and a curiosity to understand, and you have to put in the time to understand the full stack. You need to understand how files exist, how JavaScript gets read into a VM, how it gets just-in-time compiled, and how the garbage collector works. You have to be patient enough to get all of that in your brain.

[50:40] The approach to troubleshooting is really about asking good questions. I gave a talk in Dublin at an Apache conference called "Advanced Production Troubleshooting." It talks about how to diagnose problems with web servers. A classic problem is when you spin up a web server, configure it, and try to load a page, but you don't get the expected result.

[51:00] My first question is, "What is it doing?" You need the tooling to see what it's doing. The first thing it should do when I load the web page is accept the inbound connection. If I hit the web server, there should be a connect on my side and an accept on the other side. Is that happening? You can use tools like strace on Linux, truss on Solaris, or ktrace on FreeBSD to see the system calls that are happening.

[51:40] When you look at the output, you can see all the files it's loading, all the directories it's touching, and if you see a permission denied error, you know that's probably the issue. It's really about asking the system what it's doing and then asking why it's doing that. If you get deep enough into that, you may have to open the code, which is why open source is so important. When it misbehaves, you can go in and find some weird predicate somewhere.

[54:47] My Windows debugging experience is sadly incompetent. Windows does have DTrace, which is kind of cool, but I don't have much experience with it.

[59:19] Five years. Okay, well, that's fairly recent for DTrace on Windows. The article is from last month.

[59:23] Have you tried using it?

[59:32] I have. It was slightly tricky. It does function boundary tracing on the kernel, which is challenging because you don't have a lot of kernel insight. It does syscalls, which are the NT OS system calls, and you can trace processes and things like that. It also does event tracing on Windows, which is actually really cool, but I don't know much about that. So, I would say people who run Windows and are really interested in learning to debug should check out DTrace; it's a very cool tool for that. I try not to have problems on Windows. I don't think I'd ever run Windows in a complex, high-performance environment where I was on the hook for making it work. The butcher shop is not such a thing.

[1:00:31] Yeah, that makes sense.

[1:00:35] I truly appreciate having the source code for the kernel. When OpenSolaris happened, that's when DTrace became like the Holy Grail because I was no longer stuck at the kernel boundary. I could actually look inside and see what the code was doing and try to debug that better. I ended up fixing some issues; I think I have a patch or two in the Lumos kernel because of that.

[1:01:10] What areas of the system did you dive into first when the source code became available? Where did your curiosity lead you?

[1:01:31] I dug into network drivers a little bit because I had some consistent problems with those. I looked at the virtual memory management system because the mmap function on Lumos did not do some of the things that the Linux one did, and I needed to map those in. It was actually for Luigi to make it work a little bit better. One of the nuances of the Lumos user space is that the process space starts at zero from a heap allocation point of view, but the stack starts very, very high in 64-bit processes. They're not contiguous segments, so it doesn't matter, but if you're using pointer packing, that's really bad because they don't have the same lead pointer.

[1:02:31] You need to know which bits you can mask or which you can't, right?

[1:02:35] You can't mask any bits because they're all different. I made it so that there was a—I forget how I did it. I also patched Luigi to work better that way. I don't think that was ever accepted back, but I remember some load scripts you gave us to make the word jet work on certain configurations. I had a colleague with a similar background, and we were just staring at four bytes in a linking script, which we had never heard of, and suddenly it worked on Lumos.

[1:03:16] That was a linker map script that forced the heap into a different spot.

[1:03:24] That was complete black magic. We thought of ourselves as fairly sophisticated engineers, and it was just like we got this four-byte script, and this elusive problem went away. It was one of the more memorable situations.

[1:03:40] Again, it's just the same troubleshooting principles: "Hey, this is segfaulting. Why is it segfaulting?" It's because it's accessing a page that isn't right. "Why is it accessing this page that isn't right?" Okay, ask the same questions on Linux. "Oh, it's different pages." Now I see what the problem is. "How do I make it do what it does on Linux?" There was a lot of digging around, and I finally found that linker mapping stuff. I find the intricacies of systems fascinating. Kernel engineers are oddly brilliant, and I enjoy chasing their tails.

[1:05:24] There are some things on Solaris, for example, that allow you to add user execution time, determine your chipset capabilities, and superimpose code that may or may not have been compiled as an optimized form for that extended instruction set. If you had a VX 256 or something like that on Intel, you could have a version for that. You didn't need to know about it in the binary, and you didn't need to write any code inside your binary to test it. You could just say this code should run, and it would link into that if the chipset had it. It was really neat. My first question was, "How does it do that?" That's so cool! So then you're diving into the linker and all that stuff. Curiosity drives it all.

[1:06:53] These are really the dark niches of UNIX programming—the loaders and the linkers that make these processes work.

[1:07:36] I have to say, that's one of the reasons I won't get any friends for saying this, but I really dislike Go's choice of not adhering to ABI standards.

[1:07:50] They sprinkle data all over the place, right? All the coroutines...

[1:07:55] They use split stacks. They don't use the x86-64 ABI for doing their stuff. So it means that if you're going to call into C, for example, you have to run in C mode, or it's just like, "Yeah, I get you're being clever," but the cost of that cleverness is unwarranted, in my opinion. That particularly frustrates me about Go. Other than that, I like the language itself. I think it's a really nice language. It's easy to get stuff done, and you tend not to have the same type of bugs you have in other languages. The same reason I like Rust. Once you get the hang of the language, there's a whole class of bugs that happen in C and C++ that kind of just disappear unless you're an expert C++ programmer.

[1:08:56] I watched a podcast with Michelle Mitchell Hashimoto, the HashiCorp founder, yesterday. After leaving HashiCorp, he wrote a terminal emulator called Go City, and it's written in Zig. He was a big Go proponent but chose Zig for this project, which I found interesting. His reasoning was more about the enjoyment of the personal experience around it. I think Go can be a little verbose at times.

[1:09:35] I think a lot of people are just looking for something new. Everybody wants to touch something new. I think Go is a very good tool language, but it's not a great systems language.

[1:09:50] That makes sense. I think it's actually a poor systems language because, if it's a systems language, I should be able to move from user space into kernel with some effort, and that is exceptionally hard with Go. The reasons that he cited for using Zig are a lot of the reasons that people use Rust. People use Zig because they don't like the Rust community and think the Zig community is awesome, while there are a lot of people who use Rust who think the Zig community is toxic.

[1:10:24] I'll leave that one there, but I think Zig is a very interesting programming language. It has a lot of advantages like Rust does. I think the borrow checker in Rust is a lot to swallow, but once you've got it, it's cool.

[1:10:56] When you compare your experience with Solaris and Linux, reading into the source code and understanding on a deeper level how things are solved, how do you look at those two systems? How hard was it for you to move to Linux eventually?

[1:11:17] To be fair, I started on Linux, really. I started on Linux with Slackware, I think it was in 1995 or 1996. I had 38 three-and-a-half-inch floppy disks that I tried to install. It took forever. I had it on my 386 or 486, whatever it was. Then I logged into the Hopkins general-purpose mainframe, which was Irix. I thought it was the coolest thing ever because I couldn't get access to it to run on my own, and it was the thing that was over on the other side of the fence. It seemed cooler.

[1:12:30] When I joined the CS group, they had Irix running on their workstations in the lab, and they had Solaris running on their backend servers. I don't know; I just fell in love with it. I used VMs as well. Personally, I have a distaste for VMs; I don't really like them. But I fell in love with Unix. It made sense from an operational perspective. I liked being at the shell. I really liked being able to pipe things together. It felt very natural to me. I understood what I was asking it to do, so I wasn't surprised when it did weird things.

[1:13:32] I liked Unix a lot, and then I took an operating systems class where we had to write a Unix-like operating system. When I taught and helped teach the operating systems class at Hopkins, we taught using Linux. That was pretty cool. We had students rewrite and change the virtual memory system. Some people wrote their own file systems. It was a really cool class; it was very practical.

[1:14:32] That's amazing. One of my math professors always said there are only three times in life when you can really learn something: when you're learning it for the first time, when you're tutoring it, and when you're teaching it as a professor. I can imagine that tutoring this class really allowed you to go to the next level on this stuff, designing the exercises and so on, right?

[1:14:57] Yeah, it was interesting because I ran some of the classes. Some of the classes were out of the classic operating systems book, the Dinosaur book, which was the classic instructional operating system book from my time. I think it's Silberschatz and Galvin.

[1:15:30] That book was great. The Linux kernel looked nothing like that. When you were teaching this, you were asked, "Yeah, it's Silberschatz and Galvin." Oh, my memory is not so bad! That book talks about operating system concepts, but when you go into the Linux kernel, come on now!

[1:15:50] One of the big differences between looking at the Solaris source code, which I may or may not have had access to before OpenSolaris, and the Linux source code is that I have a ton of respect for Linux. I have more systems running Linux than any other thing. Back in the day, most of the systems looked like someone threw stuff at a wall, and if it stuck, it was in the Linux kernel. The code is much better now, but there's still some nasty stuff in there.

[1:16:18] The Solaris code looked like it had more draconian software engineering policies and practices in place, and you could feel it when you were reading the code. It felt like stuff was where it was supposed to be, and it did what it was supposed to do. The comments were incredibly good. If you ever want to see good code comments, the OpenSolaris and Lumos kernel has phenomenally good comments about the overarching design principles of certain subsystems, why they work the way they do, and how they work. The comments match the code, for God's sake! That was wild. It was the first time I'd ever seen anything like that. I was like, "Oh, it is possible to have comments that match the code."

[1:17:10] What about the Postgres codebase? I thought the comments there were pretty good, right?

[1:17:32] The comments there are good; they match the code, but they're just not as good. They're at a different level. They're either at a level where I don't know anything about databases, so I'm reading the comment, or I know everything, and it's a nuance, but it doesn't really talk about design principles and interactions with operating systems. I would expect the VM system for Postgres to discuss writing direct I/O files or non-direct I/O files—why it does it that way, why it uses the subsystems, and why it works that way on Linux or FreeBSD.

[1:17:57] I don't think, at least the last time I checked, the comments really covered the stuff that I find interesting. I guess it was just a personal alignment with the Lumos kernel. When I went in and read the comments, they answered the questions that I had.

[1:17:13] I will say that I don't model that all the time. A lot of my work was consulting where someone says, "I need this fixed in three hours. I don't care what it looks like." I can either say, "Oh no, I'm not going to do that," or I can get paid. So I tend to err on the side of doing the job and getting paid, and I'm not always proud of my work, but I'm almost always proud of the outcome.

[1:17:39] Yeah, and there's also a good engineering practice, right? You need to know the requirements and understand when to use duct tape and when to use a power drill.

[1:17:50] Exactly. After this conversation, I really understand better why you see things a certain way and where your experience comes from. I really like that you come from Electrical Engineering, having worked on a C64 and an Apple II, where the systems are so simple that you can literally understand the electrical layout of the components. There's this wonderful series from Ben Eater on YouTube where he builds breadboard computers. I absorbed that content, and for the first time, the dream of really understanding the machine down to the electrical wires became feasible. I now have a much better understanding of how a CPU really works.

[1:19:15] Both of us are fortunate to have witnessed a time when machines were really simplistic, and you could understand the compilers and so on. My daughters are raised with the iPad; they will have some computing experience. They already have some command line experience in Linux, typing in commands to play games, but I feel like the stack has gotten a lot more complicated. Specifically, the iPad and iPhone have more different CPUs in them than you have fingers on your hand, all interacting with each other.

[1:19:50] Yes, it's very complicated to understand that. I would say the biggest change is not so much the increased complexity in single systems, as those are effectively abstracted away by a kernel, and most people sit above the kernel. It's the distributed systems in the cloud because you don't even have visibility anymore.

[1:20:28] Right. The whole distributed tracing thing is so loosely granular, based on function boundaries that some engineer decided were the spans. The problem with distributed tracing is that you can't ask questions you didn't think about asking later. If you hadn't thought about asking the question, there's very little hope that you'll be able to answer it. So it fundamentally changes the system. What I've been wanting forever is something like DTrace for distributed systems.

[1:20:54] Yes, distributed DTrace.

[1:20:56] The problem with that is that I probably won't be able to quote myself very well, but I would say the reason I quit grad school was that I hated contemplating and debugging distributed systems. Then I started a company and ended up contemplating distributed systems, which is super unfortunate. Distributed systems are incredibly difficult to reason about. The debugging techniques are not just about digging and asking questions; it really is about trying to understand some concept of state across the system. You need to understand how coherent that state probably is and then reason about how that state was arrived at.

[1:21:46] Those are the challenges in distributed systems. In the simplest state, I have a distributed system modeled through a distributed finite state automaton.

[1:22:20] Yes, a finite state automaton, essentially.

[1:22:25] Right. You end up having a model of your computer program based on state transitions. All of the distributed system work I did was around consistency, coherency, and group communication—basically building coherent state across distributed systems. You have an automaton that takes you from one state to another, ideally running the same code on every machine, so you have that copied everywhere. The thing that actuates the movement between states are messages sent from that state machine to other machines. The question is, how did we get here? If we sent that message and it showed up to four out of five systems but not the fifth one, it's those sorts of mental models that are mind-breaking and hard.

[1:23:30] Yes, distributed DTrace would solve all of those problems, but building it would require solving all of those problems.

[1:23:52] The picture I have in mind, based on my experiences, is just a bunch of log files. You have your five database nodes communicating with each other, logging all the relevant events, and then you have five files with timestamps. You print them out or put them on the screen and try to figure out how this could possibly happen. I think of distributed tracing and the event-based observability that Honeycomb is championing as a more evolved variant of this. Instead of writing to a log file, you're centralizing it in the system and using structured text, not human-readable lines, to better index the information you get.

[1:24:49] Conceptually, I think the industry is moving in a good direction here, but there is this boundary you described before. You have to pre-instrument, and then you're also creating all that data. With DTrace, you're not creating a lot of data; you're just consuming memory locally on the node and dynamically answering questions.

[1:25:24] I think there are different types of distributed systems problems. A lot of the classic microservices interacting with databases tend to be incredibly low in state dependency and are causal. A user makes a web request, which induces a message into Kafka, which then induces a Postgres query. That's easy to think about. But when you're doing something like an actual distributed database, and you say, "I'm going to do a search query," that depends on the nodes' perceived state of the cluster. The nodes may not see the cluster the same way due to intermittent communication issues or latency.

[1:26:36] For example, you could have a 32-node cluster where 25 nodes think there are 30 nodes, four think there are 31 nodes, and three think there are 32 nodes. You're trying to reason about which nodes thought which things at which time and why the composed answer looked like that. You're already looking at it because the answer is not right. If the answer was correct, you would be like, "Oh, it works." But you're trying to debug and reason about an incorrect or perceived incorrect state that was arrived at, and it's just really hard to understand everyone's perspective.

[1:27:24] Yes, I've seen glimpses of this while we worked together, like gossip, heartbeat context, and metadata replication between nodes. You see how all this state matters in how a query is interpreted. Working in a Kubernetes environment right now, the interesting bit is that I've never run into these problems again because everything is just REST microservices that are essentially stateless, or the state is separated into a different service. You run your memcached, Postgres, or S3 and just do the state for the tricky bit into a different service, paying a lot of overhead.

[1:28:10] If you have one Postgres server, you don't have a distributed system. You sort of do because it's multiprocessor, but those hard problems tend not to represent the same way. Most people who have more than one Postgres server don't actually run it correctly unless they're using RDS. Then you've delegated all of these distributed systems challenges to Amazon.

[1:28:36] Yes, through RDS, you've isolated those hard problems behind an abstraction.

[1:28:54] I think this is, in some sense, extremely beautiful. Kubernetes makes it great when you can do it. It makes it very hard for you to invite this kind of complexity into your code. You have to architect your application in a way that the pod can go away at any point in time and be recreated. You cannot really manage local state. I have not seen a lot of applications that do cross-talk between parts of the same application, like this heartbeat context. Most of the time, we just rely on the load balancer to decide where to put work.

[1:29:18] Most people don't tend to have problems where their working set size is larger than a single node, and they need low-latency, consistent, highly available answers. If they do, they just jam it into Amazon and pay out the wazoo for it. They don't solve the problem themselves more efficiently.

[1:29:36] Yes, and that's fine if your business model allows it. I'd argue that it's highly recommended if your business model allows it.

[1:29:50] This is the thing I'm also thinking about. The proficiency needed to debug problems in that environment effectively is relatively high, and the problems you may encounter can become mission-critical very fast if your database locks up and you're not able to understand why. I would much rather not have that problem than have to educate a larger developer population on how to debug this kind of stuff. But there's still the question of how Amazon or the platform teams operate these things and what tools they have to do so.

[1:30:24] I also think that in environments where you don't have PCAP permissions or PTRACE permissions in production, it's not really the place where you want to solve these kinds of production issues.

Yes, I'm not interested in solving people's problems when they artificially limit my toolset.

It's interesting to see the attitude back in the day. Just give somebody the root password without any checks, and they could do whatever was needed on the box to get things going.

It reminds me of a post by Fred Moyer about what an Amazon trick question would look like if it were answered by Feynman. It's hilarious because it's the classic scenario where you have three light switches and three light bulbs in another room, and you need to figure out which light controls which switch, but you can only enter the room once. It's a trick question because it's a horrible question that has nothing to do with computing.

The answer is to turn on two light switches, wait for a minute, turn one of them off, wait for 30 seconds, turn them all off, and then walk into the room and touch the light bulbs to see how hot they are. The five-minute response is, "Are they all UL certified? Are they wired with hot leads? Are they all load leads? Are some of them loaded on neutral? Do I have a testing current kit?" The person gets frustrated with them the whole time, and it gets to the point where they say, "Well, they don't necessarily have the same bulbs, so one could be fluorescent and one could be incandescent, and they'd have different heat profiles. You can't trust that anyway."

Interestingly, this is a software engineering job. When I do my job, am I allowed to use tools, compilers, debuggers, analyzers? No, of course not. Why would you ask me an electrical question if you're not going to give me the industry best practice standards for electrical systems? It's a very funny article that was shared in the observability Slack channel.

It's a good framing of this as a craft and an engineering discipline. What are the tools you really need in this kind of environment? For me, it's frustrating. Inspecting an HTTP payload is often surprisingly hard. Try injecting a print statement in a Python script that runs in a container. I can fire up and install my Emacs in production, and that works. APT-get works, but then I want to edit the Python code and load it, and I need to restart Python. Then, boom, the container is gone.

It gets even more difficult when you have tools that implement zero-trust security systems. For example, using OpenZ, which does zero-trust mTLS security systems between nodes with a registry, you can't even offload the TLS. It's deep in the library payload, making debugging very difficult. It is a security solution, so ideally, you shouldn't be able to see the payload, but it makes debugging just very challenging.

So far, there has been very little conversation about how a debugging environment in Kubernetes should really look and what the tools should be for TLS and this kind of stuff. You can terminate in front and use proxies to have a little bit more visibility.

It does feel like the old "Have you tried turning it off and on again?"

This is what you usually do, right? You try this as often as you can. You automate the turning off and on, trying to push every possible problem into business hours so you can usually deploy and use your very slow feedback loops.

What's heart-crushing for me is that it used to be, back in the day, we had a system—I'm trying to think of one we had like this—but we always had a process at a customer that would swell in memory on Friday nights, peg the CPU, and get screwed up. Someone would have to log in and restart it. The customer was like, "I don't want to spend any time fixing this problem. You should just log in and restart it." They'd pay someone to do that every Friday for 15 minutes.

It's demeaning as a software engineer. It's very painful to accept that as a much more economical solution because it devalues your schedule flexibility.

You can also write a cron job that does the restart for you.

When it doesn't turn back on, you don't have your job anymore. You know, we jump to automation. They always say, "Automate your fuck-ups," too. The problem is that Kubernetes kind of makes that the natural, automated state of things. The whole point of it is that when it malfunctions, it restarts.

The worst-case scenario is it pegs itself and locks itself up in a way that it's still live, like a livelock. That's bad. The only bad state is if it deadlocks. The health check goes off, boom, it kills it, restarts it, and everything's fine.

If you're not paying attention, you just don't see these problems, and then you have the business arguing whether it's a problem. I find those tail-chasing arguments annoying because they remove the craft from the process. You should take pride in the quality of your work product. If your work product is a piece of shit that can be restarted all the time and no one cares, then it saddens me. I don't think I can formalize that in a different way than to say it just disappoints me.

Yeah, it's similar. I would argue it's not just the auto-scaling; it's also the configuration and the amount of code I can deploy to Kubernetes without ever having to care about whether it's running or not. It's just always running. It's pretty hard to get a web front-end not to work on Kubernetes. It solves a problem, and I appreciate that. Now, it's a question we have as an industry because so many people have made those choices: how do we get debuggability and observability back in a better sense? The way we are approaching this problem right now is really about how much data we can externalize and send downstream in the hopes of it being useful.

The art of asking good questions and peeling the layers, like the fixture table debugging we talked about before, is crucial. If I have a Kubernetes deployment, how do I stop it and dissect it, especially with multiple moving parts?

I also feel like my perspective isn't that useful for everyone. Most of my time dealing with these problems in my career was spent with people pushing the envelope on scalability. They were effectively the Googles of their time. Google was a customer, as was Apple and some other really large customers. The questions they ask are ones that I think a lot of people just don't care about. For example, some of these requests take 100 milliseconds, and some take 80. Why is there a difference? Most people in a Kubernetes environment aren't anywhere close to being able to answer a question like that.

They aren't close to caring about it either.

Right, but if they did care, they wouldn't have the access or knowledge to debug it in 95% of the cases. It's a problem; if it's slow, it's probably the hardest problem to debug.

Aberrant slowness is particularly challenging. When you have one out of every 500 requests that are 20% slower, and they are the same requests hitting the same server, there shouldn't be any differences. Those are hard questions. In my career, I was in a position where my customers cared, but I think the vast majority of customers today do not. Their business value doesn't rely on solving those issues.

You really have to understand where the business value is. There are always companies that are performance and latency-sensitive, but this isn't an e-commerce company or a butcher shop. Different environments have different needs.

Do you see yourself staying engaged with software in the next decade, or are you more focused on the physical aspects of butchery?

Oh, I'm not focused on the physical aspects of butchery other than consumption. I've been programming in Rust quite a bit. I've been making pull requests on GitHub, and I really enjoy solving problems with code. I think I'd like to explore embedded systems a little more. I have some interesting projects in the IoT realm that I'd like to pursue, bringing out a bit more of my electrical engineering background, which is very rusty. But I'm comfortable soldering; it's the circuit layouts that have been a while.

It's fascinating what those little chips can do these days with Wi-Fi and networking.

Yes, I want to see how big of a ZigBee network I can create because I would love to have a 10,000-node network.

That would be really interesting—pushing the envelope.

Yes, I don't think you can do that with ZigBee, but there are always ways. That's how engineers think.

I will be very curious about your experiments in that direction. Maybe another company will come up, or some products we will be able to buy.

I'm advising a few companies that have some interesting products, and I'm excited about some of them. One of them is pretty interested in the OpenZiti stuff.

Ah, is that something like Tailscale?

Yes, but it's embedded in the app. It's like Tailscale, but app-embedded. You can dial into your domain and say, "I need this web service." It connects you, and it's MTLS, secure, and it doesn't matter where you are in the world.

So in this scenario, you install a desktop application that connects to your infrastructure through magic, or is it for devices?

You would build OpenZiti into something like Chrome as the socket protocol. When it wants to reach out to a service on port 443 or port 80, it uses that protocol and has service discovery, endpoint discovery, and connection facilities.

Is it also about not tunneling and getting into places where usual TCP connections cannot, or is it more on the security side?

We could definitely traverse NAT because it's sort of like a mesh network where everybody dials out into the mesh.

Here's a complete tangent. I found it quite interesting at a high level. One of my beliefs is that the internet, like society on the internet, has problems because of identity. I'm not able to say who is making a request, for better or worse. I think that's the biggest difference between the real world and the internet. You can do bad things, but you are always tangibly identifiable in many ways. I always thought that at the root of a lot of problems lies this complete anonymity. You can create a new account, and nobody will know who you were. In certain situations, I think this is not the best way to do it. Tailscale, for the first time, had this idea of connecting to a network through an OAuth layer. I do my GitHub OAuth, and then I get access to a network, having services offered to me in a secured environment.

Who's the one trusting, right? A lot of users trust that they have privacy.

Yes, it's tricky. I think OpenZiti has a similar concept where you have an authentication system that grants you access to the fabric, to the mesh network of endpoints. You know who's connecting to whom, or at least the person being connected to knows the credentials of the person connecting, which is useful in those contexts. It's definitely useful in the corporate world. For me, all the services that I've had to have on a DMZ behind a VPN that need to talk to another system sort of disappear when you have these mesh networks, especially if you containerize them. If you're containerizing your services so that they're isolated, if they're compromised, you don't have access to the network at all. You only have access to an isolated network that just has a route out.

Yes, and it doesn't really matter where that runs. There are still security implications; you can't be careless with your design. But it takes away a lot of the intricacies. In the old days, you had a set of intranet apps that people needed to use at home—developers, salespeople, or marketing people. Then all of that went into the cloud, which sort of solved it through things like Okta. You have access to these services through authentication, but all the bespoke services are a pain to integrate with Okta.

True. It would be nice if I could just connect to my Postgres instance based on who I am. I don't care where it is—production, staging, or development. If I have credentials to connect to it, it doesn't matter where I am; I can just connect securely.

Yes, and not only that, because it's secure, if someone is on my machine and I'm connecting, there's still secure communication into the process. They can't sniff the network. They could dump the memory of my process, but that's about as close as they're going to get. It's a nice concept. It's not host-to-host; it's really app-to-app. My Postgres client would talk to my Postgres server, and anyone else on the box would be isolated. It's horrible for debug ability and observability, but there are just these two processes talking to this mesh network over TLS, and they don't know what's going on.

Yes, it's layers of indirection. The physical routing still has to happen, and you accept that this is taken care of at another layer. It was always surprising for me to see that this actually works because of capacity issues. It also works at a large scale; Kubernetes basically works this way in many instances. I was always skeptical about routing all the traffic through certain houses, but it turns out you can route a lot of traffic on commodity boxes without really running into capacity issues.

In the '90s, it was easy to hit the limits of your 100 megabit full duplex link, but it's very hard for most services these days. Your individual Golang app or Python app on a pod is unlikely to saturate the 10 gig or 40 gig link on the box. The capacity has outpaced the performance of people's software in most cases, so it's not a concern.

Yes, you can get away with just eating that overhead. The additional flexibility that the layer of abstraction gives you, combined with containerization, makes all this sandboxing possible. It sounds interesting. I haven't heard of it; I will check it out. Maybe this will be our next conversation—OpenZiti, mesh networks, and IoT. Let's see.

I'm slowly getting into that. I'm looking to set up a greenhouse and would like to have a lot of things both remotely actuated and monitored in there.

We have some common interests here. My wife has a pretty elaborate gardening operation going on, so the thoughts are certainly there, but the execution time and proficiency are lacking. I think you will be ahead of me in that.

All right, let's cut it here. Theo, this has been a blast; it's so much fun. Thank you so much for coming on.

Yeah, thanks for having me, man.

Have a great time! Go eat some ribs; that sounds like a great plan. Let's stop the recording. Bye!