Building an HTTPS Model API for Cheap: AWS, Docker, and the Normconf API - Ben Labaschin

Transcript generated with OpenAI Whisper large-v2.

Picture this. It's two months until a conference for which you're an organizer and a speaker. You've got planning a band, community, donations, and finances to organize. You're a founding hire at a seed stage startup, just moved to a new state, and also have life things to take care of. The smartest thing to do given all of this would be just to put your head down and get things done. That's why when I faced this situation, the only natural thing to do was to use my time to build a production-grade API. What preceded was a lesson in tool choices, overengineering, and dedication to a meme. Thank you for coming to my talk about building cheap HTTPS APIs. My name is Ben Labashian. I'm principal Emily at Work Helix and an organizer for NormConf. Today, I want to walk you through the infrastructure and deployment process for the NormConf goodies API. More than that, I want to speak with you about our most scarce of resources, time. But before I start, some useful info for folks who want to follow along at home. You can check out the goodies code repo above, and if you want to play around with the NormConf API, head over to api.normconf.com. If you haven't heard about the API, this is your chance to play around with the most useful, definitely need to have this above all other conference priorities. Vicky, please let me have this API out there. I mean, look at this thing. With a simple curl to api.normconf.com, you can get an ASCII-ified NormConf logo. You know who hasn't made a conference API? NeurIPS. You know who hasn't been invited to speak at NeurIPS? Me. Coincidence? Almost certainly. But we can at least agree that this is all very important stuff. Still, there's a lot to cover, and I don't want to lose the forest for the trees, so I'm going to rip off the band-aid and tell you exactly what the central message of this talk is now. You ready? We don't have enough time to do all the things we want or need to do in our jobs, let alone our lives. So many of you, as with me, feel the weight of wanting to learn it all and do it all. We push ourselves every day, hoping to accomplish more. And we achieve great things. Clearly this conference wouldn't exist were it not for its head organizer being the type of person to push themselves. If you're in this industry and you're listening to this talk, I suspect you're like this, too. This type of mentality bleeds into our jobs just as it seeps into our lives. As I said, I built the NormConf API while juggling many things, and yet I chose to build it. Having done so, I chose the tools I was familiar with. Some of those were good choices. Others held me back. If I had chosen more wisely, the work probably would have been done faster. So yes, this is a talk about having built the goodies API, but it's a metaphor about the time we don't have, realistically or not, and the consequences of choosing the wrong tools for the job. So when I talk about building APIs for cheap, I am actually referring to the tradeoffs we make among our resources, time, cognitive energy, and cost savings. Okay. Now that we've established that, here's how it's going to work. First, I'll talk about the initial steps I took. Then I'll talk about the aspects of software I like. I've called these aspects normy software. It's completely made up, but I can do that because, again, this isn't a NeurIPS talk. Finally, I'll talk about some pains of deploying APIs to the cloud. Today we're going to be talking about AWS, but this easily could have been applied to any of the big cloud platforms. And that'll be the talk. Okay. So let's get things started. So when I sat down to build the goodies API, there was no doubt in my mind what I was going to do. That's because as an MLE, I spend a lot of time working with and building APIs. Most models, if they're going to be any good, have to be served. Someone or something needs to be able to interact with the model and its outputs. Now, data scientists of a certain type might build APIs for their models, but oftentimes it's the MLEs who are tasked with building or refining them to become production ready. That's the nature of the position. We are relied upon to know a lot of things in order to solve many very different problems. That's why tooling choice is so important. As MLEs, we cannot afford to embed software into our systems that makes them unable to scale or that cause inefficiencies. Intuitively, most of us understand this. The problem is it's hard to avoid bad tool choices in practice. Why? Well, when I went about building the goodies API, I stuck with what I know. Fast API, Docker, GitHub, AWS. These are my bread and butter. I use them every day and being time pressed and trying to do everything at once, I reached for them when I sat down to program. Here's a simple diagram that illustrates what I imagined the goodies API would look like. To begin, I'd start with Python. Easy, functional, well supported. When it came to an API framework, I reached for fast API. It's my go-to framework for reasons I'll discuss soon. Next, Docker. It's ubiquitous, straightforward, gets the job done. After that, I'd push to Git, which would activate a GitHub action CICD for testing and deployment. Finally, that would deploy the container straight to AWS, where it would be hosted in Elastic Container Service or ECS. In my mind, all of it can be done in a day or two. And for the most part, it worked out. Because if you choose your tools wisely, they make your job easier. That is, they are cheaper to use. Your tools should make your job easier. Time is expensive. We have a lot to do. This brings me to my next section of the talk. Normie software or the benefits of choosing good software tools. As I've said, when I select my tools properly, I look out for what I'm calling the qualities of normie software. There's not much to it. To be a normie software, a tool has to have one or more of the following qualities. They do the thing they say they do. They are an investment. They are easy to pick up. I'm going to be citing these tenets throughout my talk. Keep an eye out. Let's break these down one by one. They do the thing they say they do. That is, a hammer hits things, an API creation tool creates APIs. That's what it does. If you're spending an inordinate amount of time trying to get a tool to the point that will accomplish your underlying goal, you should ask yourself, is that what this tool is meant to do? Is there a better tool to get me there? Should I be creating doc from scratch in Fortran or could I just find a better way? Next, they are an investment. An investment in yourself, in the current problem you're facing, and in future problems that will also need to be solved. This one's hand wavy, so let me clarify. We've already established that people in our field, in my case, MLEs, are required to solve different time box problems which entail a significant cognitive load. These are costs. It's hard to remember everything, let alone do all the things we do. If a tool is going to require us to learn how to operate it, the time spent learning it becomes an investment if the tool will also be used in future projects. You want your tools to scale. You want your time to scale. That's what an investment, that's why it's an investment, and that's why I argue you should view your time and tools as such. And most critically, they are easy to pick up. That is, they do the hard work for you, thus making you want to use them. This one I'd like to expand on a little bit more. There's an often cited quote by computer scientist Tony Hoare. It says, the most important property of a program is whether it accomplishes the intentions of its user. Maybe you've heard this one floating around. Well, contextually, Hoare was writing in a paper about the axiomatic qualities of programs. And aptly, this was the first sentence of a section called proofs of program correctness. So people will cite this, as I just did, to make a point about the purpose of programs. And it's true. Programs should accomplish the goals of their users. But programs are made by people, at least for now. And people can accomplish these goals in many different ways. It would be nice if there were one and only one obvious way to program. But most often there isn't. Just because software does the thing you want it to do, the first rule of normie software, does not mean it does it in a way that is amenable for time-pressed people like ourselves. Far better if an equivalent tool has the resources you need to learn what you need to accomplish your goals. That's what I mean when I say good software is easy to pick up. Yes, first a program should do the thing. But Lord almighty, wouldn't it be nice if I didn't have to struggle to learn how to do it with every additional tool? Lower the barrier of accessibility for your tools and more people will want to use them. Now, I've been speaking at a high level about what normie qualities are and why tools with these qualities make APIs and programming cheap. But now I want to show you what I mean. So as an example of the first quality of normie software, let's take a look at fast API, which I use to develop the goodies API. To use fast API, one need not know the depths of app building or have a degree on state. The maintainers of fast API know that ultimately all you want to do is create APIs. To have a means with which to deal with requests and responses, this is what I mean when I talk about tools doing the thing they do. Fast API is really very good at that. In the app, as you can see from the sample code here, all I had to do was to create an endpoint with a decorator over my function, get norm conf. In this case, it's a function that returns an ASCII version of the norm conf logo that you saw before. It's a get request, so you write get. You want a certain path, you write the path. You want to ensure a plain text response, well, lo and behold, in the body of the function, there is an object that allows you to do that. So yes, I use fast API, built the functions I needed, added the endpoints, and importantly, I was done in no time. Okay. So from there, I knew I had to create an image, which leads me to my next point about normie software being an investment. With the goodies API, I knew ahead of time that whatever I created, Docker would handle it because it's a tool I'm deeply familiar with. This was useful, not only at the micro level of programming, but at the macro level of optimizing cognitive effort. With good tools, that is tools amenable to investment, each use builds upon itself so that your experience of the software is a feedback loop that opens up even more use cases. I'm going to keep banging on this drum. As analysts and data scientists, we are required to know a lot about a lot. If there's a single tool that can solve many different problems, that's cognitive load that would have otherwise been stratified that is now internalized. That's time I don't have to use picking up another tool. That's why I think Docker is normie software. It's something you benefit from returning to time and again, as long as it works. With the goodies API, as can be seen here, it was a simple thing to import Python 3.10, copy the relevant files, install the dependencies, and run a simple uvicorn command. But my point is, it could have been any number of things I needed to do and I would have felt just as comfortable because my experiences with the software have built upon themselves. This was a time saver. When you're choosing your tools at work, try not to reinvent the wheel, as you may lose out on current and future productivity. At this point, the image has been made, but it has nowhere to go. It has to go somewhere. And that's where GitHub actions come in. If you're not aware of GitHub actions, this is a service that allows users to automate their workflows. That means testing code, building images, sending them where they need to go, all with a simple push to your GitHub repository. Pretty nice. Now, because it requires a bunch of domain knowledge, some might expect me to say that GitHub actions isn't normie software, that it's not easy to pick up. But to be honest, I think it functions just as I'd want it to. When software is this configurable, we will be required to learn a thing or two. To me, being easy to pick up is a function of the amount of effort I have to put in to accomplish my goals. And how much support is available to me to quickly solve my problems. If I have to dig around in dense documentation, that's a sign of software being too involved, of not being accessible software. GitHub actions has great documentation. On the subject of documentation, I actually asked about this the other day in normcom slack. I wanted to hear other people's thoughts about it, and I got some really great responses from the community. Basically, my message was about my contention that companies with products with broad use cases often have dense, inaccessible documentation. That they're not easy to pick up. And among the great responses I got was this one from Sarah Moyer, who unsurprisingly writes about this stuff. Her response was like, yeah, these companies have products with endless configurability, and therefore write documentation that is super dense and detailed. But the user often doesn't need that. The user just wants to do the thing they want to do. They want to pick up the software and run. That was me. For example, I use poetry for my dependency management. In building the API, I didn't want to have to use a different dependency manager when deploying my code from that which I use locally. So I literally just typed poetry install GitHub actions into Google and found an action and code examples I needed to build and test with poetry in the goodies CI CD pipeline. This was a portion of the result. Almost literally a copy and paste from what I found. It was easy to find useful code. The code itself was very self explanatory. I.e., the above code doesn't seem very involved to me. Just a bunch of simple YAML. And I got what I needed to get done in a short amount of time. It was very easy to pick up. Altogether, the fast API, Docker, and GitHub action steps of the app development took maybe an afternoon to get a working draft done. But I told you at the beginning of the presentation that I was going to talk about the mistakes made along the way as an analogy of what we go through every day. And mistakes were made. The fact is, once I got to the cloud stage of my deployment, I found my productivity slowed significantly. My time started getting sucked up in areas I'd rather they hadn't. So how did that happen? Well, let's start with a common metaphor about what cloud providers do as I think it will help us discuss some of the issues I ran into deploying the app. So far as I can tell, cloud providers are kind of like industrial tool manufacturers. They know that their customers will have a diverse need for parts and tools. Yes, everyone will need screwdrivers and hammers, but the size of nails, the kind of materials, the amount of specialization and specification that will be needed, that is endless. So they do a bit of everything. And while scale has allowed cloud providers to offer many different services, the experience of using these services can feel unlike normie software. The problem in the cloud is not simply that it's someone else's tools that you're paying to use. It's that it's a warehouse of tools meant to apply to almost any situation. And therefore, for the tools are inefficient for your needs as they were for mine. So sometimes you just want a screwdriver. But when you go to the screwdriver section of the warehouse, there are 100 different specifications of screwdrivers. Also, none of them look like screwdrivers. Also, they're called AWS glue. For example, I was at the point of deploying the goodies API to ECS, a fully managed container orchestration service that felt fully managed by me. Now, the goal here was to host a container so that everyone can access its endpoints. As a concept, this is a very common need. We build software, containers are often the manner in which that software ships, and we want people to use that software. But it's not that easy. Instead, one needs to become pretty well acquainted with things like task definitions, service names and clusters. Even though in my case, this is an attempt to deploy to Fargate, which is nominally a serverless machine to run on to make life easier. To me, this isn't just doing the thing they say they do. As a comparison of the possibilities, and I know these are not apples to apples, don't cancel me, but in a service like Streamlet cloud, I simply connect a GitHub repo that contains a Docker file and my app is deployed. Often very quickly. Again, it's not a one to one comparison, but I have a hard time believing it needs to be much more complicated than that. So to me, this is the first aspect of the project that did not do the thing as it says it does. Which leads me to the next issue I experienced when deploying the goodies API. So let's say I've pushed my app successfully and it's being hosted in ECS. Well, ECS is by definition elastic, and its IP address is dynamic. So every time I make a change, a new IP address will deploy. But as it is, that's not very useful for norm conf attendees who would like nothing more than to ping the API to get all its really very important outputs. So what do you have to do? Well, you have to connect to route tables, network gateways, subnet, security groups, and load balancers all in one. Now be sure to work with a DNS in route 53 right away. That comes after the load balancer. And for good measure, if you want it to be a secure HTTPS API, you're going to have to set up secure port forwarding on 443, which requires listeners and target groups. And well, it just gets tiring to list out. Especially because this wasn't even what I was supposed to be working on. Now, even if you have a general understanding of all these things like me, you can still get tripped up by the ordering of these operations or even more frustrating, which I am permissions you need to set up. The issue is, as I've said, we have been given a warehouse full of tools and told to go make a house where usually you have plumbers and electricians and more. We often have to be all those things at once. We didn't always have to be network engineers in addition to everything else, but to deploy an API in big cloud services, oftentimes now we do. And if you expect us to know all these things, that's fine. But it's necessarily a trade off from some other work that we could be working on. It's costly. Ideally, we shouldn't have to be doing all of this. We could be working on the things that really matter, like debating the usefulness of automatic code formatters or which IDE to use. But the way it is right now, I think that there should be an easier way. And so here I would argue that work on things like networking is not an investment. I'm not benefiting from working on these tools because they're not helping me achieve my core functionality. They can be abstracted away and I would not lose much. That's why I think this network engineering was not a useful software investment for me. Okay, so finally, let's talk about things more generally. To get through all the API deployment from ECS and Route 53 to load balancing and more, we need to rely on documentation. The problem is, well, as Vicky says here, cloud documentation can be pretty rough. And as we've established, the more complicated the system, the more it can do. The denser the cloud documentation typically is. But when you're faced with all the documentation, you're often left trying to parse through it for your relevant use case. And if there are a few examples with sample code, it can get rough. That's why I'd argue when working a lot with cloud software, including what it took to deploy the goodies API, it's more involved than it needs to be. Even simply just by trying to find the relevant information you need to work to accomplish your use case. This is why this too broke a rule of normie software. It is not easy to pick up and it could be easier. Which brings me to where we started. Remember the original diagram I had with my ideal setup? Well, on the left is that diagram. And on the right is actually a relatively simplified version of where I ended up. It's a bunch scrunched in there. So let me summarize. While making the actual app stayed the same, the actual deployment to the cloud was far less normie than I wanted. So where does that leave us? I started this talk by saying that I wanted to build a cheap API for the conference. And in some ways I did. Over two months, we've paid maybe like $30 for a scalable production grade API. But in other ways, the API was costly. Costly on my time. True, some of the tools I reached for were beneficial to me. I argue that's because those tools had traits of normie software. They do the things they say they do. They were investments. They're easy to pick up. But once I reached the deployment to the cloud, my experience left something to be desired. It was too much work on things that distract from the core features of what I think we should be doing. So you may be thinking, okay, well, then the solution is to pick a service that specializes in hosting and deployment, Ben. And that may or may not be true. But what I'm really trying to get at here is that if I had taken the time to think about the tradeoffs of my software choices, I would have had other time for a thousand other things I'd like to be doing with my time. And in our jobs, it's the same thing. We're often jumping from task to task. Especially if you're at an early stage startup like me. Instead of grabbing the shiny software, it behooves us to take a moment and consider which of our screwdrivers to use. Especially because if it has to do with cloud, it will often take longer than it probably could. So my final message at the end of all of this is seek out Normie software. Tools that you're familiar with are good. But even if you're comfortable with the cloud like me, recognize the scale of the problem you're trying to solve. Don't use a chainsaw when scissors will do. And more generally, our time is precious. We are asked to do so much from ourselves and our jobs. Do yourself a favor. Choose your software carefully and respect your time both now and in the future. Now, some of you, I know we're hoping I literally walk through the instructions to deploy the app to the cloud, but I thought that would be too boring to regurgitate. So I wrote a post. It's on that QR code. You can check my website. But for everyone else, I've appreciated your time. Thank you so much. This conference has been amazing. And it's been a great, I'm really grateful for the opportunity to speak with you today.