Building an HTTPS Model API for Cheap: AWS, Docker, and the Normconf API - Ben Labaschin
Transcript generated with
OpenAI Whisper
large-v2
.
Picture this. It's two months until a conference for which you're an organizer and a speaker.
You've got planning a band, community, donations, and finances to organize. You're a founding
hire at a seed stage startup, just moved to a new state, and also have life things to
take care of. The smartest thing to do given all of this would be just to put your head
down and get things done. That's why when I faced this situation, the only natural thing
to do was to use my time to build a production-grade API. What preceded was a lesson in tool choices,
overengineering, and dedication to a meme. Thank you for coming to my talk about building
cheap HTTPS APIs. My name is Ben Labashian. I'm principal Emily at Work Helix and an organizer
for NormConf. Today, I want to walk you through the infrastructure and deployment process
for the NormConf goodies API. More than that, I want to speak with you about our most scarce
of resources, time. But before I start, some useful info for folks who want to follow along
at home. You can check out the goodies code repo above, and if you want to play around
with the NormConf API, head over to api.normconf.com. If you haven't heard about the API, this is
your chance to play around with the most useful, definitely need to have this above all other
conference priorities. Vicky, please let me have this API out there.
I mean, look at this thing. With a simple curl to api.normconf.com, you can get an ASCII-ified
NormConf logo. You know who hasn't made a conference API? NeurIPS. You know who hasn't
been invited to speak at NeurIPS? Me. Coincidence? Almost certainly. But we can at least agree
that this is all very important stuff. Still, there's a lot to cover, and I don't want to
lose the forest for the trees, so I'm going to rip off the band-aid and tell you exactly
what the central message of this talk is now. You ready?
We don't have enough time to do all the things we want or need to do in our jobs, let alone
our lives. So many of you, as with me, feel the weight of wanting to learn it all and
do it all. We push ourselves every day, hoping to accomplish more. And we achieve great things.
Clearly this conference wouldn't exist were it not for its head organizer being the type
of person to push themselves. If you're in this industry and you're listening to this
talk, I suspect you're like this, too. This type of mentality bleeds into our jobs just
as it seeps into our lives. As I said, I built the NormConf API while juggling many things,
and yet I chose to build it. Having done so, I chose the tools I was familiar with. Some
of those were good choices. Others held me back. If I had chosen more wisely, the work
probably would have been done faster. So yes, this is a talk about having built the goodies
API, but it's a metaphor about the time we don't have, realistically or not, and the
consequences of choosing the wrong tools for the job. So when I talk about building APIs
for cheap, I am actually referring to the tradeoffs we make among our resources, time,
cognitive energy, and cost savings. Okay. Now that we've established that, here's how
it's going to work. First, I'll talk about the initial steps I took. Then I'll talk about
the aspects of software I like. I've called these aspects normy software. It's completely
made up, but I can do that because, again, this isn't a NeurIPS talk. Finally, I'll talk
about some pains of deploying APIs to the cloud. Today we're going to be talking about
AWS, but this easily could have been applied to any of the big cloud platforms. And that'll
be the talk. Okay. So let's get things started. So when I sat down to build the goodies API,
there was no doubt in my mind what I was going to do. That's because as an MLE, I spend a
lot of time working with and building APIs. Most models, if they're going to be any good,
have to be served. Someone or something needs to be able to interact with the model and
its outputs. Now, data scientists of a certain type might build APIs for their models, but
oftentimes it's the MLEs who are tasked with building or refining them to become production
ready. That's the nature of the position. We are relied upon to know a lot of things
in order to solve many very different problems. That's why tooling choice is so important.
As MLEs, we cannot afford to embed software into our systems that makes them unable to
scale or that cause inefficiencies. Intuitively, most of us understand this. The problem is
it's hard to avoid bad tool choices in practice. Why? Well, when I went about building the
goodies API, I stuck with what I know. Fast API, Docker, GitHub, AWS. These are my bread
and butter. I use them every day and being time pressed and trying to do everything at
once, I reached for them when I sat down to program. Here's a simple diagram that illustrates
what I imagined the goodies API would look like. To begin, I'd start with Python. Easy,
functional, well supported. When it came to an API framework, I reached for fast API.
It's my go-to framework for reasons I'll discuss soon. Next, Docker. It's ubiquitous, straightforward,
gets the job done. After that, I'd push to Git, which would activate a GitHub action
CICD for testing and deployment. Finally, that would deploy the container straight to
AWS, where it would be hosted in Elastic Container Service or ECS. In my mind, all of it can
be done in a day or two. And for the most part, it worked out. Because if you choose
your tools wisely, they make your job easier. That is, they are cheaper to use. Your tools
should make your job easier. Time is expensive. We have a lot to do. This brings me to my
next section of the talk. Normie software or the benefits of choosing good software
tools. As I've said, when I select my tools properly, I look out for what I'm calling
the qualities of normie software. There's not much to it. To be a normie software, a
tool has to have one or more of the following qualities. They do the thing they say they
do. They are an investment. They are easy to pick up. I'm going to be citing these tenets
throughout my talk. Keep an eye out. Let's break these down one by one. They do the thing
they say they do. That is, a hammer hits things, an API creation tool creates APIs. That's
what it does. If you're spending an inordinate amount of time trying to get a tool to the
point that will accomplish your underlying goal, you should ask yourself, is that what
this tool is meant to do? Is there a better tool to get me there? Should I be creating
doc from scratch in Fortran or could I just find a better way? Next, they are an investment.
An investment in yourself, in the current problem you're facing, and in future problems
that will also need to be solved. This one's hand wavy, so let me clarify. We've already
established that people in our field, in my case, MLEs, are required to solve different
time box problems which entail a significant cognitive load. These are costs. It's hard
to remember everything, let alone do all the things we do. If a tool is going to require
us to learn how to operate it, the time spent learning it becomes an investment if the tool
will also be used in future projects. You want your tools to scale. You want your time
to scale. That's what an investment, that's why it's an investment, and that's why I argue
you should view your time and tools as such. And most critically, they are easy to pick
up. That is, they do the hard work for you, thus making you want to use them. This one
I'd like to expand on a little bit more. There's an often cited quote by computer scientist
Tony Hoare. It says, the most important property of a program is whether it accomplishes the
intentions of its user. Maybe you've heard this one floating around. Well, contextually,
Hoare was writing in a paper about the axiomatic qualities of programs. And aptly, this was
the first sentence of a section called proofs of program correctness. So people will cite
this, as I just did, to make a point about the purpose of programs. And it's true. Programs
should accomplish the goals of their users. But programs are made by people, at least
for now. And people can accomplish these goals in many different ways. It would be nice if
there were one and only one obvious way to program. But most often there isn't. Just
because software does the thing you want it to do, the first rule of normie software,
does not mean it does it in a way that is amenable for time-pressed people like ourselves.
Far better if an equivalent tool has the resources you need to learn what you need to accomplish
your goals. That's what I mean when I say good software is easy to pick up. Yes, first
a program should do the thing. But Lord almighty, wouldn't it be nice if I didn't have to struggle
to learn how to do it with every additional tool? Lower the barrier of accessibility for
your tools and more people will want to use them. Now, I've been speaking at a high level
about what normie qualities are and why tools with these qualities make APIs and programming
cheap. But now I want to show you what I mean. So as an example of the first quality of normie
software, let's take a look at fast API, which I use to develop the goodies API. To use fast
API, one need not know the depths of app building or have a degree on state. The maintainers
of fast API know that ultimately all you want to do is create APIs. To have a means with
which to deal with requests and responses, this is what I mean when I talk about tools
doing the thing they do. Fast API is really very good at that. In the app, as you can
see from the sample code here, all I had to do was to create an endpoint with a decorator
over my function, get norm conf. In this case, it's a function that returns an ASCII version
of the norm conf logo that you saw before. It's a get request, so you write get. You
want a certain path, you write the path. You want to ensure a plain text response, well,
lo and behold, in the body of the function, there is an object that allows you to do that.
So yes, I use fast API, built the functions I needed, added the endpoints, and importantly,
I was done in no time. Okay. So from there, I knew I had to create an image, which leads
me to my next point about normie software being an investment. With the goodies API,
I knew ahead of time that whatever I created, Docker would handle it because it's a tool
I'm deeply familiar with. This was useful, not only at the micro level of programming,
but at the macro level of optimizing cognitive effort. With good tools, that is tools amenable
to investment, each use builds upon itself so that your experience of the software is
a feedback loop that opens up even more use cases. I'm going to keep banging on this drum.
As analysts and data scientists, we are required to know a lot about a lot. If there's a single
tool that can solve many different problems, that's cognitive load that would have otherwise
been stratified that is now internalized. That's time I don't have to use picking up
another tool. That's why I think Docker is normie software. It's something you benefit
from returning to time and again, as long as it works. With the goodies API, as can
be seen here, it was a simple thing to import Python 3.10, copy the relevant files, install
the dependencies, and run a simple uvicorn command. But my point is, it could have been
any number of things I needed to do and I would have felt just as comfortable because
my experiences with the software have built upon themselves. This was a time saver. When
you're choosing your tools at work, try not to reinvent the wheel, as you may lose out
on current and future productivity. At this point, the image has been made, but it has
nowhere to go. It has to go somewhere. And that's where GitHub actions come in. If you're
not aware of GitHub actions, this is a service that allows users to automate their workflows.
That means testing code, building images, sending them where they need to go, all with
a simple push to your GitHub repository. Pretty nice. Now, because it requires a bunch of
domain knowledge, some might expect me to say that GitHub actions isn't normie software,
that it's not easy to pick up. But to be honest, I think it functions just as I'd want it to.
When software is this configurable, we will be required to learn a thing or two. To me,
being easy to pick up is a function of the amount of effort I have to put in to accomplish
my goals. And how much support is available to me to quickly solve my problems. If I have
to dig around in dense documentation, that's a sign of software being too involved, of
not being accessible software. GitHub actions has great documentation. On the subject of
documentation, I actually asked about this the other day in normcom slack. I wanted to
hear other people's thoughts about it, and I got some really great responses from the
community. Basically, my message was about my contention that companies with products
with broad use cases often have dense, inaccessible documentation. That they're not easy to pick
up. And among the great responses I got was this one from Sarah Moyer, who unsurprisingly
writes about this stuff. Her response was like, yeah, these companies have products
with endless configurability, and therefore write documentation that is super dense and
detailed. But the user often doesn't need that. The user just wants to do the thing
they want to do. They want to pick up the software and run. That was me. For example,
I use poetry for my dependency management. In building the API, I didn't want to have
to use a different dependency manager when deploying my code from that which I use locally.
So I literally just typed poetry install GitHub actions into Google and found an action and
code examples I needed to build and test with poetry in the goodies CI CD pipeline. This
was a portion of the result. Almost literally a copy and paste from what I found. It was
easy to find useful code. The code itself was very self explanatory. I.e., the above
code doesn't seem very involved to me. Just a bunch of simple YAML. And I got what I needed
to get done in a short amount of time. It was very easy to pick up.
Altogether, the fast API, Docker, and GitHub action steps of the app development took maybe
an afternoon to get a working draft done. But I told you at the beginning of the presentation
that I was going to talk about the mistakes made along the way as an analogy of what we
go through every day. And mistakes were made. The fact is, once I got to the cloud stage
of my deployment, I found my productivity slowed significantly. My time started getting
sucked up in areas I'd rather they hadn't. So how did that happen? Well, let's start
with a common metaphor about what cloud providers do as I think it will help us discuss some
of the issues I ran into deploying the app. So far as I can tell, cloud providers are
kind of like industrial tool manufacturers. They know that their customers will have a
diverse need for parts and tools. Yes, everyone will need screwdrivers and hammers, but the
size of nails, the kind of materials, the amount of specialization and specification
that will be needed, that is endless. So they do a bit of everything. And while scale has
allowed cloud providers to offer many different services, the experience of using these services
can feel unlike normie software. The problem in the cloud is not simply that it's someone
else's tools that you're paying to use. It's that it's a warehouse of tools meant to apply
to almost any situation. And therefore, for the tools are inefficient for your needs as
they were for mine. So sometimes you just want a screwdriver. But when you go to the
screwdriver section of the warehouse, there are 100 different specifications of screwdrivers.
Also, none of them look like screwdrivers. Also, they're called AWS glue. For example,
I was at the point of deploying the goodies API to ECS, a fully managed container orchestration
service that felt fully managed by me. Now, the goal here was to host a container so that
everyone can access its endpoints. As a concept, this is a very common need. We build software,
containers are often the manner in which that software ships, and we want people to use
that software. But it's not that easy. Instead, one needs to become pretty well acquainted
with things like task definitions, service names and clusters. Even though in my case,
this is an attempt to deploy to Fargate, which is nominally a serverless machine to run on
to make life easier. To me, this isn't just doing the thing they say they do. As a comparison
of the possibilities, and I know these are not apples to apples, don't cancel me, but
in a service like Streamlet cloud, I simply connect a GitHub repo that contains a Docker
file and my app is deployed. Often very quickly. Again, it's not a one to one comparison, but
I have a hard time believing it needs to be much more complicated than that. So to me,
this is the first aspect of the project that did not do the thing as it says it does.
Which leads me to the next issue I experienced when deploying the goodies API. So let's say
I've pushed my app successfully and it's being hosted in ECS. Well, ECS is by definition
elastic, and its IP address is dynamic. So every time I make a change, a new IP address
will deploy. But as it is, that's not very useful for norm conf attendees who would like
nothing more than to ping the API to get all its really very important outputs. So what
do you have to do? Well, you have to connect to route tables, network gateways, subnet,
security groups, and load balancers all in one. Now be sure to work with a DNS in route
53 right away. That comes after the load balancer. And for good measure, if you want it to be
a secure HTTPS API, you're going to have to set up secure port forwarding on 443, which
requires listeners and target groups. And well, it just gets tiring to list out. Especially
because this wasn't even what I was supposed to be working on. Now, even if you have a
general understanding of all these things like me, you can still get tripped up by the
ordering of these operations or even more frustrating, which I am permissions you need
to set up. The issue is, as I've said, we have been given a warehouse full of tools
and told to go make a house where usually you have plumbers and electricians and more.
We often have to be all those things at once. We didn't always have to be network engineers
in addition to everything else, but to deploy an API in big cloud services, oftentimes now
we do. And if you expect us to know all these things, that's fine. But it's necessarily
a trade off from some other work that we could be working on. It's costly. Ideally, we shouldn't
have to be doing all of this. We could be working on the things that really matter,
like debating the usefulness of automatic code formatters or which IDE to use. But the
way it is right now, I think that there should be an easier way. And so here I would argue
that work on things like networking is not an investment. I'm not benefiting from working
on these tools because they're not helping me achieve my core functionality. They can
be abstracted away and I would not lose much. That's why I think this network engineering
was not a useful software investment for me. Okay, so finally, let's talk about things
more generally. To get through all the API deployment from ECS and Route 53 to load balancing
and more, we need to rely on documentation. The problem is, well, as Vicky says here,
cloud documentation can be pretty rough. And as we've established, the more complicated
the system, the more it can do. The denser the cloud documentation typically is. But
when you're faced with all the documentation, you're often left trying to parse through
it for your relevant use case. And if there are a few examples with sample code, it can
get rough. That's why I'd argue when working a lot with cloud software, including what
it took to deploy the goodies API, it's more involved than it needs to be. Even simply
just by trying to find the relevant information you need to work to accomplish your use case.
This is why this too broke a rule of normie software. It is not easy to pick up and it
could be easier. Which brings me to where we started. Remember the original diagram
I had with my ideal setup? Well, on the left is that diagram. And on the right is actually
a relatively simplified version of where I ended up. It's a bunch scrunched in there.
So let me summarize. While making the actual app stayed the same, the actual deployment
to the cloud was far less normie than I wanted. So where does that leave us? I started this
talk by saying that I wanted to build a cheap API for the conference. And in some ways I
did. Over two months, we've paid maybe like $30 for a scalable production grade API. But
in other ways, the API was costly. Costly on my time. True, some of the tools I reached
for were beneficial to me. I argue that's because those tools had traits of normie software.
They do the things they say they do. They were investments. They're easy to pick up.
But once I reached the deployment to the cloud, my experience left something to be desired.
It was too much work on things that distract from the core features of what I think we
should be doing. So you may be thinking, okay, well, then the solution is to pick a service
that specializes in hosting and deployment, Ben. And that may or may not be true. But
what I'm really trying to get at here is that if I had taken the time to think about the
tradeoffs of my software choices, I would have had other time for a thousand other things
I'd like to be doing with my time. And in our jobs, it's the same thing. We're often
jumping from task to task. Especially if you're at an early stage startup like me. Instead
of grabbing the shiny software, it behooves us to take a moment and consider which of
our screwdrivers to use. Especially because if it has to do with cloud, it will often
take longer than it probably could. So my final message at the end of all of this is
seek out Normie software. Tools that you're familiar with are good. But even if you're
comfortable with the cloud like me, recognize the scale of the problem you're trying to
solve. Don't use a chainsaw when scissors will do. And more generally, our time is precious.
We are asked to do so much from ourselves and our jobs. Do yourself a favor. Choose
your software carefully and respect your time both now and in the future. Now, some of you,
I know we're hoping I literally walk through the instructions to deploy the app to the
cloud, but I thought that would be too boring to regurgitate. So I wrote a post. It's on
that QR code. You can check my website. But for everyone else, I've appreciated your time.
Thank you so much. This conference has been amazing. And it's been a great, I'm really
grateful for the opportunity to speak with you today.