Tale as old as time. True as it can be. Barely even friends. Then somebody implements the wrong bloody server architecture.

Battles have waged for years over what is the best design pattern to use for backend architecture. Hundreds of articles have been written comparing which is better, with most inconclusively concluding "it depends".
For anyone unfamiliar with the terms, these design patterns are the two most popular approaches to building a backend API in the cloud. Monoliths are single applications that handle all the routing and logic for the entire API. Microservices, on the other hand, are micro applications, typically built using on-demand server resources like AWS Lambda. The API is built of many "micro" applications, each handling one API route. Their advantages and disadvantages generally boil down to something along the lines of
Advantages
Disadvantages
Advantages
Disadvantages
Starting this project, I had experience with both approaches. Most recently with micro services, but a few years ago with monoliths. Microservices have only been "fashionable" recently, as cloud computing has evolved in a way to make this kind of fast-starting, low-compute architecture possible. Monolith had always been a preference of mine because it is so much simpler to build and understand. However, as discussed in a previous article, I wanted to build this for free and micro services would make that possible, as AWS have a good free tier on Lambdas, but an EC2 machine that would be on all the time running a monolith API would not be free. I reinforced this decision in my mind by telling myself this is a massively multiplayer online football manager game, and when millions of users start playing, then I am going to need to be able to scale quickly.
So that's the approach I took. It should be a perfectly valid approach and could have worked out just fine. Unfortunately, some of my previous architecture choices worked against me - the main one being the choice to use Supabase for a PostgreSQL database and MikroORM as a database management framework in front of it. It was quite obvious early on that the game's performance was poor around API calls. Simple requests were taking 2-4 seconds and making the entire user interface feel unresponsive and broken. After fixing some genuinely poorly optimised queries, I managed to track the issue down to the database initialisation.
Adding in timing metrics to my logs revealed that the database was taking around 2 seconds to initialise. The lambdas were taking around half a second to load into memory, too, compounding the issue, and the logic itself would run in a fraction of a second. There were three issues as far as I could tell:
Firstly, I tried to address the large lambda size. I took all of the common code that all of the API lambdas used, like Mikro and some other dependencies, and created a lambda layer. This is like a library that my lambdas can import at run time. It keeps the size of the individual lambdas down as they can use the shared library instead of each deploying a copy of it themselves, which allows the lambdas to start faster. This helped, but not much.
So I moved on to the connection to Supabase. I looked into IPV6, but the only way this was possible with a lambda would be to run my lambdas from within a VPC (Virtual Private Cloud) and creating a VPC was outside of the free tier.
Lastly, I looked at modifying the config for MikroORM to try and bring down the initialisation time, but nothing I found seemed to improve things in any great way.
This is where bullet biting started. A compromise was going to have to be made, and it was going to increase my AWS costs one way or another. The options:
The next problem to solve was how I could make this switch without rewriting my entire backend. Thankfully, my good friend Augment Code rescue came to the rescue. I built a standard NodeJS Express server with routing, but AI helped write a transformation layer in my routes that converted Express events into API Gateway style events so we could call the same functions that were being invoked by my Lambdas and then converted the response back into Express format. This "middleware" layer meant all of my existing code just worked without any changes. Lifesaver!
The API response time was immediately night and day. Requests were being handled in fractions of a second, as you would expect, instead of seconds, and the game felt immediately better. 95% of the backend code was now routed through the monolith API including some times tasks like processing scouting requests but there were a couple of jobs that were left for lambdas to deal with. Tasks like processing the matches (300 per gameworld twice a day) were best left to lambdas which could scale independently as the game grows and more gameworlds are created and can run multiple jobs in parallel. This kind of work would easily take down my small EC2 machine and make the game unplayable while these are being processed.
Those monolith disadvantages are not to be sniffed at, though. One big problem with this approach is that if there is a problem with my monolith, then the entire game stops working. Previously, with micro services, I could (and did) break the scout players endpoint and not even notice for two days (is that a good thing?). With the monolith, an error could prevent the server from even starting, and all the routes would be inaccessible. I had to create a custom deployment task that deployed the updated server alongside the existing server and launch it. It would then run a couple of uptime checks to make sure that the APIs were reachable and the database could be connected to. Only then would we shut down the existing server, move it to one side and replace it with the new one. Even this approach will guarantee you a minute or so of downtime. There's no "easy" way to avoid this (though there are methods if time and money is no object). For my use case, 1 minute of downtime is not ideal but not critical.
So what is the best architecture? Well... it depends. I don't think I'm going to be especially controversial in my conclusions here. Economy aside, monolith APIs are good for most cases. They're fast, easy to understand by developers working on them, and that should be a good enough reason alone. You can scale a monolith for quite a long time before it becomes a problem by just adding bigger compute. As things grow, you can spin off areas to a microservice and keep the bulk of the API inside the monolith. This is what I've done for match processing and processing the end of season.
Don't just build a microservice because you think you're going to have a million users, so you need the scale. You probably won't. Monolith to microservice is a well-worn path and not the impossible tech debt task you think it would be - especially if you migrate piece by piece rather than one big migration. And certainly don't build a microservice because you think that's the modern way to build APIs now.
If you're trying to stay within the free tier, then, yeah, microservices do allow you to do that, but it only works up to a certain scale, and then it can get in the way. Is the added complexity worth it? If you're going to end up paying for event queues anyway, it might be cheaper to run a small EC2 instance.