Running Django

I’ve been professionally running and working on several Django apps, and I’d like to share a few tips that have proven useful over the years. Much of this advice also applies to backend applications in general.

Minimal Docker Container

The main purpose of a minimal Docker image is to improve build and deployment speed, as smaller images build and deploy faster. However, transferring data across the cloud or internet can still be slow.

Keeping app dependencies minimal improves build process stability over the app’s lifecycle. It’s common for developers not to properly freeze dependencies, which can cause build issues when packages are updated. Additionally, there can be loose dependencies between Python packages and OS-level packages, especially when a Python package uses a C driver library—often for databases. This is typically the worst-case scenario, as there may also be issues with system libraries or compilers.

Another factor to consider is image storage cost over time—small images simply cost less.

Configuration

Most developers don’t differentiate between configuration options. I learned the importance of this when many older apps became impossible to run locally due to the sheer number of configuration variables and possible value combinations.

I’ve seen many apps overuse environment variables for all configuration. I decided to develop a better configuration strategy: I use environment variables only for settings required to start the app, usually just a few variables for database credentials. If the app uses a cache, I apply the same approach, since database and cache drivers need credentials to start.

Another set of options I call “static app configuration.” I keep this configuration hardcoded in per-environment config files, which are checked into git. The app selects the correct file when it starts. This approach works because the build image is identical across all environments.

The last special type of configuration is for dependencies on other apps. I recommend creating dedicated manager classes and lazily loading these configuration values from their application stack outputs or other parameter stores. Lazily loading configuration saves time and effort when updating different apps in the stack; your app will automatically update itself when restarted, without any manual intervention.

Migrations

Migrations must run before the new version of the app is deployed. I’ve found it useful to execute the manage.py migration logic as a single task in the cluster, as part of the stack update. I use CloudFormation custom resources to achieve this, ensuring the CI/CD process works smoothly in both directions. Of course, this assumes that migrations are non-destructive and well-crafted. If a deployment fails, the stack automatically rolls back, including the migrations. If rollback doesn’t work as expected, I can simply delete the old app and re-run the last successful build, which redeploys the previous working version and migrates the database back to its last known good state.

Third-party Services

Any external service the app interacts with counts as a dependency, and it’s best not to rely on them too heavily. I treat these services as stateful and wrap them like models. I prefer to log all data received from a third party to a dedicated model, either synchronously or asynchronously via a queue. This lets me review everything that comes in—essentially like a structured log. It allows me to quickly respond to the third party and handle local work in a Django signal, again asynchronously. In other words, I avoid processing work synchronously in the request/response chain unless absolutely necessary. This improves both performance and reliability: everything is captured and processed when the system is ready. I like to think of this as a restaurant analogy (similar to Andy Grove’s), where each step creates a ticket, and processing can be replayed if needed.

For close third-parties, like AWS services, I instantiate SDK clients when the app starts, then simply reuse the instances when needed. This offers a significant speed improvement and keeps memory usage flat even with increased traffic. Compared to the naive approach—instantiating a new SDK client for each incoming request—it should be clear that creating clients once and sharing them is far superior.

Monitoring

I’ve found Sentry to be the best open-source monitoring solution. It’s easy to install and configure, and requires no special setup. Sentry gives me everything I need at a high level—including code, queries, and performance metrics—with rich context, all in one place. I highly recommend it, as Sentry supports not only frontend and backend apps but special systems like Unreal Engine as well.

Finally, this list is just a top-of-mind subset of the things I pay attention to.