Move Fast, Don’t Break Things: An Inside Look at Cadre’s Engineering Operation
One of the biggest challenges for any software engineering organization is figuring out how to maximize development velocity. Some organizations can afford to prioritize velocity over quality. At Cadre, we can’t. We treat quality as the highest priority — users trust our technology and we can’t compromise their trust.
We believe that, when done correctly, development infrastructure can allow engineers to move as quickly as possible while also not breaking production, sacrificing users’ data integrity, or introducing security vulnerabilities. Here’s a brief summary of our development environment and process.
There’s a simple rule in reliability engineering: you can’t avoid mistakes but you can avoid the consequences of mistakes. Our AWS environment follows this rule. We have three separate VPCs: Corporate, Development and Production. The only two addresses exposed to the global internet are our production website and corporate VPN. Peering between VPCs allows very limited types of traffic. We continuously validate this isolation through internal testing and by engaging 3rd-party network penetration testers.
This isolation extends beyond networking. No changes made in our development environment can influence production in any way. Changes to our production environment are only done through automation that’s been code reviewed, rigorously tested, and continuously audited. This allows us to eliminate any possibility of human operator error.
At Cadre, we don’t build software just for the sake of building it — we build software that serves our needs. New proptech companies emerge daily, so we know it’s important to move fast to maintain our first-mover advantage. Therefore, simplicity of development is a very important aspect of our engineering process. Cadre is written as a monolith Django application, we have a React.js frontend, and all of our software is running in AWS.
We’re a small team of about 30 engineers and don’t see much value in overcomplicating our stack by introducing a multi-language micro services architecture. We don’t rule-out the possibility of creating separate services in the future, but at our current size and load, a Django monolith allows us to move fast and keep things simple.
We like to benefit from — and contribute to — the open-source community. We sponsor Django Software Foundation, Django Rest Framework, and last year we sponsored DjangoCon in San Diego. We’ve made numerous contributions to upstream dependencies and are maintainers of a popular Django model-version-tracking library called Django Simple History (which we also use at Cadre).
We leverage a lot of automated testing to ensure our code is high quality. Before hitting production, every change goes through an extensive pipeline of unit, regression and end-to-end tests. We have a custom end-to-end testing framework tightly integrated with the Django ORM, and built with pytest and the official Python Selenium bindings.
Even with rigorous automated testing, a human touch is needed to validate new features or designs. For every pull request created by a developer, we automatically spin up a separate version of cadre.com, complete with generated production-like fixture data. These lightweight environments are only accessible within our corporate network. A developer can send a link to this site to designers, product managers, and business owners so they can test the feature and provide feedback. Not only does this increase development velocity because it shortens the feedback loop between the various stages of development, it also simplifies the process of manual testing and helps us maintain the quality of the code we’re shipping to production.
We ship to production all day, every day. Here’s why:
- First and foremost, we’re continually improving and enhancing our customer experience, internal tools, and data platform. We want these changes to hit production as soon as possible.
- We believe that every developer should be responsible for shipping quality code. Developers take this responsibility seriously — especially when hitting “merge” means their code is going straight to production. We also think it’s substantially easier to catch bugs by shipping often and in small chunks, as opposed to large daily or weekly deployments.
- Our team of designers, product managers, and engineers are motivated by seeing how their work contributes to the company’s success. Seeing our changes hit production within an hour after they’ve merged is exhilarating. Plus, it encourages the completion of tasks in smaller increments, at a faster pace.
One of our Platform Team’s values is to build tools and an environment that are useful for the entire company — including the ability to work remotely. We’re a relatively small team and understand that from time-to-time, colleagues may need to work from home or from remote destinations. We’re okay with that. We try to build all of our tools and frameworks with this in mind, as a way to maintain productivity for those who occasionally need to work from home.
Cadre Engineering, come work with us!