Modern applications require high availability. Our customers expect it, our customers demand it. But building a modern scalable application that has high availability is not easy and does not happen automatically. Problems happen. And when problems happen, availability suffers. Sometimes availability problems come from the simplest of places, but sometimes they can be highly complex.
In this episode, we will continue our discussion from last week with the remainder of the five strategies for keeping your modern application, highly available as well.
This is How to Improve Application Availability, on Modern Digital Applications.
Links and More Information
The following are links mentioned in this episode, and links to related information:
How to Improve Availability, Part 2
Building a scalable application that has high availability is not easy and does not come automatically. Problems can crop up in unexpected ways that can cause your application to stop working for some or all of your customers.
No one can anticipate where problems will come from, and no amount of testing will find all issues. Many of these are systemic problems, not merely code problems.
To find these availability problems, we need to step back and take a systemic look at our applications and how they works.
What follows are five things you can and should focus on when building a system to make sure that, as its use scales upwards, availability remains high. In part 1 of this series, we discussed two of these focuses. The first was building with failure in mind. The second was always think about scaling. In part 2 of this series, we conclude with the remaining three focuses.
Number 3 - Mitigate risk
Keeping a system highly available requires removing risk from the system. When a system fails, often the cause of the failure could have been identified as a risk before the failure actually occurred. Identifying risk is a key method of increasing availability.
All systems have risk in them. There is risk that:
Keeping a system available requires removing risk. But as systems become more and more complicated, this becomes less and less possible. Keeping a large system available is more about managing what your risk is, how much risk is acceptable, and what you can do to mitigate that risk.
This is Risk management, and it is at the heart of building highly available systems.
Part of risk management is risk mitigation. Risk mitigation is knowing what to do when a problem occurs in order to reduce the impact of the problem as much as possible. Mitigation is about making sure your application works as best and as completely as possible, even when services and resources fail. Risk mitigation requires thinking about the things that can go wrong, and putting a plan together now, to be able to handle the situation when it does happen.
For example, consider a typical online e-commerce store. Being able to search for product on the e-commerce store is critical to almost any online store.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More