DNS Incidents Like Cloudflare’s Could Turn your Status Page Useless; Here is How to Prevent It
How do I prevent DNS related Issues from affecting my status page? The question is a common one that we receive from customers at Statuspal.
We have created this article to answer the question and provide clear reasons.
Use a separate domain for your status page. Instead of status.yourcompany.com, use yourcompany-status.com.
Use two different DNS nameservers, one for your status page and another for your main infrastructure. For example, if you use Cloudfront for your primary servers, use something else for your yourcompany-status.com.
You could go as far as using a completely different Domain registrar.
What is DNS
According to Cloudflare, The Domain Name System (DNS) is the Internet's phonebook. People access information online via domain names like nytimes.com and espn.com. Web browsers interact through Internet Protocol (IP) addresses. DNS translates domain names to IP addresses so browsers can load Internet resources.
Types of DNS issues
There are three types of DNS issues that can affect the components of the Domain Name System. Below are some examples that we've seen in the past:
Outage affecting your DNS provider
An outage affecting your DNS provider (nameservers), like the one that affected Cloudflare yesterday, Jun 21, 2022, brought down a significant portion of the internet.
This outage type can cause slow response times or complete unreachability of your services.
Outage affecting your Domain registrar
An outage affecting your Domain registrar could then impact your domain name and render your services unreachable or unresponsive. The results are similar to the previous DNS issue.
Incorrect DNS records
Finally, the third issue can result from a misconfigured DNS record. The problem can impact your status page. This logical error would happen at your company/team level instead of upstream.
How to Prevent DNS issues from affecting your status sage
Now that we've identified the types of DNS issues that can affect your status page, how can we prevent them?
1. Use a separate domain for your status page
It's common for companies to use a vanity subdomain like status.apple.com; it looks great, legitimate and maintains branding. However, in this case, it can be to the detriment of the resilience of your status page.
Instead, opt for using a completely different domain name, like apple-status.com, not status.apple.com. This fix prevents some common issues like incorrect DNS records. Since the DNS management is separate, the chances of adding the wrong record are significantly reduced.
2. Use a different DNS nameserver for your status page
Using a different DNS provider is critical in preventing your status page from being affected by the same DNS incidents that can affect your main servers.
For example, DigitalOcean's status page and admin dashboard were inaccessible yesterday during Cloudflare's outage. Unfortunately, users could not get information about this incident from the status page during this crucial time.
So if, for example, you use Fastly for your main infrastructure, use something different for your status page domain. Recommendation #1 is a prerequisite for this.
3. Use a different domain registrar for your status page
Finally, although uncommon, an outage affecting your name registrar can render your services unreachable. In this case, using a separate registrar for your status page makes sense. Doing so reduces the chance that an outage could affect both simultaneously.
For example, if you registered yourcompany.com with Namecheap, use Amazon Route 53 for your status page domain (yourcompany-status.com).
Status pages following these recommendations
Following these good practices is becoming common practice as companies learn from unexpected problems. Here are some examples of status pages implementing at least some of our recommendations:
- Meta status page (https://metastatus.com/)
- Twitter status page (https://api.twitterstat.us/)
- Intercom status page (https://www.intercomstatus.com/)
- Cloudfalre status page (https://www.cloudflarestatus.com)
All of the above status pages have a custom top-level domain, and none of them depends on the same domain as their servers.
DNS related issues are not the most common type of incident, but they are a real threat. If you are looking to have a highly available status page, you want to prevent these types of issues. Follow our advice and prevent your status page from becoming useless when you need it the most.