Twitter has 99.1% uptime for June

According to uptime service monitor Pingdom, social networking site Twitter had a June uptime figure of 99.17%. Although this sounds high, it is actually low by industry standards, especially for a large site like Twitter with so many resources at its disposal.
The 0.83% downtime figure equates to 5 hours and 43 minutes of lost Tweeting. Network configuration issues as well as spikes of traffic due to the World Cup and NBA Finals caused the downtime.
Unfortunately, Twitter fanatics will not be able to get the lost time back. Maybe the site will have better uptime this month?
Photo | Flickr
Tag: downtime, outage, twitter, uptime
What to do when your server goes down

First of all: do not panic. What may appear to be an outage, may actually be an issue with your network connection or Internet congestion. Once you have eliminated the usual suspects, there are a few steps you can take to resolve the issue quickly and get your dedicated server back up and running.
1. Test an SSH connection. If you can still SSH into your server, you most likely just have a software issue. If your web server application (such as Apache) has crashed, a simple restart may fix the problem. If you notice it starting to crash routinely every day or every week, you may have a security exploit.
2. If you cannot SSH into your server, try to ping and traceroute the server. If you get network connections all the way up the traceroute but cannot connect to your server, that means the network is fine, but the physical server may have crashed or been shutdown. Follow the normal procedure for rebooting. If your server is remote, you can ask your web host to reboot it. Some hosts also have automatic reboot switches that you can activate remotely. If something is wrong with the network, check with your host. They may already be diligently trying to fix the problem.
3. If rebooting does not fix the problem, and you cannot access your server, your host may offer you a KVM connection so that you can troubleshoot your server’s network settings.
4. If your host cannot even get the server to start in order to use KVM, they will probably have to re-image your box. This will erase everything, and you will be thankful at this point that you have kept backups of all websites on your server.
Photo Source: stock.xchng
Tag: apache, internet, kvm, network, outage, reboot, server, ssh, traceroute
Amazon EC2 cloud service experiences power outage… again
Earlier this week, Amazon’s EC2 cloud service experienced yet another power outage. This time, a car crashed into a local utility pole and knocked out the power. The generator transfer switch failed. A number of East Coast customers lost service for about an hour.
A very similar incident occurred in 2007 at a RackSpace data center. Regardless of this, Amazon needs to get its act together. Why didn’t the server load transfer over to the generators properly?
The cloud computing provider surely won’t be signing up very many new customers if these power outages continue. Finally, current EC2 users must be very upset about this and worried about Amazon’s long-term reliability.
Tag: amazon, cloud computing, ec2, outage, power outage
Germany's .de experiences major outage

The majority of the Internet’s 13.6 million .de domains were unavailable from between 1:30pm and 2:50pm German time yesterday. DENIC, the .de operator, reports that the names went “kaputt” after empty zone files were accidentally uploaded to the DNS root system.
Information on the number of names affected varies. According to one source, every .de name starting with the letters “a” through “0″ saw downtime. DENIC is still investigating the outage.
Source | The Register
Amazon addresses cloud computing power issues

After power outages on Amazon’s EC2 cloud computing service resulted in a loss of service for some users on May 4 and May 8, Amazon has announced that it is working on a change in its power distribution to address the issue. The company said the changes will, “significantly reduce the number of instances that can be affected by failures like we have seen in the last week.”
The outages were caused by the failure of several electrical components as well as human error. Several disgruntled users report experiencing data loss as well.
While most EC2 users will unaffected by the power failures, this just goes to show that cloud computing isn’t perfectly reliable and there is still a lot of progress to be made in the field of distributed computing.
Tag: amazon, cloud computing, downtime, ec2, outage, power failure
THAT caused a web host outage?
Usually when a web host goes down, the cause is something very mundane. Maybe a router went offline or a hardware upgrade didn’t go as planned. In the case of Rackspace in 2007, however, something no one could have expected knocked one of its data centers out: a truck.
In a dizzying domino effect, a truck crashed into a utility pole. The pole then crashed into a nearby transformer, blowing it up. The power went out and Rackspace’s generators couldn’t handle the equipment load. All of its dedicated server clients were taken offline.
It took around 12 hours for service to be restored. Although very costly and inconvenient, Rackspace takes the cake for most the coolest data center outage cause.
Source | Randomkitty.net
Photo | jsnward
Customers of The Planet experience outages

Dedicated server owners at The Planet’s facility in Houston, Texas, experienced an outage lasting around 90 minutes last night. Customers were pleased to once again have access to their sites, only to experience more downtime this morning.
Since then The Planet has brought all its servers back online. The hosting company says a router issue in the core network caused the two outages. While the problem may be solved now, some customers wish more was done to update them on the situation.
Regardless of what ever happens, hosts have an obligation to keep their customers updated. It’s unclear if The Planet did a good job of this or not this morning and last night, but when choosing a host, check to see what sort of communication lines it has with customers. You definitely don’t want to be left in the blue if there is ever an issue.
Source | Data Center Knowledge
Tag: downtime, houston, outage, the planet
Why uptime matters

Ever notice that most hosts have 99.9% uptime. There’s a good reason why they try so hard to keep things running. While a few percentage points might not seem like a big deal, over the course of a year they can really add up. Just take a look at the numbers:
99.9% uptime= 8.76 hours of downtime
99.5% uptime= 43.8 hours
99.0% uptime= 87.6 hours
97% uptime= 262.8 hours
Even a host that has 99.5% uptime still experiences 2 days of downtime per year! For some, that may not be a problem. But keep in mind that falling uptime figures are a very slippery slope. A seemingly decent up-time of 97% translate to outage time of 262.8 hours, or a little under 11 days.
Thanks, LiquidWeb
Normally my VPS provider, LiquidWeb, does a very good job. In my one year with them, I have experienced zero downtime– until this this week. While it’s true that no provider is perfect, I think the company could have done a better job of handling a recent hardware problem.
For about a week or so, the parent server my VPS is hosted on was rebooted multiple times, up to several times a day. Each reboot took my sites down for around ten minutes or so. This wasn’t the end of the world, but was effecting my traffic figures. If a site isn’t reliably people will stop visiting.
So I contacted LiquidWeb about the problem. The support staff was very friendly and responded quickly, but I wasn’t so pleased with the response:
…we’ve been experiencing some problems with the
parent server your VPS is hosted on. We are aware of the reboots, and will replace any necessary hardware if it comes to that. While I agree that it’s frustrating to have the server reboots occur, we do have to balance maintenance with keeping services available for all of the customers on this parent server.
Later that day, I received an update saying the server would be offline for some time in order to replace a broken RAID controller. I’m upset about two things:
Read More >>
WordPress.com blog hosting suffers outage
The WordPress.com blog hosting service suffered a two-hour-long outage today. The downtime had nothing to do with the WordPress CMS, but instead rendered the 10 million sites using its free blog hosting service unavailable.
The cause of the outage is still being investigated, but right now it seems as though one router caused all the ruckus. Apparently someone at one of the four data centers where WordPress rents space made a configuration change to a core router. This not only blocked off access to the blogs at that particular facility, but the other three data centers as well.
Photo | ozdv8
Tag: blog hosting, downtime, free web hosting, outage, wordpress