Twitter has 99.1% uptime for June

According to uptime service monitor Pingdom, social networking site Twitter had a June uptime figure of 99.17%. Although this sounds high, it is actually low by industry standards, especially for a large site like Twitter with so many resources at its disposal.
The 0.83% downtime figure equates to 5 hours and 43 minutes of lost Tweeting. Network configuration issues as well as spikes of traffic due to the World Cup and NBA Finals caused the downtime.
Unfortunately, Twitter fanatics will not be able to get the lost time back. Maybe the site will have better uptime this month?
Photo | Flickr
Tag: downtime, outage, twitter, uptime
How to Schedule a Reboot on a Windows Server

In a previous post, I explained how to use the “at” command to schedule a reboot on a Linux server. On a Windows server, you can accomplish the same thing. Scheduling a reboot is helpful for those rare occasions when you make changes to your server that require a reboot. A major system security update is a perfect example.
In those instances, it is not wise to reboot your server in the middle of the day, at the height of website traffic. By scheduling your reboot, you can minimize the number of website visitors affected by any downtime. Also, if something goes wrong, any extended downtime will be during off hours.
To schedule a reboot, enter the Windows command prompt and run the following command:
c:> at 4:00am c:\admutils\psshutdown.exe -r -f -c -t 10
In this example, the server will reboot at 4:00AM. As with any server, make sure that your system time is correct. Otherwise, you might end up performing the reboot at an inopportune time.
Source: nixCraft
Photo Source: Flickr
Tag: at command, downtime, reboot, schedule, server, system, windows
99% Uptime Guarantee

It seems as though nearly all web hosting providers promise 99% uptime. Therefore, the promise alone does not make the choice any easier. While there are sites that provide monitoring services that rate the actual uptime of hosts, the real question you should ask a web host is what the guarantee entails.
There is no question that even the best web host will have some down time. That is why no host promises 100% uptime. When your server does go down, what are the consequences? Will the web host say nothing and just eventually turn it back on, pretending like nothing happened? Will they apologize after a reboot? Will they assure you that it will never happen again?
The truth is, when a website is critical to an organization or business, a web host, which is also a business, should compensate the customer for downtime. That guarantee should include a clause about compensation. It may take the form of pro-rating a monthly fee or some other form, but the ultimate outcome should satisfy the customer.
Photo Source: Flickr
Germany's .de experiences major outage

The majority of the Internet’s 13.6 million .de domains were unavailable from between 1:30pm and 2:50pm German time yesterday. DENIC, the .de operator, reports that the names went “kaputt” after empty zone files were accidentally uploaded to the DNS root system.
Information on the number of names affected varies. According to one source, every .de name starting with the letters “a” through “0″ saw downtime. DENIC is still investigating the outage.
Source | The Register
Amazon addresses cloud computing power issues

After power outages on Amazon’s EC2 cloud computing service resulted in a loss of service for some users on May 4 and May 8, Amazon has announced that it is working on a change in its power distribution to address the issue. The company said the changes will, “significantly reduce the number of instances that can be affected by failures like we have seen in the last week.”
The outages were caused by the failure of several electrical components as well as human error. Several disgruntled users report experiencing data loss as well.
While most EC2 users will unaffected by the power failures, this just goes to show that cloud computing isn’t perfectly reliable and there is still a lot of progress to be made in the field of distributed computing.
Tag: amazon, cloud computing, downtime, ec2, outage, power failure
Customers of The Planet experience outages

Dedicated server owners at The Planet’s facility in Houston, Texas, experienced an outage lasting around 90 minutes last night. Customers were pleased to once again have access to their sites, only to experience more downtime this morning.
Since then The Planet has brought all its servers back online. The hosting company says a router issue in the core network caused the two outages. While the problem may be solved now, some customers wish more was done to update them on the situation.
Regardless of what ever happens, hosts have an obligation to keep their customers updated. It’s unclear if The Planet did a good job of this or not this morning and last night, but when choosing a host, check to see what sort of communication lines it has with customers. You definitely don’t want to be left in the blue if there is ever an issue.
Source | Data Center Knowledge
Tag: downtime, houston, outage, the planet
Why uptime matters

Ever notice that most hosts have 99.9% uptime. There’s a good reason why they try so hard to keep things running. While a few percentage points might not seem like a big deal, over the course of a year they can really add up. Just take a look at the numbers:
99.9% uptime= 8.76 hours of downtime
99.5% uptime= 43.8 hours
99.0% uptime= 87.6 hours
97% uptime= 262.8 hours
Even a host that has 99.5% uptime still experiences 2 days of downtime per year! For some, that may not be a problem. But keep in mind that falling uptime figures are a very slippery slope. A seemingly decent up-time of 97% translate to outage time of 262.8 hours, or a little under 11 days.
Thanks, LiquidWeb
Normally my VPS provider, LiquidWeb, does a very good job. In my one year with them, I have experienced zero downtime– until this this week. While it’s true that no provider is perfect, I think the company could have done a better job of handling a recent hardware problem.
For about a week or so, the parent server my VPS is hosted on was rebooted multiple times, up to several times a day. Each reboot took my sites down for around ten minutes or so. This wasn’t the end of the world, but was effecting my traffic figures. If a site isn’t reliably people will stop visiting.
So I contacted LiquidWeb about the problem. The support staff was very friendly and responded quickly, but I wasn’t so pleased with the response:
…we’ve been experiencing some problems with the
parent server your VPS is hosted on. We are aware of the reboots, and will replace any necessary hardware if it comes to that. While I agree that it’s frustrating to have the server reboots occur, we do have to balance maintenance with keeping services available for all of the customers on this parent server.
Later that day, I received an update saying the server would be offline for some time in order to replace a broken RAID controller. I’m upset about two things:
Read More >>
Overheating takes down Wikipedia

It wouldn’t be wrong to say things were heating up at Wikipedia yesterday. According to Data Center Knowledge, a server shutdown caused by overheating at a European data center sent the online encyclopedia down for several hours.
Normally when an incident like this occurs, a fail-over mechanism reroutes traffic to another data center. But in the case of Wikipedia, this measure failed and the entire take went down worldwide. An announcement on the site’s blog stated:
As this impacted all Wikipedia and other projects access from European users, we were forced to move all user traffic to our Florida cluster, for which we have a standard quick failover procedure in place, that changes our DNS entries. However, shortly after we did this failover switch, it turned out that this failover mechanism was now broken, causing the DNS resolution of Wikimedia sites to stop working globally
The data center in question is located in Amsterdam and houses 50 servers. It uses energy-efficient passive air cooling. Wikipedia has not announced why this cooling system failed.
Tag: downtime, server cooling, wikipedia
WordPress.com blog hosting suffers outage
The WordPress.com blog hosting service suffered a two-hour-long outage today. The downtime had nothing to do with the WordPress CMS, but instead rendered the 10 million sites using its free blog hosting service unavailable.
The cause of the outage is still being investigated, but right now it seems as though one router caused all the ruckus. Apparently someone at one of the four data centers where WordPress rents space made a configuration change to a core router. This not only blocked off access to the blogs at that particular facility, but the other three data centers as well.
Photo | ozdv8
Tag: blog hosting, downtime, free web hosting, outage, wordpress