Gmail outage blamed on 'routine upgrade' errors

The disruption is the latest in the serious of outages that analysts predict could dent user’s confidence in online applications

New York, September 2 -- In an outage that affected majority of its 150 million users, Google Inc.'s Gmail service was offline for nearly two hours on Tuesday.

As the users repeatedly tried to connect to the e-mail service, they received “Unable to reach Gmail" error message at their computers.

Google detected the failure within seconds and fixed the problem by transferring traffic across the rest of its networks.

Changes in routers boomerang
According to Google, the problem occurred due to errors during “routine upgrade". The disruption began when Google took some of its Gmail servers offline for maintenance.

To improve Gmail’s ability to stay online, Google incorporated changes in the routers that directed traffic to the remaining servers but these changes boomeranged. The company had also underestimated the load placed on the request routers, leading to disruption.

"At about 12:30 pm Pacific a few of the request routers became overloaded and in effect told the rest of the system 'stop sending us traffic, we're too slow!'" Google wrote in a blog post. "This transferred the load onto the remaining request routers, causing a few more of them to also become overloaded, and within minutes nearly all of the request routers were overloaded."

Disruptions can undermine confidence
The disruption is the latest in the serious of outages that analysts predict could dent users’ confidence in online applications. Users had experienced outage that lasted for several hours on Feb.24. They also witnessed minor problems on March 11, April 16 and May 8.

Analysts claim that users will whole-heartedly adopt online applications only if they are assured of undisrupted access. But technical snags undermine confidence.

"This is one of the reasons that corporate e-mail has not moved to the cloud," said Tim Bajarin, president of Creative Strategies in Campbell. "Under no circumstances do they want to have the system taken down."

Google to implement reliability improvements
According to ComScore Inc., Gmail is the world’s third most-popular e-mail service with around 149 million users in June.

Increasingly, the businesses are also shifting to the Google services as they can access the services over the Internet rather than managing in-house. Further, they can access more storage capacity than they could otherwise afford. Today more than 1.75 million businesses use Gmail as part of Google Apps.

Describing the outage as a “big deal”, Google said that it will ensure that in future the request routers have sufficient headroom to manage the load.

“We'll be hard at work over the next few weeks implementing these and other Gmail reliability improvements -- remains more than 99.9% available to all users, and we're committed to keeping events like today's notable for their rarity," stated firm's Ben Treynor.

No votes yet