open All Channels
seplocked EVE Information Portal
blankseplocked New Dev Blog: The Day the Items Disappeared
 
This thread is older than 90 days and has been locked due to inactivity.


 
Pages: [1] 2 3

Author Topic

CCP Fallout

Posted - 2010.05.13 17:24:00 - [1]
 

Friday, May 7th, was an interesting and rough day for both players and CCP. What happened, and how can we prevent future occurances? CCP Red Button returns with a post mortem of the events in his newest dev blog.

gerash
Gallente
Firebird Squadron
Terra-Incognita
Posted - 2010.05.13 17:31:00 - [2]
 

I was wondering what policies ccp has in place for recovering from hardware failures specifically a loss of the database server. Is there any coherent backup strategy in place?

Agent Unknown
Caldari
Posted - 2010.05.13 17:41:00 - [3]
 

Insightful blog. Unfortunately things can go haywire, but at least it didn't affect all the items in EVE...that would be even worse.

GateScout
Posted - 2010.05.13 17:56:00 - [4]
 

Thanks for the informative dev blog. That explain a lot. More of that please.

Thanks!


CCP Red Button

Posted - 2010.05.13 17:57:00 - [5]
 

To answer your question Gerash the Tranquility main database which is at the heart of EVE is clustered and fully redundant in every way. If needed we can recover the database up to the minute from backup but the problem is the time it currently takes. So what we aim to do now is being able to instantly switch to a time-delayed mirror in addition to drastically reducing backup restoration times.

Borgh Brainbasher
Saint Industrial Services
STEEL BROTHERHOOD
Posted - 2010.05.13 17:58:00 - [6]
 

so, CCp Red Button, is that the guy who gets hit over the head when the server goes haywire?

Darth Vapour
Posted - 2010.05.13 17:59:00 - [7]
 

Quote:
What all the scrutiny and testing of this script had failed to take into consideration was the huge difference in transaction volume on Tranquility vs. Singularity.


That's kind of amateurish. I hope this was a one-off case and some awareness exists in CCP land that your live and test systems are not comparable in how they behave.

Luke S
Zeta Corp.
Posted - 2010.05.13 18:07:00 - [8]
 

Thanks Red. but its still easier to blame Tuxford.Twisted Evil

Rene Sauntier
Gallente
Descendants of Hermes
Posted - 2010.05.13 18:27:00 - [9]
 

nice blog.poast, it should stop some people dribbling.

+1 for informative bug fixes

Jamier Legov
Minmatar
Blue Republic
RvB - BLUE Republic
Posted - 2010.05.13 18:30:00 - [10]
 

Quote:
The main lesson learned last Friday, however, is that we need to step up our game in terms of speedy restoration of backups and options for recovery.


That's like closing the barn door after the horse has left....

The main lesson learned should be that testing needs to take into account the differences between Singularity and Tranquility. When your testing, you need to test with the actual volume of transactions on the production enviroment. Load testing isn't just a good idea, or a software development best practice, it's the right way to prevent problems in the first place.

Paddlefoot Aeon
SiN. Corp
Daisho Syndicate
Posted - 2010.05.13 18:38:00 - [11]
 

I remember reading that, in the next patch, there will be the ability to run "ghost clients" for developers, to simulate in the DEV environment the high-load conditions typically seen in the production (Tranq.) DB.

Will running these clients help avoid these high-volume-only bugs from making it "live"?

The Pricer
Posted - 2010.05.13 18:55:00 - [12]
 

Originally by: Luke S
Thanks Red. but its still easier to blame Tuxford.Twisted Evil


ONE DOES NOT SIMPLY PRESS CCP RED BUTTON!!!! ShockedTwisted Evil

Cinori Aluben
Minmatar
Gladiators of Rage
Intrepid Crossing
Posted - 2010.05.13 18:56:00 - [13]
 

Edited by: Cinori Aluben on 13/05/2010 19:03:43
Wow, great transparency in helping us understand what's going on. Keep it up. And keep up the work to avoid this in future.

I will give props to the GMs in their petition responses for this one. I lost a sacrilege from a buy order on this one, and they gave it back within like 2 days. Good job guys.
Quote:
We were simply faced with the option of either having EVE offline
[email protected]

Cinori Aluben CSM5 2010

Fix the Little Things First!

www.littlethingsfirst.com




Gustovness
Broken Cannon
Posted - 2010.05.13 18:57:00 - [14]
 

I applaud CCP for putting out this dev blog. I think I speak for the majority of your customers that we appreciate your filling us in on these catastrophic disasters and keeping us in the loop. I didn't have any problems myself with this episode, but I like hearing that you guys are taking the time to keep us all informed as to what's going on.

thumbs up Very Happy

Mynxee
Veto.
Veto Corp
Posted - 2010.05.13 19:04:00 - [15]
 

Thanks for an informative "behind the scenes" explanation...much appreciated!

Nierna
Posted - 2010.05.13 19:09:00 - [16]
 

it is these kinds of devblogs that make me love the CCP developers. other games would simply say "we had a problem and we fixed it" youguys actualy explain the problem and fix implemented. please do keep up this level of transperaty.

Chribba
Otherworld Enterprises
Otherworld Empire
Posted - 2010.05.13 19:22:00 - [17]
 

Interesting read. Keep up the good work.

/c

Nareg Maxence
Gallente
Posted - 2010.05.13 19:26:00 - [18]
 

Originally by: Jamier Legov
Quote:
The main lesson learned last Friday, however, is that we need to step up our game in terms of speedy restoration of backups and options for recovery.


That's like closing the barn door after the horse has left....

The main lesson learned should be that testing needs to take into account the differences between Singularity and Tranquility. When your testing, you need to test with the actual volume of transactions on the production enviroment. Load testing isn't just a good idea, or a software development best practice, it's the right way to prevent problems in the first place.


I am qurious as to how exactly you would arrange such a test.

DmitryEKT
Clandestine.
Posted - 2010.05.13 19:45:00 - [19]
 

Rollback would have been cool tho :o

Haven Wind
Posted - 2010.05.13 19:58:00 - [20]
 

Love CCP, Classy response in an industry where often there is no response or communication with the player base at all over these types of issues.

Grimpak
Gallente
Midnight Elites
Echelon Rising
Posted - 2010.05.13 20:13:00 - [21]
 

Originally by: DmitryEKT
Rollback would have been cool tho :o




DON'T EVEN THINK ABOUT IT!

you can't even fathom what the **** happened when the last rollback happened.

Mihara Shiharu
Posted - 2010.05.13 21:07:00 - [22]
 

Time to get a better database really :P

Derus Grobb
Minmatar
Selectus Pravus Lupus
Transmission Lost
Posted - 2010.05.13 21:08:00 - [23]
 

Interesting button.. i mean devblog

T'Amber
Garoun Investment Bank
Posted - 2010.05.13 21:21:00 - [24]
 

Edited by: T''Amber on 13/05/2010 21:23:36

Ah.. some of the votes I bought off the market didn't show up that day and I was wondering what happened.
thanks for the explanation

-T'amber



Sul Trewtis
ANZAC ALLIANCE
IT Alliance
Posted - 2010.05.13 21:28:00 - [25]
 

Wait wait wait.....


Your telling us the logs actually showed something?

*pods himself in shock*




And unfortunately **** like this happens no matter what procedures you have in place. All you can really do is be professional and make sure it doesn't happen often or ever repeat.

Lagruna Zegata
Posted - 2010.05.13 21:42:00 - [26]
 

Always unfortunate to see these things happen to such a terrific game, especially to divert CCP's attention right before a much-awaited expansion. Kudos for explaining everything as thoroughly as possible.

Some semi-technical side questions: How come this new area of the junkyard needed to be cleaned? And why was a modified script necessary for this area?

~LZ

CCP Red Button

Posted - 2010.05.13 21:55:00 - [27]
 

Interresting question Lagruna. To tell you the truth I am not sure why this section of the junkyard needed cleaning at this point. Each section of the junkyard is used for a different set of things as far as I know and thus might fill up at different rates so that might explain why. I will try to find out tomorrow. The reason for the modification was to optimize performance.


Lost Hamster
Hamster Holding Corp
Posted - 2010.05.13 21:56:00 - [28]
 

It's good to read about the problem. Many of us thought that there was a rollback, as some of the items where missing after the long downtime.

Laendra
Universalis Imperium
Posted - 2010.05.13 22:12:00 - [29]
 

So what do we look for to see if we have been affected by this? At what point is it "too late" to do anything about it?

Nadarius Chrome
Celestial Horizon Corp.
Posted - 2010.05.13 22:51:00 - [30]
 

Originally by: Laendra
So what do we look for to see if we have been affected by this? At what point is it "too late" to do anything about it?


I'm in a similar situation. I have quite a few market orders, but no formal way I track them, and dread to think if there's a chance one or several of mine were affected. I suspect the chance is low but still that nagging doubt...

And I don't want to lodge a petition merely to ask "Hey, could I have been affected?" because I'm sure the GMs handling that queue are plenty busy with petitions from people who did lose items as well as petitions from people who didn't but are trying to scam free stuff.


Pages: [1] 2 3

This thread is older than 90 days and has been locked due to inactivity.


 


The new forums are live

Please adjust your bookmarks to https://forums.eveonline.com

These forums are archived and read-only