open All Channels
seplocked EVE Information Portal
blankseplocked New Dev Blog: Fixing Lag: Well, this one doesn't really...
 
This thread is older than 90 days and has been locked due to inactivity.


 
Pages: [1] 2 3

Author Topic

CCP Fallout

Posted - 2010.09.10 11:50:00 - [1]
 

CCP GingerDude's newest dev blog details the work he's been doing in regards to jumping and stuck pilots. You can read about it here.

Thunder XXV
Posted - 2010.09.10 11:56:00 - [2]
 

Edited by: Thunder XXV on 10/09/2010 11:57:04
Edited by: Thunder XXV on 10/09/2010 11:56:44
First

and IBC too Laughing

Also, nice read, but the title says it all really. Good to see you're making progress

Chribba
Otherworld Enterprises
Otherworld Empire
Posted - 2010.09.10 11:56:00 - [3]
 

Cool cool.

/c

BenjaminBarker
Posted - 2010.09.10 12:04:00 - [4]
 

Very nice. Loving all the updates!

groak
Thundercats
Posted - 2010.09.10 12:22:00 - [5]
 

I was always wondering, if there should be some sort of hidden feature on server side, which would effectively slow down time ingame (localy) to compensate the lack of purepowah. I dont think anyone minds fight in hitman style while the actions and reaction happens in controlable order for user(pilot). Waiting for button activation for few minutes ? Sur but without seeing those drones eating me throught whole structure or mwding from paris to tokyo.

also yay for effort :)

Germaldi's sister
Amarr
Posted - 2010.09.10 12:26:00 - [6]
 

Edited by: Germaldi''s sister on 10/09/2010 12:57:48
Edited by: Germaldi''s sister on 10/09/2010 12:36:54
well what if instead of a dual state transfer system ye changed it to a tri state transfer system... state 1 originatining system state (ship jumps at gate) -> state 2 in hyperspace state (ship accelarates to hyperspeed and flies between gates in hyperspace) -> state 3 destination system (ship drops out of hyperspace within 15km of the destination gate/cyno.

During the hyperspace flight sequence the servers hand over the client info from origin to destination.

sort of like how warping from grid to grid works or even like how accel gates work in plex's. when u iniatate warp u cannot stop it by logging off... the same for jumping systems, the ship will continue to go to its destination then when it comes out of hyperspace the 60 second timer (if no aggro) or 15min timer (if aggro'd) starts. to prevent cheating using the logoffski trick

Jason Edwards
Internet Tough Guy
Spreadsheets Online
Posted - 2010.09.10 12:29:00 - [7]
 

>I'm just doing my job.
>Let's stop doing my job just to write dev blog in order to say I'm doing my job.

GOOD JOB. Now lets discuss actual development.

Voight Kampf
OEG
Posted - 2010.09.10 12:29:00 - [8]
 

Nice blog. It still doesn't cover one question. Why we didn't have this lag BEFORE Dominion? If i understood your blog right this issue is independent from sov system etc and still we didn't encounter it in Apocrypha. How you can explain this?

ArchenTheGreat
Caldari
Pulsar Nebulah
Army of Lovers.
Posted - 2010.09.10 12:30:00 - [9]
 

Originally by: Germaldi's sister
well what if instead of a dual state transfer system ye changed it to a tri state transfer system... state 1 originatining system state (ship jumps at gate) -> state 2 in hyperspace state (ship accelarates to hyperspeed and flies between gates in hyperspace) -> state 3 destination system (ship drops out of hyperspace within 15km of the destination gate.

During the hyperspace flight sequence the servers hand over the client info from origin to destination.



That's what they want to avoid. People will log off in hyperspace to escape enemy. You could jump gate and never appear on the other side.

Germaldi's sister
Amarr
Posted - 2010.09.10 12:34:00 - [10]
 

Edited by: Germaldi''s sister on 10/09/2010 12:55:20
Edited by: Germaldi''s sister on 10/09/2010 12:52:04
Originally by: ArchenTheGreat


That's what they want to avoid. People will log off in hyperspace to escape enemy. You could jump gate and never appear on the other side.


this could easily be over come by making the ship come out of hyperspace at the destination much like it does when it comes out of warp now.

also it would add more feel of imersion to jumping systems. like this Stargate humans & wraith vs replicators battle

and dropping out of hyperspeed could look more like this Ancient Warship

with this effect while the transferring between servers is happening hyperspace tunnel effect

CCP GingerDude

Posted - 2010.09.10 12:49:00 - [11]
 

Originally by: Voight Kampf
Nice blog. It still doesn't cover one question. Why we didn't have this lag BEFORE Dominion? If i understood your blog right this issue is independent from sov system etc and still we didn't encounter it in Apocrypha. How you can explain this?
I'm fairly sure that this issue was present before Dominion; you just needed bigger fleets to trigger it before. I can't really explain why the server pain-threshold dropped around Dominion conclusively, since all our metrics suggested that we should've been able to handle even more pilots then before. My pet theory regarding that today is actually the change in playstyles and fitting that we saw after dominion. Some weapons are much more CPU intensive than others and Dominion basically ushered in an era of much increased use of missiles, fighter bombers and smartbombs in fleet fights. These are known to be cpu heavy compared to many other weapons and modules, so a significant increase in the % of ff players using them could very well have moved the tipping point.

filingo rapongo
Vivicide
ROMANIAN-LEGION
Posted - 2010.09.10 12:54:00 - [12]
 

Originally by: CCP GingerDude
Originally by: Voight Kampf
Nice blog. It still doesn't cover one question. Why we didn't have this lag BEFORE Dominion? If i understood your blog right this issue is independent from sov system etc and still we didn't encounter it in Apocrypha. How you can explain this?
I'm fairly sure that this issue was present before Dominion; you just needed bigger fleets to trigger it before. I can't really explain why the server pain-threshold dropped around Dominion conclusively, since all our metrics suggested that we should've been able to handle even more pilots then before. My pet theory regarding that today is actually the change in playstyles and fitting that we saw after dominion. Some weapons are much more CPU intensive than others and Dominion basically ushered in an era of much increased use of missiles, fighter bombers and smartbombs in fleet fights. These are known to be cpu heavy compared to many other weapons and modules, so a significant increase in the % of ff players using them could very well have moved the tipping point.


There are many cases of jump lag where the system is relatively empty and you have little out on the field. we experienced significant jump lag in a 5 man (2 v 3) fight yesterday with no-one else in system. these kind of scenarios would suggest there is an underlying problem with the system change procedure rather than it being a side effect of high CPU load on the nodes?

thank you for the blog however.

Victor Valka
Caldari
The Kairos Syndicate
Transmission Lost
Posted - 2010.09.10 13:00:00 - [13]
 

Originally by: CCP GingerDude
Originally by: Voight Kampf
Nice blog. It still doesn't cover one question. Why we didn't have this lag BEFORE Dominion? If i understood your blog right this issue is independent from sov system etc and still we didn't encounter it in Apocrypha. How you can explain this?
I'm fairly sure that this issue was present before Dominion; you just needed bigger fleets to trigger it before. I can't really explain why the server pain-threshold dropped around Dominion conclusively, since all our metrics suggested that we should've been able to handle even more pilots then before. My pet theory regarding that today is actually the change in playstyles and fitting that we saw after dominion. Some weapons are much more CPU intensive than others and Dominion basically ushered in an era of much increased use of missiles, fighter bombers and smartbombs in fleet fights. These are known to be cpu heavy compared to many other weapons and modules, so a significant increase in the % of ff players using them could very well have moved the tipping point.
Nerf Caldari? Very Happy

Nemtar Nataal
Demonic Retribution
Posted - 2010.09.10 13:06:00 - [14]
 

Originally by: CCP GingerDude
Originally by: Voight Kampf
Nice blog. It still doesn't cover one question. Why we didn't have this lag BEFORE Dominion? If i understood your blog right this issue is independent from sov system etc and still we didn't encounter it in Apocrypha. How you can explain this?
I'm fairly sure that this issue was present before Dominion; you just needed bigger fleets to trigger it before. I can't really explain why the server pain-threshold dropped around Dominion conclusively, since all our metrics suggested that we should've been able to handle even more pilots then before. My pet theory regarding that today is actually the change in playstyles and fitting that we saw after dominion. Some weapons are much more CPU intensive than others and Dominion basically ushered in an era of much increased use of missiles, fighter bombers and smartbombs in fleet fights. These are known to be cpu heavy compared to many other weapons and modules, so a significant increase in the % of ff players using them could very well have moved the tipping point.
Very interesting question here, but have you guyes tested how much data a client pre Dominion sends to the server post dominion performing the same operations. I seem to remember another devblog describing a lot of changes in the client around Dominion resulting in memory leaks and other stuff. I can undestand that you might not have changed anything or anything significant in the network layer, but there might be processes on top of the network layer that trigers additional load output from the network layer.
The reason (properly obvious) is that if everything in your scalability sayes you should be able to handle more players and you are handling less, well you get the point. A small change in the client software could easely be responsible for additional load on the server 2 additional calls on jumpin with the same ammount of players well doesnt really scale very well -> not a N+1 problem but still.

And another thing (which i cant remember (but didnt you remove the lowend client after dominion) is there any statistics that show how many people were using the lowend EVE client to be able to run multible clients pre dominion if it was really there the old client died). Could the new client be responsible for something and you never notissed cause the old client didnt have the same peoblem and more people were using old vs. new client ?
ArrowAgain i apologize for this question as i cant remember when the client was retiered...

CCP GingerDude

Posted - 2010.09.10 13:10:00 - [15]
 

Quote:
There are many cases of jump lag where the system is relatively empty and you have little out on the field. we experienced significant jump lag in a 5 man (2 v 3) fight yesterday with no-one else in system. these kind of scenarios would suggest there is an underlying problem with the system change procedure rather than it being a side effect of high CPU load on the nodes?

Yes, this does not address lag in any way. What you experienced may have been caused by another fight in a system 20 jumps away if they were on the same node. This change was squarely aimed at stopping whole fleets getting stuck on jumping with no way to recover.

To add some detail. When this happened (i.e. massive lock queing) it affected the systems capability to do other stuff as well. Pilots were stuck while exploding, dead ships couldn't finish dying, disconnected client session couldn't remove the users ship from space etc. All because someone was hogging a shared resource. All of this generated errors and those also add to the load.

We've started gathering statistics on how long it takes people to jump for analysis.

Dar Wento
Posted - 2010.09.10 13:16:00 - [16]
 

Thank you for the detailed blog. I appreciate you are taking your time to explain it to us.

Regards,

/Dar.

CCP GingerDude

Posted - 2010.09.10 13:18:00 - [17]
 

Originally by: Nemtar Nataal
Originally by: CCP GingerDude
Originally by: Voight Kampf
Nice blog. It still doesn't cover one question. Why we didn't have this lag BEFORE Dominion? If i understood your blog right this issue is independent from sov system etc and still we didn't encounter it in Apocrypha. How you can explain this?
I'm fairly sure that this issue was present before Dominion; you just needed bigger fleets to trigger it before. I can't really explain why the server pain-threshold dropped around Dominion conclusively, since all our metrics suggested that we should've been able to handle even more pilots then before. My pet theory regarding that today is actually the change in playstyles and fitting that we saw after dominion. Some weapons are much more CPU intensive than others and Dominion basically ushered in an era of much increased use of missiles, fighter bombers and smartbombs in fleet fights. These are known to be cpu heavy compared to many other weapons and modules, so a significant increase in the % of ff players using them could very well have moved the tipping point.
Very interesting question here, but have you guyes tested how much data a client pre Dominion sends to the server post dominion performing the same operations. I seem to remember another devblog describing a lot of changes in the client around Dominion resulting in memory leaks and other stuff. I can undestand that you might not have changed anything or anything significant in the network layer, but there might be processes on top of the network layer that trigers additional load output from the network layer.
The reason (properly obvious) is that if everything in your scalability sayes you should be able to handle more players and you are handling less, well you get the point. A small change in the client software could easely be responsible for additional load on the server 2 additional calls on jumpin with the same ammount of players well doesnt really scale very well -> not a N+1 problem but still.

And another thing (which i cant remember (but didnt you remove the lowend client after dominion) is there any statistics that show how many people were using the lowend EVE client to be able to run multible clients pre dominion if it was really there the old client died). Could the new client be responsible for something and you never notissed cause the old client didnt have the same peoblem and more people were using old vs. new client ?
ArrowAgain i apologize for this question as i cant remember when the client was retiered...
The change in the number of server calls for a jump was insignificant and also done 'lazily' so it doesn't block anything. However, someone firing missiles does create a bigger payload to other clients compared to someone just firing lazors.

mia mia
Caldari
Dawn of a new Empire
The Initiative.
Posted - 2010.09.10 13:29:00 - [18]
 

Edited by: mia mia on 10/09/2010 13:31:32
Originally by: Voight Kampf
Nice blog. It still doesn't cover one question. Why we didn't have this lag BEFORE Dominion?

This statment is incorrect. We absolutely, 100% did have serious lag issues (that manifiested itself just like it does today) before Dominion. It seemed to get worse after Dominion, no doubt, but to assert that this type of lag didn't happen before Dominion is false.

Regardless, get blog. I really appreciate a peek behind the curtains. Thanks for keeping the player base updated.

kano donn
New Path
Posted - 2010.09.10 13:41:00 - [19]
 

very interesting...
thanks for the info

Blazde
4S Corporation
Morsus Mihi
Posted - 2010.09.10 13:48:00 - [20]
 

Edited by: Blazde on 10/09/2010 13:53:20
Can we assume this queue fix also affects people already in system and logging in rather than jumping in? (When the Character Selection freezes at 90% progress.)


Originally by: CCP GingerDude
My pet theory regarding that today is actually the change in playstyles and fitting that we saw after dominion. Some weapons are much more CPU intensive than others and Dominion basically ushered in an era of much increased use of missiles, fighter bombers and smartbombs in fleet fights. These are known to be cpu heavy compared to many other weapons and modules, so a significant increase in the % of ff players using them could very well have moved the tipping point.
Pretty sceptical of these 3 things being the cause. Missile use didn't increase significantly in my experience (*), but, nor did smartbombs at all. FBs doesn't explain the majority of battles (those without supercarriers) lagging. However there was a very sharp trend towards close range turrets, and towards BC and HACs with medium turrets, so inreased rof in both cases. (This was a combination of people finally catching on that the new-style ongrid probing was an important tactic in fleet fights, and the removal of the AoE DD threat, for what it's worth).

(*) Well okay you do get more big Drake fleets recently (not immediately after Dominion), but like FBs those are only present in some battles so doesn't explain the rest lagging.

So. Plausible.

Aurora Robotnik
Caldari
Ghosts of CKSSA
Posted - 2010.09.10 13:54:00 - [21]
 

It's amazing what a bit of bad press will do to the guys in charge... Rolling Eyes

Good to see progress is being made :)

Camios
Minmatar
Sebiestor Tribe
Posted - 2010.09.10 14:02:00 - [22]
 

Edited by: Camios on 10/09/2010 14:02:11
Originally by: CCP GingerDude

I'm fairly sure that this issue was present before Dominion; you just needed bigger fleets to trigger it before. I can't really explain why the server pain-threshold dropped around Dominion conclusively, since all our metrics suggested that we should've been able to handle even more pilots then before. My pet theory regarding that today is actually the change in playstyles and fitting that we saw after dominion. Some weapons are much more CPU intensive than others and Dominion basically ushered in an era of much increased use of missiles, fighter bombers and smartbombs in fleet fights. These are known to be cpu heavy compared to many other weapons and modules, so a significant increase in the % of ff players using them could very well have moved the tipping point.


This is very interesting. Hope that you can find confirmation or denial soon.

Septimus Jr
Posted - 2010.09.10 14:47:00 - [23]
 

A fabulous job done! Congratulations on those findings! Keep up the good work, thumbs up ;)

El Liptonez
V0LTA
VOLTA Corp
Posted - 2010.09.10 14:49:00 - [24]
 

Edited by: El Liptonez on 10/09/2010 14:49:55
Originally by: CCP GingerDude
Originally by: Voight Kampf
Nice blog. It still doesn't cover one question. Why we didn't have this lag BEFORE Dominion? If i understood your blog right this issue is independent from sov system etc and still we didn't encounter it in Apocrypha. How you can explain this?
I'm fairly sure that this issue was present before Dominion; you just needed bigger fleets to trigger it before. I can't really explain why the server pain-threshold dropped around Dominion conclusively, since all our metrics suggested that we should've been able to handle even more pilots then before. My pet theory regarding that today is actually the change in playstyles and fitting that we saw after dominion. Some weapons are much more CPU intensive than others and Dominion basically ushered in an era of much increased use of missiles, fighter bombers and smartbombs in fleet fights. These are known to be cpu heavy compared to many other weapons and modules, so a significant increase in the % of ff players using them could very well have moved the tipping point.


I think Voight Kampf is mixing something up here (like the majority of players do). The "Dominion lag/change" was not so much about the lag itself, it was more about nodes dying at 700 players, or not allowing 200 people jumping into 200 people. Which worked fine before, mostly. The lag itself just turned a little higher.

Lag has always been bad in big fights, and I've honestly had worse fights before Dominion than I do have now (especially in the last 2 months). The loading system/grid is the crucial point, no one really cares about 1-3 minutes module lag, as long as you can actually shoot something before you die.

That's why I don't quite understand the missile/turret point. If the hostiles are just jumping in, there's no missiles to be shot before the system loads for the jumpers (or it doesn't, but not because of the missiles).

Edit: Nice blog btw. Very Happy

Celia Therone
Posted - 2010.09.10 14:53:00 - [25]
 

Edited by: Celia Therone on 10/09/2010 14:55:00
Originally by: CCP GingerDude

Yes, this does not address lag in any way. What you experienced may have been caused by another fight in a system 20 jumps away if they were on the same node. This change was squarely aimed at stopping whole fleets getting stuck on jumping with no way to recover.

To add some detail. When this happened (i.e. massive lock queing) it affected the systems capability to do other stuff as well. Pilots were stuck while exploding, dead ships couldn't finish dying, disconnected client session couldn't remove the users ship from space etc. All because someone was hogging a shared resource. All of this generated errors and those also add to the load.

We've started gathering statistics on how long it takes people to jump for analysis.

I've started seeing freezes on jumping into even empty systems since around 1.04 coming out. (I've cut way down on eve so I can't be sure exactly when.)

Seems to be that the client keeps chewing up 60 meg chunks of memory every system. Every now and again (not an obvious pattern) it releases a ton of those chunks in one go (eg 360 megs at once). However if it happens not to release them in time then the client just dies on jumping into a system. Even if the system is empty and (on re-logging in) apparently lag free. I usually hard kill the eve process at that point as I've been mostly in null or low sec and not wanted to hang out defenseless at the gate.

Really, it's just another reason not to play at this point.

Jordan Musgrat
Convergent
Posted - 2010.09.10 14:53:00 - [26]
 

Good job.

As to whether this was present before dominion, let me be the first to tell you that it wasn't. Yes back in 06 we'd have this jumpin lag, but with stackless io and etc. you guys had made it to where jumpin lag started at much, much higher player counts. You would be able to jump in a gang of 250 into a system that already had 500, and nobody would take more than a minute or so, which is pretty acceptable. And you'd still be able to fly around and shoot in system after that. So all that really happened is that along with breaking whatever allows you to fly around in system, you broke whatever allowed the threshold to be so high.

Point was, this might have been present before the evil patch, but if it was, nobody ever encountered it until local was above 1200 or so and other issues were surfacing beyond jumping more people into system.

ReaperTox
Gallente
The Littlest Hobos
Posted - 2010.09.10 14:56:00 - [27]
 

Good work. Keep on going to kill the Lag beast!

What about the creation of a Fleet Jump /Wing Jump /Squad Jump option for the Fleet/Wing or Squad Commanders? Wouldn't that help by providing a single call with a bunch of simultaneous locks? Wouldn't t avoid concurrency between individual calls per pilot?

CCP Explorer

Posted - 2010.09.10 15:01:00 - [28]
 

Originally by: Blazde
Originally by: CCP GingerDude
My pet theory regarding that today is actually the change in playstyles and fitting that we saw after dominion. Some weapons are much more CPU intensive than others and Dominion basically ushered in an era of much increased use of missiles, fighter bombers and smartbombs in fleet fights. These are known to be cpu heavy compared to many other weapons and modules, so a significant increase in the % of ff players using them could very well have moved the tipping point.
Pretty sceptical of these 3 things being the cause. Missile use didn't increase significantly in my experience (*), but, nor did smartbombs at all. FBs doesn't explain the majority of battles (those without supercarriers) lagging. However there was a very sharp trend towards close range turrets, and towards BC and HACs with medium turrets, so inreased rof in both cases. (This was a combination of people finally catching on that the new-style ongrid probing was an important tactic in fleet fights, and the removal of the AoE DD threat, for what it's worth).

(*) Well okay you do get more big Drake fleets recently (not immediately after Dominion), but like FBs those are only present in some battles so doesn't explain the rest lagging.

So. Plausible.
We are currently looking very seriously at fighter bombers as a possible reason for post-Dominion lag since missiles are much, much more expensive that any other type of weapons in terms of server processing and fighter bombers spew out missiles at a high RoF like there is no tomorrow. We will also be looking at other RoF changes in and since Dominion to see if they contributed to lag.

Gnulpie
Minmatar
Miner Tech
Posted - 2010.09.10 15:10:00 - [29]
 

That is a really good blog. Explaining pretty detailed on what is going wrong and where.

However no solution to be found. Fingraining the locks is a way to distribute the locks better so that it can be more fair for everyone. But ...

... not good enough Very Happy


Maybe you should not work with individual jumps but with whole packs of jumps, 10 or so. So you wouldn't need 10 locks but only 1 for the whole pack. That might reduce the fixed-sized overhead. Sure it has also drawbacks, and I have absolutely no idea how difficult it would be to change the code etc.

Have you also considered that the "jump" command can be processed on their own node to preprocess some data. Especially combined with those 'packed jumps' that could help.

Just some idea, nothing else.

CCP GingerDude

Posted - 2010.09.10 15:19:00 - [30]
 

Originally by: Gnulpie

...
Maybe you should not work with individual jumps but with whole packs of jumps, 10 or so. So you wouldn't need 10 locks but only 1 for the whole pack. That might reduce the fixed-sized overhead. Sure it has also drawbacks, and I have absolutely no idea how difficult it would be to change the code etc.
...


Originally by: ReaperTox

...
What about the creation of a Fleet Jump /Wing Jump /Squad Jump option for the Fleet/Wing or Squad Commanders? Wouldn't that help by providing a single call with a bunch of simultaneous locks? Wouldn't t avoid concurrency between individual calls per pilot?
...


I think these are good ideas; good for the servers, good for fleet commanders. We may yet implement this, but it isn't high on The List at the moment.


Pages: [1] 2 3

This thread is older than 90 days and has been locked due to inactivity.


 


The new forums are live

Please adjust your bookmarks to https://forums.eveonline.com

These forums are archived and read-only