open All Channels
seplocked EVE Information Portal
blankseplocked New Dev Blog: TQ Level Up
 
This thread is older than 90 days and has been locked due to inactivity.


 
Pages: first : previous : 1 [2] 3 4 5 6 7 8 9 ... : last (11)

Author Topic

RaTTuS
BIG
Gentlemen's Agreement
Posted - 2010.06.16 14:03:00 - [31]
 

ShineyWink

Cinori Aluben
Minmatar
Gladiators of Rage
Intrepid Crossing
Posted - 2010.06.16 14:09:00 - [32]
 

Originally by: DevBlog
do anything crazy like running servers under nitrogen pools (although that is pretty cool).

Overclock that bad boy, and 10,000 people in Jita anyone? :D

I am very grateful to see you guys are continuing to upgrade the hardware infrastructure behind EVE. As EVE continues to grow, in playerbase and in depth, it's going to be increasingly important to maintain a pattern of upgrading the server.
This should most definitely help with the lag issues though, yay!

Any chance of before and after system performance info?

Originally by: Camios

-Next Level Fleet Fights: Is the current level working? If not, why are we already at the next level?


I must say however that I agree with this statement. Don't get lazy and let the hardware do all the work for you. Code efficiency should still be a priority, even though your new hardware might 'fix' the issue, you'll be taking better advantage of your new hardware if you fix the code first. I trust you, just sayin... Wink

Finally, where does the old stuff go?

gtiness
Sick Tight
BricK sQuAD.
Posted - 2010.06.16 14:11:00 - [33]
 

I like your enclosed row stuff you've got going on there. We have gear like yours (but more, and not all blades)...and are not lucky enough to have such a tidy hot row/cold row configuration. So, cooling is our biggest problem.

Offsite backup for teh data?

Trick Novalight
Caldari
Instapop Industries
Posted - 2010.06.16 14:13:00 - [34]
 

Originally by: CCP Valar
Originally by: Trick Novalight

(1x72GB hd? No raid 5? Seems like setting up 15k rpm HDs in raid 5 or raid 10 would help increase the read/write of the SQL database...)


1x72GB hard disk in each application server... and those have nothing to do with the SQL Server. The EVE server does almost no I/O so disk performance on the application servers is of no concern.




Very interesting, now that makes me wonder how the severside app and netcode is set up. why not have a huge VM array?

Tylvern Bison
Gallente
Navy of Xoc
Posted - 2010.06.16 14:13:00 - [35]
 

Great posting. Keep up the good work. More details about the network would be awesome!

Be aware of two things. I used HS21s for over a year, starting back in 2007, and the firmware on the disk controllers were a problem for quite some time (lots of blue screens under WIN2K3). Also IBM wasn't sure what hardware components were in our servers when we called in for service. We migrated as fast as we could to the HS22s, and they seem to be a much more stable product.

Regat Kozovv
Caldari
Alcothology
Posted - 2010.06.16 14:15:00 - [36]
 

Always love hearing what you guys are doing under the hood.

Keep up the great work, and post pics! =D

CCP Yokai

Posted - 2010.06.16 14:20:00 - [37]
 

Edited by: CCP Yokai on 16/06/2010 14:26:51
"Did you guys consider the Cisco UCS for the blade servers?"

We have a great relationship with Cisco. They have some very cool toys and we try not to keep our eyes open for anything that makes TQ better. That being said, the IBM blades have been so good and the IBM team are working hand in hand with our team to make TQ better. We never stop looking for the best.

The UCS solution is very good for virtualized and for solutions where many if not most of the servers need lots of connection types (fiberchannel, Gig-E, Infiniband, Etc...) The EVE code is quite amazing that we can get 60,000 plus players on 64 servers with only Gig-E connectivity. Do some research and see how many servers someone like a game about secondary life needs to operate at that level.

Not to keep pimping future blogs... but, the next one is all About how we map EVE's 7929 solar systems (w/wormholes) onto those 50 or so nodes that handle solar systems and make sure the one you are playing in has the correct load. That's where we'll make the most noticable impact on performance in the short term.



Yuda Mann
Posted - 2010.06.16 14:22:00 - [38]
 

Originally by: Cinori Aluben
Don't get lazy and let the hardware do all the work for you. Code efficiency should still be a priority, even though your new hardware might 'fix' the issue, you'll be taking better advantage of your new hardware if you fix the code first. I trust you, just sayin... Wink


Obviously you haven't participated in mass testing on sisi, otherwise you wouldn't make such an insulting and ignorant statement. :)

Memorya
Posted - 2010.06.16 14:30:00 - [39]
 

/Fail

Upgrade network framework and you migh actualy see 2k fleet battles, but until then... dream on....

Ruban Spangler
Caldari
24th Imperial Crusade
Posted - 2010.06.16 14:36:00 - [40]
 

The photograph clearly shows further evidence of CCP’s bias towards Caldari. Why did you pick Caldari cabinets over Gallente or Amarr? No need to explain the decision not to pick rust and duct tape.

CCP Yokai

Posted - 2010.06.16 14:38:00 - [41]
 

"why not have a huge VM array?"

We get this question a lot and the answer is pretty simple. Think of a server, even a very big one as a loaf of bread. Each time you make a slice you leave some crumbs behind (the overhead of VM’s) no matter how small or efficient the slicing the fact is you don’t get the peak capacity you could if it were dedicated to the one service.

In Eve we already virtualize so to speak by distributing solar systems onto servers based on usage data. But we don’t need the overhead of many of those popular virtualization software providers when we do need to dedicate a node to Jita, Fleet Fights, etc. So, in some ways… Eve is very virtualized and very good at it.

Trick Novalight
Caldari
Instapop Industries
Posted - 2010.06.16 14:39:00 - [42]
 

Edited by: Trick Novalight on 16/06/2010 14:41:16
Originally by: CCP Yokai
Edited by: CCP Yokai on 16/06/2010 14:26:51
"Did you guys consider the Cisco UCS for the blade servers?"

We have a great relationship with Cisco. They have some very cool toys and we try not to keep our eyes open for anything that makes TQ better. That being said, the IBM blades have been so good and the IBM team are working hand in hand with our team to make TQ better. We never stop looking for the best.

The UCS solution is very good for virtualized and for solutions where many if not most of the servers need lots of connection types (fiberchannel, Gig-E, Infiniband, Etc...) The EVE code is quite amazing that we can get 60,000 plus players on 64 servers with only Gig-E connectivity. Do some research and see how many servers someone like a game about secondary life needs to operate at that level.

Not to keep pimping future blogs... but, the next one is all About how we map EVE's 7929 solar systems (w/wormholes) onto those 50 or so nodes that handle solar systems and make sure the one you are playing in has the correct load. That's where we'll make the most noticable impact on performance in the short term.





Can't wait! sounds like a great read


Originally by: CCP Yokai
"why not have a huge VM array?"

We get this question a lot and the answer is pretty simple. Think of a server, even a very big one as a loaf of bread. Each time you make a slice you leave some crumbs behind (the overhead of VM’s) no matter how small or efficient the slicing the fact is you don’t get the peak capacity you could if it were dedicated to the one service.

In Eve we already virtualize so to speak by distributing solar systems onto servers based on usage data. But we don’t need the overhead of many of those popular virtualization software providers when we do need to dedicate a node to Jita, Fleet Fights, etc. So, in some ways… Eve is very virtualized and very good at it.



I had a feeling it was something along these lines. which is how you can "reinforce" a node upon request. Are you using a proprietary OS to run TQ?

Bomberlocks
Minmatar
CTRL-Q
Posted - 2010.06.16 14:55:00 - [43]
 

Thanks for this blog. I for one am seriously tired of the whole CCP over the top hubris that almost never meets expectations and it's nice to see some plain talk about what actually goes on on the data center side. More transparency will get you many more satisfied customers.

Evelgrivion
Gunpoint Diplomacy
Posted - 2010.06.16 15:12:00 - [44]
 

Originally by: Memorya
/Fail

Upgrade network framework and you migh actualy see 2k fleet battles, but until then... dream on....


I'd much rather have them spruce up the state manager. Why do you think Jita is at least serviceable with almost 1400 pilots while performance goes down the toilet with less than half that number of pilots in a fleet battle?

Ticondrius
United Federation Starfleet
Saints Amongst Sinners
Posted - 2010.06.16 15:37:00 - [45]
 

I've been here since the beginning, I've seen 2-3 total replacements now of the TQ cluster. I've heard the jokes about selling Oveur's car to buy the first RAMSAN. I've survived the Infiniband hype (which is apparenlty still not used...).

Push the Session Timer to 10 seconds (if not handle it more elegantly and transparent to the players) and redesign the session system as to permit multiple nodes to support a single star system, then I'll be impressed.

Software efficiency is ALWAYS better than throwing more hardware at the problem. I am not amused.

Molly Cutter
Posted - 2010.06.16 15:40:00 - [46]
 

Being mildly pesimistic, and in my line of work it is most of the time too optimistic (I am IT guy - as well as more than nine tenths of internet users anyway) it is obvious that CCP found lag source. That means back to 0.0 for me :) Keep pushing envelope guys (even when everyone and his dog knows better).

Kerdrak
GreenSwarm
Black Legion.
Posted - 2010.06.16 15:40:00 - [47]
 

Just a question: 1000 player battles, when?

Nye Jaran
Posted - 2010.06.16 15:41:00 - [48]
 

Talk about de ja vu. I just ordered a pair of HS21s in nearly that exact configuration, couldn't get the 3.33ghz. Love our IBM blades.

Have you guys looked at the HX5's? Same or better processing power as the 3850 in a blade form factor?


CCP Yokai

Posted - 2010.06.16 15:51:00 - [49]
 

Edited by: CCP Yokai on 16/06/2010 15:53:07
"Software efficiency is ALWAYS better than throwing more hardware at the problem."

I am a fan of this comment :)

Yes, but we try not to limit our efforts to just one source. I'm not a programmer but the team I work with is focused on making sure the code that does get deployed does not have inference from limited or weak infrastructure design.

Trader Jen
Posted - 2010.06.16 15:52:00 - [50]
 

yeh yeh....we've seen these "solutions" blogs a few times. I fully expect nothing to change about lag.

Liorah
Posted - 2010.06.16 15:54:00 - [51]
 

Quote:
In Eve we already virtualize so to speak by distributing solar systems onto servers based on usage data. But we don’t need the overhead of many of those popular virtualization software providers when we do need to dedicate a node to Jita, Fleet Fights, etc. So, in some ways… Eve is very virtualized and very good at it.

I'd attempt the argument that with something like a VMware enterprise product and its automatic dynamic resource allocation, you might achieve better utilization of your overall resources with quicker response, even considering the minor overhead of the hypervisor. Then there is the advantage that you could specify conditions in which to migrate the guest VMs off of a server to automatically "reinforce" a particular node within seconds or minutes of the need becoming apparent, without having to know in advance that the NC was going to jump the SC in some obscure system.

Another advantage is at hardware upgrade time. You configure your VM guests for the virtualized hardware, so hardware upgrades are little more than: install hypervisor on new server, migrate VMs off of outdated server, plug in new server, migrate VMs to new server. You could train a hamster to do it, and it can be done without downtime for individual servers/blades.

I guess this ....

Quote:
Not to keep pimping future blogs... but, the next one is all About how we map EVE's 7929 solar systems (w/wormholes) onto those 50 or so nodes that handle solar systems and make sure the one you are playing in has the correct load. That's where we'll make the most noticeable impact on performance in the short term.

... will help make more clear what you guys are doing =)

(But allowing it to happen automatically would save dev/operator time and would reduce the opportunity for human error)

CODE RED
Caldari
Norse'Storm Battle Group
Intrepid Crossing
Posted - 2010.06.16 16:02:00 - [52]
 

I just had a nerdgasim :) Being a die hard IT engineer for 15 years, this is good stuff! Good luck guys!

CCP Yokai

Posted - 2010.06.16 16:03:00 - [53]
 

Liorah,

VM Ware and similar solutions are pretty damn cool for that kind of thing. Again, not that it's a bad idea, but there are complexities to moving sessions around on the virtual nodes even in seconds.

Right now we have some very dedicated guys that make sure the systems get reallocated, and part of the tools I'm talking about in "Predicting Hot Spots" is all about knowing where and when to put nodes to dedicated status and making it completely automated.

The ease of use trade off, with virtualization is just not as high on our list as getting fights bigger... we are at or near Moore's Law and I don't expect to see us getting CPU to 3x anytime soon... so every % we can protect we do.

ctx2007
Minmatar
Wychwood and Wells
Posted - 2010.06.16 16:05:00 - [54]
 

doing CCENT i like to hear about how server rooms work. hmmm new cisco switches

Jim Luc
Caldari
Rule of Five
Vera Cruz Alliance
Posted - 2010.06.16 16:22:00 - [55]
 

Very interesting post.

I am curious - does the UI or graphic effects like brackets and smoke have any effect on server lag, or is this a software issue?

Is there a way to create a queue for players, so you see a loading bar, and then when the grid has completely loaded, the new system fades in seamlessly. This way, you don't jump into a system, and it doesn't take a few seconds to see your ship, or for the overview to update. I hate getting targeted before I even see my ship. I get a loading bar when undocking, why not for going through gates? Not sure if this is server issues or software Very Happy

Rongar Maximus
Posted - 2010.06.16 16:31:00 - [56]
 

Edited by: Rongar Maximus on 16/06/2010 16:31:48
Good luck on getting that done in 6 hours.

Me? I am setting a skill that takes as least 3 weeks on 22nd Wink

Molly Cutter
Posted - 2010.06.16 16:39:00 - [57]
 

Edited by: Molly Cutter on 16/06/2010 16:41:33
@ Liorah

I think you are totally right. I also believe that there is some sort of evolutionary problem that limits CCP possibilities here. Cluster is virtualized already, so it may be too much too attempt changing of virtualization level. Also it looks that some other early concepts may interfere as well. Maybe writing code all over again would help but that is just ... well, lol.

@CCP Yokai

Moore's law reference is valid in case that there is similar tremd in business, but I am under impression that CCP business is changing outside this envelope. May I point @ 60k concurrent connections from Saturday?
Plus, 64 blades is too round number (IT round) to be ignored. Perhaps cluster is at infrastructure limit already? And next step is what - 128? Still not too big investment if business can support it (or if it will solve lag issues in game language)... But it is never that simple :) Still, looks like good effort.
edit. and btw, nice blog - max points from me.

CCP Yokai

Posted - 2010.06.16 16:51:00 - [58]
 

I'll set all my accounts to a 6 hour skill just to be sure ;)

glas mir
Reaction Scientific
Posted - 2010.06.16 16:52:00 - [59]
 

Originally by: Molly Cutter


...64 blades is too round number (IT round) to be ignored. Perhaps cluster is at infrastructure limit already? And next step is what - 128?


Clusters tend to grow by powers of 2 because of communication efficiency - a major bottleneck in multiprocessor programming. For instance you can reduce a value from all nodes to a single one in log base 2 time. So reducing 65 nodes takes the same time as reducing 128.

Dan O'Connor
Cerberus Network
Dignitas.
Posted - 2010.06.16 16:59:00 - [60]
 

Edited by: Dan O''Connor on 16/06/2010 17:00:18

Skill Training completed: Tranquility Cluster IV

Tranquility Cluster bonuses: 20% increase to total Tranquility node performance per level, 15% Jita stability per level, 10% lag decrease in fleet battles per level. Cannot be trained on trial accounts.

Prerequisites:
- Hacking IV
-- Electronic Upgrades V
--- Electronics V


Pages: first : previous : 1 [2] 3 4 5 6 7 8 9 ... : last (11)

This thread is older than 90 days and has been locked due to inactivity.


 


The new forums are live

Please adjust your bookmarks to https://forums.eveonline.com

These forums are archived and read-only