open All Channels
seplocked EVE Information Portal
blankseplocked New Dev Blog: Full Suite of Upgrades to the Tranquility DB - Yippie!
 
This thread is older than 90 days and has been locked due to inactivity.


 
Pages: 1 [2] 3 4 5

Author Topic

gtiness
Sick Tight
BricK sQuAD.
Posted - 2011.03.24 18:10:00 - [31]
 

Edited by: gtiness on 24/03/2011 18:10:39
What is the size of the EVE database?

Edit: to properly snipe page2.

theocratis
Posted - 2011.03.24 18:11:00 - [32]
 

Originally by: Ban Doga
Edited by: Ban Doga on 24/03/2011 18:02:45
Originally by: CCP Yokai
Originally by: Ban Doga
Edited by: Ban Doga on 24/03/2011 17:46:25
I remember that someone (CCP Explorer ?) said not so long ago, that the DB is not anywhere near its performance limit.
So I'm really looking forward to seeing no performance improvements from this one.

I'm happy that you are happy, tho...


Check the disk busy... CPU... um and other graphs in the blog... We dug deeper and found some bottlenecks recently.


I didn't say the DB is not running better now (it certainly is).
I was just repeating what a CCP official said: the DB is not a bottleneck itself (I'll try to find the original statement on eve-search).

If your SOL nodes hit 100% CPU load your improved DB performance won't matter that much...

*EDIT*
Reduced stress for the machines is great (especially for the ones in charge of the health of those servers), but how does this translate into perceived performance for the users?


"We have also seen another positive, albeit unplanned, side effect of the increased performance of the new database systems. Previously if our SQL Server cluster needed to fail over to the redundant server, every node in the cluster died and all players were disconnected.We recently failed over to our secondary database server on the new system and only 3 nodes out of 208 died! This means with some tweaks we may be able to fail the servers, storage, and switching environment without a single disconnect!"

CCP Yokai

Posted - 2011.03.24 18:11:00 - [33]
 

Originally by: gtiness
What is the size of the EVE database?


Lots of "stuff" in there but somewhere around 1.3TB


EliteSlave
Minmatar
Macabre Votum
Morsus Mihi
Posted - 2011.03.24 18:14:00 - [34]
 

Originally by: CCP Yokai
Edited by: CCP Yokai on 24/03/2011 17:57:15
Originally by: EliteSlave
Originally by: CCP Yokai
Originally by: EliteSlave
I wish i could post that screen cap of Stan after sneaking into that trailer with internet....

But, I have so many questions about the hardware... like specific model numbers, firmware. More so to do with possible integration here at my office as we have a pretty large database and are looking to scale up our hardware as we are approaching saturation of it.

so any **** is good ****


Ask and I'll try and answer... better yet if you are at Fanfest... I present all that tomorrow.


Wow, didnt expect a response like that..

Well I guess the first few questions are we are currently doing the FC/IP (Fibre Chan over IP) and we are kinda limited in the I/O factor of around 30 that we have currently attached to 12 Dell Powervault 3610's ( ISCSI / FC/IP and FC ) and we are already maxing out the hardware and we are trying to get the next bang for out buck with going full FC but since this will be our first foray into the "Enterprise" level we are reading "blah blah blah" and dont really understand what we should be looking for of sorts. Now im not expecting you to give me a visio flowchart of the equipment or how its setup. But if you can say broad terminology that is well acceptable for growth of the next 2-3 years and allows for 1000-1500 ( ideally would like to have hardware that supports 2500-3000 to allow for growth) users concurrently hitting the database at any given time would mucho appreciated.


First off... nothing over anything if you can do it... FC is the best bet for "Enterprise" like you said because of the session based communications and the potential for synchronous protection to redundant SAN controllers. I am biased... this is my opinion but I avoid iSCSI unless forced buy sharp object.

Next... direct connection or true SAN need to be decided on early. Direct connecting the disks is faster (like nano seconds only) but cheaper because you don't have to buy switches. BUT!!! When you run out of host ports and you need to get another system attached or accessing the LUNs, boy are you gonna miss those switches.

Next... SAS unless you know better. SAS is great! I love this stuff... cheaper than FC-SCSI and much better than SATA... best mid-ground disks. SSD's rock my socks but you will pay massively for them or you will get cheap ones and hate what you did later.

Last... Find the right performance metric. Concurrent users means lots to lots of people. I would suggest looking at your IOPS. A 12 drive tray of SAS usually nets 5-10K IOPS on normal workloads for a DB.

Hope that helps a little. All I can toss at you between beers at Fanfest :)

CCP Yokai




Hey thanks for the tidbit of advice, (Give me a virtual server farm and I can do wonders... give me a database make me cry...but im a masochist and would love the time learn it)

I definitely agree with the anything over anything is going to be a hassle and try to avoid it, but the CTO before me well... had X budget and if he spent only Y he got a certain commission on that and the company finally learned that well being cheap and rewarding cheap only got them deeper in the hole later on and finally fired him and now going into compliance and looking to stay ahead of the curve.

Can you reccomend any classes to take and which to avoid as you think they are a waste of time and or just nothing to gain from?

PS: If i find you at fanfest I will throw a beer your way and plus my resume ( even tho i know you prolly cant hire me ) hahha

CCP Valar

Posted - 2011.03.24 18:17:00 - [35]
 

Perhaps the perceived performance for the users won't change much with the new database hardware, but it gives us a lot of room to grow and makes us able to perform more online maintenance without affecting users and prevents us from having to schedule extended downtimes for things we needed to do offline before.
Also, a major part of the decision to upgrade the hardware was to increase availability and options for disaster recovery.

Leet Magician
Evolution
The Initiative.
Posted - 2011.03.24 18:20:00 - [36]
 

maybe now the logs will actually show something!!

AnonyTerrorNinja
Minmatar
Atomic Geese
Posted - 2011.03.24 18:24:00 - [37]
 

If I had to be told

"you are going to work in the server room"

I swear I would take a sleepingbag, one of those funny inch-thick mattress things, a perpetual coffee machine, a portable shower and never leave.
Just being around that kind of awesome is enough. Embarassed

Sarmatiko
Posted - 2011.03.24 18:25:00 - [38]
 

*fap fap fap*

Thanks! Very Happy

Ban Doga
Posted - 2011.03.24 18:27:00 - [39]
 

Originally by: theocratis
Originally by: Ban Doga
Edited by: Ban Doga on 24/03/2011 18:02:45
Originally by: CCP Yokai
Originally by: Ban Doga
Edited by: Ban Doga on 24/03/2011 17:46:25
I remember that someone (CCP Explorer ?) said not so long ago, that the DB is not anywhere near its performance limit.
So I'm really looking forward to seeing no performance improvements from this one.

I'm happy that you are happy, tho...


Check the disk busy... CPU... um and other graphs in the blog... We dug deeper and found some bottlenecks recently.


I didn't say the DB is not running better now (it certainly is).
I was just repeating what a CCP official said: the DB is not a bottleneck itself (I'll try to find the original statement on eve-search).

If your SOL nodes hit 100% CPU load your improved DB performance won't matter that much...

*EDIT*
Reduced stress for the machines is great (especially for the ones in charge of the health of those servers), but how does this translate into perceived performance for the users?


"We have also seen another positive, albeit unplanned, side effect of the increased performance of the new database systems. Previously if our SQL Server cluster needed to fail over to the redundant server, every node in the cluster died and all players were disconnected.We recently failed over to our secondary database server on the new system and only 3 nodes out of 208 died! This means with some tweaks we may be able to fail the servers, storage, and switching environment without a single disconnect!"



I didn't miss that, but it's not about performance. It's about stability.
You don't get dropped, but you're not getting improved performance while staying online.


Also found the original statement I was referring to:
Originally by: CCP Atlas
We only have a single database and it's easier to scale that up than the sol nodes and we're already ahead of the curve in terms of what the DB can deliver. We do cache very aggressively on the server though and consolidating these character node calls onto a half a dozen nodes rather than servicing them throughout the cluster does remove a bit of the DB load since we get more cache hits, but like I said, the DB is not a big issue in this regard today.


http://www.eveonline.com/ingameboard.asp?a=topic&threadID=1371750&page=2#39

Thunderf00t
Posted - 2011.03.24 18:32:00 - [40]
 

Edited by: Thunderf00t on 24/03/2011 18:38:39
Edited by: Thunderf00t on 24/03/2011 18:37:02

Does the storage system support active-active mode, or is it passive-active to the storage processors/controllers?

What SAN switches do you use? Brocade...Cisco?

I suppose the DB cluster is some sort of active-active setup ( something like Oracle RAC maybe)?


EDIT:

I'm dumb, the database is probably on RAW devices...

J Kunjeh
Gallente
Posted - 2011.03.24 18:35:00 - [41]
 

Yet another sultry Dev Blog for those of us who love the tech pron. So enlightening to read more in-depth about the Eve architecture. Just finished another article over at Gamasutra that went into some depth on the architecture as well (here for those who are interested). Keep up the good work CCP!

Batolemaeus
Caldari
Free-Space-Ranger
Morsus Mihi
Posted - 2011.03.24 18:41:00 - [42]
 

I came.

Ariane VoxDei
Posted - 2011.03.24 18:41:00 - [43]
 

Originally by: CCP Yokai
Check the disk busy... CPU... um and other graphs in the blog... We dug deeper and found some bottlenecks recently.
Yes, the "disk before" graphs is scary. Like really really scary.
If that translates to something like the similar graph in windows, it gives peen-shrinking shivers of diskwaits - making even the mightiest cpu/ram/gfx combo seem like stoneage implements choking any game into a stuttering slideshow while it desperately waits for IO requests to complete. (memories of logging into lagaran come to mind).

Interesting graph of online-players you had about fall/fail-over to redundant SQL server. That smaller spike coincides well with the recent mass disconnent many of us suffered, where we got repeated disconnect after logging back in for quite some time. Think it was about 2 weeks ago.
Was that it or something else?

Quote:
Ask and I'll try and answer... better yet if you are at Fanfest... I present all that tomorrow.
Looking forward to watching that.
Anyway, if that failover was the cause, could you try to talk about that on the presentation? And talk about it anyway, so we know what to expect when it does happen.

The ugly thing about that problem was that it only dropped "some". If it drops everyone, well no worries, your enemies dropped too. Partial drops are nastier.
I am not a titan pilot, but I think you can get the picture, and thats just one of the ugly scenarios.
Match that with the revision of the reimburse policy, as per GM blog, and that can suddenly be very expensive.

Andrea Griffin
Posted - 2011.03.24 18:42:00 - [44]
 

So, how much did this all cost (roughly)?

Being a nerd girl I love hearing about this stuff. I really enjoy CCP's tech blogs.

I'd love to play with that hardware, but being able to play ON it is almost as fun (and I don't get called at 4am when it breaks). Very Happy

Celebris Nexterra
Gallente
Jupiter Force
Posted - 2011.03.24 18:46:00 - [45]
 

Man, you guys have been churning out devblogs like it's your freaking JOB these past two weeks!

You- wait...yes...OK, I'm being told it is in fact your job to post devblogs. Nonetheless! It is still awesome! Keep up the great work, and give me more fapping material like the screen cap of 16 hyper-threaded CPU cores =D!!!

Ciaa
Gallente
The Executives
Executive Outcomes
Posted - 2011.03.24 18:48:00 - [46]
 

Nice blog, good to see some tech love/**** :D
Any chance of some photos around the office and server room?

Rambobinette
Caldari
Angels of Death Corp
Posted - 2011.03.24 18:52:00 - [47]
 


Is it really a single database? if it is, it means you have a active-passive cluster configuration which is a waste of computing resources because when you have 2 or more, you can balance the databases on each nodes giving you an active/active cluster configuration. You can also add more nodes thus reducing the work load.


Charles37
Posted - 2011.03.24 18:54:00 - [48]
 

Those are some incredibly sexy graphs. Thank you!

This also makes my trusty computer feel... rather inadequate. But then again, spec sheets aren't everything, right...? Right?

Aldariandra
Gallente
MunsterMunch
The 0rphanage
Posted - 2011.03.24 18:57:00 - [49]
 

Edited by: Aldariandra on 24/03/2011 19:04:21
This is very interesting. At our company we are currently trying out a Whiptail SSD SAN (rated theoretically up to 250000 IOp/s) and we get about 65000 I/ops out of it on account of being limited to 4Gb FC Blade switches (Brocom). This is to run VMware storage on btw.

Didn't you guys use Blades aswell (IBM)? If so, seems like a lot of ports being taken up for both network and HBA?

What kind of SAN switches do you use?

Maybe I am picky, but average disk queue of about 3 still seems high to me. The fact that your storage still seems to get to 100% disk use also explains the queuing probably. I would not be happy with ever seeing disks bottleneck on 100% use, its something I would see as a definite problem to solve still.

I find it very interesting that you run Eve on MS SQL. We have a lot of performance problems with some of our SQL servers and our DBA's are quite inexperienced. I would very much like to know how its set up and how you spread the load out.
What kind of IOP/s does the database eat?

Myobi Rush
Killboard Padding
Posted - 2011.03.24 19:00:00 - [50]
 

This effect every system or just Jita? :Unamused:

Rambobinette
Caldari
Angels of Death Corp
Posted - 2011.03.24 19:05:00 - [51]
 

Originally by: Aldariandra

I find it very interesting that you run Eve on MS SQL. We have a lot of performance problems with some of our SQL servers and our DBA's are quite inexperienced. I would very much like to know how its set up and how you spread the load out.
What kind of IOP/s does the database eat?


thre is a limit a DBA can fix. Even with the appropriate index and optimizations, if the app do a full table scan because the developer doesn't know SQL rules, well you will have problems. I had great experiences with MSsql.




Koshiko Murakami
Posted - 2011.03.24 19:20:00 - [52]
 

So how many TPS does the current database hit? How do you see this expanding?

Soldarius
Caldari
Peek-A-Boo Bombers
Posted - 2011.03.24 19:23:00 - [53]
 

Wait. A business is using its income from subscribers to actually improve the business?Shocked

Impressive numbers. Great job, CCP. Keep it up.

ORCACommander
Posted - 2011.03.24 19:25:00 - [54]
 

Edited by: ORCACommander on 24/03/2011 19:29:09
Originally by: Vuk Lau
:fapfapfap:


this ^^^^

btw you can fix that GHrz discrepancy easy through overclocking

DiaBlo UK
ZDK
Posted - 2011.03.24 19:33:00 - [55]
 

Edited by: DiaBlo UK on 24/03/2011 19:54:23
ball park figure on the cost of the upgrade??? YARRRR!!

I mean, do I need to win the Saturday night jackpot or wait around for a Euro millions double roll-over? Razz

Ishina Fel
Caldari
Terra Incognita
Intrepid Crossing
Posted - 2011.03.24 19:42:00 - [56]
 

Originally by: Ban Doga

Also found the original statement I was referring to:
Originally by: CCP Atlas
We only have a single database and it's easier to scale that up than the sol nodes and we're already ahead of the curve in terms of what the DB can deliver. We do cache very aggressively on the server though and consolidating these character node calls onto a half a dozen nodes rather than servicing them throughout the cluster does remove a bit of the DB load since we get more cache hits, but like I said, the DB is not a big issue in this regard today.


http://www.eveonline.com/ingameboard.asp?a=topic&threadID=1371750&page=2#39


That quote is from August 2010... I'm sure it was compeltely true back then. But since then, they released Incursions, activated resource depletion on planets, overhauled the whole inventory system, and did other code improvements... I'm pretty sure that it is especially the latter two things that trouble the database.

Imagine - they just released a blog where they state that they can allow for Jita's maximum population to grow by over 1000 additional people, because the new efficient inventory code allows the node CPU to handle that many more inventory operations per second. But where do all these inventory operations go? Well, they hit the database. And now there's going to be a whole lot more of them in the same amount of time. Not only in Jita, but in every system that saves CPU cycles due to this coding change.

So the very database that ended up sitting around bored because TQ couldn't generate enough requests to saturate it, suddenly had to scramble to keep up, approaching its limits. So an upgrade made sense.

(This is of course pure guesswork, I have no idea what really happened. I only know that more often than not when you improve one part of a complex system, you end up stressing a different part without even meaning to.)

And on topic: that is a beautiful database server you got there. As a system integrator myself, consider me jealous - I can't find a single thing I would have done different! Very Happy

Diomedes Calypso
Aetolian Armada
Posted - 2011.03.24 19:50:00 - [57]
 

cool stuff.. just a thumbs up to let you know its being read and enjoyed even by people like me who have little clue about what some of the stuff means and use it as a learning experience.

Dian'h Might
Minmatar
Cash and Cargo Liberators Incorporated
Posted - 2011.03.24 20:14:00 - [58]
 

Awesome blog. Technical details like that are great and give me an excuse to read eve forums at work Razz

Mr LaForge
Posted - 2011.03.24 20:21:00 - [59]
 

Whoah....Dude..

So I herd u got new hamsters...

Shandir
Minmatar
Brutor Tribe
Posted - 2011.03.24 20:22:00 - [60]
 

Originally by: DiaBlo UK
Edited by: DiaBlo UK on 24/03/2011 19:54:23
ball park figure on the cost of the upgrade??? YARRRR!!

I mean, do I need to win the Saturday night jackpot or wait around for a Euro millions double roll-over? Razz


I suspect they already have to put a *lot* of effort into cooling, although this is an idea for reinforced nodes. If CCP currently is looking into multicore as they cannot push single-core processing as much as they'd like - what is stopping you from taking the highest clock-speed rating CPU commercially available, and then overclock it under heavy cooling for the max performance reinforced nodes?


Pages: 1 [2] 3 4 5

This thread is older than 90 days and has been locked due to inactivity.


 


The new forums are live

Please adjust your bookmarks to https://forums.eveonline.com

These forums are archived and read-only