open All Channels
seplocked EVE Information Portal
blankseplocked New Dev Blog: Power to the People...
 
This thread is older than 90 days and has been locked due to inactivity.


 
Pages: 1 2 [3] 4 5 6 7

Author Topic

Lutz Major
Posted - 2011.08.30 17:47:00 - [61]
 

Originally by: CCP Dr.EyjoG
... [snip] ...

As CCP_Stillman mentioned then there is a real technical challenge to have close to live data. That challenge will not be resolved soon so let's focus on the 24 hour+ option.

Does anyone see a problem with providing 24 - 48 hour old data?

24-48 hours would be awsome!

Stillman: A API call would be nice, but wouldn't the sheer amount of requests by - uhm - everybody (!) have an impact on the QoS of the current API server? Not everybody sticks to the cachedUntil times ugh

I think a zipped CSV is best use for all operating systems.

Dierdra Vaal
Caldari
Veto.
Veto Corp
Posted - 2011.08.30 17:56:00 - [62]
 

Originally by: CCP Dr.EyjoG
Does anyone see a problem with providing 24 - 48 hour old data?


for my purpose, no :)

CCP Stillman

Posted - 2011.08.30 17:57:00 - [63]
 

Originally by: Lutz Major


Stillman: A API call would be nice, but wouldn't the sheer amount of requests by - uhm - everybody (!) have an impact on the QoS of the current API server? Not everybody sticks to the cachedUntil times ugh

I think a zipped CSV is best use for all operating systems.

Performance is one of the main concerns we have to work with to deliver this. But there's a lot of caching that can be done. For one thing, all the data will be cached in memcached, and we can easily turn on output caching.

So while performance is definitely a concern, one of the things we won't let happen is degrade the API performance. So I wouldn't worry too much to be honest Smile

Marcel Devereux
Aideron Robotics
Posted - 2011.08.30 17:57:00 - [64]
 

Edited by: Marcel Devereux on 30/08/2011 18:00:07
Edited by: Marcel Devereux on 30/08/2011 17:59:49
Generating one large dump in one pass is very expensive. But dumping out the history for a individual item by region is not. You already do this and the client can request it. People have written webpages that continually cycle through the market items using javascript and the IGB. They then upload the cache files of both the history and the open orders to various sites. You can do this request and put the information in flat files and let us access those. This would probably reduce database load as people would stop constantly updating orders in the client.

Again there is NO reason to create a HUGE dump file. ONE file per ITEM by REGION. I request that I have is that you generate a yearly dump file, again per item by region, so we can have access to all historical data for the items. For example you for 2007 Trit data in The Forge would look like: 2007-34-10000002.csv. You should not have any DB performance problems generating this file. If you do then you have bigger problems that you should be working on ;-)

I would also like to see the number of buy/sell transactions.


malaire
Posted - 2011.08.30 18:20:00 - [65]
 

Originally by: CCP Dr.EyjoG
The issue of the time lag of the information is a very interesting one for us. From the discussion so far I gather that anything else than 7 days is a nice to have, anything from 24 hours hold to 7 days is interesting to have, and that less than 24 hour old would be awesome.

As CCP_Stillman mentioned then there is a real technical challenge to have close to live data. That challenge will not be resolved soon so let's focus on the 24 hour+ option.

Does anyone see a problem with providing 24 - 48 hour old data?

I oppose the idea of giving too fresh information too easily. That would take advantage off from players who can use cache readers to implement this by themselves ... Cool

So. 7+ days is acceptable, 1-7 days is worse and less than 24 hours is just awfull. ugh

Salpun
Gallente
Paramount Commerce
Posted - 2011.08.30 18:23:00 - [66]
 

I am not a market junkieCool but if you are recalculating the numbers already and do add the buy sell numbers add a third data set identifing the number of buy sell orders that are outside 40% ? of the mediam price. So analysts have a reliability indicator built into the system. So outlier buy/ sell orders can be IDed as a significant factor in the price or if one centing was the prime mover.
Just a thought.

malaire
Posted - 2011.08.30 18:24:00 - [67]
 

Originally by: Marcel Devereux
I would also like to see the number of buy/sell transactions.

There is already transactionCount. If you mean separate counts for buy and sell transactions, then perhaps you forgot that each transaction is both buy and sell, depending on whether you ask from buyer or seller.

mini meee
Posted - 2011.08.30 18:29:00 - [68]
 

Originally by: CCP Dr.EyjoG

The issue of the time lag of the information is a very interesting one for us. From the discussion so far I gather that anything else than 7 days is a nice to have, anything from 24 hours hold to 7 days is interesting to have, and that less than 24 hour old would be awesome.

Does anyone see a problem with providing 24 - 48 hour old data?


With the existence of the browser api, jscript and eve-metrics old uploader - it's basically possible to get the market history data automatically at the moment from the cache [up to previous day] - This is what eve-metrics used to do, however their code for doing that effectively took 4-5 hours to loop through items.

Similarly in game, we are currently used to the history in game showing a daily min/avg/max.

Therefore *ideally* accurate data, once per day at midnight would provide what it s currently possible at the moment.

So Suggestion: how about a weekly/monthly csv containing all history, and then a daily 'update csv' file processed at midnight each day that contains the previous days data for min/avg/max. [I'm assuming that those would be low maintainence as I assume you don't calculate the ingame history data on the fly :)]

I'm also assuming that <24 hours data isn't that useful - if I'm looking to buy 200mil trit, and it's 3isk in Jita on current market orders. I like to know what the price was yesterday i.e. has price gone up/down, is todays price 'fair' - for that the previous day would be useful - prices can change a lot in a week.

Gallosek
Posted - 2011.08.30 18:31:00 - [69]
 

Edited by: Gallosek on 30/08/2011 18:32:50
never mind :)

Callean Drevus
Caldari
Icosahedron Crafts and Shipping
Silent Infinity
Posted - 2011.08.30 18:33:00 - [70]
 

Edited by: Callean Drevus on 30/08/2011 18:35:45
Originally by: malaire
Originally by: Marcel Devereux
I would also like to see the number of buy/sell transactions.

There is already transactionCount. If you mean separate counts for buy and sell transactions, then perhaps you forgot that each transaction is both buy and sell, depending on whether you ask from buyer or seller.


Though it's pretty useful to know whether it's a buy or a sell order being fulfilled :) on a different note, since when are you in favour of cache scraping?

In any case, this is completely awesome! I wouldn't mind any delay less than a week or so.

It means I'll always have the historical data available, and people will still be interested in the true orders in a region, so services like EVE Marketeer are not immediately defunct (but won't have to gather the annoying history tab information anymore, meaning more focus on true orders).

As for the format, CSV is most desired.

Data would have to be available in episodes. Probably per day or per month. The person above me describes it well.

UPDATE: As for the timeline on the data, how far back will this data go? Personally I'd be very interested in seeing the history of prices even back to the launch of EVE, even if they aren't useful for my current trading (but I'm a data freak).

Haguu
Caldari
TLA Ltd
Posted - 2011.08.30 18:37:00 - [71]
 

As many, many people have said no Microsoft formats at all. ever. After that my initial reaction is a tossup to CSV or XML with JSON a somewhat distant third.

Until you are prepared to go [near] real-time, why bother with an API that people have to code cache timers for. If you had a static file name like EY20110830.zip then we could download the static file without your database engine being effected by performance. If you really cared, then put the files in Google files or EVE files where the downloads have zero impact on CCP bandwidth. Don't go to the expense of an API transaction if we are not benefitting from it. Similarly, it could easily be that a single static file is easier to compute and serve up than a lot of players and site requesting hundreds of queries. Plus a single static file maintains a history so if some player wanted to audit what was that price on that past day, it is available. Which an API query of a database might not still have available.

I do see some uses for true historical data (trends of trit % of BS market basket over last few years or plex price changes prior to patches)

What I think would be awesome and yet not that expensive to do is a nightly dump of all items in Jita 4-4 (in addition to a whole Universe every 3-7 days)!!!! Worst case, the one from the prior downtime? Or release the previous DT data 12 hours after DT (so during DT you just copy and you have 12 hours afterwards to run the data.)


This is a much smaller dataset to compute; much smaller to serve up. (Again, make sure it is a static file not some API overhead.)
This is all the killboard sort of applications really need.
It does not impact the real-time trader searching the universe for arbitrage.
It is probably good enough for the lower end "what can I manufacture?" "what ore should I mine" sites?
Even if a manufacturer has special arrangements for their costs or sales, Jita is the "list price." There are many times when people want to know the reference price.

So please: a frequent static zipped CSV file of just Jita 4-4 prices???

malaire
Posted - 2011.08.30 18:42:00 - [72]
 

Originally by: Callean Drevus
Originally by: malaire
Originally by: Marcel Devereux
I would also like to see the number of buy/sell transactions.

There is already transactionCount. If you mean separate counts for buy and sell transactions, then perhaps you forgot that each transaction is both buy and sell, depending on whether you ask from buyer or seller.


Though it's pretty useful to know whether it's a buy or a sell order being fulfilled :) on a different note, since when are you in favour of cache scraping?

True, however in some cases both are true (player making buy/sell order which is instantly fulfilled by existing sell/buy order).

And in my other thread I didn't oppose cache reading as much as IGB javascript automation to update the cache.

Salpun
Gallente
Paramount Commerce
Posted - 2011.08.30 18:53:00 - [73]
 

Is created sell orders and there mean price enough data to serve as the sanity check but small and simple enough enough that CCP might add it to the list.

Processed transactions verses outstanding transactions is the issue. Eve-Central pulls outstanding transactions both buy and sell. CCP is wanting to give us Processed transactions.

Palovana
Caldari
Inner Fire Inc.
Posted - 2011.08.30 19:00:00 - [74]
 

CSV format, which can be imported into spreadsheets or any (free) database we'd be using.

MSSQL backup format is far less useful to those not using MSSQL.

malaire
Posted - 2011.08.30 19:13:00 - [75]
 

It would be usefull to have both min/max prices which include outliers, and other min/max prices which does not.

Marcel Devereux
Aideron Robotics
Posted - 2011.08.30 19:15:00 - [76]
 

Originally by: malaire
Originally by: Marcel Devereux
I would also like to see the number of buy/sell transactions.

There is already transactionCount. If you mean separate counts for buy and sell transactions, then perhaps you forgot that each transaction is both buy and sell, depending on whether you ask from buyer or seller.


Ok Obi-Wan. How about this point-of-view? The counts of how may buy orders and sell orders were fulfilled.

Meissa Anunthiel
Redshift Industrial
Rooks and Kings
Posted - 2011.08.30 19:17:00 - [77]
 

Originally by: CCP Dr.EyjoG
Originally by: Amsterdam Conversations
Hey I couldn't be arsed to do the QEN, so here is your data, do it yourself.


Ahhh, you saw right through me, didn't you Confused

The issue of the time lag of the information is a very interesting one for us. From the discussion so far I gather that anything else than 7 days is a nice to have, anything from 24 hours hold to 7 days is interesting to have, and that less than 24 hour old would be awesome.

As CCP_Stillman mentioned then there is a real technical challenge to have close to live data. That challenge will not be resolved soon so let's focus on the 24 hour+ option.

Does anyone see a problem with providing 24 - 48 hour old data?


Makes it more difficult for some traders to profit by making it clearer where the opportunities are (thereby creating competition where there was little or none). Information is power when it comes to markets Eyjo, so the shorter the span, the bigger the damage.

Then again, it may also creates opportunities for enterprising individuals. For markets that have big transaction volume, 24 hours wouldn't be a big deal, for "more remote" regions, it would. Personally I think 3 days is more than enough, it gives enough information that people would have to move their arses and see what it's like "down there" right now.


Salpun
Gallente
Paramount Commerce
Posted - 2011.08.30 19:19:00 - [78]
 

Hi Meissa

Glad to see a CSM joining the discussion.

Tau Cabalander
Posted - 2011.08.30 19:22:00 - [79]
 

Edited by: Tau Cabalander on 30/08/2011 19:26:13

CSV is preferable to a proprietary (Microsoft) format. Any open format is preferred.

I'll just import it into open source packages like SQLite, or even MySQL.

If the dump isn't current, expect the API to be hammered as a result.

Marcel Devereux
Aideron Robotics
Posted - 2011.08.30 19:25:00 - [80]
 

Originally by: Meissa Anunthiel
Originally by: CCP Dr.EyjoG
Originally by: Amsterdam Conversations
Hey I couldn't be arsed to do the QEN, so here is your data, do it yourself.


Ahhh, you saw right through me, didn't you Confused

The issue of the time lag of the information is a very interesting one for us. From the discussion so far I gather that anything else than 7 days is a nice to have, anything from 24 hours hold to 7 days is interesting to have, and that less than 24 hour old would be awesome.

As CCP_Stillman mentioned then there is a real technical challenge to have close to live data. That challenge will not be resolved soon so let's focus on the 24 hour+ option.

Does anyone see a problem with providing 24 - 48 hour old data?


Makes it more difficult for some traders to profit by making it clearer where the opportunities are (thereby creating competition where there was little or none). Information is power when it comes to markets Eyjo, so the shorter the span, the bigger the damage.

Then again, it may also creates opportunities for enterprising individuals. For markets that have big transaction volume, 24 hours wouldn't be a big deal, for "more remote" regions, it would. Personally I think 3 days is more than enough, it gives enough information that people would have to move their arses and see what it's like "down there" right now.




And by bigger the damage you mean more tears. This is EVE. They will need to HTFU or do something else. Universal access to data and competition is good.

Arkady Sadik
Minmatar
Electus Matari
Posted - 2011.08.30 19:27:00 - [81]
 

Originally by: CCP Dr.EyjoG
Does anyone see a problem with providing 24 - 48 hour old data?


Depends on what you mean, exactly. Data of the "current" day is available, albeit not very useful ("so far today there have been X items traded at price Y on average"). Bargain traders want market orders, not the market history.

The interesting data are the finalized values per day.

So, let's say we have the 10th of a month (sometime during noon, say, DT). Data of the 10th is theoretically available, but not too helpful. The interesting data starts with the finalized values of the 9th, which is available in-game starting at 00:00 on the 10th. Having that data available out-of-game would be fabolous. Having data from the 8th only would be ok still. Going back further from there would get worse quickly.

To give you an idea, for our current index, we consider data older than 3 days "old", and anything older than 7 days "outdated". (Excluding rarely-traded items, of course.)


The data quantities involved here are not particularly huge. The data itself is already available on the TQ DB and accessible to any client. So the question is transfer volume. For a single day, we're talking about 6844 market types in 67 regions with a market (assuming CCP trades stuff in their jovian regions). The export contains 10 columns, assuming 64 bit each that would be under 40MB of data per day. I do not think that transferring that to a different server once per day would be a particularly huge drain on the database.

It would even be perfectly fine to get text files with the daily data on some file server. cdnsomething/markethistory/20110801.xml.gz with the full data for that day only would be perfectly ok - almost all people using this data will query daily and store it locally anyhow. Even assuming some wasteful XML encoding, this would still be under 100MB for a single day, uncompressed (assuming 20 bytes per number + separators gives 92MB).


Regarding "unfairness" - giving an in-game advantage to people who happen to know how to read cache files sounds like a bad game design choice. It's enough to give an in-game advantage to those who know how to massage data into something useful already, no need to make it more complicated than that.

Matthew
Caldari
BloodStar Technologies
Posted - 2011.08.30 19:40:00 - [82]
 

This is pretty awesome.

In terms of format, .bak is fine for me. Alternatively XML would be preferred and would have the advantage of being able to be unified between static historic dumps and any potential API service for near-live data.

I would suggest that the API and static dumps are deployed as an integrated system. The API can be used to serve the most recent data (maybe last 30 days or so, sufficient to overlap with the static data cycle), with full historic series (back to the beginning of time if possible) provided in static downloads.

In terms of timeliness, 24-48 hour delay on the API calls sounds good. Any longer than that and cache-sc****rs are going to remain the most popular form of acquiring this data.

API Calls

MarketDailyUpdate - Accepts a date as parameter and returns an all-items all-regions dump for that single day.

MarketItemUpdate - Accepts a typeID as parameter and returns an all-regions all-days dump.

MarketQuery - Accepts date, typeID and regionID to allow small specific pulls.

Static Dumps

I suggest a system of files that group-up over time.

Where a full year of data is available, the file is provided by year. For the incomplete year, complete months are provided by month. Potentially then a weekly file for incomplete months depending on the API/static balance that is settled on.

That way new users boot-strapping their records have a reasonable number of historic files to grab, while consumers of ongoing updates don't have large amounts of repeat data to extract.

Data Content

Everything in there looks good, the only thing that seems to be missing compared to the info in the client are the 5 and 20 day averages. While the means can be calculated by the user, the medians cannot be derived locally, so if these could be added that would be good.

Chruker
Posted - 2011.08.30 20:08:00 - [83]
 

Could use a ticker thingy which would show new orders as they are put up :-)

Arkady Sadik
Minmatar
Electus Matari
Posted - 2011.08.30 20:14:00 - [84]
 

Minor point: You might want to rename the stationID column to regionID in your csv file. :-P

Hel O'Ween
Men On A Mission
EVE Trade Consortium
Posted - 2011.08.30 20:30:00 - [85]
 

Please, please do not release the data in CSV format. CSV has no strong data types nor is it locale aware. It's an ugly hack, the lowest common denominator ... from a couple of decades ago.

Anything is better than CSV. If you're looking for a text-based format, go for XML. 3rd party devs working with the (official) SDE already do have MSSQL installed somewhere. If not - pick up your free copy at MS's website.

Arkady Sadik
Minmatar
Electus Matari
Posted - 2011.08.30 20:33:00 - [86]
 

Edited by: Arkady Sadik on 30/08/2011 20:34:58
Originally by: Hel O'Ween
3rd party devs working with the (official) SDE already do have MSSQL installed somewhere.


No, we don't. We wait for zofu to convert it to something we can actually use.

Quote:
If not - pick up your free copy at MS's website.


And then we have to still convert the SQL to something the actual databases we use can parse. Which requires quite some regex gymnastics.

Parsing CSV (especially as this dump primarily contains numbers only, no complex strings) is trivial and fault-free in comparison.

XML is ok (better). Just no $/!"& MSSQL .BAK file.

E man Industries
Posted - 2011.08.30 21:00:00 - [87]
 

as a dumb user I like this.
Fitting tools like EFT use to calculate how much a given ship would cost. this was very helpfull in fitting. Does that meta 4 items cost a stupid amount? ext.

Also having access out of game to jita prices I can make quick desisions on to sell localy or transport to jita for sale as I can compare the cost.

I do not want the data to be live however and the data should be 1+week old. Let in game speculation rule rather than a data base that is much easier to run through a bot to show diffrences in price.

CCP Stillman

Posted - 2011.08.30 21:08:00 - [88]
 

First of all, thanks to everybody who's shared their opinions and comments so far. Keep them coming, they're immensely helpful.

On the format discussion, I want to propose a suggestion. It might be silly, but hear me out:

CSV is a very simple format. I know that at least the database engine I work with on a daily basis(MSSQL), can import a CSV file into a table in about a small line of SQL. And I'd be surprised if other database engines couldn't.

I know for a fact we have players around here who has expert knowledge in other database engines, who would know how how to make a CREATE statement for the data and a query for importing the data into a database.

So how about we throw a page on EVELopedia, where everybody can share how to import a CSV file into their database engine of choice. That means we can release a single format, and if people wish to import it into a database, or any other application one could think of, then they can head on over to EVELopedia and find out how to do it easily.

Does that seem like a nice middle-ground? We can't please everybody, but at least we can make sure that the information for how to consume the information is easily available.

Just an idea. I'd be interested to hear what you all think.
-Stillman

Marcel Devereux
Aideron Robotics
Posted - 2011.08.30 21:09:00 - [89]
 

Originally by: Hel O'Ween
Please, please do not release the data in CSV format. CSV has no strong data types nor is it locale aware. It's an ugly hack, the lowest common denominator ... from a couple of decades ago.

Anything is better than CSV. If you're looking for a text-based format, go for XML. 3rd party devs working with the (official) SDE already do have MSSQL installed somewhere. If not - pick up your free copy at MS's website.


Not everyone has Windows. This format needs to be some texted based format.

Eybium
Hephaestus Shipbuilding
Posted - 2011.08.30 21:18:00 - [90]
 

First off, thank you very much for this information. I am already drawing useful insight from it.

I can confirm Arkady Sadik's observation is true for the SQL backup as well; the stationID field is actually populated with regionID.

Also, although I don't have an opinion on which format is ultimately used for this information, I might suggest that if SQL Backups are the future, to produce those backups in 10.00 or even 9.xx format for users running older instances of SQL Server 2008 and 2005. Currently the dump can only be restored to a SQL Server 2008 R2 instance.



Pages: 1 2 [3] 4 5 6 7

This thread is older than 90 days and has been locked due to inactivity.


 


The new forums are live

Please adjust your bookmarks to https://forums.eveonline.com

These forums are archived and read-only