open All Channels
seplocked EVE Technology Lab
blankseplocked best way to read chatlogs in python
 
This thread is older than 90 days and has been locked due to inactivity.


 
Author Topic

Ajurna Jakar
Gallente
Jian Products Engineering Group
Atlas.
Posted - 2011.06.28 11:29:00 - [1]
 

hey, im writing a script to do some trawling around chatlogs for information.

problem is that they are encoded in ucs2, i have a workaround for this but it seems very "hacky" i was wondering what the best way to convert these to normal python strings would be.

my current implementation is this:
def clean_log_line(self, line):
out = ''
linedec = line.decode('utf-8', errors='ignore')
for x in [linedec[i] for i in range(1, len(linedec), 2)]:
out += x
return str(out.strip())

Entity
X-Factor Industries
Synthetic Existence
Posted - 2011.06.28 13:08:00 - [2]
 

Well, first off, CCP is simply appending full UCS2 strings to the file. This causes the BOM (byte order mark) to appear several times in the file.
While this is technically not a bug, it is undesirable as it stops you from simply loading the file and decoding it in 1 go. The quick fix is to simply remove all BOMs.

Here's how you could go about iterating over log lines:

fn = r"X:\Path\to\some\eve\chatlog.txt"

log = open(fn, "rb").read()
log = log.replace("\xff\xfe", "").decode("utf_16")
# log is now a normal unicode string with the whole log.

# you can then iterate over the lines like this:
import StringIO # don't use cStringIO, doesnt work on unicode.
for line in StringIO.StringIO(log):
line = line.rstrip("\r\n")
print line


Alternatively, you could split on the BOMs in the log and end up with the separate sessions. Could be useful :)


Ajurna Jakar
Gallente
Jian Products Engineering Group
Atlas.
Posted - 2011.06.28 13:13:00 - [3]
 

thank you very much.


 

This thread is older than 90 days and has been locked due to inactivity.


 


The new forums are live

Please adjust your bookmarks to https://forums.eveonline.com

These forums are archived and read-only