Sunday, October 15, 2017

Packet Refinements (Update)

It turns out that I wasn't thinking properly about the problem, and it didn't take long to come up with a better solution for packet encoding. Should have tried a simpler solution to start with. I replaced the bytes.Buffer with a static buffer of the target packet length. Here are the results:

BenchmarkEncodeMsgConnectEx-8           30000000              40.6 ns/op
BenchmarkEncodeUsingReflection-8         2000000               674 ns/op

Encoding now also shows improvement by a multiple of about 17. Cheers.

Packet Refinements

The Code

Yesterday, I mentioned I would be refining Chimera's packet system to no longer use reflection. I didn't expect to get to it so soon, but I was thinking about solutions yesterday and today and finished a packet reader and writer. The new reader and writer both help meet decode and encode interfaces, used by the client when receiving and sending data. The interfaces now look like this...

// Decoder defines a function that can be used to decode a byte buffer
// received by the server into a structure using TQ Digital's byte ordering
// rules. Must be met to be received by the server.
type Decoder interface { Decode(*Reader) interface{} }

// Encoder defines a function that can be used to encode a structure into a
// byte buffer to be sent to the client using TQ Digital's byte ordering
// rules. Must be met to be sent to the client.
type Encoder interface { Encode() *Writer }

The writer and reader are implemented using a byte.Buffer for the writer and a static byte array for the reader (slice split from the client's receive buffer). The methods are relatively uninteresting, but most importantly are used like this:

func Decode(read *packet.Reader) *Packet {
    p := new(Packet)
    p.Length     = read.Uint16()
    p.Identifier = read.Uint16()
    p.Account    = read.Seek(4).CString(128)
    p.Server     = read.CString(16)
    return p
}

func (p *Packet) Encode() *packet.Writer {
    write := packet.NewWriter()
    write.Uint16(p.Length)
    write.Uint16(p.Identifier)
    write.Uint32(p.Identity)
    write.Uint32(p.Token)
    write.Uint32(p.Port).Zero(4)
    write.CString(p.IPAddress, 16)
    return write
}

Test Results

As expected, this is much more performant than using reflection. For my decode tests, I tested decoding MsgConnect. When looking at my old reflection based system where structures were encoded and decoded automatically, 1000000 tests averaged about 1298 ns/op. Using the new reader where structures are decoded using explicit instructions, 20000000 tests averaged about 72.2 ns/op. Here are the below results:

> go test warry.io/chimera/lib/packet -bench .
BenchmarkDecodeMsgConnect-8             20000000              72.2 ns/op
BenchmarkDecodeUsingReflection-8         1000000              1298 ns/op
PASS
ok      warry.io/chimera/lib/packet     2.890s

Encoding is slightly slower due to the bytes.Buffer, and thus I might get back to that later. For my encode tests, I tested encoding MsgConnectEx. These were the results:

BenchmarkEncodeMsgConnectEx-8           10000000               191 ns/op
BenchmarkEncodeUsingReflection-8         2000000               722 ns/op

Conclusion

Obviously stated, don't use reflection in a high performance server. Removing reflection improved decode by a multiple of about 17. Encode is currently showing improvement by a multiple of about 4, so I'll work on that next (maybe). If you're interested in testing using Go, check out their testing package. https://golang.org/pkg/testing

Friday, October 13, 2017

Returning to Chimera

Introduction

Without going into excuses about my extended absence, I completed the login sequence for Chimera a few months back and then left it "as-is". This post is just to let you know that I started development on Chimera again. After speaking with a member of the development community, I got some revived interest in developing a Conquer Online server, for better or for worse.

Revision

When I left off, I had a serious performance problem with the packet system that I was struggling to find a clean solution for. When decoding client data, I used reflection and TQ's byte ordering rules to populate a packet structure. This resulted in roughly 300 ns per decode. My plan is to replace the reflection system with a stream system to reduce this time (will retest). Decoding would then be as simple as accepting a byte.Buffer as a packet.Reader, and then using methods to manually decode the structure. Something like this:

// Reader encapsulates a byte buffer, used by packet structures to dynamically
// decode client data using TQ Digital's byte ordering rules. The base buffer
// can be assigned directly from the client's data response.
type Reader struct {
    bytes.Buffer
}

I'd use the same interfaces that I did before, but have it pass the bytes.Buffer down from the server events rather than passing the byte array. When encoding, I'd do the same thing - returning a bytes.Buffer rather than an array (just to create another byte.Buffer again). Hopefully that makes sense.

Conclusion

Point being that I have interest to keep this project going and to make it more performant before really developing up Conquer's subsystems. I want to make this project as easy to develop for and understand as possible, and work with others in the community to develop tools and resources needed for Conquer development. The wiki is also back up if you'd like to contribute. Cheers.