Performance Tips

Make similar draw function calls successive

The less draw commands, the better the performance is.

One drawing function like DrawImage or Fill is usually treated as one (internal) draw command, but there is an exception. Successive drawing commands are treated as one draw command when all the below conditions are satisfied:

All the functions are DrawImage or DrawTriangles
All the render targets are same (A in A.DrawImage(B, op))
All the blends are same
All the filter values are same
All the address values are same (only for DrawTriangles)

Even when all the above conditions are satisfied, multiple draw commands can be used in really rare cases. Ebitengine images usually share an internal automatic texture atlas, but when you consume the atlas, or you create a huge image, those images cannot be on the same texture atlas. In this case, draw commands are separated. The texture atlas size depends on graphic devices. Another case is when you use an offscreen as a render source. An offscreen doesn't share the texture atlas with high probability.

examples/sprites is a good example to draw > 10000 sprites with one (or a few) draw command(s).

Know the actual drawing commands with `ebitenginedebug` build tag

To see actual drawing commands, you can use ebitenginedebug build tag (or ebitendebug for Ebitengine v2.3 or older). For example, if you execute blocks example, you will see the below logs:

go run -tags=ebitenginedebug github.com/hajimehoshi/ebiten/v2/examples/blocks@latest

...
--
draw-triangles: dst: 7 <- src: 1, colorm: <nil>, mode copy, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 7 <- src: 2, colorm: <nil>, mode source-over, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 1, colorm: <nil>, mode clear, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 7, colorm: <nil>, mode copy, filter: screen, address: clamp_to_zero
--
draw-triangles: dst: 7 <- src: 1, colorm: <nil>, mode copy, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 7 <- src: 2, colorm: <nil>, mode source-over, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 1, colorm: <nil>, mode clear, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 7, colorm: <nil>, mode copy, filter: screen, address: clamp_to_zero
--
draw-triangles: dst: 7 <- src: 1, colorm: <nil>, mode copy, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 7 <- src: 2, colorm: <nil>, mode source-over, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 1, colorm: <nil>, mode clear, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 7, colorm: <nil>, mode copy, filter: screen, address: clamp_to_zero
--
draw-triangles: dst: 7 <- src: 1, colorm: <nil>, mode copy, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 7 <- src: 2, colorm: <nil>, mode source-over, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 1, colorm: <nil>, mode clear, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 7, colorm: <nil>, mode copy, filter: screen, address: clamp_to_zero
--
...

Avoid changing render sources' pixels

Ebitengine records almost all draw functions in order to restore when context lost happens. When a render source's pixel is changed after it is used as a render source, Ebitengine tries a complicated calculation for restoring.

A.DrawImage(B, op) // B is a render source
B.DrawImage(C, op) // tries to change B's pixels. Avoid this if possible.

As well, cyclic drawing should also be avoided.

A.DrawImage(B, op)
B.DrawImage(A, op) // cyclic drawing! Avoid this if possible.

Avoid using the screen as a render source

The screen is a special image because the image is cleared at every frame. As explained above, Ebitengine records a drawing function calls but using the screen as a render source makes the calculation complicated.

Don't call `(*Image).ReplacePixels` too much

ReplacePixels is a relatively heavy function.

Don't call `(*Image).At` too much

At is also heavy that tries to solve all the queued draw commands and retrieve pixels from GPU.

It is fine to create one player for one short sound effect

Creating an audio.Player is not expensive. It is fine to create one player for one short sound effect. For example, this code is totally fine:

// PlaySE plays a sound effect.
func PlaySE(bs []byte) {
    sePlayer := audioContext.NewPlayerFromBytes(bs)
    // sePlayer is never GCed as long as it plays.
    sePlayer.Play()
}

In this code, (*audio.Context).NewPlayerFromBytes is used instead of (*audio.Context).NewPlayer. (*audio.Context).NewPlayerFromBytes creates a new stream on call, while (*audio.Context).NewPlayer accepts an existing stream. As a stream has a byte data and its position, one stream cannot be shared by multiple players. With (*audio.Context).NewPlayerFromBytes, you can play sounds effects regardless of whether the same sound is playing or not.

As for BGMs that bytes can be much bigger than SEs, it is recommended to reuse one audio.Player by, e.g., (*audio.Player).Rewind. It is because preparing a byte slice for the whole music at one time might be expensive. It should be rare to play the same BGM at the same time anyway.

Encourage using the discrete GPU on Windows

On Windows, you can encourage your application to use the discrete GPU instead of the integrated GPU by exporting some symbols: NvOptimusEnablement and AmdPowerXpressRequestHighPerformance. This requires Cgo unfortunately, then Ebitengine does not do this by default.

Jae's preferdiscretegpu is a nice package to enable this very easily.