Performance Tips
Make similar draw function calls successive
The less draw commands, the better the performance is.
One drawing function like DrawImage
or Fill
is usually treated as one (internal) draw command, but there is an exception. Successive drawing commands are treated as one draw command when all the below conditions are satisfied:
- All the functions are
DrawImage
orDrawTriangles
- All the render targets are same (
A
inA.DrawImage(B, op)
) - All the blends are same
- All the filter values are same
- All the address values are same (only for
DrawTriangles
)
Even when all the above conditions are satisfied, multiple draw commands can be used in really rare cases. Ebitengine images usually share an internal automatic texture atlas, but when you consume the atlas, or you create a huge image, those images cannot be on the same texture atlas. In this case, draw commands are separated. The texture atlas size depends on graphic devices. Another case is when you use an offscreen as a render source. An offscreen doesn't share the texture atlas with high probability.
examples/sprites is a good example to draw > 10000 sprites with one (or a few) draw command(s).
Know the actual drawing commands with ebitenginedebug
build tag
To see actual drawing commands, you can use ebitenginedebug
build tag (or ebitendebug
for Ebitengine v2.3 or older). For example, if you execute blocks
example, you will see the below logs:
go run -tags=ebitenginedebug github.com/hajimehoshi/ebiten/v2/examples/blocks@latest
...
--
draw-triangles: dst: 7 <- src: 1, colorm: <nil>, mode copy, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 7 <- src: 2, colorm: <nil>, mode source-over, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 1, colorm: <nil>, mode clear, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 7, colorm: <nil>, mode copy, filter: screen, address: clamp_to_zero
--
draw-triangles: dst: 7 <- src: 1, colorm: <nil>, mode copy, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 7 <- src: 2, colorm: <nil>, mode source-over, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 1, colorm: <nil>, mode clear, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 7, colorm: <nil>, mode copy, filter: screen, address: clamp_to_zero
--
draw-triangles: dst: 7 <- src: 1, colorm: <nil>, mode copy, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 7 <- src: 2, colorm: <nil>, mode source-over, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 1, colorm: <nil>, mode clear, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 7, colorm: <nil>, mode copy, filter: screen, address: clamp_to_zero
--
draw-triangles: dst: 7 <- src: 1, colorm: <nil>, mode copy, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 7 <- src: 2, colorm: <nil>, mode source-over, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 1, colorm: <nil>, mode clear, filter: nearest, address: clamp_to_zero
draw-triangles: dst: 8 (screen) <- src: 7, colorm: <nil>, mode copy, filter: screen, address: clamp_to_zero
--
...
Avoid changing render sources' pixels
Ebitengine records almost all draw functions in order to restore when context lost happens. When a render source's pixel is changed after it is used as a render source, Ebitengine tries a complicated calculation for restoring.
A.DrawImage(B, op) // B is a render source
B.DrawImage(C, op) // tries to change B's pixels. Avoid this if possible.
As well, cyclic drawing should also be avoided.
A.DrawImage(B, op)
B.DrawImage(A, op) // cyclic drawing! Avoid this if possible.
Avoid using the screen as a render source
The screen is a special image because the image is cleared at every frame. As explained above, Ebitengine records a drawing function calls but using the screen as a render source makes the calculation complicated.
Don't call (*Image).ReplacePixels
too much
ReplacePixels is a relatively heavy function.
Don't call (*Image).At
too much
At is also heavy that tries to solve all the queued draw commands and retrieve pixels from GPU.
It is fine to create one player for one short sound effect
Creating an audio.Player
is not expensive. It is fine to create one player for one short sound effect. For example, this code is totally fine:
// PlaySE plays a sound effect.
func PlaySE(bs []byte) {
sePlayer := audioContext.NewPlayerFromBytes(bs)
// sePlayer is never GCed as long as it plays.
sePlayer.Play()
}
In this code, (*audio.Context).NewPlayerFromBytes
is used instead of (*audio.Context).NewPlayer
. (*audio.Context).NewPlayerFromBytes
creates a new stream on call, while (*audio.Context).NewPlayer
accepts an existing stream. As a stream has a byte data and its position, one stream cannot be shared by multiple players. With (*audio.Context).NewPlayerFromBytes
, you can play sounds effects regardless of whether the same sound is playing or not.
As for BGMs that bytes can be much bigger than SEs, it is recommended to reuse one audio.Player
by, e.g., (*audio.Player).Rewind
. It is because preparing a byte slice for the whole music at one time might be expensive. It should be rare to play the same BGM at the same time anyway.
Encourage using the discrete GPU on Windows
On Windows, you can encourage your application to use the discrete GPU instead of the integrated GPU by exporting some symbols: NvOptimusEnablement
and AmdPowerXpressRequestHighPerformance
. This requires Cgo unfortunately, then Ebitengine does not do this by default.
Jae's preferdiscretegpu is a nice package to enable this very easily.