]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/commit
amd64: finish the tail in memset with an overlapping store
authorMateusz Guzik <mjg@FreeBSD.org>
Mon, 22 Oct 2018 06:44:20 +0000 (06:44 +0000)
committerMateusz Guzik <mjg@FreeBSD.org>
Mon, 22 Oct 2018 06:44:20 +0000 (06:44 +0000)
commit099c6f6d45c0b4cd50d768428d6c1cf0ca93c624
tree4cf091d18f3b5aea29ad108ae63b8e0eb9abcc44
parent4a8e4793ed8d066f94da92a9161dc7f2326682c5
amd64: finish the tail in memset with an overlapping store

Instead of finding the exact size to fit in we can just shift the target
by -8 + tail. Doing a blind write to a previously rep stosq'ed area comes
with a penalty so do it conditionally.

Sample win on EPYC when zeroing a 257 sized buffer (tail = 1) aligned to
16 bytes:
before: 44782846 ops/s
after:  46118614 ops/s

Idea stolen from NetBSD.

Sponsored by: The FreeBSD Foundation
sys/amd64/amd64/support.S