With the removal of SSE2 runtime detection made in
golang.org/cl/344350 we can remove this mechanism as there
are no required features anymore.
For making sure CPUs running a go program support all
the minimal hardware requirements the go runtime should
do feature checks early in the runtime initialization
before it is likely any compiler emitted but unsupported
instructions are used. This is already the case for e.g.
checking MMX support on 386 arch targets.
Change-Id: If7b1cb6f43233841e917d37a18314d06a334a734
Reviewed-on: https://go-review.googlesource.com/c/go/+/354209
Trust: Martin Möhrmann <martin@golang.org>
Run-TryBot: Martin Möhrmann <martin@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
When GO386=sse2 we can assume sse2 to be present without
a runtime check. If GO386=softfloat is set we can avoid
the usage of SSE2 even if detected.
This might cause a memcpy, memclr and bytealg slowdown of Go
binaries compiled with softfloat on machines that support
SSE2. Such setups are rare and should use GO386=sse2 instead
if performance matters.
On targets that support SSE2 we avoid the runtime overhead of
dynamic cpu feature dispatch.
The removal of runtime sse2 checks also allows to simplify
internal/cpu further by removing handling of the required
feature option as a followup after this CL.
Change-Id: I90a853a8853a405cb665497c6d1a86556947ba17
Reviewed-on: https://go-review.googlesource.com/c/go/+/344350
Trust: Martin Möhrmann <martin@golang.org>
Run-TryBot: Martin Möhrmann <martin@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
To measure all instructions having been completed before reading
the time stamp counter with RDTSC an instruction sequence that
has instruction stream serializing properties which guarantee
waiting until all previous instructions have been executed is
needed. This does not necessary mean to wait for all stores to
be globally visible.
This CL aims to remove vendor specific logic for determining the
instruction sequence with CPU feature flag checks that are
CPU vendor independent.
For intel LFENCE has the wanted properties at least
since it was introduced together with SSE2 support.
On AMD instruction stream serializing LFENCE is supported by setting
an MSR C001_1029[1]=1 on AMD family 10h/12h/14h/15h/16h/17h processors.
AMD family 0Fh/11h processors support LFENCE as serializing always.
AMD plans support for this MSR and access to this bit for all future processors.
Source: https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf
Reading the MSR to determine LFENCE properties is not always possible
or reliable (hypervisors). The Linux kernel is relying on serializing
LFENCE on AMD CPUs since a commit in July 2019: https://lkml.org/lkml/2019/7/22/295
and the MSR C001_1029 to enable serialization has been set by default
with the Spectre v1 mitigations.
Using an MFENCE on AMD is waiting on previous instructions having been executed
but in addition also flushes store buffers.
To align the serialization properties without runtime detection
of CPU manufacturers we can use the newer RDTSCP instruction which
waits until all previous instructions have been executed.
RDTSCP is available on Intel since around 2008 and on AMD CPUs since
around 2006. Support for RDTSCP can be checked independently
of manufacturer by checking CPUID bits.
Using RDTSCP is the default in Linux to read TSC in program order
when the instruction is available.
e22ce8eb63/arch/x86/include/asm/msr.h (L231)
Change-Id: Ifa841843b9abb2816f8f0754a163ebf01385306d
Reviewed-on: https://go-review.googlesource.com/c/go/+/344429
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Martin Möhrmann <martin@golang.org>
Run-TryBot: Martin Möhrmann <martin@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Remove all cpu features from the ARM64 struct that are not initialized
to reduce cache lines used and to avoid those features being
accidentially used without actual detection if they are present.
Add missing option to mask the CPUID feature.
Change-Id: I94bf90c0655de1af2218ac72117ac6c52adfc289
Reviewed-on: https://go-review.googlesource.com/c/go/+/267658
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Trust: Martin Möhrmann <moehrmann@google.com>
Like in x/sys/cpu, use anonymous structs to declare the CPU feature vars
instead of defining single-use types. Also, order the vars
alphabetically.
Change-Id: Iedd3ca51916e3cbb852d2aeed18b3a4c6613e778
Reviewed-on: https://go-review.googlesource.com/c/go/+/221757
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Use the following (suboptimal) script to obtain a list of possible
typos:
#!/usr/bin/env sh
set -x
git ls-files |\
grep -e '\.\(c\|cc\|go\)$' |\
xargs -n 1\
awk\
'/\/\// { gsub(/.*\/\//, ""); print; } /\/\*/, /\*\// { gsub(/.*\/\*/, ""); gsub(/\*\/.*/, ""); }' |\
hunspell -d en_US -l |\
grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' |\
grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' |\
sort -f |\
uniq -c |\
awk '$1 == 1 { print $2; }'
Then, go through the results manually and fix the most obvious typos in
the non-vendored code.
Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae
Reviewed-on: https://go-review.googlesource.com/c/go/+/193848
Reviewed-by: Robert Griesemer <gri@golang.org>
This CL changes the internal/cpu API to more closely match the
public version in x/sys/cpu (added in CL 163003). This will make it
easier to update the dependencies of vendored code. The most prominent
renaming is from VE1 to VXE for the vector-enhancements facility 1.
VXE is the mnemonic used for this facility in the HWCAP vector.
Change-Id: I922d6c8bb287900a4bd7af70567e22eac567b5c1
Reviewed-on: https://go-review.googlesource.com/c/164437
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In the s390x assembly implementation of NIST P-256 curve, utilize faster multiply/square
instructions introduced in the z14. These new instructions are designed for crypto
and are constant time. The algorithm is unchanged except for faster
multiplication when run on a z14 or later. On z13, the original mutiplication
(also constant time) is used.
P-256 performance is critical in many applications, such as Blockchain.
name old time new time delta
BaseMultP256 24396 ns/op 21564 ns/op 1.13x
ScalarMultP256 87546 ns/op 72813 ns/op. 1.20x
Change-Id: I7e6d8b420fac56d5f9cc13c9423e2080df854bac
Reviewed-on: https://go-review.googlesource.com/c/146022
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Change internal/cpu feature configuration to use
GODEBUG=cpu.feature1=value,cpu.feature2=value...
instead of GODEBUGCPU=feature1=value,feature2=value... .
This is not a backwards compatibility breaking change
since GODEBUGCPU was introduced in go1.11 as an
undocumented compiler experiment.
Fixes#28757
Change-Id: Ib21b3fed2334baeeb061a722ab1eb513d1137e87
Reviewed-on: https://go-review.googlesource.com/c/149578
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Minimum Go requirement for ppc64(le) architecture support is POWER8.
https://github.com/golang/go/wiki/MinimumRequirements#ppc64-big-endian
Reduce CPU features supported in internal/cpu to those needed to
test minimum requirements and cpu feature kernel support for ppc64(le).
Currently no internal/cpu feature variables are used to guard code
from using unsupported instructions. The IsPower9 feature variable
and detection is kept as it will soon be used to guard code execution.
Reducing the set of detected CPU features for ppc64(le) makes
implementing Go support for new operating systems easier as
CPU feature detection for ppc64(le) needs operating system support
(e.g. hwcap on Linux and getsystemcfg syscall on AIX).
Change-Id: Ic4c17b31610970e481cd139c657da46507391d1d
Reviewed-on: https://go-review.googlesource.com/c/145117
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This CL adds the ability to enable the cpu feature FEATURE by specifying
FEATURE=on in GODEBUGCPU. Syntax support to enable cpu features is useful
in combination with a preceeding all=off to disable all but some specific
cpu features. Example:
GODEBUGCPU=all=off,sse3=on
This CL implements printing of warnings for invalid GODEBUGCPU settings:
- requests enabling features that are not supported with the current CPU
- specifying values different than 'on' or 'off' for a feature
- settings for unkown cpu feature names
Updates #27218
Change-Id: Ic13e5c4c35426a390c50eaa4bd2a408ef2ee21be
Reviewed-on: https://go-review.googlesource.com/c/141800
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This change exposes feature flags needed to implement an FMA intrinsic
on ARM CPUs via auxv's HWCAP bits. Specifically, it exposes HasVFPv4 to
detect if an ARM processor has the fourth version of the vector floating
point unit. The relevant instruction for this CL is VFMA, emitted in Go
as FMULAD.
Updates #26630.
Change-Id: Ibbc04fb24c2b4d994f93762360f1a37bc6d83ff7
Reviewed-on: https://go-review.googlesource.com/c/126315
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Enabling GODEBUGCPU without the need to set GOEXPERIMENT=debugcpu enables
trybots and builders to run tests for GODEBUGCPU features in upcoming CLs
that will implement the new syntax and features for non-experimental
GODEBUGCPU support from proposal golang.org/issue/27218.
Updates #27218
Change-Id: Icc69e51e736711a86b02b46bd441ffc28423beba
Reviewed-on: https://go-review.googlesource.com/c/141817
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
sys here is runtime/internal/sys.
Replace uses of sys.CacheLineSize for padding by
cpu.CacheLinePad or cpu.CacheLinePadSize.
Replace other uses of sys.CacheLineSize by cpu.CacheLineSize.
Remove now unused sys.CacheLineSize.
Updates #25203
Change-Id: I1daf410fe8f6c0493471c2ceccb9ca0a5a75ed8f
Reviewed-on: https://go-review.googlesource.com/126601
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The new constant CacheLinePadSize can be used to compute best effort
alignment of structs to cache lines.
e.g. the runtime can use this in the locktab definition:
var locktab [57]struct {
l spinlock
pad [cpu.CacheLinePadSize - unsafe.Sizeof(spinlock{})]byte
}
Change-Id: I86f6fbfc5ee7436f742776a7d4a99a1d54ffccc8
Reviewed-on: https://go-review.googlesource.com/131237
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Assumes mandatory VFP and VFPv3 support to be present by default
but not IDIVA if AT_HWCAP is not available.
Adds GODEBUGCPU options to disable the use of code paths in the runtime
that use hardware support for division.
Change-Id: Ida02311bd9b9701de3fc120697e69445bf6c0853
Reviewed-on: https://go-review.googlesource.com/114826
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The runtime package already imports the internal/cpu package, so there
is no reason for it to use go:linkname comments to refer to
internal/cpu functions and variables. Since internal/cpu is internal,
we can just export those names. Removing the obscurity of go:linkname
outweighs the minor additional complexity added to the internal/cpu API.
Change-Id: Id89951b7f3fc67cd9bce67ac6d01d44a647a10ad
Reviewed-on: https://go-review.googlesource.com/128755
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Add a CacheLinePad struct type to internal/cpu that has a size of CacheLineSize.
This can be used for padding structs in order to avoid false sharing.
Updates #25203
Change-Id: Icb95ae68d3c711f5f8217140811cad1a1d5be79a
Reviewed-on: https://go-review.googlesource.com/116276
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Hardware AES support in Go on s390x currently requires ECB, CBC
and CTR modes be available. It also requires that either the
GHASH or GCM facilities are available. The existing checks missed
some of these constraints.
While we're here simplify the cpu package on s390x, moving masking
code out of assembly and into Go code. Also, update SHA-{1,256,512}
implementations to use the cpu package since that is now trivial.
Finally I also added a test for internal/cpu on s390x which loads
/proc/cpuinfo and checks it against the flags set by internal/cpu.
Updates #25822 for changes to vet whitelist.
Change-Id: Iac4183f571643209e027f730989c60a811c928eb
Reviewed-on: https://go-review.googlesource.com/114397
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
I don't know why these files were not formatted. Perhaps because
their changes came from Github PRs?
Change-Id: Ida8d7b9a36f0d1064caf74ca1911696a247a9bbe
Reviewed-on: https://go-review.googlesource.com/114824
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
When the internal/cpu package was introduced, the AES package still used
the custom crypto/internal/cipherhw package for amd64 and s390x. This
change removes that package entirely in favor of directly referencing the
cpu feature flags set and exposed by the internal/cpu package. In
addition, 5 new flags have been added to the internal/cpu s390x struct
for detecting various cipher message (KM) features.
Change-Id: I77cdd8bc1b04ab0e483b21bf1879b5801a4ba5f4
GitHub-Last-Rev: a611e3ecb1f480dcbfce3cb0c8c9e4058f56c1a4
GitHub-Pull-Request: golang/go#24766
Reviewed-on: https://go-review.googlesource.com/105695
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Needs the go compiler to be build with GOEXPERIMENT=debugcpu to be active.
The GODEBUGCPU environment variable can be used to disable usage of
specific processor features in the Go standard library.
This is useful for testing and benchmarking different code paths that
are guarded by internal/cpu variable checks.
Use of processor features can not be enabled through GODEBUGCPU.
To disable usage of AVX and SSE41 cpu features on GOARCH amd64 use:
GODEBUGCPU=avx=0,sse41=0
The special "all" option can be used to disable all options:
GODEBUGCPU=all=0
Updates #12805
Updates #15403
Change-Id: I699c2e6f74d98472b6fb4b1e5ffbf29b15697aab
Reviewed-on: https://go-review.googlesource.com/91737
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
runtime.alginit needs runtime/support_{aes,ssse3,sse41} feature flag
to init aeshash function but internal/cpu.init not be called yet.
This CL will call internal/cpu.initialize before runtime.alginit, so
that we can move all cpu features related code to internal/cpu.
Change-Id: I00b8e403ace3553f8c707563d95f27dade0bc853
Reviewed-on: https://go-review.googlesource.com/104636
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Move the IndexByte function from the runtime to a new bytealg package.
The new package will eventually hold all the optimized assembly for
groveling through byte slices and strings. It seems a better home for
this code than randomly keeping it in runtime.
Once this is in, the next step is to move the other functions
(Compare, Equal, ...).
Update #19792
This change seems complicated enough that we might just declare
"not worth it" and abandon. Opinions welcome.
The core assembly is all unchanged, except minor modifications where
the code reads cpu feature bits.
The wrapper functions have been cleaned up as they are now actually
checked by vet.
Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796
Reviewed-on: https://go-review.googlesource.com/98016
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
change hash/crc32 package to use cpu package instead of using
runtime internal variables to check crc32 instruction
Change-Id: I8f88d2351bde8ed4e256f9adf822a08b9a00f532
Reviewed-on: https://go-review.googlesource.com/76490
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Add support for ADX cpuid bit detection and all instructions,
implied by that bit (ADOX/ADCX). They are useful for rsa and math/big in
general.
Change-Id: Idaa93303ead48fd18b9b3da09b3e79de2f7e2193
Reviewed-on: https://go-review.googlesource.com/74850
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This change replaces the current runtime capabilities check for ppc64x with the
new internal/cpu package. It also adds support for the new POWER9 ISA and
capabilities.
Updates #15403
Change-Id: I5b64a79e782f8da3603e5529600434f602986292
Reviewed-on: https://go-review.googlesource.com/53830
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Implements detection of x86 cpu features that
are used in the go standard library.
Changes all standard library packages to use the new cpu package
instead of using runtime internal variables to check x86 cpu features.
Updates: #15403
Change-Id: I2999a10cb4d9ec4863ffbed72f4e021a1dbc4bb9
Reviewed-on: https://go-review.googlesource.com/41476
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>