1. go-smtp is replaced by a fork that reverts StartTLS removal.
2. SASL LOGIN is no longer supported by upstream go-sasl, readded disabled by default.
3. Updated endpoint code to match new go-smtp authentication interfaces.
4. certmagic repo had some renames
5. Minimum Go version increased to 1.23 to match dependencies.
1. Apply conn_max_idle_time to each connection individually,
not pool bucket.
2. Include local_addr in some log messages to help identify
individual connections in the pool.
3. Run conn.Close outside of keysLock and asynchronously. Ensures
slow server or dead connection won't cause pool operations to hang.
4. Set 5 second timeout for QUIT call in conn.Close.
To detect dead connections faster, there is no reason for any
server to take more than 5 seconds to respond to QUIT.
See #675.
Several comments were removed since they are not worth the trouble.
A few minor issues were addressed.
Most of remaining comments got corresponding GitHub issues assigned.
Previous approach consisted of multiple independent options with unknown
interaction between each other and not offering enough flexibility for
local policy configuration.
Additionally, it was not possible to implement downgrade protection
mentioned in #178 because it was not clear what is "downgrade" since
options were not related in any linear order, this commit makes it
explicit via the "security levels" system:
MX: DNSSEC > MTA-STS > Nothing
TLS: Authenticated+Encrypted > Encrypted > Plaintext
Note DNSSEC and MTA-STS being different levels, they provide different
security guarantees. Keeping them together under "authenticated" level
would not provide enough granularity for levels-based downgrade
protection and local policies.
'common_domain' MX authentication option is removed. It was offering no
real protection and therefore is was problematic to use together with
planned downgrade protection.
All security level errors are marked as temporary to force requeueing
and allow local admin to troubleshoot them without losing messages.
'remote' tests are changed to use testTarget function to initialize
tested module instance, since security levels mapping requires some
pre-initialization.
Support for IP literals in address domain-part is disabled because it
is incompatible with the new verification logic and was broken anyway
(#176).
TLS without authentication is still better than no TLS at all.
To save latency in transactions with a misconfigured recipient server
that cannot use TLS at all but still advertises STARTTLS support,
downgrade to non-authenticated TLS is attempted only on verification
errors (x509.UnknownAuthorityError or x509.HostnameError) and malformed
certificate errors (x509.ConstraintViolationError and
x509.CertificateInvalidError). In all other cases 'remote' module
fallbacks to plaintext directly.
While rearranging code to support this, some additional changes were
made to allow simplier implementation of security levels idea from #178.
See https://tools.ietf.org/html/rfc7435.
See #178.
As revealed by latency tracing using runtime/trace, MTA-STS cache miss
essentially doubles the connection time for outbound delivery. This is
mostly because MTA-STS lookup have to estabilish a TCP+TLS connection to
obtain the policy text (shame on Google for pushing that terribly
misdesigned protocol, but, well, it is better than nothing so we adopt
it).
Additionally, there is a number of additional DNS lookups needed (e.g.
TLSA record for DANE). This commit rearranges connection code so it is
possible to run all "additional" queries in parallel with the connection
estabilishment. However, this changes the behavior of TLS requirement
checks (including MTA-STS). The connection to the candidate MX is
already estabilished and STARTTLS is always attempted if it is
available. Only after that the policy check is done, using the result of
TLS handshake attempt (if any). If for whatever reason, the candidate MX
cannot be used, the connection is then closed. This might bring
additional overhead in case of configuration errors on the recipient
side, but it is believed to not be a major problem since this should not
happen often.
runtime/trace together with 'go tool trace' provides extremely powerful
tooling for performance (latency) analysis. Since maddy prides itself on
being "optimized for concurrency", it is a good idea to actually live up
to this promise.
Closes#144. No need to reinvent the wheel. The original issue
proposed a solution to use in production to detect "performance
anomalies", it is possible to use runtime/trace in production too, but
the corresponding flag to enable profiler endpoint is hidden behind the
'debugflags' build tag at the moment.
For SMTP code, the basic latency information can be obtained from
regular logs since they include timestamps with millisecond granularity.
After the issue is apparent, it is possible to deploy the server
executable compiled with tracing support and obtain more information
... Also add missing context.Context arguments to smtpconn.C.
The intention is to keep to repo root clean while the list of packages
is slowly growing.
Additionally, a bunch of small (~30 LoC) files in the repo root is
merged into a single maddy.go file, for the same reason.
Most of the internal code is moved into the internal/ directory. Go
toolchain will make it impossible to import these packages from external
applications.
Some packages are renamed and moved into the pkg/ directory in the root.
According to https://github.com/golang-standards/project-layout this is
the de-facto standard to place "library code that's ok to use by
external applications" in.
To clearly define the purpose of top-level directories, README.md files
are added to each.