Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pedro Giffuni <pfg@freebsd.org>
hyperv/hn: Fixat RNDIS rxfilter after the successful RNDIS init.
Under certain conditions on certain versions of Hyper-V, the RNDIS
rxfilter is _not_ zero on the hypervisor side after the successful
RNDIS initialization, which breaks the assumption of any following
code (well, it breaks the RNDIS API contract actually). Clear the
RNDIS rxfilter explicitly, drain packets sneaking through, and drain
the interrupt taskqueues scheduled due to the stealth packets.
Reported by: dexuan@
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D10230
Leap-second smearing is an experimental option that may be specified in
ntp.conf(5) and the -x option on the command line to spread the effect
of a leap-second over an interval as specified by the leapsmearinterval
config file statement. Recommended values are between 7200 (2 hours) and
86400 (24 hours).
It is advised that leap-second smearing not be used for public NTP
servers (https://www.meinbergglobal.com/download/burnicki/Leap\
%20Second%20Smearing%20With%20NTP.pdf). It is also advised that NTP
clients not use a mix of NTP servers using leap-second smearing with
NTP servers not using leap-second smearing as that could cause
undefined client behaviour.
Leap-second smearing was committed to ports net/ntp and net/ntp-devel
by r426825 on 2016-11-22.
For three years now CAM does not use SIM lock, but still enforces SIM to
use it. Remove this requirement, allowing SIMs to have any locking they
prefer, if they pass no mutex to cam_sim_alloc().
Let firmware do its best first, and if it can't, try software recovery.
I would remove software timeout handler completely, but found bunch of
complains on command timeout on sparc64 mailing list few years ago, so
better be safe in case of interrupt loss.
MFC r315579, r315670: Add initial support for multiple MSI-X vectors.
For 24xx and above use 2 vectors (default and response queue).
For 26xx and above use 3 vectors (default, response and ATIO queues).
Due to global lock interrupt hardlers never run simultaneously now, but
at least this allows to save one regitster read per interrupt.
Interesting fixes which were not already merged: 0c7c611 Merge C++ demangler bug fixes from ELF Tool Chain (#40) 2b208d9 __cxa_demangle_gnu3: demangle 'z' as '...', not 'ellipsis' (#41)
mm [Fri, 31 Mar 2017 20:17:30 +0000 (20:17 +0000)]
MFC r315636,315876,316095:
Sync libarchive with vendor
Vendor changes/bugfixes (FreeBSD-related):
r315636:
PR 867 (bsdcpio): show numeric uid/gid when names are not found
PR 870 (seekable zip): accept files with valid ZIP64 EOCD headers
PR 880 (pax): Fix handling of "size" pax header keyword
PR 887 (crypto): Discard 3072 bytes instead of 1024 of first keystream
OSS-Fuzz issue 806 (mtree): rework mtree_atol10 integer parser
Break ACL read/write code into platform-specific source files
r315876:
Store extended attributes with extattr_set_link() if no fd is provided
Add extended attribute tests to libarchive and bsdtar
Fix tar's test_option_acls
Support the UF_HIDDEN file flag
r316095:
Constify variables in several places
Unify platform ACL code in a single source file
Fix unused variable if compiling on FreeBSD without NFSv4 ACL support
truckman [Fri, 31 Mar 2017 06:33:20 +0000 (06:33 +0000)]
MFC r315516
Change several constants used by the PIE algorithm from unsigned to signed.
- PIE_MAX_PROB is compared to variable of int64_t and the type promotion
rules can cause the value of that variable to be treated as unsigned.
If the value is actually negative, then the result of the comparsion
is incorrect, causing the algorithm to perform poorly in some
situations. Changing the constant to be signed cause the comparision
to work correctly.
- PIE_SCALE is also compared to signed values. Fortunately they are
also compared to zero and negative values are discarded so this is
more of a cosmetic fix.
- PIE_DQ_THRESHOLD is only compared to unsigned values, but it is small
enough that the automatic promotion to unsigned is harmless.
r314547
loader.efi: reduce the size of the staging area if necessary
The loader assumes physical memory in [2MB, 2MB + EFI_STAGING_SIZE)
is Conventional Memory, but actually it may not, e.g. in the case
of Hyper-V Generation-2 VM (i.e. UEFI VM) running on Windows
Server 2012 R2 host, there is a BootServiceData memory block at
the address 47.449MB and the memory is not writable.
Without the patch, the loader will crash in efi_copy_finish():
see PR 211746.
The patch verifies the end of the staging area, and reduces its
size if necessary. This way, the loader will not try to write into
the BootServiceData memory any longer.
Thank Marcel Moolenaar for helping me on this issue!
The patch also allocates the staging area in the first 1GB memory.
See the comment in the patch for this.
r314770
loader.efi: fix recent UEFI-boot regression on physical machines
This patch fixes my recent patch
"loader.efi: reduce the size of the staging area if necessary", which
causes EFI-boot failure on physical machines since Mar 2:
on the host there is a 1MB LoaderData memory range, which splits
the big Conventional Memory range into a small one (15MB) and a
big one: the small one is too small to hold the staging area.
We can actually use the LoaderData range safely, because when
amd64_tramp -> efi_copy_finish() starts to run, we're almost at
the very end of the efi loader code and we're going to "return"
to the kernel entry, so we're pretty sure we won't access any loader
data any more.
For people who are interested in the details: please see
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211746#c22
PS, some people also reported the regression happened to FreeBSD VM
running on Bhyve in EFI mode. This patch should resolve it too,
though I don't have such a setup to test.
r314828
loader.efi: fix an off-by-one bug in efi_verify_staging_size()
Also remove the warning message: it may not be unusual to see
the memory range containing 2MB is not of EfiConventionalMemory.
Sponsored by: Microsoft
r314891
loader.efi: finally fix the off-by-one bug in efi_verify_staging_size()
r314828(loader.efi: fix an off-by-one bug in efi_verify_staging_size())
doesn't really fix the bug and this patch adds the missing part.
It's a shame that I didn't make everything correct at the very beginning...
Sponsored by: Microsoft
r314956
loader.efi: only reduce the size of the staging area on Hyper-V
Doing this on physical hosts turns out to be problematic, e.g. see comment
24 and 28 in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211746.
To fix the real underlying issue correctly & thoroughly, IMO we need
a relocatable kernel, but that would require a lot of complicated long
term work: https://reviews.freebsd.org/D9686?id=25414#inline-56969
For now, let's only apply efi_verify_staging_size() to VMs running on
Hyper-V, and restore the old behavior on physical machines since that
has been working for people for a long period of time, though that's
potentially unsafe...
Sponsored by: Microsoft
r314962
loader.efi: only include the machine/ header files on x86
The 2 files may not exist on other archs like aarch64 and hence we
can have a build failure there.
Reported by: lwhsu
Sponsored by: Microsoft
r315235
loader.efi: use stricter check for Hyper-V
Some other hypervisors like Xen can pretend to be Hyper-V but obviously
they can't implement all Hyper-V features. Let's make sure we're genuine
Hyper-V here.
ngie [Thu, 30 Mar 2017 05:49:46 +0000 (05:49 +0000)]
MFstable/11 r316229:
Backport mlx4{en,ib}(4) from ^/head
MFCing other pieces would be very structurally disruptive. This just
brings back the manpages so they can be used by end-users and to ease
future backports.
ngie [Thu, 30 Mar 2017 05:15:00 +0000 (05:15 +0000)]
MFC r314372:
Use "build" instead of "all" when building ports modules
"all" in ports currently means "stage the ports", which requires root today,
and brings to light other potential issues, like ENAMETOOLONG with staged
directories (bug 161481, etc).
This fixes buildkernel for me when run as a non-root user, assuming all
of the prerequisites have been installed beforehand and are up-to-date.
sevan [Thu, 30 Mar 2017 01:00:45 +0000 (01:00 +0000)]
MFC 315964
ftp.microsoft.com is dead and the document was not archived, point to the full
protocol spec document instead.
Fix spelling mistake flagged by igor.
Rephrase bad sentence flagged by igor.
mav [Wed, 29 Mar 2017 16:04:42 +0000 (16:04 +0000)]
MFC r315507: Reorganize RQSTYPE_NOTIFY handling for chips <= 23xx.
There were two copies of the code: one in generic code was half-broken, and
another in platform code was never called. Leave only one in generic code
and working.
mav [Wed, 29 Mar 2017 15:43:07 +0000 (15:43 +0000)]
MFC r315307: Refactor interrupt handling.
Instead of single isp_intr() function doing all possible magic, introduce
four different functions to handle mailbox operation completions, async
events, response and ATIO queues. The goal is to isolate different code
paths to make code more readable, and to make easier support for multiple
interrupt vectors. Even oldest hardware in many cases can identify what
code path it should run on interrupt. Contemporary hardware can assign
them to different interrupt vectors.
royger [Wed, 29 Mar 2017 15:34:52 +0000 (15:34 +0000)]
xen/netfront: release resources on removal
Current netfront code doesn't release the resources (grants and mbufs) on
removal. Add a new helper that releases the resources, so FreeBSD doesn't run
out of grants or memory when performing heavy hotplug/unplug of Xen PV nic
devices.
This is a direct commit to stable/10 because the code in newer branches has
been completely refactored and no longer has this issue.
ngie [Wed, 29 Mar 2017 08:00:11 +0000 (08:00 +0000)]
MFC r315699:
Print out name of non-dynamic sysctl in sysctl_remove_oid_locked
This will provide a slightly better smoking gun than just stating
"can't remove non-dynamic nodes!" when calling sysctl_ctx_free(9)
and sysctl_remove_{name,oid}(9) with a non-dynamic (likely
static) sysctl.
np [Wed, 29 Mar 2017 02:21:05 +0000 (02:21 +0000)]
MFC r315201, r315920, r315921, r315922, r316008, and r316062.
r315201:
cxgbe(4): Fix an always-true assertion (reported by PVS-Studio).
sys/dev/cxgbe/t4_main.c: PVS-Studio: Expression is Always True (CWE-571) (3)
r315920:
cxgbe/iw_cxgbe: c4iw_connect should always returns a -ve errno on failure.
r315921:
cxgbe/iw_cxgbe: alloc_ep expects a gfp_t, and it's always ok to sleep during
alloc_ep.
r315922:
cxgbe/iw_cxgbe: allocations that use GFP_KERNEL (which is M_WAITOK on
FreeBSD) cannot fail.
r316008:
cxgbe/iw_cxgbe: Remove unused code.
r316062:
cxgbe/iw_cxgbe: Defer the handling of error CQEs and RDMA_TERMINATE to
the thread that deals with socket state changes. This eliminates
various bad races with the ithread.
cy [Wed, 29 Mar 2017 01:32:34 +0000 (01:32 +0000)]
MFC r311103 (ian):
Update ntp.conf to use the ntpd pool feature.
Our previous ntp.conf file configured 3 servers from freebsd.pool.ntp.org
using 3 separate 'server' config lines. That is now replaced with a single
'pool' line which causes ntpd to add multiple servers from the pool.
More than just making the config smaller, the pool feature in ntpd has one
major advantage over configuring 3 separate servers from a pool: if a server
that was added using a 'pool' statement provides bad time (initially or at
some later date), ntpd automatically discards it and configures a new
different server from the pool without needing to be restarted.
These changes also add a 'tos' line to control how many pool servers get
added, a 'restrict source' line that is required to allow ntpd to add new
peers from the pool, and it deletes a 'restrict 127.127.1.0' line that does
nothing and should never have been there (127.127.1.0 is not a valid IP
address, it's a refclock identifier).
amdmi3 [Tue, 28 Mar 2017 10:43:20 +0000 (10:43 +0000)]
MFC r315242: Fix late and noauto with geli swap
With the following in /etc/fstab:
/dev/gpt/swap.eli none swap sw,late 0 0
swap will not be enabled, with `swapon -aL' complaining:
swapon: Invalid option: late
This happens because swap_on_geli_args() which parses geli arguments
out of all mount options does not expect late or noauto among them.
Fix this by explicitly allowing these arguments.
It was implemented to reduce context switches when uploading firmware to
card's RAM. But this mechanism is not used last 10 years since all mbox
operations are now polled, and it was never used for cards produced in
last 15 years. Newer cards can use DMA to upload firmware.
mav [Tue, 28 Mar 2017 10:22:55 +0000 (10:22 +0000)]
MFC r315234: Improvements around attach, reset and detach.
This change fixes DMA resource leak on driver unload. Also it removes
DMA resources allocation for hardcoded number of requests before fetching
the real number from firmware. Also it prepares ground for more flexible
IRQs allocation according to firmware capabilities.
mav [Tue, 28 Mar 2017 10:19:52 +0000 (10:19 +0000)]
MFC r313568 (by ken):
Change the isp(4) driver to not adjust the tag type for REQUEST SENSE.
The isp(4) driver was changing the tag type for REQUEST SENSE
commands to Head of Queue, when the CAM CCB flag
CAM_TAG_ACTION_VALID was NOT set. CAM_TAG_ACTION_VALID is set
when the tag action in the XPT_SCSI_IO is not CAM_TAG_ACTION_NONE
and when the target has tagged queueing turned on.
In most cases when CAM_TAG_ACTION_VALID is not set, it is because
the target is not doing tagged queueing. In those cases, trying to
send a Head of Queue tag may cause problems. Instead, default to
sending a simple tag.
IBM tape drives claim to support tagged queueing in their standard
Inquiry data, but have the DQue bit set in the control mode page
(mode page 10). CAM correctly detects that these drives do not
support tagged queueing, and clears the CAM_TAG_ACTION_VALID flag
on CCBs sent down to the drives.
This caused the isp(4) driver to go down the path of setting the
tag action to a default value, and for Request Sense commands only,
set the tag action to Head of Queue.
If an IBM tape drive does get a Head of Queue tag, it rejects it with
Invalid Message Error (0x49,0x00). (The Qlogic firmware translates that
to a Transport Error, which the driver translates to an Unrecoverable
HBA Error, or CAM_UNREC_HBA_ERROR.) So, by default, it wasn't possible
to get a good response from a REQUEST SENSE to an FC-attached IBM
tape drive with the isp(4) driver.
IBM tape drives (tested on an LTO-5 with G9N1 firmware and a TS1150
with 4470 firmware) also have a bug in that sending a command with a
non-simple tag attribute breaks the tape drive's Command Reference
Number (CRN) accounting and causes it to ignore all subsequent
commands because it and the initiator disagree about the next
expected CRN. The drives do reject the initial command with a head
of queue tag with an Invalid Message Error (0x49,0x00), but after that
they ignore any subsequent commands. IBM confirmed that it is a bug,
and sent me test firmware that fixes the bug. However tape drives in
the field will still exhibit the bug until they are upgraded.
Request Sense is not often sent to targets because most errors are
reported automatically through autosense in Fibre Channel and other
modern transports. ("Modern" meaning post SCSI-2.) So this is not
an error that would crop up frequently. But Request Sense is useful on
tape devices to report status information, aside from error reporting.
This problem is less serious without FC-Tape features turned on,
specifically precise delivery of commands (which enables Command
Reference Numbers), enabled on the target and initiator. Without
FC-Tape features turned on, the target would return an error and
things would continue on.
And it also does not cause problems for targets that do tagged
queueing, because in those cases the isp(4) driver just uses the
tag type that is specified in the CCB, assuming the
CAM_TAG_ACTION_VALID flag is set, and defaults to sending a Simple
tag action if it isn't an ordered or head of queue tag.
sys/dev/isp/isp.c:
In isp_start(), don't try to send Request Sense commands
with the Head of Queue tag attribute if the CCB doesn't
have a valid tag action. The tag action likely isn't valid
because the target doesn't support tagged queueing.
ngie [Tue, 28 Mar 2017 06:13:16 +0000 (06:13 +0000)]
MFC r313436,r313437,r313438,r314587,r315687:
r313436:
Clarify #includes for hexdump(3) vs sbuf_hexdump(9)
hexdump(3) only requires libutil.h, whereas sbuf_hexdump(9) requires
sys/types.h (for ssize_t) and sys/sbuf.h
r313437:
Create link from hexdump(3) to sbuf_hexdump(9) as the manpage describes
sbuf_hexdump(9)'s behavior
r313438:
Clean up trailing and leading whitespace for variables to make it
consistent with the rest of the file and style.Makefile(9) a bit
more
r314587:
Correct MLINKS for sbuf_hexdump(9)
sbuf_hexdump(9) should be linked to sbuf(9), not hexdump(3). Another
review will be posted to deduplicate the sbuf_hexdump reference in
in hexdump(3) or at the very least make the information less duplicative.
r315687:
Document sbuf_hexdump(9) in just sbuf(9)
- Remove duplicate references to sbuf_hexdump(9) from hexdump(3).
sbuf_hexdump(9) already pointed back to hexdump(3) for implementation
details.
- Refer to sbuf_hexdump(9) instead of sbuf(9) for completeness
ngie [Tue, 28 Mar 2017 06:05:26 +0000 (06:05 +0000)]
MFC r315686,r315688:
r315686:
kvm_geterr: handle `kd` == NULL in a deterministic/graceful manner
Return a NUL string instead of just working by accident with kvm_geterr(3)
when MALLOC_PRODUCTION is disabled (I didn't confirm the MALLOC_PRODUCTION
being enabled path).
Document the new explicit return behavior for kvm_geterr(3), as well
as the previous implicit behavior, i.e., the buffer attached to
returned via kvm_geterr(3) would be empty if a previous error hadn't been
stored in `kd`.
r315688:
kvm_write: fix -Wcast-qual warning in pointer arithmetic argument
Cast buf to `const char *` when doing arithmetic operation to match
`cp`'s type [1].
ngie [Mon, 27 Mar 2017 18:28:17 +0000 (18:28 +0000)]
MFC r314245:
Fill MK_LIBTHR as far as lib/libthr is concerned
There are other areas of the tree that will need to be evaluated for sanity
if they're supposed to be conditionally compiled out of the build/install,
like libzpool
Relnotes: yes (this might break someone's system if have the knob set)