brooks [Thu, 28 Jun 2018 21:23:05 +0000 (21:23 +0000)]
MFC r335641:
Fix a stack overflow in mount_smbfs when hostname is too long.
The local hostname was blindly copied into the to the nn_name array.
When the hostname exceeded 16 bytes, it would overflow. Truncate the
hostname to 15 bytes plus a 0 terminator which is the "workstation name"
suffix.
Use defensive strlcpy() when filling nn_name in all cases.
PR: 228354
Reported by: donald.buchholz@intel.com
Reviewed by: jpaetzel, ian (prior version)
Discussed with: Security Officer (gtetlow)
Security: Stack overflow with the hostname.
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15936
avg [Fri, 22 Jun 2018 11:16:17 +0000 (11:16 +0000)]
MFC r333667: followup to r332730/r332752: set kdb_why to "trap" for fatal traps
This change updates arm, arm64 and mips achitectures. Additionally, it
removes redundant checks for kdb_active where it already results in
kdb_reenter() and adds kdb_reenter() calls where they were missing.
Some architectures check the return value of kdb_trap(), but some don't.
I haven't changed any of that.
Some trap handling routines have a return code. I am not sure if I
provided correct ones for returns after kdb_reenter(). kdb_reenter
should never return unless kdb_jmpbufp is NULL for some reason.
avg [Fri, 22 Jun 2018 09:41:13 +0000 (09:41 +0000)]
MFC r333630: Fix 'zpool create -t <tempname>'
Creating a pool with a temporary name fails when we also specify custom
dataset properties: this is because we mistakenly call
zfs_set_prop_nvlist() on the "real" pool name which, as expected,
cannot be found because the SPA is present in the namespace with the
temporary name.
dteske [Thu, 21 Jun 2018 15:02:17 +0000 (15:02 +0000)]
MFC r335308: bsdconfig: Fix a bug when editing users
The usermgmt API was stomping on a global ($user_gid to be specific)
so things would appear to work fine until you tried to make a second
pass into the API with the now-tainted variable contents.
Fixed by localizing menu-specific contents as to not leak outside API.
PR: bin/208774
Reported by: Martin Waschbuesch <martin@waschbuesch.de>
Sponsored by: Smule, Inc.
ae [Thu, 21 Jun 2018 11:24:20 +0000 (11:24 +0000)]
MFC r335133:
In m_megapullup() use m_getjcl() to allocate 9k or 16k mbuf when requested.
It is better to try allocate a big mbuf, than just silently drop a big
packet. A better solution could be reworking of libalias modules to be
able use m_copydata()/m_copyback() instead of requiring the single
contiguous buffer.
dteske [Wed, 20 Jun 2018 05:50:54 +0000 (05:50 +0000)]
dpv(3): MFC r330943, r335264
r330943:
Fix bad error messages from dpv(3)
Before = dpv: <__func__>: posix_spawnp(3): No such file or directory
After = dpv: <path/cmd>: No such file or directory
Most notably, show the 2nd argument being passed to posix_spawnp(3)
so we know what path/cmd failed.
Also, we don't need to have "posix_spawnp(3)" in the error message
nor the function because that can [a] change and [b] traversed using
a debugger if necessary.
r335264:
Fix comparison between pointer and char literal
PR: misc/204252
Reported by: David Binderman <dcb314@hotmail.com>
Sponsored by: Smule, Inc.
dim [Mon, 18 Jun 2018 20:42:53 +0000 (20:42 +0000)]
Follow-up to r335289, which merged r334948 from head, to really fix the
bxe build on i386. In the stable/10 branch, the rman functions still
use u_long instead of uintmax_t (this was changed in r294883 and
r297000), so these have to be printed using the l modifier instead.
dim [Sun, 17 Jun 2018 17:28:27 +0000 (17:28 +0000)]
MFC r334948:
Fix build of bxe with base gcc on i386
Casting from rman_res_t to a pointer results in "cast to pointer from
integer of different size" warnings with base gcc on i386, so print
these without casting. The kva field of struct bxe_bar is of type
vm_offset_t, which can be 32 or 64 bit, so cast it to uintmax_t before
printing.
kp [Sat, 16 Jun 2018 11:42:27 +0000 (11:42 +0000)]
MFC r334876:
pf: Fix deadlock with route-to
If a locally generated packet is routed (with route-to/reply-to/dup-to) out of
a different interface it's passed through the firewall again. This meant we
lost the inp pointer and if we required the pointer (e.g. for user ID matching)
we'd deadlock trying to acquire an inp lock we've already got.
Pass the inp pointer along with pf_route()/pf_route6().
rmacklem [Wed, 6 Jun 2018 22:18:24 +0000 (22:18 +0000)]
MFC: r333580
Fix a slow leak of session structures in the NFSv4.1 server.
For a fairly rare case of a client doing an ExchangeID after a hard reboot,
the old confirmed clientid still exists, but some clients use a new
co_verifier. For this case, the server was not freeing up the sessions on
the old confirmed clientid.
This patch fixes this case. It also adds two LIST_INIT() macros, which are
actually no-ops, since the structure is malloc()d with M_ZERO so the pointer
is already set to NULL.
It should have minimal impact, since the only way I could exercise this
code path was by doing a hard power cycle (pulling the plus) on a machine
running Linux with a NFSv4.1 mount on the server.
Originally spotted during testing of the ESXi 6.5 client.
rmacklem [Wed, 6 Jun 2018 01:30:48 +0000 (01:30 +0000)]
MFC: r334396
Strengthen locking for the NFSv4.1 server DestroySession operation.
If a client did a DestroySession on a session while it was still in use,
the server might try to use the session structure after it is free'd.
I think the client has violated RFC5661 if it does this, but this patch
makes DestroySession block all other nfsd threads so no thread could
be using the session when it is free'd. After the DestroySession, nfsd
threads will not be able to find the session. The patch also adds a check
for nd_sessionid being set, although if that was not the case it would have
been all 0s and unlikely to have a false match.
This might fix the crashes described in PR#228497 for the FreeNAS server.
rmacklem [Mon, 4 Jun 2018 20:55:25 +0000 (20:55 +0000)]
MFC: r334252
Fix the sleep event for layout recall.
The sleep for I/O completion during an NFSv4.1 pNFS layout recall used
the wrong event value and could result in the "[nfscl]" thread hung
for the mount.
This patch fixes the event to be the correct.
This bug will only affect NFSv4.1 pnfs mounts and only when the server
does a layout recall callback, so it won't affect many. Without the patch,
a mount without the "pnfs" option will avoid the problem.
Found during testing of the pNFS server.
rmacklem [Mon, 4 Jun 2018 20:40:22 +0000 (20:40 +0000)]
MFC: r333592
Fix the eir_server_scope reply argument for NFSv4.1 ExchangeID.
In the reply to an ExchangeID operation, the NFSv4.1 server returns a
"scope" value (eir_server_scope). If this value is the same, it indicates
that two servers share state, which is never the case for FreeBSD servers.
As such, the value needs to be unique and it was without this patch.
However, I just found out that it is not supposed to change when the
server reboots and without this patch, it did change.
This patch fixes eir_server_scope so that it does not change when the
server is rebooted.
The only affect not having this patch has is that Linux clients don't
reclaim opens and locks after a server reboot, which meant they lost
any byte range locks held before the server rebooted.
It only affects NFSv4.1 mounts and the FreeBSD NFSv4.1 client was not
affected by this bug.
delphij [Mon, 4 Jun 2018 05:47:15 +0000 (05:47 +0000)]
MFC r333098:
Don't bail out from the check if readboot() returns !FSFATAL.
This can happen when the fsinfo signature is invalid, and the
user have choose to fix it, in which case the code would return
FSBOOTMOD (not FSOK but not FSFATAL either).
All other (fatal) cases would return FSFATAL.
Obtained from: Android Open Source Project
Obtained from: https://android.googlesource.com/platform/external/fsck_msdos/+/d8775a29ea7eac2e5f1504dd21da3725b93b3036
emaste [Wed, 23 May 2018 14:05:56 +0000 (14:05 +0000)]
MFC r326074: filter all passwords (not only changed) from periodic passwd backup
The periodic 200.backup-passwd script outputs any differences it finds
in master.passwd, relative to the previous backup. It intends to elide
the encrypted password field, but previously did so only for changed
lines (i.e., those beginning with - or + in the diff).
Apply the sed expression also to unchanged lines to also elide their
passwords.
PR: 223461
Reported by: Andre Albsmeier
Sponsored by: The FreeBSD Foundation
gjb [Mon, 14 May 2018 17:44:02 +0000 (17:44 +0000)]
MFC r333473:
Add a special GCE_LICENSE variable to Makefile.gce, which when set,
will include license metadata in the resultant GCE image.
GCE_LICENSE is unset by default, as it primarily pertains to images
produced by the FreeBSD Project, but for downstream FreeBSD consumers,
it can be set in the make(1) environment in the format of:
The "license" is not a license, per se, but required metadata that
is required by the GCE marketplace. For the FreeBSD Project, the
license name is simply 'freebsd', with the description of 'FreeBSD'.
davidcs [Wed, 9 May 2018 20:18:23 +0000 (20:18 +0000)]
MFC r333004
Fix Issue with adding MUltiCast Addresses. When multicast addresses are
added/deleted, the delete the multicast addresses previously programmed
in HW and reprogram the new set of multicast addresses.
emaste [Tue, 8 May 2018 17:05:39 +0000 (17:05 +0000)]
MFC r333368: Prepare DB# handler for deferred trigger of watchpoints.
Prepare DB# handler for deferred trigger of watchpoints.
Since pop %ss/mov %ss instructions defer all interrupts and exceptions
for the next instruction, it is possible that the userspace watchpoint
trap executes on the first instruction of the kernel entry for
syscall/bpt.
In this case, DB# should be treated similarly to NMI: on amd64 we must
always load GSBASE even if the trap comes from kernel mode, and load
the kernel page table root into %cr3. Moreover, the trap must
use the dedicated stack, because we are still on the user stack when
trapped on syscall entry.
For i386, we must reload %cr3. The syscall instruction is not configured,
so there is no issue with executing on user stack when trapping.
Due to some CPU erratas it is not always possible to detect that the
userspace watchpoint triggered by inspecting %dr6. In trap(), compare the
trap %rip with the known unsafe entry points and if matched pretend that
the watchpoint did not fire at all.
Thank you to the MSRC Incident Response Team, and in particular Greg
Lenti and Nate Warfield, for coordinating the response to this issue
across multiple vendors.
Thanks to Computer Recycling at The Working Center of Kitchener for
making hardware available to allow us to test the patch on additional
CPU families.
Reviewed by: jhb
Discussed with: Matthew Dillon
Tested by: emaste
Security: CVE-2018-8897
Security: FreeBSD-SA-18:06.debugreg
Sponsored by: The FreeBSD Foundation
gjb [Mon, 7 May 2018 16:22:17 +0000 (16:22 +0000)]
MFC r333262, r333264:
r333262:
Ensure the ports and src trees are available on GCE images,
satisfying a requirement to allow FreeBSD to be considered
a top-tier supported OS in Google Compute Engine.
philip [Mon, 7 May 2018 07:02:26 +0000 (07:02 +0000)]
MFC r333247: Import tzdata 2018e
North Korea switches back to +09 on 2018-05-05.
This version more correctly models time stamps in time zones with
negative DST such as Europe/Dublin (from 1971 on), Europe/Prague
(1946/7), and Africa/Windhoek (1994/2017). This does not affect the
UT offsets, only time zone abbreviations and the tm_isdst flag.
marius [Thu, 3 May 2018 15:41:04 +0000 (15:41 +0000)]
MFC: r327312, r327842, r327865
- Add initial support for Intel Ice Lake and Cannon Lake Ethernet MACs.
- Add workaround for Intel Sky Lake and Kabby Lake Ethernet MAC erratum
1.5.4.5.
- Fix uses of 1 << 31.
avg [Thu, 3 May 2018 07:57:08 +0000 (07:57 +0000)]
MFC r332752: set kdb_why to "trap" when calling kdb_trap from trap_fatal
This will allow to hook a ddb script to "kdb.enter.trap" event.
Previously there was no specific name for this event, so it could only
be handled by either "kdb.enter.unknown" or "kdb.enter.default" hooks.
Both are very unspecific.
Having a specific event is useful because the fatal trap condition is
very similar to panic but it has an additional property that the current
stack frame is the frame where the trap occurred. So, both a register
dump and a stack bottom dump have additional information that can help
analyze the problem.
I have added the event only on architectures that have trap_fatal()
function defined. I haven't looked at other architectures. Their
maintainers can add support for the event later.
Sample script:
kdb.enter.trap=bt; show reg; x/aS $rsp,20; x/agx $rsp,20
Note: changes to powerpc/aim/trap.c and powerpc/booke/trap.c are direct
changes.
avg [Thu, 3 May 2018 07:22:24 +0000 (07:22 +0000)]
MFC r332426: allow ZFS pool to have temporary name for duration of current import
The change adds -t <name> option to zpool create and -t option to zpool
import in its form with an old name and a new name. This allows to
import (or create) a pool under a name that's different from its real,
permanent name without affecting that name. This is useful when working
with VM images or images of other physical systems if they happen to
have a ZFS pool with the same name as the host system.
pfctl: Don't break connections on skipped interfaces on reload
On reload we used to first flush everything, including the list of skipped
interfaces. This can lead to termination of these connections if they send
packets before the new configuration is applied.
Note that this doesn't currently happen on 12 or 11, because of special EACCES
handling introduced in r315514. This special behaviour in tcp_output() may
change, hence the fix in pfctl.
PR: 214613
Submitted by: Andreas Longwitz <longwitz at incore.de>
MFC 332735:
Fix two off-by-one errors when allocating MSI and MSI-X interrupts.
x86 enforces an (arbitray) limit on the number of available MSI and
MSI-X interrupts to simplify code (in particular, interrupt_source[]
is statically sized). This means that an attempt to allocate an MSI
vector needs to fail if it would go beyond the limit, but the checks
for exceeding the limit had an off-by-one error. In the case of MSI-X
which allocates interrupts one at a time this meant that IRQ 768 kept
getting handed out multiple times for msix_alloc() instead of failing
because all MSI IRQs were in use.
MFC 332733:
Workaround fixed I/O port resources encoded as I/O port ranges in _CRS.
ACPI I/O port descriptors use _MIN and _MAX fields to specify the set
of allowable base (start) addresses for an I/O port resource along with
a _LEN field specifying the length. A fixed resource is supposed to be
encoded with _MIN == _MAX, but some buggy firmwares instead set _MAX to
the end of the fixed range. Relocating I/O ranges only make sense in
_PRS (possible resource settings), not in _CRS (current resource settings),
so if an I/O port range with _MAX set set to the end of the range is
present in _CRS, treat it as a fixed I/O port resource starting at
_MIN.
Handle Programmable Early Warning for control commands in sa(4).
When the tape position is inside the Early Warning area, the tape
drive will return a sense key of NO SENSE, and an ASC/ASCQ of
0x00,0x02, which means: End-of-partition/medium detected". If
this was in response to a control command like WRITE FILEMARKS,
we correctly translate this as informational status and return
0 from saerror().
Programmable Early Warning should be handled the same way, but
we weren't handling it that way. As a result, if a PEW status
(sense key of NO SENSE, ASC/ASCQ of 0x00,0x07, "Programmable early
warning detected") came back in response to a WRITE FILEMARKS,
we returned an error.
The impact of this was that if an application was writing to a
sa(4) device, and a PEW area was set (in the Device Configuration
Extension subpage -- mode page 0x10, subpage 1), and a filemark
needed to be written on close, we could wind up returning an error
to the user on close because of a "failure" to write the filemarks.
It actually isn't a failure, but rather just a status report from
the drive, and shouldn't be treated as a failure.
sys/cam/scsi/scsi_sa.c:
For control commands in saerror(), treat asc/ascq 0x00,0x07
the same as 0x00,{0-5} -- not an error. Return 0, since
the command actually did succeed.
Reported by: Dr. Andreas Haakh <andreas@haakh.de>
Tested by: Dr. Andreas Haakh <andreas@haakh.de>
Sponsored by: Spectra Logic
------------------------------------------------------------------------
MFC r329372 and r329464:
Implement enable_irq() and disable_irq() in the LinuxKPI and add checks for
valid IRQ tag before setting up or tearing down an interrupt handler in the
LinuxKPI. This is needed when the interrupt handler is disabled
before freeing the interrupt.
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
MFC r331355:
Clear old MSIX IRQ numbers in the LinuxKPI.
When disabling the MSIX IRQ vectors for a PCI device through the
LinuxKPI, make sure any old MSIX IRQ numbers are no longer visible to
the linux_pci_find_irq_dev() function else IRQs can be requested from
the wrong PCI device.
r332385:
hyperv/storvsc: storvsc_io_done(): do not use CAM_SEL_TIMEOUT
CAM_SEL_TIMEOUT was introduced in
https://reviews.freebsd.org/D7521 (r304251), which claimed:
"VM shall response to CAM layer with CAM_SEL_TIMEOUT to filter those
invalid LUNs. Never use CAM_DEV_NOT_THERE which will block LUN scan
for LUN number higher than 7."
But it turns out this is not correct:
I think what really filters the invalid LUNs in r304251 is that:
before r304251, we could set the CAM_REQ_CMP without checking
vm_srb->srb_status at all:
ccb->ccb_h.status |= CAM_REQ_CMP.
r304251 checks vm_srb->srb_status and sets ccb->ccb_h.status properly,
so the invalid LUNs are filtered.
I changed my code version to r304251 but replaced the CAM_SEL_TIMEOUT
with CAM_DEV_NOT_THERE, and I confirmed the invalid LUNs can also be
filtered, and I successfully hot-added and hot-removed 8 disks to/from
the VM without any issue.
CAM_SEL_TIMEOUT has an unwanted side effect -- see cam_periph_error():
For a selection timeout, we consider all of the LUNs on
the target to be gone. If the status is CAM_DEV_NOT_THERE,
then we only get rid of the device(s) specified by the
path in the original CCB.
This means: for a VM with a valid LUN on 3:0:0:0, when the VM inquires
3:0:0:1 and the host reports 3:0:0:1 doesn't exist and storvsc returns
CAM_SEL_TIMEOUT to the CAM layer, CAM will detech 3:0:0:0 as well: this
is the bug I reported recently:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226583
MFC r332674:
Increase the msdosfs partition size on arm SoC images where the
current size may not be sufficiently large for development and/or
testing.
MFC r331616: vfs_donmount: in certain cases try r/o mount if r/w mount fails
If the operation is not an update, if neither r/w nor r/o mode is
explicitly requested, if the error code hints at the possibility of the
media being read-only, and if the fallback is allowed, then we can try
to automatically downgrade to the readonly mode.
This is especially useful for auto-mounting of removable media that
sometimes can happen to be write-protected.
The fallback to r/o is not enabled by default. It can be requested on a
per-mount basis with a new mount option, 'autoro'. Or it can be
globally allowed by setting vfs.default_autoro.
stable/10 note: this branch does not have SYSCTL_BOOL, so SYSCTL_INT is
used instead.
MFC r332452: Update vt(4) "Terminus BSD Console" font to v4.46
"Terminus BSD Console" is a derivative of Terminus that is provided
by Mr. Dimitar Zhekov under the 2-clause BSD license for use by the
FreeBSD vt(4) console and other BSDs.
MFC 331466:
Add a workaround to the hypervisor detection for older versions of KVM.
Originally KVM set %eax to 0 in the cpuid leaf 0x4000000 rather than
to the highest supported leaf in the hypervisor "branch". Detect this
case and fixup the %eax value so that the hypervisor is still
detected.
growfs: Commit the changes after expanding the partition
This fix the problem in arm snapshot present since at least 6 months
where growfs was failing at firstboot and dropped you in a single
user shell.
Note: In addition to this merge, kern.geom.part.mbr.enforce_chs has
been enabled on the build machine to mitigate against the issue in
the PR referenced.
MFC r332421: vt: add three more cp437 mappings for vga textmode
In UTF-8 locales mandoc uses a number of characters outside of the Basic
Latin group, e.g. from general punctuation or miscellaneous mathematical
symbols, and these rendered as ? in text mode.
This change adds (char, replacement, code point, description):
– - U+2013 En Dash
⟨ < U+27E8 Mathematical Left Angle Bracket
⟩ > U+27E9 Mathematical Right Angle Bracket
This change addresses some common cases; there are others that still
need to be added after a more thorough review.
MFC 331324: Ensure thread library is initialized in pthread_testcancel().
Call _thr_check_init() before reading curthread in pthread_testcancel().
If a constructor in a library creates a semaphore via sem_init() and
then waits for it via sem_wait(), the program can core dump in
_pthread_testcancel() called from sem_wait(). This is because the
semaphore implementation lives in libc, so the library's constructors
can be run before libthr's constructors.
tail: fix "tail -r" for piped input that begins with '\n'
A subtle logic bug, probably introduced in r311895, caused tail to print the
first two lines of piped input in forward order, if the very first character
was a newline.
PR: 222671
Reported by: Jim Long <freebsd-bugzilla@umpquanet.com>, pprocacci@gmail.com
Sponsored by: Spectra Logic Corp
Ensure that multiplications for memory allocations cannot overflow, and
that we'll not try to allocate M_WAITOK for potentially overly large
allocations.
pf: Improve ioctl validation for DIOCRGETTABLES, DIOCRGETTSTATS, DIOCRCLRTSTATS and DIOCRSETTFLAGS
These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.
Limit the allocation to required size (or the user allocation, if that's
smaller). That does mean we need to do the allocation with the rules
lock held (so the number doesn't change while we're doing this), so it
can't M_WAITOK.
pf: Improve ioctl validation for DIOCIGETIFACES and DIOCXCOMMIT
These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.
There's no obvious limit to the request size for these, so we limit the
requests to something which won't overflow. Change the memory allocation
to M_NOWAIT so excessive requests will fail rather than stall forever.
pf: Improve ioctl validation for DIOCRADDTABLES and DIOCRDELTABLES
The DIOCRADDTABLES and DIOCRDELTABLES ioctls can process a number of
tables at a time, and as such try to allocate <number of tables> *
sizeof(struct pfr_table). This multiplication can overflow. Thanks to
mallocarray() this is not exploitable, but an overflow does panic the
system.
Arbitrarily limit this to 65535 tables. pfctl only ever processes one
table at a time, so it presents no issues there.
Exit with usage when extra arguments are on command line
preventing mistakes such as "halt 0p" for "halt -p".
Approved by: bde (mentor, implicit), phk (mentor,implicit)
MFC after: 1 week
ifconf(): correct handling of sockaddrs smaller than struct sockaddr.
Portable programs that use SIOCGIFCONF (e.g. traceroute) assume
that each pseudo ifreq is of length MAX(sizeof(struct ifreq),
sizeof(ifr_name) + ifr_addr.sa_len). For short sockaddrs we copied
too much from the source sockaddr resulting in a heap leak.
I believe only one such sockaddr exists (struct sockaddr_sco which
is 8 bytes) and it is unclear if such sockaddrs end up on interfaces
in practice. If it did, the result would be an 8 byte heap leak on
current architectures.
If a user attempts to add two tables with the same name the duplicate table
will not be added, but we forgot to free the duplicate table, leaking memory.
Ensure we free the duplicate table in the error path.
The previous split of zeroing ifr_name and ifr_addr seperately is safe
on current architectures, but would be unsafe if pointers were larger
than 8 bytes. Combining the zeroing adds no real cost (a few
instructions) and makes the security property easier to verify.
GC never enabled support for SIOCGADDRROM and SIOCGCHIPID.
When de(4) was imported in 1997 the world was not ready for these ioctls.
In over 20 years that hasn't changed so it seems safe to assume their
time will never come.
r331654:
Don't access userspace directly from the kernel in nxge(4).
Update to what the previous code seemed to be doing via the correct
interfaces. Further issues exist in xge_ioctl_registers(), but this is
debugging code in a driver that has few users and they don't appear to
be crashes or leaks.
r331869:
Fix the build on arches with default unsigned char. Capture the fubyte()
return value in an int as well as the char, and test the full int value
for fubyte() failure.