avg [Thu, 3 May 2018 07:22:24 +0000 (07:22 +0000)]
MFC r332426: allow ZFS pool to have temporary name for duration of current import
The change adds -t <name> option to zpool create and -t option to zpool
import in its form with an old name and a new name. This allows to
import (or create) a pool under a name that's different from its real,
permanent name without affecting that name. This is useful when working
with VM images or images of other physical systems if they happen to
have a ZFS pool with the same name as the host system.
pfctl: Don't break connections on skipped interfaces on reload
On reload we used to first flush everything, including the list of skipped
interfaces. This can lead to termination of these connections if they send
packets before the new configuration is applied.
Note that this doesn't currently happen on 12 or 11, because of special EACCES
handling introduced in r315514. This special behaviour in tcp_output() may
change, hence the fix in pfctl.
PR: 214613
Submitted by: Andreas Longwitz <longwitz at incore.de>
MFC 332735:
Fix two off-by-one errors when allocating MSI and MSI-X interrupts.
x86 enforces an (arbitray) limit on the number of available MSI and
MSI-X interrupts to simplify code (in particular, interrupt_source[]
is statically sized). This means that an attempt to allocate an MSI
vector needs to fail if it would go beyond the limit, but the checks
for exceeding the limit had an off-by-one error. In the case of MSI-X
which allocates interrupts one at a time this meant that IRQ 768 kept
getting handed out multiple times for msix_alloc() instead of failing
because all MSI IRQs were in use.
MFC 332733:
Workaround fixed I/O port resources encoded as I/O port ranges in _CRS.
ACPI I/O port descriptors use _MIN and _MAX fields to specify the set
of allowable base (start) addresses for an I/O port resource along with
a _LEN field specifying the length. A fixed resource is supposed to be
encoded with _MIN == _MAX, but some buggy firmwares instead set _MAX to
the end of the fixed range. Relocating I/O ranges only make sense in
_PRS (possible resource settings), not in _CRS (current resource settings),
so if an I/O port range with _MAX set set to the end of the range is
present in _CRS, treat it as a fixed I/O port resource starting at
_MIN.
Handle Programmable Early Warning for control commands in sa(4).
When the tape position is inside the Early Warning area, the tape
drive will return a sense key of NO SENSE, and an ASC/ASCQ of
0x00,0x02, which means: End-of-partition/medium detected". If
this was in response to a control command like WRITE FILEMARKS,
we correctly translate this as informational status and return
0 from saerror().
Programmable Early Warning should be handled the same way, but
we weren't handling it that way. As a result, if a PEW status
(sense key of NO SENSE, ASC/ASCQ of 0x00,0x07, "Programmable early
warning detected") came back in response to a WRITE FILEMARKS,
we returned an error.
The impact of this was that if an application was writing to a
sa(4) device, and a PEW area was set (in the Device Configuration
Extension subpage -- mode page 0x10, subpage 1), and a filemark
needed to be written on close, we could wind up returning an error
to the user on close because of a "failure" to write the filemarks.
It actually isn't a failure, but rather just a status report from
the drive, and shouldn't be treated as a failure.
sys/cam/scsi/scsi_sa.c:
For control commands in saerror(), treat asc/ascq 0x00,0x07
the same as 0x00,{0-5} -- not an error. Return 0, since
the command actually did succeed.
Reported by: Dr. Andreas Haakh <andreas@haakh.de>
Tested by: Dr. Andreas Haakh <andreas@haakh.de>
Sponsored by: Spectra Logic
------------------------------------------------------------------------
MFC r329372 and r329464:
Implement enable_irq() and disable_irq() in the LinuxKPI and add checks for
valid IRQ tag before setting up or tearing down an interrupt handler in the
LinuxKPI. This is needed when the interrupt handler is disabled
before freeing the interrupt.
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
MFC r331355:
Clear old MSIX IRQ numbers in the LinuxKPI.
When disabling the MSIX IRQ vectors for a PCI device through the
LinuxKPI, make sure any old MSIX IRQ numbers are no longer visible to
the linux_pci_find_irq_dev() function else IRQs can be requested from
the wrong PCI device.
r332385:
hyperv/storvsc: storvsc_io_done(): do not use CAM_SEL_TIMEOUT
CAM_SEL_TIMEOUT was introduced in
https://reviews.freebsd.org/D7521 (r304251), which claimed:
"VM shall response to CAM layer with CAM_SEL_TIMEOUT to filter those
invalid LUNs. Never use CAM_DEV_NOT_THERE which will block LUN scan
for LUN number higher than 7."
But it turns out this is not correct:
I think what really filters the invalid LUNs in r304251 is that:
before r304251, we could set the CAM_REQ_CMP without checking
vm_srb->srb_status at all:
ccb->ccb_h.status |= CAM_REQ_CMP.
r304251 checks vm_srb->srb_status and sets ccb->ccb_h.status properly,
so the invalid LUNs are filtered.
I changed my code version to r304251 but replaced the CAM_SEL_TIMEOUT
with CAM_DEV_NOT_THERE, and I confirmed the invalid LUNs can also be
filtered, and I successfully hot-added and hot-removed 8 disks to/from
the VM without any issue.
CAM_SEL_TIMEOUT has an unwanted side effect -- see cam_periph_error():
For a selection timeout, we consider all of the LUNs on
the target to be gone. If the status is CAM_DEV_NOT_THERE,
then we only get rid of the device(s) specified by the
path in the original CCB.
This means: for a VM with a valid LUN on 3:0:0:0, when the VM inquires
3:0:0:1 and the host reports 3:0:0:1 doesn't exist and storvsc returns
CAM_SEL_TIMEOUT to the CAM layer, CAM will detech 3:0:0:0 as well: this
is the bug I reported recently:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226583
MFC r332674:
Increase the msdosfs partition size on arm SoC images where the
current size may not be sufficiently large for development and/or
testing.
MFC r331616: vfs_donmount: in certain cases try r/o mount if r/w mount fails
If the operation is not an update, if neither r/w nor r/o mode is
explicitly requested, if the error code hints at the possibility of the
media being read-only, and if the fallback is allowed, then we can try
to automatically downgrade to the readonly mode.
This is especially useful for auto-mounting of removable media that
sometimes can happen to be write-protected.
The fallback to r/o is not enabled by default. It can be requested on a
per-mount basis with a new mount option, 'autoro'. Or it can be
globally allowed by setting vfs.default_autoro.
stable/10 note: this branch does not have SYSCTL_BOOL, so SYSCTL_INT is
used instead.
MFC r332452: Update vt(4) "Terminus BSD Console" font to v4.46
"Terminus BSD Console" is a derivative of Terminus that is provided
by Mr. Dimitar Zhekov under the 2-clause BSD license for use by the
FreeBSD vt(4) console and other BSDs.
MFC 331466:
Add a workaround to the hypervisor detection for older versions of KVM.
Originally KVM set %eax to 0 in the cpuid leaf 0x4000000 rather than
to the highest supported leaf in the hypervisor "branch". Detect this
case and fixup the %eax value so that the hypervisor is still
detected.
growfs: Commit the changes after expanding the partition
This fix the problem in arm snapshot present since at least 6 months
where growfs was failing at firstboot and dropped you in a single
user shell.
Note: In addition to this merge, kern.geom.part.mbr.enforce_chs has
been enabled on the build machine to mitigate against the issue in
the PR referenced.
MFC r332421: vt: add three more cp437 mappings for vga textmode
In UTF-8 locales mandoc uses a number of characters outside of the Basic
Latin group, e.g. from general punctuation or miscellaneous mathematical
symbols, and these rendered as ? in text mode.
This change adds (char, replacement, code point, description):
– - U+2013 En Dash
⟨ < U+27E8 Mathematical Left Angle Bracket
⟩ > U+27E9 Mathematical Right Angle Bracket
This change addresses some common cases; there are others that still
need to be added after a more thorough review.
MFC 331324: Ensure thread library is initialized in pthread_testcancel().
Call _thr_check_init() before reading curthread in pthread_testcancel().
If a constructor in a library creates a semaphore via sem_init() and
then waits for it via sem_wait(), the program can core dump in
_pthread_testcancel() called from sem_wait(). This is because the
semaphore implementation lives in libc, so the library's constructors
can be run before libthr's constructors.
tail: fix "tail -r" for piped input that begins with '\n'
A subtle logic bug, probably introduced in r311895, caused tail to print the
first two lines of piped input in forward order, if the very first character
was a newline.
PR: 222671
Reported by: Jim Long <freebsd-bugzilla@umpquanet.com>, pprocacci@gmail.com
Sponsored by: Spectra Logic Corp
Ensure that multiplications for memory allocations cannot overflow, and
that we'll not try to allocate M_WAITOK for potentially overly large
allocations.
pf: Improve ioctl validation for DIOCRGETTABLES, DIOCRGETTSTATS, DIOCRCLRTSTATS and DIOCRSETTFLAGS
These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.
Limit the allocation to required size (or the user allocation, if that's
smaller). That does mean we need to do the allocation with the rules
lock held (so the number doesn't change while we're doing this), so it
can't M_WAITOK.
pf: Improve ioctl validation for DIOCIGETIFACES and DIOCXCOMMIT
These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.
There's no obvious limit to the request size for these, so we limit the
requests to something which won't overflow. Change the memory allocation
to M_NOWAIT so excessive requests will fail rather than stall forever.
pf: Improve ioctl validation for DIOCRADDTABLES and DIOCRDELTABLES
The DIOCRADDTABLES and DIOCRDELTABLES ioctls can process a number of
tables at a time, and as such try to allocate <number of tables> *
sizeof(struct pfr_table). This multiplication can overflow. Thanks to
mallocarray() this is not exploitable, but an overflow does panic the
system.
Arbitrarily limit this to 65535 tables. pfctl only ever processes one
table at a time, so it presents no issues there.
Exit with usage when extra arguments are on command line
preventing mistakes such as "halt 0p" for "halt -p".
Approved by: bde (mentor, implicit), phk (mentor,implicit)
MFC after: 1 week
ifconf(): correct handling of sockaddrs smaller than struct sockaddr.
Portable programs that use SIOCGIFCONF (e.g. traceroute) assume
that each pseudo ifreq is of length MAX(sizeof(struct ifreq),
sizeof(ifr_name) + ifr_addr.sa_len). For short sockaddrs we copied
too much from the source sockaddr resulting in a heap leak.
I believe only one such sockaddr exists (struct sockaddr_sco which
is 8 bytes) and it is unclear if such sockaddrs end up on interfaces
in practice. If it did, the result would be an 8 byte heap leak on
current architectures.
If a user attempts to add two tables with the same name the duplicate table
will not be added, but we forgot to free the duplicate table, leaking memory.
Ensure we free the duplicate table in the error path.
The previous split of zeroing ifr_name and ifr_addr seperately is safe
on current architectures, but would be unsafe if pointers were larger
than 8 bytes. Combining the zeroing adds no real cost (a few
instructions) and makes the security property easier to verify.
GC never enabled support for SIOCGADDRROM and SIOCGCHIPID.
When de(4) was imported in 1997 the world was not ready for these ioctls.
In over 20 years that hasn't changed so it seems safe to assume their
time will never come.
r331654:
Don't access userspace directly from the kernel in nxge(4).
Update to what the previous code seemed to be doing via the correct
interfaces. Further issues exist in xge_ioctl_registers(), but this is
debugging code in a driver that has few users and they don't appear to
be crashes or leaks.
r331869:
Fix the build on arches with default unsigned char. Capture the fubyte()
return value in an int as well as the char, and test the full int value
for fubyte() failure.
The original implementation used a reference to ifr_data and a cast to
do the equivalent of accessing ifr_addr. This was copied multiple
times since 1996.
MFC r332042: Fix kernel memory disclosure in linux_ioctl_socket
strlcpy is used to copy a string into a buffer to be copied to userland,
previously leaving uninitialized data after the terminating NUL. Zero
the buffer first to avoid a kernel memory disclosure.
admbugs: 765, 811
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reported by: Vlad Tsyrklevich
Sponsored by: The FreeBSD Foundation
MFC r331739
1. Add additional debug prints.
2. Break transmit when IFF_DRV_RUNNING is OFF.
3. set desc_count=0 for default case in switch in ql_rcv_isr()
gordon [Wed, 4 Apr 2018 05:26:33 +0000 (05:26 +0000)]
MFC r331981:
Limit glyph count in vtfont_load to avoid integer overflow.
Invalid font data passed to PIO_VFONT can result in an integer overflow
in glyphsize. Characters may then be drawn on the console using glyph
map entries that point beyond the end of allocated glyph memory,
resulting in a kernel memory disclosure.
Submitted by: emaste
Reported by: Dr. Silvio Cesare of InfoSect
Security: CVE-2018-6917
Security: FreeBSD-SA-18:04.vt
Sponsored by: The FreeBSD Foundation
eugen [Tue, 3 Apr 2018 14:09:34 +0000 (14:09 +0000)]
MFC r331630: Fix instructions in the zfsboot manual page.
zfsloader(8) fails to probe a slice containing ZFS pool if its second sector
contains traces of BSD label (DISKMAGIC == 0x82564557).
Fix manual page to show working example erasing such traces.
MFC r331806:
Add logic for "families" for GCE images.
This allows for GCE consumers to easily detect the latest major
version of FreeBSD when using the gcloud command line utility.
To ensure snapshot builds do not conflict with release-style
builds (ALPHA, BETA, RC, RELEASE), the '-snap' suffix is appended
to the GCE image family name.
gjb [Sat, 31 Mar 2018 01:37:14 +0000 (01:37 +0000)]
MFC r331696, r331697:
r331696:
Update the Release Engineering article URL to the modern version.
r331697:
Add an example for building SD card images for the RPI-B and RPI3.
Note, this commit manually fixes a merge conflict caused by r325096,
which does a seemingly recursive http -> https update, which this
commit was never marked for MFC.
emaste [Thu, 29 Mar 2018 22:31:14 +0000 (22:31 +0000)]
MF11 r331330: Fix kernel memory disclosure in svr4_sys_getdents64
svr4_sys_getdents64() copies a dirent structure to userland. When
calculating the record length for any given dirent entry alignment is
performed. However, the aligned bytes are not cleared, this will
trigger an info leak.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Security: Kernel memory disclosure (801)
Sponsored by: The FreeBSD Foundation
gjb [Thu, 29 Mar 2018 19:29:12 +0000 (19:29 +0000)]
MFC r331562 (manu):
release: arm: Copy boot.scr from ports
Latest u-boot update need u-boot script to load and start ubldr.
(See D14230 for more details)
Copy this file for our arm release on the fat partition.
emaste [Wed, 28 Mar 2018 13:44:02 +0000 (13:44 +0000)]
MFC r331329: Fix kernel memory disclosure in ibcs2_getdents
ibcs2_getdents() copies a dirent structure to userland. The ibcs2
dirent structure contains a 2 byte pad element. This element is never
initialized, but copied to userland none-the-less.
Note that ibcs2 has not built on HEAD since r302095.
Submitted by: Domagoj Stolfa <ds815@cam.ac.uk>
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Security: Kernel memory disclosure (803)
Sponsored by: The FreeBSD Foundation
brooks [Tue, 27 Mar 2018 17:48:39 +0000 (17:48 +0000)]
MFC r330876, r330945
r330876:
Fix ISP_FC_LIP and ISP_RESCAN on big-endian 64-bit systems.
For _IO() ioctls, addr is a pointer to uap->data which is a caddr_t.
When the caddr_t stores an int, dereferencing addr as an (int *) results
in truncation on little-endian 64-bit systems and corruption (owing to
extracting top bits) on big-endian 64-bit systems. In practice the
value of chan was probably always zero on systems of the latter type as
all such FreeBSD platforms use a register-based calling convention.
brooks [Tue, 27 Mar 2018 17:43:03 +0000 (17:43 +0000)]
MFC r330820:
Reject ioctls to SCSI enclosures from 32-bit compat processes.
The ioctl objects contain pointers and require translation and some
refactoring of the infrastructure to work. For now prevent opertion
on garbage values. This is very slightly overbroad in that ENCIOC_INIT
is safe.
brooks [Tue, 27 Mar 2018 17:42:04 +0000 (17:42 +0000)]
MFC r330819, r330885, r330934
r330819:
Reject CAMIOGET and CAMIOQUEUE ioctl's on pass(4) in 32-bit compat mode.
These take a union ccb argument which is full of kernel pointers.
Substantial translation efforts would be required to make this work.
By rejecting the request we avoid processing or returning entierly
wrong data.
sevan [Sun, 25 Mar 2018 01:33:51 +0000 (01:33 +0000)]
MFC r322665
Add caveat to kinfo_getvmmap(3) explaining high CPU utilisation.
Based on kib's reply on https://lists.freebsd.org/pipermail/freebsd-hackers/2016-July/049710.html
sevan [Sun, 25 Mar 2018 01:24:02 +0000 (01:24 +0000)]
MFC 321881
For the udp-client example, instruct user to add an entry for a udp based
service.
For tcp-client & udp-client, use the same port in configuration snippet as used
in the comment prior to remove any ambiguity on the port number which needs to
be specified.
sevan [Sun, 25 Mar 2018 01:17:40 +0000 (01:17 +0000)]
MFC r316721 r331274
Most wireless drivers don't support altq(4).
Extend the description of ALTQ to call it a system which is a framework in
altq(4) to match altq(9). This makes preserving the history section as the
author of ALTQ easier in the history section, rather than calling it a framework
in the description & a system in the history.
Add a history section to altq(4) and extend the history section in altq(9)
emaste [Fri, 23 Mar 2018 02:38:31 +0000 (02:38 +0000)]
MFC r331333: Fix kernel memory disclosure in drm_infobufs
drm_infobufs() has a structure on the stack, fills it out and copies it
to userland. There are 2 elements in the struct that are not filled out
and left uninitialized. This will leak uninitialized kernel stack data
to userland.
emaste [Fri, 23 Mar 2018 02:34:45 +0000 (02:34 +0000)]
MFC r331339: Correct signedness bug in drm_modeset_ctl
drm_modeset_ctl() takes a signed in from userland, does a boundscheck,
and then uses it to index into a structure and write to it. The
boundscheck only checks upper bound, and never checks for nagative
values. If the int coming from userland is negative [after conversion]
it will bypass the boundscheck, perform a negative index into an array
and write to it, causing memory corruption.
Note that this is in the "old" drm driver; this issue does not exist
in drm2.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by: cem
Sponsored by: The FreeBSD Foundation
kp [Wed, 21 Mar 2018 09:57:29 +0000 (09:57 +0000)]
MFC 330105:
pf: Do not flush on reload
pfctl only takes the last '-F' argument into account, so this never did what
was intended.
Moreover, there is no reason to flush rules before reloading, because pf keeps
track of the rule which created a given state. That means that existing
connections will keep being processed according to the rule which originally
created them. Simply reloading the (new) rules suffices. The new rules will
apply to new connections.
PR: 127814
Submitted by: Andreas Longwitz <longwitz at incore.de>
kp [Wed, 21 Mar 2018 09:55:49 +0000 (09:55 +0000)]
MFC r330108:
pf: Apply $pf_flags when verifying the pf.conf file
When checking the validity of the pf.conf file also include the user supplied
pf_flags. These flags might overrule macros or specify anchors, which we will
apply when actually applying the pf.conf file, so we must also take them into
account when verifying the validity.
Submitted by: Andreas Longwitz <longwitz at incore.de>
ian [Tue, 20 Mar 2018 22:57:14 +0000 (22:57 +0000)]
MFC r330745:
Make root mount timeout logic work for filesystems other than ufs.
The vfs.mountroot.timeout tunable and .timeout directive in a mount.conf(5)
file allow specifying a wait timeout for the device(s) hosting the root
filesystem to become usable. The current mechanism for waiting for devices
and detecting their availability can't be used for zfs-hosted filesystems.
See the comment #20 in the PR for some expanded detail on these points.
This change adds retry logic to the actual root filesystem mount. That is,
insted of relying on device availability using device name lookups, it uses
the kernel_mount() call itself to detect whether the filesystem can be
mounted, and loops until it succeeds or the configured timeout is exceeded.
These changes are based on the patch attached to the PR, but it's rewritten
enough that all mistakes belong to me.
dab [Mon, 19 Mar 2018 17:38:35 +0000 (17:38 +0000)]
MFC r331015:
Modify rc.d/fsck to handle new status from fsck/fsck_ffs
r328013 introduced a new error code from fsck_ffs that indicates that
it could not completely fix the file system; this happens when it
prints the message PLEASE RERUN FSCK. However, this status can happen
when fsck is run in "preen" mode and the rc.d/fsck script does not
handle that error code. Modify rc.d/fsck so that if "fsck -p"
("preen") returns the new status code (16) it will run "fsck -y", as
it currently does for a status code of 8 (the "standard error exit").
marius [Mon, 19 Mar 2018 14:28:58 +0000 (14:28 +0000)]
MFC: r328834
o Let rtld(1) set up psABI user trap handlers prior to executing the
objects' init functions instead of doing the setup via a constructor
in libc as the init functions may already depend on these handlers
to be in place. This gets us rid of:
- the undefined order in which libc constructors as __guard_setup()
and jemalloc_constructor() are executed WRT __sparc_utrap_setup(),
- the requirement to link libc last so __sparc_utrap_setup() gets
called prior to constructors in other libraries (see r122883).
For static binaries, crt1.o still sets up the user trap handlers.
o Move misplaced prototypes for MD functions in to the MD prototype
section of rtld.h.
o Sprinkle nitems().
ae [Mon, 19 Mar 2018 09:54:16 +0000 (09:54 +0000)]
MFC r330792:
Do not try to reassemble IPv6 fragments in "reass" rule.
ip_reass() expects IPv4 packet and will just corrupt any IPv6 packets
that it gets. Until proper IPv6 fragments handling function will be
implemented, pass IPv6 packets to next rule.
kp [Sun, 18 Mar 2018 11:26:07 +0000 (11:26 +0000)]
MFC r329950:
pf: Cope with overly large net.pf.states_hashsize
If the user configures a states_hashsize or source_nodes_hashsize value we may
not have enough memory to allocate this. This used to lock up pf, because these
allocations used M_WAITOK.
Cope with this by attempting the allocation with M_NOWAIT and falling back to
the default sizes (with M_WAITOK) if these fail.
PR: 209475
Submitted by: Fehmi Noyan Isi <fnoyanisi AT yahoo.com>
marius [Thu, 15 Mar 2018 23:02:52 +0000 (23:02 +0000)]
MFC: r327929
Use the correct revision specifier (EXT_CSD revision rather than
system specification version) for deciding whether the EXT_CSD
register includes the EXT_CSD_GEN_CMD6_TIME field.
marius [Thu, 15 Mar 2018 23:01:04 +0000 (23:01 +0000)]
MFC: r327355, r327926
- Don't allow userland to switch partitions; it's next to impossible
to recover from that, especially when something goes wrong.
- When userland changes EXT_CSD, update the kernel copy before using
relevant EXT_CSD bits in mmcsd_switch_part().