Kyle Evans [Wed, 8 Aug 2018 21:51:19 +0000 (21:51 +0000)]
ls(1): Enable colors with COLORTERM is set in the environment
COLORTERM is the de facto standard, while CLICOLOR is generally specific to
FreeBSD and ls(1).
PR: 230101
Submitted by: D Green <dfrg@xsmail.com> (with manpage additions by myself)
Reviewed by: cem ("LGTM" in PR; pre-manpage changes)
MFC after: 1 week
Kyle Evans [Wed, 8 Aug 2018 21:21:28 +0000 (21:21 +0000)]
apply(1): Fix magic number substitution with magic character ' '
Using a space as the magic character would result in problems if the command
started with a number:
- For a 'valid' number n, n < size of argv, it would erroneously get
replaced with that argument; e.g. `apply -a ' ' -d 1rm x => `execxrm x`
- For an 'invalid' number n, n >= size of argv, it would segfault.
e.g. `apply -a ' ' 2to3 test.py` would try to access argv[2]
This problem occurred because apply(1) would prepend "exec " to the command
string before doing the actual magic number replacements, so it would come
across "exec 2to3 1" and assume that the " 2" is also a magic number to be
replaced.
Re-work this to instead just append "exec " to the command sbuf and
workaround the ugliness. This also simplifies stuff in the process.
Breno Leitao [Wed, 8 Aug 2018 21:19:07 +0000 (21:19 +0000)]
powerpc64/powernv: re-read RTC after polling
If OPAL_RTC_READ is busy and does not return the information on the first run,
as returning OPAL_BUSY_EVENT, the system will crash since ymd and hmsm variable
will contain junk values.
This is happening because we were not calling OPAL_RTC_READ again after
OPAL_POLL_EVENTS' return, which would finally replace the old/junk hmsm and ymd
values.
The code was also mixing OPAL_RTC_READ and OPAL_POLL_EVENTS return values.
This patch fix this logic and guarantee that we call OPAL_RTC_READ after
OPAL_POLL_EVENTS return, and guarantee the code will only proceed if
OPAL_RTC_READ returns OPAL_SUCCESS.
Rick Macklem [Wed, 8 Aug 2018 20:30:12 +0000 (20:30 +0000)]
Fix the err() arguments for a nfssvc(8) failure.
argv has been incremented during argument handling, so elements of the
array are no longer valid. Change the err() arguments so only the
first string pointer in argv is used.
Found during code inspection.
Rick Macklem [Wed, 8 Aug 2018 20:21:45 +0000 (20:21 +0000)]
Assorted fixes to handling of LayoutRecall callbacks, mostly error handling.
After a re-read of the appropriate section of RFC5661, I decided that a
few things should be changed related to LayoutRecall callback handling.
Here are the things fixed by this patch.
- For two of the three cases that LayoutRecall is done, I now think
setting the clora_changed argument false is correct.
- All errors other than NFSERR_DELAY returned by LayoutRecall appear
permanent, so don't retry for any of them. (NFSERR_DELAY is retried by
newnfs_request(), so it is not affected by this patch.)
- Instead of waiting "forever" (actually until the process is SIGTERM'd)
for Layouts to be returned during a mirror copy, fail and return
ENXIO after about 1minute.
Waiting for a <ctrl>C made sense when pnfsdscopymr() was done by itself,
but did not make sense when done via find(1).
This patch only affects the pNFS server.
Mark Johnston [Wed, 8 Aug 2018 17:26:51 +0000 (17:26 +0000)]
Simplify compression code.
- Remove the compression suffix macros and move them directly into the
compress_type array.
- Remove the hardcoded sizes on the suffix and compression args arrays.
- Simplify the compression args arrays at the expense of a __DECONST
when calling execv().
- Rewrite do_zipwork. The COMPRESS_* macros can directly index the
compress_types array, so the outer loop is not needed. Convert
fixed-length strings into asprintf or sbuf calls.
Submitted by: Dan Nelson <dnelson_1901@yahoo.com>
Reviewed by: gad
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16518
Alan Cox [Wed, 8 Aug 2018 16:55:01 +0000 (16:55 +0000)]
Add support for pmap_enter(..., psind=1) to the armv6 pmap. In other words,
add support for explicitly requesting that pmap_enter() create a 1 MB page
mapping. (Essentially, this feature allows the machine-independent layer
to create superpage mappings preemptively, and not wait for automatic
promotion to occur.)
Export pmap_ps_enabled() to the machine-independent layer.
Add a flag to pmap_pv_insert_pte1() that specifies whether it should fail
or reclaim a PV entry when one is not available.
Refactor pmap_enter_pte1() into two functions, one by the same name, that
is a general-purpose function for creating pte1 mappings, and another,
pmap_enter_1mpage(), that is used to prefault 1 MB read- and/or execute-
only mappings for execve(2), mmap(2), and shmat(2).
In addition, as an optimization to pmap_enter(..., psind=0), eliminate the
use of pte2_is_managed() from pmap_enter(). Unlike the x86 pmap
implementations, armv6 does not have a managed bit defined within the PTE.
So, pte2_is_managed() is actually a call to PHYS_TO_VM_PAGE(), which is O(n)
in the number of vm_phys_segs[]. All but one call to PHYS_TO_VM_PAGE() in
pmap_enter() can be avoided.
These were found by the Undefined Behaviour GsoC project at NetBSD:
Do not change signedness bit with left shift.
While there avoid signed integer overflow.
Address both issues with using unsigned type.
msdosfs_fat.c:512:42, left shift of 1 by 31 places cannot be represented
in type 'int'
msdosfs_fat.c:521:44, left shift of 1 by 31 places cannot be represented
in type 'int'
msdosfs_fat.c:744:14, left shift of 1 by 31 places cannot be represented
in type 'int'
msdosfs_fat.c:744:24, signed integer overflow: -2147483648 - 1 cannot be
represented in type 'int [20]'
msdosfs_fat.c:840:13, left shift of 1 by 31 places cannot be represented
in type 'int'
msdosfs_fat.c:840:36, signed integer overflow: -2147483648 - 1 cannot be
represented in type 'int [20]'
Randall Stewart [Wed, 8 Aug 2018 13:36:49 +0000 (13:36 +0000)]
Fix a small bug in rack where it will
end up sending the FIN twice.
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D16604
Fedor Uporov [Wed, 8 Aug 2018 12:07:45 +0000 (12:07 +0000)]
Fix directory blocks checksum updating logic.
The checksum updating functions were not called in case of dir index inode splitting
and in case of dir entry removing, when the entry was first in the block.
Fix and move the dir entry adding logic when i_count == 0 to new function.
Roger Pau Monné [Wed, 8 Aug 2018 07:58:29 +0000 (07:58 +0000)]
build: skip the database check for the distributeworld target
distributeworld is used to generate install media, so it makes no
sense to check the host database since the install media can be
generated from any box, regardless of the version of FreeBSD it's
running.
Eitan Adler [Wed, 8 Aug 2018 06:31:46 +0000 (06:31 +0000)]
top(1): hide THR column in separate-thread mode.
It does not make sense to show a "thread count" column when displaying
threads separately. In fact we don't, but do show the header for this
column. Fix this.
Alan Cox [Wed, 8 Aug 2018 02:30:34 +0000 (02:30 +0000)]
Defer and aggregate swap_pager_meta_build frees.
Before swp_pager_meta_build replaces an old swapblk with an new one,
it frees the old one. To allow such freeing of blocks to be
aggregated, have swp_pager_meta_build return the old swap block, and
make the caller responsible for freeing it.
Define a pair of short static functions, swp_pager_init_freerange and
swp_pager_update_freerange, to do the initialization and updating of
blk addresses and counters used in aggregating blocks to be freed.
Fix printf(1) ignores width and precision in %b format.
The precision with behavior is "unspecified" by POSIX (as of 2018), but
most implementations seem to have taken it to be treated the same as for
"s"; applied after the unescaping.
Adopt the same treatment on our printf.
Rick Macklem [Tue, 7 Aug 2018 21:29:14 +0000 (21:29 +0000)]
Allow newnfs_request() to retry all callback RPCs with an NFSERR_DELAY reply.
The code in newnfs_request() retries RPCs that get a reply of NFSERR_DELAY,
but exempts certain NFSv4 operations. However, for callback RPCs, there
should not be any exemptions at this time. The code would have erroneously
exempted the CBRECALL callback, since it has the same operation number as
the CLOSE operation.
This patch fixes this by checking for a callback RPC (indicated by clp != NULL)
and not checking for exempt operations for callbacks.
This would have only affected the NFSv4 server when delegations are enabled
(they are not enabled by default) and the client replies to CBRECALL with
NFSERR_DELAY. This may never actually happen.
Spotted during code inspection.
Kirk McKusick [Tue, 7 Aug 2018 21:17:45 +0000 (21:17 +0000)]
When getting mount information for all filesystems, mount uses the
getfsstat(2) system call using the MNT_NOWAIT flag to indicate that
it wants to use the statfs information cached in the mount structure.
When the -v (verbose) flag is specified, we need to use the MNT_WAIT
flag to getfsstat(2) so that kernel will call VFS_STATFS to get the
current statfs statistics from each filesystem.
Move description of init_shell, init_script, and init_chroot kenv
tunables from loader(8) to init(8), since it's init that actually
uses them. Add .Xrs at their old place.
Mark Johnston [Tue, 7 Aug 2018 16:36:48 +0000 (16:36 +0000)]
Improve handling of control message truncation.
If a recvmsg(2) or recvmmsg(2) caller doesn't provide sufficient space
for all control messages, the kernel sets MSG_CTRUNC in the message
flags to indicate truncation of the control messages. In the case
of SCM_RIGHTS messages, however, we were failing to dispose of the
rights that had already been externalized into the recipient's file
descriptor table. Add a new function and mbuf type to handle this
cleanup task, and use it any time we fail to copy control messages
out to the recipient. To simplify cleanup, control message truncation
is now only performed at control message boundaries.
The change also fixes a few related bugs:
- Rights could be leaked to the recipient process if an error occurred
while copying out a message's contents.
- We failed to set MSG_CTRUNC if the truncation occurred on a control
message boundary, e.g., if the caller received two control messages
and provided only the exact amount of buffer space needed for the
first.
PR: 131876
Reviewed by: ed (previous version)
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D16561
These were found by the Undefined Behavious GsoC project at NetBSD:
Avoid undefined behavior in ftok(3)
Do not change the signedness bit with a left shift operation.
Cast to unsigned integer to prevent this.
ftok.c:56:10, left shift of 123456789 by 24 places cannot be represented
in type 'int'
ftok.c:56:10, left shift of 4160 by 24 places cannot be represented in
type 'int'
Avoid undefined behavior in an inet_addr.c
Do not change the signedness bit with a left shift operation.
Cast to unsigned integer to prevent this.
inet_addr.c:218:20, left shift of 131 by 24 places cannot be represented
in type 'int'
sed(1): partial fix for the case of the regex delimited with '['.
We don't generally support the weird case of regular expresions delimited
by an opening square bracket ('[') but POSIX says that inside
bracket expressions, escaping is not possible and both '[' and '\'
represent themselves.
Colin Percival [Tue, 7 Aug 2018 08:33:40 +0000 (08:33 +0000)]
Replace a pair of 8-bit writes to VGA memory with a single 16-bit write.
The VGA "text mode" buffer has a pair of bytes for each character: One
byte for the character symbol, and an "attribute" byte encoding the
foreground and background colours. When updating the screen, we were
writing these two bytes separately.
On some virtualized systems, every write results in a glyph being redrawn
into a (graphical) virtual screen; writing these two bytes separately
results in twice as much work being done to draw characters, whereas if
we perform a single 16-bit write instead, the character only needs to be
redrawn once.
On an EC2 c5.4xlarge instance, this change cuts 1.30s from the kernel boot,
speeding it up from 8.90s to 7.60s.
Cy Schubert [Tue, 7 Aug 2018 07:12:59 +0000 (07:12 +0000)]
Remove redundant and incorrect default definition of AF_INET6. AF_INET6
is defined in sys/socket.h where it's defined as 28.
A bit of trivia: On NetBSD AF_INET6 is defined as 24. On Solaris it is
defined as 26. This is probably why Darren defaulted to 26, because
ipfilter was originally written for SunOS 4 and Solaris many moons ago.
John Baldwin [Tue, 7 Aug 2018 00:10:58 +0000 (00:10 +0000)]
Remove spurious ABI tags from kdump output.
The abidump routine output an ABI tag when -A was specified for records
that were not displayed due to type or pid filtering. To fix, split
the code to lookup the ABI from the code to display the ABI, move the
code to display the ABI into dumpheader(), and move dumpheader() later
in the main loop as a simplification. Previously dumpheader() was
called under a condition that repeated conditions made later in the
main loop.
John Baldwin [Mon, 6 Aug 2018 23:51:08 +0000 (23:51 +0000)]
Make the system C11 atomics headers fully compatible with external GCC.
The <sys/cdefs.h> and <stdatomic.h> headers already included support for
C11 atomics via intrinsincs in modern versions of GCC, but these versions
tried to "hide" atomic variables inside a wrapper structure. This wrapper
is not compatible with GCC's internal <stdatomic.h> header, so that if
GCC's <stdatomic.h> was used together with <sys/cdefs.h>, use of C11
atomics would fail to compile. Fix this by not hiding atomic variables
in a structure for modern versions of GCC. The headers already avoid
using a wrapper structure on clang.
Note that this wrapper was only used if C11 was not enabled (e.g.
via -std=c99), so this also fixes compile failures if a modern version
of GCC was used with -std=c11 but with FreeBSD's <stdatomic.h> instead
of GCC's <stdatomic.h> and this change fixes that case as well.
Reported by: Mark Millard
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D16585
Navdeep Parhar [Mon, 6 Aug 2018 23:21:13 +0000 (23:21 +0000)]
cxgbe(4): Allow user-configured and driver-configured traffic classes to
be used simultaneously. Move sysctl_tc and sysctl_tc_params to
t4_sched.c while here.
Navdeep Parhar [Mon, 6 Aug 2018 21:54:51 +0000 (21:54 +0000)]
cxgbe(4): Break up sysctl_bitfield into 8 bit and 16 bit variants. Have
them display the current value of the bitfield rather than the fixed
value that was provided when the sysctl node was created.
Kirk McKusick [Mon, 6 Aug 2018 21:09:11 +0000 (21:09 +0000)]
Put in place the framework for consolodating contiguous blocks into
a smaller number of larger TRIM requests. The hope had been to have
the full TRIM consolodation in place for 12.0, but the algorithms
are still under development and need further testing. With this
framework in place it will be possible to easily add TRIM consolodation
once the optimal strategy has been found.
The only functional change with this patch is the elimination of TRIM
requests for blocks that are freed before they have been likely to
have been written.
Reviewed by: kib
Discussed with: Warner Losh and Chuck Silvers
Sponsored by: Netflix
Colin Percival [Mon, 6 Aug 2018 19:21:32 +0000 (19:21 +0000)]
Add EC2PUBLICSNAP option to EC2 builds; this passes a (recently added)
flag to bsdec2-image-upload instructing it to mark the snapshot of its
root disk as public (which is independent from marking the created AMIs
as public).
Address concerns about CPU usage while doing TCP reassembly.
Currently, the per-queue limit is a function of the receive buffer
size and the MSS. In certain cases (such as connections with large
receive buffers), the per-queue segment limit can be quite large.
Because we process segments as a linked list, large queues may not
perform acceptably.
The better long-term solution is to make the queue more efficient.
But, in the short-term, we can provide a way for a system
administrator to set the maximum queue size.
We set the default queue limit to 100. This is an effort to balance
performance with a sane resource limit. Depending on their
environment, goals, etc., an administrator may choose to modify this
limit in either direction.
Reviewed by: jhb
Approved by: so
Security: FreeBSD-SA-18:08.tcp
Security: CVE-2018-6922
Emmanuel Vadot [Mon, 6 Aug 2018 17:21:20 +0000 (17:21 +0000)]
release: arm: Copy the dtb to the fat partition
When booting via EFI on arm we have no way to know the dtb file to load
and we always use the one provided from the bootloader.
This works in most case but :
U-Boot have some really old DTB for some boards, the sync from Linux isn't done automatically for all boards
Some boards (like TI BeagleBone series) use one u-boot for all the model and it doesn't embed the DTBs
Some boards (like IMX6 based ones), don't embed the DTB
We want u-boot to load and patch the DTB with the mac address or the display
node enabled or not.
Mark Johnston [Mon, 6 Aug 2018 16:22:01 +0000 (16:22 +0000)]
dhclient: Don't chroot if we are in capability mode.
The main dhclient process is Capsicumized but also chroots to
restrict filesystem access. With r322369, pidfile(3) maintains a
directory descriptor for the pidfile, which can cause the chroot
to fail in certain cases. To minimize the problem, only chroot
if we fail to enter capability mode, and store dhclient pidfiles
in a subdirectory of /var/run, thus restricting access via
pidfile(3)'s directory descriptor.
PR: 223327
Reviewed by: cem, oshogbo
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D16584
Emmanuel Vadot [Mon, 6 Aug 2018 05:36:00 +0000 (05:36 +0000)]
aw_thermal: Add nvmem and H5 support
Now that aw_sid expose nvmem interface, use that to read the calibration
data.
Add support for H5 SoC.
Fix the bindings, we used to have non-upstreamed bindings. Switch to the
one that have been sent upstream. They are not stable yet, so we switch
from custom, wrong, bindings to correct, proposed bindings
Emmanuel Vadot [Mon, 6 Aug 2018 05:35:24 +0000 (05:35 +0000)]
aw_sid: Add nvmem interface
Rework aw_sid so it can work with the nvmem interface.
Each SoC expose a set of fuses (for now rootkey/boardid and, if available,
the thermal calibration data). A fuse can be private or public, reading private
fuse needs to be done via some registers instead of reading directly.
Each fuse is exposed as a sysctl.
For now leave the possibility for a driver to read any fuse without using
the nvmem interface as the awg and emac driver use this to generate a mac
address.
Ian Lepore [Sun, 5 Aug 2018 22:24:38 +0000 (22:24 +0000)]
Document 64-bit arm in terms of arch name (aarch64) not machine (arm64).
Other architectures are documented in terms of the name that is displayed by
'uname -p', aka MACHINE_ARCH and TARGET_ARCH in the build system, now
aarch64 matches the rest of them.
Rick Macklem [Sun, 5 Aug 2018 19:21:50 +0000 (19:21 +0000)]
Copy all bits of a file handle in case there is padding in the structure.
At least on x86, fhandle_t is a packed structure, so I believe an
assignment will copy all the bits. However, for some current/future
architectures, there might be padding in the structure that doesn't get
copied via an assignment.
Since NFS assumes a file handle is an opaque blob of bits that can be
compared via memcmp()/bcmp(), all the bits including any padding must be
copied.
This patch replaces the assignments with a call to a byte copy function.
Spotted during code inspection.
Kristof Provost [Sun, 5 Aug 2018 13:54:37 +0000 (13:54 +0000)]
pf: Increase default hash table size
Now that we (by default) limit the number of states to 100.000 it makse sense
to also adjust the default size of the hash table.
Based on the benchmarking results in
https://github.com/ocochard/netbenches/blob/master/Atom_C2758_8Cores-Chelsio_T540-CR/pf-states_hashsize/results/fbsd12-head.r332390/README.md
128K entries offers a good compromise between performance and memory use.
Users may still overrule this setting with the net.pf.states_hashsize and
net.pf.source_nodes_hashsize loader(8) tunables.
Kristof Provost [Sun, 5 Aug 2018 11:15:28 +0000 (11:15 +0000)]
zfsboot: Fix startup crash
On a FreeNAS mini XL, with geli encrypted drives the loader crashed in
geli_read().
When we iterate over the list of disks and allocate the zfsdsk structures we
don’t zero out the gdev pointer. In one case that resulted in geli_read()
(called on the bogus pointer) dividing by zero.
Use calloc() to ensure the zfsdsk structure is always zeroed, so the pointer is
initialised to NULL. As a side benefit it gets rid of one #ifdef
LOADER_GELI_SUPPORT.
Emmanuel Vadot [Sun, 5 Aug 2018 06:15:35 +0000 (06:15 +0000)]
extres: clkdiv: Fix div_with_table
We didn't allowed a divider register value of 0 which can exists and
also didn't wrote the value but the divider, which result of a wrong
frequency to be selected
Emmanuel Vadot [Sun, 5 Aug 2018 06:10:13 +0000 (06:10 +0000)]
arm: allwinner: Disconnect A10/A20 HDMI driver
It doesn't work since 2 years when we stopped patching DTS.
The DTS now have the correct bindings but they are a lot different
from our hacked ones we used to have (and more representative of the
reality).
Emmanuel Vadot [Sun, 5 Aug 2018 06:08:23 +0000 (06:08 +0000)]
arm: allwinner: Remove old unused clocks
Remove the old clocks for allwinner as now all the SoCs have been converted
to clkng.
The only old clock now is the gmac clock which still lives under the /clocks
dts node.
Conrad Meyer [Sat, 4 Aug 2018 22:08:24 +0000 (22:08 +0000)]
settimeofday(2): Remove stale note about timezone
Contrary to the removed comment, the kernel does appear to use the timezone
argument of settimeofday. The comment dates to the BSD4.4 import; I assume it
is just stale.
Kyle Evans [Sat, 4 Aug 2018 21:41:10 +0000 (21:41 +0000)]
efirt: Don't enter EFI context early, convert addrs to KVA instead
efi_enter here was needed because efi_runtime dereference causes a fault
outside of EFI context, due to runtime table living in runtime service
space. This may cause problems early in boot, though, so instead access it
by converting paddr to KVA for access.
While here, remove the other direct PHYS_TO_DMAP calls and the explicit DMAP
requirement from efidev.
Swapped-out process that is WKILLED must be swapped in as soon as
possible. The reason is that such process can be killed by OOM and
its pages can be only freed if the process exits. To exit, the kernel
stack of the process must be mapped.
When allocating pages for the stack of the WKILLED process on swap in,
use VM_ALLOC_SYSTEM requests to increase the chance of the allocation
to succeed.
Add counter of the swapped out processes to avoid unneeded iteration
over the allprocs list when there is no work to do, reducing the
allproc_lock ownership.
Mark Johnston [Sat, 4 Aug 2018 20:29:58 +0000 (20:29 +0000)]
Fix the regression test for PR 181741.
With r337328, the test hangs becase the sendmsg() call will block until
the receive buffer is at least partially drained. Fix the problem by
using a non-blocking socket and allowing short writes. Also assert
that a SCM_CREDS message was received if one was expected.
PR: 181741
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D16516
Mark Johnston [Sat, 4 Aug 2018 20:26:54 +0000 (20:26 +0000)]
Don't check rcv sockbuf limits when sending on a unix stream socket.
sosend_generic() performs an initial comparison of the amount of data
(including control messages) to be transmitted with the send buffer
size. When transmitting on a unix socket, we then compare the amount
of data being sent with the amount of space in the receive buffer size;
if insufficient space is available, sbappendcontrol() returns an error
and the data is lost. This is easily triggered by sending control
messages together with an amount of data roughly equal to the send
buffer size, since the control message size may change in uipc_send()
as file descriptors are internalized.
Fix the problem by removing the space check in sbappendcontrol(),
whose only consumer is the unix sockets code. The stream sockets code
uses the SB_STOP mechanism to ensure that senders will block if the
receive buffer fills up.
PR: 181741
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D16515