yuripv [Wed, 12 Dec 2018 04:23:00 +0000 (04:23 +0000)]
regcomp: reduce size of bitmap for multibyte locales
This fixes the obscure endless loop seen with case-insensitive
patterns containing characters in 128-255 range; originally
found running GNU grep test suite.
Our regex implementation being kludgy translates the characters
in case-insensitive pattern to bracket expression containing both
cases for the character and doesn't correctly handle the case when
original character is in bitmap and the other case is not, falling
into the endless loop going through in p_bracket(), ordinary(),
and bothcases().
Reducing the bitmap to 0-127 range for multibyte locales solves this
as none of these characters have other case mapping outside of bitmap.
We are also safe in the case when the original character outside of
bitmap has other case mapping in the bitmap (there are several of those
in our current ctype maps having unidirectional mapping into bitmap).
mckusick [Tue, 11 Dec 2018 22:14:37 +0000 (22:14 +0000)]
Continuing efforts to provide hardening of FFS. This change adds a
check hash to the filesystem inodes. Access attempts to files
associated with an inode with an invalid check hash will fail with
EINVAL (Invalid argument). Access is reestablished after an fsck
is run to find and validate the inodes with invalid check-hashes.
This check avoids a class of filesystem panics related to corrupted
inodes. The hash is done using crc32c.
Note this check-hash is for the inode itself and not any of its
indirect blocks. Check-hash validation may be extended to also
cover indirect block pointers, but that will be a separate (and
more costly) feature.
Check hashes are added only to UFS2 and not to UFS1 as UFS1 is
primarily used in embedded systems with small memories and low-powered
processors which need as light-weight a filesystem as possible.
Reviewed by: kib
Tested by: Peter Holm
Sponsored by: Netflix
kp [Tue, 11 Dec 2018 21:49:13 +0000 (21:49 +0000)]
pf tests: Use the ATF cleanup infrastructure in the ioctl tests
Use ATF_TC_CLEANUP(), because that means the cleanup code will get
called even if a test fails. Before it would only be executed if every
test within the body succeeded.
Reported by: Marie Helene Kvello-Aune <marieheleneka@gmail.com>
MFC after: 2 weeks
kp [Tue, 11 Dec 2018 21:44:39 +0000 (21:44 +0000)]
pf: Prevent integer overflow in PF when calculating the adaptive timeout.
Mainly states of established TCP connections would be affected resulting
in immediate state removal once the number of states is bigger than
adaptive.start. Disabling adaptive timeouts is a workaround to avoid this bug.
Issue found and initial diff by Mathieu Blanc (mathieu.blanc at cea dot fr)
Reported by: Andreas Longwitz <longwitz AT incore.de>
Obtained from: OpenBSD
MFC after: 2 weeks
mav [Tue, 11 Dec 2018 20:47:00 +0000 (20:47 +0000)]
Allow CTL device specification in bhyve virtio-scsi.
There was a large refactoring done in CTL to allow multiple ioctl frontend
ports (and respective devices) to be created, particularly for bhyve.
Unfortunately, respective part of bhyve functionality got lost somehow from
the original virtio-scsi commit. This change allows wanted device path to
be specified in either of two ways:
-s 6,virtio-scsi,/dev/cam/ctl1.1
-s 6,virtio-scsi,dev=/dev/cam/ctl2.3
If neither is specified, the default /dev/cam/ctl device is used.
While there, remove per-queue CTL device opening, which makes no sense at
this point.
Reported by: wg
Reviewed by: araujo
MFC after: 3 days
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D18504
dim [Tue, 11 Dec 2018 19:05:28 +0000 (19:05 +0000)]
Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to
the upstream release_70 branch r348686 (effectively, 7.0.1 rc3). The
release will follow very soon, but no more functional changes are
expected.
Release notes for llvm, clang and lld 7.0.0 are available here:
<http://releases.llvm.org/7.0.0/docs/ReleaseNotes.html>
<http://releases.llvm.org/7.0.0/tools/clang/docs/ReleaseNotes.html>
<http://releases.llvm.org/7.0.0/tools/lld/docs/ReleaseNotes.html>
shurd [Tue, 11 Dec 2018 17:46:01 +0000 (17:46 +0000)]
Fix !tx_abdicate error from r336560
r336560 was supposed to restore pre-r323954 behaviour when tx_abdicate is
not set (the default case). However, it appears that rather than the drainage
check being made conditional on tx_abdicate being set, it was duplicated
so it occured twice if tx_abdicate was set and once if it was not.
Now when !tx_abdicate, drainage is only checked if the doorbell isn't
pending.
dim [Tue, 11 Dec 2018 06:45:53 +0000 (06:45 +0000)]
For arm and armv6, only enable LLVM target support for arm by default,
to shrink libllvm.a.
This is a workaround for "relocation truncated to fit" errors with BFD
ld 2.17.50 on arm and armv6, when linking executables against it.
The required range extensions are not yet supported by this very old
version of BFD ld. When arm and armv6 userland can be successfully
linked by lld, this workaround can be removed.
delphij [Tue, 11 Dec 2018 05:10:22 +0000 (05:10 +0000)]
Remove questionable initialization for ICH8M, rely on BIOS to properly
initialize the controller.
According to the datasheet, the old code checks if port 2 (P2E, 0x4) was
the only enabled port (except port 0, which was ignored by mask 0xfe),
and issue a write to the PCS register to disable all but port 0, right
before ahci_ctlr_reset.
Some other operating systems would issue a port enable to all ports, but
since the current code only does the special initialization for ICH8M,
it entirely and rely on BIOS to do the right thing (the alternative
would be https://reviews.freebsd.org/D18300?id=50922 , should we see
reports that we really need to do it).
kib [Tue, 11 Dec 2018 02:54:36 +0000 (02:54 +0000)]
Free bootstacks after AP startup.
Bootstacks are unused after APs executed sched_throw() in
init_secondary_tail() and started executing on proper idle thread
stack. Add sysinit that detects that the idle thread for each CPU was
scheduled at least once, and free corresponding bootstack.
Slight addition of the code (~200 bytes) is compensated by the saving,
because even on typical small modern desktop CPU we leak 128K of
memory otherwise (4 pages x 8 threads).
kib [Tue, 11 Dec 2018 02:48:49 +0000 (02:48 +0000)]
Remove special case handling for getfhat(fd, NULL, handle).
There is no reason for it to behave differently from openat(fd, NULL).
Also the handling did not worked because the substituted path was from
the system address space, causing EFAULT.
markj [Tue, 11 Dec 2018 02:15:56 +0000 (02:15 +0000)]
Use inline tests for individual PTE bits in the RISC-V pmap.
Inline tests for PTE_* bits are easy to read and don't really require a
predicate function, and predicates which operate on a pt_entry_t are
inconvenient when working with L1 and L2 page table entries.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18461
jhibbits [Tue, 11 Dec 2018 02:03:00 +0000 (02:03 +0000)]
powerpc/booke: Don't get and use the load offset for TOC on APs
The code was a near exact copy of the code in startup, but it doesn't need
the complexity since the kernel is already relocated. With
VM_MIN_KERNEL_ADDRESS as currently set to KERNBASE, this doesn't cause a
problem, because it's a zero offset. However, when KERNBASE is changed to a
physical load address, it then has a non-zero offset, and ends up with an
invalid stack pointer, causing the AP to hang.
cem [Tue, 11 Dec 2018 01:38:50 +0000 (01:38 +0000)]
rc.subr: Implement list_vars without using 'read'
'read' pessimistically read(2)s one byte at a time, which can be quite
silly for large environments in slow emulators.
In my boring user environment, truss shows that the number of read()
syscalls to source rc.subr and invoke list_vars is reduced by something like
3400 to 60. ministat(1) shows a significant time difference of about -71%
for my environment.
jhb [Mon, 10 Dec 2018 19:39:24 +0000 (19:39 +0000)]
Don't report stale signal information for non-signal events in ptrace_lwpinfo.
Once a signal's siginfo was copied to 'td_si' as part of the signal
exchange in issignal(), it was never cleared. This caused future
thread events that are reported as SIGTRAP events without signal
information to report the stale siginfo in 'td_si'. For example, if a
debugger created a new process and used SIGSTOP to stop it after
PT_ATTACH, future system call entry / exit events would set PL_FLAG_SI
with the SIGSTOP siginfo in pl_siginfo. This broke 'catch syscall' in
current versions of gdb as it assumed PL_FLAG_SI with SIGTRAP
indicates a breakpoint or single step trap.
ae [Mon, 10 Dec 2018 16:23:11 +0000 (16:23 +0000)]
Rework how protocol number is tracked in rule. Save it when O_PROTO
opcode will be printed. This should solve the problem, when protocol
name is not printed in `ipfw -N show`.
luporl [Mon, 10 Dec 2018 14:54:28 +0000 (14:54 +0000)]
ppc64: handle exception 0x1500 (soft patch)
This change adds a hypervisor trap handler for exception 0x1500 (soft patch),
normalizing all VSX registers and returning.
This avoids a kernel panic due to unknown exception.
Change made with the collaboration of leonardo.bianconi_eldorado.org.br,
that found out that this is a hypervisor exception and not a supervisor one,
and fixed this in the code.
emaste [Mon, 10 Dec 2018 14:50:11 +0000 (14:50 +0000)]
Clean stale wpa dependencies and objects after r341759
The wpa update added some source files with the same name as a file in
another directory (found via .PATH in the previous version). Having a
stale entry in a .depend file means the new file won't be built, so test
for this case and if found remove all of wpa's dependency files.
MFC with: r341759
Sponsored by: The FreeBSD Foundation
arybchik [Mon, 10 Dec 2018 09:36:05 +0000 (09:36 +0000)]
sfxge(4): use n Tx queues instead of n + 2 on EF10 HW
On EF10 HW we can avoid sending packets without checksum offload
or with IP-only checksum offload to dedicated queues. Instead, we
can use option descriptors to change offload policy on any queue
during runtime. Thus, we don't need to create two dedicated queues.
Submitted by: Ivan Malov <Ivan.Malov at oktetlabs.ru>
Sponsored by: Solarflare Communications, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18390
arybchik [Mon, 10 Dec 2018 09:35:53 +0000 (09:35 +0000)]
sfxge(4): prepare the number of Tx queues on event queue 0 to become variable
The number of Tx queues on event queue 0 can depend on the NIC family type,
and this property will be leveraged by future patches.
This patch prepares the code for this change.
Submitted by: Ivan Malov <Ivan.Malov at oktetlabs.ru>
Sponsored by: Solarflare Communications, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18389
arybchik [Mon, 10 Dec 2018 09:35:45 +0000 (09:35 +0000)]
sfxge(4): report support for Tx checksum op descriptors
FreeBSD driver needs a patch to provide a means for packets
which do not need checksum offload but have flow ID set
to avoid hitting only the first Tx queue (which has been used
for packets not needing checksum offload).
This should be possible on Huntington, Medford or Medford2 chips
since these support toggling checksum offload on any given queue
dynamically by means of pushing option descriptors.
The patch for FreeBSD driver will then need a means to figure out
whether the feature can be used, and testing adapter family might
not be a good solution.
This patch adds a feature bit specifically to indicate support
for checksum option descriptors. The new feature bits may have
more users in future, apart from the mentioned FreeBSD patch.
Submitted by: Ivan Malov <Ivan.Malov at oktetlabs.ru>
Sponsored by: Solarflare Communications, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18388
arybchik [Mon, 10 Dec 2018 09:35:33 +0000 (09:35 +0000)]
sfxge(4): populate per-event queue stats in sysctl
In order to find out why the first event queue and corresponding
interrupt is triggered more frequent, it is useful to know which
events go to each event queue.
alc [Sun, 9 Dec 2018 17:55:10 +0000 (17:55 +0000)]
blst_leaf_alloc updates bighint for a leaf when an allocation is successful
and includes the last block represented by the leaf. The reasoning is that,
if the last block is included, then there must be no solution before that
one in the leaf, so the leaf cannot provide an allocation that big again;
indeed, the leaf cannot provide a solution bigger than range1.
Which is all correct, except that if the value of blk passed in did not
represent the first block of the leaf, because the cursor was pointing to
the middle of the leaf, then a possible solution before the cursor may have
been ignored, and bighint cannot be updated.
Consider the sequence allocate 63 (returning address 0), free 0,63 (freeing
that same block, and allocate 1 (returning 63). The result is that one
block is allocated from the first leaf, and the value of bighint is 0, so
that nothing can be allocated from that leaf until the only block allocated
from that leaf is freed. This change detects that skipped-over solution,
and when there is one it makes sure that the value of bighint is not changed
when the last block is allocated.
bde [Sun, 9 Dec 2018 15:34:20 +0000 (15:34 +0000)]
Fix devstat on md devices.
devstat_end_transaction() was called before the i/o was actually ended
(by delivering it to GEOM), so at least the i/o length was messed up.
It was always recorded as 0, so the average transaction size and the
average transfer rate was always displayed as 0.
devstat_end_transaction() was not called at all for the error case, so
there were sometimes multiple starts per end. I didn't observe this in
practice and don't know if it did much damage. I think it extended the
length of the i/o to the next transaction.
scottl [Sun, 9 Dec 2018 06:52:25 +0000 (06:52 +0000)]
I missed powerpcspe in the previous commit for excluding mps and mpr.
I also learned that 'mips' is overly broad and covers 64bit architectures
too. However, it's not worth the fight right now, so any refinements
will have to come another day.
scottl [Sun, 9 Dec 2018 06:10:11 +0000 (06:10 +0000)]
Copy and clear the reply descriptor atomically. This prevents concurrency
in the interrupt handlers (usually due to timeout/error recovery) from
seeing and processing the same descriptor twice.
scottl [Sun, 9 Dec 2018 06:06:06 +0000 (06:06 +0000)]
Remove the mps driver from powerpc 32bit GENERIC, and don't build it and
mpr as a module for powerpc or mips. An upcoming commit will cause these
drivers to rely on the presence of 64bit atomic operations. Discussed
with jhibbits.
kib [Sat, 8 Dec 2018 22:12:57 +0000 (22:12 +0000)]
Fix PAE boot.
With the introduction of M_EXEC support for kmem_malloc(), some kernel
mappings start having NX bit set in the paging structures early, for
PAE kernels on machines with NX support, i.e. practically on all
machines. In particular, AP trampoline and initialization needs to
access pages which translations has NX bit set, before initializecpu()
is called.
Check for CPUID NX feature and enable EFER.NXE before we enable paging
in mp boot trampoline. This allows the CPU to use the kernel page
table instead of generating page fault due to reserved bit set.
PR: 233819
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
jchandra [Sat, 8 Dec 2018 19:32:23 +0000 (19:32 +0000)]
acpica: support parsing of arm64 affinity in acpi_pxm.c
ACPI SRAT table on arm64 uses GICC entries to provide CPU locality
information. These entries use an AcpiProcessorUid to identify the
CPU (unlike on x86 where the entries have an APIC ID).
Update acpi_pxm.c to extend the cpu_add/cpu_find/cpu_get_info
functions to handle AcpiProcessorUid. Use the updated functions
while parsing ACPI_SRAT_GICC_AFFINITY entry for arm64.
Also update sys/conf/files.arm64 to build acpi_pxm.c when ACPI is
enabled.
jchandra [Sat, 8 Dec 2018 19:10:58 +0000 (19:10 +0000)]
acpica : move SRAT/SLIT parsing to sys/dev/acpica
This moves the architecture independent parts of sys/x86/acpica/srat.c
to sys/dev/acpica/acpi_pxm.c, to be used later on arm64. The function
declarations are moved to sys/dev/acpica/acpivar.h
We also need to update sys/conf/files.{i386,amd64} to use the new file.
No functional changes.
jchandra [Sat, 8 Dec 2018 18:34:05 +0000 (18:34 +0000)]
x86/acpica/srat.c: Add API for parsing proximity tables
The SLIT and SRAT ACPI tables needs to be parsed on arm64 as well, on
systems that use UEFI/ACPI firmware and support NUMA. To do this, we
need to move most of the logic of x86/acpica/srat.c to dev/acpica and
provide an API that architectures can use to parse and configure ACPI
NUMA information.
This commit adds the API in srat.c as a first step, without making any
functional changes. We will move the common code to sys/dev/acpica
as the next step.
The functions added are:
* int acpi_pxm_init(int ncpus, vm_paddr_t maxphys) - to allocate and
initialize data structures used
* void acpi_pxm_parse_tables(void) - parse SRAT/SLIT, save the cpu and
memory proximity information
* void acpi_pxm_set_mem_locality(void) - use the saved data to set
memory locality
* void acpi_pxm_set_cpu_locality(void) - use the saved data to set cpu
locality
* void acpi_pxm_free(void) - free data structures allocated by init
On arm64, we do not have an cpu APIC id that can be used as index to
store CPU data, we need to use the Processor Uid. To help with this,
define internal functions cpu_add, cpu_find, cpu_get_info to store
and get CPU proximity information.
mmel [Sat, 8 Dec 2018 14:58:17 +0000 (14:58 +0000)]
Implement R_AARCH64_TLS_DTPMOD64 and A_AARCH64_TLS_DTPREL64 relocations.
Although these are slightly obsolete in favor of R_AARCH64_TLSDESC,
gcc -mtls-dialect=trad still use them.
Please note that definition of TLS_DTPMOD64 and TLS_DTPREL64 are incorrectly
exchanged in GNU binutils. TLS_DTPREL64 should be encoded to 1028 (as is
defined in ARM ELF ABI) but binutils encode it to 1029. And vice versa,
TLS_DTPMOD64 should be encoded to 1029 but binutils encode it to 1028.
While I'm in, add also R_AARCH64_NONE. It can be produced as result of linker
relaxation.
vmaffione [Sat, 8 Dec 2018 12:52:09 +0000 (12:52 +0000)]
tools: netmap: pkt-gen: check packet length against interface MTU
Validate the value of the -l argument (packet length) against the MTU of the netmap port.
In case the netmap port does not refer to a physical interface (e.g. VALE port or pipe), then
the netmap buffer size is used as MTU.
This change also sets a better default value for the -M option, so that pkt-gen uses
the largest possible fragments in case of multi-slot packets.
mjg [Sat, 8 Dec 2018 10:22:12 +0000 (10:22 +0000)]
Fix a corner case in ID bitmap management.
If all IDs from trypid to pid_max were used as pids, the code would enter
a loop which would be infinite if none of the IDs could become free (e.g.
they all belong to processes which did not transitioned to zombie).
Fixes: r341684 ("Manage process-related IDs with bitmaps")
kib [Fri, 7 Dec 2018 23:05:12 +0000 (23:05 +0000)]
Fix expression evaluation.
Braces were put in the wrong place, causing failing EAGAIN check to
return zero result. Remove the problematic assignment from the
conditional expression at all.
While there, remove used once variable vp, and wrap too long line.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
mav [Fri, 7 Dec 2018 20:30:00 +0000 (20:30 +0000)]
Fix several iov handling bugs in bhyve virtio-scsi backend.
- buf_to_iov() does not use buflen parameter, allowing out of bound read.
- buf_to_iov() leaks memory if seek argument > 0.
- iov_to_buf() doesn't need to reallocate buffer for every segment.
- there is no point to use size_t for iov counts, int is more then enough.
- some iov function arguments can be constified.
- pci_vtscsi_request_handle() used truncate_iov() incorrectly, allowing
getting out of buffer and possibly corrupting data.
- pci_vtscsi_controlq_notify() written returned status at wrong offset.
- pci_vtscsi_controlq_notify() leaked one buffer per event.
Reported by: wg
Reviewed by: araujo
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D18465
mjg [Fri, 7 Dec 2018 12:32:25 +0000 (12:32 +0000)]
proc: when exiting move to zombproc before taking proctree
The kernel was already doing this prior to r329615. It was changed
to reduce contention on allproc. However, introduction of pidhash
locks and removal of proctree -> allproc ordering from fork thanks
to bitmaps fixed things enough to make this change pessimal.
waitpid takes proctree on each call and this change (now) causes
avoidable stalls if allproc is held.