mav [Tue, 3 May 2016 07:49:40 +0000 (07:49 +0000)]
MFC r297963: Remove watchdog timer stop check.
There are bunch of reports that this check fails at least on Nuvoton
NCT6776 chips. I don't see why this check needed there, and Linux does
not have it either. So far this check only made watchdogd unstopable.
When a user defines "jail_list" in rc.conf the jails are started in the
order defined. Currently the jails are not are stopped in reverse order
which may break dependencies between jails/services and prevent a clean
shutdown. The new parameter "jail_reverse_stop" will shutdown jails in
"jail_list" in reverse order when set to "YES".
Please note that this does not affect manual invocation of the jail rc
script. If a user runs the command
# service jail stop jail1 jail2 jail3
the jails will be stopped in exactly the order specified regardless of
jail_reverse_stop being defined in rc.conf.
MFC r295568:
Document the new jail_reverse_stop parameter
While here clean up the documentation for jail_list
Note the existence of module-specific jail paramters, starting with the
linux.* parameters when linux emulation is loaded.
MFC r298585:
Encapsulate SYSV IPC objects in jails. Define per-module parameters
sysvmsg, sysvsem, and sysvshm, with the following bahavior:
inherit: allow full access to the IPC primitives. This is the same as
the current setup with allow.sysvipc is on. Jails and the base system
can see (and moduly) each other's objects, which is generally considered
a bad thing (though may be useful in some circumstances).
disable: all no access, same as the current setup with allow.sysvipc off.
new: A jail may see use the IPC objects that it has created. It also
gets its own IPC key namespace, so different jails may have their own
objects using the same key value. The parent jail (or base system) can
see the jail's IPC objects, but not its keys.
Add a new jail OSD method, PR_METHOD_REMOVE. It's called when a jail is
removed from the user perspective, i.e. when the last pr_uref goes away,
even though the jail mail still exist in the dying state. It will also
be called if either PR_METHOD_CREATE or PR_METHOD_SET fail.
MFC r298683:
Delay removing the last jail reference in prison_proc_free, and instead
put it off into the pr_task. This is similar to prison_free, and in fact
uses the same task even though they do something slightly different.
MFC r298566:
Pass the current/new jail to PR_METHOD_CHECK, which pushes the call
until after the jail is found or created. This requires unlocking the
jail for the call and re-locking it afterward, but that works because
nothing in the jail has been changed yet, and other processes won't
change the important fields as long as allprison_lock remains held.
Keep better track of name vs namelc in kern_jail_set. Name should
always be the hierarchical name (relative to the caller), and namelc
the last component.
Remove the PR_REMOVE flag, which was meant as a temporary marker for
a jail that might be seen mid-removal. It hasn't been doing the right
thing since at least the ability to resurrect dying jails, and such
resurrection also makes it unnecessary.
msdosfs: Prevent buffer overflow when expanding win95 names
In win2unixfn() we expand Windows 95 style long names. In some cases that
requires moving the data in the nbp->nb_buf buffer backwards to make room. That
code failed to check for overflows, leading to a stack overflow in win2unixfn().
We now check for this event, and mark the entire conversion as failed in that
case. This means we present the 8 character, dos style, name instead.
MFC r297967:
Ensure the received IP header gets 32-bits aligned.
The FreeBSD's TCP/IP stack assumes that the IP-header is 32-bits aligned
when decoding it. Else unaligned 32-bit memory access can happen, which
not all processor architectures support.
When downing a mlxen network adapter we need to check the port_up variable
to ensure we don't continue to transmit data or restart timers which can
reside in freed memory.
Don't remove the /var/run/jail_name.id file if a jail fails to start.
This messes up ezjail (and possibly others), when attempting to start
a jail that already exists.
MFC r298521;
regex: prevent two improbable signed integer overflows.
In matcher() we used an integer to index nsub of type size_t.
In print() we used an integer to index nstates of type sopno,
typedef'd long.
In both cases the indexes never take negative values.
MFC 297932,298295:
Improvements for PCI passthru devices.
297932:
Handle PBA that shares a page with MSI-X table for passthrough devices.
If the PBA shares a page with the MSI-X table, map the shared page via
/dev/mem and emulate accesses to the portion of the PBA in the shared
page by accessing the mapped page.
298295:
Always emit an error message on passthru configuration errors.
Previously, many errors (such as the PCI device not being attached
to the ppt(4) driver) resulted in bhyve silently exiting without
starting the virtual machine. Now any errors encountered when
configuring a virtual slot for a PCI passthru device should be noted
on stderr.
MFC 297039,297374,297398,297484:
Poll the IPI status while waiting constantly instead of delaying
5 microseconds between checks. This avoids inserting a minimum
latency of 5 microseconds on each IPI.
297039:
Check IPI status more frequently when waiting.
An IPI cannot be sent via the local APIC if a previous IPI is still
being delivered. Attempts to send an IPI will wait for a pending IPI
to clear. Prior to r278325 these checks used a spin loop with a
hardcoded maximum count which broke AP startup on some systems.
However, r278325 also enforced a minimum latency of 5 microseconds if an
IPI was still pending which resulted in a measurable performance hit.
This change reduces that minimum latency to 1 microsecond.
297374:
Calibrate the frequency of the of the native_lapic_ipi_wait() loop,
and avoid a delay while waiting for IPI delivery acknowledgement in
xAPIC mode. This makes the loop exit immediately after the delivery
bit in APIC_ICR register is set, instead of waiting for some
microseconds.
We only need to ensure that some amount of time is allowed for the
LAPIC to react to the command, and we need that the wait time is
finite and reasonable. For that reasons, it is irrelevant if the CPU
frequency or throttling decrease the speed and make the loop,
calibrated for full CPU speed at boot time, execute somewhat slower.
297398:
Fix several bugs in r297374:
- fix UP build [1]
- do not obliterate initial reading of rdtsc by the loop counter [2]
- restore the meaning of the argument -1 to native_lapic_ipi_wait()
as wait until LAPIC acknowledge without timeout
- correct formula for calculating loop iteration count for 1us, it was
inverted, and ensure that even on unlikely slow CPUs at least one
check for ack is performed.
297484:
Style(9), use tabs for the #define LOOPS line.
Print unsigned values with %u.
Make code slightly more compact by inlining loop limit.
The default value of MINFREE is defined to be 8% in
ufs/ffs/fs.h and not 10%. The newfs(8) and tunefs(8)
man pages had this change already, but fs(5) did not.
This change makes it consistent again.
MFC r297820:
Fix the problem, when gpart(8) can't write both bootcode and partcode
in one command due to wrong file size limit. Do not use bootcode size
to calculate partsize limit.
Also add report message about successful partcode writing.
Add DEBUG_FLAGS to PROG_VARS and STRIP to PROG_OVERRIDE_VARS
This will allow the variables [*] to be overridden on a per-PROG basis,
which is useful when controlling "stripping" behavior for some tests
that require debug symbols or to be unstripped
DEBUG_FLAGS (similar to CFLAGS) supports appending, whereas STRIP is
an override
*: Due to how STRIP is defined in bsd.own.mk (in addition to
bsd.lib.mk and bsd.prog.mk), and the fact that bsd.test.mk pulls in
bsd.own.mk first, overriding STRIP doesn't work today.
A follow up commit is pending to "rectify" this after additional
testing is done.
Discussed with: bdrewery
r298013:
Commit documentation change for r298012
Requested by: bdrewery
r298014:
Regenerate the list of bsd.progs.mk supported variables
Prefix with dashes (unordered list) and put one variable on each
line (to avoid future conflicts)
Done via the following one-liner:
> sh -c 'for i in $(make -C tests/sys/aio PROG=foo -VPROG_VARS:O); do printf "\t\t- $i\n"; done'
MFC r288490: Add debug file extension to kldxref(8)
After r288176 [in head] kernel debug files have the extension .debug.
They also moved [in head] to /usr/lib/debug/boot/kernel by default so
in the normal case kldxref does not encounter them. A src.conf(5)
setting may be used to continue installing them in /boot/kernel
though, so have kldxref skip .debug files in addition to .symbols
files.
Merged this change to avoid warnings when a stable/10 kldxref runs
against a head install, perhaps on an upgrade to 11-CURRENT. The change
to kernel debug files will not be merged to stable/10.
Reserve and ignore the a new module metadata type MDT_PNP_INFO for
associating an optional PNP hint table with this module. In the
future, when these are added, these changes will silently ignore the
new type they would otherwise warn about. It will always be safe to
ignore this data. Get this into the builds today for some future
proofing.
The previous method would completely nerf CFLAGS once bsd.progs.mk had
recursed into the per-PROG logic and make the CFLAGS for tap testcases
to -O0, instead of appending to CFLAGS for all of the tap testcases.
MFC 295930:
Add support for displaying thread IDs to truss(1).
- Consolidate duplicate code for printing the metadata at the start of
each line into a shared function.
- Add an -H option which will log the thread ID of the relevant thread
for each event.
While here, remove some extraneous calls to clock_gettime() in
print_syscall() and print_syscall_ret(). The caller of print_syscall_ret()
always updates the current thread's "after" time before it is called.
MFC 295677,295678:
Fetch the current thread and it's syscall state from the trussinfo object
instead of passing some of that state as arguments to print_syscall() and
print_syscallret(). This just makes the calls of these functions shorter
and easier to read.
MFC r297873
1. Process tx completions in bxe_periodic_callout_func() and restart
transmissions if possible.
2. For SIOCSIFFLAGS call bxe_init_locked() only if !BXE_STATE_DISABLED
3. remove code not needed in bxe_init_internal_common()
Disable the NetBSD-specific EFAULT requirements test in gettimeofday_err
FreeBSD doesn't specifically list this as a supported error, and in some
configurations/versions of FreeBSD, this test will segfault as the memory
address might be evaluated in userspace, instead of in kernel space like
in NetBSD.
kqueue EVFILT_PROC: avoid collision between NOTE_CHILD and NOTE_EXIT
NOTE_CHILD and NOTE_EXIT return something in kevent.data: the parent
pid (ppid) for NOTE_CHILD and the exit status for NOTE_EXIT.
Do not let the two events be combined, since one would overwrite
the other's data.
PR: 180385
Submitted by: David A. Bright <david_a_bright@dell.com>
Sponsored by: Dell Inc.
MFC r297672: Alike to r293708 relax pool check in vdev_geom_open_by_path().
This made impossible spare disk open by known path, which kind of worked
only because the same fix was applied to vdev_geom_attach_by_guids() in
r293708.
The namesz and descsz variables need to be used in native endianness.
The sizes are in native order after swapping in the file to memory case,
and before swapping in the memory to file case.
This change is not identical to r275430 because r273443 was never merged
to stable/10, and libelf moved from lib/ to contrib/elftoolchain/.
MFC r278817: touch: Fix some subtle bugs related to NULL times fallback:
* Do not subvert vfs.timestamp_precision by reading the time and passing
that to utimensat(). Instead, pass UTIME_NOW. A fallback to a NULL times
pointer is no longer used.
* Do not ignore -a/-m if the user has write access but does not own the
file. Leave timestamps unchanged using UTIME_OMIT and do not fall back to
a NULL times pointer (which would set both timestamps) if that fails.