trasz [Mon, 3 Aug 2015 08:04:31 +0000 (08:04 +0000)]
MFC r282086:
Make setproctitle(3) work in Capsicum capability mode. This makes
ctld(8) child processes to indicate initiator address and name in
their titles, similar to what iscsid(8) child processes do.
trasz [Sun, 2 Aug 2015 10:08:57 +0000 (10:08 +0000)]
MFC r284582:
Fix off-by-one error in fstyp(8) and geom_label(4) that made them use
a single space (" ") as a CD9660 label name when no label was present.
Similar problem was also present in msdosfs label recognition.
rmacklem [Sat, 1 Aug 2015 22:56:42 +0000 (22:56 +0000)]
MFC: r286046
This patch fixes a problem where, if the NFSv4 server has a previous
unconfirmed clientid structure for the same client on the last hash list,
this old entry would not be removed/deleted. I do not think this bug would have
caused serious problems, since the new entry would have been before the old one
on the list. This old entry would have eventually been scavenged/removed.
kib [Sat, 1 Aug 2015 03:37:00 +0000 (03:37 +0000)]
MFC r285878:
Revert r173708's modifications to vm_object_page_remove().
This fixes inconsistencies encountered by vm_object_unwire() or
by the buffer cache when the file is truncated.
MFC: r285113
If a "principal" argument isn't provided for a Kerberized NFS mount,
the kernel would generate a bogus one with a ":/<path>" suffix.
This would only occur for the case where there was no explicit
"principal" argument and the getaddrinfo() call in mount_nfs.c failed to a
return a cannonical name for the server.
This patch fixes this unusual case.
Since the IETF has redefined the meaning of the tos field to accommodate
a set of differentiated services, set IPTOS_PREC_* macros using
IPTOS_DSCP_* macro definitions.
While here, add IPTOS_DSCP_VA macro according to RFC 5865.
When we allocate the struct pf_fragment in pf_fillup_fragment() we
forgot to initialise the fr_flags field. As a result we sometimes
mistakenly thought the fragment to not be a buffered fragment.
This resulted in panics because we'd end up freeing the pf_fragment
but not removing it from V_pf_fragqueue (believing it to be part of
V_pf_cachequeue). The next time we iterated V_pf_fragqueue we'd use
a freed object and panic.
While here also fix a pf_fragment use after free in pf_normalize_ip().
pf_reassemble() frees the pf_fragment, so we can't use it any more.
X-MFS-To: releng/10.2
Sponsored by: The FreeBSD Foundation
Run a shell in the jail when no command is specified.
Add a new flag, -l, for a clean environment, same as jail(8) exec.clean.
Change the GET_USER_INFO macro into a function.
marius [Thu, 30 Jul 2015 02:23:09 +0000 (02:23 +0000)]
MFC: r285843
- Since r253161, uart_intr() abuses FILTER_SCHEDULE_THREAD for signaling
uart_bus_attach() during its test that 20 iterations weren't sufficient
for clearing all pending interrupts, assuming this means that hardware
is broken and doesn't deassert interrupts. However, under pressure, 20
iterations also can be insufficient for clearing all pending interrupts,
leading to a panic as intr_event_handle() tries to schedule an interrupt
handler not registered. Solve this by introducing a flag that is set in
test mode and otherwise restores pre-r253161 behavior of uart_intr(). The
approach of additionally registering uart_intr() as handler as suggested
in PR 194979 is not taken as that in turn would abuse special pccard and
pccbb handling code of intr_event_handle(). [1]
- Const'ify uart_driver_name.
- Fix some minor style bugs.
marius [Thu, 30 Jul 2015 02:06:29 +0000 (02:06 +0000)]
MFC: r285839
o Revert the other functional half of r239864, i. e. the merge of r134227
from x86 to use smp_ipi_mtx spin lock not only for smp_rendezvous_cpus()
but also for the MD cache invalidation, TLB demapping and remote register
reading IPIs due to the following reasons:
- The cross-IPI SMP deadlock x86 otherwise is subject to can't happen on
sparc64. That's because on sparc64, spin locks don't disable interrupts
completely but only raise the processor interrupt level to PIL_TICK. This
means that IPIs still get delivered and direct dispatch IPIs such as the
cache invalidation etc. IPIs in question are still executed.
- In smp_rendezvous_cpus(), smp_ipi_mtx is held not only while sending an
IPI_RENDEZVOUS, but until all CPUs have processed smp_rendezvous_action().
Consequently, smp_ipi_mtx may be locked for an extended amount of time as
queued IPIs (as opposed to the direct ones) such as IPI_RENDEZVOUS are
scheduled via a soft interrupt. Moreover, given that this soft interrupt
is only delivered at PIL_RENDEZVOUS, processing of smp_rendezvous_action()
on a target may be interrupted by f. e. a tick interrupt at PIL_TICK, in
turn leading to the target in question trying to send an IPI by itself
while IPI_RENDEZVOUS isn't fully handled, yet, and, thus, resulting in a
deadlock.
o As mentioned in the commit message of r245850, on least some sun4u platforms
concurrent sending of IPIs by different CPUs is fatal. Therefore, hold the
reintroduced MD ipi_mtx also while delivering cross-traps via MI helpers,
i. e. ipi_{all_but_self,cpu,selected}().
o Akin to x86, let the last CPU to process cpu_mp_bootstrap() set smp_started
instead of the BSP in cpu_mp_unleash(). This ensures that all APs actually
are started, when smp_started is no longer 0.
o In all MD and MI IPI helpers, check for smp_started == 1 rather than for
smp_cpus > 1 or nothing at all. This avoids races during boot causing IPIs
trying to be delivered to APs that in fact aren't up and running, yet.
While at it, move setting of the cpu_ipi_{selected,single}() pointers to
the appropriate delivery functions from mp_init() to cpu_mp_start() where
it's better suited and allows to get rid of the global isjbus variable.
o Given that now concurrent IPI delivery no longer is possible, also nuke
the delays before completely disabling interrupts again in the CPU-specific
cross-trap delivery functions, previously giving other CPUs a window for
sending IPIs on their part. Actually, we now should be able to entirely get
rid of completely disabling interrupts in these functions. Such a change
needs more testing, though.
o In {s,}tick_get_timecount_mp(), make the {s,}tick variable static. While not
necessary for correctness, this avoids page faults when accessing the stack
of a foreign CPU as {s,}tick now is locked into the TLBs as part of static
kernel data. Hence, {s,}tick_get_timecount_mp() always execute as fast as
possible, avoiding jitter.
marius [Thu, 30 Jul 2015 00:28:27 +0000 (00:28 +0000)]
MFC: r284447, r284552
Merge from NetBSD:
o rev. 1.10: Nuke trailing whitespace.
o rev. 1.15: Fix typo in comment.
o rev. 1.16: Add the following registers from IEEE 802.3-2009 Clause 22:
- PSE control register (0x0b)
- PSE status register (0x0c)
- MMD access control register (0x0d)
- MMD access address data register (0x0e)
o rev. 1.17 (comments only): The bit location of link ability is different
between 1000Base-X and others (see Annex 28B.2 and 28D).
o rev. 1.18: Nuke dupe word.
dim [Wed, 29 Jul 2015 19:25:28 +0000 (19:25 +0000)]
Reapply r286007, modified to compile with pre-C++11 compilers:
Pull in r219009 from upstream llvm trunk (by Adam Nemet):
[ISel] Keep matching state consistent when folding during X86 address match
In the X86 backend, matching an address is initiated by the 'addr' complex
pattern and its friends. During this process we may reassociate and-of-shift
into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
shift into the scale of the address.
However as demonstrated by the testcase, this can trigger CSE of not only the
shift and the AND which the code is prepared for but also the underlying load
node. In the testcase this node is sitting in the RecordedNode and MatchScope
data structures of the matcher and becomes a deleted node upon CSE. Returning
from the complex pattern function, we try to access it again hitting an assert
because the node is no longer a load even though this was checked before.
Now obviously changing the DAG this late is bending the rules but I think it
makes sense somewhat. Outside of addresses we prefer and-of-shift because it
may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
example because it create a non-canonical node). We currently don't recognize
addresses during DAGCombiner where arguably this canonicalization should be
performed. On the other hand, having this in the matcher allows us to cover
all the cases where an address can be used in an instruction.
I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
the new shift node in FoldMaskedShiftToScaledMask. This RAUW is responsible
for initiating the recursive CSE on users
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
is not strictly necessary since the shift is hooked into the visited user. Of
course it's safer to keep the DAG consistent at all times (e.g. for accurate
number of uses, etc.).
So rather than changing the fundamentals, I've decided to continue along the
previous patches and detect the CSE. This patch installs a very targeted
DAGUpdateListener for the duration of a complex-pattern match and updates the
matching state accordingly. (Previous patches used HandleSDNode to detect the
CSE but that's not practical here). The listener is only installed on X86.
I tested that there is no measurable overhead due to this while running
through the spec2k BC files with llc. The only thing we pay for is the
creation of the listener. The callback never ever triggers in spec2k since
this is a corner case.
dim [Wed, 29 Jul 2015 12:59:16 +0000 (12:59 +0000)]
Pull in r219009 from upstream llvm trunk (by Adam Nemet):
[ISel] Keep matching state consistent when folding during X86 address match
In the X86 backend, matching an address is initiated by the 'addr' complex
pattern and its friends. During this process we may reassociate and-of-shift
into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
shift into the scale of the address.
However as demonstrated by the testcase, this can trigger CSE of not only the
shift and the AND which the code is prepared for but also the underlying load
node. In the testcase this node is sitting in the RecordedNode and MatchScope
data structures of the matcher and becomes a deleted node upon CSE. Returning
from the complex pattern function, we try to access it again hitting an assert
because the node is no longer a load even though this was checked before.
Now obviously changing the DAG this late is bending the rules but I think it
makes sense somewhat. Outside of addresses we prefer and-of-shift because it
may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
example because it create a non-canonical node). We currently don't recognize
addresses during DAGCombiner where arguably this canonicalization should be
performed. On the other hand, having this in the matcher allows us to cover
all the cases where an address can be used in an instruction.
I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
the new shift node in FoldMaskedShiftToScaledMask. This RAUW is responsible
for initiating the recursive CSE on users
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
is not strictly necessary since the shift is hooked into the visited user. Of
course it's safer to keep the DAG consistent at all times (e.g. for accurate
number of uses, etc.).
So rather than changing the fundamentals, I've decided to continue along the
previous patches and detect the CSE. This patch installs a very targeted
DAGUpdateListener for the duration of a complex-pattern match and updates the
matching state accordingly. (Previous patches used HandleSDNode to detect the
CSE but that's not practical here). The listener is only installed on X86.
I tested that there is no measurable overhead due to this while running
through the spec2k BC files with llc. The only thing we pay for is the
creation of the listener. The callback never ever triggers in spec2k since
this is a corner case.
The UEFI loader on the 10.1 release install disk (disc1) modifies an
existing EFI_DEVICE_PATH_PROTOCOL instance in an apparent attempt to
truncate the device path. In doing so it creates an invalid device
path.
Perform the equivalent action without modification of structures
allocated by firmware.
PR: 197641
Submitted by: Chris Ruffin <chris.ruffin at intel.com>
Merge r283106:
During module unload unlock rules before destroying UMA zones, which
may sleep in uma_drain(). It is safe to unlock here, since we are already
dehooked from pfil(9) and all pf threads had quit.
dim [Tue, 28 Jul 2015 09:19:04 +0000 (09:19 +0000)]
MFC r285340:
Fix swapped copyin(9) arguments in cxgb's iwch_arm_cq() function.
Detected by clang 3.7.0 with the warning:
sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_provider.c:309:18: error: variable
'rptr' is uninitialized when used here [-Werror,-Wuninitialized]
chp->cq.rptr = rptr;
^~~~
Merge r271458:
- Provide a sleepable lock to protect against ioctl() vs ioctl() races.
- Use the new lock to protect against simultaneous DIOCSTART and/or
DIOCSTOP ioctls.
MFC r285735:
lseek() allows an offset to be set beyond the end of file. Using
it to check that partition has enough space to write bootcode doesn't
work. Use the known size of provider instead.
retval is used to test the return of XML_Parse function which is ok if 1 is
returned and retval it directly returned to the main function and used as an
exit value.
if all the parsing part is done reset retval to 0 so that the command return 0
if everything ok
nvd: set d_delmaxsize to full capacity of NVMe namespace
The NVMe specification has no ability to specify a maximum delete size
that is less than the full capacity of the namespace - so just using the
namespace size is the correct value here.
This fixes reported issues where ZFS trim on init looked like it was
hanging the system - previously the default I/O max size (128KB on
Intel NVMe controllers) was used for delete operations which worked out
to only about 8MB/s. With this patch I can add an 800GB DC P3700
drive to a ZFS pool in about 15-20 seconds.
MFC: r285066
Alex Burlyga reported a POLA violation for the new NFS client as
compared to the old NFS client via email to the freebsd-fs@ mailing list.
For the new client, when multiple clients attempted to create a symbolic
link concurrently, more that one client would report success instead of
EEXIST. This was caused by code in the new client that mapped EEXIST to
OK assuming it was caused by a retried RPC request.
Since the old client did not do this, the patch defaults to the old
behaviour and permits the new behaviour to be enabled via a sysctl.
Partially revert r284034. In particular, revert the final change in this
MFC (281874). It broke suspend and resume on several Thinkpads (though not
all) in 10 even though it works fine on the same laptops in HEAD.
PR: 201239
Reported by: Kevin Oberman and several others
- Implement PF_IMMUTABLE flag and apply it to "name" and "jid" in
jail.conf parameters. This flag disallows redefinition of the parameter.
"name" and/or "jid" are automatically defined in jail.conf by using
the jail names at the front of jail parameter definitions. However,
one could override them by using a variable with the same name like
$name = "foo". This confused the parser and could end up with SIGSEGV.
Note that this change also affects a case when all of parameters are
defined in the command line arguments, not in jail.conf. Specifically,
"jail -c name=j1 name=j2" no longer works. This should be harmless.
- Add SOCK_SEQPACKET support in UNIX-domain socket.
- Display zoneid using % notation in an IPv6 address.
- Use nitems().
- Use sstos{in,in6,un} macros to simplify casts.
- style(9).
- Remove ND6_IFF_IGNORELOOP. This functionality was useless in practice
because a link where looped back NS messages are permanently observed
does not work with either NDP or ARP for IPv4.
Fix group membership of cloned interfaces when one is moved by
if_vmove().
In if_vmove(), if_detach_internal() and if_attach_internal() were
called in series to detach and reattach the interface. When
detaching, if_delgroup() was called and the interface leaves all of
the group membership. And then upon attachment, if_addgroup(ifp,
IFG_ALL) was called and it joined only "all" group again.
This had a problem. Normally, a cloned interface automatically joins
a group whose name is ifc_name of the cloner in addition to "all"
upon creation. However, if_vmove() removed the membership and did
not restore upon attachment.
Remove examples of gif_interfaces and gifconfig. These have already been
marked as deprecated in rc.conf(5) manual page but these examples
were still here.
* Add -x waittime and -X timeout options for feature parity. These are
equivalent to -W and -t options of ping(8). Different letters are used
because both have already been used for another purposes in ping6(8).
* Fix a problem that reply packets are not received when -i T option is set
and (T < RTT).
- Use select(2) for timeout instead of interval timer. Remove poll(2) support.
- Use sigaction(2) instead of signal(3).
- Exit in SIGINT handler when two signals are received and doing reverse DNS
lookup as ping(8) does.
- Remove redundant variables used for getaddrinfo(3).
r285722 (brd):
Add support for building VirtualBox Vagrant images.
Abstract the build, package and upload to handle building
either type.
r285733
Fix an out-of-order execution issue regarding pkg(8):
- pkg(8) cannot be removed before subsequent reinvocations
- The PKG_CACHEDIR cannot be cleaned after the repo*.sqlite
has been removed
- pkg(8) cannot be removed as a precursor to any of the other
steps involved here
Approved by: re (kib)
Sponsored by: The FreeBSD Foundation
hiren [Wed, 22 Jul 2015 15:05:45 +0000 (15:05 +0000)]
MFC r284941:
Avoid a situation where we do not set persist timer after a zero window
condition.
If you send a 0-length packet, but there is data is the socket buffer, and
neither the rexmt or persist timer is already set, then activate the persist
timer.
MFC: r285679
Add auto-detecting workaround for Lenovo GPT boot issue
Add auto-detecting workaround for "GPT Active" boot issue
Allow user to select partitioning scheme in the ufs wizard
PR: 184910
PR: 194359
Approved by: re (gjb), marcel
Relnotes: yes
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D3144
hiren [Tue, 21 Jul 2015 19:41:39 +0000 (19:41 +0000)]
Partial MFC of r285528 as full RSS support is not available in FreeBSD 10.
Expose full 32bit RSS hash from card regardless of whether RSS is defined or
not. When doing multiqueue, we are all setup to have full 32bit RSS hash from
the card. We do not need to hide that under "ifdef RSS" and should expose that
by default so others like lagg(4) can use that and avoid hashing the traffic by
themselves.
Approved by: re (gjb)
Sponsored by: Limelight Networks
Check TCP timestamp option flag so that the automatic receive buffer
scaling code does not use an uninitialized timestamp echo reply value
from the stack when timestamps are not enabled.
MFC r285663, r285664, r285667:
Ensure that locstat_nsecs() has no effect when lockstat probes are not
enabled or when the profiled lock carries the LO_NOPROFILE flag.
PR: 201642, 201517
Approved by: re (gjb)
Tested by: Jason Unovitch
MFC: r285594
New partition flag for gpart, writes the 0xee partition in the pmbr in the second slot, rather than the first.
Works around Lenovo legacy GPT boot issue
PR: 184910
Approved by: re (gjb), marcel
Relnotes: yes
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D3140