bdrewery [Tue, 22 Mar 2016 22:41:14 +0000 (22:41 +0000)]
Handle copyin failures.
Skip the log entry as there is nothing good to write out. Don't fail
the syscall though since it already succeeded. There's no reason
filemon's tracing failure should fail the already-succeeded syscall.
Record the error for later to return from close(2) on the filemon devfs
file descriptor.
bdrewery [Tue, 22 Mar 2016 22:41:03 +0000 (22:41 +0000)]
Follow-up r297156: Close the log in filemon_dtr rather than in the last reference.
If the tracer has decided to the close the log then it should be fully
written, not getting more entries, when close(2) returns. This was
a regression in r297156 in that it allowed a traced process to continue
a traced syscall and add more entries to the log while the tracer had
already closed its fd or exited. This was only really part of the
daemonized process case which is abnormal.
np [Tue, 22 Mar 2016 18:56:23 +0000 (18:56 +0000)]
cxgbe(4): Be consistent and call ETHER_BPF_MTAP before writing anything
to the descriptor ring no matter what path the frame takes within the
driver's tx.
jtl [Tue, 22 Mar 2016 15:55:17 +0000 (15:55 +0000)]
to_flags is currently a 64-bit integer; however, we only use 7 bits.
Furthermore, there is no reason this needs to be a 64-bit integer
for the forseeable future.
Also, there is an inconsistency between to_flags and the mask in
tcp_addoptions(). Before r195654, to_flags was a u_long and the mask in
tcp_addoptions() was a u_int. r195654 changed to_flags to be a u_int64_t
but left the mask in tcp_addoptions() as a u_int, meaning that these
variables will only be the same width on platforms with 64-bit integers.
Convert both to_flags and the mask in tcp_addoptions() to be explicitly
32-bit variables. This may save a few cycles on 32-bit platforms, and
avoids unnecessarily mixing types.
bz [Tue, 22 Mar 2016 15:43:47 +0000 (15:43 +0000)]
Mfp4 @180378:
Factor out nd6 and in6_attach initialization to their own files.
Also move destruction into those files though still called from
the central initialization.
Sponsored by: CK Software GmbH
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: gnn
Differential Revision: https://reviews.freebsd.org/D5033
trasz [Tue, 22 Mar 2016 13:46:01 +0000 (13:46 +0000)]
Wait for root mount tokens before showing the root mount prompt.
This restores the pre-r290196 behaviour, eliminating the need to manually
press '.' a couple of times to get USB to finish probing.
Note that there's still something wrong with the console (character
echoing doesn't quite work), and there's also a reported problem with
BHyVe, but those two don't seem related to the problem above.
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
gnn [Tue, 22 Mar 2016 13:16:52 +0000 (13:16 +0000)]
Add an mbuf provider to DTrace.
The mbuf provider is made up of a set of Statically Defined Tracepoints
which help us look into mbufs as they are allocated and freed. This can be
used to inspect the buffers or for a simplified mbuf leak detector.
tuexen [Tue, 22 Mar 2016 12:40:09 +0000 (12:40 +0000)]
Support checksum offloading for TCP/IPV6 and UDP/IPV6.
Support SCTP checksum offloading for SCTP/IPV6.
Support SCTP checksum offloading on all controllers except 82575.
bz [Tue, 22 Mar 2016 12:12:01 +0000 (12:12 +0000)]
Adding pci_host_generic unconditionally breaks ARM boards with a PCI(e) interface.
Make it a device option to be included in the kernel configs that request this file.
kib [Tue, 22 Mar 2016 10:51:42 +0000 (10:51 +0000)]
Apparently there are some popular programs around which assume that it
is safe to call pthread_mutex_init() on the same shared mutex several
times. POSIX claims that the behaviour in this case is undefined.
Make this working by only allowing one caller to initialize the mutex.
Other callers either see already completed initialization and do
nothing, or busy-loop yielding while designated initializer finishes.
Also make the API requirements loose by initializing mutexes on other
pthread_mutex*() calls if they see uninitialized shared mutex.
Only mutexes provide the hack for now, but it could be also
implemented for other process shared primitives from libthr.
Reported and tested by: "Oleg V. Nauman" <oleg@opentransfer.com>
Sponsored by: The FreeBSD Foundation
andrew [Tue, 22 Mar 2016 08:36:25 +0000 (08:36 +0000)]
Use the saved program state register to detect when an exception frame is
from userpsace. Previously we could have triggered a panic by trying to
jump to a kernel address from userland as the trap handling code thought we
received an ast in kernel mode.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
mav [Tue, 22 Mar 2016 06:24:52 +0000 (06:24 +0000)]
Optimize IPMI watchdog patting.
Set watchdog timer parameters only when they really need to be changed.
In other cases just restart the timer with single Reset command instead
of two (Set and Reset).
From one side this visually reduces amount of CPU time burned in tight
loop waiting while some slow BMC configures its watchdog hardware, that
seems to be much more complicated task then just resetting the timer.
From another side on some BMCs those slow Set commands sometimes tend to
timeout, that leads to noisy log messages and even more CPU time burned,
so avoiding them can provide even bigger bonuses.
sephe [Tue, 22 Mar 2016 06:23:09 +0000 (06:23 +0000)]
hyperv/vmbus: Remove NULL check for taskqueue_create_fast(M_WAITOK)
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5215
sephe [Tue, 22 Mar 2016 06:13:27 +0000 (06:13 +0000)]
hyperv/vmbus: Use taskqueue_fast for non-performance critical messages
This gets rid of the per-cpu SWIs.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5215
sephe [Tue, 22 Mar 2016 05:48:51 +0000 (05:48 +0000)]
hyperv/evttimer: Use an independent message slot so that it can work
Using the same message slot as the other types of the messages has
the side effect that the event timer message could be deferred to
the swi threads to run (lacking of trapframe and the original code
didn't even handle that, so the event timer was actually broken).
As of this commit we use an independent message slot for event timer,
so that we could handle all of event timer messages in the interrupt
handler directly. Note, the message slot for event timer is still
bind to the same interrupt vector as the other types of messages.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe
Discussed with: Jun Su <junsu microsoft com>, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5696
adrian [Tue, 22 Mar 2016 01:09:15 +0000 (01:09 +0000)]
[urtwn] welcome basic 11n support to urtwn.
This is a pretty good reference for teaching an almost-11n-capable
driver about 11n.
It enables HT20 operation, A-MPDU/A-MSDU RX, but no aggregate support
for transmit. That'll come later. This means that receive throughput
should be higher, but transmit throughput won't have changed much.
* Disable bgscan - for now, bgscan will interfere with AMPDU TX/RX,
so until we correctly handle it in software driven scans, disable.
* Add null 11n methods for channel width / ampdu_enable.
the firmware can apparently handle ampdu tx (and hopefully block-ack
handling and retransmission) so I'll go review the linux code and
figure it out.
* Set the number of tx/rx streams. I /hope/ that nchains == nstreams
here.
* Add 11n channels in the call to ieee80211_init_channels().
* Don't enable HT40 for now - I'll have to verify the channel set command
and tidy it up a bit first.
* Teach the RX path about M_AMPDU for 11n nodes. Kinda wonder why
we aren't just doing this in net80211 already, this is the fourth
driver I've had to do this to.
* Teach rate2ridx() about MCS rates and what hardware rates to use.
* Teach the urtwn_tx_data() routine about MCS/11ng transmission.
It doesn't know about short-gi and 40MHz modes yet; that'll come
later.
* For 8192CU firmware, teach the rate table code about MCS rates.
* Ensure that the fixed rate transmit sets the right transmit flag
so the firmware obeys the driver transmit path.
* Set the default transmit rate to MCS4 if no rate control is available.
* Add HT protection (RTS-CTS exchange) support.
* Add appropriate XXX TODO entries.
TODO:
* 40MHz, short-gi, etc - channel tuning, TX, RX;
* teach urtwn_tx_raw() about (more) 11n stuff;
* A-MPDU TX would be nice!
Thanks to Andriy (avos@) for reviewing the code and testing it on IRC.
Tested:
* RTL8188EU - STA (me)
* RTL8192CU - STA (me)
* RTL8188EU - hostap (avos)
* RTL8192CU - STA (avos)
jhb [Mon, 21 Mar 2016 21:37:33 +0000 (21:37 +0000)]
Fully handle size_t lengths in AIO requests.
First, update the return types of aio_return() and aio_waitcomplete() to
ssize_t.
POSIX requires aio_return() to return a ssize_t so that it can represent
all return values from read() and write(). aio_waitcomplete() should use
ssize_t for the same reason.
aio_return() has used ssize_t in <aio.h> since r31620 but the manpage and
system call entry were not updated. aio_waitcomplete() has always
returned int.
Note that this does not require new system call stubs as this is
effectively only an API change in how the compiler interprets the return
value.
Second, allow aio_nbytes values up to IOSIZE_MAX instead of just INT_MAX.
aio_read/write should now honor the same length limits as normal read/write.
Third, use longs instead of ints in the aio_return() and aio_waitcomplete()
system call functions so that the 64-bit size_t in the in-kernel aiocb
isn't truncated to 32-bits before being copied out to userland or
being returned.
Finally, a simple test has been added to verify the bounds checking on the
maximum read size from a file.
bdrewery [Mon, 21 Mar 2016 20:29:39 +0000 (20:29 +0000)]
Stop tracking stat(2).
None of lstat(2), fstat(2), fstatat(2) were tracked either.
The other filemon implementations also do not track stat(2), nor
does bmake utilize it. The act of opening a file for read should
be enough to decide that a file is a dependency. There could be
rare cases where just having a file would cause a dependency but it
is unlikely.
bdrewery [Mon, 21 Mar 2016 20:29:27 +0000 (20:29 +0000)]
Track filemon usage via a proc.p_filemon pointer rather than its own lists.
- proc.p_filemon is added which is protected by PROC_LOCK. This improves
performance and avoids double-fork issues, taking allproc_lock
while in syscalls, and walking the process tree in syscalls. A
particular proc.p_filemon can only be changed to NULL or another
filemon, or the filemon inherited, while the filemon->lock is held.
- Filemon are reference counted. On the last reference the log will be closed.
- When closing the devfs file handle, the filemon will be detached from all
processes and inheritance prevented.
- Disallow attaching to a process already being traced since filemon is
typically intended to be used on children only. This is allowed for
curproc as bmake relies on this behavior for rare cases when combining
.MAKE with .META.
- Detach any previously tracked process on ioctl(FILEMON_SET_PID).
- Handle error from devfs_set_cdevpriv() in filemon_open().
- The global filemon lock and lists are removed.
- A free list is no longer kept. Previously this list was
forever-expanding and never garbage cleaned.
- No longer loses track of double-forks. If the process holding the filemon
handle closes it will close the log rather than wait on a daemonized process,
but it will log all activity until it closes its handle. The filemon
will be removed from the process and not inherited.
- A separate process count is kept only as an optimization for
forced detachment to avoid taking allproc_lock and walking the entire
process tree.
- struct filemon access is protected by sx(9) filemon->lock as it was before.
- Add more comments and KASSERTS.
ian [Mon, 21 Mar 2016 15:06:50 +0000 (15:06 +0000)]
If the dhcp server provided an interface-mtu option, transcribe the value
to the boot.netif.mtu env var, which will be picked up by pre-existing code
in nfs_mountroot() and used to configure the interface accordingly.
This should bring the same functionality when the bootp/dhcp work is done
by loader(8) as r297150 does for the in-kernel BOOTP case.
ian [Mon, 21 Mar 2016 14:58:12 +0000 (14:58 +0000)]
If the dhcp server delivers an interface-mtu option, parse it and store
the value in a new global intf_mtu for use by the application.
These changes were inspired by the patch provided by Robert Blayzor in
PR 187094, and will allow loader(8) to propagate the value to the kernel
along with the other nfs_diskless parms delivered via environment vars.
ian [Mon, 21 Mar 2016 14:51:51 +0000 (14:51 +0000)]
If the dhcp server provides an interface-mtu option, parse the value and
set that mtu on the interface.
These changes are based on the patch submitted by Robert Blayzor in the
PR, but I changed things around a bit, so the blame for any mistakes
belongs to me.
ian [Mon, 21 Mar 2016 14:21:32 +0000 (14:21 +0000)]
Garbage collect the bswap routines from libstand. The declaration was
wrapped in an i386 ifdef with a comment questioning their usefulness even
there. It turns out they aren't referenced anywhere, but their presence
prevents using sys/endian.h in libstand code.
These days, sys/endian.h provides much better support for such things, using
compiler builtins and inline functions (and creating connections between
libstand code and header files from sys/ would not be breaking new ground).
sephe [Mon, 21 Mar 2016 06:54:21 +0000 (06:54 +0000)]
hyperv: Factor out snprinf_hv_guid()
Submitted by: Ju Sun <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5651
kib [Mon, 21 Mar 2016 06:46:16 +0000 (06:46 +0000)]
From libthr, remove special and strange code to set up session and
control terminal, activated when running with pid 1. It is
application duty to handle this, and unsuspecting init replacements
which are linked with libthr would be broken by this.
The pre-resolving of getpid() is restored, just in case.
Reviewed by: jilles
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
emaste [Mon, 21 Mar 2016 00:59:30 +0000 (00:59 +0000)]
i915: disable GEN6_MBCTL write in gen6_init_clock_gating
This write came from Linux commit b4ae3f22d238 which has been implicated
in Sandy Bridge power consumption issues (albeit under different
conditions on Linux). Disabling it restores normal power consumption on
my Sandy Bridge laptop (Thinkpad X220).
PR: 207889
Reviewed by: cem, dumbbell
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5665
ian [Mon, 21 Mar 2016 00:52:24 +0000 (00:52 +0000)]
Fix fallout from r292180 (Dec 2015)... ensure that every driver which has
a DRIVER_MODULE() referencing mmc_driver has a MODULE_DEPEND() on mmc. This
is because the kernel linker only searches for symbols in dependent modules,
so loading sdhci_pci (and other bus-flavors of sdhci) would fail when mmc
was not compiled into the kernel (even if you hand-loaded mmc first).
(Thanks to jilles@ for providing the vital clue about the kernel linker.)
jhibbits [Sun, 20 Mar 2016 14:21:07 +0000 (14:21 +0000)]
Convert a long to rman_res_t, fixing a sign extension bug.
ahci.c had one signed long, which was passed into rman, rather than u_long.
After the switch of rman_res_t from size u_long to size uintmax_t, the sign
extension caused ranges to get messed up, and ahcich* to not attach.
There may be more signed longs used in this way, which will be fixed as they're
reported.
dchagin [Sun, 20 Mar 2016 11:40:52 +0000 (11:40 +0000)]
Rework r296543:
1. Limit secs to INT32_MAX / 2 to avoid errors from kern_setitimer().
Assert that kern_setitimer() returns 0.
Remove bogus cast of secs.
Fix style(9) issues.
2. Increment the return value if the remaining tv_usec value more than 500000 as a Linux does.
adrian [Sun, 20 Mar 2016 03:54:57 +0000 (03:54 +0000)]
[urtwn] migrate urtwn out into sys/dev/urtwn/ .
There's some upcoming work to add new chipset support here and I'd
like to only add 802.11n support to one driver, instead of both
urtwn and rtwn.
There's also missing support for things like 802.11n, some powersave
work, bluetooth integration/coexistence, etc, and also newer parts
(like 8192EU, maybe some 11ac parts, not sure yet.)
So, this is hopefully the first step in a longer set of steps to unify
rtwn/urtwn and extend it with more interesting chipset and functionality
support.
pfg [Sun, 20 Mar 2016 03:27:06 +0000 (03:27 +0000)]
localedef(1): minor sorting to match Illumos.
Illumos recently included space in 'print' class. We already had
this but the code had slight sorting differences. Move it some
lines up to reduce diffs with Illumos.
jhb [Fri, 18 Mar 2016 19:48:49 +0000 (19:48 +0000)]
Check IPI status more frequently when waiting.
An IPI cannot be sent via the local APIC if a previous IPI is still
being delivered. Attempts to send an IPI will wait for a pending IPI
to clear. Prior to r278325 these checks used a spin loop with a
hardcoded maximum count which broke AP startup on some systems.
However, r278325 also enforced a minimum latency of 5 microseconds if an
IPI was still pending which resulted in a measurable performance hit.
This change reduces that minimum latency to 1 microsecond.
pfg [Fri, 18 Mar 2016 19:04:01 +0000 (19:04 +0000)]
aio_qphysio(): Avoid uninitialized pointer read on error.
For the !unmap case it may happen that pbuf gets called unreferenced
when vm_fault_quick_hold_pages() fails.
Initialize it so it doesn't cause trouble.