dchagin [Tue, 28 Feb 2017 19:49:21 +0000 (19:49 +0000)]
FreeBSD does not have analgue for epill EPOLLPRI event type.
So, do not set EPOLLPRI event acidently.
Also, do not set EPOLLWRNORM and EPOLLRDNORM events as epoll
do not set this events.
np [Tue, 28 Feb 2017 19:27:41 +0000 (19:27 +0000)]
cxgbe/iw_cxgbe: fix various double-close panics with iWARP sockets.
Sockets representing the TCP endpoints for iWARP connections are
allocated by the ibcore module. Before this revision they were closed
either by the ibcore module or the iw_cxgbe hardware driver depending on
the state transitions during connection teardown. This is error prone
and there were cases where both iw_cxgbe and ibcore closed the socket
leading to double-free panics. The fix is to let ibcore close the
sockets it creates and never do it in the driver.
- Use sodisconnect instead of soclose (preceded by solinger = 0) in the
driver to tear down an RDMA connection abruptly. This does what's
intended without releasing the socket's fd reference.
- Close the socket in ibcore when the iWARP iw_cm_id is destroyed. This
works for all kinds of sockets: clients that initiate connections,
listeners, and sockets accepted off of listeners.
Reviewed by: Steve Wise @ Open Grid Computing, hselasky@
MFC after: 3 days
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D9796
avg [Tue, 28 Feb 2017 18:48:12 +0000 (18:48 +0000)]
Local APIC: add support for extended LVT entries found in AMD processors
The extended LVT entries can be used to configure interrupt delivery
for various events that are internal to a processor and can use this
feature.
All current processors that support the feature have four of such entries.
The entries are all masked upon the processor reset, but it's possible
that firmware may use some of them.
BIOS and Kernel Developer's Guides for some processor models do not assign
any particular names to the extended LVTs, while other BKDGs provide names
and suggested usage for them.
However, there is no fixed mapping between the LVTs and the processor
events in any processor model that supports the feature. Any entry can be
assigned to any event. The assignment is done by programming an offset
of an entry into configuration bits corresponding to an event.
This change does not expose the flexibility that the feature offers.
The change adds just a single method to configure a hardcoded extended LVT
entry to deliver APIC_CMC_INT. The method is designed to be used with
Machine Check Error Thresholding mechanism on supported processor models.
For references please see BKDGs for families 10h - 16h and specifically
descriptions of APIC30, APIC400, APIC[530:500] registers.
For a description of the Error Thresholding mechanism see, for example,
BKDG for family 10h, section 2.12.1.6.
http://developer.amd.com/resources/developer-guides-manuals/
scottl [Tue, 28 Feb 2017 18:25:06 +0000 (18:25 +0000)]
Implement sbuf_prf(), which takes an sbuf and outputs it
to stdout in the non-kernel case and to the console+log
in the kernel case. For the kernel case it hooks the
putbuf() machinery underneath printf(9) so that the buffer
is written completely atomically and without a copy into
another temporary buffer. This is useful for fixing
compound console/log messages that become broken and
interleaved when multiple threads are competing for the
console.
manu [Tue, 28 Feb 2017 15:44:21 +0000 (15:44 +0000)]
allwinner: A31: Add ccung driver
This adds clocks support for the aw_ccung on the A31 SoC.
Newer DTS files require this.
All the clocks except two CSI are defined and exported on the clock domain.
mav [Tue, 28 Feb 2017 05:17:50 +0000 (05:17 +0000)]
Add safety check against too long CDB.
SBP-2 specification defined maximum CDB length as 12 bytes. Newer SBP-3
specification allows CDB of any size, but this driver is too old. Proper
solution would be to look on maximal ORB size supported by the target.
ngie [Tue, 28 Feb 2017 04:48:30 +0000 (04:48 +0000)]
Use "build" instead of "all" when building ports modules
"all" in ports currently means "stage the ports", which requires root today,
and brings to light other potential issues, like ENAMETOOLONG with staged
directories (bug 161481, etc).
This fixes buildkernel for me when run as a non-root user, assuming all
of the prerequisites have been installed beforehand and are up-to-date.
jhibbits [Tue, 28 Feb 2017 04:31:28 +0000 (04:31 +0000)]
Make kernel breakpoints work for book-e
Add the necessary bits to enable kernel breakpoints for Book-E. The entrypoint
for program exception is very trivial, so rather than expand it to be similar to
AIM, add it into the standard trap handler.
This wasn't blocked out as Book-E specific because it is only a minor redundancy
over AIM, which should have already called db_trap_glue() at this point. If
it's going to panic with a fatal trap anywya, it doesn't matter if it goes
through this path again.
jhibbits [Tue, 28 Feb 2017 04:13:20 +0000 (04:13 +0000)]
Unbreak kernel breakpoints, broken for ~4 years now
When committing DTrace in 2012/2013 era I inadvertently broke breakpoints, by
setting EXC_DTRACE to the same value as BKPT_INST. Change EXC_DTRACE to a
different, yet logically identical, trap (tw <all>,31,31).
shurd [Tue, 28 Feb 2017 02:27:51 +0000 (02:27 +0000)]
bnxt: propagate RSS hash type to the network stack.
RSS hash type will be used to identify the CPU on to which, a receive packet
will be queued. This patch extracts the "RSS hash type" from the receive
completion and sends it to the stack.
davidcs [Mon, 27 Feb 2017 23:38:51 +0000 (23:38 +0000)]
1. state checks in bxe_tx_mq_start_locked() and bxe_tx_mq_start() to sync threads during interface down or detach.
2. add sysctl to set pause frame parameters
3. increase max segs for TSO packets to BXE_TSO_MAX_SEGMENTS (32)
4. add debug messages for PHY
5. HW LRO support restricted to FreeBSD versions 8.x and above.
Submitted by:Vaishali.Kulkarni@cavium.com
MFC after:5 days
avg [Mon, 27 Feb 2017 17:36:31 +0000 (17:36 +0000)]
fix lvt_mode: edge-triggered interrupt mode is set by clearing APIC_LVT_TM
The fixed is used only to fix up buggy MPTable information and the
trigger mode is probably ignored for the relevant interrupt types
anyway. Still, it's better to be standards compliant and have the code
do what it says it does.
hselasky [Mon, 27 Feb 2017 08:36:51 +0000 (08:36 +0000)]
Fix startup race initialising ACPI CM battery structures on MacBookPro.
During acpi_cmbat_attach() the acpi_cmbat_init_battery() notification
handler is registered. It has been observed this notification handler
can be called instantly, before the attach routine has returned. In
the notification handler there is a call to device_is_attached() which
returns false. Because the softc is set we know an attach is in
progress and the fix is simply to wait and try again in this case.
oshogbo [Sun, 26 Feb 2017 22:07:26 +0000 (22:07 +0000)]
Don't try to open devices in the gettc() function which will always
fail in the Capability mode. Instead silently fallback to the syscall
method, which is done for example in the gettimeofday(2) function.
jchandra [Sun, 26 Feb 2017 22:05:22 +0000 (22:05 +0000)]
Enable pl011 UART FIFOs
The pl011 UART has a 16 entry Tx FIFO and a 16 entry Rx FIFO that
have not been used so far. Update the driver to enable the FIFOs
and use them in transmit and receive.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D8819
avos [Sun, 26 Feb 2017 20:49:35 +0000 (20:49 +0000)]
net80211 drivers: fix rate setup for EAPOL frames, obtain Tx parameters
directly from the node.
- Use ni_txparms directly instead of calculating them manually every time
- Move M_EAPOL flag check upper; otherwise it may be skipped due to
'ucastrate' / 'mcastrate' check
- Use 'mgtrate' for control frames too (see ifconfig(8), mgtrate parameter)
- Add few more M_EAPOL checks where it was missing (zyd(4), ural(4),
urtw(4))
- Few unrelated cleanups
Tested with:
- Intel 6205 (iwn(4)), STA mode;
- WUSB54GC (rum(4)), HOSTAP mode + RTL8188EU (rtwn(4)), STA mode.
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D9811
alc [Sun, 26 Feb 2017 19:54:02 +0000 (19:54 +0000)]
Refine the fix from r312954. Specifically, add a new PDE-only flag,
PG_PROMOTED, that indicates whether lingering 4KB page mappings might
need to be flushed on a PDE change that restricts or destroys a 2MB
page mapping. This flag allows the pmap to avoid range invalidations
that are both unnecessary and costly.
mav [Sun, 26 Feb 2017 19:23:03 +0000 (19:23 +0000)]
Add support for SIMs without autosense.
If we asked to send sense data by setting CAM_SEND_SENSE, but SIM didn't
confirm transmission by setting CAM_SENT_SENSE, assume it was not sent.
Queue the I/O back to CTL for later REQUEST SENSE with ctl_queue_sense().
This is needed for error reporting on SPI HBAs like ahc(4)/ahd(4).
manu [Sun, 26 Feb 2017 16:00:20 +0000 (16:00 +0000)]
Add clkng driver for Allwinner SoC
Since Linux 4.9-4.10 DTS doesn't have clocks under /clocks but only a ccu node.
Currently only H3 is supported with almost the same state as HEAD.
(video pll aren't supported for now but we don't support video).
This driver and clocks will also be used for other SoC (A64, A31, H5, H2 etc ...)
mav [Sun, 26 Feb 2017 12:52:44 +0000 (12:52 +0000)]
Fix residual length reporting in target mode.
This allows to properly handle cases when target wants to receive or send
more data then initiator wants to send or receive. Previously in such
cases isp(4) returned CAM_DATA_RUN_ERR, while now it returns resid > 0.
dchagin [Sun, 26 Feb 2017 09:40:42 +0000 (09:40 +0000)]
Return EOVERFLOW error in case then the size of tv_sec field of struct timespec
in COMPAT_LINUX32 Linuxulator's not equal to the size of native tv_sec.
mav [Sun, 26 Feb 2017 06:25:55 +0000 (06:25 +0000)]
Implement use of multiple transfers per I/O.
This change removes limitation of single S/G list entry and limitation on
maximal I/O size, using multiple data transfers per I/O if needed. Also
it removes code duplication between send and receive paths, which are now
completely equal.
jtl [Sun, 26 Feb 2017 00:19:02 +0000 (00:19 +0000)]
Do some minimal work to better conform to the 802.3ad (LACP) standard.
In particular, don't set the synchronized bit for the peer unless it truly
appears to be synchronized to us. Also, don't set our own synchronized bit
unless we have actually seen a remote system.
Prior to this change, we were seeing some strange behavior, such as:
1. We send an advertisement with the Activity, Aggregation, and Default
flags, followed by an advertisement with the Activity, Aggregation,
Synchronization, and Default flags. However, we hadn't seen an
advertisement from another peer and were still advertising the default
(NULL) peer. A closer examination of the in-kernel data structures (using
kgdb) showed that the system had added the default (NULL) peer as a valid
aggregator for the segment.
2. We were receiving an advertisement from a peer that included the
default (NULL) peer instead of including our system information. However,
we responded with an advertisement that included the Synchronization flag
for both our system and the peer. (Since the peer's advertisement did not
include our system information, we shouldn't add the synchronization bit
for the peer.)
Based on the discovery that every unmap waits for the commit of the txn to the ZIL,
introducing a very high latency to unmap commands, this behavior was made into a
tunable zvol_unmap_sync_enabled and set to false. The net impact of this change is
that by default SCSI unmap commands will result in space being freed within the zvol
(today they are ignored and returned with good status). However, unlike the code
today, instead of 18+ms per unmap, they take about 30us.
With the testing done on NTFS against a Win2k12 target, the new behavior should work
seamlessly. Files on the zvol that have already been set with the zfree application
will continue to write 0's when deleted, and any new files created since zvol
creation will send unmap commands when deleted. This behavior exists today, but with
this change the unmap commands will be processed and result in reclaim of space.
Author: Stephen Blinick <stephen.blinick@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Robert Mustacchi <rm@joyent.com>
oshogbo [Sat, 25 Feb 2017 18:14:32 +0000 (18:14 +0000)]
Remove unused macro from common/drv.c.
When we was compering it to code from boot2 it also looks like
this code is buggy and boot2 was never updated to use this code.
USE_XREAD flag is unused in boot2, and common/drv.c was never
build with that flag.
avg [Sat, 25 Feb 2017 16:45:53 +0000 (16:45 +0000)]
zfs: call spa_deadman on a taskqueue thread
callout(9) prohibits callout functions from sleeping.
illumos mutexes are emulated using sx(9).
spa_deadman() calls vdev_deadman() and the latter acquires vq_lock.
As a result we can get a more confusing panic instead of a specific
panic or no panic:
sleepq_add: td 0xfffff80019669960 to sleep on wchan 0xfffff8001cff4d88 with sleeping prohibited
This change adds another level of indirection where the deadman
callout schedules spa_deadman() to be executed on taskqueue_thread.
While there, use callout_schedule(0 instead of callout_reset()
in spa_sync().
avg [Sat, 25 Feb 2017 16:39:21 +0000 (16:39 +0000)]
call vm_lowmem hook in uma_reclaim_worker
A comment near kmem_reclaim() implies that we already did that.
Calling the hook is useful, because some handlers, e.g. ARC,
might be able to release significant amounts of KVA.
Now that we have more than one place where vm_lowmem hook is called,
use this change as an opportunity to introduce flags that describe
a reason for calling the hook. No handler makes use of the flags yet.
The fsid of zfs filesystems might change after reboot or remount. The problem seems to
be caused by a race between unique_insert() and unique_remove(). The unique_remove()
is called from dsl_dataset_evict() which is now an asynchronous thread. In a case the
dsl_dataset_evict() thread is very slow and calls unique_remove() too late we will end
up with changed fsid on zfs mount.
This problem is very likely caused by #5056.
Steps to Reproduce
Note: I'm able to reproduce this always on a single core (virtual) machine. On multicore
machines it is not so easy to reproduce.
Impact
The persistent fsid (filesystem id) is essential for proper NFS functionality.
If the fsid of a filesystem changes on remount (or after reboot) the NFS
clients might not be able to automatically recover from such event and the
manual remount of the NFS filesystems on every NFS client might be needed.
Author: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Dan Vatca <dan.vatca@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>