sbruno [Fri, 18 Nov 2016 04:19:21 +0000 (04:19 +0000)]
iflib updates and fixes:
- reset gen on down
- initialize admin task statically
- drain mp_ring on down
- don't drop context lock on stop
- reset error stats on down
- fix typo in min_latency sysctl
- return ENOBUFS from if_transmit if the driver isn't running or the link is down
Submitted by: mmacy@nextbsd.org
Reviewed by: shurd
MFC after: 2 days
Sponsored by: Isilon and Limelight Networks
Differential Revision: https://reviews.freebsd.org/D8558
glebius [Fri, 18 Nov 2016 00:13:30 +0000 (00:13 +0000)]
If FreeBSD source tree is a subproject of a bigger project, then .git or
.hg may reside above FreeBSD sources root. Provide function findvcs()
that will climb up and seek for presence of a VCS directory.
ivadasz [Thu, 17 Nov 2016 21:52:00 +0000 (21:52 +0000)]
[net80211] Don't check bgscanidle setting in net80211 for full-offload scan.
If full-offload scan is used, the NIC driver (or rather the firmware of
the NIC) should take care of interrupting and continuing the background
scan. So net80211 should ignore the vap->iv_bgscanidle setting then, instead
the NIC driver might look at this setting and pass it on to the firmware
in some way if possible.
Since full-offload scans won't be explicitly interrupted by net80211, it
also doesn't really make sense to check the vap->iv_bgscanidle condition
in that case, before starting a background scan. If the NIC driver
advertises background scan support and full-offload scanning, the firmware
should be able to execute that scan without interfering too much with our
data traffic.
glebius [Thu, 17 Nov 2016 21:36:18 +0000 (21:36 +0000)]
Add flag SF_USER_READAHEAD to sendfile(2). When specified, the syscall won't
do any speculations about readahead, and use exactly the amount of readahead
specified by user. E.g. setting SF_FLAGS(0, SF_USER_READAHEAD) will guarantee
that no readahead at all will be performed.
glebius [Thu, 17 Nov 2016 21:02:55 +0000 (21:02 +0000)]
Use bogus_page to properly reduce number of I/Os in sendfile(2). The new
sendfile_swapin() loop works this way:
- Find first invalid page in the request.
- Do vm_pager_has_page() and get count of pages, that can be taken in
single I/O.
- Trim valid pages from the end of the request.
- Cycle through the request and substitute to bogus_page all valid
pages that are in the middle of the request.
- After I/O launched (pager copies array of pages into buf(9), it
is important to restore proper page pointers with help vm_page_lookup().
Count bogus pages used and report them in sendfile stats.
mav [Thu, 17 Nov 2016 21:01:27 +0000 (21:01 +0000)]
After some ZIL changes 6 years ago zil_slog_limit got partially broken
due to zl_itx_list_sz not updated when async itx'es upgraded to sync.
Actually because of other changes about that time zl_itx_list_sz is not
really required to implement the functionality, so this patch removes
some unneeded broken code and variables.
Original idea of zil_slog_limit was to reduce chance of SLOG abuse by
single heavy logger, that increased latency for other (more latency critical)
loggers, by pushing heavy log out into the main pool instead of SLOG. Beside
huge latency increase for heavy writers, this implementation caused double
write of all data, since the log records were explicitly prepared for SLOG.
Since we now have I/O scheduler, I've found it can be much more efficient
to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE
to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG.
Existing ZIL implementation had problem with space efficiency when it
has to write large chunks of data into log blocks of limited size. In some
cases efficiency stopped to almost as low as 50%. In case of ZIL stored on
spinning rust, that also reduced log write speed in half, since head had to
uselessly fly over allocated but not written areas. This change improves
the situation by offloading problematic operations from z*_log_write() to
zil_lwb_commit(), which knows real situation of log blocks allocation and
can split large requests into pieces much more efficiently. Also as side
effect it removes one of two data copy operations done by ZIL code WR_COPIED
case.
While there, untangle and unify code of z*_log_write() functions.
Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing
block boundary, that may also improve efficiency if ZPL is made to do that.
asomers [Thu, 17 Nov 2016 20:42:56 +0000 (20:42 +0000)]
Fix "camcontrol rescan" with SATA drives behind a SAS controller
A bug in CAM's serial number hash logic resulted in SATA drives behind a SAS
controller getting removed and readded anytime the drive was rescanned for
any reason.
glebius [Thu, 17 Nov 2016 20:32:32 +0000 (20:32 +0000)]
- If caller specifies readbehind and readahead that together with count
doesn't fit into a buf, then trim readbehind and readahead evenly. If
rbehind was limited by the previous BMAP, then roundup its trim to
block size.
- Add KASSERT to check that b_blkno has proper offset from original
blkno returned by BMAP. [1]
- Add KASSERT to check that pages in buf are consecutive.
ivadasz [Thu, 17 Nov 2016 20:00:20 +0000 (20:00 +0000)]
[iwm] Sync iwm_nvm_read_chunk() function with Linux iwlwifi.
This fixes an error handling detail in iwm_nvm_read_chunk(), where an
error response from the firmware for an NVM read shouldn't be fatal if
the offset was non-zero.
tsoome [Thu, 17 Nov 2016 19:38:30 +0000 (19:38 +0000)]
loader: zfs toplevel vdev must have spa set.
The salt based checksum mechanisms, such as skein, are storing the seed
in spa structure, and need to access the spa to use the seed. The current
mechanism for quick access to correct spa is via pointer provided by
vdev structure, but unfortunately the current code does set spa only
for the leaf vdev. This patch will fix the issue by making sure the
loader zfs reader will set spa also for top-level vdevs.
tsoome [Thu, 17 Nov 2016 18:38:35 +0000 (18:38 +0000)]
loader: beri_sdcard_disk_print() needs to return int.
The https://reviews.freebsd.org/rS308434 did change the return type for
dv_print callbacks, but the return type for beri_sdcard_disk_print()
was left unchanged, causing compile errors.
emaste [Thu, 17 Nov 2016 18:12:17 +0000 (18:12 +0000)]
crunchide: report explicit error for combined string table
Some tools produce objects with a combined strtab and shstrtab.
These objects are not supported by crunchide since it rewrites the
symtab and strtab to "hide" symbols. This invalidates section header
offsets into a combined strtab/shstrtab.
In the future we could support these objects (by ensuring that we retain
unmodified section name strings in the output .strtab, and then rewriting
each section header's sh_name).
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
br [Thu, 17 Nov 2016 16:06:53 +0000 (16:06 +0000)]
Do not reallocate driver softc for uart unnecessarily.
Do not assume that all uart drivers use uart_softc structure as is.
Some do a sensible thing and do declare their uart class and driver
properly and arrive into uart_bus_attach with suitably sized softc.
br [Thu, 17 Nov 2016 15:37:44 +0000 (15:37 +0000)]
Make gpiobus early driver at BUS_PAS_BUS.
The gpiobus driver is attached explicitly and generally should be
at the same pass as its parent. Making it use BUS_PAS_BUS ensures
that it attaches immediately after parent adds it (assuming the
parent itself attached at BUS_PAS_BUS and above).
fanf [Thu, 17 Nov 2016 15:19:06 +0000 (15:19 +0000)]
More robust handling of whois referrals from RIRs.
An example problem case is 163.1.0.0 (University of Oxford)
which is in an APNIC ERX address range. Previously we assumed
that ARIN has the correct information for all ERX allocations,
but in this case ARIN refers back to APNIC, rather than referring
to RIPE. This caused whois to loop.
Whois will no longer loop back and forth forever between two RIRs
that don't have an answer, but instead try the other RIRs in turn.
bz [Thu, 17 Nov 2016 14:03:44 +0000 (14:03 +0000)]
Writing out the L2TP control packet requires 12 bytes of
contiguous memory but in one path we did not always guarantee this,
thus do a m_pullup() there.
PR: 214385
Submitted by: Joe Jones (joeknockando googlemail.com)
MFC after: 3 days
mizhka [Thu, 17 Nov 2016 07:33:37 +0000 (07:33 +0000)]
[etherswitch] add infineon adm6996fc support on etherswitch
This is Infineon ADM6996FC/M/MX driver code on etherswitch framework.
Support PORT and DOT1Q VLAN.
This code suppose ADM6996FC SDC/SDIO connect to SOC network interface
MDC/MDIO.
This code tested on Netgear WGR614Cv7.
imp [Wed, 16 Nov 2016 17:11:05 +0000 (17:11 +0000)]
Allow installworld to be skipped as well as installkernel with -W.
Allow -B to mean -K -W.
There are times when fixing non-base elementes of the build that you
don't want to wait to get a completely clean world install. This
allows that at the cost of a little danger.
hselasky [Wed, 16 Nov 2016 14:39:03 +0000 (14:39 +0000)]
Make sure MAC address is reprogrammed when if_init() callback is
invoked. Else promiscious mode must be used to pass traffic. While at
it fix a debug print macro.
The feature enables us to pass through physical PCIe devices to FreeBSD VM
running on Hyper-V (Windows Server 2016) to get near-native performance with
low CPU utilization.
The patch implements a PCI bridge driver to support the feature:
1) The pcib driver talks to the host to discover device(s) and presents
the device(s) to FreeBSD's pci driver via PCI configuration space (note:
to access the configuration space, we don't use the standard I/O port
0xCF8/CFC method; instead, we use an MMIO-based method supplied by Hyper-V,
which is very similar to the 0xCF8/CFC method).
2) The pcib driver allocates resources for the device(s) and initialize
the related BARs, when the device driver's attach method is invoked;
3) The pcib driver talks to the host to create MSI/MSI-X interrupt
remapping between the guest and the host;
4) The pcib driver supports device hot add/remove.
dexuan [Wed, 16 Nov 2016 09:02:17 +0000 (09:02 +0000)]
hyperv/vmbus: add a new method to get vcpu_id
vcpu_id is host's representation of guest CPU.
We get the mapping between vcpu_id and FreeBSD kernel's cpu id when VMBus
driver is loaded. Later, when a driver, like the coming pcib driver, talks
to the host and needs to refer to a guest CPU, the driver must use the
vcpu_id.
jhibbits [Wed, 16 Nov 2016 05:24:42 +0000 (05:24 +0000)]
Simplify the page tracking for VA<->PA translations.
Drop the tracking down to the pmap layer, with optimizations to only track
necessary pages. This should give a (slight) performance improvement, as well
as a stability improvement, as the tracking is already mostly handled by the
pmap layer.
kan [Wed, 16 Nov 2016 03:19:36 +0000 (03:19 +0000)]
Set endianness and floating point flags explicitly for MIPS targets
The tree can be build with an external toolchain that will not
necessarily default to desired settings, so we have to specify
the required flags explicitly to force the required compilation
mode.
jhibbits [Wed, 16 Nov 2016 02:14:07 +0000 (02:14 +0000)]
Add a GPIO poweroff and reset driver.
Summary:
This implements part of the gpio-poweroff and gpio-restart device tree
bindings. Optional properties are not handled currently. It also currently
only supports level-triggered reset.
Rather than printing a warning for every time we receive a fileid > 2^32
from the NFS server, count warnings and print at most one of each warning
type per minute, e.g.,
Nov 15 05:17:34 ip-172-30-1-221 kernel: NFSv4 fileid > 32bits (24730 occurrences)
Nov 15 05:17:56 ip-172-30-1-221 kernel: NFSv4 mounted on fileid > 32bits (178 occurrences)
Nov 15 05:18:53 ip-172-30-1-221 kernel: NFSv4 fileid > 32bits (7582 occurrences)
Nov 15 05:18:58 ip-172-30-1-221 kernel: NFSv4 mounted on fileid > 32bits (23 occurrences)
A buildworld with an NFS mounted /usr/obj can otherwise result in
hundreds of thousands of lines being printed, which seems unnecessarily
verbose.
When ino_t becomes a 64-bit type, these printfs will no longer be needed
(and the problems associated with truncating 64-bit fileids to generate
32-bit inode numbers will also go away).
jmcneill [Tue, 15 Nov 2016 23:48:30 +0000 (23:48 +0000)]
On command error, reset only DMA and FIFO engines instead of the entire
controller. Fixes eMMC device detection on OrangePi Plus 2e (and likely
others).
mizhka [Tue, 15 Nov 2016 20:44:19 +0000 (20:44 +0000)]
[MIPS] Fix Config3[ULRI] printing
Bit identifier of printf %b is octal integer, but not decimal. ULRI bit is
13-th bit (starting with 0) according to MIPS Architecture Volume III v.6.
In this case the bit identifier (starts with 1) should be \16.
alc [Tue, 15 Nov 2016 18:22:50 +0000 (18:22 +0000)]
Remove most of the code for implementing PG_CACHED pages. (This change does
not remove user-space visible fields from vm_cnt or all of the references to
cached pages from comments. Those changes will come later.)
jhb [Tue, 15 Nov 2016 17:01:48 +0000 (17:01 +0000)]
Sync instruction cache's after writing user breakpoints on MIPS.
Add an implementation for pmaps_sync_icache() on MIPS that sync's the
instruction cache on all CPUs via smp_rendezvous() after a debugger
inserts a breakpoint via ptrace(PT_IO).
Tested by: kan (on Creator CI20 running Ingenic JZ4780 SOC)
MFC after: 2 weeks
Sponsored by: DARPA / AFRL
kib [Tue, 15 Nov 2016 09:43:26 +0000 (09:43 +0000)]
Pass CPUID[1] %edx (cpu_feature), %ecx (cpu_feature2) and
CPUID[7].%ebx (cpu_stdext_feature), %ecx (cpu_stdext_feature2) to the
ifunc resolvers on x86.
It is much more clean to use CPUID instruction in usermode to retrieve
this information than to pass AT_HWCAP aux vector from kernel, on
x86. Still, the change does allow for use of AT_HWCAP on arches where it is
needed, by passing aux array to ifunc_init() initializer which should
prepare arguments for ifunc resolvers.
Current signature for resolvers on x86 is
func_t iresolve(uint32_t cpu_feature, uint32_t cpu_feature2,
uint32_t cpu_stdext_feature, uint32_t cpu_stdext_feature2);
where arguments have identical meaning as the kernel variables of the
same name. The ABIs allow to use resolvers with the void or shortened
list of arguments.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D8448
manu [Tue, 15 Nov 2016 07:08:33 +0000 (07:08 +0000)]
Upstream DTS provides PLL3 and PLL7 nodes (and their x2 form),
so remove them from our DTS and adapt the code to handle them correctly.
This fix HDMI video on A20.
mjg [Tue, 15 Nov 2016 03:38:05 +0000 (03:38 +0000)]
cache: fix a race between entry removal and demotion
The negative list shrinker can demote an entry with only hotlist + neglist
locks held. On the other hand entry removal possibly sets the NCF_DVDROP
without aformentioned locks held prior to detaching it from the respective
netlist., which can lose the update made by the shrinker.
sephe [Tue, 15 Nov 2016 02:36:12 +0000 (02:36 +0000)]
hyperv/vss: Add driver and tools for VSS
VSS stands for "Volume Shadow Copy Service". Unlike virtual machine
snapshot, it only takes snapshot for the virtual disks, so both
filesystem and applications have to aware of it, and cooperate the
whole VSS process.
This driver exposes two device files to the userland:
/dev/hv_fsvss_dev
Normally userland programs should _not_ mess with this device file.
It is currently used by the hv_vss_daemon(8), which freezes and
thaws the filesystem. NOTE: currently only UFS is supported, if
the system mounts _any_ other filesystems, the hv_vss_daemon(8)
will veto the VSS process.
If hv_vss_daemon(8) was disabled, then this device file must be
opened, and proper ioctls must be issued to keep the VSS working.
/dev/hv_appvss_dev
Userland application can opened this device file to receive the
VSS freeze notification, hold the VSS for a while (mainly to flush
application data to filesystem), release the VSS process, and
receive the VSS thaw notification i.e. applications can run again.
The VSS will still work, even if this device file is not opened.
However, only filesystem consistency is promised, if this device
file is not opened or is not operated properly.
hv_vss_daemon(8) is started by devd(8) by default. It can be disabled
by editting /etc/devd/hyperv.conf.
adrian [Tue, 15 Nov 2016 01:47:37 +0000 (01:47 +0000)]
[net80211] announce 11n capabilities in probe requests in IBSS mode.
The 802.11-2012 specification notes that a subset of IEs should be present
in IBSS probe requests. This is what (initially) allows nodes to discover
that other nodes are 11n capable. Notably - HTCAP, but not HTINFO.
This isn't everything required to reliably enable 11n between net80211
peers; there's more work to come.
adrian [Tue, 15 Nov 2016 01:41:45 +0000 (01:41 +0000)]
[mips] enable relbuf on mips for now to work around page aliasing in mips hardware.
Although the higher end MIPS hardware handles cache aliasing issues in
hardware, the older cores (r4k, etc) and some compile versions of the
newer cores (mips24k, mips34k, mips74k) don't have this feature.
This means we end up with some very unfortunate behaviour that was
made very obvious by some recent changes to the FFS pager by kib.
So, flip this off until we get our MIPS pmap/cache code upgraded to
handle aliased pages in software.
mizhka [Mon, 14 Nov 2016 21:38:36 +0000 (21:38 +0000)]
[MIPS] Print Config7 on boot for several MIPS architectures
Config7 contains useful fields, for instance, field AR indicating that the D-cache is configured to avoid cache aliases. This patch brings printing of config7 for MIPS 24K, 74K, 1004K.
Reviewed by: adrian
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D8514
loos [Mon, 14 Nov 2016 20:57:30 +0000 (20:57 +0000)]
Add the cpsw, the NIC driver for ti/am335x, to GENERIC kernel.
While here:
- remove 'device mii' - included by miibus;
- remove 'device smcphy' - included by miibus;
- sorted the network drivers list;
- added a comment about miibus based on amd64/GENERIC.