Zbigniew Bodek [Thu, 31 Mar 2016 13:23:43 +0000 (13:23 +0000)]
Don't omit m_dup() for non-writeable mbufs that need checksum calculation
If the driver is not active or link is down the packet could remain
non-writeable. This commit makes all mbufs enqueued to the driver's
ring buffer to have correct attributes.
Zbigniew Bodek [Thu, 31 Mar 2016 13:13:38 +0000 (13:13 +0000)]
Fix MAC address configuration for VNIC
The FDT description is as follows:
- phy-handle, reg, qlm-mode, mac-address are under nodes in bgx0/1 node
- phy nodes (pointed by phy-handle) are under MDIO even though they may
not be connected through to MDIO. In those nodes they do not contain
MAC address or etc.
This commit changes parsing of the FDT nodes for BGX so that it can
obtain correct MAC address for a given PHY.
Zbigniew Bodek [Thu, 31 Mar 2016 13:10:29 +0000 (13:10 +0000)]
Improve TX path of the VNIC driver
- Avoid memory leak when nicvf_tx_mbuf_locked() fails
- Introduce nicvf_xmit_locked() routine that uses drbr_peek(),
drbr_advance() or drbr_putback() for a specific ifnet.
This gives more clear and efficient design as well as
prevents from dropping mbufs that where not sent due to temporary
lack of descriptors.
- Add missing ETHER_BPF_MTAP() hook
Zbigniew Bodek [Thu, 31 Mar 2016 11:18:52 +0000 (11:18 +0000)]
Disable MSI-x for AHCI on Alpine plattform
Changes introduced to AHCI code adding support for MSI-x
caused interrupt storm on Alpine boards.
This is unintended behaviour so added quirk to omit this functionality.
Andrew Turner [Thu, 31 Mar 2016 11:07:24 +0000 (11:07 +0000)]
Add support for 4 level pagetables. The userland address space has been
increased to 256TiB. The kernel address space can also be increased to be
the same size, but this will be performed in a later change.
To help work with an extra level of page tables two new functions have
been added, one to file the lowest level table entry, and one to find the
block/page level. Both of these find the entry for a given pmap and virtual
address.
This has been tested with a combination of buildworld, stress2 tests, and
by using sort to consume a large amount of memory by sorting /dev/zero. No
new issues are known to be present from this change.
Reviewed by: kib
Obtained from: ABT Systems Ltd
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5720
Adrian Chadd [Thu, 31 Mar 2016 04:57:38 +0000 (04:57 +0000)]
Add support for the Nuvoton NCT5104D.
Make it compile only for i386/amd64 for now as it's been tested there.
It's quite possible it'll show up elsewhere and we can enable it
for other architectures later.
Bryan Drewery [Thu, 31 Mar 2016 03:04:26 +0000 (03:04 +0000)]
hosttools: Trim unneeded directories.
These should only be build tools that are in various Makefile.depend
as host dependencies. Anything toolchain related is handled by
toolchain and bootstrap-tools currently.
Bryan Drewery [Thu, 31 Mar 2016 00:26:40 +0000 (00:26 +0000)]
DIRDEPS_BUILD: Don't reset OBJROOT in sub-makes.
MAKEOBJDIRPREFIX is set to blank and exported from MAKELEVEL0 along
with OBJROOT exported. In sub-makes OBJROOT is recalculated with
an empty MAKEOBJDIRPREFIX though.
Bryan Drewery [Wed, 30 Mar 2016 23:50:29 +0000 (23:50 +0000)]
Fix the external GCC build after r297271 by setting -L <sysroot>/usr/lib.
GCC does add <sysroot>/usr/lib to the library search path but it comes after
/usr/local/lib which can find ports libraries such as libedit.so. The
bad path comes in as /usr/local/lib/gcc/x86_64-portbld-freebsd11.0/5.3.0/../../../
which corresponds to <prefix>/lib.
This partially reverts r297271.
Pointyhat to: bdrewery
Sponsored by: EMC / Isilon Storage Division
Ed Maste [Wed, 30 Mar 2016 14:42:09 +0000 (14:42 +0000)]
libc: stop exporting cerror
i386 stopped exporting .cerror in r240152, and likewise for amd64 in
r240178. It is not used by other libraries on any platform, so apply
the same change to the remaining architectures.
Reviewed by: jhibbits, jilles
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5774
Navdeep Parhar [Wed, 30 Mar 2016 01:08:08 +0000 (01:08 +0000)]
Remove unnecessary dequeue_mutex (added in r294610) from the iWARP
connection manager. Examining so_comp without synchronization with
iw_so_event_handler is a harmless race.
Submitted by: Krishnamraju Eraparaju @ Chelsio
Reviewed by: Steve Wise @ Open Grid Computing
Sponsored by: Chelsio Communications
Adrian Chadd [Wed, 30 Mar 2016 00:42:18 +0000 (00:42 +0000)]
[net80211] Add fields to decode uAPSD fields.
It turns out that madwifi actually has the basics for uAPSD implemented
but it was never ported to FreeBSD. I may eventually port most of the
pieces; I'll see how it goes!
Do not access buffer if bread(9) or cluster_read(9) failed. On error,
the functions free the buffer and set the pointer to NULL. Also
remove useless call to brelse(9) on the error path.
Gleb Smirnoff [Tue, 29 Mar 2016 19:57:11 +0000 (19:57 +0000)]
The sendfile(2) allows to send extra data from userspace before the file
data (headers). Historically the size of the headers was not checked
against the socket buffer space. Application could easily overcommit the
socket buffer space.
With the new sendfile (r293439) the problem remained, but a KASSERT was
inserted that checked that amount of data written to the socket matches
its space. In case when size of headers is bigger that socket space,
KASSERT fires. Without INVARIANTS the new sendfile won't panic, but
would report incorrect amount of bytes sent.
o With this change, the headers copyin is moved down into the cycle, after
the sbspace() check. The uio size is trimmed by socket space there,
which fixes the overcommit problem and its consequences.
o The compatibility handling for FreeBSD 4 sendfile headers API is pushed
up the stack to syscall wrappers. This required a copy and paste of the
code, but in turn this allowed to remove extra stack carried parameter
from fo_sendfile_t, and embrace entire compat code into #ifdef. If in
future we got more fo_sendfile_t function, the copy and paste level would
even reduce.
Fix several bugs in r297374:
- fix UP build [1]
- do not obliterate initial reading of rdtsc by the loop counter [2]
- restore the meaning of the argument -1 to native_lapic_ipi_wait()
as wait until LAPIC acknowledge without timeout
- correct formula for calculating loop iteration count for 1us, it was
inverted, and ensure that even on unlikely slow CPUs at least one
check for ack is performed.
Reported by: Michael Butler <imb@protected-networks.net> [1], rpokala[2],
jhb[3]
Tested by: Michael Butler
Pointy hat to: kib
Sponsored by: The FreeBSD Foundation
Mark Johnston [Tue, 29 Mar 2016 19:23:00 +0000 (19:23 +0000)]
Modify nd6_llinfo_timer() to acquire the nd6 lock before the LLE lock.
When expiring a neighbour cache entry we may need to look up the associated
default router, which requires the nd6 read lock. To avoid an LOR, the nd6
lock should be acquired first.
X-MFC-With: r296063
Tested by: Larry Rosenman <ler@lerctr.org> (previous revision)
Alexander Motin [Tue, 29 Mar 2016 19:18:34 +0000 (19:18 +0000)]
Modify "4958 zdb trips assert on pools with ashift >= 0xe".
Unlike Illumos FreeBSD has concept of logical ashift, that specifies
really minimal vdev block size that can be accessed. This knowledge
allows properly pad physical I/O and correctly assert its alignment.
This change fixes L2ARC write errors when device has logical sector
size above 512 bytes.
This driver works in PIO mode for now, interrupts are available only when
FIFO is enabled. The FIFO cannot be used with arbitrary sizes which defeat
its general use.
At some point we can add DMA transfers where the FIFO can be more useful.
Bryan Drewery [Tue, 29 Mar 2016 16:07:51 +0000 (16:07 +0000)]
Reword descriptions of asserting locks held without WITNESS.
This corrects an error in r296947 in that it is not possible to assert
which thread holds a shared (or read) lock, but it is possible to assert
that one is held. Just not very useful.
Import portions of the PowerPC OF PCI implementation into new file
"ofwpci.c", common for other platforms. The files ofw_pci.c and ofw_pci.h
from sys/powerpc/ofw no longer exist. All required declarations are moved
to sys/dev/ofw/ofwpci.h. This creates a new ofw_pci_write_ivar() function
and modifies some others methods. Most functions contain existing ppc
implementations in the majority unchanged. Now there is no need to have
multiple identical copies of methods for various architectures.
Andrew Turner [Tue, 29 Mar 2016 13:51:26 +0000 (13:51 +0000)]
Read the CPU ID for the current CPU from the GIC. The GIC may have a
different ID space than the kernel. Because of this we need to read the
ID from the hardware. The hardware will provide this value to the CPU by
reading any of the first 8 Interrupt Processor Targets Registers.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5706
Zbigniew Bodek [Tue, 29 Mar 2016 13:31:09 +0000 (13:31 +0000)]
Improve HW checksums support in VNIC
- Do not mark CSUM_IP_CHECKED and CSUM_IP_VALID on IPv6 packets.
IPv6 does not have checksums by definition.
- Set SCTP packets csum_flags CSUM_SCTP_VALID instead of
CSUM_DATA_VALID and skip csum_data
- Set csum_data simply as 0xffff without byteswap
Calibrate the frequency of the of the native_lapic_ipi_wait() loop,
and avoid a delay while waiting for IPI delivery acknowledgement in
xAPIC mode. This makes the loop exit immediately after the delivery
bit in APIC_ICR register is set, instead of waiting for some
microseconds.
We only need to ensure that some amount of time is allowed for the
LAPIC to react to the command, and we need that the wait time is
finite and reasonable. For that reasons, it is irrelevant if the CPU
frequency or throttling decrease the speed and make the loop,
calibrated for full CPU speed at boot time, execute somewhat slower.
Discussed with: bde, jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
John Baldwin [Mon, 28 Mar 2016 21:51:56 +0000 (21:51 +0000)]
Don't start the random harvester process until timers are working.
This is a no-op currently, but in kernels with earlier AP startup, the
random kthread was trying to use timeouts with sleeps before timers are
working. Wait until SI_SUB_KICK_SCHEDULER to start the random kproc.
Warren Block [Mon, 28 Mar 2016 17:42:14 +0000 (17:42 +0000)]
Replace "user land", which, for any definition of the word "user",
sounds like some kind of horrific theme park. "Hey kids, want to go to
User Land?" "No! We'll be good!"
The obvious replacement is "userland", a compound word replete with
term-of-art meaning and just a hint of cautionary tale. The alternate
terms "flugelhorn" and "bullfrog", while also good, are less well-known
and were voted down in committee.
Do not load LAPIC_DCR_TIMER with an undefined value. If we are in the
deadline mode the divide configuration is not used and
lapic_timer_divisor is not set.
Reported by: dhw, mav
Tested by: mav
Sponsored by: The FreeBSD Foundation
Use TSC deadline mode for LAPIC timer, when available. The mode fires
LAPIC timer iinterrupt when TSC reaches the value written to the
IA32_TSC_DEADLINE MSR. To arm or reset the timer in deadline mode, a
single non-serializing MSR write is enough. This is an advance from
the one-shot mode of LAPIC, where timer operated with the FSB
frequency and required two (serialized in case of xAPIC) writes to the
APIC registers.
The LVT_TIMER register value is cached to avoid unneeded writes in the
deadline mode. Unused arguments to specify period (which is passed in
struct lapic as la_timer_period) and interrupt enable (which is always
enabled) are removed from lapic_timer_{oneshot,periodic,deadline}
functions. Instead, special lapic_timer_oneshot_nointr() function for
interrupt-less one-shot calibration is added.
Reviewed by: mav (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D5738
Warner Losh [Mon, 28 Mar 2016 04:22:22 +0000 (04:22 +0000)]
Sometimes, it's useful to export the entire line to an external
program without listening to the devd socket for all events. Define
two new pseudo variables $*, the entire event from devctl and $_,
the entire event without the type character, since it might be easier
to use in some circumstances.
Ian Lepore [Sun, 27 Mar 2016 23:16:37 +0000 (23:16 +0000)]
Do not try to install a default route for each interface found, because
only the first one will actually work and all the others just result in
errors (which would get printed but otherwise ignored).
Instead, wait until we make a choice of which interface will be used to
mount the rootfs, and install the default route associated with it (if any).
After doing the md_mount() call to obtain the needed info, remove the
default route again, and transcribe the route info into the nfs_diskless
structure. If the system eventually chooses to mount the nfs rootfs, the
default route will be installed again when the nfs_diskless code
re-initializes the interface.
The theory here is that since we can only have one default route, the one
most likely to be correct for mounting the rootfs is the one that was
delivered along with the rootpath option.
Ian Lepore [Sun, 27 Mar 2016 22:58:56 +0000 (22:58 +0000)]
Stop setting the default route to the IP of the interface itself when the
bootp/dhcp server doesn't provide a router option. Doing so prevents
setting defaultrouter=<ip> in rc.conf (it fails because there's already
a bogus default route installed by bootpc_init).
When an admin wants to use this style of proxy arp on an interface, the
proper mechanism is to set the "use-lease-addr-for-default-route" flag
in the dhcp server config. That causes the lease address to be delivered
in the routers option, and the normal handling of the routers option will
then install the self-ip as the default route.
Ian Lepore [Sun, 27 Mar 2016 22:36:32 +0000 (22:36 +0000)]
Switch bootpc_adjust_interface() from returning int to void. Its one caller
doesn't check for errors, and all the errors that can happen result in it
calling panic anyway, except for one that's really more of a warning (and
is going to disappear on an upcoming commit anyway).
Ian Lepore [Sun, 27 Mar 2016 22:21:34 +0000 (22:21 +0000)]
Set ifctx->gotrootpath=1 only when the root path came from the dhcp/bootp
server (and not when it came from a fallback method such as the ROOTDEVNAME
option). This makes the code in bootpc_init() choose the first interface
that provided a rootpath name. Previously it was choosing the first
interface that got an IP address, which could be on a different and
potentially unreachable subnet than the server providing the rootfs.
If the rootpath name actually does come from a fallback source, then the
code continues to use the first interface in the list that got configured.
Note that this wasn't directly reported in the PR cited below, but was
discovered while working on that PR.
Pedro F. Giffuni [Sun, 27 Mar 2016 20:02:21 +0000 (20:02 +0000)]
netstat: avoid returning uninitialized value in p_sockaddr().
In the case the width is less than 0, we are returning an uninitialized
value. For practical purposes the return value is ignored but initialize
it to avoid trouble.
Kristof Provost [Sun, 27 Mar 2016 17:22:27 +0000 (17:22 +0000)]
pf: Friendly error message for status if pf.ko is not loaded
Check if pf.ko is loaded (i.e. /dev/pf exists) before trying to use it. This
means that '/etc/rc.d/pf status' will no longer return 'pfctl: /dev/pf: No such
file or directory' but 'pf.ko is not loaded'.
PR: 205671
Submitted by: Johannes Jost Meixner <xmj@FreeBSD.org>