Ed Schouten [Thu, 31 Mar 2016 18:50:06 +0000 (18:50 +0000)]
Sync in the latest CloudABI system call definitions.
Some time ago I made a change to merge together the memory scope
definitions used by mmap (MAP_{PRIVATE,SHARED}) and lock objects
(PTHREAD_PROCESS_{PRIVATE,SHARED}). Though that sounded pretty smart
back then, it's backfiring. In the case of mmap it's used with other
flags in a bitmask, but for locking it's an enumeration. As our plan is
to automatically generate bindings for other languages, that looks a bit
sloppy.
Change all of the locking functions to use separate flags instead.
John Baldwin [Thu, 31 Mar 2016 18:10:29 +0000 (18:10 +0000)]
Rework handling of thread sleeps before timers are working.
Previously, calls to *sleep() and cv_*wait*() immediately returned during
early boot. Instead, permit threads that request a sleep without a
timeout to sleep as wakeup() works during early boot. Sleeps with
timeouts are harder to emulate without working timers, so just punt and
panic explicitly if any thread tries to use those before timers are
working. Any threads that depend on timeouts should either wait until
SI_SUB_KICK_SCHEDULER to start or they should use DELAY() until timers
are available.
Until APs are started earlier this should be a no-op as other kthreads
shouldn't get a chance to start running until after timers are working
regardless of when they were created.
John Baldwin [Thu, 31 Mar 2016 17:27:30 +0000 (17:27 +0000)]
Tidy up the unmapped I/O code in qphysio.
- Move some blocks around to reduce the number of 'if (unmap)' checks.
- Use 'pbuf == NULL' instead of 'unmap'.
- Use nitems.
- Pull an assignment out of an if expression.
Bryan Drewery [Thu, 31 Mar 2016 17:27:17 +0000 (17:27 +0000)]
WITHOUT_TOOLCHAIN: Skip building of h_raw.
-fsanitize does not seem to work when a --sysroot is specified and there
is no <sysroot>/usr/lib/clang/3.8.0/lib/freebsd/libclang_rt.ubsan_standalone-*.a.
Bryan Drewery [Thu, 31 Mar 2016 17:27:01 +0000 (17:27 +0000)]
WITHOUT_TOOLCHAIN: Fix build of rtld.
MK_TOOLCHAIN==no disables building and installing of pic archives.
c_pic.a is still needed for rtld though so force it to build in lib/libc
and link directly to the objdir version of it for rtld.
Zbigniew Bodek [Thu, 31 Mar 2016 16:44:32 +0000 (16:44 +0000)]
Fix number of the enabled VFs in VNIC
nic->num_vf_en is set based on the number of the enabled LMACs.
This number should not be overwritten later by any routine.
Instead it should fail PCI_IOV_ADD_VF() so that available VFs
with the corresponding LMACs will attach whereas other, disabled
VFs will fail with the proper error code.
Error signaling (due to improper number of VFs requested) is also moved
from PCI_IOV_INIT() to PCI_IOV_ADD_VF().
This will be reworked when multiple queue sets are enabled but for
now this is the correct behavior of the driver.
Zbigniew Bodek [Thu, 31 Mar 2016 13:23:43 +0000 (13:23 +0000)]
Don't omit m_dup() for non-writeable mbufs that need checksum calculation
If the driver is not active or link is down the packet could remain
non-writeable. This commit makes all mbufs enqueued to the driver's
ring buffer to have correct attributes.
Zbigniew Bodek [Thu, 31 Mar 2016 13:13:38 +0000 (13:13 +0000)]
Fix MAC address configuration for VNIC
The FDT description is as follows:
- phy-handle, reg, qlm-mode, mac-address are under nodes in bgx0/1 node
- phy nodes (pointed by phy-handle) are under MDIO even though they may
not be connected through to MDIO. In those nodes they do not contain
MAC address or etc.
This commit changes parsing of the FDT nodes for BGX so that it can
obtain correct MAC address for a given PHY.
Zbigniew Bodek [Thu, 31 Mar 2016 13:10:29 +0000 (13:10 +0000)]
Improve TX path of the VNIC driver
- Avoid memory leak when nicvf_tx_mbuf_locked() fails
- Introduce nicvf_xmit_locked() routine that uses drbr_peek(),
drbr_advance() or drbr_putback() for a specific ifnet.
This gives more clear and efficient design as well as
prevents from dropping mbufs that where not sent due to temporary
lack of descriptors.
- Add missing ETHER_BPF_MTAP() hook
Zbigniew Bodek [Thu, 31 Mar 2016 11:18:52 +0000 (11:18 +0000)]
Disable MSI-x for AHCI on Alpine plattform
Changes introduced to AHCI code adding support for MSI-x
caused interrupt storm on Alpine boards.
This is unintended behaviour so added quirk to omit this functionality.
Andrew Turner [Thu, 31 Mar 2016 11:07:24 +0000 (11:07 +0000)]
Add support for 4 level pagetables. The userland address space has been
increased to 256TiB. The kernel address space can also be increased to be
the same size, but this will be performed in a later change.
To help work with an extra level of page tables two new functions have
been added, one to file the lowest level table entry, and one to find the
block/page level. Both of these find the entry for a given pmap and virtual
address.
This has been tested with a combination of buildworld, stress2 tests, and
by using sort to consume a large amount of memory by sorting /dev/zero. No
new issues are known to be present from this change.
Reviewed by: kib
Obtained from: ABT Systems Ltd
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5720
Adrian Chadd [Thu, 31 Mar 2016 04:57:38 +0000 (04:57 +0000)]
Add support for the Nuvoton NCT5104D.
Make it compile only for i386/amd64 for now as it's been tested there.
It's quite possible it'll show up elsewhere and we can enable it
for other architectures later.
Bryan Drewery [Thu, 31 Mar 2016 03:04:26 +0000 (03:04 +0000)]
hosttools: Trim unneeded directories.
These should only be build tools that are in various Makefile.depend
as host dependencies. Anything toolchain related is handled by
toolchain and bootstrap-tools currently.
Bryan Drewery [Thu, 31 Mar 2016 00:26:40 +0000 (00:26 +0000)]
DIRDEPS_BUILD: Don't reset OBJROOT in sub-makes.
MAKEOBJDIRPREFIX is set to blank and exported from MAKELEVEL0 along
with OBJROOT exported. In sub-makes OBJROOT is recalculated with
an empty MAKEOBJDIRPREFIX though.
Bryan Drewery [Wed, 30 Mar 2016 23:50:29 +0000 (23:50 +0000)]
Fix the external GCC build after r297271 by setting -L <sysroot>/usr/lib.
GCC does add <sysroot>/usr/lib to the library search path but it comes after
/usr/local/lib which can find ports libraries such as libedit.so. The
bad path comes in as /usr/local/lib/gcc/x86_64-portbld-freebsd11.0/5.3.0/../../../
which corresponds to <prefix>/lib.
This partially reverts r297271.
Pointyhat to: bdrewery
Sponsored by: EMC / Isilon Storage Division
Ed Maste [Wed, 30 Mar 2016 14:42:09 +0000 (14:42 +0000)]
libc: stop exporting cerror
i386 stopped exporting .cerror in r240152, and likewise for amd64 in
r240178. It is not used by other libraries on any platform, so apply
the same change to the remaining architectures.
Reviewed by: jhibbits, jilles
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5774
Navdeep Parhar [Wed, 30 Mar 2016 01:08:08 +0000 (01:08 +0000)]
Remove unnecessary dequeue_mutex (added in r294610) from the iWARP
connection manager. Examining so_comp without synchronization with
iw_so_event_handler is a harmless race.
Submitted by: Krishnamraju Eraparaju @ Chelsio
Reviewed by: Steve Wise @ Open Grid Computing
Sponsored by: Chelsio Communications
Adrian Chadd [Wed, 30 Mar 2016 00:42:18 +0000 (00:42 +0000)]
[net80211] Add fields to decode uAPSD fields.
It turns out that madwifi actually has the basics for uAPSD implemented
but it was never ported to FreeBSD. I may eventually port most of the
pieces; I'll see how it goes!
Do not access buffer if bread(9) or cluster_read(9) failed. On error,
the functions free the buffer and set the pointer to NULL. Also
remove useless call to brelse(9) on the error path.
Gleb Smirnoff [Tue, 29 Mar 2016 19:57:11 +0000 (19:57 +0000)]
The sendfile(2) allows to send extra data from userspace before the file
data (headers). Historically the size of the headers was not checked
against the socket buffer space. Application could easily overcommit the
socket buffer space.
With the new sendfile (r293439) the problem remained, but a KASSERT was
inserted that checked that amount of data written to the socket matches
its space. In case when size of headers is bigger that socket space,
KASSERT fires. Without INVARIANTS the new sendfile won't panic, but
would report incorrect amount of bytes sent.
o With this change, the headers copyin is moved down into the cycle, after
the sbspace() check. The uio size is trimmed by socket space there,
which fixes the overcommit problem and its consequences.
o The compatibility handling for FreeBSD 4 sendfile headers API is pushed
up the stack to syscall wrappers. This required a copy and paste of the
code, but in turn this allowed to remove extra stack carried parameter
from fo_sendfile_t, and embrace entire compat code into #ifdef. If in
future we got more fo_sendfile_t function, the copy and paste level would
even reduce.
Fix several bugs in r297374:
- fix UP build [1]
- do not obliterate initial reading of rdtsc by the loop counter [2]
- restore the meaning of the argument -1 to native_lapic_ipi_wait()
as wait until LAPIC acknowledge without timeout
- correct formula for calculating loop iteration count for 1us, it was
inverted, and ensure that even on unlikely slow CPUs at least one
check for ack is performed.
Reported by: Michael Butler <imb@protected-networks.net> [1], rpokala[2],
jhb[3]
Tested by: Michael Butler
Pointy hat to: kib
Sponsored by: The FreeBSD Foundation
Mark Johnston [Tue, 29 Mar 2016 19:23:00 +0000 (19:23 +0000)]
Modify nd6_llinfo_timer() to acquire the nd6 lock before the LLE lock.
When expiring a neighbour cache entry we may need to look up the associated
default router, which requires the nd6 read lock. To avoid an LOR, the nd6
lock should be acquired first.
X-MFC-With: r296063
Tested by: Larry Rosenman <ler@lerctr.org> (previous revision)
Alexander Motin [Tue, 29 Mar 2016 19:18:34 +0000 (19:18 +0000)]
Modify "4958 zdb trips assert on pools with ashift >= 0xe".
Unlike Illumos FreeBSD has concept of logical ashift, that specifies
really minimal vdev block size that can be accessed. This knowledge
allows properly pad physical I/O and correctly assert its alignment.
This change fixes L2ARC write errors when device has logical sector
size above 512 bytes.
This driver works in PIO mode for now, interrupts are available only when
FIFO is enabled. The FIFO cannot be used with arbitrary sizes which defeat
its general use.
At some point we can add DMA transfers where the FIFO can be more useful.
Bryan Drewery [Tue, 29 Mar 2016 16:07:51 +0000 (16:07 +0000)]
Reword descriptions of asserting locks held without WITNESS.
This corrects an error in r296947 in that it is not possible to assert
which thread holds a shared (or read) lock, but it is possible to assert
that one is held. Just not very useful.
Import portions of the PowerPC OF PCI implementation into new file
"ofwpci.c", common for other platforms. The files ofw_pci.c and ofw_pci.h
from sys/powerpc/ofw no longer exist. All required declarations are moved
to sys/dev/ofw/ofwpci.h. This creates a new ofw_pci_write_ivar() function
and modifies some others methods. Most functions contain existing ppc
implementations in the majority unchanged. Now there is no need to have
multiple identical copies of methods for various architectures.
Andrew Turner [Tue, 29 Mar 2016 13:51:26 +0000 (13:51 +0000)]
Read the CPU ID for the current CPU from the GIC. The GIC may have a
different ID space than the kernel. Because of this we need to read the
ID from the hardware. The hardware will provide this value to the CPU by
reading any of the first 8 Interrupt Processor Targets Registers.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5706
Zbigniew Bodek [Tue, 29 Mar 2016 13:31:09 +0000 (13:31 +0000)]
Improve HW checksums support in VNIC
- Do not mark CSUM_IP_CHECKED and CSUM_IP_VALID on IPv6 packets.
IPv6 does not have checksums by definition.
- Set SCTP packets csum_flags CSUM_SCTP_VALID instead of
CSUM_DATA_VALID and skip csum_data
- Set csum_data simply as 0xffff without byteswap
Calibrate the frequency of the of the native_lapic_ipi_wait() loop,
and avoid a delay while waiting for IPI delivery acknowledgement in
xAPIC mode. This makes the loop exit immediately after the delivery
bit in APIC_ICR register is set, instead of waiting for some
microseconds.
We only need to ensure that some amount of time is allowed for the
LAPIC to react to the command, and we need that the wait time is
finite and reasonable. For that reasons, it is irrelevant if the CPU
frequency or throttling decrease the speed and make the loop,
calibrated for full CPU speed at boot time, execute somewhat slower.
Discussed with: bde, jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
John Baldwin [Mon, 28 Mar 2016 21:51:56 +0000 (21:51 +0000)]
Don't start the random harvester process until timers are working.
This is a no-op currently, but in kernels with earlier AP startup, the
random kthread was trying to use timeouts with sleeps before timers are
working. Wait until SI_SUB_KICK_SCHEDULER to start the random kproc.
Warren Block [Mon, 28 Mar 2016 17:42:14 +0000 (17:42 +0000)]
Replace "user land", which, for any definition of the word "user",
sounds like some kind of horrific theme park. "Hey kids, want to go to
User Land?" "No! We'll be good!"
The obvious replacement is "userland", a compound word replete with
term-of-art meaning and just a hint of cautionary tale. The alternate
terms "flugelhorn" and "bullfrog", while also good, are less well-known
and were voted down in committee.
Do not load LAPIC_DCR_TIMER with an undefined value. If we are in the
deadline mode the divide configuration is not used and
lapic_timer_divisor is not set.
Reported by: dhw, mav
Tested by: mav
Sponsored by: The FreeBSD Foundation
Use TSC deadline mode for LAPIC timer, when available. The mode fires
LAPIC timer iinterrupt when TSC reaches the value written to the
IA32_TSC_DEADLINE MSR. To arm or reset the timer in deadline mode, a
single non-serializing MSR write is enough. This is an advance from
the one-shot mode of LAPIC, where timer operated with the FSB
frequency and required two (serialized in case of xAPIC) writes to the
APIC registers.
The LVT_TIMER register value is cached to avoid unneeded writes in the
deadline mode. Unused arguments to specify period (which is passed in
struct lapic as la_timer_period) and interrupt enable (which is always
enabled) are removed from lapic_timer_{oneshot,periodic,deadline}
functions. Instead, special lapic_timer_oneshot_nointr() function for
interrupt-less one-shot calibration is added.
Reviewed by: mav (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D5738