Justin Hibbits [Fri, 10 Nov 2017 04:14:48 +0000 (04:14 +0000)]
Book-E pmap_mapdev_attr() improvements
* Check TLB1 in all mapdev cases, in case the memattr matches an existing
mapping (doesn't need to be MAP_DEFAULT).
* Fix mapping where the starting address is not a multiple of the widest size
base. For instance, it will now properly map 0xffffef000, size 0x11000 using
2 TLB entries, basing it at 0x****f000, instead of 0x***00000.
Bryan Drewery [Fri, 10 Nov 2017 02:09:37 +0000 (02:09 +0000)]
Deal with src.conf for top-level MAKEOBJDIRPREFIX guard.
- Don't discard SRCCONF value since it may incorrectly have MAKEOBJDIRPREFIX
in it.
- Add note about src.conf not being a suitable place for MAKEOBJDIRPREFIX.
Bryan Drewery [Fri, 10 Nov 2017 02:09:33 +0000 (02:09 +0000)]
Handle some .OBJDIR == .CURDIR cases.
- If OBJROOT is SRCTOP then don't add on TARGET.TARGET_ARCH. This
only happens at the top-level, and for sub-directories when the
user is clever with MAKEOBJDIRPREFIX=/.
- Don't bother checking 'test -w' on .CURDIR.
- Properly set OBJTOP/OBJROOT to SRCTOP in various needed cases.
- Check if the OBJDIR is writable even for *clean* targets since it
determines which .OBJDIR the user gets; If they cannot write to an
existing eligible .OBJDIR then it needs to clean in .CURDIR instead.
- Add guard to cleanworld/cleanuniverse from removing SRCTOP.
- Ensure OBJTOP is proper for .OBJDIR=.CURDIR which fixes finding
libraries since src.libnames.mk is based on OBJTOP.
- Avoid some chdir(2) for modifying .OBJDIR
Conrad Meyer [Fri, 10 Nov 2017 02:00:40 +0000 (02:00 +0000)]
systm.h: Include cdefs.h first
Ever since r143063, machine/atomic.h requires cdefs.h. So, include it
first. Weak support: style(9) tells us to include cdefs.h first.
Argument against: since code that includes systm.h still compiles,
compilation units that include systm.h must already include cdefs.h. So, an
argument could be made that the cdefs.h include could just be removed
entirely. That is maybe a bigger change and not one I am interested in
bikeshedding.
John Baldwin [Fri, 10 Nov 2017 01:17:26 +0000 (01:17 +0000)]
Some fixups to the CFI directives for PLT stub entry points.
The directives I added in r323466 and r323501 did not define a valid
CFA until several instructions into the associated functions. This
triggers an assertion in GDB when generating a stack trace while
stopped at the first instruction of PLT stub entry point since there
is no valid CFA rule for the first instruction.
This is probably just wrong on my part as the non-simple .cfi_startproc
would have defined a valid CFA. Instead, define a valid CFA as sp + 0
at the start of the functions and then use .cfa_def_offset to change the
offset when sp is adjusted later in the function.
Matt Joras [Thu, 9 Nov 2017 22:51:48 +0000 (22:51 +0000)]
Introduce EVENTHANDLER_LIST and some users.
This introduces a facility to EVENTHANDLER(9) for explicitly defining a
reference to an event handler list. This is useful since previously all
invokers of events had to do a locked traversal of the global list of
event handler lists in order to find the appropriate event handler list.
By keeping a pointer to the appropriate list an invoker can avoid this
traversal completely. The pointer is initialized with SYSINIT(9) during
the eventhandler stage. Users registering interest in events do not need
to know if the event is backed by such a list, since the list is added
to the global list of lists. As with lists that are not pre-defined it
is safe to register for the events before the list has been created.
This converts the process_* and thread_* events to using the new
facility, as these are events whose locked traversals end up showing up
significantly in ports build workflows (and presumably other workflows
with many short lived threads/procs). It may be advantageous to convert
other events to using the new facility.
The el_flags field is now unused, but leave it be so that this revision
can be MFC'd.
Bryan Drewery [Thu, 9 Nov 2017 22:08:07 +0000 (22:08 +0000)]
Mark targets .PHONY.
This avoids the obvious of not running the target when expected, but
also avoids META_MODE from showing 'Building'. This is mostly only
a problem when directly including bsd.obj.mk as many of these targets
were already .PHONY via bsd.sys.mk.
Make sure the IPv6 scope ID gets zeroed when exchanging CMA messages in ibcore.
Else the IPv6 address matching might fail. This change adds support for both
embedded and non-embedded IPv6 scope IDs when passing a IPv6 link-local socket
address to RDMA. Prior to this change only global IPv6 addresses would work
with RDMA.
Multiple fixes for using IPv6 link-local addresses with RDMA in ibcore.
1) Fail to resolve RDMA address if rtalloc1() returns the loopback
device, lo0, as the gateway interface. Currently RDMA loopback is
not supported.
2) Use ip_dev_find() and ip6_dev_find() to lookup network interfaces
with matching IPv4 and IPv6 addresses, respectivly.
3) In addr_resolve() make sure the "ifa" pointer is always set, also when
the "ifp" is NULL. Else a NULL pointer access might happen trying to
read from the "ifa" pointer later on.
4) In rdma_addr_find_dmac_by_grh() make sure the "bound_dev_if" field
gets set properly instead of passing the scope ID through the IPv6
socket address structure. This is more in line with upstream OFED
in Linux.
5) In rdma_addr_find_smac_by_sgid() there is no need to pass the
scope ID for IPv6. Either it is stored in the "bound_dev_if" field
or ip6_dev_find() will find the correct network device regardless
of the scope ID.
https://www.illumos.org/issues/7531
I found that some buffers that could be L2ARC eligible are not flagged
such, leading to some performance impact. As a test I ran the same IO
workload 10 times in a raw. It is a metadata only workload (files
listing). l2arc_noprefetch=0.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: benrubson <ben.rubson@gmail.com>
https://www.illumos.org/issues/7531
I found that some buffers that could be L2ARC eligible are not flagged
such, leading to some performance impact. As a test I ran the same IO
workload 10 times in a raw. It is a metadata only workload (files
listing). l2arc_noprefetch=0.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: benrubson <ben.rubson@gmail.com>
Reviewed by: Yuri Pankov <yuripv@gmx.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Toomas Soome <tsoome@me.com>
Reviewed by: Yuri Pankov <yuripv@gmx.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Toomas Soome <tsoome@me.com>
https://www.illumos.org/issues/8713
If we're creating a pool with version >= SPA_VERSION_DSL_SCRUB (v11) we need to
account for additional space needed by the origin dataset which will also be
snapshotted: "poolname"+"/"+"$ORIGIN"+"@"+"$ORIGIN".
Enforce this limit in pool_namecheck().
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: loli10K <ezomori.nozomu@gmail.com>
https://www.illumos.org/issues/8713
If we're creating a pool with version >= SPA_VERSION_DSL_SCRUB (v11) we need to
account for additional space needed by the origin dataset which will also be
snapshotted: "poolname"+"/"+"$ORIGIN"+"@"+"$ORIGIN".
Enforce this limit in pool_namecheck().
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: loli10K <ezomori.nozomu@gmail.com>
Marcin Wojtas [Thu, 9 Nov 2017 13:36:42 +0000 (13:36 +0000)]
Allow usage of more RX descriptors than 1 in ENA driver
Using only 1 descriptor on RX could be an issue, if system would be low
on resources and could not provide driver with large chunks of
contiguous memory.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12871
Marcin Wojtas [Thu, 9 Nov 2017 13:33:02 +0000 (13:33 +0000)]
Fix calculating io queues number in ENA driver
The maximum number of io_cq was the same number as maximum io_sq
indicated by the device working in normal mode (without LLQ).
It is not always true, especially when LLQ is being enabled.
Fix it.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12869
Marcin Wojtas [Thu, 9 Nov 2017 13:30:39 +0000 (13:30 +0000)]
Rework printouts and logging level in ENA driver
The driver was printing out a lot of information upon failure, which
does not have to be interested for the user.
Changing logging level required to rebuild driver with proper flags. The
proper sysctl was added, so the level now can be changed dynamically
using bitmask.
Levels of printouts were adjusted to keep on mind end user instead of
debugging purposes.
More verbose messages were added to align the driver with the Linux.
Fix building error introduced by the r325506 by casting csum_flags to
uint64_t.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12868
Marcin Wojtas [Thu, 9 Nov 2017 12:03:06 +0000 (12:03 +0000)]
Allow partial MSI-x allocation in ENA driver
The situation, where part of the MSI-x was not configured properly, was
not properly handled. Now, the driver reduces number of queues to
reflect number of existing and properly configured MSI-x vectors.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12863
Marcin Wojtas [Thu, 9 Nov 2017 11:57:02 +0000 (11:57 +0000)]
Refactor style of the ENA driver
* Change all conditional checks in "if" statement to boolean expressions
* Initialize variables with too complex values outside the declaration
* Fix indentations
* Move code associated with sysctls to ena_sysctl.c file
* For consistency, remove unnecesary "return" from void functions
* Use if_getdrvflags() function instead of accesing variable directly
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12860
Marcin Wojtas [Thu, 9 Nov 2017 11:54:32 +0000 (11:54 +0000)]
Fix error handling in the ENA driver and lock drbr_free() call
Some goto tags were renamed for consistency, and few error handling
routines were reworked.
The drbr_free() must be locked just in case code will change in the
future - for now, it should never be an issue, because drbr is being
flushed in the ena_down() call, and the lock is required only when there
are some mbufs inside.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12859
Marcin Wojtas [Thu, 9 Nov 2017 11:52:52 +0000 (11:52 +0000)]
Destroy admin queue after freeing interrupts in ENA driver
On heavy load, when interrupt handling routine was slowed down, there
could appear memory corruption, because resources were destroyed and
interrupt was still being handled.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12858
Marcin Wojtas [Thu, 9 Nov 2017 11:45:59 +0000 (11:45 +0000)]
Add RX OOO completion feature
The RX out of order completion feature, allows to complete RX
descriptors out of order, by keeping trace of all free descriptors in
the separate array.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12855
Bryan Drewery [Thu, 9 Nov 2017 02:37:49 +0000 (02:37 +0000)]
AUTO_OBJ: Fix 'old style' kernel builds using wrong .OBJDIR.
There's no way currently to automatically prevent the bad .OBJDIR from being
created but it can at least be prevented from being used. Passing
WITHOUT_AUTO_OBJ=yes or MK_AUTO_OBJ=no or -DNO_OBJ in will prevent it.
Emmanuel Vadot [Wed, 8 Nov 2017 21:24:06 +0000 (21:24 +0000)]
Allwinner A13: Add clkng support
DTS files switch from clocks under /clocks to a ccu (Clock Controller Unit)
a while ago.
Restore A13 functionality by adding a clock driver for it.
Almost every clocks are handled, the missing ones are mostly video related
clocks.
Roger Pau Monné [Wed, 8 Nov 2017 14:44:45 +0000 (14:44 +0000)]
loader: set options before including bsd.init.mk
bsd.init.mk ends up including defs.mk so the per-arch options must be
set before including defs.mk, or else the global defaults will be
used and the per-arch ones will be ignored.
Although better, note that the usage of MK_FDT before the inclusion of
bsd.init.mk is incorrect but doesn't lead to build errors. This
circular dependency must be broken in order for this to work
correctly.
Ed Schouten [Wed, 8 Nov 2017 14:21:52 +0000 (14:21 +0000)]
Upgrade to CloudABI v0.17.
Compared to the previous version, v0.16, there are a couple of minor
changes:
- CLOUDABI_AT_PID: Process identifiers for CloudABI processes.
Initially, BSD process identifiers weren't exposed inside the runtime,
due to them being pretty much useless inside of a cluster computing
environment. When jobs are scheduled across systems, the BSD process
number doesn't act as an identifier. Even on individual systems they
may recycle relatively quickly.
With this change, the kernel will now generate a UUIDv4 when executing
a process. These UUIDs can be obtained within the process using
program_getpid(). Right now, FreeBSD will not attempt to store this
value. This should of course happen at some point in time, so that it
may be printed by administration tools.
- Removal of some unused structure members for polling.
With the polling framework being simplified/redesigned, it turns out
some of the structure fields were not used by the C library. We can
remove these to keep things nice and tidy.
Xin LI [Wed, 8 Nov 2017 08:21:17 +0000 (08:21 +0000)]
Update arcmsr(4) to 1.40.00.01:
- Fix clear doorbell queue buffer for ADAPTER_TYPE_B
- Fix release memory resource when detach device
- Add support for ARC-1216, 1226 SAS 12Gb controllers
- Declare some functions as static
- Change checking dword read/write for IOP rqbuffer.
Many thanks to Areca for continuing to support FreeBSD.
Jeff Roberson [Wed, 8 Nov 2017 02:39:37 +0000 (02:39 +0000)]
Replace manyinstances of VM_WAIT with blocking page allocation flags
similar to the kernel memory allocator.
This simplifies NUMA allocation because the domain will be known at wait
time and races between failure and sleeping are eliminated. This also
reduces boilerplate code and simplifies callers.
A wait primitive is supplied for uma zones for similar reasons. This
eliminates some non-specific VM_WAIT calls in favor of more explicit
sleeps that may be satisfied without new pages.
Justin Hibbits [Wed, 8 Nov 2017 01:28:20 +0000 (01:28 +0000)]
DS1307: Add the mcp7941x enable bit
Summary:
Existing code recognizes the mcp7941x RTC, but this RTC has an
enable bit at the same location as the "Clock Halt" bit on the ds1307, with an
opposite assertion (set == on, whereas CH set == clock stopped). Thus the
current code halts the clock, with no way to enable it.
Reviewed By: ian
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D12961
Justin Hibbits [Wed, 8 Nov 2017 01:23:37 +0000 (01:23 +0000)]
Clear the WE bit in C code rather than the asm
According to EREF rlwinm is supposed to clear the upper 32 bits of the
register of 64-bit cores. However, from experience it seems there's a bug
in the e5500 which causes the result to be duplicated in the upper bits of
the register. This causes problems when applied to stashed SRR1 accessed
to retrieve context, as the upper bits are not masked out, so a
set_mcontext() fails. This causes sigreturn() to in turn return with
EINVAL, causing make(1) to exit with error.
This bit is unused in e500mc derivatives (including e5500), so could just be
conditional on non-powerpc64, but there may be other non-Freescale cores
which do use it. This is also the same as the POW bit on Book-S, so could
be cleared unconditionally with the only penalty being a few clock cycles
for these two interrupts.
Bryan Drewery [Tue, 7 Nov 2017 18:20:08 +0000 (18:20 +0000)]
Reenable AUTO_OBJ by default.
The problem with it was a bogus .OBJDIR in some cases where creation of
object directories were purposely not attempted, such as for 'make cleandir'
and in etc/ sub-directories. In these cases bmake would start with a
bogus .OBJDIR like etc/ due to MAKEOBJDIR being a dynamic value based on
.CURDIR, SRCTOP, and OBJTOP. OBJTOP would not yet be defined but is
during early src.sys.obj.mk. That file and auto.obj.mk both were not
modifying .OBJDIR unless they expected to create the objdir. Thus in
these cases the .OBJDIR was left as etc/* rather than fixed to the
proper .CURDIR.
The issues were fixed in r325404 and r325416. An assertion to avoid the
bad .OBJDIR was added in r325405.
Make sysctl_kern_proc_umask execute fast path when requested pid in
curproc->p_pid or 0, avoiding unnecessary locking. Update libc consumer
to skip calling getpid().
Marcin Wojtas [Tue, 7 Nov 2017 13:20:41 +0000 (13:20 +0000)]
Fix ENA driver error handling in attach and basic style fixes
The patch contains following changes:
* In conditional checks, always check for NULL or 0 instead of negating values
* Use malloc and free explicitely, instead of ENA_MEM_FREE and ENA_MEM_FREE (the
dmadev passed to macro is never used, and could be a little misleading)
* Always check for NULL after calling malloc (few checks were missing)
* Rework naming of the goto tags in ena_attach() for consistency
* Fix error handling in ena_attach() - few goto instructions were leading to the
wrong tag
* Destroy MMIO req read request if attach failed
* Remove checking for NULL after calling malloc with M_WAITOK flag
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon.com, Inc.
Differential Revision: https://reviews.freebsd.org/D12853
Warner Losh [Tue, 7 Nov 2017 09:47:05 +0000 (09:47 +0000)]
Correct the detection of hard float arm
* Don't test MACHINE, it's irrelevant to userland and should never be
used in userland Makefiles.
* If we match armv[67] and CPUTYPE is undefined OR it doesn't have
'soft' in it, choose armhf.
* Add a note that the soft float on armv[67] may be broken.
Use hardware timestamps to report packet timestamps for SO_TIMESTAMP
and other similar socket options.
Provide new control message SCM_TIME_INFO to supply information about
timestamp. Currently it indicates that the timestamp was
hardware-assisted and high-precision, for software timestamps the
message is not returned. Reserved fields are added to ABI to report
additional info about it, it is expected that raw hardware clock value
might be useful for some applications.
Add a place for a driver to report rx timestamps in nanoseconds from
boot for the received packets.
The rcv_tstmp field overlaps the place of Ln header length indicators,
not used by received packets. The basic pkthdr rearrangement change
in sys/mbuf.h was provided by gallatin.
There are two accompanying M_ flags: M_TSTMP means that there is the
timestamp (and it was generated by hardware).
Another flag M_TSTMP_HPREC indicates that the timestamp is
high-precision. Practically M_TSTMP_HPREC means that hardware
provided additional precision comparing with the stamps when the flag
is not set. E.g., for ConnectX all packets are stamped by hardware
when PCIe transaction to write out the completion descriptor is
performed, but PTP packet are stamped on port. For Intel cards, when
PTP assist is enabled, only PTP packets are stamped in the limited
number of registers, so if Intel cards ever start support this
mechanism, they would always set M_TSTMP | M_TSTMP_HPREC if hardware
timestamp is present for the given packet.
Add IFCAP_HWRXTSTMP interface capability to indicate the support for
hardware rx timestamping, and ifconfig(8) command to toggle it.
Based on the patch by: gallatin
Reviewed by: gallatin (previous version), hselasky
Sponsored by: Mellanox Technologies
MFC after: 2 weeks (? mbuf KBI issue)
X-Differential revision: https://reviews.freebsd.org/D12638
Enji Cooper [Tue, 7 Nov 2017 06:26:48 +0000 (06:26 +0000)]
Redo r325502
:U:Mfoo expands to :Mfoo, apparently. Explicit check for CPUTYPE being
defined, and test for it's value not containing *soft* before calling CRTARCH
armhf.
Tested, somewhat. Unfortunately recent changes appear to have affected
cross-builds where it no longer works, per my tests after universe12a being
upgraded from 07/2017 to 11/2017 sources (DESTDIR isn't being used in WORLDTMP;
MK_SYSTEM_COMPILER might be causing issues right now).
Enji Cooper [Tue, 7 Nov 2017 04:55:23 +0000 (04:55 +0000)]
Use bsd.compiler.mk instead of src.opts.mk
- MK_PROFILE is controlled in bsd.opts.mk, which is pulled in via bsd.own.mk,
which is pulled in via bsd.init.mk . All upstream Makefiles which build off
of this one use bsd.init.mk.
- COMPILER_{TYPE,VERSION} is set via bsd.compiler.mk .
This reduces the namespace pollution/complexity somewhat.
Stephen Hurd [Mon, 6 Nov 2017 16:41:29 +0000 (16:41 +0000)]
bnxt: Add support for new phy_types and speeds - Part #2
Use our ifm_list of supported media types rather than nested switch
statements to find the current media type. Find a supported type that
matches the current speed.
Remove all workarounds while updating ifmr->ifm_active.
For BNXT_IFMEDIA_ADD, added Three more speeds IFM_10G_T, IFM_2500_T & IFM_2500_KX.
Stephen Hurd [Mon, 6 Nov 2017 16:23:21 +0000 (16:23 +0000)]
Only chain non-LRO mbufs when LRO is not possible
Preserve packet order between tcp_lro_rx() and if_input() to avoid
creating extra corner cases. If no packets can be LROed, combine them
into one chain for submission via if_input(). If any packet can
potentially be LROed however, retain old behaviour and call if_input()
for each packet.
This should keep the 12% improvement for small packet forwarding intact,
but mostly avoids impacting the LRO case.