andrew [Thu, 17 Dec 2015 17:00:04 +0000 (17:00 +0000)]
Support the variant of the interrupt-map property where the parent bus has
the #address-cells property set. For this we need to read more data before
the parent interrupt description.
this is only enabled on arm64 for now as it's not quite compliant with the
ePAPR spec. We should use a default of 2 where the #address-cells property
is missing, however this will need further testing across architectures.
gnn [Thu, 17 Dec 2015 02:02:09 +0000 (02:02 +0000)]
Switch the IPsec related statistics to using the built in sysctl
variable set rather than reading from kernel memory.
This also makes the -z (zero) flag work correctly
markj [Thu, 17 Dec 2015 00:00:27 +0000 (00:00 +0000)]
Support an arbitrary number of arguments to DTrace syscall probes.
Rather than pushing all eight possible arguments into dtrace_probe()'s
stack frame, make the syscall_args struct for the current syscall available
via the current thread. Using a custom getargval method for the systrace
provider, this allows any syscall argument to be fetched, even in kernels
that have modified the maximum number of system call arguments.
markj [Wed, 16 Dec 2015 23:39:27 +0000 (23:39 +0000)]
Fix style issues around existing SDT probes.
- Use SDT_PROBE<N>() instead of SDT_PROBE(). This has no functional effect
at the moment, but will be needed for some future changes.
- Don't hardcode the module component of the probe identifier. This is
set automatically by the SDT framework.
smh [Wed, 16 Dec 2015 22:26:28 +0000 (22:26 +0000)]
Fix issues introduced by r292275
* Fix panic for etherswitches which don't have a LLADDR.
* Disabled DELAY in unsolicited NDA, which needs further work.
* Fixed missing DELAY in carp_send_na.
* style(9) fix.
glebius [Wed, 16 Dec 2015 21:30:45 +0000 (21:30 +0000)]
A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES().
o With new KPI consumers can request contiguous ranges of pages, and
unlike before, all pages will be kept busied on return, like it was
done before with the 'reqpage' only. Now the reqpage goes away. With
new interface it is easier to implement code protected from race
conditions.
Such arrayed requests for now should be preceeded by a call to
vm_pager_haspage() to make sure that request is possible. This
could be improved later, making vm_pager_haspage() obsolete.
Strenghtening the promises on the business of the array of pages
allows us to remove such hacks as swp_pager_free_nrpage() and
vm_pager_free_nonreq().
o New KPI accepts two integer pointers that may optionally point at
values for read ahead and read behind, that a pager may do, if it
can. These pages are completely owned by pager, and not controlled
by the caller.
This shifts the UFS-specific readahead logic from vm_fault.c, which
should be file system agnostic, into vnode_pager.c. It also removes
one VOP_BMAP() request per hard fault.
emaste [Wed, 16 Dec 2015 19:23:10 +0000 (19:23 +0000)]
Enable LLDB by default on amd64 and arm64
LLDB is usable for userland core file and live debugging on amd64, and
for userland core file debugging on arm64. In general it works at least
as well on FreeBSD as our in-tree gdb version, so enable it by default
to allow for broader use and testing.
An LLDB tutorial is available at http://lldb.llvm.org/tutorial.html, and
a table mapping GDB commands to LLDB commands can be found at
http://lldb.llvm.org/lldb-gdb.html .
LLDB also has some level of support for FreeBSD on arm, mips, i386,
and powerpc, but is not yet ready to have them enabled by default.
bapt [Wed, 16 Dec 2015 17:13:09 +0000 (17:13 +0000)]
pxeboot: make the tftp loader use the option root-path directive
pxeboot in tftp loader mode (when built with LOADER_TFTP_SUPPORT) now
prefix all the path to open with the path obtained via the option 'root-path'
directive.
This allows to be able to use the traditional content /boot out of box. Meaning
it now works pretty much like all other loaders. It simplifies hosting hosting
multiple version of FreeBSD on a tftp server.
As a consequence, pxeboot does not look anymore for a pxeboot.4th (which was
never provided)
Note: that pxeboot in tftp loader mode is not built by default.
emaste [Wed, 16 Dec 2015 16:19:18 +0000 (16:19 +0000)]
UEFI: combine GetMemoryMap and ExitBootServices and retry on error
The EFI memory map may change before or during the first
ExitBootServices call. In that case ExitBootServices returns an error,
and GetMemoryMap and ExitBootServices must be retried.
Glue together calls to GetMemoryMap(), ExitBootServices() and storage of
(now up-to-date) MODINFOMD_EFI_MAP metadata within a single function.
That new function - bi_add_efi_data_and_exit() - uses space previously
allocated in bi_load_efi_data() to store the memory map (it will fail if
that space is too short). It handles re-calling GetMemoryMap() once to
update the map key if necessary. Finally, if ExitBootServices() is
successful, it stores the memory map and its header as MODINFOMD_EFI_MAP
metadata.
ExitBootServices() calls are now done earlier, from within arch-
independent bi_load() code.
melifaro [Wed, 16 Dec 2015 10:14:16 +0000 (10:14 +0000)]
Provide additional lle data in IPv6 lltable dump used by ndp(8).
Before the change, things like lle state were queried via
SIOCGNBRINFO_IN6 by ndp(8) for _each_ lle entry in dump.
This ioctl was added in 1999, probably to avoid touching rtsock code.
This change maps SIOCGNBRINFO_IN6 data to standard rtsock dump the
following way:
expire (already) maps to rtm_rmx.rmx_expire
isrouter -> rtm_flags & RTF_GATEWAY
asked -> rtm_rmx.rmx_pksent
state -> rtm_rmx.rmx_state (maps to rmx_weight via define)
des [Wed, 16 Dec 2015 09:17:07 +0000 (09:17 +0000)]
Reset bufpos to 0 immediately after refilling the buffer. Otherwise, we
risk leaving the connection in an indeterminate state if the server fails
to send a chunk delimiter. Depending on the application and on the sizes
of the preceding chunks, the result can be anything from missing data to a
segfault. With this patch, it will be reported as a protocol error.
melifaro [Wed, 16 Dec 2015 09:16:06 +0000 (09:16 +0000)]
Fix ARP reply handling changed in r286955.
If source of ARP request didn't pass the routing check
(e.g. not in directly connected network), be polite and
still answer the request instead of dropping frame.
ngie [Wed, 16 Dec 2015 09:11:11 +0000 (09:11 +0000)]
Integrate a number of testcases from tools/regression/lib/msun
into the FreeBSD test suite
There's no functional change with these testcases; they're purposely
being left in TAP format for the time being
Other testcases which crash on amd64/i386 as-is have not been
integrated yet (they need to be retested on a later version of
CURRENT, as I haven't used i386 in some time)
kib [Wed, 16 Dec 2015 08:48:37 +0000 (08:48 +0000)]
Optimize vop_stdadvise(POSIX_FADV_DONTNEED). Instead of looking up a
buffer for each block number in the range with gbincore(), look up the
next instantiated buffer with the logical block number which is
greater or equal to the next lblkno. This significantly speeds up the
iteration for sparce-populated range.
Move the iteration into new helper bnoreuselist(), which is structured
similarly to flushbuflist().
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
kib [Wed, 16 Dec 2015 08:39:51 +0000 (08:39 +0000)]
Simplify the loop step in the flushbuflist() and make it independed on
the type stability of the buffers memory. Instead of memoizing
pointer to the next buffer and validating it, remember the next
logical block number in the bo list and re-lookup.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
ngie [Wed, 16 Dec 2015 08:09:03 +0000 (08:09 +0000)]
Integrate tools/regression/lib/libc/nss into the FreeBSD test suite as
lib/libc/tests/nss
- Convert the testcases to ATF
- Do some style(9) cleanups:
-- Sort headers
-- Apply indentation fixes
-- Remove superfluous parentheses
- Explicitly print out debug printfs for use with `kyua {debug,report}`; for
items that were overly noisy, they've been put behind #ifdef DEBUG
conditionals
- Fix some format strings
rrs [Wed, 16 Dec 2015 00:56:45 +0000 (00:56 +0000)]
First cut of the modularization of our TCP stack. Still
to do is to clean up the timer handling using the async-drain.
Other optimizations may be coming to go with this. Whats here
will allow differnet tcp implementations (one included).
Reviewed by: jtl, hiren, transports
Sponsored by: Netflix Inc.
Differential Revision: D4055
smh [Tue, 15 Dec 2015 21:11:41 +0000 (21:11 +0000)]
Prevent g_access calls to bad multipath members
When a multipath member is orphaned its access members are zeroed before its
removed if marked for wither, so prevent any future calls to g_access on
such members.
This prevents a panic on debug kernels which validates the resultant values
aren't negative.
bdrewery [Tue, 15 Dec 2015 18:42:30 +0000 (18:42 +0000)]
Correct comment about MAKEOBJDIRPREFIX in src-env.conf.
It may only be used with WITH_AUTO_OBJ, which the WITH_DIRDEPS_BUILD does. We
could support this in the normal build as well if we forced creating the directory
and setting .OBJDIR.
jamie [Tue, 15 Dec 2015 17:25:00 +0000 (17:25 +0000)]
Fix jail name checking that disallowed anything that starts with '0'.
The intention was to just limit leading zeroes on numeric names. That
check is now improved to also catch the leading spaces and '+' that
strtoul can pass through.
skra [Tue, 15 Dec 2015 16:04:45 +0000 (16:04 +0000)]
Local TLB flush is sufficient in pmap_remove_pages().
(1) The pmap argument passed to the function must be current pmap only.
(2) The process must be single threaded as the function is called either
when a process is exiting or from exec_new_vmspace().
Remove pmap_tlb_flush_ng() which is not used anywhere now.
smh [Tue, 15 Dec 2015 16:02:11 +0000 (16:02 +0000)]
Fix lagg failover due to missing notifications
When using lagg failover mode neither Gratuitous ARP (IPv4) or Unsolicited
Neighbour Advertisements (IPv6) are sent to notify other nodes that the
address may have moved.
This results is slow failover, dropped packets and network outages for the
lagg interface when the primary link goes down.
We now use the new if_link_state_change_cond with the force param set to
allow lagg to force through link state changes and hence fire a
ifnet_link_event which are now monitored by rip and nd6.
Upon receiving these events each protocol trigger the relevant
notifications:
* inet4 => Gratuitous ARP
* inet6 => Unsolicited Neighbour Announce
This also fixes the carp IPv6 NA's that stopped working after r251584 which
added the ipv6_route__llma route.
The new behavour can be controlled using the sysctls:
* net.link.ether.inet.arp_on_link
* net.inet6.icmp6.nd6_on_link
Also removed unused param from lagg_port_state and added descriptions for the
sysctls while here.
skra [Tue, 15 Dec 2015 15:22:33 +0000 (15:22 +0000)]
Replace all postponed TLB flushes by immediate ones except the one
in pmap_remove_pages().
Some points were considered:
(1) There is no range TLB flush cp15 function.
(2) There is no target selection for hardware TLB flush broadcasting.
(3) Some memory ranges could be mapped sparsely.
(4) Some memory ranges could be quite large.
Tested by buildworld on RPi2 and Jetson TK1, i.e. 4 core platforms.
It turned out that the buildworld time is faster. On the other hand,
when the postponed TLB flush was also removed from pmap_remove_pages(),
the result was worse. But pmap_remove_pages() is called for removing
all user mapping from a process, thus it's quite expected.
Note that the postponed TLB flushes came here from i386 pmap where
hardware TLB flush broadcasting is not available.
smh [Tue, 15 Dec 2015 14:17:07 +0000 (14:17 +0000)]
Add flag to disable inital reboot(8) userland sync
Add -N flag to reboot(8) which bypasses the userland sync(2) during
reboot(8) while still allow the kernel sync during the reboot(2) syscall
to occur.
An example use of this is when rebooting with disconnected iSCSI sessions
which would otherwise cause the reboot to hang on BIOs that will never
complete.
skra [Tue, 15 Dec 2015 13:17:40 +0000 (13:17 +0000)]
Flush intermediate TLB cache when L2 page table is unlinked.
This fixes an issue observed on Cortex A7 (RPi2) and on Cortex A15
(Jetson TK1) causing various memory corruptions. It turned out that
even L2 page table with no valid mapping might be a subject of such
caching.
Note that not all platforms have intermediate TLB caching implemented.
An open question is if this fix is sufficient for all platforms with
this feature.
royger [Tue, 15 Dec 2015 11:20:20 +0000 (11:20 +0000)]
hyperv/kvp: wake up the daemon if it's sleeping due to poll()
Without the patch, there is a race condition: when poll() is invoked(),
if kvp_globals.daemon_busy is false, the daemon won't be timely
woke up, because hv_kvp_send_msg_to_daemon() can't wake up the daemon
in this case.
Submitted by: Dexuan Cui <decui@microsoft.com>
Sponsored by: Microsoft OSTC
Reviewed by: delphij, royger
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D4258
royger [Tue, 15 Dec 2015 10:07:03 +0000 (10:07 +0000)]
x86/bounce: try to always completely fill bounce pages
Current code doesn't try to make use of the full page when bouncing because
the size is only expanded to be a multiple of the alignment. Instead try to
always create segments of PAGE_SIZE when using bounce pages.
This allows us to remove the specific casing done for
BUS_DMA_KEEP_PG_OFFSET, since the requirement is to make sure the offsets
into contiguous segments are aligned, and now this is done by default.
adrian [Tue, 15 Dec 2015 04:43:28 +0000 (04:43 +0000)]
[arge] add a comment about needing mdio busses in order to use the interface.
This is a holdover from how reset is handled in the ARGE_MDIO world.
You need to define the mdio bus device if you want to use the ethernet
device or the arge setup path doesn't bring the MAC out of reset.
bdrewery [Tue, 15 Dec 2015 02:46:14 +0000 (02:46 +0000)]
DIRDEPS_BUILD: Fix incorrectly adding in RELDIR for DIRDEPS in bootstrapping.
This is not wrong, but was unexpected. Using <empty>:H results in '.' which
then using the rest of the conversion was added in RELDIR. This was also
causing an empty _DP_DIRDEPS to resolve to SRCTOP for DIRDEPS.
jhb [Tue, 15 Dec 2015 00:05:07 +0000 (00:05 +0000)]
Start on a new library (libsysdecode) that provides routines for decoding
system call information such as system call arguments. Initially this
will consist of pulling duplicated code out of truss and kdump though it
may prove useful for other utilities in the future.
This commit moves the shared utrace(2) record parser out of kdump into
the library and updates kdump and truss to use it. One difference from
the previous version is that the library version treats unknown events
that start with the "RTLD" signature as unknown events. This simplifies
the interface and allows the consumer to decide how to handle all
non-recognized events. Instead, this function only generates a string
description for known malloc() and RTLD records.
bdrewery [Mon, 14 Dec 2015 23:25:31 +0000 (23:25 +0000)]
Follow-up r290423: Don't use CSH for buildenv shell.
It does not properly import PATH; the PATH is reset by included profile
files on startup which breaks the biggest feature of buildenv (using
sysrooted cc from WORLDTMP)
cem [Mon, 14 Dec 2015 22:01:52 +0000 (22:01 +0000)]
ioat(4): Add support for interrupt coalescing
In I/OAT, this is done through the INTRDELAY register. On supported
platforms, this register can coalesce interrupts in a set period to
avoid excessive interrupt load for small descriptor workflows. The
period is configurable anywhere from 1 microsecond to 16.38
milliseconds, in microsecond granularity.
kp [Mon, 14 Dec 2015 19:44:49 +0000 (19:44 +0000)]
inet6: Do not assume every interface has ip6 enabled.
Certain interfaces (e.g. pfsync0) do not have ip6 addresses (in other words,
ifp->if_afdata[AF_INET6] is NULL). Ensure we don't panic when the MTU is
updated.
pfsync interfaces will never have ip6 support, because it's explicitly disabled
in in6_domifattach().
asomers [Mon, 14 Dec 2015 19:40:47 +0000 (19:40 +0000)]
Don't retry SAS commands in response to protocol errors
sys/dev/mpr/mpr_sas_lsi.c
sys/dev/mps/mps_sas_lsi.c
When mp[rs]sas_get_sata_identify returns
MPI2_IOCSTATUS_SCSI_PROTOCOL_ERROR, don't bother retrying. Protocol
errors aren't likely to be fixed by sleeping.
Without this change, a system that generated may protocol errors due
to signal integrity issues was taking more than an hour to boot, due
to all the retries.
In r289315, I added new fields to res_state. This broke binary
backward compatibility. It also broke some ports (and possibly
other code) by requiring the definition of time_t and struct timespec.
Fix these problems by moving the new fields into __res_state_ext.
Suggested by: ume
Reviewed by: ume
MFC after: 3 days
Sponsored by: Dell Inc.
Differential Revision: https://reviews.freebsd.org/D4472
andrew [Mon, 14 Dec 2015 17:08:40 +0000 (17:08 +0000)]
Update the handling of interrupts on the generic PCIe driver:
* Use the interrupt-map property to route interrupts
* Remove the IRQ rman, it's now unneeded
* Support MSI/MSI-X interrupts
With this I'm able to use the two NICs I've tested (em and msk), however
while I can boot with an AHCI devie attached it fails when any drives are
connected.
Obtained from: ABT Systems Ltd
Sponsored by: SoftIron Inc