MFC r282809:
Add new socket ioctls SIOC[SG]TUNFIB to set FIB number of encapsulated
packets on tunnel interfaces. Add support of these ioctls to gre(4),
gif(4) and me(4) interfaces. For incoming packets M_SETFIB() should use
if_fib value from ifnet structure, use proper value in gre(4) and me(4).
MFC r276148:
Remove in_gif.h and in6_gif.h files. They only contain function
declarations used by gif(4). Instead declare these functions in C files.
Also make some variables static.
MFC r276215:
Extern declarations in C files loses compile-time checking that
the functions' calls match their definitions. Move them to header files.
Split it into two modules: if_gre(4) for GRE encapsulation and
if_me(4) for minimal encapsulation within IP.
gre(4) changes:
* convert to if_transmit;
* rework locking: protect access to softc with rmlock,
protect from concurrent ioctls with sx lock;
* correct interface accounting for outgoing datagramms (count only payload size);
* implement generic support for using IPv6 as delivery header;
* make implementation conform to the RFC 2784 and partially to RFC 2890;
* add support for GRE checksums - calculate for outgoing datagramms and check
for inconming datagramms;
* add support for sending sequence number in GRE header;
* remove support of cached routes. This fixes problem, when gre(4) doesn't
work at system startup. But this also removes support for having tunnels with
the same addresses for inner and outer header.
* deprecate support for various GREXXX ioctls, that doesn't used in FreeBSD.
Use our standard ioctls for tunnels.
me(4):
* implementation conform to RFC 2004;
* use if_transmit;
* use the same locking model as gre(4);
PR: 164475
MFC r274289 (by bz):
gcc requires variables to be initialised in two places. One of them
is correctly used only under the same conditional though.
For module builds properly check if the kernel supports INET or INET6,
as otherwise various mips kernels without IPv6 support would fail to build.
r276485 is the real change here, the rest deal with the fallout of
mp_ring's reliance on 64b atomics.
Use the incorrectly spelled 'eigth' from struct pkthdr in this branch
instead of MFC'ing r261733, which would have renamed the field of a
public structure in a -STABLE branch.
---
r276480:
Temporarily unplug cxgbe(4) from !amd64 builds.
r276485:
cxgbe(4): major tx rework.
a) Front load as much work as possible in if_transmit, before any driver
lock or software queue has to get involved.
b) Replace buf_ring with a brand new mp_ring (multiproducer ring). This
is specifically for the tx multiqueue model where one of the if_transmit
producer threads becomes the consumer and other producers carry on as
usual. mp_ring is implemented as standalone code and it should be
possible to use it in any driver with tx multiqueue. It also has:
- the ability to enqueue/dequeue multiple items. This might become
significant if packet batching is ever implemented.
- an abdication mechanism to allow a thread to give up writing tx
descriptors and have another if_transmit thread take over. A thread
that's writing tx descriptors can end up doing so for an unbounded
time period if a) there are other if_transmit threads continuously
feeding the sofware queue, and b) the chip keeps up with whatever the
thread is throwing at it.
- accurate statistics about interesting events even when the stats come
at the expense of additional branches/conditional code.
The NIC txq lock is uncontested on the fast path at this point. I've
left it there for synchronization with the control events (interface
up/down, modload/unload).
c) Add support for "type 1" coalescing work request in the normal NIC tx
path. This work request is optimized for frames with a single item in
the DMA gather list. These are very common when forwarding packets.
Note that netmap tx in cxgbe already uses these "type 1" work requests.
d) Do not request automatic cidx updates every 32 descriptors. Instead,
request updates via bits in individual work requests (still every 32
descriptors approximately). Also, request an automatic final update
when the queue idles after activity. This means NIC tx reclaim is still
performed lazily but it will catch up quickly as soon as the queue
idles. This seems to be the best middle ground and I'll probably do
something similar for netmap tx as well.
e) Implement a faster tx path for WRQs (used by TOE tx and control
queues, _not_ by the normal NIC tx). Allow work requests to be written
directly to the hardware descriptor ring if room is available. I will
convert t4_tom and iw_cxgbe modules to this faster style gradually.
r276498:
cxgbe(4): remove buf_ring specific restriction on the txq size.
r277225:
Make cxgbe(4) buildable with the gcc in base.
r277226:
Allow cxgbe(4) to be built on i386. Driver attach will succeed only on
a subset of i386 systems.
r277227:
Plug cxgbe(4) back into !powerpc && !arm builds, instead of building it
on amd64 only.
r277230:
Build cxgbe(4) on powerpc64 too.
r277637:
Make sure the compiler flag to get cxgbe(4) to compile with gcc is used
only when gcc is being used. This is what r277225 should have been.
jhb [Fri, 5 Jun 2015 20:38:22 +0000 (20:38 +0000)]
MFC 281932:
Rename the kld for oce(4) to if_oce.ko. ifconfig(8) has special knowledge
about kld filenames for network drivers that requires them to follow the
pattern of if_<foo>. This also fixes the existing documentation in the
manpage which says to use if_oce_load=YES in loader.conf.
jhb [Fri, 5 Jun 2015 17:05:09 +0000 (17:05 +0000)]
MFC 274633,274639,274663,277233-277235,281870,281871,281873,281874:
Various fixes for suspend and resume of PCI to PCI and PCI to Cardbus
bridges.
274633:
Remove stray empty comment. The code is adequately explained in the
block comment above, so there's nothing to add here.
274639:
Modernize comments about BIOSes being lame since in this detail they
aren't lame, the rules changed along the way. Catch up to 1999 or so
with the new rules.
274663:
Fix typo pointed out by avg@ and Joerg Sonnenberger. Add a clarifying
sentence too.
277233:
Suspend and resume were the only two functions not to follow the brdev
convention here, so fix that.
277234:
Move the suspsned and resume functions to the bus attachment. They
were accessing PCI config registers, which won't work for the ISA
version.
277235:
Always enable I/O, memory and dma cycles. Some BIOSes don't enable
them, sometimes they are reset for power state transitions or during
whatever happens while suspended. Also, it is good practice to always
do this.
281870:
Cosmetic change: use PCIR_SECLAT_2 rather than PCIR_SECLAT_1.
281871:
The minimim grant and maximum latency PCI config registers are only valid
for type 0 devices, not type 1 or 2 bridges. Don't read them for bridge
devices during bus scans and return an error when attempting to read them
as ivars for bridge devices.
281873:
Don't explicitly manage power states for PCI-PCI bridge devices in the
driver's suspend and resume routines. These have been redundant no-ops
since r214065 changed the PCI bus driver to manage power states for
all devices (including type 1/2 bridge devices) during suspend and resume.
281874:
Update the pci_cfg_save/restore routines to operate on bridge devices
(type 1 and type 2) as well as leaf devices (type 0). In particular,
this allows the existing PCI bus logic to save and restore capability
registers such as MSI and PCI-express work for bridge devices rather than
requiring that code to be duplicated in bridge drivers. It also means
that bridge drivers no longer need to save and restore basic registers
such as the PCI command register or BARs nor manage powerstates for the
bridge device.
While here, pci_setup_secbus() has been changed to initialize the 'sec'
and 'sub' fields in the 'secbus' structure instead of requiring the pcib
and pccbb drivers to do this in the NEW_PCIB + PCI_RES_BUS case.
hselasky [Fri, 5 Jun 2015 07:17:14 +0000 (07:17 +0000)]
MFC r283922:
Fix for control endpoint handling in the DWC OTG driver. The data
stage processing is only allowed after the setup complete event has
been received. Else a race may occur and the OUT data can be corrupted.
While at it ensure resetting a FIFO has the required wait loop.
imp [Thu, 4 Jun 2015 22:11:39 +0000 (22:11 +0000)]
MFC: Merge more of the dtb machinery
Merge 278459,278460,278461,278462 which define DTBDIR and other things
needed for install to work. Although the commit in head kinda fixed
install_as_user, it's unknown if that works in 10.x (it didn't the last
time I tried).
slm [Thu, 4 Jun 2015 16:27:18 +0000 (16:27 +0000)]
MFC: r283661
- Updated all files with 2015 Avago copyright, and updated LSI's copyright
dates.
- Changed all of the PCI device strings from LSI to Avago Technologies (LSI).
- Added a sysctl variable to control how StartStopUnit behavior works. User can
select to spin down disks based on if disk is SSD or HDD.
- Inquiry data is required to tell if a disk will support SSU at shutdown or
not. Due to the addition of mprssas_async, which gets Advanced Info but not
Inquiry data, the setting of supports_SSU was moved to the
mprsas_scsiio_complete function, which snoops for any Inquiry commands. And,
since disks are shutdown as a target and not a LUN, this process was
simplified by basing it on targets and not LUNs.
- Added a sysctl variable that sets the amount of time to retry after sending a
failed SATA ID command. This helps with some bad disks and large disks that
require a lot of time to spin up. Part of this change was to add a callout to
handle timeouts with the SATA ID command. The callout function is called
mprsas_ata_id_timeout(). (Fixes PR 191348)
- Changed the way resets work by allowing I/O to continue to devices that are
not currently under a reset condition. This uses devq's instead of simq's and
makes use of the MPSSAS_TARGET_INRESET flag. This change also adds a function
called mprsas_prepare_tm().
- Some changes were made to reduce code duplication when getting a SAS address
for a SATA disk.
rmacklem [Thu, 4 Jun 2015 12:35:00 +0000 (12:35 +0000)]
MFC: r283273
The NFS client wasn't handling getdirentries(2) requests for sizes
that are not an exact multiple of DIRBLKSIZ correctly. Fortunately
readdir(3) always uses an exact multiple of DIRBLKSIZ, so few applications
were affected. This patch fixes this problem by reducing the size
of the directory read to an exact multiple of DIRBLKSIZ.
pkelsey [Thu, 4 Jun 2015 04:54:54 +0000 (04:54 +0000)]
MFC r283641:
Add CAP_FCNTL to the lease file capsicum rights, and limit to
CAP_FCNTL_GETFL. Without CAP_FCNTL_GETFL, the lease file truncation
in rewrite_client_leases() will fail to trim old data when rewriting
the file with a lesser amount of data.
ian [Thu, 4 Jun 2015 01:52:17 +0000 (01:52 +0000)]
MFC r262955:
Add 3wire and std as terminal types/classes. These are similar to
the existing terminal types/classes that have the baudrate suffix,
but differ in that no baudrate is set/defined.
tuexen [Wed, 3 Jun 2015 17:45:45 +0000 (17:45 +0000)]
MFC r283784:
Remove trailing whitespaces.
MFC r283785:
Require the embedded packet to contain 8 bytes after the IP header instead
of only 4. This is guaranteed by RFC 792 and the verification of GRE, ICMP
and TCP packets use 8 bytes.
MFC r283786:
There is no payload anymore. So compute the minimum packet length
correctly and use 40 as the default (if the minumum allows it), as
specified in the man page.
MFC r283806:
When the packet verification fails in verbose mode, print the correct
number of words in host byte order. Also remove a stray 'x'.
hselasky [Wed, 3 Jun 2015 15:32:43 +0000 (15:32 +0000)]
MFC r282650 and r282651:
Extend the maximum number of allowed PCM channels in a PCM stream to
127 and decrease the maximum number of sub-channels to 1. These
definitions are only used inside the kernel and can be changed later
if more than one sub-channel is desired. This has been done to allow
so-called USB audio rack modules to work with FreeBSD.
Add support for more than 8 audio channels per PCM stream for USB
audio class compliant devices under FreeBSD. Tested using 16 recording
and 16 playback audio channels simultaneously.
Bump the FreeBSD version to force recompiling all external modules.
emaste [Wed, 3 Jun 2015 13:12:08 +0000 (13:12 +0000)]
MFC r261220 by csjp: Allow sigwait(2) in capabilities mode.
It's common for multi-threaded processes to create a thread for
the purpose of synchronously processing signals. Allow such processes to
utilize a capabilities sandbox.
emaste [Wed, 3 Jun 2015 11:36:47 +0000 (11:36 +0000)]
MFC r257736 (by pjd):
- Remove mac_get_fd/mac_set_fd - those are not syscalls. The
__mac_get_fd() and __mac_set_fd() syscalls are listed earlier.
- Correct typo in syscall name. It should be sched_rr_get_interval,
not sched_rr_getinterval.
MFC r283146:
In the reply to SADB_X_SPDGET message use the same sequence number that
was in the request. Some IKE deamons expect it will the same. Linux and
NetBSD also follow this behaviour.
jhb [Tue, 2 Jun 2015 15:12:33 +0000 (15:12 +0000)]
MFC 282417:
Various updates to the ftruncate(2) documentation:
- Note that ftruncate(2) can operate on shared memory objects and cross
reference shm_open(2).
- Note that ftruncate(2) does not change the file position pointer (aka
seek pointer) of the file descriptor.
- ftruncate(2) will fail with EINVAL for all sorts of other fd types than
just sockets, so instead note that it fails for all but regular files and
shared memory objects.
- Note that ftruncate(2) also appeared in 4.2BSD along with truncate(2).
(Or at least the manpage for both appeared in 4.2, I did not check the
kernel code itself to see if either predated 4.2.)
jhb [Tue, 2 Jun 2015 15:09:33 +0000 (15:09 +0000)]
MFC 282416:
Partially revert r255486, the first argument to socketpair() is a socket
domain, not a file descriptor. Use 'domain' instead of the original 'd'
for this argument to match socket(2).
jhb [Tue, 2 Jun 2015 14:55:55 +0000 (14:55 +0000)]
MFC 281601:
Remove THRMISC_VERSION. The thrmisc structure doesn't include a version
number, so this wasn't used (and can't easily be added). If at some point
we want to extend thrmisc, we will probably need to just add a new note
type and ensure that the new type includes a version number.
jhb [Tue, 2 Jun 2015 14:54:53 +0000 (14:54 +0000)]
MFC 281266:
Move the 32-bit compatible procfs types from freebsd32.h to <sys/procfs.h>
and export them to userland.
- Define __HAVE_REG32 on platforms that define a reg32 structure and check
for this in <sys/procfs.h> to control when to export prstatus32, etc.
- Add prstatus32_t and prpsinfo32_t typedefs for the 32-bit structures.
libbfd looks for these types, and having them fixes 'gcore' in gdb of a
32-bit process on a 64-bit platform.
- Use the structure definitions from <sys/procfs.h> in gcore's elf32 core
dump code instead of duplicating the definitions.
jhb [Tue, 2 Jun 2015 13:07:22 +0000 (13:07 +0000)]
MFC 269128:
Create 32-bit core files for 32-bit processes on 64-bit machines.
The 64-bit machine supported right now is amd64, but it's not too
hard to add powerpc64.
tijl [Tue, 2 Jun 2015 09:42:00 +0000 (09:42 +0000)]
MFC r283406,283418:
Fix decoding of UTF-7 when a base64 encoded chunk appears at the end of
the input buffer.
_citrus_UTF7_mbtoutf16 stored the decoder state at the beginning so it
could restore this state on an incomplete character such that the next
call would restart the decoding. The problem was that "-" (end of base64
mode) at the end of a string was also treated as an incomplete character
but was also removed from the state buffer. So the initial state would be
restored (with base64 mode) and the next call would no longer see the "-"
so it continued in base64 mode.
This state saving/restoring isn't needed here. It's already handled
elsewhere (citrus_iconv_std.c:_citrus_iconv_std_iconv_convert) so just
remove it.
Also initialise *nresult.
When only 2 bytes can be read from a 4 byte UTF-16 character in a base64
encoded chunk of a UTF-7 string, treat that as an incomplete character and
return an error instead of a shift sequence and no error.
Also check that the low 2 bytes have a valid value.
hiren [Tue, 2 Jun 2015 08:03:28 +0000 (08:03 +0000)]
MFC: r282866
Fix pmcstat symbol resolution for userland processes.
When examining existing processes pmcstat fails to
correctly determine the locations of executable sections
of the process due to a miscalculated virtual load address.
This does not affect the newly launched processes as the
same value passed as a "start address" to the pmcstat_image_link()
thus nullifying the effect of it. The issue manifests itself
in processes not being reported in the pmcstat(8) output and
"dubious frames" being reported.
Fix it for now by ignoring all the sections except the executable
one. This won't fix the issue for objects with multiple
executable sections but helps in majority of real world usecases.
The real solution would be to modify the MAP-IN event to include
the appropriate load address so pmcstat(8) won't have to manually
parse object files to try to determine it.
MFC r283101:
Teach key_expire() send SADB_EXPIRE message with the SADB_EXT_LIFETIME_HARD
extension header type. The key_flush_sad() now will send SADB_EXPIRE
message when HARD lifetime expires. This is required by RFC 2367 and some
keying daemons rely on these messages. HARD lifetime messages have
precedence over SOFT lifetime messages, so now they will be checked first.
Also now SADB_EXPIRE messages will be send even the SA has not been used,
because keying daemons might want to rekey such SA.
PR: 200282, 200283
MFC r283102:
Change SA's state before sending SADB_EXPIRE message. This state will
be reported to keying daemon.
MFC r275392:
Remove route chaching support from ipsec code. It isn't used for some time.
* remove sa_route_union declaration and route_cache member from struct secashead;
* remove key_sa_routechange() call from ICMP and ICMPv6 code;
* simplify ip_ipsec_mtu();
* remove #include <net/route.h>;
MFC r283104:
Read GEOM_UNCOMPRESS metadata using several requests that fit into
MAXPHYS. For large compressed images the metadata size can be bigger
than MAXPHYS and this triggers KASSERT in g_read_data().
Also use g_free() to free memory allocated by g_read_data().
jhb [Mon, 1 Jun 2015 18:08:56 +0000 (18:08 +0000)]
MFC 283123:
Fix two bugs that could result in PMC sampling effectively stopping.
In both cases, the the effect of the bug was that a very small positive
number was written to the counter. This means that a large number of
events needed to occur before the next sampling interrupt would trigger.
Even with very frequently occurring events like clock cycles wrapping all
the way around could take a long time. Both bugs occurred when updating
the saved reload count for an outgoing thread on a context switch.
First, the counter-independent code compares the current reload count
against the count set when the thread switched in and generates a delta
to apply to the saved count. If this delta causes the reload counter
to go negative, it would add a full reload interval to wrap it around to
a positive value. The fix is to add the full reload interval if the
resulting counter is zero.
Second, occasionally the raw counter value read during a context switch
has actually wrapped, but an interrupt has not yet triggered. In this
case the existing logic would return a very large reload count (e.g.
2^48 - 2 if the counter had overflowed by a count of 2). This was seen
both for fixed-function and programmable counters on an E5-2643.
Workaround this case by returning a reload count of zero.
jhb [Mon, 1 Jun 2015 18:05:30 +0000 (18:05 +0000)]
MFC 282643:
Use the kern.bootfile sysctl to set the default kernel path rather than
hardcoding /boot/kernel. This allows pmcstat(8) to work without -k when
using nextboot -k or 'boot foo' at the loader to boot alternate kernels.
jhb [Mon, 1 Jun 2015 17:57:05 +0000 (17:57 +0000)]
MFC 282641,282658:
- Move hwpmc(4) debugging code under a new HWPMC_DEBUG option instead of
the broader DEBUG option.
- Convert hwpmc(4) debug printfs over to KTR.
np [Sun, 31 May 2015 23:47:08 +0000 (23:47 +0000)]
MFC r273480, r273750, r273753, r273797, and r274461.
r273480:
cxgbe/iw_cxgbe: wake up waiters after flushing the qp.
r273750:
Some cxgbe/iw_cxgbe fixes:
- Free rt in c4iw_connect only if it is allocated.
- Call soclose instead of so_shutdown if there is an abort from the peer.
- Close socket and return failure if TOE is not enabled.
r273753:
iwcm_event status needs to be populated for close_complete_upcall
r273797:
Always request a completion for every work request for iWARP. The
initial MPA exchange must be tracked this way so that t4_tom's state for
the tid is all clean at the time the tid transitions to RDMA mode. Once
it does, t4_tom is out of the way and iw_cxgbe uses the qp endpoints
directly.
r274461:
iw_cxgbe: don't forget to close the socket in c4iw_connect if soconnect
fails.
ae [Sun, 31 May 2015 23:29:04 +0000 (23:29 +0000)]
MFC r283313:
Properly update TX statistics for wlan(4).
ieee80211_pwrsave() can fail due to queue overflow, check its return code
and increment oerrors counter when it fails. Also handle more error cases
and update oerrors counter when we don't send mbuf due to some errors.
Return ENETDOWN when parent interface isn't ready. Update obytes and omcasts
counters in corresponding places.
ngie [Sun, 31 May 2015 23:00:35 +0000 (23:00 +0000)]
MFC r283147:
Build cddl/{sbin,usr.bin,usr.sbin} in parallel as all of the applications are
freestanding (they require libraries build via make libraries in buildworld)
ae [Sun, 31 May 2015 22:58:41 +0000 (22:58 +0000)]
MFC r282965:
Add an ability accept encapsulated packets from different sources by one
gif(4) interface. Add new option "ignore_source" for gif(4) interface.
When it is enabled, gif's encapcheck function requires match only for
packet's destination address.
ngie [Sun, 31 May 2015 22:44:14 +0000 (22:44 +0000)]
MFC r283170:
Import proposed fix from upstream for
atf-sh/atf_check_test:flush_stdout_on_timeout
Many thanks for jmmv for the fix!
PR: 197060
Original commit message:
From 0e546407567ea858e261e72f75c5ed61e07d0ddf Mon Sep 17 00:00:00 2001
From: Julio Merino <jmmv@google.com>
Date: Tue, 17 Feb 2015 18:10:11 -0500
Subject: [PATCH] Fix atf-sh/atf_check_test:flush_stdout_on_death
The test atf-sh/atf_check_test:flush_stdout_on_timeout was flaky as it
was playing solely with time. Fix this by making the test more robust
and rename it while we are at it: there is nothing left about "timeouts"
in this test, considering that ATF itself does not enforce deadlines
any longer.
Add routing_test:static_ipv6_loopback_route_for_each_fib.
It tests that all FIBs get a static IPv6 loopback route.
Submitted by: asomers
Sponsored by: Spectra Logic
MFSpectraBSD: 1048456 on 2014/03/13 1114523 on 2015/01/23
r277650 (by will):
Add tests/etc/rc.d to mtree.
Submitted by: stefanf
MFC with: 277627
r282059:
Move etc/tests/rc.d to etc/rc.d/tests to match the directory layout jmmv@
documented and implemented in other areas of the FreeBSD tree
r283056:
Move all test integration pieces for etc/ from etc/ to tests/
This is being done to fix breakage with make distribution with read-only
source trees as make distribution doesn't use make obj like building
tests/ does in all cases
Reported by: Wolfgang Zenker <wolfgang@lyxys.ka.sub.org>
Suggested by: jhb
tuexen [Sun, 31 May 2015 13:01:58 +0000 (13:01 +0000)]
MFC r283665:
Take source and destination address into account when determining
the scope.
This fixes a problem when a client with a global address
connects to a server with a private address.
Thanks to Irene Ruengeler in helping me to find the issue.
tuexen [Sun, 31 May 2015 12:56:22 +0000 (12:56 +0000)]
MFC r283662:
Fix a bug where messages would not be sent in SHUTDOWN_RECEIVED state.
This problem was reported by Mark Bonnekessel and Markus Boese.
Thanks to Irene Ruengeler for helping me to fix the cause of
the problem. It can be tested with the following packetdrill script:
bapt [Sat, 30 May 2015 21:45:46 +0000 (21:45 +0000)]
MFC: r273754 (by nwhitehorn)
Use pkg-1.4-style platform identifiers based on MACHINE_ARCH (e.g.
FreeBSD:11:amd64 instead of freebsd:11:x86:64) when bootstrapping pkg.
Thanks to portmgr for providing symlinks so both styles work.
rmacklem [Sat, 30 May 2015 01:04:45 +0000 (01:04 +0000)]
MFC: r283008
Add a warning message to mountd for exported file
systems that are automounted, since that configuration
isn't supported. This still allows the export, since
two emails I received felt that this should not be
disabled. It sends the message to syslog(LOG_ERR..), so that
it goes to the same places as the other messages related
to /etc/exports problems, even though it is a warning and not an error.
tuexen [Fri, 29 May 2015 13:37:04 +0000 (13:37 +0000)]
MFC r282810:
Ensure that the COOKIE-ACK can be sent over UDP if the COOKIE-ECHO was
received over UDP.
Thanks to Felix Weinrank for makeing me aware of the problem and to
Irene Ruengeler for providing the fix.