Do not strip outer header when operating in transport mode.
Instead requeue mbuf back to IPv4 protocol handler. If there is one extra IP-IP
encapsulation, it will be handled with tunneling interface. And thus proper
interface will be exposed into mbuf's rcvif. Also, tcpdump that listens on tunneling
interface will see packets in both directions.
hrs [Thu, 2 Oct 2014 01:16:30 +0000 (01:16 +0000)]
Resurrect set_rcvar() as a function to define a rc.conf variable.
It defines a variable and its default value in load_rc_config() just after
rc.conf is loaded. "rcvar" command shows the current and the default values.
This is an attempt to solve a problem that rc.d scripts from third-party
software do not have entries in /etc/defaults/rc.conf. The fact that
load_rc_config() reads rc.conf only once and /etc/rc invokes the function
before running rc.d scripts made developers confused for a long time because
load_rc_config() just before run_rc_command() in each rc.d script overrides
variables only when the script is directly invoked, not from /etc/rc.
Variables defined in set_rcvar are always set in load_rc_config() after
loading rc.conf. An rc.d script can now be written in a self-contained
manner regarding the related variables as follows:
hrs [Thu, 2 Oct 2014 00:25:57 +0000 (00:25 +0000)]
Add an additional routing table lookup when m->m_pkthdr.fibnum is changed
at a PFIL hook in ip{,6}_output(). IPFW setfib rule did not perform
a routing table lookup when the destination address was not changed.
delphij [Thu, 2 Oct 2014 00:13:08 +0000 (00:13 +0000)]
Diff reduction with kernel code: instruct the compiler that the data of
these types may be unaligned to their "normal" alignment and exercise
caution when accessing them.
hrs [Wed, 1 Oct 2014 21:37:32 +0000 (21:37 +0000)]
Virtualize lagg(4) cloner. This change fixes a panic when tearing down
if_lagg(4) interfaces which were cloned in a vnet jail.
Sysctl nodes which are dynamically generated for each cloned interface
(net.link.lagg.N.*) have been removed, and use_flowid and flowid_shift
ifconfig(8) parameters have been added instead. Flags and per-interface
statistics counters are displayed in "ifconfig -v".
melifaro [Wed, 1 Oct 2014 21:24:58 +0000 (21:24 +0000)]
Free radix mask entries on main radix destroy.
This is temporary commit to be merged to 10.
Other approach (like hash table) should be used
to store different masks.
PR: 194078
Submitted by: Rumen Telbizov
MFC after: 3 days
marcel [Wed, 1 Oct 2014 21:03:17 +0000 (21:03 +0000)]
Improve performance of mking(1) by keeping a list of "chunks" in memory,
that keeps track of a particular region of the image. In particular the
image_data() function needs to return to the caller whether a region
contains data or is all zeroes. This required reading the region from
the temporary file and comparing the bytes. When image_data() is used
multiple times for the same region, this will get painful fast.
With a chunk describing a region of the image, we now also have a way
to refer to the image provided on the command line. This means we don't
need to copy the image into a temporary file. We just keep track of the
file descriptor and offset within the source file on a per-chunk basis.
For streams (pipes, sockets, fifos, etc) we now use the temporary file
as a swap file. We read from the input file and create a chunk of type
"zeroes" for each sequence of zeroes that's a multiple of the sector
size. Otherwise, we allocte from the swap file, mmap(2) it, read into
the mmap(2)'d memory and create a chunk representing data.
For regular files, we use SEEK_HOLE and SEEK_DATA to handle sparse files
eficiently and create a chunk of type zeroes for holes and a chunk of
type data for data regions. For data regions, we still compare the bytes
we read to handle differences between a file system's block size and our
sector size.
After reading all files, image_write() is used by schemes to scribble in
the reserved sectors. Since this never amounts to much, keep this data
in memory in chunks of exactly 1 sector.
The output image is created by looking using the chunk list to find the
data and write it out to the output file. For chunks of type "zeroes"
we prefer to seek, but fall back to writing zeroes to handle pipes.
For chunks of type "file" and "memoty" we simply write.
The net effect of this is that for reasonably large images the execution
time drops from 1-2 minutes to 10-20 seconds. A typical speedup is about
5 to 8 times, depending on partition sizes, output format whether in
input files are sparse or not.
will [Wed, 1 Oct 2014 20:52:08 +0000 (20:52 +0000)]
Revise r272363 by collapsing the tests into a for loop.
This has the side effect of ensuring that realpath is also run for the
nominal case of PORTSDIR=/usr/ports (assuming .CURDIR is a ports directory
that relies on /usr/ports but is not rooted in it). This ensures that any
generated PORTSDIR used is always the actual location.
dteske [Wed, 1 Oct 2014 18:59:57 +0000 (18:59 +0000)]
Optimize program flow for execution speed. Also fix some more style(9) nits
while here:
+ Fix an issue when extracting small archives where dialog_mixedgauge was
not rendering; leaving the user wondering if anything happened.
+ Add #ifdef's to assuage compilation against older libarchive
NB: Minimize diff between branches; make merging easier.
+ Add missing calls to end_dialog(3)
+ Change string processing from strtok(3) to strcspn(3) (O(1) optimization)
+ Use EXIT_SUCCESS and EXIT_FAILURE instead of 0/1
+ Optimize getenv(3) use, using stored results instead of calling repeatedly
NB: Fixes copy/paste error wherein we display getenv(BSDINSTALL_DISTDIR) in
an error msgbox when chdir(2) to getenv(BSDINSTALL_CHROOT) fails
(wrong variable displayed in msgbox).
+ Use strtol(3) instead of [deprecated] atoi(3)
+ Add additional error checking (e.g., check return of archive_read_new(3))
+ Assign DECONST strings to static variables
+ Fix typo in distextract.c error message (s/Could could/Could not/)
+ Add comments and make a minor whitespace adjustment
andrew [Wed, 1 Oct 2014 16:00:21 +0000 (16:00 +0000)]
Clean up detection of big-endian ARM. In all cases we follow the pattern
arm*eb*. Check we are building for arm and if MACHINE_ARCH follows this
pattern.
melifaro [Wed, 1 Oct 2014 14:39:06 +0000 (14:39 +0000)]
Remove lock init from radix.c.
Radix has never managed its locking itself.
The only consumer using radix with embeded rwlock
is system routing table. Move per-AF lock inits there.
will [Wed, 1 Oct 2014 14:12:02 +0000 (14:12 +0000)]
zfsvfs_create(): Refuse to mount datasets whose names are too long.
This is checked for in the zfs_snapshot_004_neg STF/ATF test (currently
still in projects/zfsd rather than head).
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:
- zfsvfs_create(): Check whether the objset name fits into
statfs.f_mntfromname, and return ENAMETOOLONG if not. Although
the filesystem can be unmounted via the umount(8) command, any
interface that relies on iterating on statfs (e.g. libzfs) will
fail to find the filesystem by its objset name, and thus assume
it's not mounted. This causes "zfs unmount", "zfs destroy",
etc. to fail on these filesystems, whether or not -f is passed.
andrew [Wed, 1 Oct 2014 08:26:51 +0000 (08:26 +0000)]
Remove MK_ARM_EABI, the armeb issues have been fixed. The code to support
the oabi is still in the tree, but it is expected this will be removed
as developers work on surrounding code.
With this commit the ARM EABI is the only supported supported ABI by
FreeBSD on ARMa 32-bit processors.
X-MFC after: never
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D876
tuexen [Wed, 1 Oct 2014 05:43:29 +0000 (05:43 +0000)]
The default for UDPLITE_RECV_CSCOV is zero. RFC 3828 recommend
that this means full checksum coverage for received packets.
If an application is willing to accept packets with partial
coverage, it is expected to use the socekt option and provice
the minimum coverage it accepts.
ian [Tue, 30 Sep 2014 23:01:11 +0000 (23:01 +0000)]
Return the actual baud rate programmed in the hardware rather than 115200.
This allows the "3wire" entry in /etc/ttys (with no speed specified) to work.
Support tunable to control Tx deferred packet list limits
Also increase default for Tx queue get-list limit.
Too small limit results in TCP packets drops especiall when many
streams are running simultaneously.
Put list may be kept small enough since it is just a temporary
location if transmit function can't get Tx queue lock.
Submitted by: Andrew Rybchenko <arybchenko at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Remove trailing whitespaces and tabs.
Enclose value in return statements in parentheses.
Use tabs after #define.
Do not skip comparison with 0/NULL in boolean expressions.
Submitted by: Andrew Rybchenko <arybchenko at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
If the checksum coverage field in the UDPLITE header is the length
of the complete UDPLITE packet, the packet has full checksum coverage.
SO fix the condition.
xen: fix blkback pushing responses before releasing internal resources
Fix a problem where the blockback driver could run out of requests,
despite the fact that we allocate enough request and reqlist
structures to satisfy the maximum possible number of requests.
The problem was that we were sending responses back to the other
end (blockfront) before freeing resources. The Citrix Windows
driver is pretty agressive about queueing, and would queue more I/O
to us immediately after we sent responses to it. We would run into
a resource shortage and stall out I/O until we freed resources.
It isn't clear whether the request shortage condition was an
indirect cause of the I/O hangs we've been seeing between Windows
with the Citrix PV drivers and FreeBSD's blockback, but the above
problem is certainly a bug.
Sponsored by: Spectra Logic
Submitted by: ken
Reviewed by: royger
dev/xen/blkback/blkback.c:
- Break xbb_send_response() into two sub-functions,
xbb_queue_response() and xbb_push_responses().
Remove xbb_send_response(), because it is no longer
used.
- Adjust xbb_complete_reqlist() so that it calls the
two new functions, and holds the mutex around both
calls. The mutex insures that another context
can't come along and push responses before we've
freed our resources.
- Change xbb_release_reqlist() so that it requires
the mutex to be held instead of acquiring the mutex
itself. Both callers could easily hold the mutex
while calling it, and one really needs to hold the
mutex during the call.
- Add two new counters, accessible via sysctl
variables. The first one counts the number of
I/Os that are queued and waiting to be pushed
(reqs_queued_for_completion). The second one
(reqs_completed_with_error) counts the number of
requests we've completed with an error status.
xen/balloon: fix accounting of current memory pages on PVH
Using realmem on PVH is not realiable, since in this case the realmem value
is computed from Maxmem, which contains the higher memory address found. Use
HYPERVISOR_start_info->nr_pages instead, which is set by the hypervisor and
contains the exact number of memory pages assigned to the domain.
This device is used by the user-space daemon that runs xenstore
(xenstored). It allows xenstored to map the xenstore memory page, and
reports the event channel xenstore is using.
Sponsored by: Citrix Systems R&D
dev/xen/xenstore/xenstored_dev.c:
- Add the xenstored character device that's used to map the xenstore
memory into user-space, and to report the event channel used by
xenstore.
conf/files:
- Add the device to the build process.
xen: convert the xenstore user-space char device to a newbus device
Convert the xenstore user-space device (/dev/xen/xenstore) to a device
using the newbus interface. This allows us to make the device
initialization dependant on the initialization of xenstore itself in
the kernel.
Sponsored by: Citrix Systems R&D
dev/xen/xenstore/xenstore.c:
- Convert to a newbus device, this removes the xs_dev_init function.
xen: defer xenstore initialization until xenstored is started
The xenstore related devices in the kernel cannot be started until
xenstored is running, which will happen later in the Dom0 case. If
start_info_t doesn't contain a valid xenstore event channel, defer all
xenstore related devices attachment to later.
Sponsored by: Citrix Systems R&D
dev/xen/xenstore/xenstore.c:
- Prevent xenstore from trying to attach it's descendant devices if
xenstore is not initialized.
- Add a callback in the xenstore interrupt filter that will trigger
the plug of xenstore descendant devices on the first received
interrupt. This interrupt is generated when xenstored attaches to
the event channel, and serves as a notification that xenstored is
running.
Explicitly return None for negative event indices. Prior to this,
eventat(-1) would return the next-to-last event causing the back button
to cycle back to the end of an event source instead of stopping at the
start.
xen: add the Xen implementation of pci_child_added method
Add the Xen specific implementation of pci_child_added to the Xen PCI
bus. This is needed so FreeBSD can register the devices it finds with
the hypervisor.
Sponsored by: Citrix Systems R&D
x86/xen/xen_pci.c:
- Add the Xen pci_child_added method.
This patch adds support for MSI interrupts when running on Xen. Apart
from adding the Xen related code needed in order to register MSI
interrupts this patch also makes the msi_init function a hook in
init_ops, so different MSI implementations can have different
initialization functions.
Sponsored by: Citrix Systems R&D
xen/interface/physdev.h:
- Add the MAP_PIRQ_TYPE_MULTI_MSI to map multi-vector MSI to the Xen
public interface.
x86/include/init.h:
- Add a hook for setting custom msi_init methods.
amd64/amd64/machdep.c:
i386/i386/machdep.c:
- Set the default msi_init hook to point to the native MSI
initialization method.
x86/xen/pv.c:
- Set the Xen MSI init hook when running as a Xen guest.
x86/x86/local_apic.c:
- Call the msi_init hook instead of directly calling msi_init.
xen/xen_intr.h:
x86/xen/xen_intr.c:
- Introduce support for registering/releasing MSI interrupts with
Xen.
- The MSI interrupts will use the same PIC as the IO APIC interrupts.
xen/xen_msi.h:
x86/xen/xen_msi.c:
- Introduce a Xen MSI implementation.
x86/xen/xen_nexus.c:
- Overwrite the default MSI hooks in the Xen Nexus to use the Xen MSI
implementation.
x86/xen/xen_pci.c:
- Introduce a Xen specific PCI bus that inherits from the ACPI PCI
bus and overwrites the native MSI methods.
- This is needed because when running under Xen the MSI messages used
to configure MSI interrupts on PCI devices are written by Xen
itself.
dev/acpica/acpi_pci.c:
- Lower the quality of the ACPI PCI bus so the newly introduced Xen
PCI bus can take over when needed.
conf/files.i386:
conf/files.amd64:
- Add the newly created files to the build process.
Fix old iSCSI initiator to work with new CAM locking.
This switches code to using xpt_scan() routine, irrelevant to locking.
Using xpt_action() directly requires knowledge about higher level locks,
that SIM does not need to have.
This code is obsoleted, but that is not a reason to crash.
- use daemon(8) to write out a pid file for processes,
and check for for the existence of that file after
killing processes
- use explict named parameters to jail(8)
Be prepared that set_dumper() might fail even when resetting it or prefix
the call with (void) to document that we intentionally ignore the return
value - no way to handle an error in case of device disappearing.
Make clear in the ipheth(4) hardware notes that this driver is for the
tethering functionality only. Add a "bugs" section to give a pointer
to usbconfig set_config if the device isn't automatically detected.
adrian [Tue, 30 Sep 2014 03:19:29 +0000 (03:19 +0000)]
Add initial support for the AR9485 CUS198 / CUS230 variants.
These variants have a few differences from the default AR9485 NIC,
namely:
* a non-default antenna switch config;
* slightly different RX gain table setup;
* an external XLNA hooked up to a GPIO pin;
* (and not yet done) RSSI threshold differences when
doing slow diversity.
To make this possible:
* Add the PCI device list from Linux ath9k, complete with vendor and
sub-vendor IDs for various things to be enabled;
* .. and until FreeBSD learns about a PCI device list like this,
write a search function inspired by the USB device enumeration code;
* add HAL_OPS_CONFIG to the HAL attach methods; the HAL can use this
to initialise its local driver parameters upon attach;
* copy these parameters over in the AR9300 HAL;
* don't default to override the antenna switch - only do it for
the chips that require it;
* I brought over ar9300_attenuation_apply() from ath9k which is cleaner
and easier to read for this particular NIC.
This is a work in progress. I'm worried that there's some post-AR9380
NIC out there which doesn't work without the antenna override set as
I currently haven't implemented bluetooth coexistence for the AR9380
and later HAL. But I'd rather have this code in the tree and fix it
up before 11.0-RELEASE happens versus having a set of newer NICs
in laptops be effectively RX deaf.
Use bzero instead of explicitly zeroing stuff in do_execve.
While strictly speaking this is not correct since some fields are pointers,
it makes no difference on all supported archs and we already rely on it doing
the right thing in other places.
When setting environment variables in the atrun script, use the
"export foo=bar" form instead of "foo=bar; export foo" since the
former allows the shell to catch variable names that are not valid
shell identifiers. This will cause /bin/sh to exit with an error
(which gets mailed to the at user) and it will not run the script.
Obtained from: OpenBSD (r1.63 millert)
MFC after: 3 days
Ensure that ixl_flush() uses a defined register on VFs
In some code that is shared between the ixl(4) and ixlv(4) drivers,
a macro hard-coded a register offset that was not valid on ixlv devices.
Fix this by having each driver define a variable that contains the correct
offset.
Reviewed by: Eric Joyner <ricera10 AT gmail.com>
MFC after: 3 days
Sponsored by: Sandvine Inc
Fix integer truncation in affecting systat -ifstat
The "systat -ifstat" command was using a u_int to store byte counters.
With a 10Gbps or faster interface, this overflows within the default
5 second refresh period. Switch to using a uint64_t across the board,
which matches the size used for all counters as of r263102.
will [Mon, 29 Sep 2014 15:05:23 +0000 (15:05 +0000)]
Search for the nearest PORTSDIR where Mk/bsd.ports.mk exists, from .CURDIR.
This will only take effect if PORTSDIR is not set, as previously supported.
Use .if exists(), for four specific possibilities relative to .CURDIR:
., .., ../.., and ../../.. The fourth possibility is primarily in case
ports ever grows a third level. If none of these paths exist, fall back to
the old default of /usr/ports.
This removes the need to set PORTSDIR explicitly (or via wrapper script) if
one is running out of a ports tree that is not in /usr/ports, but in a
home directory.
des [Mon, 29 Sep 2014 08:57:36 +0000 (08:57 +0000)]
Instead of failing when neither PAM_TTY nor PAM_RHOST are available, call
login_access() with "**unknown**" as the second argument. This will allow
"ALL" rules to match.
Reported by: Tim Daneliuk <tundra@tundraware.com>
Tested by: dim@
PR: 83099 193927
MFC after: 3 days
Use snprintf(3) in place of unbounded sprintf(3) (prevent buffer overflow).
Use adequately sized buffer for error(s) (512 -> PATH_MAX + 512).
Fix the following style(9) nits while here:
- distfetch.c uses PATH_MAX while distextract.c uses MAXPATHLEN;
standardize on one (PATH_MAX)
- Move $FreeBSD$ from comment to __FBSDID()
- Sort included headers (alphabetically, sys/* at top)
- Add missing header includes (e.g., <stdlib.h> for getenv(3),
calloc(3)/malloc(3)/free(3), and atoi(3); <string.h> for strdup(3),
strrchr(3), strsep(3), and strcmp(3); <ctype.h> for isspace(3); and
<unistd.h> for chdir(2), etc.)
- Remove rogue newline at end of distfetch.c
- Don't declare variables in if-, while-, or other statement
NB: To prevent masking of prior declarations atop function
- Perform stack alignment for variable declarations
- Add missing function prototype for count_files() in distextract.c
- Break out single-line multivariable-declarations
NB: Aligning similarly-named variables with one-char difference(s)
NB: Minimizes diffs and makes future diffs more clear
- Use err(3) family of functions (requires s/int err;/int retval;/g)
Change the /var dataset in the default ZFS layout to have the
ZFS property canmount=off so that /var/db/pkg and other such directories
are part of the / dataset, and only /var/mail, /var/log, and /var/crash
are excluded from the ZFS boot environment (beadm).
Add support for the missing POSIX-2001 %U and %W features: the
existing FreeBSD strptime code recognizes both directives and
validates that the week number lies in the permitted range,
but then simply discards the value.
Initial support for the feature was written by Paul Green.
David Carlier added the initial handling of tm_wday/tm_yday.
Major credit goes to Andrey Chernov for detecting much of the
brokenness, and rewriting/cleaning most of the code, making it
much more robust.
Tested independently with the strptime test from the GNU C
library.
tty_rel_free() can be called more than once for the same tty so make sure
that the tty is dequeued from 'tty_list' only the first time.
The panic below was seen when a revoke(2) was issued on an nmdm device.
In this case there was also a thread that was blocked on a read(2) on the
device. The revoke(2) woke up the blocked thread which would typically
return an error to userspace. In this case the reader also held the last
reference on the file descriptor so fdrop() ended up calling tty_rel_free()
via ttydev_close().
tty_rel_free() then tried to dequeue 'tp' again which led to the panic.
panic: Bad link elm 0xfffff80042602400 prev->next != elm
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00f9c90460
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe00f9c90510
vpanic() at vpanic+0x189/frame 0xfffffe00f9c90590
panic() at panic+0x43/frame 0xfffffe00f9c905f0
tty_rel_free() at tty_rel_free+0x29b/frame 0xfffffe00f9c90640
ttydev_close() at ttydev_close+0x1f9/frame 0xfffffe00f9c90690
devfs_close() at devfs_close+0x298/frame 0xfffffe00f9c90720
VOP_CLOSE_APV() at VOP_CLOSE_APV+0x13c/frame 0xfffffe00f9c90770
vn_close() at vn_close+0x194/frame 0xfffffe00f9c90810
vn_closefile() at vn_closefile+0x48/frame 0xfffffe00f9c90890
devfs_close_f() at devfs_close_f+0x2c/frame 0xfffffe00f9c908c0
_fdrop() at _fdrop+0x29/frame 0xfffffe00f9c908e0
sys_read() at sys_read+0x63/frame 0xfffffe00f9c90980
amd64_syscall() at amd64_syscall+0x2b3/frame 0xfffffe00f9c90ab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe00f9c90ab0
--- syscall (3, FreeBSD ELF64, sys_read), rip = 0x800b78d8a, rsp = 0x7fffffbfdaf8, rbp = 0x7fffffbfdb30 ---
CR: https://reviews.freebsd.org/D851
Reviewed by: glebius, ed
Reported by: Leon Dang
Sponsored by: Nahanni Systems
MFC after: 1 week
release/Makefile:
Connect the virtual machine image build to the release
target if WITH_VMIMAGES is set to a non-empty value.
release/release.sh:
Add WITH_VMIMAGES to RELEASE_RMAKEFLAGS.
release/release.conf.sample:
Add commented entries for tuning the release build if the
WITH_VMIMAGES make(1) environment variable is set to
a non-empty value.
Move the unconditional #include of net/ifq.h to the very end of file.
This seems to allow us to pass a universe with either clang or gcc
after r272244 (and r272260) and probably makes it easier to untabgle
these chained #includes in the future.
Remove duplicate declaraton of the if_inc_counter() function after r272244.
if_var.h has the expected on and if_var.h include ifq.h and thus we get
duplicates. It seems only one cavium ethernet file actually includes ifq.h
directly which might be another cleanup to be done but need to test first.
- Remove empty wrappers ether_poll_[de]register_drv(). [1]
- Move polling(9) declarations out of ifq.h back to if_var.h
they are absolutely unrelated to queues.