Add a special mount option "failok" to indicate that the administrator wants
the system to proceed to boot without bailing out into single user mode,
even when the file system can not be successfully mounted.
This option is implemented in mount(8) and not passed into kernel.
MFC: r221333
Remove usr/include/nfs/krpc.h and usr/include/nfs/nfsdiskless.h from
ObsoleteFiles.inc, since these files have been reincarnated in the new
NFS implementation.
Discussed with dim@.
MFC r222808:
Sync ng_nat with recent (r222806) ipfw_nat changes:
Make a behaviour of the libalias based in-kernel NAT a bit closer to
how natd(8) does work. natd(8) drops packets only when libalias returns
PKT_ALIAS_IGNORED and "deny_incoming" option is set, but ipfw_nat
always did drop packets that were not aliased, even if they should
not be aliased and just are going through.
Also add SCTP support: mark response packets to skip firewall processing.
MFC r222806:
Make a behaviour of the libalias based in-kernel NAT a bit closer to
how natd(8) does work. natd(8) drops packets only when libalias returns
PKT_ALIAS_IGNORED and "deny_incoming" option is set, but ipfw_nat
always did drop packets that were not aliased, even if they should
not be aliased and just are going through.
PR: kern/122109, kern/129093, kern/157379
Submitted by: Alexander V. Chernikov (previous version)
MFC: r223441
Plug an mbuf leak in the new NFS client that occurred when a
server replied NFS3ERR_JUKEBOX/NFS4ERR_DELAY to an rpc.
This affected both NFSv3 and NFSv4. Found during testing
at the recent NFSv4 interoperability Bakeathon.
MFC: r223436
Fix the new NFSv4 client so that it uses the same uid as
was used for doing a mount when performing system operations
on AUTH_SYS mounts. This resolved an issue when mounting
a Linux server. Found during testing at the recent
NFSv4 interoperability Bakeathon.
MFC r222582:
O_FORWARD_IP is only action which depends from the result of lookup of
dynamic rules. We are doing forwarding in the following cases:
o For the simple ipfw fwd rule, e.g.
fwd 10.0.0.1 ip from any to any out xmit em0
fwd 127.0.0.1,3128 tcp from any to any 80 in recv em1
o For the dynamic fwd rule, e.g.
fwd 192.168.0.1 tcp from any to 10.0.0.3 3333 setup keep-state
When this rule triggers it creates a dynamic rule, but this
dynamic rule should forward packets only in forward direction.
o And the last case that does not work before - simple fwd rule which
triggers when some dynamic rule is already executed.
MFC r223660:
Initialize elements of state array when creating the GPT table.
This fixes the problem, when the secondary GPT header is not erased when
partition table destroyed. Move equal operations from g_part_gpt_create
and g_part_gpt_recover to the separate function g_gpt_set_defaults.
ALL BIND USERS ARE ENCOURAGED TO UPGRADE IMMEDIATELY
This update addresses the following vulnerability:
CVE-2011-2464
=============
Severity: High
Exploitable: Remotely
Description:
A defect in the affected BIND 9 versions allows an attacker to remotely
cause the "named" process to exit using a specially crafted packet. This
defect affects both recursive and authoritative servers. The code location
of the defect makes it impossible to protect BIND using ACLs configured
within named.conf or by disabling any features at compile-time or run-time.
MFC r223608:
Disable microcode loading for 82550 and 82550C controllers. Loading
the microcode caused SCB timeouts. Linux driver does not allow
microcode loading for these controllers and jfv also confirmed that
there is no need to do and it shouldn't.
MFC: r223382
Change the NFSv4 nfsuserd daemon so that it doesn't preload the
uid<->username mapping cache with an entry when another entry
for that uid is already loaded. This fixes a case where the
mapping of "toor" would replace "root" when the daemon was started,
resulting in no mapping for "root" until the cache entry for "toor"
timed out.
The algorithm is inefficient, but since it is only done once when
the daemon is started up, I don't think that's an issue.
MFC: r223373
Fix the new NFSv4 server so that it checks for VREAD_ACL when
a client does a Getattr for an ACL and not VREAD_ATTRIBUTES.
This was found during the recent NFSv4 interoperability Bakeathon.
MFC: r223349
Fix the new NFSv4 server so that it only allows Lookup of
directories and symbolic links when traversing non-exported
file systems. Found during the recent NFSv4 interoperability
Bakeathon.
MFC: r223348
Fix the new NFSv4 server so that it allows Access and Readlink
operations while traversing non-exported file systems. This is
required for some non-FreeBSD clients to do NFSv4 mounts. Found during
the recent NFSv4 interoperability Bakeathon.
MFC: r223312
Fix a number of places where the new NFS server did not
lock the mutex when manipulating rc_flag in the DRC cache.
This is believed to fix a hung server that was reported
to the freebsd-fs@ list on June 9 under the subject heading
"New NFS server stress test hang", where all the threads
were waiting for the RC_LOCKED flag to clear.
MFC: r223309
Fix the kgssapi so that it can be loaded as a module. Currently
the NFS subsystems use five of the rpcsec_gss/kgssapi entry points,
but since it was not obvious which others might be useful, all
nineteen were included. Basically the nineteen entry points are
set in a structure called rpc_gss_entries and inline functions
defined in sys/rpc/rpcsec_gss.h check for the entry points being
non-NULL and then call them. A default value is returned otherwise.
When dropping privileges prefer capsicum over chroot+setgid+setuid.
We can use capsicum for secondary worker processes and hastctl.
When working as primary we drop privileges using chroot+setgid+setuid
still as we need to send ioctl(2)s to ggate device, for which capsicum
doesn't allow (yet).
r221898 (pjd):
When using capsicum to sanbox, still use other methods first, just in case
one of them have some problems.
r221899 (pjd):
Currently we are unable to use capsicum for the primary worker process,
because we need to do ioctl(2)s, which are not permitted in the capability
mode. What we do now is to chroot(2) to /var/empty, which restricts access
to file system name space and we drop privileges to hast user and hast
group.
This still allows to access to other name spaces, like list of processes,
network and sysvipc.
To address that, use jail(2) instead of chroot(2). Using jail(2) will restrict
access to process table, network (we use ip-less jails) and sysvipc (if
security.jail.sysvipc_allowed is turned off). This provides much better
separation.
r222224 (pjd):
To handle BIO_FLUSH and BIO_DELETE requests in secondary worker we need
to use ioctl(2). This is why we can't use capsicum for now to sandbox
secondary. Capsicum is still used to sandbox hastctl.
r223584 (pjd):
Log a warning if we cannot sandbox using capsicum, but only under debug level 1.
It would be too noisy to log it as a proper warning as CAPABILITIES are not
compiled into GENERIC by default.
r223585 (pjd):
Compile capsicum support only if HAVE_CAPSICUM is defined.
MFC r223227: rc.subr: Eliminate about 100 forks from the boot sequence.
With the current sh, placing eval in a command substitution always results
in a fork(), even if it is the only command and only executes a single
simple command. Therefore, avoid it where it can be avoided easily.
Side effect: values starting with a hyphen and all whitespace are preserved.
The values are defaults and names for rc.conf variables and messages to be
given about obsolete ones.
The change in the _echoonce function is not included in this MFC because
stable/8 does not have this function.
Since head/ and stable/8 have different handling for geom control request
parameters, r215941 should be modified to allow use -F option in the
"gpart restore" command - "force" parameter should be ascii string.
MFC r223522: sh(1): Improve documentation of shell patterns:
* Shell patterns are also for ${var#pat} and the like.
* An '!' by itself will not trigger pathname generation so do not call it a
meta-character, even though it has a special meaning directly after an
'['.
* Character ranges are locale-dependent.
* A '^' will complement a character class like '!' but is non-standard.
yongari [Wed, 29 Jun 2011 17:18:33 +0000 (17:18 +0000)]
MFC r223405:
Remove link state change callback handler. There is no need to
register both status change and link state change callbacks.
Implement checking valid link in state change callback and poll
active link state in vr_tick(). This allows immediate detection of
lost link as well as protecting driver from frequent link flips during
link renegotiation. taskq implementation was removed because driver
now needs to poll link state in vr_tick().
While I'm here do not report current link state if interface is not
running.
dim [Wed, 29 Jun 2011 16:43:44 +0000 (16:43 +0000)]
MFC r223579:
For some reason, contrib/traceroute/traceroute.c ensures MAXHOSTNAMELEN
is defined, but then proceeds to use a hardcoded maximum hostname length
of 64 anyway. Fix this by checking against MAXHOSTNAMELEN instead.
jhb [Wed, 29 Jun 2011 16:16:59 +0000 (16:16 +0000)]
MFC 223198:
- Use a dedicated task to handle deferred transmits from the if_transmit
method instead of reusing the existing per-queue interrupt task.
Reusing the per-queue interrupt task could result in both an interrupt
thread and the taskqueue thread trying to handle received packets on a
single queue resulting in out-of-order packet processing.
- Don't define igb_start() at all on 8.0 and where if_transmit is used.
Replace last remaining call to igb_start() with a loop to kick off
transmit on each queue instead.
- Call ether_ifdetach() earlier in igb_detach().
- Drain tasks and free taskqueues during igb_detach().
jhb [Wed, 29 Jun 2011 15:58:26 +0000 (15:58 +0000)]
MFC 221393,222930:
Reimplement how PCI-PCI bridges manage their I/O windows. Previously the
driver would verify that requests for child devices were confined to any
existing I/O windows, but the driver relied on the firmware to initialize
the windows and would never grow the windows for new requests. Now the
driver actively manages the I/O windows.
This is implemented by allocating a bus resource for each I/O window from
the parent PCI bus and suballocating that resource to child devices. The
suballocations are managed by creating an rman for each I/O window. The
suballocated resources are mapped by passing the bus_activate_resource()
call up to the parent PCI bus. Windows are grown when needed by using
bus_adjust_resource() to adjust the resource allocated from the parent PCI
bus. If the adjust request succeeds, the window is adjusted and the
suballocation request for the child device is retried.
When growing a window, the rman_first_free_region() and
rman_last_free_region() routines are used to determine if the front or
end of the existing I/O window is free. From using that, the smallest
ranges that need to be added to either the front or back of the window
are computed. The driver will first try to grow the window in whichever
direction requires the smallest growth first followed by the other
direction if that fails.
Subtractive bridges will first attempt to satisfy requests for child
resources from I/O windows (including attempts to grow the windows). If
that fails, the request is passed up to the parent PCI bus directly
however.
The PCI-PCI bridge driver will try to use firmware-assigned ranges for
child BARs first and only allocate a "fresh" range if that specific range
cannot be accommodated in the I/O window. This allows systems where the
firmware assigns resources during boot but later wipes the I/O windows
(some ACPI BIOSen are known to do this) to "rediscover" the original I/O
window ranges.
The ACPI Host-PCI bridge driver has been adjusted to correctly honor
hw.acpi.host_mem_start and the I/O port equivalent when a PCI-PCI bridge
makes a wildcard request for an I/O window range.
The new PCI-PCI bridge driver is only enabled if the NEW_PCIB kernel option
is enabled.
trociny [Tue, 28 Jun 2011 19:27:34 +0000 (19:27 +0000)]
MFC r222164, r222228, r222467, r223181:
r222164 (pjd):
Recognize HIO_FLUSH requests.
r222228 (pjd):
Keep statistics on number of BIO_READ, BIO_WRITE, BIO_DELETE and BIO_FLUSH
requests as well as number of activemap updates.
Number of BIO_WRITEs and activemap updates are especially interesting, because
if those two are too close to each other, it means that your workload needs
bigger number of dirty extents. Activemap should be updated as rarely as
possible.
r222467:
If READ from the local node failed we send the request to the remote
node. There is no use in doing this for synchronization requests.
r223181:
In HAST we use two sockets - one for only sending the data and one for
only receiving the data. In r220271 the unused directions were
disabled using shutdown(2).
Unfortunately, this broke automatic receive buffer sizing, which
currently works only for connections in ETASBLISHED state. It was a
root cause of the issue reported by users, when connection between
primary and secondary could get stuck.
Disable the code introduced in r220271 until the issue with automatic
buffer sizing is not resolved.
Reported by: Daniel Kalchev <daniel@digsys.bg>, danger, sobomax
Tested by: Daniel Kalchev <daniel@digsys.bg>, danger
edwin [Tue, 28 Jun 2011 10:46:02 +0000 (10:46 +0000)]
Bring iso3166 file in sync with HEAD
MFC of recent updates on the ISO3166 file:
r223633
- Remove AN again now that tzdata2011h has been imported.
r222094
- Put AN back after finding out that tzsetup(1) will complain that
it doesn't exist. It will be removed again once the tzdata distribution
files have been updated with the replacements for AN.
r222014
- Revert change to "MF" I made in r189767. I bet that at the time of r189767
I checked with http://www.iso.org/iso/country_codes/iso_3166_code_lists.htm
and "MF" was officially spelled in English as "Saint Martin" there, but now
that "SX" exists (for "Sint Maarten (Dutch part)") (nice official "English"
spelling!) they seem to have added a "(French part)" suffix to "MF". Since
this is also in line with Newsletter VI-1 (2007-09-21), catch up.
r222011
- ISO3166: Update for newsletters VI-7 and VI-8 from 2010
- Name change for SH
- BQ, CW, and SX replace AN
ae [Tue, 28 Jun 2011 04:57:53 +0000 (04:57 +0000)]
MFC r223364:
When user specifies the bootcode with size smaller than VTOC_BOOTSIZE,
gpart_write_partcode_vtoc8 does access out of range of allocated memory.
Check size of bootcode before writing it.
hselasky [Mon, 27 Jun 2011 21:45:35 +0000 (21:45 +0000)]
MFC r223495:
- Add two new API's to libusb20 which can be used to retrive information
about the parent USB device (which is usually an USB HUB):
- libusb20_dev_get_parent_address
- libusb20_dev_get_parent_port
hselasky [Mon, 27 Jun 2011 21:04:35 +0000 (21:04 +0000)]
MFC r223467 and r223472:
- Add more USB templates for various USB device classes
- Add basic template support for USB 3.0
- Export definition of template sysctl numbers through usb_ioctl.h
- Export all USB device ID's in so-called sections.
- Fix duplicate occurence of a USB ID in if_urtw and if_zyd.
- Add new tool to autogenerate nomatch entries for devd.
- Add new usb.conf file to auto-load USB drivers.
- Fix some issues related to the nomatch notifications.
- Add support for AF_INET6 sockets for %S format character.
- Use inet_ntop(3) instead of reimplementing it.
- Use %hhu for unsigned char instead of casting it to unsigned int and
using %u.
r222108 (pjd):
In preparation for IPv6 support allow to specify multiple addresses to
listen on.
r222115 (pjd):
Rename proto_tcp4.c to proto_tcp.c in preparation for IPv6 support.
r222116 (pjd):
Rename tcp4 to tcp in preparation for IPv6 support.
r222117 (pjd):
Allow [ ] characters in strings. They might be used in IPv6 addresses.
r222118 (pjd):
Now that hell is fully frozen it is good time to add IPv6 support to HAST.
r222119 (pjd):
Rename ipv4/ipv6 to tcp4/tcp6.
r222120 (pjd):
If no listen address is specified, bind by default to:
bz [Mon, 27 Jun 2011 11:10:15 +0000 (11:10 +0000)]
MFC r223223:
gre(4) was using a field in the softc to detect possible recursion.
On MP systems this is not a usable solution anymore and could easily
lead to false positives triggering enough logging that even using
the console was no longer usable (multiple parallel ping -f can do).
Switch to the suggested solution of using mbuf tags to carry per
packet state between gre_output() invocations. Contrary to the
proposed solution modelled after gif(4) only allocate one mbuf tag
per packet rather than per packet and per gre_output() pass through.
As the sysctl to control the possible valid (gre in gre) nestings does
no sanity checks, make sure to always allocate space in the mbuf tag
for at least one, and at most 255 possible gre interfaces to detect
loops in addition to the counter.
jilles [Sun, 26 Jun 2011 10:50:11 +0000 (10:50 +0000)]
MFC r222511,r223206: posix_spawn(): Do not fail when trying to close an fd
that is not open.
As noted in Austin Group issue #370 (an interpretation has been issued),
failing posix_spawn() because an fd specified with
posix_spawn_file_actions_addclose() is not open is unnecessarily harsh, and
there are existing implementations that do not fail posix_spawn() for this
reason.
rmacklem [Fri, 24 Jun 2011 20:15:44 +0000 (20:15 +0000)]
MFC: r216587, r222623
Fix the nfs related daemons so that they don't intermittently
fail with "bind: address already in use". This problem was reported
to the freebsd-stable@ mailing list on Feb. 19 under the subject
heading "statd/lockd startup failure" by george+freebsd at m5p dot com.
The problem is that the first combination of {udp,tcp X ipv4,ipv6}
would select a port# dynamically, but one of the other three combinations
would have that port# already in use. The patch is somewhat involved
because it was requested by dougb@ that the four combinations use the
same port# wherever possible. The patch splits the create_service()
function into two functions. The first goes as far as bind(2) in a
loop for up to GETPORT_MAXTRY - 1 times, attempting to use the same port#
for all four cases. If these attempts fail, the last attempt allows
the 4 cases to use different port #s. After this function has succeeded,
the second function, called complete_service(), does the rest of what
create_service() did.
The three daemons mountd, rpc.lockd and rpc.statd all have a
create_service() function that is patched in a similar way. However,
create_service() has non-trivial differences for the three daemons
that made it impractical to share the same functions between them.
Also MFC'd r216587 so that r222623 would merge cleanly and mountd.c
would be up to date.
joerg [Fri, 24 Jun 2011 19:24:56 +0000 (19:24 +0000)]
Open the floppy disk device with O_RDONLY rather than O_RDWR. After
all, this is the fd*read* command, and thus should be able to read
even write-protected disks.
jhb [Fri, 24 Jun 2011 17:29:41 +0000 (17:29 +0000)]
MFC 221324,221324:
Add implementations of BUS_ADJUST_RESOURCE() to the PCI bus driver,
generic PCI-PCI bridge driver, x86 nexus driver, x86 Host to PCI bridge
drivers, and the drivers that sit between the x86 Host-PCI bridge drivers
and nexus.
jhb [Fri, 24 Jun 2011 16:06:50 +0000 (16:06 +0000)]
MFC 221231,221450,222600:
Add a new bus method, BUS_ADJUST_RESOURCE() that is intended to be a
wrapper around rman_adjust_resource(). Include a generic implementation,
bus_generic_adjust_resource() which passes the request up to the parent
bus. There is currently no default implementation. A
bus_adjust_resource() wrapper is provided for use in drivers.
jhb [Fri, 24 Jun 2011 13:45:14 +0000 (13:45 +0000)]
MFC 221220:
Extend the rman(9) API to support altering an existing resource.
Specifically, these changes allow a resource to back a relocatable and
resizable resource such as the I/O window decoders in PCI-PCI bridges.
- rman_adjust_resource() can adjust the start and end address of an
existing resource. It only succeeds if the newly requested address
space is already free. It also supports shrinking a resource in
which case the freed space will be marked unallocated in the rman.
- rman_first_free_region() and rman_last_free_region() return the
start and end addresses for the first or last unallocated region in
an rman, respectively. This can be used to determine by how much
the resource backing an rman must be adjusted to accomodate an
allocation request that does not fit into the existing rman.
While here, document the rm_start and rm_end fields in struct rman,
rman_is_region_manager(), the bound argument to
rman_reserve_resource_bound(), and rman_init_from_resource().
jhb [Fri, 24 Jun 2011 13:41:46 +0000 (13:41 +0000)]
MFC 221218:
Change rman_manage_region() to actually honor the rm_start and rm_end
constraints on the rman and reject attempts to manage a region that is out
of range.
- Fix various places that set rm_end incorrectly (to ~0 or ~0u instead of
~0ul).
- To preserve existing behavior, change rman_init() to set rm_start and
rm_end to allow managing the full range (0 to ~0ul) if they are not set by
the caller when rman_init() is called.
hselasky [Wed, 22 Jun 2011 08:55:00 +0000 (08:55 +0000)]
MFC r210823, r211397, r210933, r219101, r213852, r213849 and r208020:
- Add support for LibUSB in 32-bit compatibility mode.
- Some manpage related fixes.
yongari [Wed, 22 Jun 2011 01:42:52 +0000 (01:42 +0000)]
MFC r222219,222221,222223,222226-222227,222231,222516:
Merge all relevant changes from HEAD to fix long standing
instability issues of msk(4). To get desired effect of this
merge, cold restarting is required because incorrectly programmed
registers are not reset to default value.
PR: kern/114631, kern/116853, kern/139093, kern/144206,
kern/147824, kern/151169, kern/154591, kern/155636,
kern/156493
r222219:
Do not blindly clear entire GPHY control register. It seems some
bits of the register is used for other purposes such that clearing
these bits resulted in unexpected results such as corrupted RX
frames or missing LE status updates. For old controllers like
Yukon EC it had no effect but it caused all kind of troubles on
Yukon Supreme.
This change shall improve stability of controllers like Yukon
Ultra, Ultra2, Extreme, Optima and Supreme.
r222221:
Rework store and forward configuration of TX MAC FIFO. Basically it
enables store and forward mode except for jumbo frame on Yukon
Ultra.
r222223:
Do not configure RAM registers for controllers that do not have
them. These registers are defined only for Yukon XL, Yukon EC and
Yukon FE.
r222226:
Make sure to enable all clocks before accessing registers.
Releasing PHY from power down/COMA is done after enabling all
clocks. While I'm here remove unnecessary controller reset.
r222227:
Do not touch ASF related register for controllers that do not have
these registers. Also disable Watchdog of ASF microcontroller.
r222231:
When MTU is changed, check whether driver should be reinitialized or
not. If reinitialized is required, clear driver running flag.
r222516:
Correctly check MAC running status before disabling TX/RX MACs.
yongari [Wed, 22 Jun 2011 00:48:13 +0000 (00:48 +0000)]
MFC r205091,216860:
r205091:
Implement Rx checksum offloading for Yukon EC, Yukon Ultra,
Yukon FE and Yukon Ultra2. These controllers provide very simple
checksum computation mechanism and it requires additional pseudo
header checksum computation in upper stack. Even though I couldn't
see much performance difference with/without Rx checksum offloading
it may help notebook based controllers.
Actually controller can compute two checksum value by giving
different starting position of checksum computation on received
frame. However, for long time, Marvell's checksum offloading engine
have been known to have several silicon bugs so don't blindly trust
computed partial checksum value. Instead, compute partial checksum
twice by giving the same checksum computation position and compare
the result. If the value is different it's clear indication of
hardware bug. This configuration lose IP checksum offloading
capability but I think it's better to take safe route.
Note, Rx checksum offloading for Yukon XL was still disabled due to
known silicon bug.
r216860:
Fix endianness bug introduced in r205091.
After controller updates control word in a RX LE, driver converts
it to host byte order. The checksum value in the control word is
stored in big endian form by controller. r205091 didn't account for
the host byte order conversion such that the checksum value was
incorrectly interpreted on big endian architectures which in turn
made all TCP/UDP frames dropped. Make RX checksum offload work
on any architectures by swapping the checksum value.
Reported by: Sreekanth M. ( kanthms <> netlogicmicro dot com )
Tested by: Sreekanth M. ( kanthms <> netlogicmicro dot com )
yongari [Wed, 22 Jun 2011 00:38:25 +0000 (00:38 +0000)]
MFC r222542:
If driver is not running, disable interrupts and do not try to
process received frames. Previously it was possible to handle RX
interrupts even if controller is not fully initialized. This
resulted in non-working driver after system is up and running.
yongari [Wed, 22 Jun 2011 00:35:42 +0000 (00:35 +0000)]
MFC r222581:
Poke correct GPIO pins for newer axe(4) controllers with Marvell
PHY. Newer models seem to use different LED mode that requires
enabling both GPIO1 and GPIO2.
yongari [Wed, 22 Jun 2011 00:16:40 +0000 (00:16 +0000)]
MFC r221818:
Add initial BCM5719 support. TSO and jumbo frame was intentionally
disabled for BCM5719 A0 revision due to known hardware errata.
Many thanks to Broadcom for continuing support of FreeBSD.