sam [Wed, 12 Aug 2009 21:19:19 +0000 (21:19 +0000)]
Drain link state event changes posted during vap destroy. This is a
band-aid for the general problem that if_link_state_change can be
called between if_detach and if_free leaving a task queued that has
been free'd.
Spotted by: thompsa
Reviewed by: rwatson
Approved by: re (rwatson)
qingli [Wed, 12 Aug 2009 19:15:26 +0000 (19:15 +0000)]
A piece of code was added to install a host route when an IPv6 interface
address is configured with a /128 prefix. This is no longer necessary due
to r192011. In fact that code conflicts with r192011. This patch removes
the host route installation when detecting the /128 prefix, and instead
let the code added by r192011 to install the loopback route for that IPv6
interface address.
rmacklem [Wed, 12 Aug 2009 16:27:51 +0000 (16:27 +0000)]
Add a check for a NULL mbuf ptr at the beginning of xdrmbuf_inline()
so that it returns failure instead of crashing when "m->m_len" is
executed and m == NULL. The mbuf ptr can be NULL when a call to
xdrmbuf_getbytes() gets the bytes it needs, but they are at the end
of a short RPC reply. When this happens, xdrmbuf_getbytes() returns
success, but advances the mbuf ptr (xdrs->x_private) to m_next, which
is NULL. If this is followed by a call to xdrmbuf_getlong(), it calls
xdrmbuf_inline(), which would cause a crash by accessing "m->m_len".
Tested by: pho, serenity at exscape dot org
Approved by: re (rwatson), kib (mentor)
bz [Wed, 12 Aug 2009 12:06:16 +0000 (12:06 +0000)]
Add ddb show dpcpu_off command to ease dpcpu memory debugging.
While show pcpu prints pc_dynamic this also prints the original
memory address as well as the maths.
Once dpcpu goes NUMA this is considered to help debugging as well.
cperciva [Wed, 12 Aug 2009 11:55:26 +0000 (11:55 +0000)]
Apply the ntp-related part of r195626 to the correct part of the tree --
the mkver which is used in builds is the one in usr.sbin/ntp/scripts,
not the one in contrib/ntp/scripts.
bz [Wed, 12 Aug 2009 10:26:03 +0000 (10:26 +0000)]
Put minimum alignment on the dpcpu and vnet section so that ld
when adding the __start_ symbol knows the expected section alignment
and can place the __start_ symbol correctly.
These sections will not support symbols with super-cache line alignment
requirements.
For full details, see posting to freebsd-current, 2009-08-10,
Message-ID: <20090810133111.C93661@maildrop.int.zabbadoz.net>.
Debugging and testing patches by:
Kamigishi Rei (spambox haruhiism.net),
np, lstewart, jhb, kib, rwatson
Tested by: Kamigishi Rei, lstewart
Reviewed by: kib
Approved by: re
rwatson [Sun, 2 Aug 2009 19:43:32 +0000 (19:43 +0000)]
Many network stack subsystems use a single global data structure to hold
all pertinent statatistics for the subsystem. These structures are
sometimes "borrowed" by kernel modules that require a place to store
statistics for similar events.
Add KPI accessor functions for statistics structures referenced by kernel
modules so that they no longer encode certain specifics of how the data
structures are named and stored. This change is intended to make it
easier to move to per-CPU network stats following 8.0-RELEASE.
The following modules are affected by this change:
if_bridge
if_cxgb
if_gif
ip_mroute
ipdivert
pf
In practice, most of these statistics consumers should, in fact, maintain
their own statistics data structures rather than borrowing structures
from the base network stack. However, that change is too agressive for
this point in the release cycle.
attilio [Sun, 2 Aug 2009 14:28:40 +0000 (14:28 +0000)]
Make the newbus subsystem Giant free by adding the new newbus sxlock.
The newbus lock is responsible for protecting newbus internIal structures,
device states and devclass flags. It is necessary to hold it when all
such datas are accessed. For the other operations, softc locking should
ensure enough protection to avoid races.
Newbus lock is automatically held when virtual operations on the device
and bus are invoked when loading the driver or when the suspend/resume
take place. For other 'spourious' operations trying to access/modify
the newbus topology, newbus lock needs to be automatically acquired and
dropped.
For the moment Giant is also acquired in some key point (modules subsystem)
in order to avoid problems before the 8.0 release as module handlers could
make assumptions about it. This Giant locking should go just after
the release happens.
Please keep in mind that the public interface can be expanded in order
to provide more support, if there are really necessities at some point
and also some bugs could arise as long as the patch needs a bit of
further testing.
Bump __FreeBSD_version in order to reflect the newbus lock introduction.
Reviewed by: ed, hps, jhb, imp, mav, scottl
No answer by: ariff, thompsa, yongari
Tested by: pho,
G. Trematerra <giovanni dot trematerra at gmail dot com>,
Brandon Gooch <jamesbrandongooch at gmail dot com>
Sponsored by: Yahoo! Incorporated
Approved by: re (ksmith)
rwatson [Sat, 1 Aug 2009 20:24:45 +0000 (20:24 +0000)]
Remove vnet_foreach() utility function, which previously allowed
vnet.c to iterate virtual network stacks without being aware of
the implementation details previously hidden in kern_vimage.c.
Now they are in the same file, so remove this added complexity.
rwatson [Sat, 1 Aug 2009 19:26:27 +0000 (19:26 +0000)]
Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.
Make the "enforce_statfs" default 2 (most restrictive) in jail_set(2),
instead of whatever the parent/system has (which is generally 0). This
mirrors the old-style default used for jail(2) in conjunction with the
security.jail.enforce_statfs sysctl.
Handle kernels that don't have IPv6 by not sending an "ip6.addr"
parameter unless a (numeric) IPv6 address is given. Even the default
binaries built with -DINET6 will work with IPv6-less kernels. With an
eye to the future, similarly handle the possibility of an IPv4-less kernel.
Fix some LORs between vnode locks and filedescriptor table locks.
- Don't grab the filedesc lock just to read fd_cmask.
- Drop vnode locks earlier when mounting the root filesystem and before
sanitizing stdin/out/err file descriptors during execve().
Show interface name which received short CARP packet (e.g. a VRRP packet),
in order to match other codepaths nearby. This makes troubleshooting
easier.
Remove a LOR, where the the sleepable allprison_lock was being obtained
in prison_equal_ip4/6 while an inp mutex was held. Locking allprison_lock
can be avoided by making a restriction on the IP addresses associated with
jails:
Don't allow the "ip4" and "ip6" parameters to be changed after a jail is
created. Setting the "ip4.addr" and "ip6.addr" parameters is allowed,
but only if the jail was already created with either ip4/6=new or
ip4/6=disable. With this restriction, the prison flags in question
(PR_IP4_USER and PR_IP6_USER) become read-only and can be checked
without locking.
This also allows the simplification of a messy code path that was needed
to handle an existing prison gaining an IP address list.
alfred [Thu, 30 Jul 2009 00:57:54 +0000 (00:57 +0000)]
Missed this file for r195963:
USB core:
- add support for defragging of written device data.
- improve handling of alternate settings in device side mode.
- correct return value from usbd_get_no_alts() function.
- reported by: HPS
- P4 ID: 166156, 166168
- report USB device release information to devd and pnpinfo.
- reported by: MIHIRA Sanpei Yoshiro
- P4 ID: 166221
alfred [Thu, 30 Jul 2009 00:15:50 +0000 (00:15 +0000)]
USB core:
- add support for defragging of written device data.
- improve handling of alternate settings in device side mode.
- correct return value from usbd_get_no_alts() function.
- reported by: HPS
- P4 ID: 166156, 166168
- report USB device release information to devd and pnpinfo.
- reported by: MIHIRA Sanpei Yoshiro
- P4 ID: 166221
alfred [Thu, 30 Jul 2009 00:14:34 +0000 (00:14 +0000)]
USB CORE:
- Add minimum polling support to drive UMASS
and UKBD in case of panic.
- Add extra check to ukbd probe to fix problem about
mouse devices attaching like keyboards.
- P4 ID: 166148
Fix XEN build breakage, by implementing pmap_invalidate_cache_range()
and using it when appropriate. Merge analogue of the r195836
optimization to XEN.
Don't allow mixing the "vnet" and "ip4/6" jail parameters, since vnet
jails have their own IP stack and don't have access to the parent IP
addresses anyway. Note that a virtual network stack forms a break
between prisons with regard to the list of allowed IP addresses.
Change the default value of the "ip4" and "ip6" jail parameters to
"disable", which only allows access to the parent/physical system's
IP addresses when specifically directed. Change the default value of
"host" to "new", and don't copy the parent host values, to insulate
jails from the parent hostname et al.
Fix the experimental nfs client so that it only calls ncl_vinvalbuf()
for NFSv2 and not NFSv4 when nfscl_mustflush() returns 0. Since
nfscl_mustflush() only returns 0 when there is a valid write delegation
issued to the client, it only affects the case of an NFSv4 mount with
callbacks/delegations enabled.
Delete the descriptions of the gssname and allgssname optionss from
mount_nfs.8 since these options are not implemented in FreeBSD8.
This is content change for the man page.
Update less to v436. This is considered as a bugfix release from vendor.
Major changes from v429:
* Don't pass "-" to non-pipe LESSOPEN unless it starts with "-".
* Allow a fraction as the argument to the -# (--shift) option.
* Fix highlight bug when underlined/overstruck text matches at end of line.
* Fix non-regex searches with ctrl-R.
As was done in r195820 for amd64, use clflush for flushing cache lines
when memory page caching attributes changed, and CPU does not support
self-snoop, but implemented clflush, for i386.
Take care of possible mappings of the page by sf buffer by utilizing
the mapping for clflush, otherwise map the page transiently. Amd64
used direct map.
Proposed and reviewed by: alc
Approved by: re (kensmith)
Eliminate ARG_UPATH[12] arguments to AUDIT_ARG_UPATH() and instead
provide specific macros, AUDIT_ARG_UPATH1() and AUDIT_ARG_UPATH2()
to capture path information for audit records. This allows us to
move the definitions of ARG_* out of the public audit header file,
as they are an implementation detail of our current kernel-internal
audit record, which may change.
Currently there is a problem with fscking UFS file systems created on
top of ZVOLs. The problem is that rc.d/fsck runs before rc.d/zfs. The
latter makes ZVOLs to appear in /dev/. In such case rc.d/fsck cannot
find devfs entry and aborts. We cannot simply move rc.d/zfs before
rc.d/fsck, because we first want kern.hostid to be configured (by
rc.d/hostid). If we won't wait (hostid will be 0) we can reuse disks
which are in use by different systems (eg. in SAN/NAS environment).
We also cannot move rc.d/hostid before rc.d/fsck, because rc.d/hostid on
first system start stores generated kern.hostuuid in /etc/hostid file,
so it needs root file system to be mounted read-write.
The fix is to split rc.d/hostid so that rc.d/hostid (which will now run
before rc.d/fsck) only generates hostid and sets up sysctls, but doesn't
touch root file system and rc.d/hostid_save (which is run after
rc.d/root) and only creates /etc/hostid file.
With that in place, we can move ZVOL initialization to dedicated
rc.d/zvol script which runs before rc.d/fsck.
PR: conf/120194
Reported by: James Snow <snow@teardrop.org>
Reviewed by: brooks
Approved by: re (kib)
MFC after: 2 weeks
Update to version 9.6.1-P1 which addresses a remote DoS vulnerability:
Receipt of a specially-crafted dynamic update message may
cause BIND 9 servers to exit. This vulnerability affects all
servers -- it is not limited to those that are configured to
allow dynamic updates. Access controls will not provide an
effective workaround.
More details can be found here: https://www.isc.org/node/474
All BIND users are encouraged to update to a patched version ASAP.
Rework vnode argument auditing to follow the same structure, in order
to avoid exposing ARG_ macros/flag values outside of the audit code in
order to name which one of two possible vnodes will be audited for a
system call.
Audit file descriptors passed to fooat(2) system calls, which are used
instead of the root/current working directory as the starting point for
lookups. Up to two such descriptors can be audited. Add audit record
BSM encoding for fooat(2).
Note: due to an error in the OpenBSM 1.1p1 configuration file, a
further change is required to that file in order to fix openat(2)
auditing.
The new flow table caches both the routing table entry as well as the
L2 information. For an indirect route the cached L2 entry contains the
MAC address of the gateway. Typically the default route is used to
transmit multicast packets when explicit multicast routes are not
available. The ether_output() function bypasses L2 resolution function
if it verifies the L2 cache is valid, because the cached L2 address
(a unicast MAC address) is copied into the packets as the destination
MAC address. This validation, however, does not apply to broadcast and
multicast packets because the destination MAC address is mapped
according to a standard method instead.
Submitted by: Xin Li
Reviewed by: bz
Approved by: re
Turns out that when a receiver forwards through its TNS's the
processing code holds the read lock (when processing a
FWD-TSN for pr-sctp). If it finds stranded data that
can be given to the application, it calls sctp_add_to_readq().
The readq function also grabs this lock. So if INVAR is on
we get a double recurse on a non-recursive lock and panic.
This fix will change it so that readq() function gets a
flag to tell if the lock is held, if so then it does not
get the lock.
Add INDEX-8 to the default portsnap configuration file, and remove INDEX-5.
The Portsnap buildbox now generates teh bits needed for portsnap to produce
INDEX-8; and it hasn't built INDEX-5 for a long time, although the bits are
still distributed for an INDEX-5 from when FreeBSD 5.x reached its EoL.
Approved by: re (kib)
MFC after: 3 days (INDEX-8 addition only)
- Allow loopback route to be installed for address assigned to
interface of IFF_POINTOPOINT type.
- Install loopback route for an IPv4 interface addreess when the
"useloopback" sysctl variable is enabled. Similarly, install
loopback route for an IPv6 interface address when the sysctl variable
"nd6_useloopback" is enabled. Deleting loopback routes for interface
addresses is unconditional in case these sysctl variables were
disabled after an interface address has been assigned.
Fix the freebsd32 versions of semsys(), shmsys(), and msgsys() to use the
old ABI versions of the relevant control system call (e.g.
freebsd7_freebsd32_msgctl() instead of freebsd32_msgctl() for msgsys()).
We don't support ephemeral IDs in FreeBSD and without this fix ZFS can
panic when in zfs_fuid_create_cred() when userid is negative. It is
converted to unsigned value which makes IS_EPHEMERAL() macro to
incorrectly report that this is ephemeral ID. The most reasonable
solution for now is to always report that the given ID is not ephemeral.
PR: kern/132337
Submitted by: Matthew West <freebsd@r.zeeb.org>
Tested by: Thomas Backman <serenity@exscape.org>, Michael Reifenberger <mike@reifenberger.com>
Approved by: re (kib)
MFC after: 2 weeks
Mesh fixes, namely:
* don't clobber proxy entries
* HWMP seq number processing, including discard of old frames
* flush routing table entries based on nexthop
* print route flags in ifconfig
* more debugging messages and comments
Restore PATA device probe order, broken by PMP support implementation,
requesting IDENTIFY from slave device first. This order is important
for proper cable type detection by master device.
Update epair(4) to the new netisr implementation and polish
things a bit:
- use dpcpu data to track the ifps with packets queued up,
- per-cpu locking and driver flags
- along with .nh_drainedcpu and NETISR_POLICY_CPU.
- Put the mbufs in flight reference count, preventing interfaces
from going away, under INVARIANTS as this is a general problem
of the stack and should be solved in if.c/netisr but still good
to verify the internal queuing logic.
- Permit changing the MTU to virtually everythinkg like we do for loopback.
Make the in-kernel logic for the SIOCSIFVNET, SIOCSIFRVNET ioctls
(ifconfig ifN (-)vnet <jname|jid>) work correctly.
Move vi_if_move to if.c and split it up into two functions(*),
one for each ioctl.
In the reclaim case, correctly set the vnet before calling if_vmove.
Instead of silently allowing a move of an interface from the current
vnet to the current vnet, return an error. (*)
There is some duplicate interface name checking before actually moving
the interface between network stacks without locking and thus race
prone. Ideally if_vmove will correctly and automagically handle these
in the future.
Make ifconfig ifN -vnet <jname|jid> actually work:
- fix ifconfig to ignore the non-existent interface in the current
network stack in case of '-vnet'.
- in ifconfig: actually use the local variables defined for the
vnet functions rather than modifying the global.