Nick Hibma [Thu, 25 Jun 2009 13:15:20 +0000 (13:15 +0000)]
- Make pprint print through fd 3, so it can be used in customisation
functions to print something to the screen.
- Prefix each line with the running time (bikeshed).
This adds the following functions to the acl(3) API: acl_add_flag_np,
acl_clear_flags_np, acl_create_entry_np, acl_delete_entry_np,
acl_delete_flag_np, acl_get_extended_np, acl_get_flag_np, acl_get_flagset_np,
acl_set_extended_np, acl_set_flagset_np, acl_to_text_np, acl_is_trivial_np,
acl_strip_np, acl_get_brand_np. Most of them are similar to what Darwin
does. There are no backward-incompatible changes.
John Baldwin [Thu, 25 Jun 2009 12:34:05 +0000 (12:34 +0000)]
Raise the default size of the EFI partition on ia64 from 100MB to 400MB.
A fresh install of a current 8.0 snapshot uses 156MB with a single kernel
and having the filesystem too small prevented the system from booting.
Robert Watson [Thu, 25 Jun 2009 11:52:33 +0000 (11:52 +0000)]
Add a new global rwlock, in_ifaddr_lock, which will synchronize use of the
in_ifaddrhead and INADDR_HASH address lists.
Previously, these lists were used unsynchronized as they were effectively
never changed in steady state, but we've seen increasing reports of
writer-writer races on very busy VPN servers as core count has gone up
(and similar configurations where address lists change frequently and
concurrently).
For the time being, use rwlocks rather than rmlocks in order to take
advantage of their better lock debugging support. As a result, we don't
enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion
is complete and a performance analysis has been done. This means that one
class of reader-writer races still exists.
Robert Watson [Thu, 25 Jun 2009 08:37:38 +0000 (08:37 +0000)]
Clean up reference management in in6_update_ifa and in6_unlink_ifa, and
in particular, add a reference for in6_ifaddrhead since we do remove a
reference for it when an IPv6 address is removed. This fixes ifconfig
delete of an IPv6 alias.
Jeff Roberson [Thu, 25 Jun 2009 01:33:51 +0000 (01:33 +0000)]
- Use DPCPU for SCHED_STATS. This is somewhat awkward because the
offset of the stat is not known until link time so we must emit a
function to call SYSCTL_ADD_PROC rather than using SYSCTL_PROC
directly.
- Eliminate the atomic from SCHED_STAT_INC now that it's using per-cpu
variables. Sched stats are always incremented while we're holding
a spinlock so no further protection is required.
Rick Macklem [Thu, 25 Jun 2009 00:28:43 +0000 (00:28 +0000)]
Fix two known problems in clnt_rc.c, plus issues w.r.t. smp noted
during reading of the code. Change the code so that it never accesses
rc_connecting, rc_closed or rc_client when the rc_lock mutex is not held.
Also, it now performs the CLNT_CLOSE(client) and CLNT_RELEASE(client)
calls after the rc_lock mutex has been released, since those calls do
msleep()s with another mutex held. Change clnt_reconnect_call() so that
releasing the reference count is delayed until after the
"if (rc->rc_client == client)" check, so that rc_client cannot have been
recycled.
Xin LI [Wed, 24 Jun 2009 23:17:16 +0000 (23:17 +0000)]
Lock around access to nc_file and netconfig_info ("ni"). The RPC
part of libc is still not thread safe but this would at least
reduce the problems we have.
Colin Percival [Wed, 24 Jun 2009 23:17:00 +0000 (23:17 +0000)]
Make sysinstall search for /dev/daXa and register such devices as USB disks.
This covers the common case of unsliced USB drives, and makes it possible to
select them as installation source media.
Oleg Bulyzhin [Wed, 24 Jun 2009 22:57:07 +0000 (22:57 +0000)]
- fix dummynet 'fast' mode for WF2Q case.
- fix printing of pipe profile data.
- introduce new pipe parameter: 'burst' - how much data can be sent through
pipe bypassing bandwidth limit.
Navdeep Parhar [Wed, 24 Jun 2009 22:28:48 +0000 (22:28 +0000)]
This adds a new "stdio" mode to cxgbtool - it's an interactive mode
meant primarily for _non_ interactive use. Scripts that run cxgbtool
repeatedly to perform register r/w or mdio will benefit from this.
Instead of fork/exec'ing a new cxgbtool for every regio/mdio you can
simply open a pair of pipes to/from cxgbtool and run cmds over them.
Navdeep Parhar [Wed, 24 Jun 2009 21:56:05 +0000 (21:56 +0000)]
Various ifmedia related fixes in cxgb(4), including:
- build ifmedia list based on phy->caps, not string comparisons.
- rebuild media list when a transceiver change is detected.
- return EOPNOTSUPP instead of ENXIO in cxgb_media_status.
Jamie Gritton [Wed, 24 Jun 2009 21:39:50 +0000 (21:39 +0000)]
In case of prisons with their own network stack, permit
additional privileges as well as not restricting the type of
sockets a user can open.
Note: the VIMAGE/vnet fetaure of of jails is still considered
experimental and cannot guarantee that privileged users
can be kept imprisoned if enabled.
Robert Watson [Wed, 24 Jun 2009 21:36:09 +0000 (21:36 +0000)]
Use queue(9) instead of hand-crafted link lists for the global netatalk
address list. Generally follow the style and convention of similar parts
in netinet.
John Baldwin [Wed, 24 Jun 2009 21:10:52 +0000 (21:10 +0000)]
Change the ABI of some of the structures used by the SYSV IPC API:
- The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned
short.
- The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned
short.
- The mode member of struct ipc_perm is now mode_t instead of unsigned short
(this is merely a style bug).
- The rather dubious padding fields for ABI compat with SV/I386 have been
removed from struct msqid_ds and struct semid_ds.
- The shm_segsz member of struct shmid_ds is now a size_t instead of an
int. This removes the need for the shm_bsegsz member in struct
shmid_kernel and should allow for complete support of SYSV SHM regions
>= 2GB.
- The shm_nattch member of struct shmid_ds is now an int instead of a
short.
- The shm_internal member of struct shmid_ds is now gone. The internal
VM object pointer for SHM regions has been moved into struct
shmid_kernel.
- The existing __semctl(), msgctl(), and shmctl() system call entries are
now marked COMPAT7 and new versions of those system calls which support
the new ABI are now present.
- The new system calls are assigned to the FBSD-1.1 version in libc. The
FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls.
- A simplistic framework for tagging system calls with compatibility
symbol versions has been added to libc. Version tags are added to
system calls by adding an appropriate __sym_compat() entry to
src/lib/libc/incldue/compat.h. [1]
Andrew Gallatin [Wed, 24 Jun 2009 21:09:56 +0000 (21:09 +0000)]
Add a dying flag to prevent races at detach.
I tried re-ordering ether_ifdetach(), but this created a new race
where sometimes, when under heavy receive load (>1Mpps) and running
tcpdump, the machine would panic. At panic, the ithread was still in
the original (not dead) if_input() path, and was accessing stale BPF
data structs. By using a dying flag, I can close the interface prior
to if_detach() to be certain the interface cannot send packets up in
the middle of ether_ifdetach.
Robert Watson [Wed, 24 Jun 2009 21:00:25 +0000 (21:00 +0000)]
Convert netinet6 to using queue(9) rather than hand-crafted linked lists
for the global IPv6 address list (in6_ifaddr -> in6_ifaddrhead). Adopt
the code styles and conventions present in netinet where possible.
Robert Watson [Wed, 24 Jun 2009 20:57:50 +0000 (20:57 +0000)]
Use queue(9) instead of hand-crafted link lists for the global IPX
address list (ipx_ifaddr -> ipx_ifaddrhead), and generally adopt the
naming and usage conventions found in netinet.
Marius Strobl [Wed, 24 Jun 2009 20:56:06 +0000 (20:56 +0000)]
- Change this driver to do taskqueue(9) based TX and interrupt
handling in order to reduce interrupt overhead which results in
better performance.
- Call ether_ifdetach(9) before stopping the controller and the
callouts detach in order to prevent active BPF listeners to clear
promiscuous mode which may lead to the tick callout being restarted
which will trigger a panic once it's actually gone.
- Add explicit IFF_DRV_RUNNING checking in order to prevent extra
link up/down events when using dhclient(8).
- Use the correct macro for deciding whether 2/3 of the available TX
descriptors are used.
- Wrap the RX fault printing in #ifdef CAS_DEBUG in order to not
unnecessarily frighten users and as debugging was the actual
intention. Real errors caused by these faults still will be
accumulated as input errors. It might be a good idea to later on
add driver specific counters for the faults though.
Marius Strobl [Wed, 24 Jun 2009 20:49:02 +0000 (20:49 +0000)]
o merge from amd64:
- r187144: Add a reference to the config(5) manpage and
to the "env" kernel config option.
- Add/enable the default USB drivers. Originally the USB
controller and keyboard drivers were disabled as these
interacted badly with the Open Firmware console driver,
i.e. caused the keyboard to not work with ofw_console(4).
Even when switch to uart(4) and the frame buffer drivers
most of the USB drivers still were kept disabled as
several of them, amongst others all of the drivers for
USB Ethernet controllers, weren't endian clean. With the
new USB stack these problem should be gone now so there's
no longer a reason to not include the same set of USB
drivers amd64 does.
o Remove the commented out device ofw_console; apart from it
being currently broken by some TTY changes one really needs
to know how to actually enable and make it work correctly.
John Baldwin [Wed, 24 Jun 2009 20:01:13 +0000 (20:01 +0000)]
Deprecate the msgsys(), semsys(), and shmsys() system calls by moving
them under COMPAT_FREEBSD[4567]. Starting with FreeBSD 5.0 the SYSV IPC
API was implemented via direct system calls (e.g. msgctl(), msgget(), etc.)
rather than indirecting through the var-args *sys() system calls. The
shmsys() system call was already effectively deprecated for all but
COMPAT_FREEBSD4 already as its implementation for the !COMPAT_FREEBSD4 case
was to simply invoke nosys().
Joerg Wunsch [Wed, 24 Jun 2009 19:47:53 +0000 (19:47 +0000)]
Drop the defunct FDOPT_NOERRLOG option from all the floppy utilities.
The kernel does not log floppy media errors anymore.
In fdcontrol, do always open the file descriptor in read-only mode so
it can operate on read-only media, as there is no longer a separate
control device to operate on.
Joerg Wunsch [Wed, 24 Jun 2009 19:30:31 +0000 (19:30 +0000)]
With the fdc control device disappearing some 5 years ago, it is no
longer useful for the FD_STYPE and FD_SOPTS ioctls to insist on being
issued on a writable file descriptor. Otherwise, there's no longer a
chance to set the drive type or options when a read-only medium is
present in the drive, as there is no way to obtain a writable fd then.
Marius Strobl [Wed, 24 Jun 2009 19:04:08 +0000 (19:04 +0000)]
Revert the part of r194763 which added a dying flag and instead
call ether_ifdetach(9) before stopping the controller and the
callouts. The consensus is that the latter is now safe to do and
should also solve the problem of active BPF listeners clearing
promiscuous mode can result in the tick callout being restarted
which in turn will trigger a panic once it's actually gone.
Rick Macklem [Wed, 24 Jun 2009 18:30:14 +0000 (18:30 +0000)]
If the initial attempt to refresh credentials in the RPCSEC_GSS client
side fails, the entry in the cache is left with no valid context
(gd_ctx == GSS_C_NO_CONTEXT). As such, subsequent hits on the cache
will result in persistent authentication failure, even after the user has
done a kinit or similar and acquired a new valid TGT. This patch adds a test
for that case upon a cache hit and calls rpc_gss_init() to make another
attempt at getting valid credentials. It also moves the setting of gc_proc
to before the import of the principal name to ensure that, if that case
fails, it will be detected as a failure after going to "out:".
Jack F Vogel [Wed, 24 Jun 2009 18:27:07 +0000 (18:27 +0000)]
Update for the Intel 10G driver, this adds support for
newest hardware, adds multiqueue tx interface, infrastructure
cleanup to allow up to 32 MSIX vectors on newer Nehalem systems.
Bug fixes, etc.
Warner Losh [Wed, 24 Jun 2009 17:23:10 +0000 (17:23 +0000)]
Remove usb. The need to have core@ approve major changes to usb has
passed now that the new usb stack is in the tree. The coordination
issues that necessitated this entry are now OBE.
Alexander Motin [Wed, 24 Jun 2009 17:03:06 +0000 (17:03 +0000)]
Some DMA related changes:
- honor parent DMA tag limitations, as man page requires,
- allow data buffer to be allocated within full 64bit address range, when
support is announced by hardware,
- add quirk, disabling 64bit addresses for broken chips, use it for MCP78.
Fix end-of-line issues that can come up when `lpq' reads information
about a queue from a remote host. That remote host may use \r, \r\n,
or \n\r as the line-ending character. In some cases the remote host
will write a single line of information without *any* EOL sequence.
Translate all the non-unix EOL's to the standard newline, and make
sure the final line includes a terminating newline. Logic is also
added to translate all unprintable characters to '?', but that is
#if-ed out for now.
Unbreak sparc64 after the swap accounting changes: mark kernel_map
entries allocated for translations in pmap_init() as MAP_NOFAULT. This
prevents vm_map_insert from trying to account the entries for swap
usage, that is both wrong and too early to work.
While there, change FALSE to VMFS_NO_SPACE.
Reported and tested by: Florian Smeets <flo at kasimir com>
Reviewed by: marius
Andriy Gapon [Wed, 24 Jun 2009 16:03:57 +0000 (16:03 +0000)]
dtrace/amd64: fix virtual address checks
On amd64 KERNBASE/kernbase does not mean start of kernel memory.
This should fix a KASSERT panic in dtrace_copycheck when copyin*()
is used in D program.
Also make checks for user memory a bit stricter.
Robert Watson [Wed, 24 Jun 2009 14:29:40 +0000 (14:29 +0000)]
Clear 'ia' after iterating if_addrhead for unicast address matching: since
'ifa' was used as the TAILQ_FOREACH() iterator argument, and 'ia' was just
derived form it, it could be left non-NULL which confused later
conditional freeing code. This could cause kernel panics if multicast IP
packets were received. [1]
Call 'struct in_ifaddr *' in ip_rtaddr() 'ia', not 'ifa' in keeping with
normal conventions.
When 'ipstealth' is enabled returns from ip_input early, properly release
the 'ia' reference.
Robert Watson [Wed, 24 Jun 2009 12:06:15 +0000 (12:06 +0000)]
Add stack_print_short() and stack_print_short_ddb() interfaces to
stack(9), which generate a more compact rendition of a stack trace
via the kernel's printf.
Robert Watson [Wed, 24 Jun 2009 10:32:44 +0000 (10:32 +0000)]
Break at_ifawithnet() into two variants:
- at_ifawithnet(), which acquires an locks it needs and returns an
at_ifaddr reference.
- at_ifawithnet_locked(), which relies on the caller locking
at_ifaddr_list, and returns a pointer rather than a reference.
Update various consumers to prefer one or the other, including ether
and fddi output, to properly release at_ifaddr references.
Rework at_control() to manage locking and references in a manner
identical to in_control().