Robert Watson [Mon, 5 Oct 2009 14:49:16 +0000 (14:49 +0000)]
First cut at implementing SOCK_SEQPACKET support for UNIX (local) domain
sockets. This allows for reliable bi-directional datagram communication
over UNIX domain sockets, in contrast to SOCK_DGRAM (M:N, unreliable) or
SOCK_STERAM (bi-directional bytestream). Largely, this reuses existing
UNIX domain socket code. This allows applications requiring record-
oriented semantics to do so reliably via local IPC.
Some implementation notes (also present in XXX comments):
- Currently we lack an sbappend variant able to do datagrams and control
data without doing addresses, so we mark SOCK_SEQPACKET as PR_ADDR.
Adding a new variant will solve this problem.
- UNIX domain sockets on FreeBSD provide back-pressure/flow control
notification for stream sockets by manipulating the send socket
buffer's size during pru_send and pru_rcvd. This trick works less well
for SOCK_SEQPACKET as sosend_generic() uses sb_hiwat not just to
manage blocking, but also to determine maximum datagram size. Fixing
this requires rethinking how back-pressure is done for SOCK_SEQPACKET;
in the mean time, it's possible to get EMSGSIZE when buffers fill,
instead of blocking.
John Baldwin [Mon, 5 Oct 2009 14:13:16 +0000 (14:13 +0000)]
When the timeout backoff hits the maximum value, leave it capped at the
maximum value rather than setting it to the result of a boolean expression
that is always true.
Stanislav Sedov [Mon, 5 Oct 2009 10:08:58 +0000 (10:08 +0000)]
- Drop unused pmap_use_l1 function and comment out currently unused
pmap_dcache_wbinv_all/pmap_copy_page functions which we might want
to take advatage of later. This fixes the build with PMAP_DEBUG
defined.
Edwin Groothuis [Mon, 5 Oct 2009 07:13:15 +0000 (07:13 +0000)]
Modified locale(1) to be able to show the altmon_X fields and the [cxX]_fmt's.
Also modify the "-k list" option to display only fields with a certain prefix.
Edwin Groothuis [Mon, 5 Oct 2009 07:11:19 +0000 (07:11 +0000)]
Modified locale(1) to be able to show the altmon_X fields and the [cxX]_fmt's.
Also modify the "-k list" option to display only fields with a certain prefix.
Matt Jacob [Mon, 5 Oct 2009 01:31:16 +0000 (01:31 +0000)]
The cylinder group tag cg_initediblk needs to match the number of inodes
actually initialized. In the growfs case for UFS2, no inodes were actually
being initialized and the number of inodes noted as initialized was the
number of inodes per group. This created a filesystem that was deemed
corrupted because the inodes thus added were full of garbage.
David Schultz [Sun, 4 Oct 2009 19:43:36 +0000 (19:43 +0000)]
Better glibc compatibility for getline/getdelim:
- Tolerate applications that pass a NULL pointer for the buffer and
claim that the capacity of the buffer is nonzero.
- If an application passes in a non-NULL buffer pointer and claims the
buffer has zero capacity, we should free (well, realloc) it
anyway. It could have been obtained from malloc(0), so failing to
free it would be a small memory leak.
Attilio Rao [Sat, 3 Oct 2009 15:02:55 +0000 (15:02 +0000)]
When releasing a lockmgr held in shared way we need to use a write memory
barrier in order to avoid, on architectures which doesn't have strong
ordered writes, CPU instructions reordering.
Bjoern A. Zeeb [Sat, 3 Oct 2009 11:57:21 +0000 (11:57 +0000)]
Make sure that the primary native brandinfo always gets added
first and the native ia32 compat as middle (before other things).
o(ld)brandinfo as well as third party like linux, kfreebsd, etc.
stays on SI_ORDER_ANY coming last.
The reason for this is only to make sure that even in case we would
overflow the MAX_BRANDS sized array, the native FreeBSD brandinfo
would still be there and the system would be operational.
Fix RTS/CTS flow control, broken by the TTY overhaul. The new TTY
interface is fairly simple WRT dealing with flow control, but
needed 2 new RX buffer functions with "get-char-from-buf" separated
from "advance-buf-pointer" so that the pointer could be advanced
only when ttydisc_rint() succeeded.
Bjoern A. Zeeb [Fri, 2 Oct 2009 17:48:51 +0000 (17:48 +0000)]
Add a mitigation feature that will prevent user mappings at
virtual address 0, limiting the ability to convert a kernel
NULL pointer dereference into a privilege escalation attack.
If the sysctl is set to 0 a newly started process will not be able
to map anything in the address range of the first page (0 to PAGE_SIZE).
This is the default. Already running processes are not affected by this.
You can either change the sysctl or the tunable from loader in case
you need to map at a virtual address of 0, for example when running
any of the extinct species of a set of a.out binaries, vm86 emulation, ..
In that case set security.bsd.map_at_zero="1".
Superseeds: r197537
In collaboration with: jhb, kib, alc
Hiroki Sato [Fri, 2 Oct 2009 07:00:20 +0000 (07:00 +0000)]
Enable adding a link-local address even if ND6_IFF_IFDISABLED.
Note that when the interface has ND6_IFF_IFDISABLED, a newly-added
address is always marked as IN6_IFF_TENTATIVE so that the interface
can perform DAD after the ND6_IFF_IFDISABLED is cleared.
Hiroki Sato [Fri, 2 Oct 2009 06:19:34 +0000 (06:19 +0000)]
Revert the previous afexists() change. Knobs configured explicitly by
the user should not be ignored if possible even if the kernel does not
support the prerequisite feature.
Qing Li [Fri, 2 Oct 2009 01:45:11 +0000 (01:45 +0000)]
Remove a log message from production code. This log message can be
triggered by a misconfigured host that is sending out gratuious ARPs.
This log message can also be triggered during a network renumbering
event when multiple prefixes co-exist on a single network segment.
Qing Li [Fri, 2 Oct 2009 01:34:55 +0000 (01:34 +0000)]
Previously, if an address alias is configured on an interface, and
this address alias has a prefix matching that of another address
configured on the same interface, then the ARP entry for the alias
is not deleted from the ARP table when that address alias is removed.
This patch fixes the aforementioned issue.
Ed Maste [Thu, 1 Oct 2009 21:44:30 +0000 (21:44 +0000)]
In fill_kinfo_thread, copy the thread's name into struct kinfo_proc even
if it is empty. Otherwise the previous thread's name would remain in the
struct and then be reported for this thread.
Jilles Tjoelker [Thu, 1 Oct 2009 21:40:08 +0000 (21:40 +0000)]
sh: Disallow mismatched quotes in backticks (`...`).
Due to the amount of code removed by this, it seems that allowing unmatched
quotes was a deliberate imitation of System V sh and real ksh. Most other
shells do not allow unmatched quotes (e.g. bash, zsh, pdksh, NetBSD /bin/sh,
dash).
Jung-uk Kim [Thu, 1 Oct 2009 20:56:15 +0000 (20:56 +0000)]
Compile ACPI debugger and disassembler for kernel modules unconditionally.
These files will generate almost empty object files without ACPI_DEBUG/DDB
options. As a result, size of acpi.ko will increase slightly.
Qing Li [Thu, 1 Oct 2009 20:32:29 +0000 (20:32 +0000)]
The flow-table associates TCP/UDP flows and IP destinations with
specific routes. When the routing table changes, for example,
when a new route with a more specific prefix is inserted into the
routing table, the flow-table is not updated to reflect that change.
As such existing connections cannot take advantage of the new path.
In some cases the path is broken. This patch will update the affected
flow-table entries when a more specific route is added. The route
entry is properly marked when a route is deleted from the table.
In this case, when the flow-table performs a search, the stale
entry is updated automatically. Therefore this patch is not
necessary for route deletion.
Andrew Thompson [Thu, 1 Oct 2009 18:37:16 +0000 (18:37 +0000)]
EHCI Hardware BUG workaround
The EHCI HW can use the qtd_next field instead of qtd_altnext when a short
packet is received. This contradicts what is stated in the EHCI datasheet.
Also the total-bytes field in the status field of the following TD gets
corrupted upon reception of a short packet! We work this around in software by
not queueing more than one job/TD at a time of up to 16Kbytes! The bug has been
seen on multiple INTEL based EHCI chips. Other vendors have not been tested
yet.
- Applications using /dev/usb/X.Y.Z, where Z is non-zero are affected, but not
applications using LibUSB v0.1, v1.2 and v2.0.
- Mass Storage (umass) is affected.
Submitted by: Hans Petter Selasky
MFC after: 3 days
Correct the pthread stub prototype for pthread_mutexattr_settype to allow for
the type argument. This is known to fix some pthread_mutexattr_settype()
invocations, especially when it comes to pulseaudio.
Approved by: kib
deischen (threads)
MFC after: 3 days
Provide default implementation for VOP_ACCESS(9), so that filesystems which
want to provide VOP_ACCESSX(9) don't have to implement both. Note that
this commit makes implementation of either of these two mandatory.
As a workaround, for Intel CPUs, do not use CLFLUSH in
pmap_invalidate_cache_range() when self-snoop is apparently not reported
in cpu features. We get a reserved trap when clflushing APIC registers
window.
XEN in full system virtualization mode removes self-snoop from CPU
features, making this a problem.
Tested by: csjp
Reviewed by: alc
MFC after: 3 days
John Baldwin [Wed, 30 Sep 2009 17:05:26 +0000 (17:05 +0000)]
Split the 'video' ACPI lock up into two locks to resolve a LOR with the
sysctl lock. The 'video' lock now protects the 'bus' of video output
devices attached to a graphics adapter. It is used when iterating over
the list of outputs, etc. The 'video_output' lock is used to lock the
output-specific data similar to a driver lock for the individual video
outputs.
Don't do an IPv6 operation when the kernel doesn't have
an IPv6 support.
Reported by: Alexander Best <alexbestms__at__math.uni-muenster.de>
Confirmed by: Paul B. Mahol <onemda__at__gmail.com>,
Alexander Best <alexbestms__at__math.uni-muenster.de>
Andrew Gallatin [Wed, 30 Sep 2009 14:42:06 +0000 (14:42 +0000)]
Two more mxge watchdog fixes:
1) Restore the PCI Express control register after a watchdog
reset. This is required because the device will come out
of watchdog reset with the pectl reg at its default state,
and important BIOS configuration (like max payload size)
could be lost.
2) Call mxge_start_locked() for every tx queue before dropping
the lock in the watchdog handler. This is required, as
the queue's buf ring may have filled during the reset.
Coleman Kane [Wed, 30 Sep 2009 14:28:38 +0000 (14:28 +0000)]
Correct a bug that could lead to a kernel panic if a user attempted to
perform 802.11 operations directly on the ndis0 interface before the
first VAP (wlan0) had been created. This would lead to a NULL-pointer
dereference in the kernel.
Submitted by: Paul B. Mahol <onemda@gmail.com>
MFC after: 3 days
When releasing a read/shared lock we need to use a write memory barrier
in order to avoid, on architectures which doesn't have strong ordered
writes, CPU instructions reordering.
Diagnosed by: fabio
Reviewed by: jhb
Tested by: Giovanni Trematerra
<giovanni dot trematerra at gmail dot com>
Robert Watson [Wed, 30 Sep 2009 08:46:01 +0000 (08:46 +0000)]
Reserve system call numbers for Capsicum security framework capabilities,
capability mode, and process descriptors: cap_new, cap_getrights, cap_enter,
cap_getmode, pdfork, pdkill, pdgetpid, and pdwait.
Alexander Motin [Tue, 29 Sep 2009 09:36:38 +0000 (09:36 +0000)]
Add some bits of HDMI/DisplayPort support from later specification updates.
It may be not enough to make them work, but at least should give some
information about these beasts.
The first 96 bytes may not be zeroes. It can contain trivial boot
code that merely emits an error and waits for a key press before
rebooting. The error being that extended partitions are not
bootable. The origin is presumed to be Windows 2000; Windows XP
does not do this...
For now, ignore the first 96 bytes when checking that the EBR is
(for the most part) all zeroes.
Tested by: Mario Lobo <mlobo@digiart.art.br>
MFC after: 1 week
For AR8132 fast ethernet controller, do not report 1000baseT
capability to mii(4). Even though AR8132 uses the same model/
revision number of F1 gigabit PHY, the PHY has no ability to
establish 1000baseT link. I have no idea why Atheros use the same
device/model id for this PHY.
With this change atphy(4) does not report 1000baseT media
capability and manual 1000baseT configuration is also disabled
which is more desirable behavior for 10/100Mbps PHY.
Add DGE-560SX(Yukon XL) to the supported device list. Many thanks
to "Eugene Perevyazko <john <> dnepro dot net>" who kindly gave
remote access to system with DGE-560SX.