The libc acl_valid(3) function validates the contents of a POSIX.1e ACL.
This change removes the requirement that an ACL contain no ACL_USER
entries with a uid the same as those of a file, or ACL_GROUP entries
with a gid the same as those of a file. This requirement is not in the
specification, and not enforced by the kernel's ACL implementation.
Reported by: Iustin Pop <iusty at k1024 dot org>
MFC after: 1 week
ed [Sun, 13 Jul 2008 07:20:14 +0000 (07:20 +0000)]
Make uart(4) the default serial port driver on i386 and amd64.
The uart(4) driver has the advantage of supporting a wider variety of
hardware on a greater amount of platforms. This driver has already been
the standard on platforms such as ia64, powerpc and sparc64.
I've decided not to change anything on pc98. I'd rather let people from
the pc98 team look at this.
Refine the changes made in SVN rev 180430. Specifically, instantiate a new
page table page only if the 2MB page mapping has been used. Also, refactor
some assertions.
In nmount(), if we see "update" in the mount options,
set MNT_UPDATE in fsflags, and delete the
"update" option from the global mount options.
MNT_UPDATE is a command, and not a property of a mount
that should persist after the command is executed.
We need to do similar things for MNT_FORCE and MNT_RELOAD.
All mount flags are prefixed by MNT_..... it would
be nice if flags which were commands were named differently
from flags which are persistent properties of a mount.
This was not such a big deal in the pre-nmount() days,
but with nmount() it is more important.
Strongly discourage the use of the query-source option, and explain why.
Give a better example if a user absolutely must use this option, and
suggest they pick something from the ephemeral port range rather than
port 53. This means that the example will not work if it is merely
uncommented, but this will hopefully encourage users to read the comment.
Merge from vendor/bind9/dist as of the 9.4.2-P1 import, including
the patch from ISC for lib/bind9/check.c and deletion of unused
files in lib/bind.
This version will by default randomize the UDP query source port
(and sequence number of course) for every query.
In order to take advantage of this randomization users MUST have an
appropriate firewall configuration to allow UDP queries to be sent and
answers to be received on random ports; and users MUST NOT specify a
port number using the query-source[-v6] options.
The avoid-v[46]-udp-ports options exist for users who wish to eliminate
certain port numbers from being chosen by named for this purpose. See
the ARM Chatper 6 for more information.
Also please note, this issue applies only to UDP query ports. A random
ephemeral port is always chosen for TCP queries.
This issue applies primarily to name servers whose main purpose is to
resolve random queries (sometimes referred to as "caching" servers, or
more properly as "resolving" servers), although even an "authoritative"
name server will make some queries, primarily at startup time.
All users of BIND are strongly encouraged to upgrade to the latest
version, and to utilize the source port randomization feature.
This update addresses issues raised in:
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-1447
http://www.kb.cert.org/vuls/id/800113
http://tools.ietf.org/html/draft-ietf-dnsext-forgery-resilience
A number of significant enhancements to the ciss driver:
1. The FreeBSD driver was setting an interrupt coalesce delay of 1000us
for reasons that I can only speculate on. This was hurting everything
from lame sequential I/O "benchmarks" to legitimate filesystem metadata
operations that relied on serialized barrier writes. One of my
filesystem tests went from 35s to complete down to 6s.
2. Implemented the Performant transport method. Without the fix in
(1), I saw almost no difference. With it, my filesystem tests showed
another 5-10% improvement in speed. It was hard to measure CPU
utilization in any meaningful way, so it's not clear if there was a
benefit there, though there should have been since the interrupt handler
was reduced from 2 or more PCI reads down to 1.
3. Implemented MSI-X. Without any docs on this, I was just taking a
guess, and it appears to only work with the Performant method. This
could be a programming or understanding mistake on my part. While this
by itself made almost no difference to performance since the Performant
method already eliminated most of the synchronous reads over the PCI
bus, it did allow the CISS hardware to stop sharing its interrupt with
the USB hardware, which in turn allowed the driver to become decoupled
from the Giant-locked USB driver stack. This increased performance by
almost 20%. The MSI-X setup was done with 4 vectors allocated, but only
1 vector used since the performant method was told to only use 1 of 4
queues. Fiddling with this might make it work with the simpleq method,
not sure. I did not implement MSI since I have no MSI-specific hardware
in my test lab.
4. Improved the locking in the driver, trimmed some data structures.
This didn't improve test times in any measurable way, but it does look
like it gave a minor improvement to CPU usage when many
processes/threads were doing I/O in parallel. Again, this was hard to
accurately test.
Dust off old code for support of USB isochronous transfers.
USB isochronous transfer support is required for Bluetooth SCO.
While i'm here change u_int to uint and update TODO.
This should produce no visible changes unless the device is
broken (or really old).
Use the VM_ALLOC_INTERRUPT for the page requests when allocating memory
for the bio for swapout write. It allows the page allocator to drain
free page list deeper. As result, a deadlock where pageout deamon sleeps
waiting for bio to be allocated for swapout is no more reproducable in
practice.
Alan said that M_USE_RESERVE shall be ressurrected and used there, but
until this is implemented, M_NOWAIT does exactly what is needed.
Make it atomic for the devfs_populate_loop() to see the setting of
SI_ALIAS flag and initialization of the si_parent when alias is created.
Assert that supplied parent device is not NULL.
Both situations could cause NULL dereference in the
devfs_populate_loop() when creating a symlink for SI_ALIAS'ed device.
Namely, cdp->cdp_c.si_parent may be NULL.
id_t is a 64-bit integer and thus is passed as two arguments like off_t is.
As a result, those arguments must be recombined before calling the real
syscal implementation. This change fixes 32-bit compatibility for
cpuset_getid(), cpuset_setid(), cpuset_getaffinity(), and
cpuset_setaffinity().
Use 'CSCOPE_ARCHDIR' to change the default architecture directories to
cscope. After the addition of sys/modules/dtrace/dtrace, setting
'ARCHDIR' in /etc/src.conf breaks the build.
Apply the MAC label to an outgoing UDP packet when other inpcb properties are
processed, meaning that we avoid the cost of MAC label assignment if we're
going to drop the packet due to mbuf exhaustion, etc.
peter [Thu, 10 Jul 2008 02:08:00 +0000 (02:08 +0000)]
Merge gnu cpio 2.6 -> 2.8 changes. Unfortunately, we have massive
conflicts due to radically different approaches to security and bug fixes.
In some cases I re-started from the vendor version and reimplemented our
patches. Fortunately, this is not enabled by default in -current.
Add a new program to the multicast test suite. The mcgrab program
is used to grab and hold some number of multicast addresses in order
to test what happens when an interface goes over the number of multicast
addresses it can filter in hardware.
peter [Wed, 9 Jul 2008 19:44:37 +0000 (19:44 +0000)]
Band-aid a problem with 32 bit selector setup.
Initialize %ds, %es, and %fs during CPU startup. Otherwise a garbage
value could leak to a 32-bit process if a process migrated to a different
CPU after exec and the new CPU had never exec'd a 32-bit process.
A more complete fix is needed, but this mitigates the most frequent
manifestations.
Rather than checking for a NULL so_pcb in raw_attach(), assert that
it's non-NULL, as all callers can and should already do the required
checking. Update comments a bit more to talk about rawcb allocation
for consumers.
Add sysctl subtree net.raw for generic raw socket infrastructure;
expose default send and receive socket buffer sizes using sysctls
so that they can be administered centrally.
Improve the EEPROM parsing, based on finding a datasheet that describes
it in detail.
When setting media, don't error out when a specific media is selected.
# Note: There may be some issues still here since the EtherJet PC Card doesn't
# conform to the datasheet. Many different kinds of dongles can be plugged in
# and it is unknown how to ask which one it is.
Also, add a /* bad! */ comment to a 1/2 second delay after we set the
DC/DC parameters. This should be a *sleep of some sort for !cold.
Fortunately it is the only one and is only used when setting media, so
the benefit from removing it is small. Unfortunately, it likely
serves as an exemplar of good programming techniques, which it isn't.
1) Adds the rest of the VIMAGE change macros
2) Adds some __UserSpace__ on some of the common defines that
the user space code needs
3) Fixes a bug when we send up data to a user that failed. We
need to a) trim off the data chunk headers, if present, and
b) make sure the frag bit is communicated properly for the
msgs coming off the stream queues... i.e. we see if some
of the msg has been taken.
Obtained from: jeli contributed the VIMAGE changes on this pass Thanks Julain!
Remove unused support for local and foreign addresses in generic raw
socket support. These utility routines are used only for routing and
pfkey sockets, neither of which have a notion of address, so were
required to mock up fake socket addresses to avoid connection
requirements for applications that did not specify their own fake
addresses (most of them).
Quite a bit of the removed code is #ifdef notdef, since raw sockets
don't support bind() or connect() in practice. Removing this
simplifies the raw socket implementation, and removes two (commented
out) uses of dtom(9).
Fake addresses passed to sendto(2) by applications are ignored for
compatibility reasons, but this is now done in a more consistent way
(and with a comment). Possibly, EINVAL could be returned here in
the future if it is determined that no applications depend on the
semantic inconsistency of specifying a destination address for a
protocol without address support, but this will require some amount
of careful surveying.
NB: This does not affect netinet, netinet6, or other wire protocol
raw sockets, which provide their own independent infrastructure with
control block address support specific to the protocol.
Add driver support for RTL8102E and RTL8102EL which is the second
generation of RTL810x PCIe fast ethernet controller. Note, Tx/Rx
descriptor format is different from that of first generation of
RTL8101E series. Jumbo frame is not supported for RTL810x
family.
Tested by: NAGATA Shinya ( maya AT negeta DOT com )
Fix a mutex LOR introduced by the conversion of if_ndis from spinlocks to
mutexes and replacing the obsolete if_watchdog interface. The ndis_ticktask
function calls into ieee80211_new_state under one condition with NDIS_LOCK
held. The ieee80211_new_state would call into ndis_start in some cases too,
resulting in the occasional case where ndis_start acquires NDIS_LOCK from
inside the NDIS_LOCK held by ndis_ticktask.
Obtained from: Paul B. Mahol <onemda@gmail.com>
MFC after: 1 week
Eliminate pmap_growkernel()'s dependence on create_pagetables() preallocating
page directory pages from VM_MIN_KERNEL_ADDRESS through the end of the
kernel's bss. Specifically, the dependence was in pmap_growkernel()'s one-
time initialization of kernel_vm_end, not in its main body. (I could not,
however, resist the urge to optimize the main body.)
Reduce the number of preallocated page directory pages to just those needed
to support NKPT page table pages. (In fact, this allows me to revert a
couple of my earlier changes to create_pagetables().)
Queue decapsulated packed instead of performing direct dispatch. Some
execution pathes might hit stack limit under certain circumstances
(e.g. ng_mppc).
- Simplify the procedure of retrieving XML-data from the kernel.
- Fix a number of potential memory leaks in libgeom related to doing realloc
without freeing old pointer if things go wrong.
- Fix a number of places in libgeom where malloc and calloc return values
were not checked.
- Check malloc return value and provide sufficient warning messages when XML
parsing fails.
PR: kern/83464
Submitted by: Dan Lukes <dan - at - obluda.cz>
Approved by: kib (mentor)
Rev 180333, ``Change create_pagetables() and pmap_init() so that many fewer
page table pages have to be preallocated ...'', violates an assumption made
by minidumpsys(): kernel_vm_end is the highest virtual address that has ever
been used by the kernel. Now, however, the kernel code, data, and bss may
reside at addresses beyond kernel_vm_end. This revision modifies the upper
bound on minidumpsys()'s two page table traversals to account for this
possibility.
Move cpuset_refroot and cpuset_refbase functions up, grouping the
cpuset_ref* functions together. Will make it easier to read and
add code without forward declarations.
No functional changes.
Add inline function ia64_fc_i() to abstract inline assembly.
Use the new inline function in ia64_invalidate_icache().
While there, add proper synchronization so that we know
the fc.i instructions have taken effect when we return.
In FreeBSD 7.0 and beyond, pmap_growkernel() should pass VM_ALLOC_INTERRUPT
to vm_page_alloc() instead of VM_ALLOC_SYSTEM. VM_ALLOC_SYSTEM was the
logical choice before FreeBSD 7.0 because VM_ALLOC_INTERRUPT could not
reclaim a cached page. Simply put, there was no ordering between
VM_ALLOC_INTERRUPT and VM_ALLOC_SYSTEM as to which "dug deeper" into the
cache and free queues. Now, there is; VM_ALLOC_INTERRUPT dominates
VM_ALLOC_SYSTEM.
While I'm here, teach pmap_growkernel() to request a prezeroed page.
When making release with NOPORTS, we'll checkout only the
mininal set of ports required to make the docs. However,
we also need ports/sysutils/cdrtools in order to make the
ISO images. When a platform doesn't have packages, the
release will fail in that case. Add ports/sysutils/cdrtools
to RELEASEPORTSMODULE for the DOMINIMALDOCPORTS case to
handle the NOPORTS release build.
Note that this change doesn't try to handle the NOPORTS with
NODOC case. For we have NOPORTSATALL set and it seems wrong
to check out a ports module in that case.
Allow udp_notify() to accept read, as well as write, locks on the passed
inpcb. When directly invoking udp_notify() from udp_ctlinput(), acquire
only a read lock; we may still see write locks in udp_notify() as the
in_pcbnotifyall() routine is shared with TCP and always uses a write lock
on the inpcb being notified.
Add additional udbinfo and inpcb locking assertions to udp_output(); for
some code paths, global or inpcb write locks are required, but for other
code paths, read locks or no locking at all are sufficient for the data
structures.
First step towards parallel transmit in UDP: if neither a specific
source or a specific destination address is requested as part of a send
on a UDP socket, read lock the inpcb rather than write lock it. This
will allow fully parallel transmit down to the IP layer when sending
simultaneously from multiple threads on a connected UDP socket.
Parallel transmit for more complex cases, such as when sendto(2) is
invoked with an address and there's already a local binding, will
follow.
Drop read lock on udbinfo earlier during delivery to the last matching
UDP socket for a datagram; the inpcb read lock is sufficient to provide
inpcb stability during udp6_append().
The kqueue_register() function assumes that it is called from the top of
the syscall code and acquires various event subsystem locks as needed.
The handling of the NOTE_TRACK for EVFILT_PROC is currently done by
calling the kqueue_register() from filt_proc() filter, causing recursive
entrance of the kqueue code. This results in the LORs and recursive
acquisition of the locks.
Implement the variant of the knote() function designed to only handle
the fork() event. It mostly copies the knote() body, but also handles
the NOTE_TRACK, removing the handling from the filt_proc(), where it
causes problems described above. The function is called from the fork1()
instead of knote().
When encountering NOTE_TRACK knote, it marks the knote as influx
and drops the knlist and kqueue lock. In this context call to
kqueue_register is safe from the problems.
An error from the kqueue_register() is reported to the observer as
NOTE_TRACKERR fflag.
Drop read lock on udbinfo earlier during delivery to the last matching
UDP socket for a datagram; the inpcb read lock is sufficient to provide
inpcb stability during udp_append().
Add a new ioctl for changing the read filter (BIOCSETFNR). This is
just like BIOCSETF but it doesn't drop all the packets buffered on
the discriptor and reset the statistics.
Also, when setting the write filter, don't drop packets waiting to
be read or reset the statistics.
The r178914 I erronously put the setting of the KQ_FLUXWAIT flag before
KQ_FLUX_WAKEUP(). Since the later macro clears the KQ_FLUXWAIT, the
kqueue_scan() thread may be not woken up.
Move the setting of KQ_FLUXWAIT after wakeup to correct the issue.