Add additional udbinfo and inpcb locking assertions to udp_output(); for
some code paths, global or inpcb write locks are required, but for other
code paths, read locks or no locking at all are sufficient for the data
structures.
First step towards parallel transmit in UDP: if neither a specific
source or a specific destination address is requested as part of a send
on a UDP socket, read lock the inpcb rather than write lock it. This
will allow fully parallel transmit down to the IP layer when sending
simultaneously from multiple threads on a connected UDP socket.
Parallel transmit for more complex cases, such as when sendto(2) is
invoked with an address and there's already a local binding, will
follow.
Drop read lock on udbinfo earlier during delivery to the last matching
UDP socket for a datagram; the inpcb read lock is sufficient to provide
inpcb stability during udp6_append().
The kqueue_register() function assumes that it is called from the top of
the syscall code and acquires various event subsystem locks as needed.
The handling of the NOTE_TRACK for EVFILT_PROC is currently done by
calling the kqueue_register() from filt_proc() filter, causing recursive
entrance of the kqueue code. This results in the LORs and recursive
acquisition of the locks.
Implement the variant of the knote() function designed to only handle
the fork() event. It mostly copies the knote() body, but also handles
the NOTE_TRACK, removing the handling from the filt_proc(), where it
causes problems described above. The function is called from the fork1()
instead of knote().
When encountering NOTE_TRACK knote, it marks the knote as influx
and drops the knlist and kqueue lock. In this context call to
kqueue_register is safe from the problems.
An error from the kqueue_register() is reported to the observer as
NOTE_TRACKERR fflag.
Drop read lock on udbinfo earlier during delivery to the last matching
UDP socket for a datagram; the inpcb read lock is sufficient to provide
inpcb stability during udp_append().
Add a new ioctl for changing the read filter (BIOCSETFNR). This is
just like BIOCSETF but it doesn't drop all the packets buffered on
the discriptor and reset the statistics.
Also, when setting the write filter, don't drop packets waiting to
be read or reset the statistics.
The r178914 I erronously put the setting of the KQ_FLUXWAIT flag before
KQ_FLUX_WAKEUP(). Since the later macro clears the KQ_FLUXWAIT, the
kqueue_scan() thread may be not woken up.
Move the setting of KQ_FLUXWAIT after wakeup to correct the issue.
As discussed on IRC and at BSDcan, move the mips32/* directories up a
level. The distinction was artificial. Some more movement around the
deck charis is likely depending on the fallout from this one.
Paths were corrected after the svn mv. Hope that's OK.
- This code was intially obtained from NetBSD, but it's missing licence
statement. Add the one from the current NetBSD version.
- Also bump a date to reflect my content changes I have done in previous
revision
Merge from NetBSD's pcmciadev file (rev ~1.208 - 1.226) where
appropriate (versions not appropriate to merge omitted):
o 1.226 imp nop, save for NetBSD string (minor merging the other way)
o 1.225 jnemeth Coreage LAPCCTXD
o 1.224 martin (remove 3rd and 4th clauses)
o 1.223 kiyohara (TDK bluetooth PC Card)
o 1.222 kiyohara (Anycom BlueCard)
o 1.221 ichiro (NEC Infrontia AX420N)
o 1.219 jmcneill (EDIMAX EP-4101)
o 1.213 tsutsui (TEAC IDECARDII entry fix)
Also, while I'm here, fix some tab problems that have crept in.
Use config_intrhook API to create the dev.cpu.N.temperature sysctl node.
Our hook creates the sysctl node before root is mounted, but after cpu
is probed. It seems that k8temp can be loaded before the cpu module and,
in those cases, dev.cpu.0.temperature was not created.
Increase the kernel map's size to 7GB, making room for a kmem map of size
greater than 4GB. (Auto-sizing will set the ceiling on the kmem map size
to 4.2GB.)
Make sure we are clearing the ZBUF_FLAG_IMMUTABLE any time a free buffer
is reclaimed by the kernel. This fixes a bug resulted in the kernel
over writing packet data while user-space was still processing it when
zerocopy is enabled. (Or a panic if invariants was enabled).
- the protosw entries are used directly
- the usrreq functions are library routines, generally wrapped by
consumers rather than being used directly
- the usrreq structure entries are likewise typically wrapped
Remove the rather incorrect #if 0'd pr_input_t prototype for raw_input.
Rename raw_append() to rip_append(): the raw_ prefix is generally used
for functions in the generic raw socket library (raw_cb.c, raw_usrreq.c),
and they are not used for IPv4 raw sockets.
Rename several functions in if_lmc with potential name collisions with
global symbols, such as raw_input and raw_output, to have lmc_ prefixes.
This doesn't affect actual functionality since the functions are static,
but will limit the opportunities for current confusion and future
difficulty.
marius [Sat, 5 Jul 2008 15:44:56 +0000 (15:44 +0000)]
- Merge macros depending on the flags being preserved between calls
into a single "__asm"-statement as GCC doesn't guarantee their
consecutive output even when using consecutive "__asm __volatile"-
statement for them. Remove the otherwise unnecessary "__volatile". [1]
- The inline assembler instructions used here alter the condition
codes so add them to the clobber list accordingly.
- The inline assembler instructions used here uses output operands
before all input operands are consumed so add appropriate modifiers.
marius [Sat, 5 Jul 2008 15:28:30 +0000 (15:28 +0000)]
Revert the addition of "__volatile" to "__asm" done in r180011, since
the condition codes where added to the clobber lists in r180073 the
former is unnecessary.
There's no need to announce that we're mounting local filesystems when
running in quiet mode since if we fail to mount any of them the boot
process gets interrupted.
Introduce a new lock, hostname_mtx, and use it to synchronize access
to global hostname and domainname variables. Where necessary, copy
to or from a stack-local buffer before performing copyin() or
copyout(). A few uses, such as in cd9660 and daemon_saver, remain
under-synchronized and will require further updates.
Correct a bug in which a failed copyin() of domainname would leave
domainname potentially corrupted.
Use malloc in write_archive to allocate a 64kB buffer for holding file data
instead of using 64kB of stack space in copy_file_data and write_file_data.
In -pl mode, only hardlink regular files. I need to test
other implementations, but it's clear that dirs and symlinks,
at least, shouldn't be hardlinked.
Revert CVS revision 1.68; it is now possible for entry to be NULL at the end
of write_entry. (This was perfectly safe, since archive_entry_free(NULL) is
a no-op, but adding the check back makes the style more consistent.)
When ARCHIVE_EXTRACT_PERM is requested (e.g., by "tar -p"), always
schedule a chmod() fixup for directories. In particular, this fixes
sgid handling on systems where the sgid bit is inherited from the
parent directory (which means that the actual mode of the dir
does not match the mode used in the mkdir() system call.
It may be possible to tighten this condition a bit. In
working through this, I also found a few other places where
it looks like we can avoid a redundant syscall or two. I've
commented those here but not yet tried to address them.
Fix my previous commit. We actually should pass evaluation args in
AcpiEvaluateObject() calls, otherwise, we are not able to bring devices
back up (NULL means 0, hence always off).
While there add missing WLAN on/off support.
Remove the sbsh(4) driver. No one responded to requests for testing the
MPSAFE patches on current@ and stable@. This driver also has a fundamental
issue in that it sleeps when sending commands to the card including in the
if_init/if_start routines (which can be called from interrupt context). As
such, the driver shouldn't be working reliably even on 4.x.
Make sbsh(4) MPSAFE:
- Add a mutex to the softc and use it to protect the softc and device
hardware.
- Setup interrupt handler after ether_ifattach().
- Remove unused sbsh_watchdog() routine.
- Protect against concurrent attempts to load firmware.
Enqueue de-capsulated packet instead of performing direct dispatch. It's
possible to exhaust and garble stack with a packet that contains a couple
of hundreds nested encapsulation levels.
Submitted by: Ming Fu <fming@borderware.com>
Reviewed by: rwatson
PR: kern/85320
Make sbni(4) MPSAFE:
- Add a mutex to the softc and use it to protect the softc and device
hardware.
- Setup interrupt handler after attaching device to network stack.
- Use device_set_desc() rather than device_quiet() plus a manual printf
that simulates the normal probe printf.
- Axe next_sbni_unit and instead just leave room for two sbni devices for
each bus attachment.
- Don't bzero the already-zero'd softc.
- Add a detach method to the PCI driver.
- Add a lock to protect the list of available devices used to chain
interrupt handlers for dual port ISA cards.
- Remove unused watchdog routine.
- If if_alloc() fails, make sbni_attach() return an error rather than
panic'ing.
- Consolidate code to free bus resources into sbni_release_resources().
- Clear IFF_DRV_RUNNING|OACTIVE in stop() routine instead of in callers.
- Let ether_ioctl() handle SIOCSIFMTU.
Remove the cnw(4) driver. No one responded to calls to test it on current@
and stable@. It also is a driver for an older non-802.11 wireless PC card
that is quite slow in comparison to say, wi(4). I know Warner wants this
driver axed as well.
Make cnw(4) MPSAFE:
- Add a mutex to the softc and use it to lock the softc and device hardware.
- Use a private timer to replace if_watchdog/if_timer.
- Use if_printf() rather than if_xname.
- Setup interrupt handler after ether_ifattach().
Remove the oltr(4) driver. No one responded to calls for testing on
current@ and stable@ for the locking patches. The driver can always be
revived if someone tests it.
This driver also sleeps in its if_init routine, so it likely doesn't really
work at all anyway in modern releases.
Make oltr(4) MPSAFE:
- Add a mutex to the softc and use it to protect the softc and device
hardware.
- Setup interrupt handler after interface attach.
- Retire 'unit' from softc and use if_printf() instead.
- Don't frob IFF_UP in the driver.
- Use callout_() rather than timeout() and untimeout().
Make arl(4) MPSAFE:
- Add a mutex to the softc and use it to protect the softc and device
hardware.
- Setup interrupt handler after ether_ifattach().
- Use a private timer instead of if_timer/if_watchdog.
- Retire arl_unit from the softc and use if_printf() and device_printf()
instead.
Note that the unpatched driver in 6.x and later does not work with the
hardware, so the one person who had volunteered to test the patch wasn't
able to test it.
Remove NETISR_MPSAFE, which allows specific netisr handlers to be directly
dispatched without Giant, and add NETISR_FORCEQUEUE, which allows specific
netisr handlers to always be dispatched via a queue (deferred). Mark the
usb and if_ppp netisr handlers as NETISR_FORCEQUEUE, and explicitly
acquire Giant in those handlers.
Previously, any netisr handler not marked NETISR_MPSAFE would necessarily
run deferred and with Giant acquired. This change removes Giant
scaffolding from the netisr infrastructure, but NETISR_FORCEQUEUE allows
non-MPSAFE handlers to continue to force deferred dispatch so as to avoid
lock order reversals between their acqusition of Giant and any calling
context.
It is likely we will be able to remove NETISR_FORCEQUEUE once
IFF_NEEDSGIANT is removed, as non-MPSAFE usb and if_ppp drivers will no
longer be supported.
Reviewed by: bz
MFC after: 1 month
X-MFC note: We can't remove NETISR_MPSAFE from stable/7 for KPI reasons,
but the rest can go back.
Use bcopy instead of strlcpy in uipc_bind and unp_connect, since
soun->sun_path isn't a null-terminated string. As UNIX(4) states, "the
terminating NUL is not part of the address." Since strlcpy has to return
"the total length of the string [it] tried to create," it walks off the end
of soun->sun_path looking for a \0.
das [Thu, 3 Jul 2008 23:06:06 +0000 (23:06 +0000)]
Add regression tests for fmin{,f,l} and fmax{,f,l}.
I wrote these to test amd64 asm functions that used
maxss, maxsd, minss, and minsd, but it turns out that
those instructions don't handle NaNs and signed zero
in the same way as fmin() and fmax() are required to,
so we're stuck with the C versions for now.
Added support in ldd(1) for the LD_32_xxx environment variables if
the architecture of the machine is >32 bits. If we ever go to 128
bit architectures this excercise will have to be repeated but thanks
to earlier commits today it will be relative simple.
Extract the determination of the kind of (dynamic) executable from
the main-loop into a seperate function.
Instead of using hardcoded environment variables, define them in a
lookup table.
For the rest, no functionality changes.
Stop building bsdlabel(8) and fdisk(8) on ia64. Both tools are
obsoleted by gpart(8). This avoids the following bugs in fdisk:
- initializing a disk without MBR bogusly emits the error:
fdisk: invalid fdisk partition table found
- initializing a disk with or without MBR bogusly emits either:
fdisk: Class not found
or
fdisk: Geom not found: "XXX"
- the default geometry for non-ATA and non-SCSI disks is either
invalid or sub-optimizal.
Add NO_MAN for the static variant of geom(8). Both the RESCUE and the
RELEASE_CRUNCH builds use NO_MAN anyway, so this change is primarily
to avoid that developers have to set NO_MAN manually when they build
the static variant.