emaste [Fri, 11 May 2012 01:28:25 +0000 (01:28 +0000)]
MFC r234138:
Support percent-encoded user and password
RFC 1738 specifies that any ":", "@", or "/" within a user name or
password in a URL is percent-encoded, to avoid ambiguity with the use
of those characters as URL component separators.
daichi [Thu, 10 May 2012 20:37:56 +0000 (20:37 +0000)]
MFC: 234867 and 234944
- fixed a vnode lock hang-up issue.
- fixed an incorrect lock status issue.
- fixed an incorrect lock issue of unionfs root vnode removed.
(pointed out by keith)
- fixed an infinity loop issue.
(pointed out by dumbbell)
- changed to do LK_RELEASE expressly when unlocked.
- fixed a unionfs_readdir math issue
Submitted by: ozawa@ongs.co.jp, Matthew Fleming <mfleming@isilon.com>
kib [Thu, 10 May 2012 11:08:09 +0000 (11:08 +0000)]
MFC r234981:
Move the code to call the callout callback into the helper function
softclock_call_cc(). While there, move some common code to callout_cc_del().
kib [Thu, 10 May 2012 10:56:46 +0000 (10:56 +0000)]
MFC r234952:
Mark the migrating callouts with CALLOUT_DFRMIGRATION flag. The flag is
cleared by callout_stop_safe() when the function detects a migration,
besides returning the success. The softclock() rechecks the flag for
migrating callout and cancels its execution if the flag was cleared
meantime.
kib [Wed, 9 May 2012 15:57:59 +0000 (15:57 +0000)]
MFC r211706:
On shared object unload, in __cxa_finalize, call and clear all installed
atexit and __cxa_atexit handlers that are either installed by unloaded
dso, or points to the functions provided by the dso.
Use _rtld_addr_phdr to locate segment information from the address of
private variable belonging to the dso, supplied by crtstuff.c. Provide
utility function __elf_phdr_match_addr to do the match of address against
dso executable segment.
Call back into libthr from __cxa_finalize using weak
__pthread_cxa_finalize symbol to remove any atfork handler which
function points into unloaded object.
The rtld needs private __pthread_cxa_finalize symbol to not require
resolution of the weak undefined symbol at initialization time. This
cannot work, since rtld is relocated before sym_zero is set up.
MFC r211894:
Do not call __pthread_cxa_finalize with invalid struct dl_phdr_info.
Requested and tested by: Peter Jeremy <peter rulingia com>
kib [Wed, 9 May 2012 15:16:38 +0000 (15:16 +0000)]
MFC r211705:
Introduce implementation-private rtld interface _rtld_addr_phdr, which
fills struct dl_phdr_info for the shared object that contains the
specified address, if any.
Requested and tested by: Peter Jeremy <peter rulingia com>
tuexen [Wed, 9 May 2012 15:11:47 +0000 (15:11 +0000)]
MFC r235064:
Honor SCTP_ENABLE_STREAM_RESET socket option when processing incoming
requests. Fix also the provided result in the response and use names
as specified in RFC 6525.
pho [Wed, 9 May 2012 10:05:02 +0000 (10:05 +0000)]
MFC: r234932
Added D_TRACKCLOSE to sndstat_cdevsw to fix the situation when
another process is in open() or stat() for the device node, then
close() from the owning process does not result in cdevsw close
method call. This fixes the pemanent "Device busy" seen.
Changed the sndstat_lock from mutex to sx. This allows to extend
the region covered by the lock, to include the uiomove() call in
sndstat_read() and bufptr increment. This fixes the "panic:
sbuf_put_byte called with finished or corrupt sbuf" seen.
uqs [Tue, 8 May 2012 08:19:07 +0000 (08:19 +0000)]
Sync traceroute(8) with head.
Merges r215937,216184 and r211062,215880,220968:
- Remove unused traceroute(8) contrib code from head
- make WARNS=3 clean
- fix an operator precedence bug for TCP tracerouting
- Remove unneeded struct timezone passed to gettimeofday().
- Remove clause 3 and 4 from TNF licenses.
- Check return code of setuid() in traceroute.
marius [Mon, 7 May 2012 07:04:41 +0000 (07:04 +0000)]
MFC: r225203 (partial)
Attempt to make break-to-debugger and alternative break-to-debugger more
accessible:
(1) Always compile in support for breaking into the debugger if options
KDB is present in the kernel.
(2) Disable both by default, but allow them to be enabled via tunables
and sysctls debug.kdb.break_to_debugger and
debug.kdb.alt_break_to_debugger.
(3) options BREAK_TO_DEBUGGER and options ALT_BREAK_TO_DEBUGGER continue
to behave as before -- only now instead of compiling in
break-to-debugger support, they change the default values of the
above sysctls to enable those features by default. Current kernel
configurations should, therefore, continue to behave as expected.
(4) Migrate alternative break-to-debugger state machine logic out of
individual device drivers into centralised KDB code. This has a
number of upsides, but also one downside: it's now tricky to release
sio spin locks when entering the debugger, so we don't. However,
similar logic does not exist in other device drivers, including uart.
(5) dcons requires some special handling; unlike other console types, it
allows overriding KDB's own debugger selection, so we need a new
interface to KDB to allow that to work.
GENERIC kernels will now support break-to-debugger as long as appropriate
boot/run-time options are set, which should improve the debuggability of
kernels significantly.
MFC: r225214 (partial)
Follow up to r225203 refining break-to-debugger run-time configuration
improvements:
(1) Implement new model in previously missed at91 UART driver
(2) Move BREAK_TO_DEBUGGER and ALT_BREAK_TO_DEBUGGER from opt_comconsole.h
to opt_kdb.h (spotted by np)
(3) Garbage collect now-unused opt_comconsole.h
Hide kernel option ROUTETABLES evaluations in the implementation
rather than the header file. With this also move RT_MAXFIBS and
RT_NUMFIBS into the implemantion to avoid further usage in other
code. rt_numfibs is all that should be needed.
This allows users to change the number of FIBs from 1..RT_MAXFIBS(16)
dynamically using the tunable without the need to change the kernel
config for the maximum anymore. This means that the multi-FIB
feature is now fully available with GENERIC kernels.
The kernel option ROUTETABLES can still be used to set the default
numbers of FIBs in absence of the tunable.
melifaro [Sat, 5 May 2012 11:34:27 +0000 (11:34 +0000)]
MFC r234572
Do not require radix write lock to be held while dumping route table
via sysctl(4) interface. This permits router not to stop forwarding
packets while route table is being written to user-supplied buffer.
glebius [Sat, 5 May 2012 10:05:13 +0000 (10:05 +0000)]
Merge 234342 from head:
When we receive an ICMP unreach need fragmentation datagram, we take
proposed MTU value from it and update the TCP host cache. Then
tcp_mss_update() is called on the corresponding tcpcb. It finds the
just allocated entry in the TCP host cache and updates MSS on the
tcpcb. And then we do a fast retransmit of what we have in the tcp
send buffer.
This sequence gets broken if the TCP host cache is exausted. In this
case allocation fails, and later called tcp_mss_update() finds nothing
in cache. The fast retransmit is done with not reduced MSS and is
immidiately replied by remote host with new ICMP datagrams and the
cycle repeats. This ping-pong can go up to wirespeed.
To fix this:
- tcp_mss_update() gets new parameter - mtuoffer, that is like
offer, but needs to have min_protoh subtracted.
- tcp_mtudisc() as notification method renamed to tcp_mtudisc_notify().
- tcp_mtudisc() now accepts not a useless error argument, but proposed
MTU value, that is passed to tcp_mss_update() as mtuoffer.
Reported by: az
Reported by: Andrey Zonov <andrey zonov.org>
Reviewed by: andre (previous version of patch)
hselasky [Fri, 4 May 2012 15:55:31 +0000 (15:55 +0000)]
MFC r234803 and r234961:
Add support for Multi-TT mode of modern USB HUBs.
This will give you more bandwidth for isochronous
FULL speed applications connected through a
High Speed HUB.
davidxu [Thu, 3 May 2012 03:05:18 +0000 (03:05 +0000)]
Merge 233103, 233912 from head:
233103:
Some software think a mutex can be destroyed after it owned it, for
example, it uses a serialization point like following:
pthread_mutex_lock(&mutex);
pthread_mutex_unlock(&mutex);
pthread_mutex_destroy(&muetx);
They think a previous lock holder should have already left the mutex and
is no longer referencing it, so they destroy it. To be maximum compatible
with such code, we use IA64 version to unlock the mutex in kernel, remove
the two steps unlocking code.
233912:
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
Special code for stable/8:
And add code to detect if the UMTX_OP_MUTEX_WAKE2 is available.
mav [Wed, 2 May 2012 07:22:58 +0000 (07:22 +0000)]
MFC r234415:
Some improvements to GEOM MULTIPATH:
- Implement "configure" command to allow switching operation mode of
running device on-fly without destroying and recreation.
- Implement Active/Read mode as hybrid of Active/Active and Active/Passive.
In this mode all paths not marked FAIL may handle reads same time,
but unlike Active/Active only one path handles write requests at any
point in time. It allows to closer follow original write request order
if above layers need it for data consistency (not waiting for requisite
write completion before sending dependent write).
- Hide duplicate messages about device status change.
- Remove periodic thread wake up with 10Hz rate.
mav [Wed, 2 May 2012 07:05:20 +0000 (07:05 +0000)]
MFC r222643:
When possible, join ranges of subsequest BIO_DELETE requests to handle more
(up to 2048 instead of 256 or even 64) of them with single TRIM request.
OCZ Vertex2/Vertex3 SSDs can handle no more then 64 ranges per TRIM request.
Due to lack of BIO_DELETE clustering now, it means that we could delete no
more then 2MB per request (on FS with 32K block) with limited request rate.
This change increases delete rate on Vertex2 from 250MB/s to 950MB/s.
MFC r222643:
Increase maximum supported number of ranges per TRIM command from 256 to 512
to use full potential of Intel X25-M SSDs. On synthetic test with 32K ranges
it gives about 20% speedup, which probably costs more then 2K of RAM.
mav [Wed, 2 May 2012 06:52:00 +0000 (06:52 +0000)]
Merge ATA_CAM compatibility shims. While 8-STABLE doesn't have ATA_CAM
enabled by default, this should make migration easier for users enabling
it manually.
r221071:
Add shim to simplify migration to the CAM-based ATA. For each new adaX
device in /dev/ create symbolic link with adY name, trying to mimic old ATA
numbering. Imitation is not complete, but should be enough in most cases to
mount file systems without touching /etc/fstab.
r221384:
Do not report legacy unit numbers (do not create legacy aliases) for disks
on port multiplier ports above first two. They don't fit into ATA_STATIC_ID
scheme and so can't be mapped properly. No need to pollute dev.
delphij [Wed, 2 May 2012 00:31:09 +0000 (00:31 +0000)]
MFC r233770:
Eliminate two cases of unwanted strncpy(). The name is not required
by the current code, and the results would get overwritten anyway
by subsequent memset().
kib [Tue, 1 May 2012 11:45:16 +0000 (11:45 +0000)]
MFC r234657:
Take the spinlock around clearing of the fp->_flags in fclose(3), which
indicates the avaliability of FILE, to prevent possible reordering of
the writes as seen by other CPUs.
MFC r233507:
Use program exit status as pam_exec return code (optional)
pam_exec(8) now accepts a new option "return_prog_exit_status". When
set, the program exit status is used as the pam_exec return code. It
allows the program to tell why the step failed (eg. user unknown).
However, if it exits with a code not allowed by the calling PAM service
module function (see $PAM_SM_FUNC below), a warning is logged and
PAM_SERVICE_ERR is returned.
The following changes are related to this new feature but they apply no
matter if the "return_prog_exit_status" option is set or not.
The environment passed to the program is extended:
o $PAM_SM_FUNC contains the name of the PAM service module function
(eg. pam_sm_authenticate).
o All valid PAM return codes' numerical values are available
through variables named after the return code name. For instance,
$PAM_SUCCESS, $PAM_USER_UNKNOWN or $PAM_PERM_DENIED.
pam_exec return code better reflects what went on:
o If the program exits with !0, the return code is now
PAM_PERM_DENIED, not PAM_SYSTEM_ERR.
o If the program fails because of a signal (WIFSIGNALED) or doesn't
terminate normally (!WIFEXITED), the return code is now
PAM_SERVICE_ERR, not PAM_SYSTEM_ERR.
o If a syscall in pam_exec fails, the return code remains
PAM_SYSTEM_ERR.
waitpid(2) is called in a loop. If it returns because of EINTR, do it
again. Before, it would return PAM_SYSTEM_ERR without waiting for the
child to exit.
Several log messages now include the PAM service module function name.
MFC r234459:
Fix a bug where we copy out more data from a mbuf chain that are
ctually in it. This happens when SCTP receives an unknown chunk, which
requires the sending of an ERROR chunk, and there is no final padding but
the chunk is not 4-byte aligned.
Reported by yueting via rwatson@
MFC r234556:
When MAP_STACK mapping is created, the map entry is created only to
cover the initial stack size. For MCL_WIREFUTURE maps, the subsequent
call to vm_map_wire() to wire the whole stack region fails due to
VM_MAP_WIRE_NOHOLES flag.
Use the VM_MAP_WIRE_HOLESOK to only wire mapped part of the stack.
Export the udp_cksum sysctl for upcoming SCTP work. Rather than always,
SCTP will only do IPv4 UDP checksum calculation as defined by the host
policy. When tunneling SCTP always calculates the inner checksum already
so not doing the outer UDP can save cycles.
MFC r234038
If a page belonging a reservation is cached, then mark the reservation so
that it will be freed to the cache pool rather than the default pool.
Otherwise, the cached pages within the reservation may be recycled sooner
than necessary.
MFC r234692:
Read backup GPT header from the last LBA only when primary GPT header and
table aren't valid. If they are ok, use hdr_lba_alt value to read backup
header. This will make gptboot happy when GPT used atop of some GEOM
provider, e.g. GEOM_MIRROR.
Free mbuf in case when protocol in unknown in ng_ipfw_rcvdata().
This change fixes (theoretically) possible mbuf leak introduced in
r225586. Reorder code a bit and change return codes to be more specific
- Add ipfw eXtended tables permitting radix to be used for any kind of keys.
- Add support for IPv6 and interface extended tables
- Make number of tables to be changed in runtime in range 0..65534.
- Use IP_FW3 opcode for all new extended table cmds
No ABI changes are introduced. Old userland will see valid tables for
IPv4 tables and no entries otherwise. Flush works for any table.
IP_FW3 socket option is used to encapsulate all new opcodes:
/* IP_FW3 header/opcodes */
typedef struct _ip_fw3_opheader {
uint16_t opcode; /* Operation opcode */
uint16_t reserved[3]; /* Align to 64-bit boundary */
} ip_fw3_opheader;
New opcodes added:
IP_FW_TABLE_XADD, IP_FW_TABLE_XDEL, IP_FW_TABLE_XGETSIZE, IP_FW_TABLE_XLIST
ipfw(8) table argument parsing behavior is changed:
'ipfw table 999 add some-unqualified-host' now assumes
'some-unqualified-host' to be interface name instead of hostname.
New tunable:
net.inet.ip.fw.tables_max controls number of table supported by ipfw in given
VNET instance. 128 is still the default value.
Sysctl change:
net.inet.ip.fw.tables_max is now read-write.
New syntax:
ipfw add skipto tablearg ip from any to any via table(42) in
ipfw add skipto tablearg ip from any to any via table(4242) out
This is a bit hackish, special interface name '\1' is used to signal interface
table number is passed in p.glob field.
MFC r234121:
Back out r228476.
r228476 fixed superfluous link UP/DOWN messages but broke IPMI
access during boot. It's not clear why r228476 breaks IPMI and
should be revisited.
Reported by: Paul Guyot <paulguyot <> ieee dot org >