- Propagate the largest set of interface capabilities supported by all lagg
ports to the lagg interface.
- Use the MTU from the first interface as the lagg MTU, all extra interfaces
must be the same.
This fixes using a lagg interface for a vlan or enabling jumbo frames, etc.
Dynamically choose the quality of the ACPI timer depending on whether
the fast or safe/slow method is in use. Fast remains at 1000, slow is
now at 850 (always preferred to TSC). Since the HPET has proven slower
than ACPI-fast on some systems, drop its quality to 900. In the future,
it is hoped that HPET performance will improve as it is the main
timer Intel supports. HPET may move back to 2000 in -current once RELENG_7
is branched to ensure that it gets tested.
des [Mon, 30 Jul 2007 11:06:42 +0000 (11:06 +0000)]
Make tcpstates[] static, and make sure TCPSTATES is defined before
<netinet/tcp_fsm.h> is included into any compilation unit that needs
tcpstates[]. Also remove incorrect extern declarations and TCPDEBUG
conditionals. This allows kernels both with and without TCPDEBUG to
build, and unbreaks the tinderbox.
Mfi386 revision 1.239 of src/sys/i386/isa/clock.c. Seemingly some
pc98 motherboards do not provide us with the correct day of week
either. Ignore the day of week when setting the clock here too.
In pci_alloc_map(), restore the original value of the BAR for
the duration of the function. The device we would otherwise
have left in an useless state may just as well be the low-level
console. When booting verbose, we do need it addressable if we
want to avoid a MCA.
Print integer-typed arguments as integers. This makes sure that
on 64-bit platforms the result is more reliable. For example,
-1 was previously printed as 0xffffffff.
Fix handling of Quad-type arguments. Previously, syscalls
containing 64-bit arguments would have explicit padding.
On 64-bit platforms there was no padding, so the dummy
argument was not covering anything. On 32-bit platforms
with weak alignment (i.e. i386) the 64-bit argument did
not need to be aligned, so there too an aditional argument
was introduced. On 32-bit platforms with strong alignment
(i.e. PowerPC) the dummy argument in fact cover the padding.
By elimininating the dummy argument, 64-bit platforms now
have 1 argument less. This also applies to 32-bit platforms
with weak alignment. On PowerPC this doesn't matter, because
the padding is still there. We just don't "name" it.
Deal with those 3 cases.
Syscalls have at most 6 argument, not 5. See mmap(2) for example.
Previously the offset argument to mmap(2) would be bogus as we
weren't reading it in.
Fix acpidump(8) on ia64. Revision 1.13 introduced an uninitialized
variable bug that's hidden by the precense of the hint_acpi_0_rsdp
hint on 386 and amd64. There's never a need for such hint on ia64.
andre [Sat, 28 Jul 2007 12:20:39 +0000 (12:20 +0000)]
Provide a sysctl to toggle reporting of TCP debug logging:
sys.net.inet.tcp.log_debug = 1
It defaults to enabled for the moment and is to be turned off for
the next release like other diagnostics from development branches.
It is important to note that sysctl sys.net.inet.tcp.log_in_vain
uses the same logging function as log_debug. Enabling of the former
also causes the latter to engage, but not vice versa.
Use consistent terminology in tcp log messages:
"ignored" means a segment contains invalid flags/information and
is dropped without changing state or issuing a reply.
"rejected" means a segments contains invalid flags/information but
is causing a reply (usually RST) and may cause a state change.
andre [Sat, 28 Jul 2007 12:02:05 +0000 (12:02 +0000)]
o Move setting/resetting logic of syncache timer from macro
SYNCACHE_TIMEOUT to new function syncache_timeout().
o Fix inverted timeout callout engagement logic to actually
enable the timer for the bucket row. Before SYN|ACK was
not retransmitted.
o Simplify SYN|ACK retransmit timeout backoff calculation.
o Improve logging of retransmit and timeout events.
o Reset timeout when duplicate SYN arrives.
o Add comments.
o Rearrange SYN cookie statistics counting.
Bug found by: silby
Submitted by: silby (different version)
Approved by: re (rwatson)
andre [Sat, 28 Jul 2007 11:51:44 +0000 (11:51 +0000)]
o Move all detailed checks for RST in LISTEN state from tcp_input() to
syncache_rst().
o Fix tests for flag combinations of RST and SYN, ACK, FIN. Before
a RST for a connection in syncache did not properly free the entry.
o Add more detailed logging.
Replace references to NET_CALLOUT_MPSAFE with CALLOUT_MPSAFE, and remove
definition of NET_CALLOUT_MPSAFE, which is no longer required now that
debug.mpsafenet has been removed.
Update the sysctl_ctx_init(9) manual page with the following
information from the submitter:
Starting value for OID_AUTO was changed from 100 to 256 (0x100) in
kern/kern_sysctl.c#rev1.112 on 2001-07-25, and defined as
CTL_AUTO_START in sys/sysctl.h#rev1.98.
Submitted by: cnst
Silence from: #bsddocs on efnet
MFC After: 3 days
Approved by: re (bmah)
Bring in two bandaids to get the elf trampoline to work again, until I find
a proper solution.
- Add a dummy entry point which just calls the C entry points, and try to make
sure it's the first code in the binary.
- Copy a bit more than func_end to try to copy the whole load_kernel()
function. gcc4 puts code behind the func_end symbol.
First in a series of changes to remove the now-unused Giant compatibility
framework for non-MPSAFE network protocols:
- Remove debug_mpsafenet variable, sysctl, and tunable.
- Remove NET_NEEDS_GIANT() and associate SYSINITSs used by it to force
debug.mpsafenet=0 if non-MPSAFE protocols are compiled into the kernel.
- Remove logic to automatically flag interrupt handlers as non-MPSAFE if
debug.mpsafenet is set for an INTR_TYPE_NET handler.
- Remove logic to automatically flag netisr handlers as non-MPSAFE if
debug.mpsafenet is set.
- Remove references in a few subsystems, including NFS and Cronyx drivers,
which keyed off debug_mpsafenet to determine various aspects of their own
locking behavior.
- Convert NET_LOCK_GIANT(), NET_UNLOCK_GIANT(), and NET_ASSERT_GIANT into
no-op's, as their entire behavior was determined by the value in
debug_mpsafenet.
- Alias NET_CALLOUT_MPSAFE to CALLOUT_MPSAFE.
Many remaining references to NET_.*_GIANT() and NET_CALLOUT_MPSAFE are still
present in subsystems, and will be removed in followup commits.
It seems that some i386 mothermoards either do not implement the
day of week field correctly, or they remember bad values that are
written into the day of week field. For this reason, ignore the day
of week field when reading the clock on i386 rather than bailing if
it is set incorrectly.
Problems were seen on a number of platforms, including VMWare, qemu,
EPIA ME6000, Epox-3PTA and ABIT-SL30T.
This is a slightly different fix to that proposed by Ted in his PR,
but the same basic idea.
Actually, upcalls cannot be freed while destroying the thread because we
should call uma_zfree() with various spinlock helds. Rearranging the
code would not help here because we cannot break atomicity respect
prcess spinlock, so the only one choice we have is to defer the operation.
In order to do this use a global queue synchronized through the kse_lock
spinlock which is freed at any thread_alloc() / thread_wait() through a
call to thread_reap().
Note that this approach is not ideal as we should want a per-process
list of zombie upcalls, but it follows initial guidelines of KSE authors.
Continue effort to improve parity between UDPv4 and UDPv6: add a missing
scope security check for the UDPv6 socket credential lookup service,
allowing security policies to bound access to credential information.
While not an immediate issue for Jail, which doesn't allow use of UDPv6,
this may be relevant to other security policies that may wish to control
ident lookups.
While here, eliminate a very unlikely panic case, in which a socket in
the process of being freed is inspected by the sysctl.
Fix up ndis interaction with net80211
- make NDIS_DEBUG a sysctl
- default to IEEE80211_MODE_11B if the card doesnt tell us the channels
- dont mess with ic_des_chan when we assosciate
- Allow a directed scan by setting the ESSID before scanning (verified
with wireshark). Hidden APs probably wouldnt have worked before.
- Grab the channel type and use it to look up the correct curchan for
the scan results (mistakenly used 11B before)
- Fix memory leak in the ndis_scan_results
Tested by: matteo
Reviewed by: sam
Approved by: re (rwatson)
When we do open, we should lock the vnode exclusively. This fixes few races:
- fifo race, where two threads assign v_fifoinfo,
- v_writecount modifications,
- v_object modifications,
- and probably more...
Discussed with: kib, ups
Approved by: re (rwatson)
If the trap number stored in the trapframe is corrupted into a negative
value, then we would use a negative index into the trap_msg[] array
resulting in a nested page fault. Make the 'type' variable holding the
trap number unsigned to avoid this.
Require 'cleanvar' so that files and sockets created in /var/run by
wpa_supplicant and other programs started by 'netif' don't get erased
by a subsequent 'cleanvar'.
Vendor import of 9.4.1-P1, which has fixes for the following:
1. The default access control lists (acls) are not being
correctly set. If not set anyone can make recursive queries
and/or query the cache contents.
See also:
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-2925
2. The DNS query id generation is vulnerable to cryptographic
analysis which provides a 1 in 8 chance of guessing the next
query id for 50% of the query ids. This can be used to perform
cache poisoning by an attacker.
This bug only affects outgoing queries, generated by BIND 9 to
answer questions as a resolver, or when it is looking up data
for internal uses, such as when sending NOTIFYs to slave name
servers.
All users are encouraged to upgrade.
See also:
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-2926
Fix absolutely maddening autorepeat bug that would cause the last key
to repeat if you had more than two keys down at any given time (which
happened to me all the time with emacs).
This is taken from PR 110681, although what URATAN Shigenobu describes
there is different than the pathology that I have been seeing. I'm
seeing this only in X, while he sees it on his console, yet I think
the two problems are related. I've also reworked the patch slightly
to conform to the coding standards of adjacent code.
It is unclear to me if this merely masks the maddening bug that I have
seen, or if this is a real fix. I typically see the problem when I'm
typing fast in emacs and using lots of motion keys (meta and control).
In either case, my workstation at work again is finally useful with
this patch.
PR: 110681
Submitted by: URATAN Shigenobu
Approved by: re (blanket)
ums(4) does not work if the mouse defaults to boot protocol. Force
the protocol to be report on each open, but ignore any errors as set
protocol for mice that don't implement the boot protocol can generate
an error. Evidentally, the Gyration GyroPoint RF Technology Receiver
(Gyration Ultra Cordless) device has this problem.
Submitted by: Eugene M. Kim
PR: 106565
Approved by: re (blanket)
- take out a needless panic under invariants for sctp_output.c
- Fix addrs's error checking of sctp_sendx(3) when addrcnt is less than
SCTP_SMALL_IOVEC_SIZE
- re-add back inpcb_bind local address check bypass capability
- Fix it so sctp_opt_info is independant of assoc_id postion.
- Fix cookie life set to use MSEC_TO_TICKS() macro.
- asconf changes
o More comment changes/clarifications related to the old local address
"not" list which is now an explicit restricted list.
o Rename some functions for clarity:
- sctp_add/del_local_addr_assoc to xxx_local_addr_restricted()
- asconf related iterator functions to sctp_asconf_iterator_xxx()
o Fix bug when the same address is deleted and added (and removed from
the asconf queue) where the ifa is "freed" twice refcount wise,
possibly freeing it completely.
o Fix bug in output where the first ASCONF would not go out after the
last address is changed (e.g. only goes out when retransmitted).
o Fix bug where multiple ASCONFs can be bundled in the same packet with
the and with the same serial numbers.
o Fix asconf stcb iterator to not send ASCONF until after all work
queue entries have been processed.
o Change behavior so that when the last address is deleted (auto asconf
on a bound all endpoint) no action is taken until an address is
added; at that time, an ASCONF add+delete is sent (if the assoc
is still up).
o Fix local address counting so that address scoping is taken into
account.
o #ifdef SCTP_TIMER_BASED_ASCONF the old timer triggered sending
of ASCONF (after an RTO). The default now is to send
ASCONF immediately (except for the case of changing/deleting the
last usable address).
Approved by: re(ken smith)@freebsd.org
Introduce Danny Braniss' iSCSI initiator, version 2.0.99. Please read the
included man pages on how to use it. This code is still somewhat experimental
but has been successfully tested on a number of targets. Many thanks to
Danny for contributing this.
simon [Tue, 24 Jul 2007 13:06:08 +0000 (13:06 +0000)]
Set timeout for all NIS RPC requests to 1 second and not just for
yp_next as revision 1.50 did. This should fix, or at least very much
reduce the risk of, NIS timing out due to UDP packet loss for NIS
functions.
See also revision 1.50 for more details about the general problem.
Add MSI support.
Ever since switching to adaptive polling re(4) occasionally spews
watchdog timeouts on systems with MSI capability. This change is
minimal one for supporting MSI and re(4) also needs MSIX support
for RTL8111C in future. Because softc structure of re(4) is shared
with rl(4), rl(4) was touched to use the modified softc.
Reported by: cnst
Tested by: cnst
Approved by: re (kensmith)
Don't fail on device attach if jumbo frame support was unsuccessful.
Because nfe(4) hardware doesn't support SG on Rx path, supporting
jumbo frame requires very large contiguous kernel memory(i.e. several
mega bytes). In case of lack of contiguous kernel memory that
allocation request may always fail. However nfe(4) can operate on normal
sized MTU frames, so go ahead and just disable jumbo frame support.
While I'm here add a new tunable "hw.nfe.jumbo_disable" to disable
jumbo frame support.
In nfe_poll, make sure to invoke correct Rx handler.
upcall_free() was only used in kse_GC() which has been removed so it now
results unused; this, with -Werror option of gcc, rise a warning for gcc
which let the buildkernel to be busted.
Fix this removing upcall_free().
Reported by: various
Approved by: jeff
Approved by: re
Pointy hat to: attilio
Actually, KSE kernel bits locking is broken and can lead likely to
dangerous races.
Fix this problems adding correct locking for the members of 'struct
kse_upcall' and other struct proc/struct thread related members.
For the moment, just leave ku_mflag and ku_flags "lazy" locked.
While here, cleanup the code removing the function kse_GC() (unused),
and merging upcall_link(), upcall_unlink(), upcall_stash() in their
respective callers (static functions, very short and only called in one
place).
Reported by: pav
Tested by: pav (on some pointyhat cluster nodes)
Approved by: jeff
Approved by: re
Sponsorized by: NGX Italy (http://www.ngx.it)
If clock_ct_to_ts fails to convert time time from the real time clock,
print a one line error message. Add some comments on not being able to
trust the day of week field (I'll act on these comments in a follow up
commit).
Continue effort to align UDPv4 and UDPv6 implementations by merging
udp6_output() from udp6_output.c to udp6_usrreq.c, matching the UDPv4
structure, and allowing us to remove udp6_output.c.
Make using msdosfs as the root file system sort of work:
o Initialize ownerships and permissions. They were garbage (0) for
root mounts since vfs_mountroot_try() doesn't ask for them to be set
and msdosfs's old incomplete code to set them was removed. The
garbage happened to give the correct ownerships root:wheel, but it
gave permissions 000 so init could not be execed. Use the macros
for root: wheel and 0755. (The removed code gave 0:0 and 0777. 0755
is more normal and secure, thought wrong for /tmp.)
o Check the readonly flag for initial (non-MNT_UPDATE) mounts in the
correct place, as in ffs. For root mounts, it is only passed in
mp->mnt_flags, since vfs_mountroot_try() only passes it as a flag
and nothing translates the flag to the "ro" option string. msdosfs
only looked for it in the string, so it gave a rw mount for root
mounts without even clearing the flag in mp->mnt_flags, so the final
state was inconsistent. Checking the flag only in mp->mnt_flags
works for initial userland mounts too. The MNT_UPDATE case is
messier.
The main point that should work but doesn't is fsck of msdosfs root
while it is mounted ro. This needs mainly MNT_RELOAD support to work.
It should be possible to run fsck -p and succeed provided the fs is
consistent, not just for msdosfs, but this fails because fsck -p always
tries to open the device rw. The hack that allows open for writing
in ffs is not implemented in msdosfs, since without MNT_RELOAD support
writing could only be harmful. So fsck must be turned off to use
msdosfs as root. This is quite dangerous, since msdosfs is still missing
actually using its fs-dirty flag internally, so it is happy to mount
dirty fileystems rw.
Unrelated changes:
- Fix missing error handling for MNT_UPDATE from rw to ro.
- Catch up with renaming msdos to msdosfs in a string.
Preprocessing stub "KSE" breaks ABI either with modules and userspace
consumers.
This patch makes KSE no more an optionally stub for kernel structures
fixing the breakage.
As a tail note, this bug has broken kqemu for a long period now.
Tested by: Ulf Lilleengen <lulf@FreeBSD.org>
Discussed with: rwatson, jeff
Approved by: jeff (mentor)
Approved by: re
ndis will signal the kthread to exit and then sleep on the proc pointer to
be woken up by kthread_exit. This is racey and in some cases the kthread will
exit before ndis gets around to sleep so it will be stuck indefinitely. This
change reuses the kq_exit variable to indicate that the thread has gone and
will loop on tsleep with a timeout waiting for it. If the kthread has already
exited then it will not sleep at all.
The HPET appears to be broken on silby's Acer Pentium M system, never
advancing. Read from the timer before attaching to be sure it advances
in 1 us. Since the slowest rate allowed by the spec is 10 MHz, the
timer is guaranteed to change in this interval if it is working.
Tested by: Rui Paulo
Approved by: re
MFC after: 3 days
Merge OpenBSM 1.0 alpha 15 changes to src/sys/bsm:
- Synchronized audit event list to Solaris, picking up the *at(2) system call
definitions, now required for FreeBSD and Linux. Added additional events
for *at(2) system calls not present in Solaris.
Obtained from: TrustedBSD Project
Approved by: re (hrs)
Vendor import TrustedBSD OpenBSM 1.0 alpha 15, with the following change
history since the last import:
OpenBSM 1.0 alpha 15
- Fix bug when processing in_addr_ex tokens.
- Restore the behavior of printing the string/text specified while
auditing arg32 tokens.
- Synchronized audit event list to Solaris, picking up the *at(2) system call
definitions, now required for FreeBSD and Linux. Added additional events
for *at(2) system calls not present in Solaris.
- Bugs in auditreduce(8) fixed allowing partial date strings to be used in
filtering events.