yongari [Tue, 2 Nov 2010 23:35:08 +0000 (23:35 +0000)]
MFC r213485,213710,213812:
r213485:
Overhaul MII register access routine and remove unnecessary
BGE_MI_MODE register accesses. Previously bge(4) used to read
BGE_MI_MODE register to detect whether it needs to disable
autopolling feature or not. Because we don't touch autopolling in
other part of driver there is no reason to read BGE_MI_MODE
register given that we know default value in advance. In order to
achieve the goal, check whether the controller has CPMU(Central
Power Mangement Unit) capability. If controller has CPMU feature,
use 500KHz MII management interface(mdio/mdc) frequency regardless
core clock frequency. Otherwise use default MII clock. While I'm
here, add CPMU register definition.
In bge_miibus_readreg(), rearrange code a bit and remove goto
statement. In bge_miibus_writereg(), make sure to restore
autopolling even if MII write failed. The delay time inserted after
accessing BGE_MI_MODE register increased from 40us to 80us.
The default PHY address is now stored in softc. All PHYs supported
by bge(4) currently uses PHY address 1 but it will be changed when
we add newer controllers. This change will make it easier to change
default PHY address depending on PHY models.
Submitted by: davidch
r213710:
Remove one last reference of BGE_MI_MODE register for auto polling.
Previously bge(4) always enabled auto polling for non-BGE_FLAG_TBI
controllers. With this change, auto polling is not used anymore so
polling through mii(4) was introduced.
Reviewed by: davidch
r213812:
Fix a regression introduced in r213710. r213710 removed the use of
auto polling such that it made all controllers obtain link status
information from the state of the LNKRDY input signal. Broadcom
recommends disabling auto polling such that driver should rely on
PHY interrupts for link status change indications. Unfortunately it
seems some controllers(BCM5703, BCM5704 and BCM5705) have PHY
related issues so Linux took other approach to workaround it.
bge(4) didn't follow that and it used to enable auto polling to
workaround it. Restore this old behavior for BCM5700 family
controllers and BCM5705 to use auto polling. For BCM5700 and
BCM5701, it seems it does not need to enable auto polling but I
restored it for safety.
Special thanks to marius who tried lots of patches with patience.
yongari [Tue, 2 Nov 2010 23:23:48 +0000 (23:23 +0000)]
MFC r213411,213464-213465,213468:
r213411:
Enable fix for read DMA FIFO overruns on controllers that have this
fix. Note, we still need workaround for controllers that lacks this
fix and it needs more work in RX BD updating.
Submitted by: davidch
r213464:
Separate common flags into controller specific and PHY related
flags. There should be no functional changes. This change will make
it easy to add more quirk/flags in future.
Reviewed by: davidch
r213465:
Rearrange code a bit to correctly set PHY flags. This change make
it easy to add more newer ASICs.
Obtained from: OpenBSD
r213468:
Fix bge(4) build breakage when BGE_REGISTER_DEBUG is defined.
yongari [Tue, 2 Nov 2010 23:04:23 +0000 (23:04 +0000)]
MFC r213316,213333-213334:
r213316:
Fix IFCAP_TXCSUM/IFCAP_RXCSUM handling. Previously bge(4) used
IFCAP_HWCSUM to know which capability should be changed such that
disabling RX checksun offloading resulted in disabling TX checksum
offloading.
r213333:
Allow write DMA to request larger DMA burst size to get better
performance on BCM5785.
yongari [Tue, 2 Nov 2010 22:57:20 +0000 (22:57 +0000)]
MFC r213283,213410:
r213283:
Implement hardware MAC statistics for BCM5705 or newer Broadcom
controllers. bge(4) exported MAC statistics on controllers that
maintain the statistics in the NIC's internal memory. Newer
controllers require register access to fetch these values. These
counters provide useful information to diagnose driver issues.
r213410:
Consistently use ifHCOutOctets/ifHCInOctets instead of Octets as
these names are used in data sheet. Also use UnicastPkts,
MulticastPkts and BroadcastPkts instead of UcastPkts, McastPkts
and BcastPkts to clarify its meaning.
Move all NV defines into nv.c, they are not used externally thus there is
no need to make then visible from outside.
r214283:
Implement nv_exists() function that returns true if argument of the given
name exists.
r214284:
Before this change on first connect between primary and secondary we
initialize all the data. This is huge waste of time and resources if
there were no writes yet, as there is no real data to synchronize.
Optimize this by sending "virgin" argument to secondary, which gives it a hint
that synchronization is not needed.
In the common case (where noth nodes are configured at the same time) instead
of synchronizing everything, we don't synchronize at all.
r214692:
Send packets to remote node only via the send thread to avoid possible
races - in this case a keepalive packet was send from wrong thread which
lead to connection dropping, because of corrupted packet.
Fix it by sending keepalive packets directly from the send thread.
As a bonus we now send keepalive packets only when connection is idle.
yongari [Tue, 2 Nov 2010 22:44:51 +0000 (22:44 +0000)]
MFC r213081,213225,213280:
r213081:
Always show asic/chip revision in device attach phase. There are
too many bge(4) controllers there and model name does not
necessarily match asic/chip revision. Relying on VPD string made
it hard to identify exact asic/chip revision so the first step to
debug bge(4) was getting exact asic/chip information with verbose
boot which may not be available on production server.
r213255:
Set the number of RX frames to receive after RX MBUF low watermark
has reached. This reduced number of dropped frames when
flow-control is enabled. Previously it dropped incoming frames once
RX MBUF low watermark has reached. The value used in MAC RX MBUF
low watermark is greater than or equal to 4 so receiving two more
RX frames should not be a problem.
Obtained from: OpenBSD
r213280:
After r207391, brgphy(4) passes resolved flow-control settings to
parent driver. Use that information to configure flow-control.
One drawback is there is no way to disable flow-control as we still
don't have proper way to not advertise RX/TX pause capability to
link partner. But I don't think it would cause severe problems and
users can selectively disable flow-control in switch port.
pjd [Tue, 2 Nov 2010 22:30:19 +0000 (22:30 +0000)]
MFC r211854:
- When VFS_VGET() is not supported, switch to VOP_LOOKUP().
- We are fine by only share-locking the vnode.
- Remove assertion that doesn't hold for ZFS where we cross mount points
boundaries by going into .zfs/snapshot/<name>/.
marius [Tue, 2 Nov 2010 22:12:06 +0000 (22:12 +0000)]
MFC: r214526
Partially revert r203829 (MFC'ed to stable/7 in r205920); as it turns out
what the PowerPC OFW loader did was incorrect as further down the road
cons_probe() calls malloc() so the former can't be called before init_heap()
has succeed. Instead just exit to the firmware in case init_heap() fails
like OF_init() does when hitting a problem as we're then likely running in
a very broken environment where hardly anything can be trusted to work.
marius [Tue, 2 Nov 2010 20:06:46 +0000 (20:06 +0000)]
MFC: r213878
Add a NetBSD-compatible mii_attach(), which is intended to eventually
replace mii_phy_probe() altogether. Compared to the latter the advantages
of mii_attach() are:
- intended to be called multiple times in order to attach PHYs in multiple
passes (f.e. in order to only use sub-ranges of the 0 to MII_NPHY - 1
range)
- being able to pass along the capability mask from the NIC to the PHY
drivers
- being able to specify at which address (phyloc) to probe for a PHY
(instead of always probing at all addresses from 0 to MII_NPHY - 1)
- being able to specify which PHY instance (offloc) to attach
- being able to pass along MIIF_* flags from the NIC to the PHY drivers
(f.e. as required to indicated to the PHY drivers that flow control is
supported by the NIC driver, which actually is the motivation for this
change).
While at it, I used the opportunity to get rid of some hacks in mii(4)
like miibus_probe() generally doing work besides sheer probing and the
"EVIL HACK" (which will vanish entirely along with mii_phy_probe()) by
passing the struct ifnet pointer via an argument of mii_attach() as well
as to fix some resource leaks in mii(4) in case something fails.
Commits which will update the PHY drivers to honor the MII flags passed
down from the NIC drivers and take advantage of mii_attach() to get rid
of certain types of hacks in NIC and PHY drivers as well as a conversion
of the remaining uses of mii_phy_probe() will follow shortly.
mav [Tue, 2 Nov 2010 09:26:12 +0000 (09:26 +0000)]
MFC r214016:
Set of legacy mode SATA enchancements:
- Implement proper combined mode decoding for Intel controllers to properly
identify SATA and PATA channels and associate ATA channels with SATA ports.
This fixes wrong reporting and in some cases hard resets to wrong SATA ports.
- Improve SATA registers support to handle hot-plug events and potentially
interface errors. For ICH5/6300ESB chipsets these registers accessible via
PCI config space. For later ones they may be accessible via PCI BAR(5).
- For controllers not generating interrupts on hot-plug events, implement
periodic status polling. Use it to detect hot-plug on Intel and VIA
controllers. Same probably could also be used for Serverworks and SIS.
mav [Tue, 2 Nov 2010 09:15:27 +0000 (09:15 +0000)]
MFC r213301:
Revert r132291.
Restore setting PIO/WDMA timings for VIA UDMA133 controllers.
Linux disables only AST register writing there, but no all timings.
mav [Tue, 2 Nov 2010 09:05:40 +0000 (09:05 +0000)]
MFC r214102:
Workaround strange situation when EDMA_RESQIP register returns zero instead
of proper value. It caused bunch of "EMPTY CRPB" messages and potentially
may cause premature requests completion, which could cause data corruption.
For most cases it seems enough to just reread register to get proper value.
To protect against worse cases - erase processed queue entries with
impossible values and ignore them if problem still happen.
bschmidt [Mon, 1 Nov 2010 19:05:38 +0000 (19:05 +0000)]
MFC r214160,214162,214236
r214236 & r214160:
The firmware does pad notifications to an even number of bytes (at least
the association notification), the included information though always
contains an elem block with an odd number of bytes. We handle the last
byte as if it might contain a whole elem block, this of course is not
true as one byte is not enough to hold a block, we therefore discard the
complete frame. The solution here is to subtract one from the actual
notification length, this is also what the Linux driver does. With this
change the frame ends exactly where the last elem block ends.
r214262:
The firmware always sets bit 14 and 15, to get the real associd we need
to clear those bits.
trasz [Mon, 1 Nov 2010 15:36:47 +0000 (15:36 +0000)]
MFC r212906:
First step at adopting FreeBSD to support PSARC/2010/029. This makes
acl_is_trivial_np(3) properly recognize the new trivial ACLs. From
the user point of view, that means "ls -l" no longer shows plus signs
for all the files when running ZFS v28.
rmacklem [Mon, 1 Nov 2010 02:21:35 +0000 (02:21 +0000)]
MFC: r214224
Modify the file handle hash function in the experimental NFS
server so that it will work better for non-UFS file systems.
The new function simply sums the bytes of the fh_fid field
of fhandle_t.
rmacklem [Mon, 1 Nov 2010 01:55:15 +0000 (01:55 +0000)]
MFC: r214149
Modify the experimental NFS server in a manner analagous to
r214049 for the regular NFS server, so that it will not do
a VOP_LOOKUP() of ".." when at the root of a file system
when performing a ReaddirPlus RPC.
rmacklem [Mon, 1 Nov 2010 01:03:05 +0000 (01:03 +0000)]
MFC: r214048, r214053
Modify the NFS clients and the NLM so that the NLM can be used
by both clients. Since the NLM uses various fields of the
nfsmount structure, those fields were extracted and put in a
separate nfs_mountcommon structure stored in sys/nfs/nfs_mountcommon.h.
This structure also has a function pointer for a function that
extracts the required information from the mount point and nfs vnode
for that particular client. Also, fix the type of the 3rd argument for
this function.
nyan [Sun, 31 Oct 2010 08:14:52 +0000 (08:14 +0000)]
MFC: revision 208638
- Add an integer argument to idle to indicate how likely we are to wake
from idle over the next tick.
- Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are
suspended in cpu specific states. This function can fail and cause the
scheduler to fall back to another mechanism (ipi).
- Implement support for mwait in cpu_idle() on i386/amd64 machines that
support it. mwait is a higher performance way to synchronize cpus
as compared to hlt & ipis.
- Allow selecting the idle routine by name via sysctl machdep.idle. This
replaces machdep.cpu_idle_hlt. Only idle routines supported by the
current machine are permitted.
bz [Sat, 30 Oct 2010 12:05:20 +0000 (12:05 +0000)]
MFC r213932:
MfP4 CH182763 (original version):
Make it harder to exploit certain in_control() related races between the
intiial lookup at the beginning and the time we will remove the entry
from the lists by re-checking that entry is still in the list before
trying to remove it.
Reported by: Nima Misaghian (nima_misa hotmail.com) on net@ 20100817
Tested by: Nima Misaghian (nima_misa hotmail.com) (original version)
PR: kern/146250
Submitted by: Mikolaj Golub (to.my.trociny gmail.com) (different version)
bz [Sat, 30 Oct 2010 11:54:55 +0000 (11:54 +0000)]
MFC r213930:
Close a race acquiring the IF_ADDR_LOCK() for each entry while iterating
over all interfaces to make sure the address will neither change nor be
freed while we are working on it.
alc [Sat, 30 Oct 2010 04:53:50 +0000 (04:53 +0000)]
MFC r213408
If vm_map_find() is asked to allocate a superpage-aligned region of
virtual addresses that is greater than a superpage in size but not a
multiple of the superpage size, then vm_map_find() is not always
expanding the kernel pmap to support the last few small pages being
allocated. Previously, we grew the kernel page table in
vm_map_findspace() when we found the first available virtual address.
Now, instead, we defer the call to pmap_growkernel() until we are
committed to a range of virtual addresses in vm_map_insert().
kib [Sat, 30 Oct 2010 01:19:15 +0000 (01:19 +0000)]
MFC r213916:
Provide vfs.ncsizefactor instead of hard-coding namecache ratio.
Move debug.ncnegfactor to vfs.ncnegfactor.
Provide some descriptions for the namecache related sysctls.
tuexen [Thu, 28 Oct 2010 19:10:31 +0000 (19:10 +0000)]
MFC r212799:
* Implement initial version of send buffer splitting.
* Make send/recv buffer splitting switchable via sysctl.
* While there: Fix some comments.
rrs [Thu, 28 Oct 2010 17:17:45 +0000 (17:17 +0000)]
MFC of 212225
Fix some CLANG warnings. One clang warning is left
due to the fact that its bogus.. nam->sa_family will
not change from AF_INET6 to AF_INET (but clang
thinks it does ;-D)
tuexen [Thu, 28 Oct 2010 17:04:32 +0000 (17:04 +0000)]
MFC 212099:
Fix the the SCTP_WITH_NO_CSUM option when used in combination with
interface supporting CRC offload. While at it, make use of the
feature that the loopback interface provides CRC offloading.
tuexen [Thu, 28 Oct 2010 17:02:36 +0000 (17:02 +0000)]
MFC 211969:
Fix the the SCTP_WITH_NO_CSUM option when used in combination with
interface supporting CRC offload. While at it, make use of the
feature that the loopback interface provides CRC offloading.
tuexen [Thu, 28 Oct 2010 16:58:12 +0000 (16:58 +0000)]
MFC 211944:
Fix the switching on/off of CMT using sysctl and socket option.
Fix the switching on/off of PF and NR-SACKs using sysctl.
Add minor improvement in handling malloc failures.
Improve the address checks when sending.
tuexen [Thu, 28 Oct 2010 16:53:54 +0000 (16:53 +0000)]
MFC r211030:
Fix a bug where MSG_TRUNC was not returned in all necessary cases for
SOCK_DGRAM socket. MSG_TRUNC was only returned when some mbufs could
not be copied to the application. If some data was left in the last
mbuf, it was correctly discarded, but MSG_TRUNC was not set.
edwin [Thu, 28 Oct 2010 00:54:18 +0000 (00:54 +0000)]
MFC of 214124
Fix printing of files located on ZFS filesystem with an st_dev or
st_ino larger than 2**31.
From the PR:
Printing from a ZFS filesystem using 'lp' fails and returns an
email reporting "Your printer job was not printed because it was
not linked to the original file".
In order to protect against files being switched when files
are printed using 'lp' or 'lpr -s', the st_dev and st_ino
values for the original file are saved by lpr and verified
by lpd before the file is printed. Unfortunately, lpr prints
both values using '%d' (although both fields are unsigned)
and lpd(8) assumes a string of decimal digits.
ZFS (at least) generates st_dev values greater than 2^31-1,
resulting in negative values being printed - which lpd cannot
parse, leading it to report that the file has been switched.
A similar problem would occur with large inode numbers.
How-To-Repeat:
Find a file with either st_dev or st_ino greater than 2^31-1
(stat(1) will report both numbers) and print it with 'lpq -s'.
This should generate an email reporting that the file could
not be printed because it was not linked to the original file
PR: bin/151567
Submitted by: Peter Jeremy <Peter.Jeremy@alcatel-lucent.com>
kib [Wed, 27 Oct 2010 16:01:57 +0000 (16:01 +0000)]
MFC r213983:
Document vunref(9), add some important notes for vrele(9) and vput(9).
Merge all three manpages to one, removing separate file for vput(9).
kib [Wed, 27 Oct 2010 15:57:17 +0000 (15:57 +0000)]
MFC r213716:
Add macro DECLARE_MODULE_TIED to denote a module as requiring the
kernel of exactly the same __FreeBSD_version as the headers module was
compiled against.
Mark our in-tree ABI emulators with DECLARE_MODULE_TIED. The modules
use kernel interfaces that the Release Engineering Team feel are not
stable enough to guarantee they will not change during the life cycle
of a STABLE branch. In particular, the layout of struct sysentvec is
declared to be not part of the STABLE KBI.
kib [Wed, 27 Oct 2010 15:44:49 +0000 (15:44 +0000)]
MFC r213664:
The r184588 changed the layout of struct export_args, causing an ABI
breakage for old mount(2) syscall, since most struct <filesystem>_args
embed export_args. The mount(2) is supposed to provide ABI
compatibility for pre-nmount mount(8) binaries, so restore ABI to
pre-r184588.
rmacklem [Wed, 27 Oct 2010 13:10:08 +0000 (13:10 +0000)]
MFC: r213756
Fix the krpc so that it can handle NFSv3,UDP mounts with a read/write
data size greater than 8192. Since soreserve(so, 256*1024, 256*1024)
would always fail for the default value of sb_max, modify clnt_dg.c
so that it uses the calculated values and checks for an error return
from soreserve(). Also, add a check for error return from soreserve()
to clnt_vc.c and change __rpc_get_t_size() to use sb_max_adj instead of
the bogus maxsize == 256*1024.
yongari [Wed, 27 Oct 2010 02:04:24 +0000 (02:04 +0000)]
MFC r213796:
Rewrite interrupt handler to give fairness for both RX and TX.
Previously rl(4) continuously checked whether there are RX events
or TX completions in forever loop. This caused TX starvation under
high RX load as well as consuming too much CPU cycles in the
interrupt handler. If interrupt was shared with other devices which
may be always true due to USB devices in these days, rl(4) also
tried to process the interrupt. This means polling(4) was the only
way to mitigate the these issues.
To address these issues, rl(4) now disables interrupts when it
knows the interrupt is ours and limit the number of iteration of
the loop to 16. The interrupt would be enabled again before exiting
interrupt handler if the driver is still running. Because RX buffer
is 64KB in size, the number of iterations in the loop has nothing
to do with number of RX packets being processed. This change
ensures sending TX frames under high RX load.
RX handler drops a driver lock to pass received frames to upper
stack such that there is a window that user can down the interface.
So rl(4) now checks whether driver is still running before serving
RX or TX completion in the loop.
While I'm here, exit interrupt handler when driver initialized
controller.
With this change, now rl(4) can send frames under high RX load even
though the TX performance is still not good(rl(4) controllers can't
queue more than 4 frames at a time so low TX performance was one of
design issue of rl(4) controllers). It's much better than previous
TX starvation and you should not notice RX performance drop with
this change. Controller still shows poor performance under high
network load but for many cases it's now usable without resorting
to polling(4).
Correct offset conversion to little endian. It was implemented in version 2,
but because of a bug it was a no-op, so we were still using offsets in native
byte order for the host. Do it properly this time, bump version to 4 and set
the G_ELI_FLAG_NATIVE_BYTE_ORDER flag when version is under 4.
Reported by: ivoras
r212845 (by brian):
Support attaching version 4 metadata
Reviewed by: pjd
r212846:
Fix indent.
r212934 (by brian):
Add a geli resize subcommand to resize encrypted filesystems prior
to growing the filesystem.
Refuse to attach providers where the metadata provider size is
wrong. This makes post-boot attaches behave consistently with
pre-boot attaches. Also refuse to restore metadata to a provider
of the wrong size without the new -f switch. The new -f switch
forces the metadata restoration despite the provider size, and
updates the provider size in the restored metadata to the correct
value.
Helped by: pjd
Reviewed by: pjd
r213055:
When trashing metadata, flush after each write.
r213056:
Simplify code a bit by using g_*() API from libgeom.
r213057:
- Make use of g_*() API.
- Flush cache after writing metadata.
r213058:
Because we first write metadata into new place and then trash old place we
don't want situation where old size is equal to new size, as we will trash
newly written metadata.
r213059:
- Use g_*() API when doing backups.
- fsync() created files.
r213060:
- When trashing metadata, repeat overwrite kern.geom.eli.overwrites times.
- Flush write cache after each write.
r213062:
Define default overwrite count, so that userland can use it.
r213063:
Make the code similar to the code in g_eli_integrity.c.
r213067:
Implement switching of data encryption key every 2^20 blocks.
This ensures the same encryption key won't be used for more than
2^20 blocks (sectors). This will be the default now.
r213070:
Add support for AES-XTS. This will be the default now.
r213071:
Document AES-XTS.
r213072:
Update copyright years.
r213073:
Update copyright years.
r213164:
Ignore errors from BIO_FLUSH. It might confuse users that provider wasn't
really killed. What we really care about are write errors only.
r213165:
Change g_eli_debug to int, so one can turn off any GELI output by setting
kern.geom.eli.debug sysctl to -1.
r213172:
- Add support for loading passphrase from a file (-J and -j options).
This is especially useful for things like installers, where regular
geli prompt can't be used.
- Add support for specifing multiple -K or -k options, so there is no
need to cat all keyfiles and read them from standard input.
Requested by: Kris Moore <kris@pcbsd.org>, thompsa
r214116:
- Add missing comments.
- Make a comment consistent with others.
r214118:
Bring in geli suspend/resume functionality (finally).
Before this change if you wanted to suspend your laptop and be sure that your
encryption keys are safe, you had to stop all processes that use file system
stored on encrypted device, unmount the file system and detach geli provider.
This isn't very handy. If you are a lucky user of a laptop where suspend/resume
actually works with FreeBSD (I'm not!) you most likely want to suspend your
laptop, because you don't want to start everything over again when you turn
your laptop back on.
And this is where geli suspend/resume steps in. When you execute:
# geli suspend -a
geli will wait for all in-flight I/O requests, suspend new I/O requests, remove
all geli sensitive data from the kernel memory (like encryption keys) and will
wait for either 'geli resume' or 'geli detach'.
Now with no keys in memory you can suspend your laptop without stopping any
processes or unmounting any file systems.
When you resume your laptop you have to resume geli devices using 'geli resume'
command. You need to provide your passphrase, etc. again so the keys can be
restored and suspended I/O requests released.
Of course you need to remember that 'geli suspend' won't clear file system
cache and other places where data from your geli-encrypted file system might be
present. But to get rid of those stopping processes and unmounting file system
won't help either - you have to turn your laptop off. Be warned.
Also note, that suspending geli device which contains file system with geli
utility (or anything used by 'geli resume') is not very good idea, as you won't
be able to resume it - when you execute geli(8), the kernel will try to read it
and this read I/O request will be suspended.
r214133:
Fix a bug introduced in r213067 where we use authentication key before
initializing it.
r214163:
Free opencrypto sessions on suspend, as they also might keep encryption keys.
r214225:
Move sc_akeyctx and sc_ivctx initialization to the g_eli_mkey_propagate()
function which eliminates code duplication and will ensure proper order
of operation.
r214226:
Encryption keys array might be NULL if device is suspended. Check for this, so
we don't panic when we detach suspended device.
r214227:
Add State tag, so 'geli status' will report active/suspended status, eg:
# geli status
Name Status Components
da0.eli SUSPENDED da0
da1.eli ACTIVE da1
r214228:
Close a race between checking if device is already suspended and suspending it.
r214229:
- Improve error messages, so instead of 'Not fully done', the user will get
information that device is already suspended or that device is using
one-time key and suspend is not supported.
- 'geli suspend -a' silently skips devices that use one-time key, this is fine,
but because we log which device were suspended on the console, log also which
devices were skipped.
r214404:
Use fprintf(stderr) instead of gctl_error() to print a warning about too
big sector size. When gctl error is set gctl_has_param() always returns
'false', which prevents geli(8) from finding some arguments and also masks
an error, which is generates in such case.
bschmidt [Tue, 26 Oct 2010 20:23:29 +0000 (20:23 +0000)]
MFC r214069:
Fix an undefined behaviour if the desired ratectl algo is not available.
This can happen if the algos are built as modules but are not loaded. If
the selected ratectl algo is not available, try to load it (The load
module functions does nothing currently). Add a dummy ratectl algo which
always selects the first available rate. Use that one if the desired algo
is not available.
rrs [Tue, 26 Oct 2010 19:08:26 +0000 (19:08 +0000)]
MFC:210599
PR SCTP Bugs. Basically a full sized frame of
PR SCTP FWD-TSN's would not be sent and thus
cause a stalled connection. Also the rwnd
Calculation was also off on the receiver side for
PR-SCTP.
rrs [Tue, 26 Oct 2010 19:06:31 +0000 (19:06 +0000)]
MFC:210494
Make sure that we report chunks if a socket
still exists that were not sent. In either
case carefully remove the data if it does not
get taken by the reporting routines.
rrs [Tue, 26 Oct 2010 19:04:05 +0000 (19:04 +0000)]
MFC:210493
When counting the number of chunks in the
retransmission queue to validate the retran count, we
need to include the chunks in the control send queue
too. Otherwise the count will not match and you will get
the invarient warning if invarients are on.
rrs [Tue, 26 Oct 2010 18:59:36 +0000 (18:59 +0000)]
MFC of 209663
This fixes a crash in SCTP. It was possible to have a
large number of packets queued to a crashing process.
In a specific case you may get 2 ABORT's back (from
say two packets in flight). If the aborts happened to
be processed at the same time its possible to have
one free the association while the other is trying
to report all the outbound packets. When this occured
it could lead to a crash.
rrs [Tue, 26 Oct 2010 18:56:55 +0000 (18:56 +0000)]
MFC of 209644
Log is:
Fix a bug that will cause a panic. Basically
a read-lock is being called to check the vtag-timewait cache.
Then in two cases (where a vtag is bad i.e. in the time-wait
state) the write-unlokc is called NOT the read-unlock. Under
conditions where lots of associations are coming and going
this willc ause the system to panic with invariants on.
bschmidt [Tue, 26 Oct 2010 17:30:34 +0000 (17:30 +0000)]
MFC r213729:
Fix monitor mode which is implemented by doing a firmware scan. This
is a port from stable/6, seems like the code got lost during the
background scan changes in r170530.
nwhitehorn [Tue, 26 Oct 2010 14:59:35 +0000 (14:59 +0000)]
MFC r212360:
On architectures with non-tree-based page tables like PowerPC, every page
in a range must be checked when calling pmap_remove(). Calling
pmap_remove() from vm_pageout_map_deactivate_pages() with the entire range
of the map could result in attempting to demap an extraordinary number
of pages (> 10^15), so iterate through each map entry and unmap each of
them individually.
nwhitehorn [Tue, 26 Oct 2010 14:56:46 +0000 (14:56 +0000)]
MFC r213456:
Handle vector assist traps without a kernel panic, by setting denormalized
values to zero. A correct solution would involve emulating vector
operations on denormalized values, but this has little effect on accuracy
and is much less complicated for now.
davidxu [Tue, 26 Oct 2010 09:25:29 +0000 (09:25 +0000)]
MFC r213241, r213257:
In current code, statically initialized and destroyed object have
same null value, the code can not distinguish between them, to
fix the problem, now a destroyed object is assigned to a non-null
value, and it will be rejected by some pthread functions.
PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP is changed to number 1, so that
adaptive mutex can be statically initialized correctly.