andre [Mon, 19 Aug 2013 09:49:51 +0000 (09:49 +0000)]
MFC a bundle of commits that bring autotuning to mbufs, maxfiles/sockets
and maxusers to the 9-stable branch. It is committed as bundle because
these patches build on each other and only provide the functionality in
their entirety. Some are bug fixes to aspects of earlier commits.
MFC r242029 (alfred):
Allow autotune maxusers > 384 on 64 bit machines.
MFC r242847 (alfred):
Allow maxusers to scale on machines with large address space.
MFC r243631 (andre):
Base the mbuf related limits on the available physical memory or
kernel memory, whichever is lower. The overall mbuf related memory
limit must be set so that mbufs (and clusters of various sizes)
can't exhaust physical RAM or KVM.
At the same time divorce maxfiles from maxusers and set maxfiles to
physpages / 8 with a floor based on maxusers. This way busy servers
can make use of the significantly increased mbuf limits with a much
larger number of open sockets.
MFC r243639 (andre):
Complete r243631 by applying the remainder of kern_mbuf.c that got
lost while merging into the commit tree.
MFC r243668 (andre):
Using a long is the wrong type to represent the realmem and maxmbufmem
variable as they may overflow on i386/PAE and i386 with > 2GB RAM.
MFC r243995, r243996, r243997 (pjd):
Style cleanups, Make use of the fact that uma_zone_set_max(9) already
returns actual limit set.
MFC r244080 (andre):
Prevent long type overflow of realmem calculation on ILP32 by forcing
calculation to be in quad_t space. Fix style issue with second parameter
to qmin().
MFC r245469 (alfred):
Do not autotune ncallout to be greater than 18508.
MFC r245575 (andre):
Move the mbuf memory limit calculations from init_param2() to
tunable_mbinit() where it is next to where it is used later.
MFC r246207 (andre):
Remove unused VM_MAX_AUTOTUNE_NMBCLUSTERS define.
MFC r249843 (andre):
Base the calculation of maxmbufmem in part on kmem_map size
instead of kernel_map size to prevent kernel memory exhaustion
by mbufs and a subsequent panic on physical page allocation
failure.
MFC r253204 (andre):
Fix style issues, a typo in "kern.ipc.nmbufs" and correctly plave and
expose the value of the tunable maxmbufmem as "kern.ipc.maxmbufmem"
through sysctl.
MFC r253207 (andre):
Make use of the fact that uma_zone_set_max(9) already returns the
rounded limit making a call to uma_zone_get_max(9) unnecessary.
emaste [Sun, 18 Aug 2013 08:24:58 +0000 (08:24 +0000)]
MFC r240970:
- Make C11 atomic macros usable in expressions:
- Replace do-while statements with void expressions.
- Wrap __asm statements in statement expressions.
- Make the macros function-like:
- Evaluate all arguments exactly once.
- Make sure there's a sequence point between evaluation of the
arguments and the function body. Arguments should be evaluated
before any memory barriers.
- Fix use of __atomic_is_lock_free built-in. It requires the address
of an atomic variable as second argument. Use this built-in on clang
as well because clang's __c11_atomic_is_lock_free only takes the size
of the variable into account.
- In atomic_exchange_explicit put the barrier before instead of after
the __sync_lock_test_and_set call.
emaste [Sun, 18 Aug 2013 08:18:49 +0000 (08:18 +0000)]
MFC r239960:
Properly enable Clang-style atomics when available.
In addition to testing against cxx_atomic, we must check c_atomic. The
former is only set when building C++ code. Also use __has_extension
instead of __has_feature. This allows us to use the atomics outside of
C11.
erwin [Fri, 16 Aug 2013 07:11:13 +0000 (07:11 +0000)]
MFC 253983, 253984:
Update Bind to 9.8.5-P2
New Features
Adds a new configuration option, "check-spf"; valid values are
"warn" (default) and "ignore". When set to "warn", checks SPF
and TXT records in spf format, warning if either resource record
type occurs without a corresponding record of the other resource
record type. [RT #33355]
Adds support for Uniform Resource Identifier (URI) resource
records. [RT #23386]
Adds support for the EUI48 and EUI64 RR types. [RT #33082]
Adds support for the RFC 6742 ILNP record types (NID, LP, L32,
and L64). [RT #31836]
Feature Changes
Changes timing of when slave zones send NOTIFY messages after
loading a new copy of the zone. They now send the NOTIFY before
writing the zone data to disk. This will result in quicker
propagation of updates in multi-level server structures. [RT #27242]
"named -V" can now report a source ID string. (This is will be
of most interest to developers and troubleshooters). The source
ID for ISC's production versions of BIND is defined in the "srcid"
file in the build tree and is normally set to the most recent
git hash. [RT #31494]
Response Policy Zone performance enhancements. New "response-policy"
option "min-ns-dots". "nsip" and "nsdname" now enabled by default
with RPZ. [RT #32251]
Implement syscall clock_getcpuclockid2, so we can get a clock id
for process, thread or others we want to support.
Use the syscall to implement POSIX API clock_getcpuclock and
pthread_getcpuclockid.
jfv [Thu, 15 Aug 2013 21:06:38 +0000 (21:06 +0000)]
MFC r254262 Further improve the msix setup, make sure pci_alloc_msix() gives us
the vectors we requested, and fall back to MSI when not, also release
any allocated resources before the fallback.
gjb [Thu, 15 Aug 2013 10:31:31 +0000 (10:31 +0000)]
MFC r254265:
Make sure bootonly.iso for -BETAs and -RCs use the releases/
directory on the FTP mirrors to fetch distributions, since
these are always pushed to releases/ during the release cycle.
ache [Thu, 15 Aug 2013 04:27:10 +0000 (04:27 +0000)]
MFC: r254091
According to POSIX \ in the fnmatch(3) pattern should escape
any character including '\0', but our version replace escaped '\0'
with '\\'.
I.e. fnmatch("\\", "\\", 0) should not match while fnmatch("\\", "", 0)
should (Linux and NetBSD does the same). Was vice versa.
tuexen [Thu, 15 Aug 2013 04:25:16 +0000 (04:25 +0000)]
MFC r254338:
Don't send uninitialized memory (two instances of 4 bytes) in
every cookie on the wire. This bug was reported in
https://bugzilla.mozilla.org/show_bug.cgi?id=905080
gshapiro [Thu, 15 Aug 2013 01:32:48 +0000 (01:32 +0000)]
MFC: Temporarily revert sendmail 8.14.7 change to getipnodebyname() flags
to prevent problems between the resolver and Microsoft DNS servers with
AAAA lookups. The upstream open source project will work on a more
permanent fix for the next release. Issue noted by Pavel Timofeev.
dteske [Wed, 14 Aug 2013 16:15:14 +0000 (16:15 +0000)]
MFC r254237:
Add optional support for default override of standard setup; but only if
corresponding functions are provided. If override function does not exist,
boot remains unmodified. This patch should not result in any changes.
scottl [Tue, 13 Aug 2013 22:05:50 +0000 (22:05 +0000)]
Merge r254263:
Update PCI drivers to no longer look at the MEMIO-enabled bit in the PCI
command register. The lazy BAR allocation code in FreeBSD sometimes
disables this bit when it detects a range conflict, and will re-enable
it on demand when a driver allocates the BAR. Thus, the bit is no longer
a reliable indication of capability, and should not be checked. This
results in the elimination of a lot of code from drivers, and also gives
the opportunity to simplify a lot of drivers to use a helper API to set
the busmaster enable bit.
This changes fixes some recent reports of disk controllers and their
associated drives/enclosures disappearing during boot.
pfg [Sun, 11 Aug 2013 02:53:18 +0000 (02:53 +0000)]
MFC r252890, 252906, r252907, r253861, r254104:
Implementation of the HTree directory index.
This is a port of NetBSD's GSoC 2012 Ext3 HTree directory indexing
by Vyacheslav Matyushin. It was cleaned up and enhanced for FreeBSD
by Zheng Liu (lz@).
This is an excellent example of work shared among different projects:
Vyacheslav was able to look at an early prototype from Zheng Liu who
was also able to check the code from Haiku (with permission).
As in linux, the feature is not available by default and must be
enabled explicitly with tune2fs. We still do not support the
workarounds required in readdir for NFS.
Submitted by: Zheng Liu
Tested by: Mike Ma
Sponsored by: Google Inc.
marius [Sat, 10 Aug 2013 00:06:56 +0000 (00:06 +0000)]
MFC: r251782, r251783, r253994
- Remove conflicting macros from SPARC64's atomic(9) header.
- Add MD (for now) atomic_store_acq_<type>() and use it in pmap_activate()
to get the semantics when setting the PMAP right.
marius [Sat, 10 Aug 2013 00:00:19 +0000 (00:00 +0000)]
MFC: r241374
Add an unified macro to deny ability from the compiler to reorder
instruction loads/stores at its will.
The macro __compiler_membar() is currently supported for both gcc and
clang, but kernel compilation will fail otherwise.
marius [Fri, 9 Aug 2013 20:58:06 +0000 (20:58 +0000)]
MFC: r253899, r253920
- Implement iclear methods for QUICC and SAB 82532. With r253161 in place,
this is is crucial at least for the latter.
What happens is that attaching uart(4) to scc(4) causes the SAB 82532 to
"receive" something and trigger a SER_INT_RXREADY interrupt, given that
at least fast/filter interrupts are already enabled. Prior to r253161,
uart_bus_ihand() was set up at this point and handled that condition,
i. e. read the RX FIFO and issued a Receive Message Complete.
Now, uart_bus_ihand() and uart_intr() are setup after attaching uart(4),
leaving the SER_INT_RXREADY interrupt triggered during the latter to
be handled by the iclear method. However, with that method not implement,
this in turn causes SAB 82532 to not issue any further SER_INT_RXREADY
interrupts until the RX FIFO is full again. Thus, 15 received bytes go
to nowhere, given that "the other half" of the RX FIFO is used for status
information. Hence, implementing sab82532_bfe_iclear() fixes things again.
Potentially, the same problem exists for QUICC.
- Remove unnecessary __RMAN_RESOURCE_VISIBLE.
- Remove a superfluous header.
- Use KOBJMETHOD_END.
- Mark unused arguments as such.
- Remove variables unused after initialization.
marius [Fri, 9 Aug 2013 19:45:55 +0000 (19:45 +0000)]
MFC: r253742
- Add const-qualifiers to the arguments of isonum_*().
- According to ISO 9660 7.1.2, isonum_712() should return a signed value.
- Try to get isonum_*() closer to style(9).
marius [Fri, 9 Aug 2013 18:54:27 +0000 (18:54 +0000)]
MFC: 254004
As it turns out, MSIs are broken with 2820SA so introduce an AAC_FLAGS_NOMSI
quirk and apply it to these controllers [1]. The same problem was reported
for 2230S, in which case it wasn't actually clear whether the culprit is the
controller or the mainboard, though. In order to be on the safe side, flag
MSIs as being broken with the latter type of controller as well. Given that
these are the only reports of MSI-related breakage with aac(4) so far and
OSes like OpenSolaris unconditionally employ MSIs for all adapters of this
family, however, it doesn't seem warranted to generally disable the use of
MSIs in aac(4).
While at it, simplify the MSI allocation logic a bit; there's no need to
check for the presence of the MSI capability on our own as pci_alloc_msi(9)
will just fail when these kind of interrupts are not available.
Reported and tested by: David Boyd [1]
pfg [Fri, 9 Aug 2013 17:52:56 +0000 (17:52 +0000)]
MFC r252435, r252437, r253163:
Change i_gen in UFS to an unsigned type.
In UFS, i_gen is a random generated value and there is not way for
it to be negative. Actually, the value of i_gen is just used to
match bit patterns and it is of no consequence if the values are
signed or not. Following other filesystems, set it to unsigned,
Calculation for older filesystems remains untouched.
kib [Fri, 9 Aug 2013 06:01:52 +0000 (06:01 +0000)]
MFC r253527:
Move the convert_sigevent32() utility function into freebsd32_misc.c
for consumption outside the vfs_aio.c.
For SIGEV_THREAD_ID and SIGEV_SIGNAL notification delivery methods,
also copy in the sigev_value, since librt event pumping loop compares
note generation number with the value passed through sigev_value.
kib [Thu, 8 Aug 2013 06:15:58 +0000 (06:15 +0000)]
MFC r253191:
The vm_fault() should not be allowed to proceed on the map entry which
is being wired now. The entry wired count is changed to non-zero in
advance, before the map lock is dropped. This makes the vm_fault() to
perceive the entry as wired, and breaks the fragment which moves the
wire count from the shadowed page, to the upper page, making the code
unwiring non-wired page.
On the other hand, the vm_fault() calls from vm_fault_wire() should be
allowed to proceed, so only drain MAP_ENTRY_IN_TRANSITION from
vm_fault() when wiring_thread is not current.
kib [Thu, 8 Aug 2013 06:12:29 +0000 (06:12 +0000)]
MFC r253190:
Add the thread owner of the MAP_ENTRY_IN_TRANSITION flag to struct
vm_map_entry. In vm_map_wire() and vm_map_unwire(), only process the
entries which transition owner is the current thread.
kib [Thu, 8 Aug 2013 06:07:28 +0000 (06:07 +0000)]
MFC r253189:
Never remove user-wired pages from an object when doing
msync(MS_INVALIDATE). The vm_fault_copy_entry() requires that object
range which corresponds to the user-wired vm_map_entry, is always
fully populated.
Add OBJPR_NOTWIRED flag for vm_object_page_remove() to request the
preserving behaviour, use it when calling vm_object_page_remove() from
vm_object_sync().
kib [Thu, 8 Aug 2013 06:03:34 +0000 (06:03 +0000)]
MFC r253188:
In the vm_page_set_invalid() function, do not assert that the page is
not busy, since its only caller brelse() can legitimately call it on
busy page.
kib [Wed, 7 Aug 2013 09:18:21 +0000 (09:18 +0000)]
Revert the MFC of the r244237, done as r244806. There are indeed bugs
in XEN pmap. The revert hides a panic with the cost of non-working
vfork(2), which means more obscure misbehaviour in the usermode.
Revert is only done on the stable branch to maintain the consistent
erratic behaviour.
jhb [Tue, 6 Aug 2013 19:23:57 +0000 (19:23 +0000)]
MFC 253048,253423,253449,253653,253774,253785:
- Allow mlx4 devices to switch between Ethernet and Infiniband:
- Fix sysfs attribute handling by using sysctl_handle_string() and
properly handling trailing newlines in attribute values.
- Remove check forbidding requests that would result in one port being
set to Ethernet and the subsequent port being set to IB.
- Avoid trashing IP fragments by correctly managing hardware checksumming.
- Fix panics when downing or unloading the mlx4 driver.
jfv [Tue, 6 Aug 2013 18:20:31 +0000 (18:20 +0000)]
When the igb driver is static there are cases when early interrupts occur,
resulting in a panic in refresh_mbufs, to prevent this add a check in the
interrupt handler for DRV_RUNNING.
jfv [Tue, 6 Aug 2013 17:11:12 +0000 (17:11 +0000)]
MFC r253865: Fixes to RX_COPY optimization code allowing the removal of the rearm_queues
routine used in local_timer.
r253965: Correct the queue mask bit clearing in the link interrupt handler.
hrs [Fri, 2 Aug 2013 03:46:45 +0000 (03:46 +0000)]
MFC 253751 and 253843:
- Relax the restriction on the member interfaces with LLAs. Two or more
LLAs on the member interfaces are actually harmless when the parent
interface does not have a LLA.
- Add net.link.bridge.allow_llz_overlap. This is a knob to allow LLAs on
a bridge and the member interfaces at the same time. The default is 0.
mav [Thu, 1 Aug 2013 09:42:17 +0000 (09:42 +0000)]
MFC r253754:
Partially close race between calls of orphan() method from GEOM and close()
method from ZFS core, that reliably causes use-after-free panic if SSD vdev
detached during inititial erase.
marius [Wed, 31 Jul 2013 11:36:20 +0000 (11:36 +0000)]
Revert r249530 and re-enable compilation of ctl.ko for all configurations
except i386 XEN, for which it still doesn't build so far.
This is a direct commit to stable/9.
+ Add "-f" to also output filemon(4) information.
+ Add d, p and r switches for recording script sessions with timing data
and playing sessions back with or without time delays.
+ Remove contractions.
MFC r253554:
Fix a panic in the racct code when munlock(2) is called with incorrect values.
The racct code in sys_munlock() assumed that the boundaries provided by
the userland were correct as long as vm_map_unwire() returned
successfully. However the latter contains its own logic and sometimes
manages to do something out of those boundaries, even if they are buggy.
This change makes the racct code to use the accounting done by the vm
layer, as it is done in other places such as vm_mlock().
Despite fixing the panic, Alan Cox pointed that this code is still
race-y though: two simultaneous callers will produce incorrect values.
Reviewed by: alc
MFC r253556:
Fix previous commit when option RACCT is not used.