The vm_pageout_flush() functions sbusies pages in the passed pages
run. After that, the pager put method is called, usually translated
to VOP_WRITE(). For the filesystems which use buffer cache,
bufwrite() sbusies the buffer pages again, waiting for the xbusy state
to drain. The later is done in vfs_drain_busy_pages(), which is
called with the buffer pages already sbusied (by vm_pageout_flush()).
Since vfs_drain_busy_pages() can only wait for one page at the time,
and during the wait, the object lock is dropped, previous pages in the
buffer must be protected from other threads busying them. Up to the
moment, it was done by xbusying the pages, that is incompatible with
the sbusy state in the new implementation of busy. Switch to sbusy.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
The vm_page_trysbusy() should not fail when shared busy counter or
VPB_BIT_WAITERS flag were changed between reading of busy_lock and the
cas. The vm_page_sbusy(), which is the only user of
vm_page_trysbusy() in the tree, panics on the failure, which in these
cases is transient and do not mean that the current page state
prevents sbusying.
Retry the operation inside vm_page_trysbusy() if cas failed, only
return a failure when VPB_BIT_SHARED is cleared.
Reported and tested by: pho
Reviewed by: attilio
Sponsored by: The FreeBSD Foundation
Fix file selection logic for the RCS/SCCS case, as was done for the simple
file case before. Bump version because of the changed behavior, which now
matches the documentation.
Remove fallback to fork(2) if pdfork(2) is not available. If the parent
process dies, the process descriptor will be closed and pdfork(2)ed child
will be killed, which is not the case when regular fork(2) is used.
The PROCDESC option is now part of the GENERIC kernel configuration, so we
can start depending on it.
Add UPDATING entry to inform that this option is now required and log
detailed instruction to syslog if pdfork(2) is not available:
The pdfork(2) system call is not available; recompile the kernel with options PROCDESC
Submitted by: Mariusz Zaborski <oshogbo@FreeBSD.org>
Sponsored by: Google Summer of Code 2013
Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.
The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.
The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.
The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.
To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.
Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:
Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:
Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.
This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.
Correct blkback handling of the BLKIF_OP_FLUSH_DISKCACHE opcode.
Properly round-trip the "operation code" for client requests.
sys/dev/xen/blkback/blkback.c:
In xbb_dispatch_dev() when processing a flush request,
correctly set bio->bio_caller1 to the request list (not
bare request) for the operation, as is expected by the
completion handler xbb_bio_done().
In xbb_get_resources(), initialize "operation" in the
driver's internal request object from the client's "ring
request", so it is correct when used to populate the reply
when this operation completes.
Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
Reviewed by: gibbs
- Restore the pre-PCID TLB shootdown handlers for whole address space
and single page invalidation asm code, and assign the IPI handler to
them when PCID is not supported or disabled. Old handlers have
linear control flow. But, still use the common return sequence.
- Stop using pcpu for INVPCID descriptors in the invlrg handler. It
is enough to allocate descriptors on the stack. As result, two
SWAPGS instructions are shaved off from the code for Haswell+.
- Fix the reverted condition in invlrng for checking of the PCID
support [1], also in invlrng check that pmap is kernel pmap before
performing other tests. For the kernel pmap, which provides global
mappings, the INVLPG must be used for invalidation always.
- Save the pre-computed pmap' %CR3 register in the struct pmap. This
allows to remove several checks for pm_pcid validity when %CR3 is
reloaded [2].
Noted by: gibbs [1]
Discussed with: alc [2]
Tested by: pho, flo
Sponsored by: The FreeBSD Foundation
Crashes have been observed for NFSv4.1 mounts when the system
is being shut down which were caused by the nfscbd_pool being
destroyed before the backchannel is disabled. This patch is
believed to fix the problem, by simply avoiding ever destroying
the nfscbd_pool. Since the NFS client module cannot be unloaded,
this should not cause a memory leak.
sh: Make return return from the closest function or dot script.
Formerly, return always returned from a function if it was called from a
function, even if there was a closer dot script. This was for compatibility
with the Bourne shell which only allowed returning from functions.
Other modern shells and POSIX return from the function or the dot script,
whichever is closest.
Git 1.8.4's rebase --continue depends on the POSIX behaviour.
Rework the timeout code to use actual time rather than a DELAY() loop and
to use both typical and maximum to allow logging of timeout failures.
Also correct the erase timeout, it is specified in milliseconds not
microseconds like the other timeouts. Do not invoke DELAY() between
status queries as this adds significant latency which in turn reduced
write performance substantially.
Sanity check timeout values from the hardware.
Implement support for buffered writes (only enabled on Intel/Sharp parts
for now). This yields an order of magnitude speedup on the 64MB Intel
StrataFlash parts we use.
When making a copy of the block to modify, also keep a clean copy around
until we are ready to commit the block and use it to avoid unnecessary
erases. In the non-buffer write case, also use it to avoid
unnecessary writes when the block has not been erased. This yields a
significant speedup when doing things like zeroing a block.
Add a c++/v1/tr1 include directory containing symlinks to all of the standard
headrs.
Lots of third-party code expects to find C++03 headers under tr1 because that's
where GNU decided to hide them. This should fix ports that expect them there.
For TOE connections, the window scale factor in CPL_PASS_ACCEPT_REQ is
set to 15 to indicate that the peer did not send a window scale option
with its SYN. Do not send a window scale option in the SYN|ACK reply
in that case.
The old (2.1) GNU patch has outlived its days. The major
local changes have been moved into the less restrictedly
licensed patch(1) we adopted in usr.bin/ .
A much newer version of GNU patch is available in the
ports tree (devel/patch).
Use the fact that the AES-NI instructions can be pipelined to improve
performance... Use SSE2 instructions for calculating the XTS tweek
factor... Let the compiler do more work and handle register allocation
by using intrinsics, now only the key schedule is in assembly...
Replace .byte hard coded instructions w/ the proper instructions now
that both clang and gcc support them...
On my machine, pulling the code to userland I saw performance go from
~150MB/sec to 2GB/sec in XTS mode. GELI on GNOP saw a more modest
increase of about 3x due to other system overhead (geom and
opencrypto)...
These changes allow almost full disk io rate w/ geli...
Reviewed by: -current, -security
Thanks to: Mike Hamburg for the XTS tweek algorithm
add support to gcc for AES and PCLMUL intrinsics... This addes the
-maes option, but not the -mpclmul option as I ran out of bits in
the 32 bit flags field... You can -D__PCLMUL__ to get this, but it
won't be compatible w/ clang and modern gcc...
sys/dev/xen/blkback/blkback.c:
Initialize the request id for requests in xbb_get_resources()
instead of its previous location in xbb_dispatch_io(). This
guarantees that all request types (e.g. BLKIF_OP_FLUSH_DISKCACHE)
have the front-end specified id recorded.
Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
Include the calling context in the mail subject, if any.
More concretely, periodic security scripts defaults to being
called from daily ones -- daily context -- so the mail subject
will now be "${HOST} daily security run output" instead of
"{HOST} security run output".
If you switch the period of some security checks to weekly, you
will receive another email "${HOST} weekly security run output".
Since r254974, periodic scripts' period can be configured
independently. There is no reason to leave their options
with the daily ones, so move them to their own section.
Since r254974, periodic scripts' period can be configured
independently. There is no reason to leave their options
with the daily ones, so move them to their own section.
Move periodic scripts' options into their own section. Since r254974,
Refactor PowerPC hwpmc(4) driver into generic and specific. More refactoring
will likely be done as more drivers are added, since AIM-compatible processors
have similar PMC configuration logic.
Create the default router last. This allows using an static
interface route for default routes, which seems to be common
among many dedicated hosting providers.
All changes affect only SCTP-AUTH:
* Remove non working code related to SHA224.
* Remove support for non-standardised HMAC-IDs using SHA384 and SHA512.
* Prefer SHA256 over SHA1.
* Minor cleanup.
sh: Fix race condition with signals and wait or set -T.
The change in r238888 was incomplete. It was still possible for a trapped
signal to arrive before the shell went to sleep (sigsuspend()) because a
check was missing or because the signal arrived before in_waitcmd was set.
On SMP, this bug sometimes caused the builtins/wait4.0 test to take 1 second
to execute; it then might or might not fail. On UP, the test almost always
failed.
Make ELI destruction (including orphanization) less aggressive, making it
always wait for provider close. Old algorithm was reported to cause NULL
dereference panic on attempt to close provider after softc destruction.
If not global workaroung in GEOM, that could even cause destruction with
requests still in flight.
Merge 1.12 of pf_lb.c from OpenBSD, with some changes. Original commit:
date: 2010/02/04 14:10:12; author: sthen; state: Exp; lines: +24 -19;
pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.
- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.
- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.
Help/ok claudio@.
Some additional changes to 1.12:
- We also need to bzero() the key to zero padding, otherwise key
won't match.
- Collapse two if blocks into one with ||, since both conditions
lead to the same processing.
- Only naddr changes in the cycle, so move initialization of other
fields above the cycle.
- s/u_intXX_t/uintXX_t/g
Fix socket buffer timeouts precision using the new sbintime_t KPI instead
of relying on the tvtohz() workaround. The latter has been introduced
lately by jhb@ (r254699) in order to have a fix that can be backported
to STABLE.
Forced dismounts of NFS mounts can fail when thread(s) are stuck
waiting for an RPC reply from the server while holding the mount
point busy (mnt_lockref incremented). This happens because dounmount()
msleep()s waiting for mnt_lockref to become 0, before calling
VFS_UNMOUNT(). This patch adds a new VFS operation called VFS_PURGE(),
which the NFS client implements as purging RPCs in progress. Making
this call before checking mnt_lockref fixes the problem, by ensuring
that the VOP_xxx() calls will fail and unbusy the mount point.
Use single underscore for all parameters name and local variables in
bintime_* related functions. This commit completes what was already done
by theraven@ for bintime_shift, and just uses a single underscore instead
of two (which is a style bug according to Bruce). See r251855 for reference.
system(): Restore behaviour for SIGINT and SIGQUIT.
As mentioned in r16117 and the book "Advanced Programming in the Unix
Environment" by W. Richard Stevens, we should ignore SIGINT and SIGQUIT
before forking, since it is not guaranteed that the parent process starts
running soon enough.
To avoid calling sigaction() in the vforked child, instead block SIGINT and
SIGQUIT before vfork() and keep the sigaction() to ignore after vfork(). The
FreeBSD kernel discards ignored signals, even if they are blocked;
therefore, it is not necessary to unblock SIGINT and SIGQUIT earlier.
Bring legacy CAM target implementation back into API/KPI-coherent and even
functional state. While CTL is much more superior target from all points,
there is no reason why this code should not work.
Import multiqueue VirtIO net driver from my user/bryanv/vtnetmq branch
This is a significant rewrite of much of the previous driver; lots of
misc. cleanup was also performed, and support for a few other minor
features was also added.