markm [Mon, 17 Aug 2015 07:36:12 +0000 (07:36 +0000)]
Add DEV_RANDOM pseudo-option and use it to "include out" random(4)
if desired.
Retire randomdev_none.c and introduce random_infra.c for resident
infrastructure. Completely stub out random(4) calls in the "without
DEV_RANDOM" case.
Add RANDOM_LOADABLE option to allow loadable Yarrow/Fortuna/LocallyWritten
algorithm. Add a skeleton "other" algorithm framework for folks
to add their own processing code. NIST, anyone?
Retire the RANDOM_DUMMY option.
Build modules for Yarrow, Fortuna and "other".
Use atomics for the live entropy rate-tracking.
Convert ints to bools for the 'seeded' logic.
Move _write() function from the algorithm-specific areas to randomdev.c
sbruno [Sun, 16 Aug 2015 19:43:44 +0000 (19:43 +0000)]
Increase EM_MAX_SCATTER to 64 such that the size of em_xmit()::segs[EM_MAX_SCATTER]
doesn't get overrun by things like NFS that can and do shove more than 32 segs when
being used with em(4) and TSO4.
Update tso handling code in em_xmit() with update from jhb@ in email thread:
https://lists.freebsd.org/pipermail/freebsd-net/2014-July/039306.html
set ifp->if_hw_tsomax, ifp->if_hw_tsomaxsegcount & ifp->if_hw_tsomaxsegsize
to appropriate values.
Define a TSO workaround "magic" number of 4 that is used to avoid an
alignment issue in hardware.
Change a couple of integer values that were used as booleans to actual
bool types.
Ensure that em_enable_intr() enables the appropriate mask of interrupts
and not just a hardcoded define of values.
jilles [Sun, 16 Aug 2015 19:42:15 +0000 (19:42 +0000)]
wordexp(): Stop using the undocumented wordexp builtin.
The functionality of the wordexp builtin is easily replaced using normal
shell code, although performance is slightly worse.
This does not mean that wordexp() will remain shell-independent -- a fully
reliable implementation of WRDE_NOCMD is really only possible using
extensions to the shell, or by adding much of the shell's code to libc.
alc [Sun, 16 Aug 2015 17:07:53 +0000 (17:07 +0000)]
As another piece of PG_CACHE page elimination, remove an LRU-defeating call
to vm_page_try_to_cache() from vm_pageout_flush(). Other changes, most
recently r286814, have made this call unnecessary.
ed [Sun, 16 Aug 2015 13:59:11 +0000 (13:59 +0000)]
Pick UINT_MAX / 100 as an upperbound.
The fix that I applied in r286798 is already good, but it assumes that
sizeof(int) > sizeof(short). Express the upperbound in terms of
UINT_MAX. By dividing that by 100, we're sure that the resulting value
is never larger than approximately UINT_MAX / 10, which is safe.
melifaro [Sun, 16 Aug 2015 12:23:58 +0000 (12:23 +0000)]
Split arpresolve() into fast/slow path.
This change isolates the most common case (e.g. successful lookup)
from more complicates scenarios. It also (tries to) make code
more simple by avoiding retry: cycle.
The actual goal is to prepare code to the upcoming change that will
allow LL address retrieval without acquiring LLE lock at all.
gonzo [Sat, 15 Aug 2015 21:47:07 +0000 (21:47 +0000)]
Make dtb file configurable via loader(8) variable. ubldr already checks
"fdt_file" and "fdtfile" U-Boot variables. Add one more check for
"fdt_file" loader(8) variable.
loader(8) variable takes precedence over u-boot env one
mav [Sat, 15 Aug 2015 21:46:02 +0000 (21:46 +0000)]
Remove UMA allocation of ATA requests.
After CAM replaced old ATA stack, this driver processes no more then one
request at a time per channel. Using UMA after that is overkill, so
replace it with simple preallocation of one request per channel.
marcel [Sat, 15 Aug 2015 16:13:28 +0000 (16:13 +0000)]
Improve support for Macs that have a stride not equal to the
horizonal resolution (width). In those cases fb_bpp ended up
completely wrong -- as in 6 bytes per pixel or something like
that. Since we already have a way to calculate fb_depth given
the masks and fb_bpp is effectively the same as fb_depth, all
we need to do is make sure fb_bpp is rounded to the next
multiple of the number of bits in a byte -- we assume we can
divide by the number of bits in a byte throughout vt(4).
While here:
- simplify how we calculate fb_depth.
- use fb_bpp instead of fb_depth to calculate fb_stride;
we know we can divide fb_bpp.
- don't limit fb_width and fb_height by VT_FB_DEFAULT_WIDTH
and VT_FB_DEFAULT_HEIGHT (resp.). Those constants have
not relation to the size of the frame buffer.
This at least fixes "lower-resolution" Macs. We're talking
1280x1024 or so. There still is a problem with 27" Macs,
which typically have a horizontal resolution over 2K.
marcel [Sat, 15 Aug 2015 15:44:09 +0000 (15:44 +0000)]
Improve the VT initialization message: have it say what the
resolution is. For text mode this is the number of columns
by the number of rows. Include the name of the driver in a
much less prominent way.
mav [Sat, 15 Aug 2015 15:42:21 +0000 (15:42 +0000)]
Move "ioctl" CAM frontend into separate file.
It has nothing to share with too huge ctl.c other then device descriptor,
but even that may be counted as design error that may be fixed later.
At some point we may even want to have several ioctl ports.
mav [Sat, 15 Aug 2015 13:34:38 +0000 (13:34 +0000)]
Drop "internal" CTL frontend.
Its idea was to be a simple initiator and execute several commands from
kernel level, but FreeBSD never had consumer for that functionality,
while its implementation polluted many unrelated places..
hselasky [Sat, 15 Aug 2015 12:06:15 +0000 (12:06 +0000)]
Fixes for HIGH speed ISOCHRONOUS traffic. HS ISOCHRONOUS traffic at
intervals less than 250us was not handled properly. Add support for
high-bandwidth ISOCHRONOUS packets. USB webcams, USB audio and USB DVB
devices are expected to work better. High-bandwidth INTERRUPT
endpoints is not yet supported.
hselasky [Sat, 15 Aug 2015 09:00:36 +0000 (09:00 +0000)]
Fix race in USB PF which can happen if we stop tracing exactly when
the kernel is tapping an USB transfer. This leads to a NULL pointer
access. The solution is to only trace while the USB bus lock is
locked.
ed [Sat, 15 Aug 2015 08:42:33 +0000 (08:42 +0000)]
Stop parsing digits if the value already exceeds USHRT_MAX.
There is no need for us to support parsing values that are larger than
the maximum terminal window size. In this case that would be the maximum
of unsigned short.
The problem with parsing larger values is that they can cause integer
overflows when adjusting the cursor position, leading to all sorts of
failing assertions.
oshogbo [Sat, 15 Aug 2015 06:34:49 +0000 (06:34 +0000)]
Add support for the arrays in nvlist library.
- Add
nvlist_{add,get,take,move,exists,free}_{number,bool,string,nvlist,
descriptor} functions.
- Add support for (un)packing arrays.
- Add the nvl_array_next field to the nvlist structure.
If an array is added by the nvlist_{move,add}_nvlist_array function
this field will contains next element in the array.
- Add the nitems field to the nvpair and nvpair_header structure.
This field contains number of elements in the array.
- Add special flag (NV_FLAG_IN_ARRAY) which is set if nvlist is a part of
an array.
- Add special type (NV_TYPE_NVLIST_ARRAY_NEXT).This type is used only
on packing/unpacking.
- Add new API for traversing arrays (nvlist_get_array_next).
- Add the nvlist_get_pararr function which combines the
nvlist_get_array_next and nvlist_get_parent functions. If nvlist is in
the array it will return next element from array. If nvlist is last
element in array or it isn't in array it will return his
container (parent). This function should simplify traveling over nvlist.
- Add tests for new features.
- Add documentation for new functions.
- Add my copyright.
- Regenerate the sys/cddl/compat/opensolaris/sys/nvpair.h file.
rpaulo [Fri, 14 Aug 2015 22:54:52 +0000 (22:54 +0000)]
Introduce a new make variable: NMFLAGS.
As the name indicates, these are flags to pass to nm(1). The newer
binutils have a plugin mechanism so, to build something with LLVM's
LTO, we need to pass flags to nm(1). This commit also extends
lorder(1) to pass NMFLAGS to nm(1).
rmacklem [Fri, 14 Aug 2015 22:02:14 +0000 (22:02 +0000)]
For the case where an NFSv4.1 ExchangeID operation has the client identifier
that already has a confirmed ClientID, the nfsrv_setclient() function would
not fill in the clientidp being returned. As such, the value of ClientID
returned would be whatever garbage was on the stack.
An NFSv4.1 client would not normally do this, but it appears that it can
happen for certain Linux clients. When it happens, the client persistently
retries the ExchangeID and Create_session after Create_session fails when
it uses the bogus clientid. With this patch, the correct clientid is replied.
This problem was identified in a packet trace supplied by
Ahmed Kamal via email.
jah [Fri, 14 Aug 2015 20:08:16 +0000 (20:08 +0000)]
Use pmap_quick_enter_page() to handle bouncing of unmapped buffers in the x86 busdma_bounce implementation. Also treat user buffers as unmapped.
This allows two things:
1. Sync'ing bounced maps in non-sleepable contexts. The physcopy* calls previously used could sleep on sf_buf operations in some cases.
2. Sync'ing user buffers outside the context of the owning process
alc [Fri, 14 Aug 2015 17:49:03 +0000 (17:49 +0000)]
Stop describing an acquire operation as a read barrier and a release
operation as a write barrier. That description has never been correct,
and it has caused confusion. An acquire operation orders writes as well
as reads, and a release operation orders reads as well as writes.
Also, explicitly say that a thread doesn't see its own accesses being
reordered. The reordering of a thread's accesses is only (potentially)
visible to another thread. Thus, memory barriers need only be used to
control the ordering of accesses between threads, not within a thread.
ian [Fri, 14 Aug 2015 16:48:07 +0000 (16:48 +0000)]
Use simple fixed name strings for these timecounters and eventimers which
are tied to fixed pieces of hardware; dynamic string formatting isn't needed.
pfg [Fri, 14 Aug 2015 14:58:04 +0000 (14:58 +0000)]
Remove a stale comment and clarify the original where it was taken from
The comment in the libc/sys symbol map referenced the generated symbols
for the syscall trampolines. Such comment was out of place in the secure
symbol map so remove the stale comment and attempt to clarify the old one
to avoid risks of confusion.
mav [Fri, 14 Aug 2015 13:10:30 +0000 (13:10 +0000)]
2618 arc.c mistypes in the comments
Reviewed by: Jason King <jason.brian.king@gmail.com>
Reviewed by: Josef Sipek <jeffpc@josefsipek.net>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Bart Coddens <bart.coddens@gmail.com>
hselasky [Fri, 14 Aug 2015 12:57:53 +0000 (12:57 +0000)]
Improve the realtime properties of USB transfers for embedded systems
like RPI-B and RPI-2.
Description of problem:
USB transfers can process data in their callbacks sometimes causing
unacceptable latency for other USB transfers. Separate BULK completion
callbacks from CONTROL, INTERRUPT and ISOCHRONOUS callbacks, and give
BULK completion callbacks lesser execution priority than the
others. This way USB audio won't be interfered by heavy USB ethernet
usage for example.
Further serve USB transfer completion in a round robin fashion,
instead of only serving the most CPU hungry. This has been done by
adding a third flag to USB transfer queue structure which keeps track
of looping callbacks. The "command" callback function then decides
what to do when looping.
As a way to make it more difficult to introduce bugs into the ARC, and to
make it easier to diagnose issues when bugs do creep in, it would be
beneficial to change the type of the arc_state_t's arcs_size field to be
a refcount_t instead of a uint64_t. This would allow us to make stricter
checks when incrementing and decrementing the value with debugging enabled,
but still fallback to simple, fast atomic operations when debugging is
disabled.
https://www.illumos.org/issues/6033
When we're looking for the list containing oldest buffer we never
actually look at the MFU lists even when we try to evict from MFU.
looks like a copy paste error, the fix is here:
mav [Fri, 14 Aug 2015 09:31:07 +0000 (09:31 +0000)]
MFV r277431: 5497 lock contention on arcs_mtx
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Richard Elling <richard.elling@richardelling.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>
This patch attempts to reduce lock contention on the current arc_state_t
mutexes. These mutexes are used liberally to protect the number of LRU
lists within the ARC (e.g. ARC_mru, ARC_mfu, etc). The granularity at
which these locks are acquired has been shown to greatly affect the
performance of highly concurrent, cached workloads.
pfg [Fri, 14 Aug 2015 03:03:13 +0000 (03:03 +0000)]
Move the stack protector to a new "secure" directory
As part of the code refactoring to support FORTIFY_SOURCE we want
a new subdirectory "secure" to keep the files related to security.
Move the stack protector functions to this new directory.
edwin [Thu, 13 Aug 2015 23:57:44 +0000 (23:57 +0000)]
MFV of 286748,tzdata2015f
Update to tzdata2015f:
Changes affecting future time stamps
North Korea switches to +0830 on 2015-08-15. (Thanks to Steffen Thorsen.)
The abbreviation remains "KST". (Thanks to Robert Elz.)
Uruguay no longer observes DST. (Thanks to Steffen Thorsen and Pablo Camargo.)
Changes affecting past and future time stamps
Moldova starts and ends DST at 00:00 UTC, not at 01:00 UTC. (Thanks to Roman Tudos.)
emaste [Thu, 13 Aug 2015 17:50:47 +0000 (17:50 +0000)]
Roll WITHOUT_ELFTOOLCHAIN_TOOLS into WITHOUT_TOOLCHAIN
The option was added only to ease the transition from GNU Binutils to
ELF Tool Chain tools, and that process is now complete (for the viable
replacements). Noting the removal in UPDATING is sufficient as we have
not shipped a release with the option.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3240
marcel [Thu, 13 Aug 2015 15:16:34 +0000 (15:16 +0000)]
Change md(4) to use weak symbols as start, end and size for the embedded
root disk. The embedded image is linked into the kernel in the .mfs
section.
Add rules and variables to kern.pre.mk and kern.post.mk that handle the
linking of the image. First objcopy is used to generate an object file.
Then, the object file is linked into the kernel.
Submitted by: Steve Kiernan <stevek@juniper.net>
Reviewed by: brooks@
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D2903
ian [Thu, 13 Aug 2015 14:43:25 +0000 (14:43 +0000)]
Constify the pointers to eventtimer and timecounter name strings.
The need for this appears as soon as you try to set the names to something
that isn't a "quoted literal". (I'm actually confused why quoted strings
aren't a problem as well, we must have some warning disabled.)
marcel [Thu, 13 Aug 2015 14:43:11 +0000 (14:43 +0000)]
Fix text mode operation.
We first map 64KB at 0xA0000 and then determine whether to work
in text or graphics mode. When graphics mode, the mapping is
precisely what we need and everything is fine. But text mode,
has the frame buffer relocated to 0xB8000. We didn't map that
much to safely add 0x18000 bytes to the base address.
Now we first check whether to work in text or graphics mode and
then map the frame buffer at the right address and with the
right size (0xA0000+64KB for graphics, 0xB8000+32KB for text).
melifaro [Thu, 13 Aug 2015 13:38:09 +0000 (13:38 +0000)]
Move lle update code from from gigantic ip_arpinput() to
separate bunch of functions. The goal is to isolate actual lle
updates to permit more fine-grained locking.
emaste [Thu, 13 Aug 2015 13:21:00 +0000 (13:21 +0000)]
arm64: turn unknown el0 exception into a SIGILL
It seems we get EXCP_UNKNOWN from QEMU when executing zeroed memory.
Print a register dump here and signal illegal instruction. Also print
a register dump for other invalid exceptions, before panic.
Reviewed by: andrew
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3370
mav [Thu, 13 Aug 2015 00:13:55 +0000 (00:13 +0000)]
MFV 286711: 6096 ZFS_SMB_ACL_RENAME needs to cleanup better
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Gordon Ross <gordon.w.ross@gmail.com>
Reviewed by: George Wilson <gwilson@zfsmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
mav [Thu, 13 Aug 2015 00:10:36 +0000 (00:10 +0000)]
MFV 286709:
6093 zfsctl_shares_lookup should only VN_RELE() on zfs_zget() success
Reviewed by: Gordon Ross <gwr@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Dan McDonald <danmcd@omniti.com>
mav [Wed, 12 Aug 2015 23:59:17 +0000 (23:59 +0000)]
MFV 286707: 5959 clean up per-dataset feature count code
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
A ZFS feature flags (large blocks) tracks its refcounts as the number of
datasets that have ever used the feature. Several features of this type
are planned to be added (new checksum functions). This code should be made
common infrastructure rather than duplicating the code for each feature.
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Author: Paul Dagnelie <pcd@delphix.com>
While running 'zfs recv' we noticed that every 128th 8K block required a
read. We were seeing that restore_write() was calling dmu_tx_hold_write()
and the indirect block was not cached. We should prefetch upcoming indirect
blocks to avoid having to go to disk and blocking the restore_write().
Allow an incremental send stream to be received as a clone, even if the
stream does not mark it as a clone.
np [Wed, 12 Aug 2015 22:09:58 +0000 (22:09 +0000)]
Reinstate unify_tcp_port_space and associated code that was lost during
the last OFED update (r278886).
iWARP on FreeBSD is properly integrated with the network stack and the
iWARP drivers _never_ operate out of any private TCP port-space that is
invisible to the kernel. Instead, an iWARP connection shows up as a TCP
socket (which is what it is) fully visible to the kernel and standard
tools like netstat, sockstat, etc.
ian [Wed, 12 Aug 2015 20:50:20 +0000 (20:50 +0000)]
If a specific timecounter has been chosen via sysctl, and a new timecounter
with higher quality registers (presumably in a module that has just been
loaded), do not undo the user's choice by switching to the new timecounter.
Document that behavior, and also the fact that there is no way to unregister
a timecounter (and thus no way to unload a module containing one).
dim [Wed, 12 Aug 2015 20:16:13 +0000 (20:16 +0000)]
In gcc's libcpp, stop using the INTTYPE_MAXIMUM() macro, which relies on
undefined behavior. The code used this macro to avoid problems on some
broken systems which define SSIZE_MAX incorrectly, but this is not
needed on FreeBSD, obviously.
ian [Wed, 12 Aug 2015 19:40:32 +0000 (19:40 +0000)]
Remove all dregs of the old PPS driver from this code, in preparation for
redoing it as a separate driver. Now that each hardware timer is handled by
a separate instance of the timer driver, it no longer makes sense to bundle
the pps driver with the regular timecounter code. (When all 8 timers were
handled by one driver there was no choice about this.)
Split the hardware register definitions out to their own file, so that the
new pps driver (coming in a separate commit later) can share them.
With the PPS driver gone, the question of which hardware timer to use for
what purpose becomes much easier (some instances can't do the PPS capture).
Now we can just hardcore timer2 for eventtimer and timer3 for timecounter.
This also now only instantiates devices for the 2 hardware timers actually
used to implement eventtimer and timecounter. This is required so that
other drivers can come along and attach to other hardware timers to provide
other functionality. (In addition to PPS, this hardware can also do PWM
stuff, general pulse width and frequency measurements, etc. Maybe some
day we'll have drivers for those things.)