Bruce Evans [Sun, 26 Mar 2017 10:31:48 +0000 (10:31 +0000)]
Use inline asm instead of unportable intrinsics for the SSE4 crc32
optimization.
This fixes building with gcc-4.2.1 (it doesn't support SSE4).
gas-2.17.50 [FreeBSD] supports SSE4 instructions, so this doesn't
need using .byte directives.
This fixes depending on host user headers in the kernel.
Fix user includes (don't depend on namespace pollution in <nmmintrin.h>
that is not included now).
The instrinsics had no advantages except to sometimes avoid compiler
pessimixations. clang understands them a bit better than inline asm,
and generates better looking code which also runs better for cem, but
for me it just at the same speed or slower by doing excessive
unrollowing in all the wrong places. gcc-4.2.1 also doesn't understand
what it is doing with unrolling, but with -O3 somehow it does more
unrolling that helps.
Reduce 1 of the the compiler pessimizations (copying a variable which
already satisfies an "rm" constraint in a good way by being in memory
and not used again, to different memory and accessing it there. Force
copying it to a register instead).
Try to optimize the inner loops significantly, so as to run at full
speed on smaller inputs. The algorithm is already very MD, and was
tuned for the throughput of 3 crc32 instructions per cycle found on
at least Sandybridge through Haswell. Now it is even more tuned for
this, so depends more on the compiler not rearranging or unrolling
things too much. The main inner loop for should have no difficulty
runing at full speed on these CPUs unless the compiler unrolls it too
much. However, the main inner loop wasn't even used for buffers smaller
than 24K. Now it is used for buffers larger than 384 bytes. Now it
is not so long, and the main outer loop is used more. The new
optimization is to try to arrange that the outer loop runs in parallel
with the next inner loop except for the final iteration; then reduce
the loop sizes significantly to take advantage of this.
Michal Meloun [Sun, 26 Mar 2017 08:36:56 +0000 (08:36 +0000)]
Preserve VFP state across signal delivery.
We don't have enouch space to store full VFP context within mcontext
stucture. Due to this:
- follow i386/amd64 way and store VFP state outside of the mcontext_t
but point to it. Use the size of VFP state structure as an 'magic'
indicator of the saved VFP state presence.
- teach set_mcontext() about this external storage.
- for signal delivery, store VFP state to expanded 'struct sigframe'.
Provide less laborius way to enable busdma DMAR to only short list of devices.
Kernel environment variable hw.busdma.default can take values 'bounce'
and 'dmar' and selects corresponding busdma backend as default.
Per-device environment variable hw.busdma.pci<domain>.<bus>.<slot>.<func>
takes the same values and overrides hw.busdma.default for the given device.
Note that even with hw.busdma.default=bounce, DMA translation engines
are still started if DMARs are enabled, to disable them use
hw.dmar.dma tunable, as before.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
iwn: fix return code conflict in iwn_init_locked()
Do not try to use errno(2) codes here; instead, just return unique
value (1) when radio is disabled via hardware switch and another
one (-1) for any other error in initialization path.
Sevan Janiyan [Sat, 25 Mar 2017 21:33:48 +0000 (21:33 +0000)]
ftp.microsoft.com is dead and the document was not archived, point to the full
protocol spec document instead.
Fix spelling mistake flagged by igor.
Rephrase bad sentence flagged by igor.
Andriy Gapon [Sat, 25 Mar 2017 18:45:09 +0000 (18:45 +0000)]
specific end of interrupt implementation for AMD Local APIC
The change is more intrusive than I would like because the feature
requires that a vector number is written to a special register.
Thus, now the vector number has to be provided to lapic_eoi().
It was readily available in the IO-APIC and MSI cases, but the IPI
handlers required more work.
Also, we now store the VMM IPI number in a global variable, so that it
is available to the justreturn handler for the same reason.
Mike Karels [Sat, 25 Mar 2017 15:06:28 +0000 (15:06 +0000)]
Fix reference count leak with L2 caching.
ip_forward, TCP/IPv6, and probably SCTP leaked references to L2 cache
entry because they used their own routes on the stack, not in_pcb routes.
The original model for route caching was callers that provided a route
structure to ip{,6}input() would keep the route, and this model was used
for L2 caching as well. Instead, change L2 caching to be done by default
only when using a route structure in the in_pcb; the pcb deallocation
code frees L2 as well as L3 cacches. A separate change will add route
caching to TCP/IPv6.
Another suggestion was to have the transport protocols indicate willingness
to use L2 caching, but this approach keeps the changes in the network
level
Interesting fixes which were not already merged: 0c7c611 Merge C++ demangler bug fixes from ELF Tool Chain (#40) 2b208d9 __cxa_demangle_gnu3: demangle 'z' as '...', not 'ellipsis' (#41)
Avoid leaking allocated but unused context after creation race.
As noted in the comment, nothing special needs to be done to destroy
the unneeded context after the allocation race, but the context memory
itself still should to be freed.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Do not create RMRR entries for identity-mapped domains.
It does not make sense since identity mapping already provides the
required mapping for RMRR ranges. More, since identity page tables do
not reflect content of map entries for id domains, creating RMRR
entries makes domain data inconsistent.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Adrian Chadd [Sat, 25 Mar 2017 02:49:20 +0000 (02:49 +0000)]
[iwm] Enable Energy Based Scan (EBS).
This can significantly reduce scan duration thus saving time and power.
EBS failure reported by FW disables EBS for current connection. It is
re-enabled upon new connection attempt on any WLAN interface.
net80211: fix possible panic when wlan(4) interface is destroyed.
If this is the last running vap wait until device will be powered off
(fixes panic when 'ifconfig wlan0 destroy' is executed for running iwn(4)
interface).
Tested with:
- Intel 6205, STA mode.
- RTL8188EU, STA / IBSS modes.
- RTL8821AU, STA / HOSTAP modes.
Bruce Evans [Fri, 24 Mar 2017 17:34:55 +0000 (17:34 +0000)]
Remove buggy adjustment of page tables in db_write_bytes().
Long ago, perhaps only on i386, kernel text was mapped read-only and
it was necessary to change the mapping to read-write to set breakpoints
in kernel text. Other writes by ddb to kernel text were also allowed.
This write protection is harder to implement with 4MB pages, and was
lost even for 4K pages when 4MB pages were implemented. So changing
the mapping became useless. It was actually worse than useless since
it followed followed various null and otherwise garbage pointers to
not change random memory instead of the mapping. (On i386s, the
pointers became good in pmap_bootstrap(), and on amd64 the pointers
became bad in pmap_bootstrap() if not before.)
Another bug broke detection of following of null pointers on i386,
except early in boot where not detecting this was a feature. When
I fixed the bug, I accidentally broke the feature and soon got traps
in db_write_bytes(). Setting breakpoints early in ddb was broken.
kib pointed out that a clean way to do the adjustment would be to use
a special [sub]map giving a small window on the bytes to be written.
The trap handler didn't know how to fix up errors for pagefaults
accessing the map itself. Such errors rarely need fixups, since most
traps for the map are for the first access which is a read.
Gleb Smirnoff [Fri, 24 Mar 2017 16:01:19 +0000 (16:01 +0000)]
Make sendfile(2) more robust against file change. This fixes a possible
crash when the file shrinks. This also fixes sendfile(2) not sending more
data in a case when the file grows, and the request is open-ended or
specifies a size that is greater than old file size.
PR: 217789
Reviewed by: gallatin
MFC after: 10 days
Michal Meloun [Fri, 24 Mar 2017 11:46:49 +0000 (11:46 +0000)]
Cleanup structures related to VFP and/or mcontext_t.
- in mcontext_t, rename newer used 'union __vfp' to equaly sized 'mc_spare'.
Space allocated by 'union __vfp' is too small and cannot hold full
VFP context.
- move structures defined in fp.h to more appropriate headers.
- remove all unused VFP structures.
This change introduces a new weighting algorithm to improve metaslab selection.
The new weighting algorithm relies on the SPACEMAP_HISTOGRAM feature. As a result,
the metaslab weight now encodes the type of weighting algorithm used
(size-based vs segment-based).
This also introduce a new allocation tracing facility and two new dcmds to help
debug allocation problems. Each zio now contains a zio_alloc_list_t structure
that is populated as the zio goes through the allocations stage. Here's an
example of how to use the tracing facility:
If the metaslab is using segment-based weighting then the WEIGHT column will
display the number of segments available in the bucket where the allocation
attempt was made.
Author: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Chris Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ed Schouten [Fri, 24 Mar 2017 07:09:33 +0000 (07:09 +0000)]
Include <sys/systm.h> to obtain the memcpy() prototype.
I got a report of this source file not building on Raspberry Pi. It's
interesting that this only fails for that target and not for others.
Again, that's no reason not to include the right headers.
Kevin Lo [Fri, 24 Mar 2017 01:23:07 +0000 (01:23 +0000)]
Don't initialize if_output to ether_output(), ether_ifattach() does it for
us already. While here, remove NOTYET code since if_watchdog is no longer
used.
Martin Matuska [Fri, 24 Mar 2017 00:02:12 +0000 (00:02 +0000)]
MFV r315875:
Sync libarchive with vendor.
Vendor changes (FreeBSD-related):
- store extended attributes with extattr_set_link() if no fd is provided
- add extended attribute tests to libarchive and bsdtar
- fix tar's test_option_acls
- support the UF_HIDDEN file flag
Vendor changes (FreeBSD-related):
- store extended attributes with extattr_set_link() if no fd is provided
- add extended attribute tests to libarchive and bsdtar
- support the UF_HIDDEN file flag
Landon J. Fuller [Thu, 23 Mar 2017 19:29:12 +0000 (19:29 +0000)]
[mips/broadcom]: Early boot NVRAM support
Add support for early boot access to NVRAM variables, using a new
bhnd_nvram_data_getvar_direct() API to support zero-allocation direct
reading of NVRAM variables from a bhnd_nvram_io instance backed by the
CFE NVRAM device.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D9913
Robert Watson [Thu, 23 Mar 2017 14:35:21 +0000 (14:35 +0000)]
In libcasper, prefer to send a function index or service name over the IPC
channel to a zygote process, rather than sending a function pointer or
service pointer. This avoids transfering pointers between address spaces,
which while robust in this case (due to the zygote being forked() from the
parent) is not generally a good idea, especially in the presence of
increasingly popular control-flow integrity and pointer protection
mitigation schemes. With this change, ping(8) and other sandboxed tools
using libcasper for DNS resolution now work on architectures with tagged
memory again.
Ed Schouten [Thu, 23 Mar 2017 14:09:45 +0000 (14:09 +0000)]
Don't require the presence of the compat_3_brand.
The existing ELF image activator requires the brandinfo to provide such
a string unconditionally, even if the executable format in question
doesn't use this type of branding. Skip matching when it's a null
pointer.
Andriy Gapon [Thu, 23 Mar 2017 11:59:17 +0000 (11:59 +0000)]
aacraid: rework r315083 for a clean build with and without AACRAID_DEBUG
r315083 essentially reverted r263954 which was made for a good reason,
but didn't take into account AACRAID_DEBUG.
Now both types of build should be clean.
Alexander Motin [Thu, 23 Mar 2017 10:50:45 +0000 (10:50 +0000)]
Remove "UNMAPPED" messages printed on da periph attach.
I think this message is not very useful for end user. Also its formatting
does not match other messages printed at that time. Those who really need
this information can always find it in `camcontrol negotiate daX -v`.
Andriy Gapon [Thu, 23 Mar 2017 08:59:17 +0000 (08:59 +0000)]
zfs: add zio_buf_alloc_nowait and use it in vdev_queue_aggregate
This way we can avoid blocking the whole queue in the low memory
situations. It's better to sacrifice some I/O performance by not doing
the aggregation than to add an indefinite wait for more memory.
Andriy Gapon [Thu, 23 Mar 2017 08:57:04 +0000 (08:57 +0000)]
move thread switch tracing from mi_switch to sched_switch
This is done so that the thread state changes during the switch
are not confused with the thread state changes reported when the thread
spins on a lock.
Here is an example, three consecutive entries for the same thread (from top to
bottom):
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"sleep", attributes: prio:84, wmesg:"-", lockname:"(null)"
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"spinning", attributes: lockname:"sched lock 1"
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"running", attributes: none
The above trace could leave an impression that the final state of
the thread was "running".
After this change the sleep state will be reported after the "spinning"
and "running" states reported for the sched lock.
The original author abused Nd (one-line description, used by makewhatis)
for its side effect of producing an en-dash. This broke whatis with
newer versions of mdocml. Use \(en instead.
Michal Meloun [Thu, 23 Mar 2017 08:16:53 +0000 (08:16 +0000)]
Restore original (pre r315760) naming for Tegra SDHCI device.
Newbus handles multiple equally named device classes without problems,
so there is no reason to use slightly cryptic "<foo>_shdci" for them.
In contrast, the driver module name must be unique, so "<foo>_shdci"
is the right name for it.
Enji Cooper [Thu, 23 Mar 2017 07:36:38 +0000 (07:36 +0000)]
Try polishing up iflib manpages a bit (basically all the low hanging fruit)
igor:
- Fix typos.
- Delete trailing whitespace.
manlint:
- Use .Fo/.Fc/.Fa when describing functions.
- Use .Xr.
- Fill in SEE ALSO section.
- Fix .Dt use: the section was specified incorrectly and the name
had a lowercase character.
- Continue new sentences on new lines.
Miscellaneous:
- Remove unnecessary quotes around "SEE ALSO" section headers.
- Sprinkle .Dv use in spots with constants.
Reported by: igor, make manlint
Sponsored by: Dell EMC Isilon
Enji Cooper [Thu, 23 Mar 2017 05:26:44 +0000 (05:26 +0000)]
intro(3): fix markup
- Use `Em` with `.It` macro when referring to other libraries, instead of
`Xr`.
- Use `.Em` instead of `.Xr` when referring to libraries.
- Remove commented out lines.
Adrian Chadd [Thu, 23 Mar 2017 04:43:04 +0000 (04:43 +0000)]
[iwm] Remove a couple of unneeded IWM_UCODE_TLV_FLAGS_* flags.
* All the supported firmwares have these flags set.
* This removes the following flags:
IWM_UCODE_TLV_FLAGS_PM_CMD_SUPPORT,
IWM_UCODE_TLV_FLAGS_NEWBT_COEX,
IWM_UCODE_TLV_FLAGS_BF_UPDATED,
IWM_UCODE_TLV_FLAGS_D3_CONTINUITY_API,
IWM_UCODE_TLV_FLAGS_STA_KEY_CMD,
IWM_UCODE_TLV_FLAGS_DEVICE_PS_CMD,
IWM_UCODE_TLV_FLAGS_SCHED_SCAN,
IWM_UCODE_TLV_FLAGS_RX_ENERGY_API,
IWM_UCODE_TLV_FLAGS_TIME_EVENT_API_V2
* Also remove definitions and code for dealing with the v1 time-event api.