Fix a panic during boot caused by inadequate locking of some vt(4) driver
data structures.
vt_change_font() calls vtbuf_grow() to change some vt driver data
structures. It uses TF_MUTE to prevent the console from trying to use those
data structures while it changes them.
During the early stage of the boot process, the vt driver's tc_done routine
uses those data structures; however, it is currently called outside the
TF_MUTE check.
Move the tc_done routine inside the locked TF_MUTE check.
John Baldwin [Thu, 23 Feb 2017 00:02:49 +0000 (00:02 +0000)]
Fully handle the special encoding of GOT[1] on mips64.
The MIPS ABI does not require the second GOT entry to be reserved for use
by the runtime linker as on other architectures. Instead, static linkers
use a special value in the second GOT entry to indicate if the entry is
reserved. This value is supposed to consist of an address with the MSB
set and the rest of the bits all zero which is an invalid user address.
However, the old binutils currently in the tree uses the 32-bit mask value
(2^31) on 64-bit MIPS instead of 2^63. This was fixed in upstream
binutils in 2008 to use 2^63 on 64-bit MIPS.
The first part of this change changes the runtime check in init_pltgot()
to check for both values (2^31 and 2^63) when deciding whether to store
the current object pointer in GOT[1] which fixes dynamic N64 binaries
compiled with modern binutils.
However, the initial version of this fix exposed another related bug in
that _rtld_relocate_nonplt_self() was only checking for the new value
(2^63) in GOT[1] and incorrectly treated GOT[1] as a local GOT entry
(and did not relocate the final local GOT entry). To handle this, fix
all of the places that check for GOT[1]'s status to use the same macro
that checks for both values on N64.
Toomas Soome [Wed, 22 Feb 2017 22:00:50 +0000 (22:00 +0000)]
loader: update symlink support in zfs reader
As the current zfs file system is providing symlink via system attributes, need
to update the code accordingly.
Note, as the zfsboot code does not free the memory at this time, the
object list will put some stress on the boot2 heap, eventually we should
address the issue.
Kurt Lidl [Wed, 22 Feb 2017 21:50:37 +0000 (21:50 +0000)]
Improve ipfw rule creation for blacklist-helper script
When blocking an address, the blacklist-helper script
needs to do the following things for the ipfw packet
filter:
- create a table to hold the addresses to be blocked,
so lookups can be done quickly, and place the address
to be blocked in that table
- create rule that does the lookup in the table and
blocks the packet
The ipfw system allows multiple rules to be inserted for
a given rule number. There only needs to be one rule
to do the lookup per port. Modify the script to probe
for the existence of the rule before attempting to create
it, so only one rule is inserted, rather than one rule per
blocked address.
PR: 214980
Reported by: azhegalov (at) gmail.com
Reviewed by: emaste
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D9681
When allocating unmapped pages, take advantage of the direct map on
AMD64 to get the virtual address corresponding to a page. Else all
pages allocated must be mapped because sometimes the virtual address
of a page is requested.
Move all page allocation and deallocation code into an own C-file.
Add support for GFP_DMA32, GFP_KERNEL, GFP_ATOMIC and __GFP_ZERO
allocation flags.
Make a clear separation between mapped and unmapped allocations.
The i915kms driver in Linux 4.9 reimplement parts of the scatter list
functions with regards to performance. In other words there is not so
much room for changing structure layouts and functionality if the
i915kms should be built AS-IS. This patch aligns the scatter list
support to what is expected by the i915kms driver. Remove some
comments not needed while at it.
Dimitry Andric [Wed, 22 Feb 2017 18:44:57 +0000 (18:44 +0000)]
Surround any unmangled C++ names in libcxxrt's version map with 'extern
"C++"', otherwise ld refuses to make the symbols global in the final
library. This causes the __int128-related symbols to go missing when
the library is stripped during installation.
Andriy Gapon [Wed, 22 Feb 2017 17:20:18 +0000 (17:20 +0000)]
don't use C99 static array indices with older GCC versions
For example, the FreeBSD GCC (4.2.1) has a spotty support for that
feature. If the static keyword is used with an unnamed array parameter
in a function declaration, then the compilation fails with:
error: static or type qualifiers in abstract declarator
The feature does work if the parameter is named.
So, the restriction introduced in this commit can be removed when all
affected function prototypes have the workaround.
The actual issue was the fact that if - was used then some restriction were
already set to stdin when we were applying caph_limit_stdio which was failing
due to the fact the fd was the fd was already restricted to lower rights.
Restricting stdio before actually opening the files prevent trying to raise the
right and fixes the issue.
And this allows to keep failing the program if restriction failed
Marius Strobl [Wed, 22 Feb 2017 10:21:39 +0000 (10:21 +0000)]
- Allow different slicers for different flash types to be registered
with geom_flashmap(4) and teach it about MMC for slicing enhanced
user data area partitions. The FDT slicer still is the default for
CFI, NAND and SPI flash on FDT-enabled platforms.
- In addition to a device_t, also pass the name of the GEOM provider
in question to the slicers as a single device may provide more than
provider.
- Build a geom_flashmap.ko.
- Use MODULE_VERSION() so other modules can depend on geom_flashmap(4).
- Remove redundant/superfluous GEOM routines that either do nothing
or provide/just call default GEOM (slice) functionality.
- Trim/adjust includes
Roger Pau Monné [Wed, 22 Feb 2017 09:22:17 +0000 (09:22 +0000)]
xen/timer: mark the Xen PV timer as not safe for suspension
Note that the timer itself fully supports suspension, but due to the lack of
ordering during the resume process FreeBSD cannot guarantee that the timer is
resumed before any device attempts to use it.
Alexander Motin [Wed, 22 Feb 2017 06:43:49 +0000 (06:43 +0000)]
Fix multiple problems around LUN disable under load.
- Move private data about ATIOs/INOTs from per-LUN to per-channel data.
This allows active commands to continue operation after LUN destruction.
This also simplifies lookup of the data by tag in some situations.
- Unify three restart_queue processing implementations.
- Complete all ATIOs from restart_queue on LUN disable.
- Delete ATIO private data when command completed or aborted, not depending
on the ATIO being requeued, that was ugly hack and could never happen. CAM
should always call ether XPT_CONT_TARGET_IO with status or XPT_ABORT.
- Implement XPT_ABORT for queued ATIOs/INOTs to allow CAM do graceful
shutdown, not depending on LUN disable, as it is done in ahd(4)/targ(4).
- Unify isp_endcmd() arguments to make it more usable in generic code.
- Remove never really used LUN state reference counter.
Ian Lepore [Wed, 22 Feb 2017 03:49:46 +0000 (03:49 +0000)]
Revert to this driver's historic behavior: assume an sd card is writable
if the fdt data doesn't provide a gpio pin for reading the write protect
switch and also doesn't contain a "wp-disable" property.
In r311735 the long-bitrotted code in this driver for using the non-
standard fdt "mmchs-wp-gpio-pin" property was replaced with new common
support code for handling write-protect and card-detect gpio pins. The
old code never found a property with that name, and the logic was to
assume that no gpio pin meant that the card was not write protected.
The new common code behaves differently. If there is no fdt data saying
what to do about sensing write protect, the value in the standard SDHCI
PRESENT_STATE register is used. On this hardware, if there is no signal
for write protect muxed into the sd controller then that bit in the
register indicates write protect.
The real problem here is the fdt data, which should contain "wp-disable"
properties for eMMC and micro-sd slots where write protect is not even
an option in the hardware, but we are not in control of that data, it
comes from linux. So we have to make the same flawed assumption in our
driver that the corresponding linux driver has: no info means no protect.
Reported by: several users on the arm@ list
Pointy hat: me, for not testing enough before committing r311735
Andriy Gapon [Tue, 21 Feb 2017 21:09:21 +0000 (21:09 +0000)]
zfs: lower priority of zio_write_issue threads by four
The difference of one was insignificant because zio_write_issue threads
ended up on the same run queues as other zio threads.
See sys/priority.h and sys/runq.h for more details.
Add a comment describing FreeBSD priority considerations and restore
the illumos variant of the code for comparison.
Ed Maste [Tue, 21 Feb 2017 18:59:17 +0000 (18:59 +0000)]
Exclude -flto when building *genassym.o
The build process generates *assym.h using nm from *genassym.o (which is
in turn created from *genassym.c).
When compiling with link-time optimization (LTO) using -flto, .o files
are LLVM bitcode, not ELF objects. This is not usable by genassym.sh,
so remove -flto from those ${CC} invocations.
Submitted by: George Rimar
Reviewed by: dim
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D9659
Replace dummy implementation of RCU in the LinuxKPI with one based on
the in-kernel concurrency kit's ck_epoch API. Factor RCU hlist_xxx()
functions into own rculist.h header file.
Andriy Gapon [Tue, 21 Feb 2017 17:47:08 +0000 (17:47 +0000)]
reimplement zfsctl (.zfs) support
The current code is written on top of GFS, a library with the generic
support for writing filesystems, which was ported from illumos.
Because of significant differences between illumos VFS and FreeBSD
VFS models, both the GFS and zfsctl code were heavily modified to
work on FreeBSD. Nonetheless, they still contain quite a few ugly
hacks and bugs.
This is a reimplementation of the zfsctl code where the VFS-specific
bits are written from scratch and only the code that interacts with
the rest of ZFS is reused.
Some highlights.
We use two types of nodes, static and on-demand. The static nodes
are used for permanent directories like .zfs, .zfs/snapshot, etc. The
on-demand nodes are used for ephemeral directories that act as snapshot
mount points.
Initially only static nodes are created. Their vnodes are instantiated
when they are looked up. The on-demand nodes and vnodes are instantiated
as needed and the nodes are destroyed as soon as the corresponding
vnodes are reclaimed.
We also try very hard to ensure that uncovered snapshot vnodes do not
linger. They are supposed to become inactive as soon as they are
uncovered and we try to recycle them immediately.
When a filesystem is unmounted all snapshots under .zfs are unmounted
first, then all vnodes are flushed and finally the static .zfs nodes
are destroyed.
There are some changes outside of zfsctl code too.
z_ctldir is never used directly (as it is an opaque pointer),
zfsctl_root() has to be used instead. The function returns a locked
vnode now, so it accepts a lock flags parameter. The function can
also fail now, e.g. during force unmounting, whereas previously it
was infallible.
zfsctl_root_lookup() is retired, instead of it VOP_LOOKUP() on the .zfs
vnode (obtained with zfsctl_root) is used.
Some ideas are picked from an independent work by will.
Get rid of foo_sys() in linuxulator code. It was commented out, and it
would be useless anyway - there is no point in pretending to have block
devices; our "block" devices are in fact character ones, and can only
be accessed as such.
Alexander Motin [Tue, 21 Feb 2017 14:31:58 +0000 (14:31 +0000)]
Remove duplicate INOT allocation.
For some reason isp_handle_platform_notify_fc() allocated INOT just
before calling isp_handle_platform_target_tmf(), which also allocates
INOT. It seems to be a braino introduced in r196008.
1) Add better spinlock debug names when WITNESS_ALL is defined.
2) Make sure that the calling thread gets bound to the current CPU
while a spinlock is locked. Some Linux kernel code depends on that the
CPU ID doesn't change while a spinlock is locked.
3) Add support for using LinuxKPI spinlocks during a panic().
Tasklets are implemented using a taskqueue and a small statemachine on
top. The additional statemachine is required to ensure all LinuxKPI
tasklets get serialized. FreeBSD taskqueues do not guarantee
serialisation of its tasks, except when there is only one worker
thread configured.
Make the LinuxKPI task struct persistent accross system calls.
A set of helper functions have been added to manage the life of the
LinuxKPI task struct. When an external system call or task is invoked,
a check is made to create the task struct by demand. A thread
destructor callback is registered to free the task struct when a
thread exits to avoid memory leaks.
This change lays the ground for emulating the Linux kernel more
closely which is a dependency by the code using the LinuxKPI APIs.
Add new dedicated td_lkpi_task field has been added to struct thread
instead of abusing td_retval[1].
Fix some header file inclusions to make LINT kernel build properly
after this change.
Bump the __FreeBSD_version to force a rebuild of all kernel modules.
Bartek Rutkowski [Tue, 21 Feb 2017 09:37:33 +0000 (09:37 +0000)]
Enable bsdinstall hardening options by default.
As discussed previously, in order to introduce new OS hardening
defaults, we've added them to bsdinstall in 'off by default' mode.
It has been there for a while, so the next step is to change them
to 'on by defaul' mode, so that in future we could simply enable
them in base OS.
Reviewed by: brd
Approved by: adrian
Differential Revision: https://reviews.freebsd.org/D9641
Alexander Motin [Tue, 21 Feb 2017 06:10:11 +0000 (06:10 +0000)]
Do not blindly free completed ATIOs/INOTs on invalidation.
When LUN is disabled, SIM starts returning queued ATIOs/INOTs. But at the
same time there can be some ATIOs/INOTs still carrying real new requests.
If we free those, SIM may leak some resources, forever expecting for any
response from us. So try to be careful, separating ATIOs/INOTs carrying
requests which still must be processed, from ATIOs/INOTs completed with
errors which can be freed.
John Baldwin [Mon, 20 Feb 2017 20:37:25 +0000 (20:37 +0000)]
Consolidate statements to initialize files.
Previously, the first lines of various generated files from system call
tables were generated in two sections. Some of the initialization was
done in BEGIN, and the rest was done when the first line was encountered.
The main reason for this split before r313564 was that most of the
initialization done in the second section depended on the $FreeBSD$ tag
extracted from the system call table. Now that the $FreeBSD$ tag is no
longer used, consolidate all of the file initialization in the BEGIN
section.
This change was tested by confirming that the content of generated files
did not change.
Eric Badger [Mon, 20 Feb 2017 15:53:16 +0000 (15:53 +0000)]
Defer ptracestop() signals that cannot be delivered immediately
When a thread is stopped in ptracestop(), the ptrace(2) user may request
a signal be delivered upon resumption of the thread. Heretofore, those signals
were discarded unless ptracestop()'s caller was issignal(). Fix this by
modifying ptracestop() to queue up signals requested by the ptrace user that
will be delivered when possible. Take special care when the signal is SIGKILL
(usually generated from a PT_KILL request); no new stop events should be
triggered after a PT_KILL.
Add a number of tests for the new functionality. Several tests were authored
by jhb.
Adrian Chadd [Mon, 20 Feb 2017 02:08:08 +0000 (02:08 +0000)]
[net80211] RX parameter shuffle in net80211 in preparation for 4x4 NICs and 160MHz channels.
* Migrate the rx_params stuff out from ieee80211_freebsd.h where it doesn't belong -
this isn't freebsd specific anymore.
* Don't use a hard-coded number of chains in the ioctl header; now we can shuffle
MAX_CHAINS around so it can be used in the right spot.
* Extend the signal/noisefloor levels in the mimo stats struct to userland to include
the signal and noisefloor levels for each 20MHz slice of a 160MHz channel.
* Bump the number of EVM pilots in preparation for 4x4 and 160MHz channels.
Tested:
* ath(4), STA mode
* iwn(4), STA mode
* local ath10k port, STA mode
TODO:
* 11ax chips will come with 5GHz 8x8 hardware for lots of MU-MIMO - I'll re-bump it
at that point.
Note:
* This breaks the driver and ifconfig ABI; please recompile the kernel,
ifconfig and wpa_supplicant/hostapd.
When using libfetch in an application that drops privileges when fetching
like pkg(8) then user complain because the application does not read anymore
${HOME}/.netrc. Now a caller can prepare a fd to the said file and manually
assign it to the structure.
It is also a first step to allow to capsicumize libfetch applications
Reviewed by: allanjude, des
Approved by: des
Differential Revision: https://reviews.freebsd.org/D9678
Enji Cooper [Sun, 19 Feb 2017 22:00:11 +0000 (22:00 +0000)]
A forced commit to note other portion of the Makefile change accidentally
committed in r313972
The code committed in r313962 implicitly relies on python 2.x to generate
testvect.h . There are a handful of issues with this approach:
- python is not an explicit build dependency for FreeBSD
- python 2.x is deprecated and will be removed sometime in the future
(potentially before 11.x's EOL), and the script does not function with
python 3.5 (it uses deprecated idioms and incompatible function calls).
- python(1) (by default) lives in /usr/local/bin (${LOCALBASE}/bin) and
gentestvect.py is a dependency of testvect.h (prior to r313972) which
means that if the mtime of the generator script was newer than the
mtime of the test vector, it could cause a spurious build failure in
build time or at install time.
Right now the noexec mount option disallows image activators to try
execve the files on the mount point. Also, after r127187, noexec
also limits max_prot map entries permissions for mappings of files
from such mounts, but not the actual mapping permissions.
As result, the API behaviour is inconsistent. The files from noexec
mount can be mapped with PROT_EXEC, but if mprotect(2) drops execution
permission, it cannot be re-enabled later. Make this consistent
logically and aligned with behaviour of other systems, by disallowing
PROT_EXEC for mmap(2).
Note that this change only ensures aligned results from mmap(2) and
mprotect(2), it does not prevent actual code execution from files
coming from noexec mount. Such files can always be read into
anonymous executable memory and executed from there.
Reported by: shamaz.mazum@gmail.com
PR: 217062
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Allan Jude [Sun, 19 Feb 2017 19:30:31 +0000 (19:30 +0000)]
improve PBKDF2 performance
The PBKDF2 in sys/geom/eli/pkcs5v2.c is around half the speed it could be
GELI's PBKDF2 uses a simple benchmark to determine a number of iterations
that will takes approximately 2 seconds. The security provided is actually
half what is expected, because an attacker could use the optimized
algorithm to brute force the key in half the expected time.
With this change, all newly generated GELI keys will be approximately 2x
as strong. Previously generated keys will talk half as long to calculate,
resulting in faster mounting of encrypted volumes. Users may choose to
rekey, to generate a new key with the larger default number of iterations
using the geli(8) setkey command.
Security of existing data is not compromised, as ~1 second per brute force
attempt is still a very high threshold.