ian [Tue, 4 Dec 2018 16:43:50 +0000 (16:43 +0000)]
Fix args cross-threading between gptboot(8) and loader(8) with zfs support.
When loader(8) is built with zfs support enabled, it assumes that any extarg
data present is a zfs_boot_args struct, but if the first-stage loader was
gptboot(8) the extarg data is actually a geli_boot_args struct. Luckily,
zfsboot(8) and gptzfsboot(8) have always passed KARGS_FLAGS_ZFS along with
KARGS_FLAGS_EXTARG, so we can use KARGS_FLAGS_ZFS to decide whether the
extarg data is a zfs_boot_args struct.
To avoid similar problems in the future, gptboot(8) now passes a new
KARGS_FLAGS_GELI to indicate that extarg data is geli_boot_args. In
loader(8), if the neither KARGS_FLAGS_ZFS nor KARGS_FLAGS_GELI is set but
extarg data is present (which will be the case for gptboot compiled before
this change), we now check for the known size of the geli_boot_args struct
passed by the older versions of gptboot as a way of confirming what type of
extarg data is present.
In a semi-related tidying up, since loader's main() has already decided
what type of extarg data is present and set the global 'zargs' var
accordingly, don't repeat the check in extract_currdev, just check whether
zargs is NULL or not.
X-MFC after: a few days, along with prior related changes.
Add ability to request listing and deleting only for dynamic states.
This can be useful, when net.inet.ip.fw.dyn_keep_states is enabled, but
after rules reloading some state must be deleted. Added new flag '-D'
for such purpose.
Retire '-e' flag, since there can not be expired states in the meaning
that this flag historically had.
Also add "verbose" mode for listing of dynamic states, it can be enabled
with '-v' flag and adds additional information to states list. This can
be useful for debugging.
Reimplement how net.inet.ip.fw.dyn_keep_states works.
Turning on of this feature allows to keep dynamic states when parent
rule is deleted. But it works only when the default rule is
"allow from any to any".
Now when rule with dynamic opcode is going to be deleted, and
net.inet.ip.fw.dyn_keep_states is enabled, existing states will reference
named objects corresponding to this rule, and also reference the rule.
And when ipfw_dyn_lookup_state() will find state for deleted parent rule,
it will return the pointer to the deleted rule, that is still valid.
This implementation doesn't support O_LIMIT_PARENT rules.
The refcnt field was added to struct ip_fw to keep reference, also
next pointer added to be able iterate rules and not damage the content
when deleted rules are chained.
Named objects are referenced only when states are going to be deleted to
be able reuse kidx of named objects when new parent rules will be
installed.
ipfw_dyn_get_count() function was modified and now it also looks into
dynamic states and constructs maps of existing named objects. This is
needed to correctly export orphaned states into userland.
ipfw_free_rule() was changed to be global, since now dynamic state can
free rule, when it is expired and references counters becomes 1.
External actions subsystem also modified, since external actions can be
deregisterd and instances can be destroyed. In these cases deleted rules,
that are referenced by orphaned states, must be modified to prevent access
to freed memory. ipfw_dyn_reset_eaction(), ipfw_reset_eaction_instance()
functions added for these purposes.
As part of the general cleanup of the ipfilter code, special cases
are committed separately to document fixing them separately from
the general cleanup. In this case we don't want to hide the utter
brokenness of what is being fixed.
Clean up a discombobulated block of #if's, with one block unreachable.
ip_fil.c is used in ipftest which is used to dry-run test ipfilter
rules in userspace without loading them in the kernel. The call to
(*ifp->if_output) matches that in the FreeBSD kernel.
Further testing and work will be required to make ipftest fully
functional.
jhibbits [Tue, 4 Dec 2018 04:55:49 +0000 (04:55 +0000)]
Sprinkle EARLY_DRIVER_MODULE around the tree
Mark some buses as BUS_PASS_BUS, and some resources as BUS_PASS_RESOURCE.
This also decouples some resource attachment orderings from being races by
device tree ordering, instead relying on the bus pass to provide the
ordering.
This was originally intended to support multipass suspend/resume, but it's
also needed on PowerMacs when using fdt, as the device tree seems to get
created in reverse of the OFW tree.
Reviewed by: nwhitehorn (long ago)
Differential Revision: https://reviews.freebsd.org/D918
jhibbits [Tue, 4 Dec 2018 03:51:10 +0000 (03:51 +0000)]
powerpc: preload_addr_relocate is no longer necessary for booke
The same behavior was moved to machdep.c, paired with AIM's relocation,
making this redundant. With this, it's now possible to boot FreeBSD with
ubldr on a uboot Book-E platform, even with a
KERNBASE != VM_MIN_KERNEL_ADDRESS.
sbruno [Tue, 4 Dec 2018 03:23:14 +0000 (03:23 +0000)]
Revert r340997 at the request of multiple users.
- breaks ports-mgmt/pkg build for mips64, powerpc64 and i386 for some users.
--- pkg-static ---
/usr/lib/liblzma.a(stream_encoder_mt.o): In function `mythread_cond_init':
/usr/local/poudriere/jails/ppc64/usr/src/contrib/xz/src/common/mythread.h:230:
undefined reference to `pthread_condattr_init'
brooks [Tue, 4 Dec 2018 00:15:47 +0000 (00:15 +0000)]
Remove a needlessly clever hack to start init with sys_exec().
Construct a struct image_args with the help of new exec_args_*() helper
functions and call kern_execve().
The previous code mapped a page in userspace, copied arguments out
to it one at a time, and then constructed a struct execve_args all so
that sys_execve() can call exec_copyin_args() to copy the data back in
to a struct image_args.
Opencode the part of pre_execve()/post_execve() that releases a
reference to the initial vmspace. We don't need to stop threads like
they do.
kib [Mon, 3 Dec 2018 23:39:45 +0000 (23:39 +0000)]
Improve procstat reporting for the linux cdev file descriptors.
If there is a vnode attached to the linux file, use it to fill
kinfo_file. Otherwise, report a new KF_TYPE_DEV file type, without
supplying any type-specific information.
KF_TYPE_DEV is supposed to be used by most devfs-specific file types.
markj [Mon, 3 Dec 2018 20:54:17 +0000 (20:54 +0000)]
Plug memory disclosures via ptrace(2).
On some architectures, the structures returned by PT_GET*REGS were not
fully populated and could contain uninitialized stack memory. The same
issue existed with the register files in procfs.
Reported by: Thomas Barabosch, Fraunhofer FKIE
Reviewed by: kib
MFC after: 3 days
Security: kernel stack memory disclosure
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18421
kib [Mon, 3 Dec 2018 20:03:43 +0000 (20:03 +0000)]
Some fixes for LD_BIND_NOW + ifuncs.
- Do not perform ifunc relocations together with other PLT relocations
in PLT. Instead, do it during an additional pass over the init
list, so that ifuncs are resolved in the order of dso
dependencies. This allows the ifuncs resolvers to call into depended
libs. Init list now includes all objects instead of only objects
with init/fini callables.
- Disable relro protection around bind_now ifunc relocations.
I considered calling ifunc resolvers of dso after initializers of all
dependencies are processed, and decided that this is wrong/should not
be supported. The order now is normal relocations for all
objects->ifunc resolution in init order->initializers, where each step
does complete pass over all loaded objects before moving to the next
step.
Reported, tested and reviewed by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18400
emaste [Mon, 3 Dec 2018 19:16:34 +0000 (19:16 +0000)]
stand/i386: rename .s to .S to use Clang IAS
As part of the migration away from obsolete binutils we want to retire
GNU as. Most assembly files used on amd64 have a .S extension and
(via rules in share/mk/bsd.suffixes.mk) are assembled with Clang's
Integrated Assembler (IAS). Rename files in stand/i386 to .S to use
the integrated assembler.
Clang's IAS supports the defsym option (via -Wa,) but only with one
dash, not two. As both -defsym and --defsym are accepted by GNU as,
use the former.
PR: 233611
Reviewed by: tsoome
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18369
imp [Mon, 3 Dec 2018 17:51:10 +0000 (17:51 +0000)]
Move inclusion of src.opts.mk later.
src.opts.mk includes bsd.own.mk. This in turn defines CTFCONVERT_CMD
depending on the MK_CTF value. We then set MK_CTF to no, which has no
real effect. The solution is to set all the MK_foo values before
including src.opts.mk.
This should stop the cdboot binary from exploding in size for releases
built WITH_CTF=yes in src.conf.
jhibbits [Mon, 3 Dec 2018 04:56:06 +0000 (04:56 +0000)]
powerpc: Check for a fdt in the metadata if it doesn't already exist
It's possible the fdt pointer was passed in via the metadata, as is done in
ubldr. Check for the fdt here, instead of working with a NULL fdt, and
panicking.
jhibbits [Mon, 3 Dec 2018 04:47:28 +0000 (04:47 +0000)]
powerpc/booke: Check for the metadata address by physical address
The metadata pointer will almost never be at or above 'btext', as btext is a
relocated symbol, so will be based at VM_MIN_KERNEL_ADDRESS, not at
KERNBASE. Check the address against kernload, where the kernel is
physically loaded.
ian [Mon, 3 Dec 2018 03:58:30 +0000 (03:58 +0000)]
Eliminate duplicated code and struct member definitions in the handoff
of args data between gptboot/zfsboot and loader(8).
Despite what seems like a lot of changes here, there are no actual
changes in behavior, or in the data layout in the structures involved.
This is just eliminating identical code pasted into multiple locations.
In detail, the changes are...
- Move struct zfs_boot_args definition from libsa/zfs/libzfs.h to
i386/common/bootargs.h because it is specific to x86 booting and the
handoff between zfsboot and loader, and has no relation to the zfs
library code in general.
- The geli_boot_args and zfs_boot_args structs both contain an identical
set of member variables containing geli information. Extract this out
to a new geli_boot_data struct, and embed it in the arg-passing structs.
- Provide new routines geli_import_boot_data() and geli_export_boot_data()
that can be shared between gptboot, zfsboot, and loader instead of
pasting identical code into several different .c files.
- Remove some checks for a NULL pointer that can never be true because the
pointer being tested was set using pointer math (kargs + 1) and that can
never result in NULL in this code.
imp [Sun, 2 Dec 2018 23:13:35 +0000 (23:13 +0000)]
Delete the undocumented alias 'wds'.
This was a typo for wdc. Eliminate it since it was in error. People
should use either 'wdc' or 'hgst' for the vendor from now on. 'hgst'
works for all versions this functionality is present for.
imp [Sun, 2 Dec 2018 23:13:24 +0000 (23:13 +0000)]
Move Intel specific log pages to intel.c
Move the Intel specific log pages (including the one that samsung
implements) to intel.c. Add comment to the samsung vendor that it will
be going away soon.
imp [Sun, 2 Dec 2018 23:13:12 +0000 (23:13 +0000)]
Usage cleanup pt 2
Eliminage redundant spaces and nvmecontrol at start of all the usage
strings. Update the usage printing code to add them back when
presenting to the user. Allow multi-line usage messages and print
proper leading spaces for lines starting with a space.
imp [Sun, 2 Dec 2018 23:12:58 +0000 (23:12 +0000)]
Usage cleanup pt 1
Provide a usage() function that takes a struct nvme_function pointer
and produces a usage mssage. Eliminate all now-redundant usage
functions. Propigate the new argument through the program as needed.
Use common routine to print usage.
imp [Sun, 2 Dec 2018 23:12:16 +0000 (23:12 +0000)]
Make logpage functions a linker set.
Move logpage function def to header. Convert all the logpage_function
elements to elements of the linker set. Leave them all in logpage.c
for the moment.
imp [Sun, 2 Dec 2018 23:10:55 +0000 (23:10 +0000)]
Move nvmecontrol to using linker sets for commands
More commands will be added to nvmecontrol. Also, there will be a few
more vendor commands (some of which may need to remain private to
companies writing them). The first step on that journey is to move to
using linker sets to dispatch commands. The next step will be using
dlopen to bring in the .so's that have the command that might need
to remain private for seamless integration.
Similar changes to this will be needed for vendor specific log pages.
kib [Sun, 2 Dec 2018 13:16:46 +0000 (13:16 +0000)]
Change the vm_ooffset_t type to unsigned.
The type represents byte offset in the vm_object_t data space, which
does not span negative offsets in FreeBSD VM. The change matches byte
offset signess with the unsignedness of the vm_pindex_t which
represents the type of the page indexes in the objects.
This allows to remove the UOFF_TO_IDX() macro which was used when we
have to forcibly interpret the type as unsigned anyway. Also it fixes
a lot of implicit bugs in the device drivers d_mmap methods.
Reviewed by: alc, markj (previous version)
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
asomers [Sun, 2 Dec 2018 05:06:37 +0000 (05:06 +0000)]
Unbreak geli/gmirror testcases if their geom classes cannot be loaded
The problem with the logic prior to this commit was twofold:
1. The wrong set of idioms (TAP-compatible) were being applied to the ATF
testcases when run, resulting in confusing ATF failure results on setup.
2. The cleanup subroutines were broken when the geom classes could not be
loaded as they exited with 0 unexpectedly.
This commit changes the test code to source the class-specific configuration
(conf.sh) once globally, instead of sourcing it per testcase and per cleanup
subroutine, and to call the ATF-specific setup subroutine(s) inline in
the testcases.
The refactoring done is effectively a no-op for the TAP testcases, modulo
any refactoring done to create common code between the ATF and TAP
testcases.
This unbreaks the geli testcases converted to ATF in r327662 and r327683,
and the gmirror testcases added in r327780, respectively, when the geom
class could not be loaded.
tests/sys/geom/class/mirror/...
While here, ignore errors when turning debug failpoint sysctl off, which
could occur if the gmirror class was not loaded.
cem [Sat, 1 Dec 2018 21:37:47 +0000 (21:37 +0000)]
pmcr: Fix pstate setting on Power8
Fix p-state setting on Power8 by removing the accidental double-indirection of
the pstate_ids table.
The pstate_ids table comes from the OF property "ibm,pstate-ids." On Power9,
the values happen to be identical to the indices, so the extra indirection was
harmless. On Power8, the values were out of the range [0, npstates], so
pmcr_set() would fail the spec[0] range check with EINVAL.
While here, include both the value and index in the driver-specific register
array as spec[0] and spec[1] respectively. They're redundant, but relatively
harmless, and it may aid debugging.
While here, fix the range check to exclude the index npstates, which is one
past the last valid index.
PR: 233693
Reported and tested by: sbruno
Reviewed by: jhibbits
jhibbits [Sat, 1 Dec 2018 20:39:20 +0000 (20:39 +0000)]
Fix PowerPC64 ELFv1-specific problem in __elf_phdr_match_addr() leading to crash
in threaded programs that unload libraries.
Summary:
The GNOME update to 3.28 exposed a bug in __elf_phdr_match_addr(), which leads
to a crash when building devel/libsoup on powerpc64.
Due to __elf_phdr_match_addr() limiting its search to PF_X sections, on the
PPC64 ELFv1 ABI, it was never matching function pointers properly.
This meant that libthr was never cleaning up its atfork list in
__pthread_cxa_finalize(), so if a library with an atfork handler was unloaded,
libthr would crash on the next fork.
Normally, the null pointer check it does before calling the handler would avoid
this crash, but, due to PPC64 ELFv1 using function descriptors instead of raw
function pointers, a null check against the pointer itself is insufficient, as
the pointer itself was not null, it was just pointing at a function descriptor
that had been zeroed. (Which is an ABI violation.)
Calling a zeroed function descriptor on PPC64 ELFv1 causes a jump to address 0
with a zeroed r2 and r11.
Remove IFF_DRVRLOCK as it is used in IRIX only (and we all know IRIX
is dead). This includes collaterally removing code shared by HP/UX,
SGI, and Linux, where IP Filter will in all likelihood for various
reasons never run again.
manu [Sat, 1 Dec 2018 20:26:59 +0000 (20:26 +0000)]
arm64: rockchip: Add RK3399_CLK_PLL
PLLs on the RK3399 are different than the ones on the RK3328.
Add a new type and some dedicated recalc and set_freq functions.
Rename the RK3328 dedicated rk_clk_pll function with rk3328_ prefix.
Restore handling of PMTU discovery, removed through an unifdef(1)
following the MFV of r254219 into r255332. In addition the 'FreeBSD'
macro was never defined in ipfilter 5.1.2 thus it never would have
been enabled in the first place.
This work is prompted by a general cleanup of the IP Filter code
prompted by working to resolve a PR. More to follow.
kib [Sat, 1 Dec 2018 16:50:12 +0000 (16:50 +0000)]
Allow to create swap zone larger than v_page_count / 2.
If user configured the maxswapzone tunable, just take the literal
value for the initial zone sizing attempt. Before, it was only
possible to reduce the zone by the tunable.
While there, correct the message which was not correct when zone
creation rounded the size up.
Reported by: jmg
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18381
In rare situations[*] it's possible for two different interfaces to have
the same name. This confuses pf, because kifs are indexed by name (which
is assumed to be unique). As a result we can end up trying to
if_rele(NULL), which panics.
Explicitly checking the ifp pointer before if_rele() prevents the panic.
Note pf will likely behave in unexpected ways on the the overlapping
interfaces.
[*] Insert an interface in a vnet jail. Rename it to an interface which
exists on the host. Remove the jail. There are now two interfaces with
the same name in the host.
ae [Fri, 30 Nov 2018 10:36:14 +0000 (10:36 +0000)]
Adapt the fix in r341008 to correctly work with EBR.
IFNET_RLOCK_NOSLEEP() is epoch_enter_preempt() in FreeBSD 12+. Holding
it in sysctl_rtsock() doesn't protect us from ifnet unlinking, because
unlinking occurs with IFNET_WLOCK(), that is rw_wlock+sx_xlock, and it
doesn check that concurrent code is running in epoch section. But while
we are in epoch section, we should be able to do access to ifnet's
fields, even it was unlinked. Thus do not change if_addr and if_hw_addr
fields in ifnet_detach_internal() to NULL, since rtsock code can do
access to these fields and this is allowed while it is running in epoch
section.
This should fix the race, when ifnet_detach_internal() unlinks ifnet
after we checked it for IFF_DYING in sysctl_dumpentry.
Move free(ifp->if_hw_addr) into ifnet_free_internal(). Also remove the
NULL check for ifp->if_description, since free(9) can correctly handle
NULL pointer.
manu [Fri, 30 Nov 2018 10:31:30 +0000 (10:31 +0000)]
arm64: allwinner: Add 792Mhz frequency to sun50i-a64-opp
This is the frequency of the cpu on the Pinebook so add it to make
cpufreq find the current setting.
Note that this dtbo on the Pinebook doesn't work right now as u-boot
dtb doesn't have symbols and so it fails to apply. Linux 4.20 have
the dts and will be imported once taggued.
tsoome [Fri, 30 Nov 2018 08:42:14 +0000 (08:42 +0000)]
loader.efi: fix EFI getchar() for multiple consoles
This fix is ported from illumos (issue #9970), the analysis and initial
implementation was done by John Levon.
See also: https://www.illumos.org/issues/9970
Currently, efi_cons_getchar() will wait for a key. While this seems to make
sense, the implementation of getchar() in common/console.c will loop across
getchar() for all consoles without doing ischar() first.
This means that if we've configured multiple consoles, we can't input into
the serial, as getchar() will be sat waiting for input only from efi_console.c
This patch does implement a bit more generic key buffer to support
translation of input keys, and we use generic efi_readkey() to reduce
duplication from calls from getchar() and poll().
tsoome [Fri, 30 Nov 2018 08:01:11 +0000 (08:01 +0000)]
loader: create separate lists for fd, cd and hd, merge bioscd with biosdisk
Create unified block IO implementation in BIOS version, like it is done in UEFI
side. Implement fd, disk and cd device lists, this will split floppy devices
from disks and will allow us to have consistent, predictable device naming
(modulo BIOS issues).
arybchik [Fri, 30 Nov 2018 07:11:05 +0000 (07:11 +0000)]
sfxge(4): rollback last seen VLAN TCI if Tx packet is dropped
Early processing of a packet on transmit may change last seen
VLAN TCI in the queue context. If such a packet is eventually
dropped, last seen VLAN TCI must be set to its previous value.
Submitted by: Ivan Malov <Ivan.Malov at oktetlabs.ru>
Sponsored by: Solarflare Communications, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18288
arybchik [Fri, 30 Nov 2018 07:10:32 +0000 (07:10 +0000)]
sfxge(4): update external port number calculation
Revise the external port calculation to support all
X2 port modes. The previous algorithm could not
handle different port numbering schemes on each cage.
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D18285
arybchik [Fri, 30 Nov 2018 07:10:20 +0000 (07:10 +0000)]
sfxge(4): correct annotations where NULL input is OK
Correct annotations where NULL input can be permitted
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D18284
arybchik [Fri, 30 Nov 2018 07:10:09 +0000 (07:10 +0000)]
sfxge(4): support new link modes in the driver
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D18283
arybchik [Fri, 30 Nov 2018 07:09:58 +0000 (07:09 +0000)]
sfxge(4): use transceiver ID when reading info
In efx_mcdi_phy_module_get_info() probe the
transceiver identification byte rather than assume
the module matches the fixed port type. This
supports scenarios such as a SFP mounted in a QSFP
port via a QSA module.
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D18282