Alexander Motin [Thu, 8 Aug 2019 02:21:30 +0000 (02:21 +0000)]
MFC r350020 (by imp): Use a different approach to range check.
gcc hates dt < CC_DT_NONE since it can never be true when dt is an unsigned
type. Since that's a compiler choice and may be affected by weird stuff, instead
use (unsigned)dt > CC_DT_UNKNOWN to test for bounds error since that will work
regardless of the signedness of dt.
Alexander Motin [Thu, 8 Aug 2019 02:20:42 +0000 (02:20 +0000)]
MFC r350018: Implement a devtype command.
List the device's protocol. The returned value is one of the following:
ata direct attach ATA or SATA device
satl a SATA device attached via SAS
scsi A parallel SCSI or SAS
nvme A direct attached NVMe device
mmcsd A MMC or SD attached device
Alexander Motin [Thu, 8 Aug 2019 02:18:14 +0000 (02:18 +0000)]
MFC r350008 (by imp):
Use the more proper term of SATL instead of ATA_BEHIND_SCSI.
Most people know SAS attached SATA devices by the name SAT or SATL
(with the latter being a little more common). Change the device type
ATA_BEHIND_SCSI to SATL since it's more specific and meaningful.
Alexander Motin [Thu, 8 Aug 2019 02:11:08 +0000 (02:11 +0000)]
MFC r349339 (by imp):
Go ahead and completely fix the ata_params before calling the veto
function. This breaks nothing that uses it in the tree since
ata_params is ignored in storvsc_ada_probe_veto which is the only
in-tree consumer.
Alexander Motin [Thu, 8 Aug 2019 02:08:09 +0000 (02:08 +0000)]
MFC r349283 (by scottl):
Refactor xpt_getattr() to make it more readable. No outwardly
visible functional changes, though code flow was modified a bit
internally to lessen the need for goto jumps and chained if
conditionals.
Alexander Motin [Thu, 8 Aug 2019 02:04:29 +0000 (02:04 +0000)]
MFC r348786 (by chuck): Fix nda(4) PCIe link status output
Differentiate between PCI Express Endpoint devices and Root Complex
Integrated Endpoints in the nda driver. The Link Status and Capability
registers are not valid for Integrated Endpoints and should not be
displayed. The bhyve emulated NVMe device will advertise as being an
Integrated Endpoint.
Alexander Motin [Thu, 8 Aug 2019 00:33:23 +0000 (00:33 +0000)]
MFC r345363 (by imp): Make WD and WDC aliases for HGST.
HGST was bought by WDC. Over the years, it has sold different drives
branded as HGST, WD or WDC. All of them need the HGST workaround of
sending 4k-sized packets (or multiples of 4k). And the ones that don't
really need this aren't broken by this change. Submitter is the vendor
who has tested these changes on a number of drives. I've simplified it
slightly, since we don't need additional vendors for this at this
time.
Alexander Motin [Thu, 8 Aug 2019 00:31:00 +0000 (00:31 +0000)]
MFC r345051 (by imp): Add -l to camcontrol readcap.
The -l flag sends only the READ CAPACITY (16) sevice action. Normally
we send the READ CAPACITY (10) command, and only send RC16 when the
capacity is larger than 2TB (since that's the max RC10 can
report). However, some badly programmed drives report different
numbers for RC10 and RC16. This can be hard to diagnose, but generally
there's a "Logical block address out of range" error when RC16 reports
a larger number than RC10 and the RC10 number is the correct one. By
comparing the output of readcap with and without the -l argmuent, one
can determine if there's a mismatch and if the DA_Q_NO_RC16 quirk is
needed.
Alexander Motin [Thu, 8 Aug 2019 00:29:39 +0000 (00:29 +0000)]
MFC r344070 (by imp): Fix panic message.
The panic message lead people to believe some userland CAM request had
caused a problem when in reallity it was for a kernel request (eg the
USER bit was cleared). Reword message. Also, improve a couple of
comments to reflect that the periph shouldn't be completely torn down
before we get here (so the path and sim pointers should be valid, but
aren't and the code is designed to be robust enough in the face of
that to give a specific panic message).
Alexander Motin [Thu, 8 Aug 2019 00:27:26 +0000 (00:27 +0000)]
MFC r343814 (by imp): Add quirk for Sansisk X400 drives
Certain versions of Sandisk x400 firmware can hang under extremely
heavly load of large I/Os for prolonged periods of time. Newer /
current versions work fine, and should be used where possible. Where
not possible, this quirk ensures that I/O requests are limited to 128k
to avoids the bug, even under extreme load. Since MAXPHYS is 128k,
only users with custom kernels are at risk on the older firmware.
Once all known users of the older firmware have upgraded, this quirk
will be removed.
Alexander Motin [Thu, 8 Aug 2019 00:22:15 +0000 (00:22 +0000)]
MFC r341769 (by imp):
Send a START UNIT command when a disk responds with an ASC of 04/1C.
This will hopefully spin up a disk that's in low-power mode.
Emmanuel Vadot [Wed, 7 Aug 2019 18:35:59 +0000 (18:35 +0000)]
MFC r341404-r341405, r344699, r347024, r347442
r341404 by andreast:
Add rule to build the dtb for the rock64 board.
Reviewed by: manu@
r341405 by andreast:
Build the dtb for the rock64 board.
Reviewed by: manu@
r344699:
arm64: rockchip: rk3399_pll: Fix the recalc function
The plls frequency are now correctly calculated in fractional mode
and integer mode.
While here add some debug printfs (disabled by default)
Tested with powerd on the little cluster on a RockPro64.
r347024:
dtb: Include RK3399 RockPro64 DTS in kernel build
The DTS for this board is already present in sys/gnu/dts/arm64/rockchip/
and just needs to be enabled.
r347442:
arm64: rockchip: Don't always put PLL to normal mode
We used to put every PLL in normal mode (meaning that the output would
be the result of the PLL configuration) instead of slow mode (the output
is equal to the external oscillator frequency, 24-26Mhz) but this doesn't
work for most of the PLLs as when we put them into normal mode the registers
configuring the output frequency haven't been set.
Add a normal_mode member in clk_pll_def/clk_pll_sc struct and if it's true
we then set the PLL to normal mode.
For now only set it to the LPLL and BPLL (Little cluster PLL and Big cluster
PLL respectively).
PLLs on the RK3399 are different than the ones on the RK3328.
Add a new type and some dedicated recalc and set_freq functions.
Rename the RK3328 dedicated rk_clk_pll function with rk3328_ prefix.
r341382:
arm64/rockchip: add RK3399 support
Add CRU (Clock and Reset Unit) driver for RK3399.
Add support in rk_pinctrl driver.
Submitted by: Greg V <greg@unrelenting.technology> (Original version)
Differential Revision: https://reviews.freebsd.org/D16732
r341383:
arm64: rockchip: rk_i2c: Use correct clock
While here add RK3399 support and call clk_set_assigned to set the correct
clock set in the DTS.
r341385:
arm64: rockchip: rk805: Add basic support for RK808 PMIC
RK808 PMIC is the companion chip for RK3399 SoC.
Add basic regulator support in RK805 since they are similar.
r343950:
arm64: Fix compile when removing SOC_ROCKCHIP_* options
Make every rockchip file depend on the multiple soc_rockchip options
While here make rk_i2c and rk_gpio depend on their device options.
Reported by: sbruno
r344527:
arm64: rockchip: clk: Set the write mask when setting the clock mux
RockChip clocks have a write mask in the upper 16bits of the mux register
which wasn't set in the set_mux function.
Also the wrong parent was tested instead of the real current one, when
switch parent, test with the current one before.
Pointy Hat: manu
r344576:
arm64: rockchip: clk: rk_clk_composite: Properly use the mask bits
RockChip clocks register have a write mask in the upper 16 bits, if a 1
is present the corresponding bit in the lower 16 ones is set.
Use this instead of always setting the mask to 0xFFFF0000.
This avoids a read of the register.
While here add some debug printf useful for debuging clock problems
r344577:
arm64: rockchip: clk: ARM CLK improvement
RockChip clocks register have a write mask in the upper 16 bits, if a 1
is present the corresponding bit in the lower 16 ones is set.
Use this instead of always setting the mask to 0xFFFF0000.
This avoids a read of the register.
While here set the parent after changing its freqeuncy, this reduce the time
between changing the parent and changing the divider for the arm clock.
RockChip clocks register have a write mask in the upper 16 bits, if a 1
is present the corresponding bit in the lower 16 ones is set.
Use this instead of always setting the mask to 0xFFFF0000.
This avoids a read of the register.
While here, when switching PLL frequency, first switch it to slow mode.
When set to slow mode the PLL clock will be the external oscillator.
Changing the PLL parameters while its output is used can cause hang (sometimes).
Add the 3 LDO regulator found in the RK805 Power Management IC.
r344580:
arm64: rockchip: rk805: Map the regulator
No map function was provided before so every regulator lookup resolved
the regulator with id 1, as it uses the default mapper, which is wrong.
Correctly map the regulators.
While here remove some debug printfs and make them disable by default.
r344585:
arm64: rockchip: rk_pinctrl: Fix two banks in RK3328
The last two banks don't have 3 bits for the pin function but only 2.
This fixes eMMC on the Rock64.
r344589:
arm64: rockchip: rk3399_pll: Switch to slow mode when changing the freq
Like r344578 but for RK3399.
This solve some hangs when switching between frequency.
Remove the mode_val from the clock definition as it's a bit unreadable.
Use mode_shift to represent which bit control the mode in the register.
Simplify some case where we can avoid a register read before changing it.
Set the PLL back to normal mode after the PLL have stabilized.
r344627:
mmc: dwmmc: Match on "rockchip,rk3288-dw-mshc" compatible
This is the common denominator for rockchip compatible from RK3288 to RK3399.
The other compatible are generally present in the DTS but the controllers
are the same.
Cy Schubert [Wed, 7 Aug 2019 01:08:57 +0000 (01:08 +0000)]
MFC r350568:
Resolve ipfilter kld unload issues related to VNET jails.
When the ipfilter kld is loaded, used within VNET jail, and unloaded,
then subsequent loading, use, and unloading of another packet filters
will cause the subsequently loaded netpfil kld's to panic.
The scenario is as follows:
cd /usr/tests/sys/netpfil/common
kldunload ipl
kldunload pfsync
kldunload ipfw
kyua test pass_block
kldload ipl
kyua test pass_block
kldunload ipl
kldload pfsync
kyua test pass_block
kldunload pfsync
-- page fault panic occurs here --
Emmanuel Vadot [Tue, 6 Aug 2019 12:19:09 +0000 (12:19 +0000)]
MFC r348179-r348182
r348179:
allwinner: aw_ccu: Add some debug printfs (disabled by default)
Also print information about setting frequency at boot under bootverbose
r348180:
arm: allwinner: clk: Add new clock aw_clk_frac
Add a clock driver for clock that can either be used in integer mode
with one N factor and one M divider or in fractional mode where the
output frequency is chosen between two predifined output.
r348181:
arm: allwinner: clk: Use the new frac clock
Some clocks used the NM type but this clock is for the ones with the
formula "clk = clkin / n / m" and not "clk = clkin * n / m"
Use the new frac clock for them.
r348182:
arm: allwinner: Remove frac mode from NM clk
We have a correct clock type aw_clk_frac now for this.
Emmanuel Vadot [Tue, 6 Aug 2019 12:12:29 +0000 (12:12 +0000)]
MFC r347489-r347491, r347512
r347489:
allwinner: clk: prediv_mux: Init the current parent
Do not init the first parent but read the clock register to find
it's current parent and init this one.
r347490:
allwinner: clk: sun8i_r: Correct resets
The i2c reset wasn't defined and some bits where wrong, correct them.
r347491:
twsi: Calculate the clock param based on the bus frequency
Instead of precalculating the different speed, respect the bus frequency
and calculate the clock register parameter based on it.
If the platform didn't register the core clk, fallback on the precomputed
values (This is likely do be the case on Marvell boards).
r347512:
arm: allwinner: aw_clk_nm: Don't reparent the clock if we didn't ask
When looking for the best frequency don't change the clock parent if the
clock wasn't configured to do that.
John Baldwin [Mon, 5 Aug 2019 22:04:16 +0000 (22:04 +0000)]
MFC 350618:
Validate guest-supplied length of headers for TSO transmit requests.
When transmitting a large TCP packet, the final transmit descriptor
includes the length of the protocol headers to be duplicated on each
segment. The device model was trusting the guest-supplied value
without validating it. A value of zero would result in the guest
being able to indirect a garbage pointer on the stack to overwrite
arbitrary memory in the bhyve process. A value that was non-zero but
too small for the requested parameters resulted in the device model
reading and writing values beyond the end of the on-stack buffer used
to hold the template header.
To fix, validate the supplied length and drop requests to transmit
packets that would overflow the header buffer. While here, initialize
the header pointer to NULL as a preventive measure so that any access
to an unallocated template header crashes they hypervisor
deterministically.
While here, only read the TCP sequence number if the packet being
split is a TCP packet. The e1000 logic supports a segmentation of UDP
frames, and while UDP segmentation requires this part of the header to
be valid (so there is no buffer overflow), only reading the field when
needed is cleaner.
admbugs: 918
Reported by: Reno Robert <renorobert@gmail.com>
Approved by: so
Security: CVE-2019-5609
Emmanuel Vadot [Mon, 5 Aug 2019 18:27:25 +0000 (18:27 +0000)]
MFC r346334, r346787-r346789, r347017
r346334:
arm: allwinner: Fix audio for Allwinner H3/H5
Due to three conditions the codec driver for Allwinner A10/A20 and H3/H5 did not work properly here:
Wrong bit position for the analog audio reset
Hardware Reset of codec was not de-asserted correctly
Linux DTS file did not contain the address of the analog register the way as the driver was expecting it.
r346787:
arm64: allwinner: Add compatible strings for clock devices used on both Allwinner H3 and H5
Allwinner H3 and H5 share many internal components, that's why they can
use the same drivers.
This patch adds the compatible strings to enable clock drivers
probing on Allwinner NanoPI NEO2 device.
Tested on: NanoPi NEO2 (by submitter), OrangePi PC2 (by manu)
Submitted by: Manuel Stühn (freebsdnewbie@freenet.de)
Differential Revision: https://reviews.freebsd.org/D20069
PB20 and PB21 alternate function 1 is i2c2 not i2c1
Reported by: Horiki Mori (yamori813@yahoo.co.jp)
PR: 237401
r347017:
arm64: Add support for NanoPI NEO2
Add overlay files and activate devicetree file for NanoPi NEO2 featuring
Allwinner H5 ARM64 core.
To enable sound, dma and codec drivers are enabled for build.
Submitted by: Manuel Stühn (freebsdnewbie@freenet.de)
Differential Revision: https://reviews.freebsd.org/D20129
Emmanuel Vadot [Mon, 5 Aug 2019 18:17:03 +0000 (18:17 +0000)]
MFC r346298:
config: Only warn if duplicate option/device comes from the same file
This is useful for arm (possibly other arches too) where we want to have
a GENERIC kernel that only include files for the different SoC. Since
multiple SoCs/Board needs the same device we would need to do either :
Include the device in a generic file
Include the device in each file that really needs it
Option 1 works but if someone wants to create a specific kernel config
(which isn't uncommon for embedded system), he will need to add a lots
of nodevice to it.
Option 2 also works but produce a lots of warnings.
Emmanuel Vadot [Mon, 5 Aug 2019 17:54:08 +0000 (17:54 +0000)]
MFC r344633-r344634, r344638
r344633:
usb_nop_xceiv: Add support for this pseudo device
This is a "fake" phy that handle regulator, clocks and reset gpio.
Only clock and regulator is supported for now.
Sponsored-by: Rubicon Communications, LCC ("Netgate")
r344634:
xhci_mv: Move the driver to generic_xhci
Marvell XHCI is in fact generic-xhci, so move the driver and
add the compatible string.
While here, get and enable the phy if the dtb provide one.
The xhci bindings state that phys should be in a 'phys' property but
Marvell DTS uses 'usb-phy', only add support for 'usb-phy' for now.
Sponsored-by: Rubicon Communications, LCC ("Netgate")
r344638:
Fix armv6/armv7 build after the move from xhci_mv to generic_xhci
r342012:
arm64: marvell: Add driver for Marvell Ap806 System Controller
The first two clocks are for the clusters and their frequencies can be
found reading a register. Then a fixed 1200Mhz clock is present and two
fixed clocks, 'mss' which is 1200 / 6 and 'sdio' which is 1200 / 3.
r342016:
arm64: Add mv_cp110_icu and mv_cp110_gicp
icu is a interrupt concentrator in the CP110 block and gicp
is a gic extension to allow interrupts in the CP block to be turned
into GIC SPI interrupts
Emmanuel Vadot [Mon, 5 Aug 2019 17:32:16 +0000 (17:32 +0000)]
MFC r346293:
allwinner: clk: Garbage collect old clock implementation
The old clocks are disconneted from the build since r337344.
Remove all those pseudo drivers. The only one remaining is for gmac
(the ethernet controller) so move it to sys/arm/allwinner.
While here remove a83t support from gmacclk as it is unneeded since r326114.
Emmanuel Vadot [Mon, 5 Aug 2019 17:23:23 +0000 (17:23 +0000)]
MFC r346092, r346271-r346272
r346092:
Import DTS files from Linux 5.0
r346271:
aw_rtc: Register the clocks
Since latest DTS update the rtc is supposed to register two clocks :
- osc32k (the 32k oscillator on the board that the RTC uses directly and
that other peripheral can use)
- iosc (the internal oscillator of the RTC when available which frequency
depend on the SoC revision)
Since we need the RTC before the proper clock control unit (because it uses
those clocks) attach it a BUS_PASS_BUS + MIDDLE and attach the clock control
unit at BUS_PASS_BUS + LAST for the SoC that requires it.
Tested On: A20, H3, A64
r346272:
aw_syscon: Add a new compatible
Since 5.0 DTS the syscon controller have a new compatible as it
exports new subnodes, we currently only use it as a syscon provider
so just add the new compatible.
Emmanuel Vadot [Mon, 5 Aug 2019 17:06:20 +0000 (17:06 +0000)]
MFC r345948, r345951
r345948:
twsi: Add interrupt mode
Add the ability to use interrupts for i2c message.
We still use polling for early boot i2c transfer (for PMIC
for example) but as soon as interrupts are available use them.
On Allwinner SoC >A20 is seems that polling mode is broken for some
reason, this is now fixed by using interrupt mode.
For Allwinner also fix the frequency calculation, the one in the code
was for when the APB frequency is at 48Mhz while it is at 24Mhz on most
(all?) Allwinner SoCs. We now support both cases.
While here add more debug info when it's compiled in.
Tested On: A20, H3, A64
r345951:
twsi: Use config_intrhook_oneshot instead of config_intrhook_establish
r342924:
dtb: allwinner: Add orangepi-pc to the build
PR: 226011
Submitted by: Greg V <greg@unrelenting.technology>
r343749:
release: arm64: rpi3: Install the RPI3B+ DTB file
We should use the correct DTB file otherwise the firmware uses
the RPI3B one.
r343750:
release: arm64: pine64-lts: Use the newly created u-boot-pine64-lts port
In U-Boot 2019.01 there is now a config for this board, use it for the
release image.
r343874:
mtree: Add dtb subdir to the mtree file
makefs will fails otherwise
Reported by: emaste
r344893:
arm: allwinner: Fix NM clock recalc
If the NM clock is using a fractional divider the formula isn't the same.
r344894:
arm64: allwinner: Add CCU DE2
The Display Engine 2 have it's own Clock and Control Unit, add support
for it.
r344895:
arm64: allwinner: a64: Add TCON clock
The tcon clock need a mux table for it's parent, for now just
list the parents twice.
r345711:
arm: allwinner: clk: Fix nm_recalc
When comparing best frequencies use the absolute value.
If we do not do that we end up choosing an always lower value than
the best one if the exact freq cannot be met.
Emmanuel Vadot [Mon, 5 Aug 2019 16:48:16 +0000 (16:48 +0000)]
MFC r340987, r340989, r341254, r341269, r341333
r340987:
arm64: Add evdev support to GENERIC
r340989:
regulator_fixed: Do not disable fixed regulator at probe
If the regulator is unused it will be disabled by the regulator_shutdown sysinit.
Tested on pinebook where the backlight is controlled by a fixed-regulator.
The regulator doesn't have a regulator-boot-on param (I'm gonna upstream this) and so we disable it at probe.
We later enable it but this cause the screen to go black.
Linux doesn't disable regulator at boot (at least for fixed-regulator) so better match this to have the same UX.
ofw_bus_parse_xref_list_get_length doesn't returns the number of elements, fix this.
While here when setting the clock to the assigned freqeuncy, allow the clock
driver to round down or up the frequency as sometimes the exact frequency cannot
be obtain.
r341269:
release: arm64: Add opp dtbo to PINE* boards
r341333:
arm64: allwinner: Add 792Mhz frequency to sun50i-a64-opp
This is the frequency of the cpu on the Pinebook so add it to make
cpufreq find the current setting.
Note that this dtbo on the Pinebook doesn't work right now as u-boot
dtb doesn't have symbols and so it fails to apply. Linux 4.20 have
the dts and will be imported once taggued.
Emmanuel Vadot [Mon, 5 Aug 2019 16:36:11 +0000 (16:36 +0000)]
MFC r340845-r340848, r340971, r340981, r342076
r340845:
Derive PHY class to new one specialized for USB PHY functions.
Submitted by: mmel
r340846:
aw_usbphy: Convert to usbphy subclass
Instead of routing the phy when enabling it, do the configuration
and routing in the phynode_usb_set_mode function.
While here, if we don't have a vbus detection method, enable the phy
if requested.
r340847:
a10_ehci: Always set the phy to host mode
r340848:
axp8xx: Rework the enable part and add the GPIOXLDO regulators
r340971:
aw_usbphy: Do not error if it's not phy 0
Only phy0 can switch between host/otg, do not error if we request
host mode on phy != 0.
X-MFC with: r340846
r340981:
release: arm64: Add PINEBOOK config
Add a configuration for PINEBOOK image.
Pinebook is a arm64 laptop based on a Pine64 board.
Since the usb trackpad need a quirk, add a common function for adding
quirk for arm board.
A default one is supplied as most board to not need quirks.
r342076:
arm64: allwinner: axp81x: Fix double invertion for FLDO1
This fix booting on A64 boards when disabling the unused regulators at boot.
We did disable all the regulator handled by register 0x13 which of course contain
mandatory regulators for the board to be up.
Reported by: Mark Millard <marklmi@yahoo.com>
X-MFC-With: r340848
Kristof Provost [Mon, 5 Aug 2019 15:24:05 +0000 (15:24 +0000)]
MFC r350416:
riscv: Fix copyin/copyout
r343275 introduced a performance optimisation to the copyin/copyout
routines by attempting to copy word-per-word rather than byte-per-byte
where possible.
This optimisation failed to account for cases where the buffer is longer
than XLEN_BYTES, but due to misalignment does not not allow for any
word-sized copies. E.g. a 9 byte buffer (with XLEN_BYTES == 8) which is
misaligned by 2 bytes. The code nevertheless did a single full-word
copy, which meant we copied too much data. This potentially clobbered
other data.
This is most easily demonstrated by a simple `sysctl -a`.
Fix it by not assuming that we'll always have at least one full-word
copy to do, but instead checking the remaining length first.
MFC r350417:
Add ipfw_get_action() function to get the pointer to action opcode.
ACTION_PTR() returns pointer to the start of rule action section,
but rule can keep several rule modifiers like O_LOG, O_TAG and O_ALTQ,
and only then real action opcode is stored.
ipfw_get_action() function inspects the rule action section, skips
all modifiers and returns action opcode.
Use this function in ipfw_reset_eaction() and flush_nat_ptrs().
Ed Maste [Sun, 4 Aug 2019 20:40:47 +0000 (20:40 +0000)]
MFC r350518: as: add deprecation notice to the man page
In the future FreeBSD will ship without GNU binutils 2.17.50. Add a
note advising users who require GNU as to install the binutils port
or package.
Note that on armv7, arm64, amd64, i386 we currently ship only two
binutils tools (as and objdump). A deprecation notice was added to
objdump's man page some time ago.
Ed Maste [Sun, 4 Aug 2019 01:18:50 +0000 (01:18 +0000)]
objdump: update deprecation notice
MFC r350503: objdump: move deprecation notice to indended spot
r335217 added a deprecation notice to the source file for the objdump
man page, and r335219 added it to the rendered objdump.1, but in the
wrong spot.
MFC r350505: objdump: be explicit that GNU objdump that will be removed
We may install llvm-objdump as objdump (see review D18307) or just
provide no /usr/bin/objdump, but either way GNU objdump won't be
installed in the future.
Brooks Davis [Fri, 2 Aug 2019 20:31:02 +0000 (20:31 +0000)]
MFC r350228:
ata_xpt: Use the correct union member when accessing valid.
In principle this should not matter as it's a union and they point to
the same memory location but based on the code above we should be
accessing .sata and not .ata.
Rick Macklem [Fri, 2 Aug 2019 01:59:58 +0000 (01:59 +0000)]
MFC: r350367
Lock the vnode before calling ufs_bmap_seekdata().
r346932 replaced a call to vn_bmap_seekhole() with a call to
ufs_bmap_seekdata(). Although vn_bmap_seekhole() locks the vnode,
ufs_bmap_seekdata() assumes it is already locked.
This patch adds locking of the vnode before the ufs_bmap_seekdata() call.
If the vn_lock() call fails, it returns EBADF since that is the normal
error returned when a file system is forced dismounted and is already
listed as an error return in the lseek(2) man page.
Thanks go to Harry Schmalzbauer (freebsd@omnilan.de) for noting that
this MFC was required.
Doug Moore [Thu, 1 Aug 2019 05:30:31 +0000 (05:30 +0000)]
MFC r350183, r350359
In trimming on startup, define swapon_trim() to invoke swapon before
closing the fd used for trimming so that a geli device isn't detached
before swapon is invoked. Add comments to explain the reason for the
ordering of events.
This is effectively a direct commit to stable branches as tun/tap have been
merged in head. The code here is identical, just in a slightly different
context.
Make filesystem-full messages limited per filesystem rather than systemwide
Add "untrusted" option to mount command
FS-14-UFS-3: when untrusted, valididate block pointers
In fsck_ffs, treat any inode with bad content as unknown
As of upstream fil.c CVS r1.53 (March 1, 2009), prior to the import of
ipfilter 5.1.2 into FreeBSD-10, the fix for, 2580062 from/to targets
should be able to use any interface name, moved frentry.fr_cksum to
prior to frentry.fr_func thereby making this code redundant. After
investigating whether this fix to move fr_cksum was correct and if it
broke anything, it has been determined that the fix is correct and this
code is redundant. We remove it here.
Pull in r366369 from upstream llvm trunk (by Francis Visoiu Mistrih):
[CodeGen][NFC] Simplify checks for stack protector index checking
Use `hasStackProtectorIndex()` instead of `getStackProtectorIndex()
>= 0`.
Pull in r366371 from upstream llvm trunk (by Francis Visoiu Mistrih):
[PEI] Don't re-allocate a pre-allocated stack protector slot
The LocalStackSlotPass pre-allocates a stack protector and makes sure
that it comes before the local variables on the stack.
We need to make sure that later during PEI we don't re-allocate a new
stack protector slot. If that happens, the new stack protector slot
will end up being **after** the local variables that it should be
protecting.
Therefore, we would have two slots assigned for two different stack
protectors, one at the top of the stack, and one at the bottom. Since
PEI will overwrite the assigned slot for the stack protector, the
load that is used to compare the value of the stack protector will
use the slot assigned by PEI, which is wrong.
For this, we need to check if the object is pre-allocated, and re-use
that pre-allocated slot.
Pull in r367068 from upstream llvm trunk (by Francis Visoiu Mistrih):
[CodeGen] Don't resolve the stack protector frame accesses until PEI
Currently, stack protector loads and stores are resolved during
LocalStackSlotAllocation (if the pass needs to run). When this is the
case, the base register assigned to the frame access is going to be
one of the vregs created during LocalStackSlotAllocation. This means
that we are keeping a pointer to the stack protector slot, and we're
using this pointer to load and store to it.
In case register pressure goes up, we may end up spilling this
pointer to the stack, which can be a security concern.
Instead, leave it to PEI to resolve the frame accesses. In order to
do that, we make all stack protector accesses go through frame index
operands, then PEI will resolve this using an offset from sp/fp/bp.
Together, these fix a issue where the stack protection feature in LLVM's
ARM backend can be rendered ineffective when the stack protector slot is
re-allocated so that it appears after the local variables that it is
meant to protect, leaving the function potentially vulnerable to a
stack-based buffer overflow.
Reported by: andrew
Security: https://kb.cert.org/vuls/id/129209/
commit log from upstream:
Fix race in parallel mount's thread dispatching algorithm
Strategy of parallel mount is as follows.
1) Initial thread dispatching is to select sets of mount points that
don't have dependencies on other sets, hence threads can/should run
lock-less and shouldn't race with other threads for other sets. Each
thread dispatched corresponds to top level directory which may or may
not have datasets to be mounted on sub directories.
2) Subsequent recursive thread dispatching for each thread from 1)
is to mount datasets for each set of mount points. The mount points
within each set have dependencies (i.e. child directories), so child
directories are processed only after parent directory completes.
The problem is that the initial thread dispatching in
zfs_foreach_mountpoint() can be multi-threaded when it needs to be
single-threaded, and this puts threads under race condition. This race
appeared as mount/unmount issues on ZoL for ZoL having different
timing regarding mount(2) execution due to fork(2)/exec(2) of mount(8).
`zfs unmount -a` which expects proper mount order can't unmount if the
mounts were reordered by the race condition.
There are currently two known patterns of input list `handles` in
`zfs_foreach_mountpoint(..,handles,..)` which cause the race condition.
1) #8833 case where input is `/a /a /a/b` after sorting.
The problem is that libzfs_path_contains() can't correctly handle an
input list with two same top level directories.
There is a race between two POSIX threads A and B,
* ThreadA for "/a" for test1 and "/a/b"
* ThreadB for "/a" for test0/a
and in case of #8833, ThreadA won the race. Two threads were created
because "/a" wasn't considered as `"/a" contains "/a"`.
2) #8450 case where input is `/ /var/data /var/data/test` after sorting.
The problem is that libzfs_path_contains() can't correctly handle an
input list containing "/".
There is a race between two POSIX threads A and B,
* ThreadA for "/" and "/var/data/test"
* ThreadB for "/var/data"
and in case of #8450, ThreadA won the race. Two threads were created
because "/var/data" wasn't considered as `"/" contains "/var/data"`.
In other words, if there is (at least one) "/" in the input list,
the initial thread dispatching must be single-threaded since every
directory is a child of "/", meaning they all directly or indirectly
depend on "/".
In both cases, the first non_descendant_idx() call fails to correctly
determine "path1-contains-path2", and as a result the initial thread
dispatching creates another thread when it needs to be single-threaded.
Fix a conditional in libzfs_path_contains() to consider above two.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
PR: 237517, 237397, 239243
Submitted by: Matthew D. Fuller <fullermd@over-yonder.net> (by email)
Reported by: Christopher Krah, Thomas Barabosch, and Jan-Niclas Hilgert of Fraunhofer FKIE
Reported as: FS-22-EXT2-9: Denial of service in ftruncate-0 (ext2_balloc)
FS-11-EXT2-6: Denial Of Service in write-1 (ext2_balloc)
Accept an IEEE Extended Unique Identifier (EUI-64) from the command
line for each NVMe namespace. If one isn't provided, it will create one
based on the CRC16 of:
- the FreeBSD IEEE OUI
- PCI bus, device/slot, function values
- Namespace ID
The NVMe specification defines bits 13:4 of BAR0 as Reserved (i.e. 0x0).
Most drivers do not enforce this, but the Windows NVMe driver does and
will refuse to start the device (i.e. error 10) if any of these bits are
set.
bhyve's NVMe emulation was transferring Identify data back to the guest
incorrectly causing memory corruptions. These corruptions resulted in
core dumps and other system level errors in the guest.
Pedro F. Giffuni [Fri, 26 Jul 2019 21:08:01 +0000 (21:08 +0000)]
MFC r349802 (from fsu@):
Add additional check for 'blocks per group' and 'fragments per group'
superblock fields.
These fields will not be equal only in case if bigalloc filesystem feature is
turned on. This feature is not supported for now.
Reported by: Christopher Krah, Thomas Barabosch, and Jan-Niclas Hilgert of Fraunhofer FKIE
Reported as: FS-27-EXT2-12: Denial of Service in openat-0 (vm_fault_hold/ext2_clusteracct)
MFC r344120:
Unify i386 and amd64 getcontextx.c, and use ifuncs while there.
This is yet another attempt of the merge, previously done as r344436 and
reverted in r344463. It is redone since ld was changed to ifunc-capable
linker on i386.
r349380:
libbe(3): mount: the BE dataset is mounted at /
Other parts of libbe(3) were fairly strict on the mountpoint property of the
BE dataset, and be_mount was not much better. It was improved in r347027 to
allow mountpoint=none for depth==0, but this bit was still sensitive to
mountpoint != / and mountpoint != none. Given that other parts of libbe(3)
no longer restrict the mountpoint property here, and the rest of the base
system is generally OK and will assume that a BE is mounted at /, let's do
the same.
r349383:
libbe(3): restructure be_mount, skip canmount check for BE dataset
Further cleanup after r349380; loader and kernel will both ignore canmount
on the root dataset as well, so we should not be so strict about it when
mounting it. be_mount is restructured to make it more clear that depth==0 is
special, and to not try fetching these properties that we won't care about.
bectl advertises that it has the ability to create recursive and
non-recursive boot environments. This patch implements that functionality
using the be_create_depth API provided by libbe. With this patch, bectl now
works as bectl(8) describes in regards to creating recursive/non-recursive
boot environments.
r344226:
Fix memory corruption bug introduced in r325310
The bug occurred when a bounce buffer was used and the requested read
size was greater than the size of the bounce buffer. This commit also
rewrites the read logic so that it is easier to systematically verify
all alignment and size cases.
r344234:
It turns out r344226 narrowed the overrun bug but did not eliminate it entirely
This commit fixes a remaining output buffer overrun in the
single-sector case when there is a non-zero tail.
CID 1400451: case 0 is missing a break/return and falling through to the
default case. waitpid(0, ...) makes little sense in the child, we likely
wanted to terminate immediately.
CID 1400453: size argument uses sizeof(char **) instead of sizeof(char *)
and is assigned to a char **; sizeof's match but "this isn't a portable
assumption".
Brooks Davis [Thu, 25 Jul 2019 17:33:44 +0000 (17:33 +0000)]
MFC r350116:
Document that setmode(3) is not thread safe.
In some circumstances, setmode(3) may call umask(2) twice to retrieve
the current mode and then restore it. Between calls, the process will
have a umask of 0.
Simon J. Gerraty [Thu, 25 Jul 2019 00:07:10 +0000 (00:07 +0000)]
loader: ignore some variable settings if input unverified
libsecureboot can tell us if the most recent file opened was
verfied or not.
If it's state is VE_UNVERIFIED_OK, skip if variable
matches one of the restricted prefixes.