cem [Wed, 30 Sep 2020 14:55:54 +0000 (14:55 +0000)]
gdb(4): Don't escape GDB special characters at application layer
In r351368, we introduced this XML- and GDB-encoded data. The protocol
'offset' should reflex the logical XML data offset, but unfortunately we
counted the GDB escapes as well.
In fact, we cannot safely do GDB character escaping at this layer at
all, because we don't know what will be flushed in a packet. It is
bogus to send only the first character of a two-character escape
sequence.
This patch "corrects" the problem by squashing these characters in the
transmitted XML document. It would be nice to transmit the characters
faithfully, but that is a more complicated change. Thread names are a
nice convenience feature for the GDB client, but one can always inspect
td_name or p_comm directly to find the true name.
Reported by: Ka Ho Ng <khng300 AT gmail.com>
Tested by: Ka Ho Ng
Reviewed by: emaste, markj, rlibby
Differential Revision: https://reviews.freebsd.org/D26599
Load/store/fetch access exceptions always indicate a violation of a PMP
rule. We can't treat those as page faults, because updating the page
table and trying again will only result in exactly the same access
exception recurring. This leaves us in an endless exception loop.
We cannot recover from these exceptions, so panic instead.
Every other architecture defines this and this is required for
interrupts to work when using QEMU's PCI VirtIO devices (which all
report an interrupt line of 0) for two reasons.
Firstly, interrupt line 0 is wrong; they use one of 0x20-0x23 with the
lines being cycled across devices like normal. Moreover, RISC-V uses
INTRNG, whose IRQs are virtual as indices into its irq_map, so even if
we have the right interrupt line we still need to try and route the
interrupt in order to ultimately call into intr_map_irq and get back a
unique index into the map for the given line, otherwise we will use
whatever happens to be in irq_map[line] (which for QEMU where the line
is initialised to 0 results in using the first allocated interrupt,
namely the RTC on IRQ 11 at time of commit).
Note that pci_assign_interrupt will still do the wrong thing for INTRNG
when using a tunable, as it will bypass INTRNG entirely and use the
tunable's value as the index into irq_map, when it should instead
(indirectly) call intr_map_irq to allocate a new entry for the given
IRQ and treat the tunable as stating the physical line in use, which is
what one would expect. This, however, is a problem shared by all INTRNG
architectures, and not exclusive to RISC-V.
Make copy_file_range(2) Linux compatible for overflow of offset + len.
Without this patch, if a call to copy_file_range(2) specifies an input file
offset + len that would wrap around, EINVAL is returned.
I thought that was the Linux behaviour, but recent testing showed that
Linux accepts this case and does the copy_file_range() to EOF.
This patch changes the FreeBSD code to exhibit the same behaviour as
Linux for this case.
This appears to be a typo. The AdvSIMD field encodes support for
half-precision floating point SIMD instructions, which corresponds to
HWCAP_ASIMDHP, not HWCAP_ASIMDDP.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Fallback to software for more GCM and CCM requests.
ccr(4) uses software to handle GCM and CCM requests not supported by
the crypto engine (e.g. with only AAD and no payload). This change
adds a fallback for a few more requests such as those with more SGL
entries than can fit in a work request (this can happen for GCM when
decrypting a TLS record split across 15 or more packets).
Rather than placing the epoch around the entire receive loop which
might call into rtwn_rx_frame() and USB and sleep, split the loop
into two[1] and leave us with one unlock/lock cycle as well.
PR: 249925
Reported by: thj, (rkoberman gmail.com)
Tested by: thj
Suggested by: adrian [1]
Reviewed by: adrian
MFC after: 3 days
Sponsored by: The FreeBSD Foundation (initially, paniced my iwl lab host)
Differential Revision: https://reviews.freebsd.org/D26554
Add EXAMPLES section to the man page showing the use of all flags except for
-S.
While here, clarify -f description. It not only suppresses diagnostic messages
but it also affects the exit status of the command itself. This is shown in two
of the examples.
OpenZFS will start using some of the kernel timekeeping bits
shortly. This implements the bare minimum of that which currently
is just the time_seconds variable.
o Rename acpi_iommu_get_dma_tag() -> iommu_get_dma_tag().
This function isn't ACPI dependent and we may use it on FDT systems
as well.
o Don't repeat the function declaration, include iommu.h instead.
LLVM 11 changed the meaning of '-O' from '-O2' to '-O1', which resulted
in debug kernels (with 'makeoptions DEBUG=-g') being built with inlining
disabled, causing severe performance hit.
The -O2 was already being used for building amd64, powerpc, and powerpcspe.
Improve the input validation and processing of cookies.
This avoids setting the association in an inconsistent
state, which could result in a use-after-free situation.
This can be triggered by a malicious peer, if the peer
can modify the cookie without the local endpoint recognizing
it.
Thanks to Ned Williamson for reporting the issue.
cxgbe(4): Avoid unnecessary work in the firmware during netmap tx.
Bind the netmap tx queues to a special '0xff' scheduling class which
makes the firmware skip some processing related to rate limiting on the
outgoing traffic. Future firmwares will do this automatically.
cxgbe(4): fixes for netmap operation with only some queues active.
- Only active netmap receive queues should be in the RSS lookup table.
- The RSS table should be restored for NIC operation when the last
active netmap queue is switched off, not the first one.
- Support repeated netmap ON/OFF on a subset of the queues. This works
whether the the queues being enabled and disabled are the only ones
active or not. Some kring indexes have to be reset in the driver for
the second case.
For interfaces that do not support SIOCGIFMEDIA (for which there are
quite a few) the only fallback is to query the interface for
if_data->ifi_link_state. While it's possible to get at if_data for an
interface via getifaddrs(3) or sysctl, both are heavy weight mechanisms.
SIOCGIFDATA is a simple ioctl to retrieve this fast with very little
resource use in comparison. This implementation mirrors that of other
similar ioctls in FreeBSD.
For mulitcons boot, report it and which console is primary
Until we can do proper /etc/rc output on both consoles in multicons
boot (or all of them if we ever generalize), report when we are
booting multicons. Also report the primary console. This will be a big
hint why output stops after this line (though some slow USB discovery
still happens after mountroot / init starts).
Report what console the boot loader is telling the kernel to use:
o Dual (Serial Primary)
o Dual (Video Primary)
o Serial
o Video
This allows the user to interrupt the boot and tweak the cosnole, if
needed, in a trivial way. Useful for installs where the default
selected may not be quite what you want, or when you are running a
dual setup and need to toggle over to the other console being primary.
The 'c'/'C' keys will do the cycling through the consoles. Note:
you'll still have to drop into the loader to set details about serial
consoles. And this doesn't change the console the loader is using.
Reviewed by: kevans@
MFC After: 3 days
Differential Revision: https://reviews.freebsd.org/D26573
Fix booting arm64 EFI with LINUX_BOOT_ABI enabled.
Use address of the pointer passed to kernel to determine whether the pointer
is a FDT block (physical address) or a module pointer (virtual kernel address).
This fragment was supposed to be committed before r366196, but I accidentally
skipped it in a patch series.
Rather than hard coding ada0 everywhere, use ${dev}. Also, set
dev=vtbd0 since both qemu and bhyve support this. More work
should be done to use labels instead for fstab.
qemu scripts likely need adjustment. And we should also
likely generate byhve scripts too.
The video on PCI heuristic was broken. It was supposed to infer a
video device when the last element of the path was a PCI DEVICE PATH
node. However, the last node in the device path is an END node, so
this heuristic never fired.
This leads, among other things, to bhyve only producing output in the
serial connection once we leave the boot loader. This restores the
dual headed boot on bhyve + UEFI (as we did in 11.2), but will favor
serial in the absence of other config which may be a change from 11.2.
MFC After: 3 days
Differential Revision: https://reviews.freebsd.org/D26572
In order to prioritize iSCSI traffic across a network,
DSCP can be used. In order not to rely on "ipfw setdscp"
or in-network reclassification, this adds the dscp value
directly to the portal group (where TCP sessions are accepted).
The incoming iSCSI session is first handled by ctld for any
CHAP authentication and the socket is then handed off to the
in-kernel iscsi driver without modification of the socket
parameters. Simply setting up the socket in ctld is sufficient
to keep sending outgoing iSCSI related traffic with the
configured DSCP value.
Improve the handling of receiving unordered and unreliable user
messages using DATA chunks. Don't use fsn_included when not being
sure that it is set to an appropriate value. If the default is
used, which is -1, this can result in SCTP associaitons not
making any user visible progress.
Thanks to Yutaka Takeda for reporting this issue for the the
userland stack in https://github.com/pion/sctp/issues/138.
Don't send a signal with uninitialized 'sig' and 'code' fields.
We have a few shortcuts in the arm trap code to speed up obvious "must fail"
cases. In these situations, make sure that we fill in the "sig" and "code"
fields of the generated signal.
It was removed in r355289 but forgot to return it back when new u-boot booti
support was committed. Although booti is not the preferred method of
booting the kernel, it is very useful for the initial phase of porting
FreeBSD to a new platform or booting the kernel on various embedded boards
in an industrial environment.
Don't map same physical memory multiple times with different cache attributes.
This is explicitly stated as architectural undefined behavior, leading to
coherency issues sooner or later.
Don't map same physical memory multiple times with different cache attributes.
This is explicitly stated as architectural undefined behavior, leadint to
coherencz issues sonner or later.
Bjorn reported a problem where the Linux NFSv4.1 client is
using an open_to_lock_owner4 when that lock_owner4 has already
been created by a previous open_to_lock_owner4. This caused the NFS server
to reply NFSERR_INVAL.
For NFSv4.0, this is an error, although the updated NFSv4.0 RFC7530 notes
that the correct error reply is NFSERR_BADSEQID (RFC3530 did not specify
what error to return).
For NFSv4.1, it is not obvious whether or not this is allowed by RFC5661,
but the NFSv4.1 server can handle this case without error.
This patch changes the NFSv4.1 (and NFSv4.2) server to handle multiple
uses of the same lock_owner in open_to_lock_owner so that it now correctly
interoperates with the Linux NFS client.
It also changes the error returned for NFSv4.0 to be NFSERR_BADSEQID.
Thanks go to Bjorn for diagnosing this and testing the patch.
He also provided a program that I could use to reproduce the problem.
Check for the only 32-bit MIPS ABIs we support, rather than !n64
There may be additional 64-bit ABIs supported, so use a positive check rather
than a negative check.
Suggested by: imp
MFC after: 1 week
Sponsored by: Juniper Networks, Inc
I had failed to notice that sgsendccb() was using cam_periph_mapmem()
and thus was not passing down user pointers directly to drivers. In
practice this broke requests submitted from userland.
Adjustments to includes for openzfs in _STANDALONE
Allow the necessary parts of systm.h to be visible in the _STANDALONE
environnment. Limit the reset to only being visible for _KERNEL
builds. Map KASSERT, etc to printf on failure in the bootloader until
we have more confidence things won't break and leave systems
unbootable. Eventually, this should map to a full panic in the
bootloader, but that also needs some enhancement to be more useful.
Original patch was against FreeBSD 12, and a test compile wasn't run against
head. md_tls_tcb_offset field was moved from mdthread to mdproc in the
meantime.
MFC after: 1 week
Sponsored by: Juniper Networks, Inc.
Dont let kernel and standalone both be defined at the same time
_KERNEL and _STANDALONE are different things. They cannot both be true
at the same time. If things that are normally visible only to _KERNEL
are needed for the _STANDALONE environment, you need to also make them
visible to _STANDALONE. Often times, this will be just a subset of the
required things for _KERNEL (eg global variables are but one example).
sys/cdefs.h is included by pretty much everything in both the loader
and the kernel, so is the ideal choke point.
ng_l2tp: Fix callout synchronization in the rexmit timeout handler
A received control packet may cause the transmit queue to be flushed, in
which case ng_l2tp_seq_recv_nr() cancels the transmit timeout handler.
The handler checks to see if it was cancelled before doing anything, but
did so before acquiring the node lock, so a small race window could
cause ng_l2tp_seq_rack_timeout() to attempt to flush an empty queue,
ultimately causing a null pointer dereference.
Summary:
Two bugs:
* Elf32_Auxinfo is broken, using pointers in the union, which are 64-bits not
32.
* freebsd32_sysarch() doesn't update the 'user local' register when handling
MIPS_SET_TLS, leading to a NULL pointer dereference in the 32-bit
application.
Refine locking inside of syscon driver.
In some cases, the syscon driver may be used by consumer requiring better
control about locking (ie. it may be used as registe file provider for clock
driver which needs locked access to multiple registers).
Add fine locking protocol methods together with bunch of helper functions
in syscon driver and implement this functionality in syscon_generic driver.
Correctly handle nodes compatible with "syscon", "simple-bus".
Syscon can also have child nodes that share a registration file with it.
To do this correctly, follow these steps:
- subclass syscon from simplebus and expose it if the node is also
"simple-bus" compatible.
- block simplebus probe for this compatible string, so it's priority
(bus pass) doesn't colide with syscon driver.
While I'm in, also block "syscon", "simple-mfd" for the same reason.
TCP: send full initial window when timestamps are in use
The fastpath in tcp_output tries to send out
full segments, and avoid sending partial segments by
comparing against the static t_maxseg variable.
That value does not consider tcp options like timestamps,
while the initial window calculation is using
the correct dynamic tcp_maxseg() function.
Due to this interaction, the last, full size segment
is considered too short and not sent out immediately.
Adjust ssthresh in after_idle to the maximum of
the prior ssthresh, or 3/4 of the prior cwnd. See
RFC2861 section 2 for an in depth explanation for
the rationale around this.
As newreno is the default "fall-through" reaction,
most tcp variants will benefit from this.
vi(1): Add URL to the vi/ex reference manual in the man page
Reported in PR 241985
The manual page references the "vi/ex reference manual" but there is no
information about where to find that document. Add a reference to the manual in
the SEE ALSO section since the Project hosts a copy of it[1].
Change sent upstream[2]
If D26158 gets reviewed and committed, we could close that PR.
pwm(8): fix potential duty overflow, use unsigneds for period and duty
For a long period value and the duty specified as a percentage,
there could be an overflow.
Using unsigned integers aligns the code with struct pwm_state and allows
to safely use periods up to 4 seconds where supported by drivers.
If a FUSE server returns FOPEN_DIRECT_IO in response to FUSE_OPEN, that
instructs the kernel to bypass the page cache for that file. This feature
is also known by libfuse's name: "direct_io".
However, when accessing a file via mmap, there is no possible way to bypass
the cache completely. This change fixes a deadlock that would happen when
an mmap'd write tried to invalidate a portion of the cache, wrongly assuming
that a write couldn't possibly come from cache if direct_io were set.
Arguably, we could instead disable mmap for files with FOPEN_DIRECT_IO set.
But allowing it is less likely to cause user complaints, and is more in
keeping with the spirit of open(2), where O_DIRECT instructs the kernel to
"reduce", not "eliminate" cache effects.
PR: 247276
Reported by: trapexit@spawn.link
Reviewed by: cem
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D26485