CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

pkgbase: Create a FreeBSD-utilities package and make it the default one

The default package use to be FreeBSD-runtime but it should only contain
binaries and libs enough to boot to single user and repair the system, it
is also very handy to have a package that can be tranform to a small mfsroot.
So create a new package named FreeBSD-utilities and make it the default one.
Also move a few binaries and lib into this package when it make sense.
Reviewed by: bapt, gjb
Differential Revision: https://reviews.freebsd.org/D21506

pkgbase: Add TAG for evdev and veriexec headers

Reviewed by: bapt
Differential Revision: https://reviews.freebsd.org/D21505

pkgbase: Put the sys/common test into the tests package

Every other test is there so do the same for those.

Reviewed by: bapt, gjb
Differential Revision: https://reviews.freebsd.org/D21504

pkgbase: Put a lot of binaries and lib in FreeBSD-runtime

All of them are needed to be able to boot to single user and be able
to repair a existing FreeBSD installation so put them directly into
FreeBSD-runtime.

Reviewed by: bapt, gjb
Differential Revision: https://reviews.freebsd.org/D21503

pkgbase: Put libbluetooth in the bluetooth package

It make sense to have everything bluetooth related in the same package.
Reviewed by: bapt, gjb
Differential Revision: https://reviews.freebsd.org/D21502

pkgbase: Move libcap_ to FreeBSD-runtime

A lot of binaries present in FreeBSD-runtime depend on it so move
the libs there.

Reviewed by: bapt, gjb
Differential Revision: https://reviews.freebsd.org/D21501

pkgbase: Tag passwd related file to be in FreeBSD-runtime package.

Reviewed by: bapt, gjb
Differential Revision: https://reviews.freebsd.org/D21500

pkgbase: Move rc scripts and related files to their own packages

It doesn't need to be in runtime and might help people who want to
experiment with other rc system or don't use one (like in small
embedded mfsroot).

Reviewed by: bapt, gjb
Differential Revision: https://reviews.freebsd.org/D21499

pkgbase: Force zfs(8) and zpool(8) to be in the runtime package

Those commands are needed to repair a FreeBSD installation so add them
to the runtime package

Reviewed by: bapt, gjb
Differential Revision: https://reviews.freebsd.org/D21498

pkgbase: lib80211 is needed by ifconfig(8) so put it in FreeBSD-runtime

Reviewed by: bapt, gjb
Differential Revision: https://reviews.freebsd.org/D21497

pkgbase: Move the bootloader related files to a new FreeBSD-bootloader package

Bootloader file isn't needed for jails so don't include it in FreeBSD-runtime.

Reviewed by: bapt, delphij, gjb
Differential Revision: https://reviews.freebsd.org/D21496

Decrease the default audio playback latency to a maximum of 21.3ms.
This significantly improves the audio playback response time.

Discussed with: mav@
MFC after: 1 week
Sponsored by: Mellanox Technologies

ficl: add uIsGreater word

For some reason we have u< but not u>, fix it.

patch(1): add some basic tests

Summary:
- basic: test application of patches created by diff -u at the
  beginning/middle/end of file, which have differing amounts of context
  before and after chunks being added
- limited_ctx: stems from PR 74127 in which a rogue line was getting added
  when the patch should have been rejected. Similar behavior was
  reproducible with larger contexts near the beginning/end of a file. See
  r326084 for details
- file_creation: patch sourced from /dev/null should create the file
- file_nodupe: said patch sourced from /dev/null shouldn't dupe the contents
  when re-applied (personal vendetta, WIP, see comment)
- file_removal: this follows from nodupe; the reverse of a patch sourced
  from /dev/null is most naturally deleting the file, as is expected based
  on GNU patch behavior (WIP)

sys/mount.h: Comment on distinction between vfs_{c,}mount

Hope to save someone else a little future effort in ugly duplicated code.

No functional change.

Delete the unused "nd" argument for nfsrv_checkdsattr().

The "nd" argument for nfsrv_checkdsattr() is no longer used by the function.
This patch deletes it. This allows subsequent patches to delete the "nd"
argument from nfsrv_proxyds(), since it's only use of "nd" was to pass it
to nfsrv_checkdsattr(). The same will then be true for nfsvno_getattr(),
which passes "nd" to nfsrv_proxyds().
Getting rid of the "nd" argument from nfsvno_getattr() avoids confusion
over why it might need "nd".

This patch is trivial and does not have any semantic effect.
Found by inspection while working on the NFSv4.2 server.

The efifat files are no longer used: remove the code to build them

Reviewed by: imp, tsoome, emaste
Differential Revision: https://reviews.freebsd.org/D20562

madvise(MADV_FREE): Quick fix to time rewind.

Don't free pages in a shadowing object. While this degrades MADV_FREE
to a no-op (and we could, instead, choose to fall back to
MADV_DONTNEED, at the cost of changing pmap_madvise), this is
presently considered a temporary fix. We may prefer to risk a little
fragmentation of the map by creating a zero/OBJT_DEFAULT entry over
top of the existing object and, simultaneously, revert to the existing
marking any pages in the former shadowing object in the advised region
as reclaimable. At least one consumer of MADV_FREE (snmalloc) may use
mmap() to construct zeroed pages "eventually" here anyway, so the
fragmentation may be coming anyway.

Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
PR: 240061
Reviewed by: markj
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21517

Support doorbell strides != 0.

The NVMe standard (1.4) states

>>> 8.6 Doorbell Stride for Software Emulation
>>> The doorbell stride,...is useful in software emulation of an NVM
>>> Express controller. ... For hardware implementations of the NVM
>>> Express interface, the expected doorbell stride value is 0h.

However, hardware in the wild exists with a doorbell stride of 1
(meaning 8 byte separation). This change supports that hardware, as
well as software emulators as envisioned in Section 8.6. Since this is
the fast path, care has been taken to make this computation
efficient. The bit of math to compute an offset for each is replaced
by a memory load from cache of a pre-computed value.

MFC After: 3 days
Reviewed by: scottl@
Differential Revision: https://reviews.freebsd.org/D21514

vfs: fully hold vnodes in vnlru_free_locked

Currently the code only bumps holdcnt and clears the VI_FREE flag, not
performing actual vhold. Since the vnode is still visible elsewhere, a
potential new user can find it and incorrectly assume it is properly held.

Use vholdl instead to correctly hold the vnode. Another place recycling
(vlrureclaim) does this already.

Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21522

Report the Host Buffer Memory minimum and preferred sizes.

The Host Buffer feature (NVMe 1.4 section 89) allows for the NVMe card
request the host provide it buffer for lookaside tables and maybe
other things. Report the card's minimum and preferred sizes with
nvmecontrol/camcontrol identify.

PROGS: Build common sources before recursed PROGS_TARGETS as well when building.

MFC after: 2 weeks
Sponsored by: DellEMC

Fix /proc/mounts for autofs(5) mounts.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

Improve debugging output.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

- correct HISTORY section
- while here clarify wording

PR: 240260 (based on)
Submitted by: gbergling@gmail.com
MFC after: after 1 week

procstat/tests: Fix flakiness by waiting for program to start

Some of the procstat tests start a program "while1" and examine the process
using procstat, but did not wait properly for it to start (kill -0 will
succeed immediately after the child process has been created).

Instead, have "while1" write something when it starts, and use a fifo to
wait for that.

PR: 233587, 233588
Reviewed by: ngie
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21519

Include dwgpio to the build.

Sponsored by: DARPA, AFRL

o Add support for multi-port instances of Synopsys DesignWare APB GPIO
Controller.
o Rename the driver to dwgpio.

Sponsored by: DARPA, AFRL

Back out r351799

empty does not appear to work like I thought it did and it actively breaks
real LOCAL_MODULES usage, of which I have none at the moment...

pseudofs: make readdir work without a pid again

Specifically, the following was broken:

$ mount -t procfs procfs /proc
$ ls -l /proc

r351741 reworked readdir slightly to avoid pfs_node/pidhash LOR, but
inadvertently regressed pid == NO_PID; new pfs_lookup_proc() fails for the
obvious reasons, and later pfs_visible_proc doesn't capture the
pid == NO_PID -> return 1 aspect of pfs_visible. We can infact skip this
whole block if we're operating on a directory w/ NO_PID, as it's always
visible.

Reported by: trasz
Reviewed by: mjg
Differential Revision: https://reviews.freebsd.org/D21518

bectl(8): implement sorting for 'bectl list' output

Allow 'bectl list' to sort output by a given property name. The property
name is passed in using a command-line flag, '-c' for ascending order and
'-C' for descending order. The properties allowed to sort by are:

- name (the default output, even if '-c' or '-C' are not used)
- creation
- origin
- used
- usedds
- usedsnap
- usedrefreserv

The default output for 'bectl list' is now ascending alphabetical order of
BE name.

To sort by creation time from earliest to latest, the command would be
'bectl list -c creation'

Submitted by: Rob Fairbanks <rob.fx907 gmail com>
Reviewed by: ler
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20818

mpsutil slot set status

This code has been written as a proof of concept, but I think that it
can be useful in general.  It allows to set the status of an enclosure
slot.  Practically, this means controlling whatever slot status LEDs the
enclosure provides.  At present, the new command does not have sanity
checks or any conveniences.  That means that it is possible to issue the
command for an invalid slot and an enclosure.  But the worst I have seen
happening is either the command failing or simply being ignored.  Also,
at the moment, the status has to be specified as a numeric bit mask.
The bit definitions can be found in sys/dev/mps/mpi/mpi2_init.h, they
are prefixed with MPI2_SEP_REQ_SLOTSTATUS_.  The only way to address a
slot is by the enclosure handle and the slot number.  Both are readily
available from mpsutil show commands.

So, future enhancements could include alternative ways to address a slot
(e.g., by a disk handle or a disk device name) and human friendly names
for slot statuses.

The new command is useful alternative to 'sas2ircu locate' command.
First, sas2ircu is a proprietary blob.  Second, it supports setting only
locate / identify status bit.

Tested on HP H220 running LSI IT firmware 20.x.

Reviewed by: bapt
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D20535

Adjust history, info source from v1's manuals
https://www.bell-labs.com/usr/dmr/www/1stEdman.html

MFC after: 5 days

shutdown_halt: make sure that watchdog timer is stopped

The point of halt is to keep the machine in limbo.

Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D21222

ZFS: Always refuse receving non-resume stream when resume state exists

This fixes a hole in the situation where the resume state is left from
receiving a new dataset and, so, the state is set on the dataset itself
(as opposed to %recv child).

Additionally, distinguish incremental and resume streams in error
messages.

This was also committed to ZoL:
zfsonlinux/zfs@ebeb6f23bf7e8fe6732a05267ed1cab4c38d3b23

MFC after: 2 weeks
Sponsored by: CyberSecure

Correct overflow logic in fullpath().

Obtained from: OpenBSD
MFC after: 3 days

Fix the SACK block generation in the base TCP stack by bringing it in
sync with the RACK stack.

Reviewed by: rrs@
MFC after: 5 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D21513

Fix some nits in pmap_page_array_startup().

- Use ptoa() instead of the archaic ctob().
- Use pagezero() to zero a PDP page.
- Remove PA_MIN_ADDRESS, orphaned by r351742.
- Remove unneeded parens and an unnecessary control flow statement.

Reported by: alc
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21495

LOCAL_MODULES: Allow LOCAL_MODULES="" in src.conf to work

Currently LOCAL_MODULES= works, but LOCAL_MODULES="" causes build errors as
.for still has the empty string to loop over. An .if empty prior to the loop
was considered, but LOCAL_MODULES has empty quotes at that point and thus,
isn't empty. A better solution likely exists, but this floats us by for
now...

Allow more nesting of GEOM partitioning schemes

GEOM is supposed to be topology-agnostic, but the GPT and BSD partition code
has arbitrary restrictions on nesting that are annoying in cases such as
running VMs on raw partitions (since the VM's partitioning scheme is not
visible to the host).

This patch adds sysctls to disable the restrictions except in the case of
BSD label (and similar) partitions with offset 0 (where we need to avoid
recursively recognizing the label).

Submitted by: Andrew Gierth
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21350

posixshm: start counting writeable mappings

r351650 switched posixshm to using OBJT_SWAP for shm_object

r351795 added support to the swap_pager for tracking writeable mappings

Take advantage of this and start tracking writeable mappings; fd sealing
will use this to reject a seal on writing with EBUSY if any such mapping
exist.

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D21456

vm pager: writemapping accounting for OBJT_SWAP

Currently writemapping accounting is only done for vnode_pager which does
some accounting on the underlying vnode.

Extend this to allow accounting to be possible for any of the pager types.
New pageops are added to update/release writecount that need to be
implemented for any pager wishing to do said accounting, and we implement
these methods now for both vnode_pager (unchanged) and swap_pager.

The primary motivation for this is to allow other systems with OBJT_SWAP
objects to check if their objects have any write mappings and reject
operations with EBUSY if so. posixshm will be the first to do so in order to
reject adding write seals to the shmfd if any writable mappings exist.

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D21456

Unbreak Linux binaries linked against new glibc, such as the ones
from recent Ubuntu versions. Without it they segfault on startup.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20687

Fix two TCP RACK issues:
* Convert the TCP delayed ACK timer from ms to ticks as required.
This fixes the timer on platforms with hz != 1000.
* Don't delay acknowledgements which report duplicate data using
DSACKs.

Reviewed by: rrs@
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D21512

- Retire pc-sysinstall(8)

https://reviews.freebsd.org/D21094

Submitted by: kmoore@FreeBSD.org
Approved by: imp@FreeBSD.org

Add stackgap control mode to proccontrol(1).

PR: 239894
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21352

Add procctl(PROC_STACKGAP_CTL)

It allows a process to request that stack gap was not applied to its
stacks, retroactively. Also it is possible to control the gaps in the
process after exec.

PR: 239894
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21352

Use makefs -t msdos in make_esp_file

With this last piece in place, make -C /usr/src/release release.iso is
finally able to run in a jail. This was not possible before because
msdosfs cannot be mounted inside a jail.

Submitted by: ryan@ixsystems.com
Reviewed by: emaste@, imp@, gjb@
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D21385

Add conv=fsync flag to dd

The fsync flag performs an fsync(2) on the output file before closing it.
This will be useful for the ZFS test suite.

Submitted by: ryan@ixsystems.com
Reviewed by: jilles@, imp@
MFC after: 1 week
Sponsored by: iXsystems, Inc.

bsdgrep(1): add some basic tests for some GNU Extension support

These will be expanded later as I come up with good test cases; for now,
these seem to be enough to trigger bugs in base gnugrep and expose missing
features in bsdgrep.

Make linprocfs(4) report Tgid, Linux ltrace(1) needs it.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

vfs: implement usecount implying holdcnt

vnodes have 2 reference counts - holdcnt to keep the vnode itself from getting
freed and usecount to denote it is actively used.

Previously all operations bumping usecount would also bump holdcnt, which is
not necessary. We can detect if usecount is already > 1 (in which case holdcnt
is also > 1) and utilize it to avoid bumping holdcnt on our own. This saves
on atomic ops.

Reviewed by: kib
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21471

Implement nvme suspend / resume for pci attachment

When we suspend, we need to properly shutdown the NVME controller. The
controller may go into D3 state (or may have the power removed), and
to properly flush the metadata to non-volatile RAM, we must complete a
normal shutdown. This consists of deleting the I/O queues and setting
the shutodown bit. We have to do some extra stuff to make sure we
reset the software state of the queues as well.

On resume, we have to reset the card twice, for reasons described in
the attach funcion. Once we've done that, we can restart the card. If
any of this fails, we'll fail the NVMe card, just like we do when a
reset fails.

Set is_resetting for the duration of the suspend / resume. This keeps
the reset taskqueue from running a concurrent reset, and also is
needed to prevent any hw completions from queueing more I/O to the
card. Pass resetting flag to nvme_ctrlr_start. It doesn't need to get
that from the global state of the ctrlr. Wait for any pending reset to
finish. All queued I/O will get sent to the hardware as part of
nvme_ctrlr_start(), though the upper layers shouldn't send any
down. Disabling the qpairs is the other failsafe to ensure all I/O is
queued.

Rename nvme_ctrlr_destory_qpairs to nvme_ctrlr_delete_qpairs to avoid
confusion with all the other destroy functions. It just removes the
queues in hardware, while the other _destroy_ functions tear down
driver data structures.

Split parts of the hardware reset function up so that I can
do part of the reset in suspsend. Split out the software disabling
of the qpairs into nvme_ctrlr_disable_qpairs.

Finally, fix a couple of spelling errors in comments related to
this.

Relnotes: Yes
MFC After: 1 week
Reviewed by: scottl@ (prior version)
Differential Revision: https://reviews.freebsd.org/D21493

Revert a portion of r351628 that I did not mean to commit.

Reported by: mjg
MFC with: r351628

Add preliminary support for atomic updates of per-page queue state.

Queue operations on a page use the page lock when updating the page to
reflect the desired queue state, and the page queue lock when physically
enqueuing or dequeuing a page.  Multiple pages share a given page lock,
but queue state is per-page; this false sharing results in heavy lock
contention.

Take a small step towards the use of atomic_cmpset to synchronize
updates to per-page queue state by introducing vm_page_pqstate_cmpset()
and using it in the page daemon.  In the longer term the plan is to stop
using the page lock to protect page identity and rely only on the object
and page busy locks.  However, since the page daemon avoids acquiring
the object lock except when necessary, some synchronization with a
concurrent free of the page is required.  vm_page_pqstate_cmpset() can
be used to ensure that queue state updates are successful only if the
page is not scheduled for a dequeue, which is sufficient for the page
daemon.

Add vm_page_swapqueue(), which moves a page from one queue to another
using vm_page_pqstate_cmpset().  Use it in the active queue scan, which
does not use the object lock.  Modify vm_page_dequeue_deferred() to
use vm_page_pqstate_cmpset() as well.

Reviewed by: kib
Discussed with: jeff
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21257

Map the vm_page array into KVA on amd64.

r351198 allows the kernel to use domain-local memory to back the vm_page
array (up to 2MB boundaries) and reserves a separate PML4 entry for that
purpose. One consequence of that change is that the vm_page array is no
longer present in minidumps, which only adds pages mapped above
VM_MIN_KERNEL_ADDRESS.

To avoid the friction caused by having kernel data structures mapped
below VM_MIN_KERNEL_ADDRESS, map the vm_page array starting at
VM_MIN_KERNEL_ADDRESS instead of using a dedicated PML4 entry.

Reviewed by: kib
Discussed with: jeff
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21491

pseudofs: fix a LOR pfs_node vs pidhash (sleepable after non-sleepable)

Sponsored by: The FreeBSD Foundation

superio: fix the copyright block and update the year

MFC after: 2 weeks

Temporarily skip sys.sys.qmath_test.qdivq_s64q in CI because it is unstable

PR: 240219
Discussed with: trasz
Sponsored by: The FreeBSD Foundation

Add sysctlbyname system call

Previously userspace would issue one syscall to resolve the sysctl and then
another one to actually use it. Do it all in one trip.

Fallback is provided in case newer libc happens to be running on an older
kernel.

Submitted by: Pawel Biernacki
Reported by: kib, brooks
Differential Revision: https://reviews.freebsd.org/D17282

Add a sysctl to dump kernel mappings and their properties on amd64.

The sysctl is called vm.pmap.kernel_maps. It dumps address ranges
and their corresponding protection and mapping mode, as well as
counts of 2MB and 1GB pages in the range.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21380

Replace PMAP_LARGEMAP_MAX_ADDRESS() with a more general predicate.

No functional change intended.

Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation

This patch improves the DSACK handling to conform with RFC 2883.
The lowest SACK block is used when multiple Blocks would be elegible as
DSACK blocks ACK blocks get reordered - while maintaining the ordering of
SACK blocks not relevant in the DSACK context is maintained.

Reviewed by: rrs@, tuexen@
Obtained from: Richard Scheffenegger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21038

Fix the name of the devicetree bindings document file cited in the manpage.

Reported by: thj@

Bump Linux version to 3.2.0. Without it, binaries linked against
glibc 2.24 and up (eg Ubuntu 19.04) fail with "FATAL: kernel too old".

This alone is not enough to make newer binaries actually work;
fix/hack/workaround is pending review at https://reviews.freebsd.org/D20687.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20757

In nvme_completion_poll, add a sanity check to make sure that we complete the
polling within a second. Panic if we don't. All the commands that use this
interface should typically complete within a few tens to hundreds of
microseconds. Panic rather than return ETIMEDOUT because if the command somehow
does later complete, it will randomly corrupt memory. Also, it helps to get a
traceback from where the unexpected failure happens, rather than an infinite
loop.

In all the places that we use the polled for completion interface, except crash
dump support code, move the while loop into an inline function. These aren't
done in the fast path, so if the compiler choses to not inline, any performance
hit is tiny.

Add a brief comment explaining why we can return ETIMEDOUT from the call to the
polled interface. Normally this would have the potential to corrupt stack memory
because the completion routines would run after we return. In this case,
however, we're doing a dump so it's safe for reasons explained in the comment.

Relax compat.linux.osrelease checks. This way one can do eg
'compat.linux.osrelease=3.10.0-957.12.1.el7.x86_64', which
corresponds to CentOS 7.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20685

vfs: restore mp null check in vop_stdgetwritemount

The initially read mount point can already be NULL.

Reported by: markj
Fixes: r351656 ("vfs: stop refing freed mount points in vop_stdgetwritemount")
Sponsored by: The FreeBSD Foundation

LinuxKPI: Add sysfs create/remove functions that handles multiple files in one call.

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
Differential Revision: D21475

libc: Use musl's optimized memchr

Parentheses added to HASZERO macro to avoid a GCC warning.

Reviewed by: kib, mjg
Obtained from: musl (snapshot at commit 4d0a82170a)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17631

libutil: remove SIGSYS handling from setusercontext

It was a workaround for cases where the kernel lacks setloginclass(2),
added in the 9.x era.

Submitted by: Pawel Biernacki

Belatedly bump __FreeBSD_version for r351659, gets(3) removal

Reported by: linimon

proc: clear pid bitmap entry after dropping proctree lock

There is no correctness change here, but the procid lock is contended in
the fork path and taking it while holding proctree avoidably extends its
hold time.

Note that there are other ids which can end up getting cleared with the
lock.

Sponsored by: The FreeBSD Foundation

loader.efi: use and prefer coninex interface

Add support for EFI_SIMPLE_TEXT_INPUT_EX_PROTOCOL.

loader.efi: some systems do not translate scan code 0x8 to backspace

Add scancode translation for backspace.

Use DEVICE memory instead of UNCACHEABLE on aarch64 in ioremap() in the LinuxKPI.
This fixes system hangs on reading device registers on aarch64.

Tested with: Marvell MACCHIATObin (Armada8k) + mlx4en, amdgpu
Submitted by: Greg V <greg@unrelenting.technology>
Differential Revision: https://reviews.freebsd.org/D20789
MFC after: 1 week
Sponsored by: Mellanox Technologies

Fix regression issue after r351616. Make sure the mbuf queue gets initialized.

Found by: gonzo@
MFC after: 1 week
Sponsored by: Mellanox Technologies

Remove remnants of optimization for > pagesize allocations.

In the past, this allocator seems to have allocated things larger than
a page seperately. Much of this code was removed at some point (perhaps
along with sbrk() used) so remove the rest. Instead, keep allocating in
power-of-two bins up to FIRST_BUCKET_SIZE << (NBUCKETS - 1). If we want
something more efficent, we should use a fancier allocator.

While here, remove some vestages of sbrk() use. Most importantly, don't
try to page align the pagepool since it's always page aligned by mmap().

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D21453

mips: fix some mcount nits

The symbol version for _mcount was removed 12 years ago in r169525 from
gmon/Symbol.map, to be added to the per-arch Symbol.map. mips was overlooked
in this, so _mcount has no symver. Add it back to where it should have been,
rather than where it would go if it were added today, since we're correcting
a historical mistake.

Additionally, _mcount is getting thrown into .mdebug.abi32 in the llvm80/90
world as it's not getting explicitly thrown into .text, so do this now. This
fixes the libc build that was previously failing due to relocations in
.mdebug.abi32. This is specifically due to the way clang's integrated AS
works and that they emit the .mdebug.abiNN section early in the process. An
LLVM bug has been submitted[0] and an agreement has been made that the
mips backend should switch to .text following .mdebug.abiNN for
compatibility.

[0] https://bugs.llvm.org/show_bug.cgi?id=43119

Reviewed by: imp, arichardson
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21435

Extend uma_reclaim() to permit different reclamation targets.

The page daemon periodically invokes uma_reclaim() to reclaim cached
items from each zone when the system is under memory pressure.  This
is important since the size of these caches is unbounded by default.
However it also results in bursts of high latency when allocating from
heavily used zones as threads miss in the per-CPU caches and must
access the keg in order to allocate new items.

With r340405 we maintain an estimate of each zone's usage of its
(per-NUMA domain) cache of full buckets.  Start making use of this
estimate to avoid reclaiming the entire cache when under memory
pressure.  In particular, introduce TRIM, DRAIN and DRAIN_CPU
verbs for uma_reclaim() and uma_zone_reclaim().  When trimming, only
items in excess of the estimate are reclaimed.  Draining a zone
reclaims all of the cached full buckets (the previous behaviour of
uma_reclaim()), and may further drain the per-CPU caches in extreme
cases.

Now, when under memory pressure, the page daemon will trim zones
rather than draining them.  As a result, heavily used zones do not incur
bursts of bucket cache misses following reclamation, but large, unused
caches will be reclaimed as before.

Reviewed by: jeff
Tested by: pho (an earlier version)
MFC after: 2 months
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D16667

Restrict the input domain set in cpuset_setdomain(2) to all_domains.

To permit larger values of MAXMEMDOM, which is currently 8 on amd64,
cpuset_setdomain(2) accepts a mask of size 256. In the kernel, domain
set masks are 64 bits wide, but can only represent a set of MAXMEMDOM
domains due to the use of the ds_order table.

Domain sets passed to cpuset_setdomain(2) are restricted to a subset
of their parent set, which is typically the root set, but before this
happens we modify the input set to exclude empty domains.
domainset_empty_vm() and other code which manipulates domain sets
expect the mask to be a subset of all_domains, so enforce that when
performing validation of cpuset_setdomain(2) parameters.

Reported and tested by: pho
Reviewed by: kib
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21477

Fix an off-by-one bug in the CPU and domain ID parser.

The "size" parameter is the size of the corresponding bit set, so the
maximum CPU or domain index is size - 1.

MFC after: 1 week

Move CAS check in powerpc64 ofw loader until after the PVR check.

This unbreaks using the powerpc64 loader on a 32-bit processor.

Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D21297

makefs: share msdosfsmount.h between kernel msdosfs and makefs

Sponsored by: The FreeBSD Foundation

vnic: correct and simplify SIOCSIFFLAGS

PR: 223573, 223575
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D13028

ar: use more correct size_t type for loop index

Submitted by: cem
MFC after: 1 week

lldb: shorten thread names to make logs easier to follow

lldb prepends the thread name to log entries, and the existing thread
name for the FreeBSD ProcessMonitor thread was longer than the kernel's
supported thread name length, and so was truncated. This made logs hard
to read, as the truncated thread name ran into the log message. Shorten
"lldb.process.freebsd.operation" to just "freebsd.op" so that logs are
more readable.

(Upstreaming to lldb still to be done).

Remove CLANG_NO_IAS definition

CLANG_NO_IAS is not used anywhere in the tree.

Sponsored by: The FreeBSD Foundation

libstdc++: remove gets

Removed from libc in r351659

libc: remove gets

gets is unsafe and shouldn't be used (for many years now). Leave it in
the existing symbol version so anything that previously linked aginst it
still runs, but do not allow new software to link against it.

(The compatability/legacy implementation must not be static so that
the symbol and in particular the compat sym gets@FBSD_1.0 make it
into libc.)

PR: 222796 (exp-run)
Reported by: Paul Vixie
Reviewed by: allanjude, cy, eadler, gnn, jhb, kib, ngie (some earlier)
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12298

Add myself (fox) and update the mentor information.

Approved by: philip (mentor)

netmap: import changes from upstream (SHA 137f537eae513)

- Rework option processing.
- Use larger integers for memory size values in the
memory management code.

MFC after: 2 weeks

vfs: stop refing freed mount points in vop_stdgetwritemount

The code used blindly ref based on an unsafely red address and then would
backpedal if necessary. This was safe in terms of memory access since
mounts are type-stable, but made for a potential a bug where the mount
was reused and had the count reset to 0 before this code decreased it.

Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21411

Fix initialization of top_fsn.

MFC after: 3 days

Improve the handling of state cookie parameters in INIT-ACK chunks.
This fixes problem with parameters indicating a zero length or partial
parameters after an unknown parameter indicating to stop processing. It
also fixes a problem with state cookie parameters after unknown
parametes indicating to stop porcessing.
Thanks to Mark Wodrich from Google for finding two of these issues
by fuzz testing the userland stack and reporting them in
https://github.com/sctplab/usrsctp/issues/355
and
https://github.com/sctplab/usrsctp/issues/352

MFC after: 3 days

Add support for TP-Link Archer T2U Nano.

MFC after: 2 weeks

nullfs: reduce areas protected by vnode interlock in null_lock

Similarly to the other routine stop taking the interlock for the lower
vnode. The interlock for nullfs vnode is still taken to ensure
stability of ->v_data.

Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21480

posixshm: switch to OBJT_SWAP in advance of other changes

Future changes to posixshm will start tracking writeable mappings in order
to support file sealing. Tracking writeable mappings for an OBJT_DEFAULT
object is complicated as it may be swapped out and converted to an
OBJT_SWAP. One may generically add this tracking for vm_object, but this is
difficult to do without increasing memory footprint of vm_object and blowing
up memory usage by a significant amount.

On the other hand, the swap pager can be expanded to track writeable
mappings without increasing vm_object size. This change is currently in
D21456. Switch over to OBJT_SWAP in advance of the other changes to the
swap pager and posixshm.