kib [Fri, 4 Jan 2019 19:10:46 +0000 (19:10 +0000)]
Fix i386 LINT build after r342769.
It seems that libkern/mcount.c is the only consumer of vm/pmap.h that
does not include machine/atomic.h. Make it work by bringing
machine/atomic.h when pmap.h is used for kernel non-asm .c file.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
gallatin [Fri, 4 Jan 2019 18:38:27 +0000 (18:38 +0000)]
Limit git history searches in newvers.sh
newvers.sh takes upwards of 4-5 seconds to complete on trees checked
out from github, due to searching the entire history for non-existent
git-svn metadata. Similarly, if one does not check out notes, we
again search the entire history for notes. That makes newvers.sh very
slow for many github users.
To fix this in a fair way, limit the history search to the last 10K
commits: if you're more than 10K commits out of sync, then you've
forked the project, and our SVN rev is no longer very important to you.
Due to how git implements --grep in conjunction with -n, --grep has been
removed for performance reasons (git does not seem to limit its search
to the -n limit in this case, and takes just as long as it did with no
limit).
emaste [Fri, 4 Jan 2019 18:35:25 +0000 (18:35 +0000)]
Add explicit csu test dependency
lib/csu/tests/dynamiclib requires libh_csu.so be built first. I'm not
sure this is the most correct/best way to address this but it solves
the issue in my testing.
cem [Fri, 4 Jan 2019 18:31:17 +0000 (18:31 +0000)]
Expose threads-per-core and physical core count information
With new sysctls (to the best of our ability do detect them). Restructured
smp.4 slightly for clarity (keep relevant stuff closer to the top) while
documenting.
kib [Fri, 4 Jan 2019 17:33:07 +0000 (17:33 +0000)]
i386: Use atomic 64bit load to read PDE value from PAE pagetables in
pmap_kextract().
pmap_kextract() can race with promotion/demotion on the kernel page
table, in which case current non-atomic 64bit read would see torn
value, breaking pmap_kextract(). pmap_kextract() would correctly
handle either promoted or demoted PDE, but not a mix where one word
is from a different state.
It requires PAE and > 4G memory to reproduce. We observed this in
real loads, both for intensive use of malloc(9)/free(9) where
vtoslab() returned invalid pointer to the slab, and with the use of
busdma_bounce, where incorrect page was bounced.
In collaboration with: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18714
markj [Fri, 4 Jan 2019 17:31:50 +0000 (17:31 +0000)]
Support MSG_DONTWAIT in send*(2).
As it does for recv*(2), MSG_DONTWAIT indicates that the call should
not block, returning EAGAIN instead. Linux and OpenBSD both implement
this, so the change makes porting easier, especially since we do not
return EINVAL or so when unrecognized flags are specified.
markj [Fri, 4 Jan 2019 17:14:50 +0000 (17:14 +0000)]
Don't enable interrupts in init_secondary().
The MI kernel assumes that interrupts will not be enabled on APs until
after the first context switch. In particular, the problem was causing
occasional deadlocks during boot.
Remove an unneeded intr_disable() added in r335005.
Reviewed by: jhb (previous version)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18738
emaste [Fri, 4 Jan 2019 16:47:35 +0000 (16:47 +0000)]
newvers: retire p4 version support
Perforce no longer offers a FreeBSD client and it not a viable VCS for
FreeBSD development. Remove p4 version logic to simplify newvers.sh in
advance of other changes.
chuck [Fri, 4 Jan 2019 15:03:35 +0000 (15:03 +0000)]
Fix bhyve's NVMe Completion Queue entry values
The function which processes Admin commands was not returning the
Command Specific value in Completion Queue Entry, Dword 0 (CDW0). This
effects commands such as Set Features, Number of Queues which returns
the number of queues supported by the device in CDW0. In this case, the
host will only create 1 queue pair (Number of Queues is zero based).
This also masked a bug in the queue counting logic.
chuck [Fri, 4 Jan 2019 15:03:30 +0000 (15:03 +0000)]
Fix bhyve's NVMe queue bookkeeping
Many size / length parameters in NVMe are "0's based", meaning, a value
of 0x0 represents 1, 0x1 represents 2, etc.. While this leads to an
efficient encoding, it can lead to subtle bugs. With respect to queues,
these parameters include:
- Maximum number of queue entries
- Maximum number of queues
- Number of Completion Queues
- Number of Submission Queues
To be consistent, convert all 0's based values from the host to 1's
based value internally. Likewise, covert internal 1's based values to
0's based values when returned to the host. This fixes an off-by-one bug
when creating IO queues and simplifies some of the code. Note that this
bug is masked by another bug.
While in the neighborhood,
- fix an erroneous queue ID check (checking CQ count when deleting SQ)
- check for queue ID of 0x0 in a few places where this is illegal
- clean up the Set Features, Number of Queues command and check for
illegal values
emaste [Fri, 4 Jan 2019 14:42:36 +0000 (14:42 +0000)]
newvers: avoid clearing svn revision information with nested VCS dirs
Consider the case where FreeBSD is checked out via Subversion with a
(perhaps unrelated) .git or .hg directory at a higher level - for
example,
.../.git
.../src/freebsd
Previously newvers obtained the SVN revision information via svnversion,
and then tried to obtain the SVN revision corresponding to the git or hg
commit, overwriting the existing information.
As a short term fix use a different variable for hg-svn or git-svn
information, setting $svn from hg or git info only if not empty.
Reported by: Matthias Apitz
Sponsored by: The FreeBSD Foundation
kevans [Fri, 4 Jan 2019 03:13:24 +0000 (03:13 +0000)]
getopt_long(3): fix case of malformed long opt
When presented with an arg string like '-l-', getopt_long will successfully
parse out the 'l' short option, then proceed to match '--' against the first
longopts entry as it later does a strncmp with len=0. This latter bit is
arguably another bug in itself, but presumably not a practical issue as all
callers of parse_long_options are already doing the right thing (except this
one pointed out).
An opt string like '-l-' should be considered malformed and throw a bad
argument rather than behaving as if '--' were passed. It cannot possibly do
what the invoker expects, and it's probably the result of a typo (ls -l- a)
rather than any intent.
mmacy [Thu, 3 Jan 2019 22:49:11 +0000 (22:49 +0000)]
zfsboot: support newer ZFS versions
declare v3 objset size/layout to fix userboot and possibly other loader issues
- fix for userboot assertion failure in zfs_dev_close in free due to out of bounds write
- fix for zfs_alloc / zfs_free mismatch assertion failure when booting GPT on BIOS
markj [Thu, 3 Jan 2019 16:21:44 +0000 (16:21 +0000)]
Fix some issues with the riscv pmap_protect() implementation.
- Handle VM_PROT_EXECUTE.
- Clear PTE_D and mark the page dirty when removing write access
from a mapping.
- Atomically clear PTE_W to avoid clobbering a hardware PTE update.
Reviewed by: jhb, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18719
avos [Wed, 2 Jan 2019 18:30:22 +0000 (18:30 +0000)]
rtwn(4): refresh manpages.
- Add 'device rtwn' to rtwn_pci(4) and rtwn_usb(4) config sample;
kernel will not compile otherwise.
- Refresh devices list in rtwn_usb(4); add 'chipset' column.
- Bump Dd after this commit and r342682.
markj [Wed, 2 Jan 2019 17:09:35 +0000 (17:09 +0000)]
Capsicumize savecore(8).
- Use cap_fileargs(3) to open dump devices after entering capability
mode, and use cap_syslog(3) to log messages.
- Use a relative directory fd to open output files.
- Use zdopen(3) to compress kernel dumps in capability mode.
Reviewed by: cem, oshogbo
MFC after: 2 months
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18458
markj [Wed, 2 Jan 2019 15:52:16 +0000 (15:52 +0000)]
Use g_handleattr() to reply to GEOM::candelete queries.
g_handleattr() fills out bp->bio_completed; otherwise, g_getattr()
returns an error in response to the query. This caused BIO_DELETE
support to not be propagated through stacked configurations, e.g.,
a gconcat of gmirror volumes would not handle BIO_DELETE even when
the gmirrors do. g_io_getattr() was not affected by the problem.
markj [Wed, 2 Jan 2019 15:36:35 +0000 (15:36 +0000)]
Avoid setting PG_U unconditionally in pmap_enter_quick_locked().
This KPI may in principle be used to create kernel mappings, in which
case we certainly should not be setting PG_U. In any case, PG_U must be
set on all layers in the page tables to grant user mode access, and we
were only setting it on leaf entries. Thus, this change should have no
functional impact.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
We need to subtract the TLS_TCB_SIZE to get to the real data pointer, since
r13 points to the end of the TCB structure. Prior to this, devel/protobuf-c
port broke with recent update to devel/protobuf, which exposed this issue.
cem [Tue, 1 Jan 2019 19:56:49 +0000 (19:56 +0000)]
linuxkpi: Remove extraneous NULL check on M_WAITOK allocation
The check was not introduced in r342628, but the subsequent unchecked access to
refs was added then, prompting a Coverity warning about "Null pointer
dereferences (FORWARD_NULL)." The warning is bogus due to M_WAITOK, but so is
the NULL check that hints it, so just remove it.
vmaffione [Mon, 31 Dec 2018 11:17:58 +0000 (11:17 +0000)]
netmap: add suite of unit tests
Import the unit tests from upstream (https://github.com/luigirizzo/netmap ba02539859d46d33), and make them ready for use with Kyua.
There are currently 38 regression tests, which test the kernel control ABI
exposed by netmap to userspace applications:
1: test for port info get
2-5: tests for basic port registration
6-9: tests for VALE
10-11: tests for getting netmap allocator info
12-15: tests for netmap pipes
16: test on polling mode
17-18: tests on options
19-27: tests for sync-kloop subsystem
28-39: tests for null ports
31-38: tests for the legacy NIOCREGIF registers
ian [Mon, 31 Dec 2018 01:09:23 +0000 (01:09 +0000)]
When allocating a new keyboard at vt_upgrade() time, unwind any cngrabs
done on the old keyboard and then do the corresponding number of grabs
on the new keyboard.
This fixes a race that can leave the system with a non-functioning
keyboard. It goes like this...
- The bios claims there is an AT keyboard, atkbd attaches.
- SI_SUB_INT_CONFIG_HOOKS runs.
- USB probes devices. Devices begin attaching, including disks.
- GELI prompts for a password for a just-attached disk, which results
in a cngrab() while atkbd is the keyboard.
- A USB keyboard attaches.
- vt_upgrade() runs and switches the keyboard to the new USB keyboard,
but because cngrab was never called for it, it's not activated and
keystrokes are ignored.
- Now there is no functional keyboard and no way to get one; even
plugging in a different USB keyboard doesn't help, because the console
is still grabbed, still waiting for a GELI pw.
bcran [Mon, 31 Dec 2018 00:20:58 +0000 (00:20 +0000)]
Fix ESP generation when using a gmirror, and when booting from RO medium
When using a gmirror, entries in /dev can be removed. So instead of using
kern.disks, get the list of disks from "gpart status -sg" instead.
We assume that any 'efi' partition that can't be mounted as msdosfs should
be used as an ESP. However, the ESP on the CD/DVD can't be mounted read-write
and so was being treated as if unformatted. Try the mount as read-only
instead, to catch cases like this.
marius [Sun, 30 Dec 2018 23:08:06 +0000 (23:08 +0000)]
o Don't allocate resources for SDMA in sdhci(4) if the controller or the
front-end doesn't support SDMA or the latter implements a platform-
specific transfer method instead. While at it, factor out allocation
and freeing of SDMA resources to sdhci_dma_{alloc,free}() in order to
keep the code more readable when adding support for ADMA variants.
o Base the size of the SDMA bounce buffer on MAXPHYS up to the maximum
of 512 KiB instead of using a fixed 4-KiB-buffer. With the default
MAXPHYS of 128 KiB and depending on the controller and medium, this
reduces the number of SDHCI interrupts by a factor of ~16 to ~32 on
sequential reads while an increase of throughput of up to ~84 % was
seen.
Front-ends for broken controllers that only support an SDMA buffer
boundary of a specific size may set SDHCI_QUIRK_BROKEN_SDMA_BOUNDARY
and supply a size via struct sdhci_slot. According to Linux, only
Qualcomm MSM-type SDHCI controllers are affected by this, though.
Requested by: Shreyank Amartya (unconditional bump to 512 KiB)
o Introduce a SDHCI_DEPEND macro for specifying the dependency of the
front-end modules on the sdhci(4) one and bump the module version
of sdhci(4) to 2 via an also newly introduced SDHCI_VERSION in order
to ensure that all components are in sync WRT struct sdhci_slot.
o In sdhci(4):
- Make pointers const were applicable,
- replace a few device_printf(9) calls with slot_printf() for
consistency, and
- sync some local functions with their prototypes WRT static.
delphij [Sun, 30 Dec 2018 23:04:02 +0000 (23:04 +0000)]
Fix various issues with Chinese locales:
- Change short weekday names to use only one Chinese character.
(note: this is a somewhat misunfortunate compromise due to the fact
that some applications are using short buffer for weekday names,
and in ~1905 when 星期 system was created to replace the traditional
七曜 system, which can use 日月火水木金土 to represent Sunday through
Saturday with just one character without any confusion).
- for zh_CN locales: use Arabic numerals for month names, matching the
practice of all other CJK locales
- Regenerate zh_CN.{GB2312,GBK} locales from zh_CN.UTF-8.
kib [Sun, 30 Dec 2018 15:46:45 +0000 (15:46 +0000)]
Fix linux_destroy_dev() behaviour when there are still files open from
the destroying cdev.
Currently linux_destroy_dev() waits for the reference count on the
linux cdev to drain, and each open file hold the reference.
Practically it means that linux_destroy_dev() is blocked until all
userspace processes that have the cdev open, exit. FreeBSD devfs does
not have such problem, because device refcount only prevents freeing
of the cdev memory, and separate 'active methods' counter blocks
destroy_dev() until all threads leave the cdevsw methods. After that,
attempts to enter cdevsw methods are refused with an error.
Implement somewhat similar mechanism for LinuxKPI cdevs. Demote cdev
refcount to only mean a hold on the linux cdev memory. Add sirefs
count to track both number of threads inside the cdev methods, and for
single-bit indicator that cdev is being destroyed. In the later case,
the call is redirected to the dummy cdev.
tsoome [Sun, 30 Dec 2018 09:35:47 +0000 (09:35 +0000)]
loader: create bio_alloc and bio_free for bios bounce buffer
We do have 16KB buffer space defined in pxe.c, move it to bio.c and implement
bio_alloc()/bio_free() interface to make it possible to use this space for
other BIOS calls (notably, from biosdisk.c).
mckusick [Sun, 30 Dec 2018 05:03:41 +0000 (05:03 +0000)]
For consistency with FFS2's fifoops2 and both versions of FFS's
vnodeops make FFS1's fifoops1 use ffs_lock. Also delete ffs_reallocblks
from fifoops1 which is needed only for fifoops2 because of its
support for extended attributes that need to allocate blocks.
cy [Sun, 30 Dec 2018 04:25:48 +0000 (04:25 +0000)]
TCP_PAWS_IDLE is does not exist in NetBSD and illumos. In FreeBSD
TCP_PAWS_IDLE is defined in netinet/tcp_seq.h, however this header
isn't included explicitly or implicitly at this point therefore
as far ipfilter is concerned TCP_PAWS_IDLE is not defined. Remove
the #ifdef and include netinet/tcp.h unconditionally.