Update crashinfo to work with newer gdb from ports.
If gdb from ports is installed, use it instead of the base system gdb
to extract variables from a kernel. Note that base gdb and ports gdb
do not support the same options for invoking a single command in batch
mode, so a wrapper shell function is used. In addition, prefer kgdb
from ports when generating a backtrace if present.
andrew [Wed, 20 Jul 2016 17:19:47 +0000 (17:19 +0000)]
We will be switching to a new arm64 uart cpu driver that handles both FDT
and ACPI. As such pull out what will be the common parts of the FDT cpu
detection to a new function that can be shared between them.
Reviewed by: manu
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7262
cem [Wed, 20 Jul 2016 16:59:36 +0000 (16:59 +0000)]
Extend ELF coredump to support more than 65535 segments
The ELF e_phnum field is only 16 bits wide. To support more than 65535 segments
(program headers), Sun's "Linker and Libraries Guide" table 7-7 (or 12-7,
depending on document version) prescribes a special first section header where
sh_info represents the real number of program headers.
Ensure that the UFS directory vnode' vm_object is properly sized
before UFS_BALLOC() is called. I do not believe that this caused any
real issue on FreeBSD because the exclusive vnode lock is held over
the balloc/resize, the change is to make formally correct KPI use.
Based on: the Matthew Dillon' patch from DragonFly BSD
PR: 93942
Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
manu [Wed, 20 Jul 2016 11:23:06 +0000 (11:23 +0000)]
Add support for the SID (Security ID Module) on Allwinner A10 and A20.
The rootkey is burnt at production and can't be changed, thus is can be used
as a device unique ID or to generate a MAC address (This is was u-boot does).
The rootkey is exposed as a sysctl (dev.aw_sid.<unit>.rootkey).
https://www.illumos.org/issues/7164
If the pool/dataset command-line argument is specified with a trailing
slash, for example, "tank/", we should interpret it as the topmost
dataset (rather than the whole pool)
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Tim Chase <tim@chase2k.com>
PR: 204661
MFC after: 1 week
Relnotes: yes
https://www.illumos.org/issues/6391
When using zdb with non-default SPA config file it is not convenient
to add -U <non-default-config-file-path> all the time. This commit
introduces support for setting/overriding SPA config location via
environment variable 'SPA_CONFIG_PATH'.
If -U flag is specified in the command line it will override any other
value as usual.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Richard Yao <ryao@gentoo.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Cyril Plisko <cyril.plisko@mountall.com>
MFC after: 1 week
Fix the following panic seen when migrating a FreeBSD guest on Xen:
panic: mtx_lock() of destroyed mutex @ /usr/src/sys/dev/fb/vesa.c:541
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001d2fa4f0
vpanic() at vpanic+0x182/frame 0xfffffe001d2fa570
kassert_panic() at kassert_panic+0x126/frame 0xfffffe001d2fa5e0
__mtx_lock_flags() at __mtx_lock_flags+0x15b/frame 0xfffffe001d2fa630
vesa_bios_save_restore() at vesa_bios_save_restore+0x78/frame 0xfffffe001d2fa680
vga_suspend() at vga_suspend+0xa3/frame 0xfffffe001d2fa6b0
isavga_suspend() at isavga_suspend+0x1d/frame 0xfffffe001d2fa6d0
bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame
[...]
This is caused because vga_sub_configure (which is called if the VGA adapter
is attached after VESA tried to initialize), points to vesa_configure, which
doesn't initialize the VESA mutex. In order to fix it, make sure
vga_sub_configure points to vesa_load, so that all the needed vesa
components are properly initialized.
Sponsored by: Citrix Systems R&D
MFC after: 3 days
PR: 209203
Reviewed by: dumbbell
Differential revision: https://reviews.freebsd.org/D7196
1) Per POSIX (and glibc) GLOB_NOCHECK should return original pattern,
unmodified, if no matches found. But our original code strips all '\'
returning it. Rewrite the code to allow to reconstruct exact the
original pattern with backslashes for this case.
2) Prevent to use truncated pattern if MAXPATHLEN exceeded, return
GLOB_NOMATCH instead.
3) Fix few end loop conditions filling Char arrays with mbrtowc(),
MB_CUR_MAX is unneeded in two places and condition is less by one
in other place.
4) Prevent to use truncated filenames match if MAXPATHLEN exceeded,
skip such directory entries.
5) Don't end *pathend with L'/' in glob3() if limit is reached, this
change will be not visible since error is returned.
6) If error happens in (*readdirfunc)(), do the same GLOB_ABORTED
processing as for g_opendir() as POSIX requires.
_Unwind_Exception is required to be double word aligned. GCC has
interpreted this to mean "use the maximum useful alignment for the
target" so follow that lead.
Release the second critical section in uma_zfree_arg() slightly earlier.
It is only needed when removing a full bucket from the per-CPU cache. The
bucket cache (uz_buckets) is protected by the zone mutex and thus the
critical section can be released before inserting into that list.
Merge {amd64,i386}/instr_size.c into x86_instr_size.c.
Also reduce the diff between us and upstream: the input data model will
always be DATAMODEL_NATIVE because of a bug (p_model is never set but is
always initialized to 0), so we don't need to override the caller anyway.
This change is also necessary to support the pid provider for 32-bit
processes on amd64.
libc: tag the Rune initialization function prototypes visibility as hidden.
It is good practice to export as few symbols as possible from your shared
libraries, so use the GCC visibility attribute in this case, matching what
Apple's libc does.
Supporting flushing the dump before returning, and simplify/combine the
logic. Switch to a 5us delay since most NVME devices can easily do 200,000
iops.
Submitted by: imp
MFC after: 3 days
Sponsored by: Netflix, Inc.
makefs: sync NetBSD IDs with upstream for changes that we already have
May 22 21:51:39 2011 +0000 (christos):
From Nathan Whitehorn (nwhitehorn at freebsd dot org):
Add code to generate bootable ISOs on Powermac and CHRP systems.
Synthesize some partition maps (APM and MBR, respectively) pointing
to (a) the whole disk, and (b) relevant El Torito boot images that
have been added by other code. These partition maps are a little
bit funny looking, but they seem to work. FreeBSD has been using
this successfully in their release generation on powerpc, as well
as generating all non-SPARC install media. SPARC support could
probably be added as an extension of this patch.
makefs.8 1.33
Tue Aug 23 17:09:11 2011 +0000 (christos):
PR/45285: Martin Matuska: makefs does not properly convert ISO level 1 and 2
filenames (buffer overflow)
makefs does not properly verify the maximum filename length in the
special "." case for both ISO level 1 and ISO level 2 filename
conversion. This creates broken images or causes a buffer overflow
(ISO level 2).
ISO level 1:
If a filename contains only dots or up to 8 characters followed by
dots the 8+3 limit check doesn't work.
ISO level 2:
If a filename contains a dot in the first 30 characters and a dot
on the 30th character, the length limit check doesn't work and the
buffer is overflowed.
This reverts out Gleb's changes and adds three small
fixes that I think closes up the races Gleb was
looking for. This is running quite nicely in Netflix and
now no longer causes TCP-tcb leaks.
Re-order `usage' alphabetically;
rename option arguments in the manpage's `SYNOPSIS' section to
match those from `usage' (not the other way around; the `usage'-line
(and other parts of makefs.c) contain the correct names);
minor punctuation improvements.
Random bit generator (RBG) driver for RPi and RPi2.
Summary:
This driver supports the following methods to trigger gathering random bits from the hardware:
1. interrupt when the FIFO is full (default) fed into the harvest queue
2. callout (when BCM2835_RNG_USE_CALLOUT is defined) every second if hz is less than 100, otherwise hz / 100, feeding the random bits into the harvest queue
If the kernel is booted with verbose enabled, the contents of the registers will be dumped after the RBG is started during the attach routine.
Author: hackagadget_gmail.com (Stephen J. Kiernan)
Test Plan: Built RPI2 kernel and booted on board. Tested the different methods to feed the harvest queue (callout, interrupt) and the interrupt driven approach seems best. However, keeping the other method for people to be able to experiment with.
Include makewhatis in ITOOLS when MK_MAN_UTILS is true
Previously it was conditional on MK_MAN. It's possible to build
FreeBSD with man pages but without man page tools. MK_MAN_UTILS
is the conditional used in share/man/Makefile for determining whether
makewhatis is executed at install time, so it is the proper one for
ITOOLS as well.
clang++: Always use --eh-frame-hdr on FreeBSD, even for -static
FreeBSD uses LLVM's libunwind on FreeBSD/arm64 today (and we expect to
use it more widely in the future) and it requires the EH frame segment
in static binaries.
Reviewed by: dim
Obtained from: Clang commit r266123
MFC after: 3 days
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7250
andrew [Tue, 19 Jul 2016 16:02:07 +0000 (16:02 +0000)]
Add missing flags from acpidump. These are defined in the header, but not
printed. The HW_REDUCED flag is useful as it should be set on arm64 to
comply with the ARM Server Base Boot Requirements.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Use g_resize_provider() to change the size of GEOM_DISK provider,
when it is being opened. This should fix the possible loss of a resize
event when disk capacity changed.
PR: 211028
Reported by: Dexuan Cui <decui at microsoft dot com>
MFC after: 3 weeks
The keep-state, limit and check-state now will have additional argument
flowname. This flowname will be assigned to dynamic rule by keep-state
or limit opcode. And then can be matched by check-state opcode or
O_PROBE_STATE internal opcode. To reduce possible breakage and to maximize
compatibility with old rulesets default flowname introduced.
It will be assigned to the rules when user has omitted state name in
keep-state and check-state opcodes. Also if name is ambiguous (can be
evaluated as rule opcode) it will be replaced to default.
llvm-libunwind: use conventional (non-Darwin) X86 register numbers
For historical reasons Darwin/i386 has ebp and esp swapped in the
eh_frame register numbering. That is:
Darwin Other
Reg # eh_frame eh_frame DWARF
===== ======== ======== =====
4 ebp esp esp
5 esp ebp ebp
Although the UNW_X86_* constants are not supposed to be coupled to
DWARF / eh_frame numbering they are currently conflated in LLVM
libunwind, and thus we require the non-Darwin numbering.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
adrian [Tue, 19 Jul 2016 00:27:17 +0000 (00:27 +0000)]
[ath] don't do LDPC, STBC or short-gi for locationing frames.
The 11n duration calculation function in net80211 and the HAL round /up/
the duration calculation for short-gi, so we can't use that.
The 11n duration calculation doesn't know about the extra symbol time
needed for STBC, nor the LDPC encoding duration, so we can't use
that.
This (along with other, local hacks) allow the locationing services to
get down to around 200nS (yes, nanoseconds) of variance when speaking
to a "good" AP.
g_Ctoc() conversion buffers are smaller than needed up to MB_CUR_MAX - 1
since whole conversion needs a room for (len >= MB_CUR_MAX). It is no
difference when MB_CUR_MAX == 1, but for multi-byte locales last few chars
('\0' and before) may need just one byte, and the rest of MB_CUR_MAX - 1
space becomes unavailable in the MAXPATHLEN-sized buffer, which cause
conversion error on near MAXPATHLEN long pathes.
Increase g_Ctoc() conversion buffers to MB_LEN_MAX - 1.
Add ipfw_nptv6 module that implements Network Prefix Translation for IPv6
as defined in RFC 6296. The module works together with ipfw(4) and
implemented as its external action module. When it is loaded, it registers
as eaction and can be used in rules. The usage pattern is similar to
ipfw_nat(4). All matched by rule traffic goes to the NPT module.
1) Add all characters from ~ expansion as protected to be not interpreted
as pattern meta chars.
2) GLOB_ERR and gl_errfunc are supposed to work only for real directories
per POSIX, so don't act on missing or plain files, for ENOENT or ENOTDIR
(as TODO in the code suggested).
3) Remove the hack in the manpage describing how to skip ENOENT and ENOTDIR
in gl_errfunc, it is unneeded now.
4) Set errno to ENAMETOOLONG if g_Ctoc() expansion fails in g_opendir(),
as in other places in the code which are wrappers around system functions.
1) POSIX defines well when GLOB_ABORTED can be returned (only for directory
open/read errors and with GLOB_ERR and gl_errfunc processing), so we can't
blindly return it on any MAXPATHLEN overflow. Even our manpage disagrees
with such GLOB_ABORTED usage. Use GLOB_NOSPACE for that now with errno is
set to 0 as for limits.
2) Return GLOB_NOSPACE when valid ~ expansion can't happens due to
MAXPATHLEN overflow too.
3) POSIX (and our manpage) says, if GLOB_ERR is set, GLOB_ABORTED should
be returned immediatelly, without using gl_errfunc. Implement it now.
When threads were added to the kernel, the pr_pid member of the
NT_PRSTATUS note was repurposed to store LWP IDs instead of process
IDs. However, the process ID was no longer recorded in core dumps.
This change adds a pr_pid field to prpsinfo (NT_PRSINFO). Rather than
bumping the prpsinfo version number, note parsers can use the note's
payload size to determine if pr_pid is present.
First, PL_FLAG_FORKED events now also set a PL_FLAG_VFORKED flag when
the new child was created via vfork() rather than fork(). Second, a
new PL_FLAG_VFORK_DONE event can now be enabled via the PTRACE_VFORK
event mask. This new stop is reported after the vfork parent resumes
due to the child calling exit or exec. Debuggers can use this stop to
reinsert breakpoints in the vfork parent process before it resumes.
The assertion re-added in r302614 was triggered when stopping signal
is delivered to vforked child. Issue is that we avoid stopping such
children in issignal() to not block parents. But executed AST, which
ignored stops, leaves the child with the signal pending but no AST
pending.
On first exec after vfork(), call signotify() to handle pending
reenabled signals. Adjust the assert to not check vfork children
until exec.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Relax checking if the privider size matches size recorded in the
superblock, allowing provider to be bit bigger, i.e. have some
extra padding after the FS image. That in some cases might be
a side-effect of using CLOOP format which enforces certain block
size and trying to compress image that is not exactly the number
of those blocks in size. The UFS itself does not have any issues
mounting such padded file systems, so it's what GEOM_LABEL should
do.
Break up vm_fault()'s implementation of the read-ahead and delete-behind
optimizations into two distinct pieces. The first piece consists of the
code that should only be performed once per page fault and requires the map
to be locked. The second piece consists of the code that should be
performed each time a pager is called on an object in the shadow chain.
(This second piece expects the map to be unlocked.)
Previously, the entire implementation could be executed multiple times.
Moreover, the second and subsequent executions would occur with the map
unlocked. Usually, the ensuing unsynchronized accesses to the map were
harmless because the map was not changing. Nonetheless, it was possible for
a use-after-free error to occur, where vm_fault() wrote to a freed map
entry. This change corrects that problem.
Don't print same value twice, one in decimal once in hex. This makes
output more cryptic than it needs to be and wastes cpu cycles and
console bandwidth.
will [Mon, 18 Jul 2016 02:13:57 +0000 (02:13 +0000)]
Add my beinstall script.
This is meant to install a new BE (boot environment) given a fully built
world/kernel. In addition to installing world and kernel in the new BE,
it also automatically performs /etc updates (using etcupdate or mergemaster)
and package updates (using pkg).
Because this process is performed in a new BE, it reduces the need for a
second reboot. It also means a reboot into a partially updated system (due
to install or hardware failure) can't happen.
Inspired by and similar in function to Solaris/illumos-style upgrades.
will [Mon, 18 Jul 2016 01:55:25 +0000 (01:55 +0000)]
libkvm: Improve physical address lookup scaling.
Instead of using a hash table to convert physical page addresses to offsets
in the sparse page array, cache the number of bits set for each 4MB chunk of
physical pages. Upon lookup, find the nearest cached population count, then
add/subtract the number of bits from that point to the page's PTE bit.
Then multiply by page size and add to the sparse page map's base offset.
This replaces O(n) worst-case lookup with O(1) (plus a small number of bits
to scan in the bitmap). Also, for a 128GB system, a typical kernel core of
about 8GB will now only require ~4.5MB of RAM for this approach instead of
~48MB as with the hash table.
More concretely, /usr/sbin/crashinfo against the same core improves from a
max RSS of 188MB and wall time of 43.72s (33.25 user 2.94 sys) to 135MB and
9.43s (2.58 user 1.47 sys). Running "thread apply all bt" in kgdb has a
similar RSS improvement, and wall time drops from 4.44s to 1.93s.
Remove booke_enable_l3_cache declaration and remaining definition.
L3 cache is not defined by Book-E, so is platform specific. Since it was
already moved for e500-based devices into mpc85xx in r292903, just eliminate it
altogether. Any device that supports L3 cache should have its own platform
means to enable it.
When we change nvl_array_next to NULL it means that we want to destroy or
take nvlist_array. The nvpair, which stores next nvlist of nvlist_array element
is no longer needed and can be freed.
Submitted by: Adam Starak <starak.adam@gmail.com>
MFC after: 1 week