Gleb Smirnoff [Mon, 6 Mar 2017 19:14:08 +0000 (19:14 +0000)]
In panic() print current timestamp, which matches timestamp in the dump
header. This will help to correlate console server logs with dump files,
no matter how precise is clock on a console server appliance, and how
buggy the appliance is.
Give LinuxKPI Read-Write semaphores better debug names when
WITNESS_ALL is defined. The lock name is based on the filename and
line number where the initialisation happens.
Dexuan Cui [Mon, 6 Mar 2017 09:34:31 +0000 (09:34 +0000)]
loader.efi: fix recent UEFI-boot regression on physical machines
This patch fixes my recent patch
"loader.efi: reduce the size of the staging area if necessary", which
causes EFI-boot failure on physical machines since Mar 2:
on the host there is a 1MB LoaderData memory range, which splits
the big Conventional Memory range into a small one (15MB) and a
big one: the small one is too small to hold the staging area.
We can actually use the LoaderData range safely, because when
amd64_tramp -> efi_copy_finish() starts to run, we're almost at
the very end of the efi loader code and we're going to "return"
to the kernel entry, so we're pretty sure we won't access any loader
data any more.
For people who are interested in the details: please see
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211746#c22
PS, some people also reported the regression happened to FreeBSD VM
running on Bhyve in EFI mode. This patch should resolve it too,
though I don't have such a setup to test.
Ermal Luçi [Mon, 6 Mar 2017 04:01:58 +0000 (04:01 +0000)]
The patch provides the same socket option as Linux IP_ORIGDSTADDR.
Unfortunately they will have different integer value due to Linux value being already assigned in FreeBSD.
The patch is similar to IP_RECVDSTADDR but also provides the destination port value to the application.
This allows/improves implementation of transparent proxies on UDP sockets due to having the whole information on forwarded packets.
Revert r314669, r314670:
Bring back the i486 option in GENERIC by default.
The code related to i386 CPU variants configuration has received many
changes in the last years: most of the features are detected automatically,
so there are no performance penalties from keeping the 486 support enabled.
Re-instate the 486 support: while the general configuration could still be
cleaned a bit, there is no advantage in removing it.
o check the size of O_IP_SRC_LOOKUP opcode, it can not exceed the size of
ipfw_insn_u32;
o rename ipfw_lookup_table_extended() function into ipfw_lookup_table() and
remove old ipfw_lookup_table();
o use args->f_id.flow_id6 that is in host byte order to get DSCP value;
o add SCTP ports support to 'lookup src/dst-port' opcode;
o add IPv6 support to 'lookup src/dst-ip' opcode.
Reject invalid object types that can not be used with specific opcodes.
When we doing reference counting of named objects in the new rule,
for existing objects check that opcode references to correct object,
otherwise return EINVAL.
This commit is the cause of excessive compile times on skein_block.c
(and possibly other files) during kernel builds on amd64.
We never saw the problematic behavior described in this upstream commit,
so for now it is better to revert it. An upstream bug has been filed
here: https://bugs.llvm.org/show_bug.cgi?id=32142
reallocarray(3) is a non portable extension from OpenBSD. Given that it is
already in FreeBSD, make easier future merges by adopting in some cases
where the code has some shared heritage with OpenBSD.
Fix partial requests (used by fetch -r) when the requested file is
already complete.
Since 416 is an error code, any Content-Range header in the response
would refer to the error message, not the requested document, so
relying on the value of size when we know we got a 416 is wrong.
Instead, just verify that offset == 0 and assume that we've reached
the end of the document (if offset > 0, we did not request a range,
and the server is screwing with us). Note that we cannot distinguish
between reaching the end and going past it, but that is a flaw in the
protocol, not in the code, so we just have to assume that the caller
knows what it's doing. A smart caller would request an offset
slightly before what it believes is the end and compare the result to
what is already in the file.
Andriy Gapon [Sun, 5 Mar 2017 07:46:48 +0000 (07:46 +0000)]
mca: fix up couple of issues introduced with amd thresholding in r314636
1. There a was a typo in one place where the processor family is
checked (16 vs 0x16). Now the checks are consolidated in a single
function.
2. Instead of an array of struct amd_et_state objects the code allocated
an array of pointers. That was no problem on amd64 where the sizes
are the same, but could be a problem on i386.
Reported by: tuexen and others
Tested by: tuexen (earlier version of the fix)
Pointyhat to: avg
MFC after: 5 days
X-MFC with: r314636
Emmanuel Vadot [Sun, 5 Mar 2017 07:13:29 +0000 (07:13 +0000)]
Export a sysctl dev.<clkdom>.<unit>.clocks for each clock domain containing
all the clocks that they provide.
Each clocks are exported under the node 'clock.<clkname>' and have the following
children nodes :
- frequency
- parent (The selected parent, if any)
- parents (The list of parents, if any)
- childrens (The list of childrens, if any)
- enable_cnt (The enabled counter)
This give us the possibility to examine clocks at runtime and make graph of
the clock flow.
Jilles Tjoelker [Sat, 4 Mar 2017 22:58:34 +0000 (22:58 +0000)]
sh: Fix crash if a -T trap is taken during command substitution.
Code like t=$(stat -f %m "$file") segfaulted if -T was active and a trap
was taken while the shell was waiting for the child process to finish.
What happened was that the dotrap() call in waitforjob() was hit. This
re-entered command execution (including expand.c) at a point not expected by
expbackq(), and global state (unallocated stack string and argbackq) was
corrupted.
To fix this, change expbackq() to prepare for command execution to be
re-entered.
Conrad Meyer [Sat, 4 Mar 2017 22:38:10 +0000 (22:38 +0000)]
ps(1): Only detect terminal width if stdout is a tty
If stdout isn't a tty, use unlimited width output rather than truncating to
79 characters. This is helpful for shell scripts or e.g., 'ps | grep foo'.
This hardcoded width has some history: In The Beginning of History[0], the
width of ps was hardcoded as 80 bytes. In 1985, Bloom@ added detection
using TIOCGWINSZ on stdin.[1] In 1986, Kirk merged a change to check
stdout's window size instead. In 1990, the fallback checks to stderr and
stdin's TIOCGWINSZ were added by Marc@, with the commit message "new
version."[2]
OS X Darwin has a very similar modification to ps(1), which simply sets
UNLIMITED for all non-tty outputs.[3] I've chosen to respect COLUMNS
instead of behaving identically to Darwin here, but I don't feel strongly
about that. We could match OS X for parity if that is desired.
Ian Lepore [Sat, 4 Mar 2017 21:47:43 +0000 (21:47 +0000)]
Fix bugs exposed by the recent enabling of FIFOs in the pl011 uart. These
have been in the code all along, but were masked by having a fifo depth of
one byte at the hardware level, so everything kinda worked by accident.
The hardware interrupts when the TX fifo is half empty, so set
sc->sc_txfifosz to 8 bytes (half the hardware fifo size) to match. This
eliminates dropped characters on output.
Restructure the read loop to consume all the bytes in the fifo by using
the "rx fifo empty" bit of the flags register rather than the "rx ready"
bit of the interrupt status register. The rx-ready interrupt is cleared
when the number of bytes in the fifo fall below the interrupt trigger
level, leaving the fifo half full every time receive routine was called.
Now it loops until the fifo is completely empty every time (including
when the function is called due to a receive timeout as well as for
fifo-full).
Conrad Meyer [Sat, 4 Mar 2017 20:46:57 +0000 (20:46 +0000)]
fts: Fix a potential memory leak in error case
Dan Krejsa reports a potential memory leak in an fts_build error case,
detected by Coverity. (It doesn't seem to show up in Coverity Scan, so I
don't have a CID to point to.)
I don't know whether it is actually possible to arrive in this case with a
non-empty 'head' list. The cost is low, though. One additional branch in a
terminal error case isn't the end of the world.
PR: 217125
Submitted by: Dan Krejsa <dan.krejsa at gmail.com>
Enji Cooper [Sat, 4 Mar 2017 20:35:34 +0000 (20:35 +0000)]
Fix build after r314656
Some of the changes I introduced to use .ALLSRC were correct in spirit,
but incorrect in reality -- in particular, ../Makefile.inc hadn't been
pulled in via bsd.init.mk (via bsd.lib.mk, bsd.prog.mk), so the value
of .ALLSRC (evaluated immediately) was empty. .include bsd.init.mk
explicitly so we can be certain that the values used as dependencies in
the targets are defined when the target recipe has been evaluated.
Reminder: thou shalt separate out separate functional changes before
committing them.
(YUGE) Pointyhat to: ngie
In collaboration with: bdrewery
MFC after: 1 month
Reported by: Jenkins, cy, ler, O. Hartmann, Michael Butler
Sponsored by: Dell EMC Isilon
[rpi] rpi3 should use the same cpufreq logic as rpi2, not rpi-b
RPi3 cpufreq is more like that on RPi2. Setting arm frequency
above min (say, "sysctl hw.cpufreq.arm_freq=600000001") turns on
turbo mode, and the firmware automatically raises voltage, sets
frequency to max 1200MHz, and throttle when overheat, etc.
Swap if/else parts and use SOC_BCM2835 def so RPi3 can share the
same cpufreq logic as RPi2, instead of falling to that for RPi.
Drop i486 from the default i386 GENERIC kernel configuration.
80486 production was stopped by Intel on September 2007. Dropping the 486
configuration option from the GENERIC kernel improves performance
slightly.
Removing I486_CPU is consistent at this time: we don't support any
processor without a FPU and the PC-98 arch, which frequently involved i486
CPUs, is also gone so we don't test such platforms anymore.
Dmitry Chagin [Sat, 4 Mar 2017 08:57:39 +0000 (08:57 +0000)]
Remove attribute __packed from some IPC struct definition since
Linuxulator is x86 only.
The only notable differences in algnment for an LP64 64-bit system
when compared to a 32-bit system is an eight or large byte types
alignment.
Bruce Evans [Sat, 4 Mar 2017 08:47:31 +0000 (08:47 +0000)]
Implement ec_putc() (emergency kernel [syscons] console putc()) and use
it in emergency in sc_cnputc().
Locking fixes in sc_cnputc() previously turned off normal output in
near-deadlock conditions and added deferred output which might never
be completed. Emergency output goes to the frame buffer using
sufficiently atomic non-blocking writes if the console is in text
mode (in graphics mode, nothing is done, modulo races setting the
graphics mode bit). Screen updates overwrite the emergency output
if the emergency condition clears enough to reach them.
ec_putc() also works for "early" console output in normal x86 text
mode as soon as this mode is initialized (if ever). This uses a
hard-coded x86 frame buffer address before cninit() and a hopefully
MI address after cninit(). But non-x86 is more likely to not support
text mode, when ec_putc() will be null. ec_putc() has no dependencies
of syscons before cninit(), and only has them later to track syscons'
mode changes. This commit doesn't attach ec_putc() for early use.
To test emergency use, put a breakpoint in central syscons output code
like sc_puts() and do some user output. The system used to race or
deadlock in ddb output soon after entry to ddb. The locking fixes
deferred the output until after leaving ddb, so ddb was unusable and
you had to try typing c[ontinue] blindly until it exited, or better use
a serial console in parallel. Now the output goes to a window in the
middle 2/3 of the screen. Scrolling is circular and there is no cursor,
but otherwise ec_putc() provides full dumb terminal functionality and
very fast output that hides artificates from dumb overwrites.
Enji Cooper [Sat, 4 Mar 2017 06:19:41 +0000 (06:19 +0000)]
Correct nuance of -a :service -> "*" in r314563, r314585
My attempt to correct the sender/receiver behavior was incorrect.
The source port of the sender for forwarded datagrams is filtered
with -a, and my change in r314585 didn't clarify that point at all.
Bruce Evans [Sat, 4 Mar 2017 06:19:12 +0000 (06:19 +0000)]
Colorize syscons kernel console output according to a table indexed
by the CPU number.
This was originally for debugging near-deadlock conditions where
multiple CPUs either deadlock or scramble each other's output trying
to report the problem, but I found it interesting and sometimes
useful for ordinary kernel messages. Ordinary kernel messages
shouldn't be interleaved, but if they are then the colorization
makes them readable even if the interleaving is for every character
(provided the CPU printing each message doesn't change).
The default colors are 8-15 starting at 15 (bright white on black)
for CPU 0 and repeating every 8 CPUs. This works best with 8 CPUs.
Non-bright colors and nonzero background colors need special
configuration to avoid unreadable and ugly combinations so are not
configured by default. The next bright color after 15 is 8 (bright
black = dark gray) is not very readable but is the only other color
used with 2 CPUs. After that the next bright color is 9 (bright
blue) which is not much brighter than bright black, but is used with
3+ CPUs. Other bright colors are brighter.
Colorization is configured by default so that it gets tested. It can
only be turned off by configuring SC_KERNEL_CONS_ATTR to anything other
than FG_WHITE. After booting, all colors can be changed using the
syscons.kattr sysctl. This is a SYSCTL_OPAQUE, and no utility is
provided to change it (sysctl only displays it).
The default colors work in all VGA modes that I could test. In 2-color
graphics modes, all 8 bright colors are displayed as bright white, so
the colorization has no effect, but anything with a nonzero background
gives white on white unless the foreground is zero. I don't have an
mono or VGA grayscale hardware to test on. Support for mono mode seems
to have never worked right in syscons (I think bright white gives white
underline with either bold or bright), but VGA grayscale should work
better than 2-color graphics.
Bruce Evans [Sat, 4 Mar 2017 04:06:33 +0000 (04:06 +0000)]
Fix formatting. ruptime output on FreeBSD cluster machines annoyed me
by usually being double-spaced due to auto-wrap at column 80.
r212771 increased width of the hostname field from 12 to 25. This was
supposed to allow for 80-column output with all 3 load averages taking
5 characters each, but it actually gave width exactly 80 and thus worse
than useless auto-wrap in that case. 3 wide load average fields are
unusual, but later expansion of another field gave the auto-wrap with
just 2 wide load average fields.
Change to dynamic field widths for all fields except the uptime. This
also fixes the formatting of high (above 9999) user counts and not
very high (above 9.99) load averages. The formatting for numbers now
breaks at 99999.99, but scientific notation should be used starting
well below that.
The field width for the uptime remains hard-coded to work consistently
for uptimes less than 10000 days, but this gives too much space for
small uptimes. Punctuation between fields could be improved in many
ways, for example by removing it.
Andriy Gapon [Fri, 3 Mar 2017 22:51:04 +0000 (22:51 +0000)]
add a module that provides support for DRAM ECC error injection on AMD CPUs
I imagine that the module would be useful only to a very limited number
of developers, so that's my excuse for not writing any documentation.
On a more serious note, please see DRAM Error Injection section of BKDGs
for families 10h - 16h. E.g. section 2.13.3.1 of BKDG for AMD Family 15h
Models 00h-0Fh Processors.
Many thanks to kib for his suggestions and comments.
Andriy Gapon [Fri, 3 Mar 2017 22:42:43 +0000 (22:42 +0000)]
MCA: add AMD Error Thresholding support
Currently the feature is implemented only for a subset of errors
reported via Bank 4. The subset includes only DRAM-related errors.
The new code builds upon and reuses the Intel CMC (Correctable MCE
Counters) support code. However, the AMD feature is quite different
and, unfortunately, much less regular.
For references please see AMD BKDGs for models 10h - 16h.
Specifically, see MSR0000_0413 NB Machine Check Misc (Thresholding)
Register (MC4_MISC0).
http://developer.amd.com/resources/developer-guides-manuals/
Mark Johnston [Fri, 3 Mar 2017 20:57:40 +0000 (20:57 +0000)]
Fix a ticks comparison in sched_pctcpu_update().
We may fail to reset the %CPU tracking window if a thread does not run
for over half of the ticks rollover period, resulting in a bogus %CPU
value for the thread until ticks fully rolls over. Handle this by comparing
the unsigned difference ticks - ts_ltick with SCHED_TICK_TARG instead.
Fix matching table entry value. Use real table value instead of its index
in valuestate array.
When opcode has size equal to ipfw_insn_u32, this means that it should
additionally match value specified in d[0] with table entry value.
ipfw_table_lookup() returns table value index, use TARG_VAL() macro to
convert it to its value. The actual 32-bit value stored in the tag field
of table_value structure, where all unspecified u32 values are kept.
Enji Cooper [Fri, 3 Mar 2017 20:15:22 +0000 (20:15 +0000)]
Integrate indent tests added in r313544 into ATF/Kyua and the FreeBSD
test suite
This change does the following:
- Introduces symmetry in the test inputs/outputs by adding the exit
code to the files. This simplified the test driver notably by
requiring less filename/test name manipulation.
- Adds a test driver for the testcases added in r313544, patterned
after bin/sh/tests/functional_test.sh . The driver calls indent as
noted in r313544, with an exception: The $FreeBSD$ RCS keyword's
expansion is reindented with indent, which means that the output
differs from the expected output. Thus, all lines with $FreeBSD$
in them are deleted on the fly, both in the input file and the
output file.
The test inputs/outputs are copied to the kyua sandbox before the
test is run as the pathing in some of the files relies on pathing
normalized to the current directory (copying the files is the
easiest way to resolve the issue).
Enji Cooper [Fri, 3 Mar 2017 18:44:20 +0000 (18:44 +0000)]
Clean up ddb(4) slightly
- Delete empty Li macro uses [1]. This removes some spaces between
the optional command/subcommand arguments.
- Attempt to clarify "show lock" subcommand by being more
terse/direct. This addresses an issue with a contraction [2].
Update the LinuxKPI RCU and SRCU wrappers for the concurrency kit, CK.
- Optimise the RCU implementation to not allocate and free
ck_epoch_records during runtime. Instead allocate two sets of
ck_epoch_records per CPU for general purpose use. The first set is
only used for reader locks and the second set is only used for
synchronization and barriers and is protected with a regular mutex to
prevent simultaneous issues.
- Move the task structure away from the rcu_head structure and into
the per-CPU structures. This allows the size of the rcu_head structure
to be reduced down to the size of two pointers.
- Fix a bug where the linux_rcu_barrier() function only waited for one
per-CPU epoch record to be completed instead of all.
- Use a critical section or a mutex to protect ck_epoch_begin() and
ck_epoch_end() depending on RCU or SRCU type. All the ck_epoch_xxx()
functions, except ck_epoch_register(), ck_epoch_unregister() and
ck_epoch_recycle() are not re-entrant and needs a critical section or
a mutex to operate in the LinuxKPI, after inspecting the CK
implementation of the above mentioned functions. The simultaneous
issues arise from per-CPU epoch records being shared between multiple
threads depending on the amount of taskswitching and how many threads
are involved with the RCU and SRCU operations.
- Properly free all epoch records by using safe list traversal at
LinuxKPI module unload. It turns out the ck_epoch_recycle() always
have the records on an internal list and use a flag in the epoch
record to track allocated and free entries. This would lead to use
after free during module unload.
- Remove redundant synchronize_rcu() call from the
linux_compat_uninit() function. Let the linux_rcu_runtime_uninit()
function do the final rcu_barrier() instead.
Ed Maste [Fri, 3 Mar 2017 16:07:46 +0000 (16:07 +0000)]
regen src.conf.5 for clang-4.0.0 merge
Note that makeman's use of 'make showconfig' interacts poorly with
the COMPILER_FEATURES test in share/mk/src.opts.mk, because it tests the
host compiler, not the bootstrap compiler that will actually be used to
build world. This causes it to report that Clang is enabled by default
on MIPS and PowerPC.
For example:
% make TARGET_ARCH=mips64 showconfig | grep CLANG
MK_CLANG = yes
MK_CLANG_BOOTSTRAP = no
MK_CLANG_EXTRAS = no
MK_CLANG_FULL = yes
MK_CLANG_IS_CC = no
I am committing this version anyway to avoid extraneous diffs in
src.conf.5 after every other WITH_/WITHOUT_FOO change.
In addition, we intend to switch to a C++11 compiler for all archs for
12.0 (either by fixing Clang for those archs, or by requiring an
external toolchain), and then src.conf.5 will be correct.
Re-apply part of r311585 which was inadvertantly reverted in the upgrade
to 7.3p1. The other part (which adds -DLIBWRAP to sshd's CFLAGS) is
still in place.