Kirk McKusick [Wed, 27 Jan 2016 21:27:05 +0000 (21:27 +0000)]
This fixes a bug in UFS2 exported NFS volumes. An NFS client can
crash a server that has exported UFS2 by presenting a filehandle
with an inode number that references an uninitialized inode in a
cylinder group. The problem is that UFS2 only initializes blocks
of inodes as they are first allocated and ffs_fhtovp() does not
validate that the inode is in a range of inodes that have been
initialized. Attempting to read an uninitialized inode gets random
data from the disk. When the kernel tries to interpret it as an
inode, panics often arise.
Kirk McKusick [Wed, 27 Jan 2016 21:23:01 +0000 (21:23 +0000)]
The bread() function was inconsistent about whether it would return
a buffer pointer in the event of an error (for some errors it would
return a buffer pointer and for other errors it would not return a
buffer pointer). The cluster_read() function was similarly inconsistent.
Clients of these functions were inconsistent in handling errors.
Some would assume that no buffer was returned after an error and
would thus lose buffers under certain error conditions. Others would
assume that brelse() should always be called after an error and
would thus panic the system under certain error conditions.
To correct both of these problems with minimal code churn, bread()
and cluster_write() now always free the buffer when returning an
error thus ensuring that buffers will never be lost. The brelse()
routine checks for being passed a NULL buffer pointer and silently
returns to avoid panics. Thus both approaches to handling error
returns from bread() and cluster_read() will work correctly.
Future code should be written assuming that bread() and cluster_read()
will never return a buffer with an error, so should not attempt to
brelse() the buffer when an error is returned.
Alexander Kabaev [Wed, 27 Jan 2016 20:20:37 +0000 (20:20 +0000)]
Do not unlock rtld_phdr_lock over callback invocations.
The dl_iterate_phdr consumer code in libgcc does not expect multiple
callbacks running concurrently. This was fixed once already in r178807,
but accidentally got reverted in r294373.
John Baldwin [Wed, 27 Jan 2016 17:55:01 +0000 (17:55 +0000)]
Convert ss_sp in stack_t and sigstack to void *.
POSIX requires these members to be of type void * rather than the
char * inherited from 4BSD. NetBSD and OpenBSD both changed their
fields to void * back in 1998. No new build failures were reported
via an exp-run.
Andrew Turner [Wed, 27 Jan 2016 17:33:31 +0000 (17:33 +0000)]
When finding the physical address of a device allow intermediate addresses
to be 64-bit on 32-bit architectures. It is not uncommon for device trees
to use the upper 32-bits to store what effectively is an index into the
parent ranges property. In this case, when running with a 32-bit bus_addr_t
and bus_size_t, we would previously truncate the address, this may then
incorrectly match the wrong range, and return the wrong address.
Allan Jude [Wed, 27 Jan 2016 16:45:23 +0000 (16:45 +0000)]
ficl on i386 should cast to unsigned char output to support efi i386
make it possible for efi_console to recognize and translate box characters
on i386 build (unsigned versus signed char passed as int issue).
Submitted by: Toomas Soome <tsoome at me.com>
Reviewed by: emaste, smh, dteske
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D4993
Warner Losh [Wed, 27 Jan 2016 16:36:18 +0000 (16:36 +0000)]
Fix mistake when transitioning to the new defines with ZFS loader. I
hate adding yet another define, but it is the lessor of the evil
choices available. Kill another evil by removing PATH_BOOT3 and
replacing it with PATH_LOADER or PATH_LOADER_ZFS as appropriate.
Alan Somers [Wed, 27 Jan 2016 16:17:15 +0000 (16:17 +0000)]
syslogd: Enable repeated line compression for lines of any length.
Enable repeated line compression for lines of any length, instead of only
short lines. AFAICT repeated line compression was limited to short lines as
a RAM optimization, which made sense when karels added it in 1988, but no
longer. The penalty is a paltry 904B of RAM per file logged.
Alan Somers [Wed, 27 Jan 2016 16:13:10 +0000 (16:13 +0000)]
Fix grep_test:recurse with ZFS and TMPFS tmpdirs
contrib/netbsd-tests/usr.bin/grep/t_grep.sh
Fix grep_test:recurse when /tmp is either zfs or tmpfs. The test was
relying on an implicit ordering of directory recursion which happens
to be true when using UFS. grep's specification requires no such
ordering. The solution is to ignore the order of grep's results.
Gleb Smirnoff [Wed, 27 Jan 2016 07:34:00 +0000 (07:34 +0000)]
Fix issues with TCP_CONGESTION handling after r294540:
o Return back the buf[TCP_CA_NAME_MAX] for TCP_CONGESTION,
for TCP_CCALGOOPT use dynamically allocated *pbuf.
o For SOPT_SET TCP_CONGESTION do NULL terminating of string
taking from userland.
o For SOPT_SET TCP_CONGESTION do the search for the algorithm
keeping the inpcb lock.
o For SOPT_GET TCP_CONGESTION first strlcpy() the name
holding the inpcb lock into temporary buffer, then copyout.
Xin LI [Wed, 27 Jan 2016 07:20:55 +0000 (07:20 +0000)]
Implement AT_SECURE properly.
AT_SECURE auxv entry has been added to the Linux 2.5 kernel to pass a
boolean flag indicating whether secure mode should be enabled. 1 means
that the program has changes its credentials during the execution.
Being exported AT_SECURE used by glibc issetugid() call.
Enji Cooper [Wed, 27 Jan 2016 06:24:19 +0000 (06:24 +0000)]
Adjust vm.max_wired in order to avoid hitting EAGAIN artificially
Set vm.max_wired to INT_MAX in :mlock_err, :mlock_mmap, and :mlock_nested to
avoid hitting EAGAIN artificially on the system when running the tests
Require root privileges in order to set the sysctl
Add allow_sysctl_side_effects to require.config as this test is now adjusting
sysctls that can affect the global system state
Unlike the version submitted by cem in OneFS, this version uses a scratch file
to save/restore the previous value of the sysctl. I _really_, _really_ wish
there were better hooks in atf/kyua for per test suite setup/teardown -- using
a file is kludgy, but it's the best I can do to avoid situations where (for
instance), sysctl(3) may fail and drop a core outside the kyua sandbox.
Based on a patch submitted by cem, but modified to take business logic out of
ATF_TP_ADD_TCS(3).
Devin Teske [Wed, 27 Jan 2016 06:21:35 +0000 (06:21 +0000)]
Fix a crash if `-D' is used without `-t title'
dialog(3)'s dlg_reallocate_gauge(), used both by dialog(3)'s dialog_gauge()
and dialog(1)'s `--gauge', will segmentation fault in strlen(3) if no title
is set for the widget. Reproducible with `dialog --gauge hi 6 20' (adding
`--title ""' is enough to prevent segmentation fault).
Sepherosa Ziehau [Wed, 27 Jan 2016 03:53:30 +0000 (03:53 +0000)]
hyperv/vmbus: Event handling code refactor.
- Use taskqueue instead of swi for event handling.
- Scan the interrupt flags in filter
- Disable ringbuffer interrupt mask in filter to ensure no unnecessary
interrupts.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: adrian, sephe, Dexuan <decui microsoft com>
Approved by: adrian (mentor)
MFC after: 2 weeks
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4920
Justin Hibbits [Wed, 27 Jan 2016 02:23:54 +0000 (02:23 +0000)]
Convert rman to use rman_res_t instead of u_long
Summary:
Migrate to using the semi-opaque type rman_res_t to specify rman resources. For
now, this is still compatible with u_long.
This is step one in migrating rman to use uintmax_t for resources instead of
u_long.
Going forward, this could feasibly be used to specify architecture-specific
definitions of resource ranges, rather than baking a specific integer type into
the API.
This change has been broken out to facilitate MFC'ing drivers back to 10 without
breaking ABI.
Luigi Rizzo [Wed, 27 Jan 2016 02:08:30 +0000 (02:08 +0000)]
bugfix: the scheduler template (dn_schk) for the round robin scheduler
is followed by another structure (rr_schk) whose size must be set
in the schk_datalen field of the descriptor.
Not allocating the memory may cause other memory to be overwritten
(though dn_schk is 192 bytes and rr_schk only 12 so we may be lucky
and end up in the padding after the dn_schk).
Bryan Drewery [Wed, 27 Jan 2016 01:33:26 +0000 (01:33 +0000)]
Revert yacc dependency back to pre-r241298.
Several attempts to fix this logic was done after r241298, which were
all reverted, yet this change was not.
The .h file does not depend on the .c file, so do not impose such a
dependency on it. They are generated by the same command but do not
depend on each other. Restore the .ORDER which should handle parallel build
issues. This fixes an actual bug where the .h file is not recreated
when missing [1]. For example:
cd lib/libc
make cleanobj
make nsparser.h
rm nsparser.h
make nsparser.h # will not rebuild nsparser.h
I have been trying to track down a build problem where nsparser.h is
missing when nslexer.o is built. It is possible this is related.
Bryan Drewery [Wed, 27 Jan 2016 01:24:05 +0000 (01:24 +0000)]
Fix DIRDEPS_BUILD after r294752.
DIRDEPS_BUILD does not yet support PROGS having their own dependency
file.
Overriding .MAKE.DEPENDFILE here causes major problems with the meta
mode logic since it creates the Makefile.depend as '.depend' resulting
in infinite loops in make due to dirdeps.mk including .depend endlessly.
Gleb Smirnoff [Wed, 27 Jan 2016 00:45:46 +0000 (00:45 +0000)]
Augment struct tcpstat with tcps_states[], which is used for book-keeping
the amount of TCP connections by state. Provides a cheap way to get
connection count without traversing the whole pcb list.
Devin Teske [Tue, 26 Jan 2016 23:59:30 +0000 (23:59 +0000)]
Add `-k' for dpv(3) `keep_tite' config option
For scripts using dialog(1) several times, it can be visually distracting
running dpv(1) several times amidst other dialogs. The `-k' option, similar
to dialog(1) `--keep-tite', enables the same functionality to smooth ti/te.
Devin Teske [Tue, 26 Jan 2016 23:56:27 +0000 (23:56 +0000)]
Add keep_tite configuration option
Similar to dialog(3) keep_tite option used to prevent visually disturbing
initialization or exit that could occur when run from a script using
dpv(3) by way of dpv(1) in sequence with other dialog(1) invocations.
John Baldwin [Tue, 26 Jan 2016 19:07:09 +0000 (19:07 +0000)]
Add support to libsysdecode for decoding system call names.
A new sysdecode_syscallname() function accepts a system call code and
returns a string of the corresponding name (or NULL if the code is
unknown). To support different process ABIs, the new function accepts a
value from a new sysdecode_abi enum as its first argument to select the
ABI in use. Current ABIs supported include FREEBSD (native binaries),
FREEBSD32, LINUX, LINUX32, and CLOUDABI64. Note that not all ABIs are
supported by all platforms. In general, a given ABI is only supported
if a platform can execute binaries for that ABI.
To simplify the implementation, libsysdecode's build reuses the
existing pre-generated files from the kernel source tree rather than
duplicating new copies of said files during the build.
kdump(1) and truss(1) now use these functions to map system call
identifiers to names. For kdump(1), a new 'syscallname()' function
consolidates duplicated code from ktrsyscall() and ktrsyscallret().
The Linux ABI no longer requires custom handling for ktrsyscall() and
linux_ktrsyscall() has been removed as a result.
Warner Losh [Tue, 26 Jan 2016 18:39:31 +0000 (18:39 +0000)]
Default NANO_DRIVE to ada0 not ad0. This shouldn't affect working
configs (since they'd have to change NANO_DRIVE to be ada0 to work),
but will fix old ones that used to work.
- Start vap(s) (via ieee80211_start_all()) only when initialization
succeeds; stop the first vap otherwise (via ieee80211_stop());
- Do not try to stop a device multiple times
(move (sc->sc_flags & RTWN_RUNNING) check to urtwn_stop_locked()).
Hiren Panchasara [Tue, 26 Jan 2016 16:33:38 +0000 (16:33 +0000)]
Persist timers TCPTV_PERSMIN and TCPTV_PERSMAX are hardcoded with 5 seconds and
60 seconds, respectively. Turn them into sysctls that can be tuned live. The
default values of 5 seconds and 60 seconds have been retained.
LinuxKPI list updates:
- Add some new hlist macros.
- Update existing hlist macros removing the need for a temporary
iteration variable.
- Properly define the RCU hlist macros to be SMP safe with regard
to RCU.
- Safe list macro arguments by adding a pair of parentheses.
- Prefix the _list_add() and _list_splice() functions with "linux"
to reflect they are LinuxKPI internal functions.
Don't clear the software flow control flag before draining for last
close or assert the bug that it is clear when leaving.
Remove an unrelated rotted comment that was attached to the buggy
clearing.
Since draining is not done in more cases, flushing is needed in more
cases, so start fixing flushing:
- do a full flush in ttydisc_close(). State what POSIX requires more
clearly. This was missing ttydevsw_pktnotify() calls to tell the
devsw layer to flush. Hardware tty drivers don't actually flush
since they don't understand this API.
- fix 2 missing wakeups in tty_flush(). Most of the wakeups here are
unnecessary for last close. But ttydisc_close() did one of the
missing ones.
This flow control bug ameliorated the design bug of requiring
potentially unbounded waits in draining. Software flow control is the
easiest way to get an unbounded wait, and a long wait is sometimes
actually useful. Users can type the xoff character on the receiver
and (if ixon is set on the sender) expect the output to be held until
the user is ready for more.
Hardware flow control can also give the unbounded wait, and this bug
didn't affect hardware flow control. Unbounded waits from hardware
flow control take a more unusual configuration. E.g., a terminal
program that controls the modem status lines, or unplugging the cable
in a configuration where this doesn't break the connection.
The design bug is still ameliorated by a newer bug in draining for
last close -- the 1 second timeout. E.g., if the user types the
xoff character and the sender reaches last close, then output is
not resumed and the wait times out after just 1 second. This is
broken, but preferable to an unbounded wait. Before this change,
the output was resumed immediately and usually completed.
Ruslan Bukin [Tue, 26 Jan 2016 14:34:40 +0000 (14:34 +0000)]
Remove uathload from build due to issue with GCC 5.2.0:
"ld: --relax and -r may not be used together."
Requires fixing ld command line arguments and testing.
Svatopluk Kraus [Tue, 26 Jan 2016 13:50:44 +0000 (13:50 +0000)]
Make pmap_fault() return values vm subsystem compliant to
simplify their handling in abort_handler(). While here,
remove one extra initialization of pcb variable.
Alexander Motin [Tue, 26 Jan 2016 13:45:41 +0000 (13:45 +0000)]
MFV r294819: 6495 Fix mutex leak in dmu_objset_find_dp
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Albert Lee <trisk@omniti.com>
Author: Steven Hartland <steven.hartland@multiplay.co.uk>
Alexander Motin [Tue, 26 Jan 2016 13:44:47 +0000 (13:44 +0000)]
6495 Fix mutex leak in dmu_objset_find_dp
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Albert Lee <trisk@omniti.com>
Author: Steven Hartland <steven.hartland@multiplay.co.uk>
Alexander Motin [Tue, 26 Jan 2016 13:40:22 +0000 (13:40 +0000)]
6494 ASSERT supported zio_types for file and disk vdevs
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Albert Lee <trisk@omniti.com>
Author: Steven Hartland <steven.hartland@multiplay.co.uk>
Alexander Motin [Tue, 26 Jan 2016 13:37:30 +0000 (13:37 +0000)]
MFV r294816: 4986 receiving replication stream fails if any snapshot
exceeds refquota
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gordon.ross@nexenta.com>
Author: Dan McDonald <danmcd@omniti.com>
Alexander Motin [Tue, 26 Jan 2016 13:20:31 +0000 (13:20 +0000)]
4986 receiving replication stream fails if any snapshot exceeds refquota
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gordon.ross@nexenta.com>
Author: Dan McDonald <danmcd@omniti.com>
This allows to do a full (non-incremental send) and receive it as a clone
of an existing dataset. It can leverage nopwrite to share blocks with the
origin. This can be used to change the relationship of datasets on the
target. For example, maybe on the source you have:
A ---- B ---- C
And you have sent to the target a full of B, and the incremental B->C:
B ---- C
You later realize that you want to have A on the target. You will have to
do a full send of A, but nopwrite can save you space on the target if you
receive it as a clone of B, assuming that A and B have some blocks inxi
common: