jhb [Thu, 12 Nov 2015 23:49:47 +0000 (23:49 +0000)]
MFC 285783:
Various changes to the registers displayed in DDB for x86.
- Fix segment registers to only display the low 16 bits.
- Remove unused handlers and entries for the debug registers.
- Display xcr0 (if valid) in 'show sysregs'.
- Add '0x' prefix to MSR values to match other values in 'show sysregs'.
- MFamd64: Display various MSRs in 'show sysregs'.
- Add a 'show dbregs' to display the value of debug registers.
- Dynamically size the column width for register values to properly
align columns on 64-bit platforms.
- Display %gs for i386 in 'show registers'.
jhb [Thu, 12 Nov 2015 22:45:51 +0000 (22:45 +0000)]
MFC 285773,285775,285776:
Various fixes for stack unwinding in DDB on x86.
285773:
Remove some dead code from DDB's amd64 stack unwinder.
The amd64 port copied some code from i386 to fetch function arguments and
display them in backtraces. However, it was commented out and can't easily
be implemented since the function arguments are passed in
registers rather than on the stack in amd64. Remove it in preparation for
some bug fixes in this area.
285775:
Improve stack unwinding on i386 and amd64 after an IP fault.
If we can't find a symbol corresponding to the faulting instruction, assume
that the previously-executed function is a call and attempt to find the
calling function using the return address on the stack. Otherwise we end
up associating the last stack frame with the current call, which is
incorrect and causes the unwinder to skip printing of the calling function,
resulting in a confusing backtrace.
285776:
Let the unwinder handle faults during function prologues or epilogues.
The i386 and amd64 DDB stack unwinders contain code to detect and handle
the case where the first frame is not completely set up or torn down. This
code was accidentally unused however, since db_backtrace() was never called
with a non-NULL trap frame. This change fixes that.
Also remove get_rsp() from the amd64 code. It appears to have come from
i386, which needs to take into account whether the exception triggered a
CPL switch, since SS:ESP is only pushed onto the stack if so. On amd64,
SS:RSP is pushed regardless, so get_rsp() was doing the wrong thing for
kernel-mode exceptions. As a result, we can also remove custom print
functions for these registers.
hselasky [Thu, 12 Nov 2015 08:47:10 +0000 (08:47 +0000)]
MFC r290140:
Add missing NULL check in physio().
When destroying a character device the si_devsw field is set to NULL
before all references are gone, to indicate the character device is
going away. This can cause a NULL-dereference fault inside physio().
The callers of physio() should own a thread reference on the cdev and
if si_devsw is seen as non-NULL, it is usable during the execution of
the function. Else an ENXIO error code is returned.
edwin [Thu, 12 Nov 2015 03:26:05 +0000 (03:26 +0000)]
MFC of 290697,tzdata10:
Update to tzdata2015g:
Turkey's 2015 fall-back transition is scheduled for Nov. 8, not Oct. 25.
Norfolk moves from +1130 to +1100 on 2015-10-04 at 02:00 local time.
Fiji's 2016 fall-back transition is scheduled for January 17, not 24.
Fort Nelson, British Columbia will not fall back on 2015-11-01. It has
effectively been on MST (-0700) since it advanced its clocks on 2015-03-08.
New zone America/Fort_Nelson.
kp [Wed, 11 Nov 2015 12:36:42 +0000 (12:36 +0000)]
MFC r290161:
pf: Fix IPv6 checksums with route-to.
When using route-to (or reply-to) pf sends the packet directly to the output
interface. If that interface doesn't support checksum offloading the checksum
has to be calculated in software.
That was already done in the IPv4 case, but not for the IPv6 case. As a result
we'd emit packets with pseudo-header checksums (i.e. incorrect checksums).
This issue was exposed by the changes in r289316 when pf stopped performing full
checksum calculations for all packets.
jhb [Wed, 11 Nov 2015 01:32:35 +0000 (01:32 +0000)]
MFC 284324,290164:
Workaround debuggers that try to read the full 32-bit words holding
selectors in trapframes.
284324:
Ensure that the upper 16 bits of segment registers manually saved in
trapframes are cleared by explicitly pushing a zero and then moving
the segment register into the low 16 bits. Certain Intel processors
treat a push of a segment register as a move of the segment register
into the low 16 bits leaving the upper 16 bits of the word in the
stack unchanged.
290164:
Use movw instead of movl (or plain mov) when moving segment registers
into memory. This is a nop on clang's assembler, but some assemblers
complain if the size suffix is incorrect.
bapt [Tue, 10 Nov 2015 07:17:38 +0000 (07:17 +0000)]
MFC r290480
Protecting against rm -rf / is now POSIXLY_CORRECT per posix 1003.1
edition 2013. No need anymore to disable the protection if one set
the POXILY_CORRECT environment variable.
hselasky [Mon, 9 Nov 2015 11:24:59 +0000 (11:24 +0000)]
MFC r290441:
Fix for unaligned IP-header.
The mbuf length fields must be set before m_adj() is called else
m_adj() will not always adjust the mbuf and an unaligned read
exception can trigger inside the network stack. This can happen on
platforms where unaligned reads are not supported. Adjust a length
check to include the 2-byte ethernet alignment while at it.
ngie [Mon, 9 Nov 2015 09:28:34 +0000 (09:28 +0000)]
MFC r266930,r289225:
r266930 (by jmg):
convert to using the _daddr_t types like newfs was...
Put the superblock in the correct possition for UFS2... There is a bug
in FFS that if we don't put it here (for UFS2), it will forcefully
relocate the superblock, and I believe cause data loss..
I have a fix for that, but w/ how many releases are broken, we won't be
able to switch to the better _FLOPPY (block 0) for this for a while..
r289225 (by sbruno):
makefs(8) leaves sblock.fs_providersize uninitialized (zero) that can be easily
checked with dumpfs(8). This may lead to other problems, f.e. geom_label kernel
module sanity checks do not like zero fs_old_size value and skips such UFS1
file system while tasting (fs_old_size derives from sblock.fs_providersize).
ngie [Mon, 9 Nov 2015 09:20:01 +0000 (09:20 +0000)]
MFC r290265,r290267,r290270:
r290265:
Add testcases for -t cd9660 -o isolevel=[1-3]
-- -o isolevel=1 currently fails because of path comparison issues,
so mark it as an expected failure.
-- -o isolevel=3 is not implemented, so expect it to fail as an out
of bounds value [*].
Clean up mtree keyword support a slight bit and add a few more default keywords
- Parameterize the mtree keywords as $DEFAULT_MTREE_KEYWORDS
- Test with the extra mtree keywords, `mode,gid,uid`.
- Add a note about mtrees with time support not working with makefs right now
Sponsored by: EMC / Isilon Storage Division
r290270:
Add testcases for -t ffs -o version=[12]
Verify the filesystem type using dumpfs. Add preliminary support
for NetBSD (needs to be validated)
ngie [Mon, 9 Nov 2015 08:59:55 +0000 (08:59 +0000)]
MFC r289203,r290180:
r289203 (by adrian):
makefs: introduce a new option to specify what to round the resulting
image up to.
From ticket:
While trying to run FreeBSD/mips on some device having very small flash media,
one is forced to compress file system with mkulzma(8) utility. It is desirable
to specify small UFS block/fragment sizes like 4096/512 bytes for makefs(8)
and big compression block size like 65535 bytes to mkulzma at the same time.
Then one obtains very good comression ratios (like 75% and more) but faces
the following problem.
geom_uncompress kernel module reports GEOM provider size rounded up to its
compression block size. Generally, this changes original media size and now
it fails to match the size of embedded UFS file system that leads to other
problems, f.e. geom_label kernel module does not like this and skips the
file system while tasting the GEOM and looking for UFS label.
This makes it impossible to refer to the file system using known UFS label
instead of something like /dev/map/rootfs.uncompress.
The following patch introduces new command line option "-r roundup" for makefs
that makes it round up the image to specified block size. Hence, geom_uncompress
does not change GEOM media size for images rounded that way and geom_label
accepts such GEOMs just fine.
With the patch applied, one can use following commands:
- Rename -r to -R to avoid the clash with makefs -r in NetBSD
- Note that -R is an FFS-specific option because it's not implemented
in cd9660 today
- Rename the roundup variable to "roundup-size" in the manpage and help
text for consistency with other variables.
- Bump .Dd (missed in r289203)
Detected by jemalloc, i.e. running makefs failed the arena assert
because my copy of malloc on CURRENT is compiled with the default
!MALLOC_PRODUCTION asserts on
ngie [Mon, 9 Nov 2015 07:56:06 +0000 (07:56 +0000)]
MFC r289739,r289743,r289897,r289901:
r289739:
Correctly reintroduce the rudimentary smoke tests I botched up
in r289684
Sponsored by: EMC / Isilon Storage Division
r289743:
Revise "create_test_inputs" to simplify the file structure as
these testcases don't need to be nested as much as bin/ls/ls_tests.sh
do when verifying ls -a, ls -A, etc. This allows the tests to make
all paths relative to the top of the temporary directory instead of
always tacking on $ATF_TMPDIR, thus complicating things unnecessarily
Create non-empty files in create_test_inputs as well now, similar to
create_test_inputs2 in bin/ls/ls_tests.sh
Compare the input files to the output file contents using diff where
possible:
- Skip over the fifo comparison for now because it always fails
- Skip over the symlink comparison on cd9660 because it always fails
today
Sponsored by: EMC / Isilon Storage Division
r289897:
Add more cd9660/FFS makefs testcases
General changes:
- Parameterize out the mount command.
- Use mtree to verify the contents of an image (check_image_contents) instead
of using diff (diff verifies content, but not file metadata).
- Move common logic out to functions (common_cleanup, mount_image,
check_image_contents)
- Add stub testcases for makefs -D (crashes with SIGBUS, similar to bug # 192839)
- Add a note about the ISO-9660 and rockridge specs
- Add testcases that exercise:
-- Creating disk images from an mtree and multiple directories.
-- -F flag use (not really an extensive testcase right now)
cd9660-specific test changes:
- Remove an XXX comment about symlinks; I forgot that non-rockridge images turn
symlinks into hardlinks.
- Add testcases that exercise:
-- -o allow-deep-trees
-- -o allow-max-name stub testcase (doesn't seem to be implemented in makefs)
-- -o preparer (existence in image; not conformance to spec)
-- -o publisher (existence in image; not conformance to spec)
-- -o rockridge (basic)
ngie [Mon, 9 Nov 2015 07:49:39 +0000 (07:49 +0000)]
MFC r289441:
Integrate tools/test/posixshm and tools/regression/posixshm into the FreeBSD
test suite as tests/sys/posixshm
Some other highlights:
- Convert the testcases over to ATF
- Don't use hardcoded paths to /tmp (which violate the ATF/kyua samdbox); use
mkstemp to generate temporary paths for non-SHM_ANON shm objects.
ngie [Mon, 9 Nov 2015 07:26:34 +0000 (07:26 +0000)]
MFC r290190,r290251:
r290190:
Fix compiler warnings with open_to_operation.c
Other sidenotes:
- Remove unused variables with main(..)
- Convert errx/exit with -1 to errx/exit with 1
- Fix a bogus test in try_directory_open
(expected_errno == expected_errno -> errno == expected_errno) [*]
- Fix some warnings related to discarded qualifiers
- Remove a bogus else-statement at the end of check_mmap_exec(..) in the
successful case. mmap(2), POSIX, Linux, etc all don't state what the
behavior is when mixing O_WRONLY + PROT_EXEC, so assume success for now to
get the test program to pass again.
ngie [Mon, 9 Nov 2015 07:07:25 +0000 (07:07 +0000)]
MFC r290182:
Fix rtsold's usage message
- Remove -a from the usage message example dealing with specific
interfaces. -a only makes sense when not specifying an interface,
such that it's to be run on all interfaces
- Fix the pidfile option (it's -p, not -P)
- Change `interfaces` to `interface` to match the manpage
delphij [Mon, 9 Nov 2015 01:53:54 +0000 (01:53 +0000)]
MFC r290024,290073:
In gunzip(1), treat trailing garbage as a warning and not an error. This
allows scripts to distinguish it between real fatal errors, for instance a
CRC mismatch.
Addition to prev. commit.
In some edge cases fp->_p can be changed in _sseek(), recalculate.
r290230:
Don't seek to the end if write buffer is empty (in append modes).
PR: 204156
r290110:
Add _flags2 per jhb@ suggestion since no room left in _flags.
Rewrite O_APPEND flag checking using new __S2OAP flag.
r289931:
According to POSIX, a write operation shall start at the current size of
the stream (if mode had 'a' as the first character).
r289863:
Since no room left in the _flags, reuse __SALC for O_APPEND.
It helps to remove _fcntl() call from _ftello() and optimize seek position
calculation in _swrite().
jhb [Fri, 6 Nov 2015 16:57:23 +0000 (16:57 +0000)]
MFC 288902:
Include additional info in ptrace(2) KTR traces:
- The new PC value and signal passed to PT_CONTINUE, PT_DETACH, PT_SYSCALL,
and PT_TO_SC[EX].
- The system call code returned via PT_LWPINFO.
jhb [Fri, 6 Nov 2015 16:48:33 +0000 (16:48 +0000)]
MFC 288452,289719:
288452:
Most error cases in i915_gem_do_execbuffer() jump to one of two labels to
release resources (such as unholding pages) when errors occur. Some
recently added error checks return immediately instead of jumping to a
label resulting in leaks. Fix these to jump to a label to do cleanup
instead.
Note that stable/9 does not have the "recently added" error checks, but
it does have some older error checks (that were are no longer present
in stable/10 and head) that have the same bug and this fixes those
instead.
289719:
i915_gem_do_execbuffer() holds the pages backing each relocation region for
various reasons while executing user commands. After these commands are
completed, the pages backing the relocation regions are unheld.
Since relocation regions do not have to be page aligned, the code in
validate_exec_list() allocates 2 extra page pointers in the array of
held pages populated by vm_fault_quick_hold_pages(). However, the cleanup
code that unheld the pages always assumed that only the buffer size /
PAGE_SIZE pages were used. This meant that non-page aligned buffers would
not unheld the last 1 or 2 pages in the list. Fix this by saving the
number of held pages returned by vm_fault_quick_hold_pages() for each
relocation region and using this count during cleanup.
hselasky [Fri, 6 Nov 2015 13:34:55 +0000 (13:34 +0000)]
MFC r290195:
Reduce the DWC OTG interrupt load by not reading all the host channel
status registers for every interrupt. Check a common host channel
status interrupt register first, then conditionally read the
individual host channel status registers.
jhb [Thu, 5 Nov 2015 21:22:23 +0000 (21:22 +0000)]
MFC 288371:
When XSAVE support was added on amd64, the FPU save area was moved
out of 'struct pcb' and into a variable-sized region after the
structure. The kgdb code currently only reads the pcb. It does not
read in the FPU save area but instead passes stack garbage as the
FPU's saved context. Fixing this would mean determining the proper
size of the area and fetching it. However, this state is not saved
for running CPUs in stoppcbs[], so the callback would also have to
know to ignore those pcbs. Instead, just remove the call since it is
of limited usefulness. It results in kgdb reporting the state of the
FPU/SIMD registers in userland, not their current values in the kernel.
In particular, it does not report the correct state for any code in
the kernel which does use the FPU and would report incorrect values
in that case.
jhb [Thu, 5 Nov 2015 19:55:45 +0000 (19:55 +0000)]
MFC 287934:
The EFI boot loader allocates a single chunk of contiguous memory to
hold the kernel, modules, and any other loaded data. This memory block
is relocated to the kernel's expected location during the transfer of
control from the loader to the kernel.
The GENERIC kernel on amd64 has recently grown such that a kernel + zfs.ko
no longer fits in the default staging size. Bump the default size from
32MB to 48MB to provide more breathing room.
ngie [Thu, 5 Nov 2015 07:48:48 +0000 (07:48 +0000)]
MFC r289913,r289916:
r289913:
Use 't' (bits) not 'i' (bytes) for describing MRIE (aka
"Method of Reporting Informational Exceptions") in the SCSI mode database as
the field described in X3T10/94-190 (revision 4; page 2, table 1) [1.] is
4 bits wide, not 4 bytes wide
Bug 200619
Reported by: Michael Baptist <mbaptist@isilon.com>
Submitted by: Lars Skodje <lskodje@isilon.com>
Sponsored by: EMC / Isilon Storage Division
r289916:
Limit RESOLUTION_MAX to INT_MAX, not UINT_MAX (all spelled out) so the
mode value isn't always clipped to -1 when (resolution * size) == 32, which
would have been the case with values => {4i,32b,32t}.
This seems to have been broken in r64382.
PR: 200619
Reported by: Michael Baptist
Submitted by: Lars Skodje
Sponsored by: EMC / Isilon Storage Division
hrs [Wed, 4 Nov 2015 01:00:42 +0000 (01:00 +0000)]
MFC r288600:
- Schedule DAD for IN6_IFF_TENTATIVE addresses in nd6_timer(). This
catches cases that DAD probes cannot be sent because of
IFF_UP && !IFF_DRV_RUNNING.
- nd6_dad_starttimer() now calls nd6_dad_ns_output(), instead of
calling it before nd6_dad_starttimer().
- Do not release an entry in dadq when a duplicate entry is being
added.
hselasky [Tue, 3 Nov 2015 10:24:54 +0000 (10:24 +0000)]
MFC r285914, r289029 and r289560:
- Move the remainder of host controller capability registers reading from
xhci_start_controller() to xhci_init(). These values don't change at run-
time so there's no point of acquiring them on every USB_HW_POWER_RESUME
instead of only once during initialization. In r276717, reading the first
couple of registers in question already had been moved as a prerequisite
for the changes in that revision.
- Identify ASMedia ASM1042A controllers.
- Use NULL instead of 0 for pointers.
- Add quirks for USB 3.0 PCI devices.
kib [Tue, 3 Nov 2015 08:31:01 +0000 (08:31 +0000)]
MFC r289660,r289664:
Do not allow to execute ptrace(PT_TRACE_ME) when the process is
already traced or when there is no parent which can trace the process.
dteske [Mon, 2 Nov 2015 21:46:58 +0000 (21:46 +0000)]
MFC r287696:
The <arch>/mkisoimages.sh script in release knows how to add
extra bits from an "xtra-bits-dir". This feature is unusable
from release/Makefile. Add an XTRADIR setting to use it.
MFC r287697: Whitespace alignment
wollman [Fri, 30 Oct 2015 19:26:55 +0000 (19:26 +0000)]
Long-overdue MFC of r280930:
Fix overflow bugs in and remove obsolete limit from kernel RPC
implementation.
The kernel RPC code, which is responsible for the low-level scheduling
of incoming NFS requests, contains a throttling mechanism that
prevents too much kernel memory from being tied up by NFS requests
that are being serviced. When the throttle is engaged, the RPC layer
stops servicing incoming NFS sockets, resulting ultimately in
backpressure on the clients (if they're using TCP). However, this is
a very heavy-handed mechanism as it prevents all clients from making
any requests, regardless of how heavy or light they are. (Thus, when
engaged, the throttle often prevents clients from even mounting the
filesystem.) The throttle mechanism applies specifically to requests
that have been received by the RPC layer (from a TCP or UDP socket)
and are queued waiting to be serviced by one of the nfsd threads; it
does not limit the amount of backlog in the socket buffers.
The original implementation limited the total bytes of queued requests
to the minimum of a quarter of (nmbclusters * MCLBYTES) and 45 MiB.
The former limit seems reasonable, since requests queued in the socket
buffers and replies being constructed to the requests in progress will
all require some amount of network memory, but the 45 MiB limit is
plainly ridiculous for modern memory sizes: when running 256 service
threads on a busy server, 45 MiB would result in just a single
maximum-sized NFS3PROC_WRITE queued per thread before throttling.
Removing this limit exposed integer-overflow bugs in the original
computation, and related bugs in the routines that actually account
for the amount of traffic enqueued for service threads. The old
implementation also attempted to reduce accounting overhead by
batching updates until each queue is fully drained, but this is prone
to livelock, resulting in repeated accumulate-throttle-drain cycles on
a busy server. Various data types are changed to long or unsigned
long; explicit 64-bit types are not used due to the unavailability of
64-bit atomics on many 32-bit platforms, but those platforms also
cannot support nmbclusters large enough to cause overflow.
This code (in a 10.1 kernel) is presently running on production NFS
servers at CSAIL.
Summary of this revision:
* Removes 45 MiB limit on requests queued for nfsd service threads
* Fixes integer-overflow and signedness bugs
* Avoids unnecessary throttling by not deferring accounting for
completed requests
delphij [Thu, 29 Oct 2015 17:00:51 +0000 (17:00 +0000)]
MFC r289038,r289041:
Add encoding for mime-types.
Fix short month names and replace %b with %_m in date_fmt for Chinese
locales.
When using a Chinese locale, such as zh_TW.UTF-8 or zh_CN.UTF-8,
nl_langinfo(ABMON_*) only returned numbers. For instance,
nl_langinfo(ABMON_1) returns 1, nl_langinfo(ABMON_2) returns 2, and
so on.
This causes problems in applications that put the short month name
and the day of the month together. For example, 'Apr 14' in English
becomes '414日' in Chinese on the top bar of GNOME Shell.
This problem may be resolved by appending '月' to all short month
names and replacing %b with %_m in date_fmt. ja_JP.UTF-8 already
does this, and this matches the en_US.ISO8859-1 behavior, which
returns 'Oct'. The GNU C Library also returns values with '月'
appended.
PR: 199441
Submitted by: Ting-Wei Lan <lantw44 gmail com>