Improvements to the MDXFileChunk() template function:
- Remove unneeded fstat()/lseek() calls.
- Return NULL and set errno to EINVAL on negative length.
- Fix small style problems and expand variable names.
After this change, it is possible to use this code for some irregular
files. For example, 'md5 /dev/md0' should now succeed.
Ian Lepore [Thu, 14 Jan 2016 19:33:13 +0000 (19:33 +0000)]
Fix the handling of the "PDC write transfer length" erratum for at91. The
problem affects revision 1xx hardware as well as later versions. Also, the
recommended workaround is to set the PDC count register for a 12-byte
transfer when the actual size is less than that, but there is no need to
extend or zero-out the data buffer, because the blklen register contains
the real transfer size and only that many bytes will be transferred.
Also add a sysctl to turn debugging printfs on or off on the fly.
Andrew Turner [Thu, 14 Jan 2016 19:00:13 +0000 (19:00 +0000)]
Set -mlong-calls where needed to get a static clang and lldb 3.8.0
linking. These are too large for a branch instruction to branch from an
earlier point in the code to somewhere later.
This will also allow these to be build with Thumb-2 when we get this
infrastructure.
Reviewed by: dim
Differential Revision: https://reviews.freebsd.org/D4855
Alan Somers [Thu, 14 Jan 2016 18:19:05 +0000 (18:19 +0000)]
Fix race condition involving ZFS remove events
When a ZFS drive disappears, ZFS sends a resource.fs.zfs.removed event to
userland. A userland program like zfsd(8) can use that event, for example to
activate a hotspare. The current code contains a race condition: vdev_geom
will sent the sysevent _before_ spa.c would update the vdev's status,
causing userland processes to see pool state that does not reflect the
device removal. This change moves the sysevent to spa.c, closing the race.
Reviewed by: delphij, Sean Eric Fagan
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D4902
Fix the code to retry mount attempt in mountcritlocal if there are
any root mount holds. The previous one used a wrong conditional - the
"err=$?" assignment resets "$?" to 0.
Submitted by: jilles@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Warner Losh [Thu, 14 Jan 2016 16:23:07 +0000 (16:23 +0000)]
Document how to enter the debugger here. I'm sure there's some better
canonical place, and the nit-pickers are welcome to move this
information there with a cross reference.
Netflow module is supposed to store (along with fields like
gateway address and interface index) matched netmask for each record.
This (currently) requires returning individual route entries, instead
of optimized next-hop structure. Given that, use control-plane
rib_lookup_info() function to avoid accessing rtentries directly.
While rib_lookup_info() might be slower, than fibX_lookup() flavours,
it is more scalable than rtalloc1_fib(), because rtentry mutex is
not acquired.
Gleb Smirnoff [Thu, 14 Jan 2016 10:22:45 +0000 (10:22 +0000)]
There is a bug in tcp_output()'s implementation of the TCP_SIGNATURE
(RFC 2385/TCP-MD5) kernel option.
If a tcpcb has TF_NOOPT flag, then tcp_addoptions() is not called,
and to.to_signature is an uninitialized stack variable. The value
is later used as write offset, which leads to writing to random
address.
Gleb Smirnoff [Thu, 14 Jan 2016 10:16:25 +0000 (10:16 +0000)]
Call crextend() before copying old credentials to the new credentials
and replace crcopysafe by crcopy as crcopysafe is is not intended to be
safe in a threaded environment, it drops PROC_LOCK() in while() that
can lead to unexpected results, such as overwrite kernel memory.
In my POV crcopysafe() needs special attention. For now I do not see
any problems with this function, but who knows.
Submitted by: dchagin
Found by: trinity
Security: SA-16:04.linux
Gleb Smirnoff [Thu, 14 Jan 2016 10:13:58 +0000 (10:13 +0000)]
Change linux get_robust_list system call to match actual linux one.
The set_robust_list system call request the kernel to record the head
of the list of robust futexes owned by the calling thread. The head
argument is the list head to record.
The get_robust_list system call should return the head of the robust
list of the thread whose thread id is specified in pid argument.
The list head should be stored in the location pointed to by head
argument.
In contrast, our implemenattion of get_robust_list system call copies
the known portion of memory pointed by recorded in set_robust_list
system call pointer to the head of the robust list to the location
pointed by head argument.
So, it is possible for a local attacker to read portions of kernel
memory, which may result in a privilege escalation.
Gleb Smirnoff [Thu, 14 Jan 2016 10:11:10 +0000 (10:11 +0000)]
Verify the packet length in sctp6_input().
The sctp6_ctlinput() function does not properly check the length of the packet
it receives from the ICMP6 input routine. This means that an attacker can craft
a packet that will cause a kernel panic.
When the kernel receives an ICMP6 error message with one of the types/codes
it handles, it calls icmp6_notify_error() to deliver it to the upper-level
protocol. icmp6_notify_error() cycles through the extension headers (if any)
to find the protocol number of the first non-extension header. It does NOT
verify the length of the non-extension header.
It passes information about the packet (including the actual packet) to the
upper-level protocol's pr_ctlinput function. In the case of SCTP for IPv6,
icmp6_notify_error() calls sctp6_ctlinput().
sctp6_ctlinput() assumes that the incoming packet contains a sufficiently-long
SCTP header and calls m_copydata() to extract a copy of that header. In turn,
m_copydata() assumes that the caller has already verified that the offset and
length parameters are correct. If they are incorrect, it will dereference a
NULL pointer and cause a kernel panic.
In short, no one is sufficiently verifying the input, and the result is a
kernel panic.
Steven Hartland [Thu, 14 Jan 2016 09:22:01 +0000 (09:22 +0000)]
Fix GCC warnings causing build failure after r293724
Disable some compiler warnings for GCC (non-standard compiler) fixing
build failures introduced by r293724, which enabled WARNS in the EFI boot
code, when compiling with none standard compiler (GCC).
Andrew Rybchenko [Thu, 14 Jan 2016 09:19:28 +0000 (09:19 +0000)]
sfxge: add accessors for license-related MCDI calls to common code
Add support for Huntington MCDI licensing interface to common code.
Ported from Linux net driver IOCTL functions with restructuring for
initial support for V3 licensing API.
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4918
Andrew Rybchenko [Thu, 14 Jan 2016 09:11:20 +0000 (09:11 +0000)]
sfxge: fix common code VPD iterator and duplicate tag verification
Fix efx_vpd_hunk_next() which has -- since its inception -- failed to
correctly iterate over the tags and keywords contained in the VPD data.
Only the first tag or keyword would be returned and the next call with
*contp == 1 would walk to the end of the data and finish.
This was spotted when fixing up errors spotted by Prefast code analysis
(which neglected to set all of the out parameters in all successful cases)
Also fix efx_vpd_verify() on Siena and EF10 which (as a side effect of
correctly iterating over all the tags and keywords) was failing as it
detected that both the static VPD and dynamic VPD storage contained an
RV keyword in the VPD-R tag. This is intentional as the static VPD and
dynamic VPD are stored separately (firmware merges their contents and
computes a new RV keyword checksum for the data readable from the VPD
capability in PCIe configuration space).
Submitted by: Andrew Lee <alee at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4915
Andrew Rybchenko [Thu, 14 Jan 2016 09:07:40 +0000 (09:07 +0000)]
sfxge: use correct register definitions for setting interrupt moderation on Medford
The only value which has changed is the number of rows
(ER_DZ_EVQ_TMR_REG_ROWS is 2048 vs 1024 for FR_BZ_TIMER_COMMAND_REGP0_ROWS)
but that isn't used, so this shouldn't change behaviour.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4913
Ed Schouten [Thu, 14 Jan 2016 07:27:42 +0000 (07:27 +0000)]
Remove an unneeded assignment of the return value.
tdelete() is supposed to return the address of the parent node that has
been deleted. We already keep track of this node in the loop between
lines 94-107. The GO_LEFT()/GO_RIGHT() macros are used later on as well,
so we must make sure not to change it to something else.
Sepherosa Ziehau [Thu, 14 Jan 2016 03:16:29 +0000 (03:16 +0000)]
hyperv: set receive buffer size according to NVSP protocol version
If the NVSP protocol version is not greater than NVSP_PROTOCOL_VERSION_2,
then the recv buffer size is 15MB, otherwise the buffer size is 16MB.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Reviewed by: royger, Dexuan Cui <decui microsoft com>, adrian
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4814
Sepherosa Ziehau [Thu, 14 Jan 2016 03:11:35 +0000 (03:11 +0000)]
hyperv: add interrupt counters
Submitted by: Howard Su <howard0su gmail com>
Reviewed by: royger, Dexuan Cui <decui microsoft com>, adrian
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4693
Sepherosa Ziehau [Thu, 14 Jan 2016 03:05:10 +0000 (03:05 +0000)]
hyperv: implement an event timer
Submitted by: Howard Su <howard0su@gmail.com>
Reviewed by: delphij, royger, adrian
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4676
Sepherosa Ziehau [Thu, 14 Jan 2016 02:50:13 +0000 (02:50 +0000)]
hyperv: use x86 generic code to do the hypervisor detection
This is first step to move the generic part of HV code into kernel instead
of module, so that it is possible to use hypercall to implement some other
paravirtualization code in the kernel.
Submitted by: Howard Su <howard0su@gmail.com>
Reviewed by: royger, delphij, adrian
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D3072
Eric van Gyzen [Thu, 14 Jan 2016 00:31:00 +0000 (00:31 +0000)]
bsdinstall: Suggest the GPT+Active workaround on Dell T5810
The Dell Precision Tower 5810 fails to boot from GPT in Legacy/BIOS mode
without the Active flag in the Protective MBR. Suggest the workaround
during installation.
Since an increasing number of Dell systems exhibit this behavior,
I imagine all Dells past a certain date will do so. I would like
to suggest the workaround for all Dells with a BIOS date of, say,
2014 or later, but I would need to test a variety of systems before
committing such a change.
Reviewed by: allanjude, dteske
MFC after: 5 days
Relnotes: We should probably suggest using GPT+Active on "recent" Dells.
Sponsored by: Dell Inc.
Differential Revision: https://reviews.freebsd.org/D4075
Marius Strobl [Wed, 13 Jan 2016 21:47:27 +0000 (21:47 +0000)]
Given that em(4), lem(4) and igb(4) hardware doesn't require the
alignment guarantees provided by m_defrag(9), use m_collapse(9)
instead for performance reasons.
While at it, sanitize the statistics softc members, i. e. retire
unused ones and add SYSCTL nodes missing for actually used ones.
Andrew Turner [Wed, 13 Jan 2016 21:34:15 +0000 (21:34 +0000)]
Add support for relocating AArch64 modules to kldxref. This fixes an error
message where it fails to read the module as the unrelocated addresses
are zero.
Steven Hartland [Wed, 13 Jan 2016 18:33:12 +0000 (18:33 +0000)]
Improve non-interactive forth cmd error reporting
Non-interactive forth command errors where silent even for critical issues
e.g. failing to load a required kernel module or mfs_root.
This resulted in later unexplained and hard to trace errors such as mount
root failures.
This introduces additional command return codes that are treated
appropriately by the non-interactive command processor (bf_command).
* CMD_CRIT = print error
* CMD_FATAL = panic
Also fix minor style(9) issues with command_load return codes.
Alan Somers [Wed, 13 Jan 2016 17:33:50 +0000 (17:33 +0000)]
Fix Coverity warnings regarding r293229
rpcbind/check_bound.c
Fix CID1347798, a memory leak in mergeaddr.
rpcbind/tests/addrmerge_test.c
Fix CID1347800 through CID1347803, memory leaks in ATF tests. They
are harmless because each ATF test case runs in its own process, but
they are trivial to fix. Fix a few other leaks that Coverity didn't
detect, too.
Andrew Turner [Wed, 13 Jan 2016 15:54:17 +0000 (15:54 +0000)]
Remove the compat code to handle the kernel passing us an unalinged
stackpointer. Userland expects the kernel to pass it an aligned sp and
pass a pointer to the arguments in x0. The kernel side was updated in
r289502, 3 months ago.
Steven Hartland [Wed, 13 Jan 2016 14:47:13 +0000 (14:47 +0000)]
Increase efiboot.img size used in ISO creation
Due to recent and upcoming changes to add additional functionality to
the EFI loader its now bigger than the space allocates for efiboot.img
so increase this in line with boot1.efifat.
Move the funsetown(9) call from audit_pipe_close() to cdevpriv
destructor. As result, close method becomes trivial and removed.
Final cdevsw close method might be called without file
context (e.g. in vn_open_vnode() if the vnode is reclaimed meantime),
which leaves ap_sigio registered for notification, despite cdevpriv
destructor frees the memory later.
Call destructor instead of doing a cleanup inline, for
devfs_set_cdevpriv() failure in open. This adds missed funsetown(9)
call and locks ap to satisfy audit_pipe_free() invariants.
Reported and tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Switch legacy pty clone handler to use make_dev_s(9). Add
MAKEDEV_CHECKNAME flag to the call, this is required to not panic on
race between the clone and destructing the closed master.
Reported by and discussed with: bde
Tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Enji Cooper [Wed, 13 Jan 2016 09:14:27 +0000 (09:14 +0000)]
Integrate
tools/regression/geom_{concat,eli,gate,mirror,nop,raid3,shsec,stripe,uzip}
in to the FreeBSD test suite as
tests/sys/geom/class/{concat,eli,gate,mirror,nop,raid3,shsec,stripe,uzip}
The tools/regression/geom and tools/regression/geom_part testcases are being
left alone because both test sets are both currently broken.
The majority of this work was done on ^/user/ngie/more-tests2 . The differences
are as follows:
- tests/sys/geom/class/Makefile.inc is not present; it was
inlined into the class's Makefiles for explicitness.
- The testcases officially require root via kyua
- The geom_gate(4) tests don't use the pidfile changes proposed in
https://reviews.freebsd.org/D4836 .
Andrew Rybchenko [Wed, 13 Jan 2016 06:37:45 +0000 (06:37 +0000)]
sfxge: remove unused common code EFSYS_OPT_RX_HDR_SPLIT
The EFSYS_OPT_RX_HDR_SPLIT optional feature in the common code
implemented the Lookahead Split feature of Windows. This split
received packets at a preconfigured byte offset, and delivered
the header and payload portions to separate receive queues.
Now the common code interface has no callers, so remove it.
Note that this should not be confused with the Header Data Split
feature of Windows, which splits packets at a header boundary.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4888
Andrew Rybchenko [Wed, 13 Jan 2016 06:34:51 +0000 (06:34 +0000)]
sfxge: rename common hunt NIC methods to ef10
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4880
Marcelo Araujo [Wed, 13 Jan 2016 01:49:35 +0000 (01:49 +0000)]
ypldap(8) is a feature ready to be used to translate nis(8) database to ldap(3).
This commit, fix a core dump on ypldap(8) related with memory allocation.
Also an example of how to set the ypldap.conf(5) properly is added to
examples files.
A new user _ypldap is required to be able to run ypldap(8) as well as
in a chroot mode.
Ian Lepore [Wed, 13 Jan 2016 00:22:12 +0000 (00:22 +0000)]
Go back to using uintptr_t, because code that actually compiles is
infinitely less buggy than code that is theoretically correct in some
alternate universe.
The uintfptr_t type is apparently a freebsd invention, and exists only when
compiling the kernel. It's a little hard to say for sure, since it doesn't
seem to be documented anywhere except in email advice to unsuspecting and
overly-trusting souls, who then get to wear the pointy hat for blindly
following advice without investigating or testing it first.
Ian Lepore [Tue, 12 Jan 2016 18:42:00 +0000 (18:42 +0000)]
Restore uart PPS signal capture polarity to its historical norm, and add an
option to invert the polarity in software. Also add an option to capture
very narrow pulses by using the hardware's MSR delta-bit capability of
latching line state changes.
This effectively reverts the mistake I made in r286595 which was based on
empirical measurements made on hardware using TTL-level signaling, in which
the logic levels are inverted from RS-232. Thus, this re-syncs the polarity
with the requirements of RFC 2783, which is writen in terms of RS-232
signaling.
Narrow-pulse mode uses the ability of most ns8250 and similar chips to
provide a delta indication in the modem status register. The hardware is
able to notice and latch the change when the pulse width is shorter than
interrupt latency, which results in the signal no longer being asserted by
time the interrupt service code runs. When running in this mode we get
notified only that "a pulse happened" so the driver synthesizes both an
ASSERT and a CLEAR event (with the same timestamp for each). When the pulse
width is about equal to the interrupt latency the driver may intermittantly
see both edges of the pulse. To prevent generating spurious events, the
driver implements a half-second lockout period after generating an event
before it will generate another.
Ian Lepore [Tue, 12 Jan 2016 16:31:07 +0000 (16:31 +0000)]
Cast using uintfptr_t and eliminate the cast to uint64_t which is uneeded
because rounding down cannot increase the number of bits needed to express
the result.
I had no idea there was such a thing as uintfptr_t.