oshogbo [Thu, 25 Feb 2016 18:23:40 +0000 (18:23 +0000)]
Convert casperd(8) daemon to the libcasper.
After calling the cap_init(3) function Casper will fork from it's original
process, using pdfork(2). Forking from a process has a lot of advantages:
1. We have the same cwd as the original process.
2. The same uid, gid and groups.
3. The same MAC labels.
4. The same descriptor table.
5. The same routing table.
6. The same umask.
7. The same cpuset(1).
From now services are also in form of libraries.
We also removed libcapsicum at all and converts existing program using Casper
to new architecture.
andrew [Thu, 25 Feb 2016 16:50:36 +0000 (16:50 +0000)]
Add support for the Allwinner A31 watchdog to the existing A10 watchdog
driver. This mostly involves selecting the register offsets to use at
runtime based on the hardware we are talking to.
zbb [Thu, 25 Feb 2016 14:12:51 +0000 (14:12 +0000)]
Add support for hardware Tx and Rx checksums to VNIC driver
- The network controller verifies Rx TCP/UDP/SCTP checksums by default.
Communicate this to the stack when the packet is not marked as erroneous
to avoid redundant checksum calculation in kernel.
- It is not uncommon to get the mbuf with m_len that is less than
the minimal size for the IP, TCP, UDP, etc. when HW checsumming
is enabled. To avoid data corruption performed by the HW that is
intended to write IP and TCP/UDP/SCTP checksums to the data segment,
the mbuf needs to be pulled up by the required number of bytes.
- Make sure that one can modify the mbufs that require checsum calculation
rather than check for NULL mbuf on each transmission.
kp [Thu, 25 Feb 2016 07:33:59 +0000 (07:33 +0000)]
pf: Fix possible out-of-bounds write
In the DIOCRSETADDRS ioctl() handler we allocate a table for struct pfr_addrs,
which is processed in pfr_set_addrs(). At the users request we also provide
feedback on the deleted addresses, by storing them after the new list
('bcopy(&ad, addr + size + i, sizeof(ad));' in pfr_set_addrs()).
This means we write outside the bounds of the buffer we've just allocated.
We need to look at pfrio_size2 instead (i.e. the size the user reserved for our
feedback). That'd allow a malicious user to specify a smaller pfrio_size2 than
pfrio_size though, in which case we'd still read outside of the allocated
buffer. Instead we allocate the largest of the two values.
Reported By: Paul J Murphy <paul@inetstat.net>
PR: 207463
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D5426
jmcneill [Thu, 25 Feb 2016 01:24:02 +0000 (01:24 +0000)]
Fix dedicated DMA transfers.
For sources and destinations marked "noincr", the previous code was
incorrectly programming the dedicated DMA channel control register
using bit definitions for normal DMA channels. This code path is not
currently used, but will be used by the HDMI audio driver in review.
Reviewed by: andrew
Approved by: gonzo (mentor)
Differential Revision: https://reviews.freebsd.org/D5382
bdrewery [Wed, 24 Feb 2016 17:20:31 +0000 (17:20 +0000)]
Show full DIRPRFX in subdir parallel target name.
For example when building, from buildworld, lib/atf/libatf-c++/tests/detail:
--- all_subdir_atf ---
is now:
--- all_subdir_lib/atf/libatf-c++/tests/detail ---
bdrewery [Wed, 24 Feb 2016 17:20:11 +0000 (17:20 +0000)]
DIRDEPS_BUILD: Regenerate without local dependencies.
These are no longer needed after the recent 'beforebuild: depend' changes
and hooking DIRDEPS_BUILD into a subset of FAST_DEPEND which supports
skipping 'make depend'.
bdrewery [Wed, 24 Feb 2016 17:19:18 +0000 (17:19 +0000)]
FAST_DEPEND: Always run depend via beforebuild which removes many hacks.
This will generate dependencies rather than depending on the previous behavior
of depending on the guessed OBJS: *.h dependecies or a user running
'make depend'.
Experimentation showed that depending only on headers was not enough and
prone to .ORDER errors. Downstream users may also have added
dependencies into beforedepend or afterdepend targets. The safest way to
ensure dependencies are generated before build is to run 'make depend'
beforehand rather than just depending on DPSRCS+SRCS.
Note that the OBJS_DEPEND_GUESS mechanism (a.k.a .if !exists(.depend) then
foo.o: *.h) is still useful as it improves incremental builds with missing
.depend.* files and allows 'make foo.o' to usually work, while this
'beforebuild: depend' ensures that the build will always find all dependencies.
The 'make foo.o' case has no means of a 'beforebuild' hook.
This also removes several hacks in the DIRDEPS_BUILD:
- NO_INSTALL_INCLUDES is no longer needed as it mostly was to work around
.ORDER problems with building the needed headers early.
- DIRDEPS_BUILD: It is no longer necesarry to track "local dependencies" in
Makefile.depend.
These were only in Makefile.depend for 'clean builds' since nothing would
generate the files due to skipping 'make depend' and early dependency
bugs that have been fixed, such as adding headers into SRCS for the
OBJS_DEPEND_GUESS mechanism. Normally if a .depend file does not exist then
a dependency is added by bsd.lib.mk/bsd.prog.mk from OBJS: *.h. However,
meta.autodep.mk creates a .depend file from created meta files and inserts
that into Makefile.depend. It also only tracks *.[ch] files though which can
miss some dependencies that are hooked into 'make depend'. This .depend
that is created then breaks incremental builds due to the !exists(.depend)
checks for OBJS_DEPEND_GUESS. The goal was to skip 'make depend' yet it only
really works the first time. After that files are not generated as expected,
which r288966 tried to address but was using buildfiles: rather than
beforebuild: and was reverted in r291725. As noted previously,
depending only on headers in beforebuild: would create .ORDER errors
in some cases.
meta.autodep.mk is still used to generate Makefile.depend though via:
gendirdeps: Makefile.depend
.END: gendirdeps
This commit allows removing all of the "local dependencies" in
Makefile.depend which cuts down on churn and removes some of the
arch-dependent Makefile.depend files.
The "local dependencies" were also problematic for bootstrapping.
bdrewery [Wed, 24 Feb 2016 17:19:13 +0000 (17:19 +0000)]
Hook the meta/nofilemon build into using FAST_DEPEND.
FAST_DEPEND is intended to be the "skip 'make depend' and mkdep"
feature. Since DIRDEPS_BUILD does this already with some of its own
hacks, and filemon doesn't need this, and nofilemon does, teach it how
to handle each of these cases.
In meta+filemon mode filemon will handle dependencies itself via the
meta mode logic in bmake. We still want to set MK_FAST_DEPEND=yes to
enable some logic that indicates that 'make depend' is skipped in the
traditional sense. The actual .depend.* files will be skipped.
When nofilemon is set though we still need to track and generate dependencies.
bdrewery [Wed, 24 Feb 2016 17:19:09 +0000 (17:19 +0000)]
FAST_DEPEND: Don't waste time generating an empty .depend file.
The .depend file will still be generated if _EXTRADEPEND is used. The target
is kept with a dependency on DPSRCS though so that 'make depend' will generate
all files.
bdrewery [Wed, 24 Feb 2016 17:19:05 +0000 (17:19 +0000)]
FAST_DEPEND: Rework how guessed dependencies are handled.
Rather than depend on .depend not existing, check the actual
.depend.OBJ file that will be used for that object. If it doesn't
exist then use the guessed dependencies.
FAST_DEPEND may never have a .depend file. Not having one means all of the
previous logic would over-depend all object files on all headers which is not
what we wanted. It also means that if a .depend is generated before a build
is done for _EXTRADEPEND (such as for PROG or LIB) then all of these
dependencies would not be used since the .depend wasn't generated from mkdep
and the real .depend.* files are not generated until the build.
bdrewery [Wed, 24 Feb 2016 17:18:55 +0000 (17:18 +0000)]
Follow-up r295667 with fixes for SRCS defined.
cleandepend should always remove CLEANDEPEND* if they are not empty,
but bsd.dep.mk should not add the tags entries unless SRCS is defined
as it did before. The .depend file itself it still always removed
to avoid accidentally keeping a stale one around as done in r295666.
ed [Wed, 24 Feb 2016 17:10:32 +0000 (17:10 +0000)]
Make asynchronous connection failures on UNIX sockets fail with ECONNRESET.
While making CloudABI work well on Linux, I discovered that I had a
FreeBSD-ism in one of my unit tests. The test did the following:
- Create UNIX socket 1, bind it, make it listen.
- Create UNIX socket 2, connect it to UNIX socket 1.
- Close UNIX socket 1.
- Obtain SO_ERROR from socket 2.
On FreeBSD this returns ECONNABORTED, while on Linux it returns
ECONNRESET. I dug through some of the relevant specifications[1] and it
looks like Linux is all right here. ECONNABORTED should only be returned
when the local connection (socket 2) is aborted; not the peer (socket 1).
It is of course slightly misleading: the function in which we set this
error is called uipc_abort(), but keep in mind that we're aborting the
peer, thus resetting the local socket.
kib [Wed, 24 Feb 2016 15:15:46 +0000 (15:15 +0000)]
Provide more correct sizing of the KVA consumed by a vnode, used by
the virtvnodes calculation. Include the size of fs-specific v_data as
the nfs nclnode inline, the NFS nclnode is bigger than either ZFS
znode or UFS inode. Include the size of namecache_ts and short cache
path element, multiplied by the name cache population factor, again
inline.
Inline defines are used to avoid pollution of the vnode.h with the
subsystem-private objects. Non-significant unsynchronized changes of
the definitions are fine, we do not care about that precision, and
e.g. ZFS consumes much malloced memory per vnode for reasons
unaccounted in the formula.
Lower the partition of kmem dedicated to vnodes, from 1/7 to 1/10.
The measures reduce vnode cache pressure on kmem and bring the vnode
cache memory use below some apparent thresholds that were exceeded by
r291244 due to more robust vnode reuse.
Reported and tested by: marius (i386, previous version)
Reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
wma [Wed, 24 Feb 2016 06:05:30 +0000 (06:05 +0000)]
Make pci_host_generic and thunderx_pci common
* provided OFW interface for pci_host_generic (for handling devices which are present in DTS under the PCI node)
* removed support for internal PCI from arm64/cavium
* cleaned up and made most of the code common
mckusick [Wed, 24 Feb 2016 01:58:40 +0000 (01:58 +0000)]
The UFS filesystem requires that the last block of a file always be
allocated. When shortening the length of a file in which the new end
of the file contains a hole, the hole must have a block allocated.
erj [Wed, 24 Feb 2016 00:42:43 +0000 (00:42 +0000)]
ixl(4): Fix potential driver interrupt setup issues and startup crash.
- Limit queue autoconfiguration to 8 queues to prevent the driver from
requesting a large number of MSI-X vectors at boot.
- Fix potential kernel panic that occurs when the driver loads and cannot
get all requested MSIX vectors. Instead, attach() will fail with an error.
- Move taskqueue setup to later in attach() to prevent having to free
taskqueues if some other error in attach() occurs.
sobomax [Tue, 23 Feb 2016 23:59:08 +0000 (23:59 +0000)]
Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8)
and geom_uncompress(4):
1. mkuzip(8):
- Proper support for eliminating all-zero blocks when compressing an
image. This feature is already supported by the geom_uzip(4) module
and CLOOP format in general, so it's just a matter of making mkuzip(8)
match. It should be noted, however that this feature while it sounds
great, results in very slight improvement in the overall compression
ratio, since compressing default 16k all-zero block produces only 39
bytes compressed output block, which is 99.8% compression ratio. With
typical average compression ratio of amd64 binaries and data being
around 60-70% the difference between 99.8% and 100.0% is not that
great further diluted by the ratio of number of zero blocks in the
uncompressed image to the overall number of blocks being less than
0.5 (typically). However, this may be important from performance
standpoint, so that kernel are not spinning its wheels decompressing
those empty blocks every time this zero region is read. It could also
be important when you create huge image mostly filled with zero
blocks for testing purposes.
- New feature allowing to de-duplicate output image. It turns out that
if you twist CLOOP format a bit you can do that as well. And unlike
zero-blocks elimination, this gives a noticeable improvement in the
overall compression ratio, reducing output image by something like
3-4% on my test UFS2 3GB image consisting of full FreeBSD base system
plus some of the packages (openjdk, apache etc), about 2.3GB worth of
file data (800+MB compressed). The only caveat is that images created
with this feature "on" would not work on older versions of FeeBSDxi
kernel, hence it's turned off by default.
- provide options to control both features and document them in manual
page.
- merge in all relevant LZMA compression support from the mkulzma(8),
add new option to select between both.
- switch license from ad-hoc beerware into standard 2-clause BSD.
2. geom_uzip(4):
- implement support for de-duplicated images;
- optimize some code paths to handle "all-zero" blocks without reading
any compressed data;
- beef up manual page to explain that geom_uzip(4) is not limited only
to md(4) images. The compressed data can be written to the block
device and accessed directly via magic of GEOM(4) and devfs(4),
including to mount root fs from a compressed drive.
- convert debug log code from being compiled in conditionally into
being present all the time and provide two sysctls to turn it on or
off. Due to intended use of the module, it can be used in
environments where there may not be a luxury to put new kernel with
debug code enabled. Having those options handy allows debug issues
without as much problem by just having access to serial console or
network shell access to a box/appliance. The resulting additional
CPU cycles are just few int comparisons and branches, and those are
minuscule when compared to data decompression which is the main
feature of the module.
- hopefully improve robustness and resiliency of the geom_uzip(4) by
performing some of the data validation / range checking on the TOC
entries and rejecting to attach to an image if those checks fail.
- merge in all relevant LZMA decompression support from the
geom_uncompress(4), enable automatically when appropriate format is
indicated in the header.
- move compilation work into its own worker thread so that it does not
clog g_up. This allows multiple instances work in parallel utilizing
smp cores.
- document new knobs in the manual page.
Reviewed by: adrian
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D5333
avos [Tue, 23 Feb 2016 21:11:42 +0000 (21:11 +0000)]
net80211: fix TIM cleanup.
Remove duplicate 'ni->ni_associd = 0' assignment from
ieee80211_node_leave(), since it breaks iv_set_tim() in
ic->ic_node_cleanup() (associd is cleared right after this call).
Tested with RTL8188EU (HOSTAP mode) and
WUSB54GC (STA mode, with powersaving enabled).
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D5398
jhb [Tue, 23 Feb 2016 20:00:55 +0000 (20:00 +0000)]
Add handling for non-native error values to libsysdecode.
Add two new functions, sysdecode_abi_to_freebsd_errno() and
sysdecode_freebsd_to_abi_errno(), which convert errno values between
the native FreeBSD ABI and other supported ABIs. Note that the
mappings are not necessarily perfect meaning in some cases multiple
errors in one ABI might map to a single error in another ABI. In that
case, the reverse mapping will return one of the errors that maps, but
which error is non-deterministic.
Change truss to always report the raw error value to the user but
use libsysdecode to map it to a native errno value that can be used
with strerror() to generate a description. Previously truss reported
the "converted" error value. Now the user will always see the exact
error value that the application sees.
Change kdump to report the truly raw error value to the user. Previously
kdump would report the absolute value of the raw error value (so for
Linux binaries it didn't output the FreeBSD error value, but the positive
value of the Linux error). Now it reports the real (i.e. negative) error
value for Linux binaries. Also, use libsysdecode to convert the native
FreeBSD error reported in the ktrace record to the raw error used by the
ABI. This means that the Linux ABI can now be handled directly in
ktrsysret() and removes the need for linux_ktrsysret().
jhb [Tue, 23 Feb 2016 19:56:29 +0000 (19:56 +0000)]
Add support for displaying thread IDs to truss(1).
- Consolidate duplicate code for printing the metadata at the start of
each line into a shared function.
- Add an -H option which will log the thread ID of the relevant thread
for each event.
While here, remove some extraneous calls to clock_gettime() in
print_syscall() and print_syscall_ret(). The caller of print_syscall_ret()
always updates the current thread's "after" time before it is called.
hselasky [Tue, 23 Feb 2016 18:17:01 +0000 (18:17 +0000)]
Configure the correct bMaxPacketSize for control endpoints before
requesting the initial complete device descriptor and not as part of
the subsequent babble error recovery. Babble means that the received
USB packet was bigger than than configured maximum packet size. This
only affects enumeration of FULL speed USB devices which use a
bMaxPacketSize different from 8 bytes. This patch might help fix
enumeration of USB devices which exhibit USB I/O errors in dmesg
during boot.
rrs [Tue, 23 Feb 2016 17:53:39 +0000 (17:53 +0000)]
This fixes the fastpath code to have a better module initialization sequence when
included in loader.conf. It also fixes it so that no matter if some one incorrectly
specifies a load order, the lists and such will be initialized on demand at that
time so no one can make that mistake.
dwmalone [Tue, 23 Feb 2016 15:28:13 +0000 (15:28 +0000)]
Following revision r295924, the changes to a db file should be fsynced
before the file is closed. Consequently, it shouldn't be necessary to
open the file with O_SYNC any more.
This improves the performance of building large .db files for large
password files a lot and should resolve this problem:
dwmalone [Tue, 23 Feb 2016 15:21:13 +0000 (15:21 +0000)]
If we close or sync a hash-based db file, make sure to call fsync to
make sure the changes are on disk. The people at pfSense noticed that
it didn't always make it to the disk soon enough with soft updates.