gshapiro [Sat, 29 Dec 2012 19:12:38 +0000 (19:12 +0000)]
MFC: Properly define true/false when defining __bool_true_false_are_defined
for filters which pull in mfapi.h before stdbool.h. Issue reported by
Petr Rehor, maintainer of amavisd-milter port.
dim [Wed, 19 Dec 2012 12:19:56 +0000 (12:19 +0000)]
MFC r243907:
Fix an old bug in devd, where it uses std::sort() to sort the various
lists it reads from its configuration files on the priority field.
Because some items in the lists have the same priority, and std::sort()
is not stable, the exact order in which the items are enumerated does
not have to correspond to the order they appear in the configuration
files.
Apparently this was never noticed with libstdc++, but with libc++ it
could cause the "uhid" entry from /etc/devd/usb.conf to be used instead
of the "ums" entry (which is earlier in the file). This caused the
problem described in the PR: the USB mouse module was never loaded, and
the other actions (such as starting moused) were not executed.
To fix the problem, make devd use std:stable_sort() instead.
Reported by: Jan Beich <jbeich@tormail.org>
PR: bin/172958
delphij [Mon, 17 Dec 2012 06:38:22 +0000 (06:38 +0000)]
MFC r243834:
Note that the manual page of less(1) says:
Note that a preprocessor cannot output an empty file, since that
is interpreted as meaning there is no replacement, and the origi-
nal file is used. To avoid this, if LESSOPEN starts with two ver-
tical bars, the exit status of the script becomes meaningful. If
the exit status is zero, the output is considered to be replace-
ment text, even if it empty. If the exit status is nonzero, any
output is ignored and the original file is used. For compatibil-
ity with previous versions of less, if LESSOPEN starts with only
one vertical bar, the exit status of the preprocessor is ignored.
Use two pipe symbols for zless, so that zless'ing a compressed empty
file will give output rather than being interpreted as its compressed
form, which is typically a binary.
Thanks Mark Nudelman for pointing out this difference and the
suggested solution.
jimharris [Wed, 12 Dec 2012 00:39:32 +0000 (00:39 +0000)]
MFC r243904:
Don't call bus_dmamap_load in CAM_DIR_NONE case, since there is nothing
to map, and technically this isn't allowed.
Functionally, it works OK (at least on x86) to call bus_dmamap_load with
a NULL data pointer and zero length, so this is primarily for correctness
and consistency with other drivers.
While here, remove check in isci_io_request_construct for nseg==0.
Previously, bus_dmamap_load would pass nseg==1, even for case where
buffer is NULL and length = 0, which allowed CAM_DIR_NONE CCBs
to get processed. This check is not correct though, and needed to be
removed both for the changes elsewhere in this patch, as well as jeff's
preliminary bus_dmamap_load_ccb patch (which uncovered all of this in
the first place).
MFC r243503:
Illumos 13879:4eac7a87eff2
3329 spa_sync() spends 10-20% of its time in spa_free_sync_cb()
3330 space_seg_t should have its own kmem_cache
3331 deferred frees should happen after sync_pass 1
3335 make SYNC_PASS_* constants tunable
New loader-only tunables:
vfs.zfs.sync_pass_deferred_free
vfs.zfs.sync_pass_dont_compress
vfs.zfs.sync_pass_rewrite
MFC r243524:
Import the zio nop-write improvement from Illumos. To reduce I/O,
nop-write omits overwriting data if the checksum (cryptographically
secure) of new data matches the checksum of existing data.
It also saves space if snapshots are in use.
It currently works only on datasets with enabled compression, disabled
deduplication and sha256 checksums.
eadler [Mon, 10 Dec 2012 02:33:16 +0000 (02:33 +0000)]
MFC r243084:
Add support for a -q flag. While here make the custom argument parsing
use getopt instead of hacking on it more. This change also fixes the
method of silencing the compiler warning about gfn being used
uninitialized.
eadler [Sat, 8 Dec 2012 00:28:16 +0000 (00:28 +0000)]
MFC r243893:
Remove hack to emulate effective uid and just use the EUID's name in the
first place. I was unaware of this option when originally committing
this change.
I accidently merged this to only one directory last time :(
eadler [Sat, 8 Dec 2012 00:25:51 +0000 (00:25 +0000)]
MFC r243893:
Remove hack to emulate effective uid and just use the EUID's name in the
first place. I was unaware of this option when originally committing
this change.
rwatson [Thu, 6 Dec 2012 11:54:25 +0000 (11:54 +0000)]
Early MFC of portions of r243752 adding an auditdistd user to stable/8
in order to ease future upgrades; the remainder of r243752 is left for
a future MFC of the OpenBSM upgrade:
Merge a number of changes required to hook up OpenBSM 1.2-alpha2's
auditdistd (distributed audit daemon) to the build:
- Manual cross references
- Makefile for auditdistd
- rc.d script, rc.conf entrie
- New group and user for auditdistd; associated aliases, etc.
The audit trail distribution daemon provides reliable,
cryptographically protected (and sandboxed) delivery of audit tails
from live clients to audit server hosts in order to both allow
centralised analysis, and improve resilience in the event of client
compromises: clients are not permitted to change trail contents
after submission.
Submitted by: pjd
Sponsored by: The FreeBSD Foundation (auditdistd)
delphij [Mon, 3 Dec 2012 18:47:25 +0000 (18:47 +0000)]
MFC r242681 (ambrisko):
- Extend the prior commit to use the generic SCSI command building
function use that for JBOD and Thunderbolt disk write command. Now
we only have one implementation in mfi.
- Fix dumping on Thunderbolt cards. Polled IO commands do not seem to
be normally acknowledged by changing cmd_status to MFI_STAT_OK.
In order to get acknowledgement of the IO is complete, the Thunderbolt
command queue needs to be run through. I added a flag MFI_CMD_SCSI
to indicate this command is being polled and to complete the
Thunderbolt wrapper and indicate the result. This flag needs to be
set in the JBOD case in case if that us using Thunderbolt card.
When in the polling loop check for completed commands.
- Remove mfi_tbolt_is_ldio and just do the check when needed.
- Fix an issue when attaching of disk device happens when a device is
already scheduled to be attached but hasn't attached.
- add a tunable to allow raw disk attachment to CAM via:
hw.mfi.allow_cam_disk_passthrough=1
- fixup aborting of commands (AEN and LD state change). Use a generic
abort function and only wait the command being aborted not both.
Thunderbolt cards don't seem to abort commands so the abort times
out.
delphij [Mon, 3 Dec 2012 18:39:29 +0000 (18:39 +0000)]
MFC r242497:
Copy code from scsi_read_write() as mfi_build_syspd_cdb() to build SCSI
command properly. Without this change, mfi(4) always sends 10 byte READ
and WRITE commands, which will cause data corruption when device is
larger than 2^32 sectors.
MFC r236884:
Introduce "feature flags" for ZFS pools (bump SPA version to 5000).
Add first feature "com.delphix:async_destroy" (asynchronous destroy
of ZFS datasets).
Implement features support in ZFS boot code.
Illumos revisions merged:
13700:2889e2596bd6
13701:1949b688d5fb
2619 asynchronous destruction of ZFS file systems
2747 SPA versioning with zfs feature flags
1796 "ZFS HOLD" should not be used when doing "ZFS SEND" froma read-only pool
2871 support for __ZFS_POOL_RESTRICT used by ZFS test suite
2903 zfs destroy -d does not work
2957 zfs destroy -R/r sometimes fails when removing defer-destroyed snapshot
MFC r238926:
Partial MFV (illumos-gate 13753:2aba784c276b)
2762 zpool command should have better support for feature flags
References:
https://www.illumos.org/issues/2762
MFC r238950:
Fix reporting of root pool upgrade notice.
MFC r238951:
Fix wrong indent according to style(9)
MFC r239389:
Backport fix for vendor issue #3085
3085 zfs diff panics, then panics in a loop on booting
References:
https://www.illumos.org/issues/3085
MFC r239394:
Update zfs(8) manpage with illumos version of "zfs diff"
Illumos issue:
2399 zfs manual page does not document use of "zfs diff"
References:
https://www.illumos.org/issues/2399
MFC r239620 [2]:
Merge recent vendor changes:
3086 unnecessarily setting DS_FLAG_INCONSISTENT on async destroyed datasets
3090 vdev_reopen() during reguid causes vdev to be treated as corrupt
3102 vdev_uberblock_load() and vdev_validate() may read the wrong label
MFC r240153 (gjb) [3]:
Typo fix and minor word swap.
MFC r240303:
Add assfail() and assfail3() to the opensolaris module.
Remove obsoleted intermediate cddl/compat/opensolaris/sys/debug.h.
MFC r240345 (avg):
zfs: fix sa_modify_attrs handling of variable-sized attributes
- skip length_idx index for a replaced variable-sized attribute
- skip length_idx index for a removed variable-sized attribute
- also re-arranged code to make sure that length_idx is always
incremented for variable-sized attributes
- additionally add an assertion that the number of actually produced
attributes is the same as the expected number of resulting
attributes
Illumos issued covered:
1884 Empty "used" field for zfs *space commands
3006 VERIFY[S,U,P] and ASSERT[S,U,P] frequently check if first argument
is zero
3028 zfs {group,user}space -n prints (null) instead of numeric GID/UID
3048 zfs {user,group}space [-s|-S] is broken
3049 zfs {user,group}space -t doesn't really filter the results
3060 zfs {user,group}space -H output isn't tab-delimited
3061 zfs {user,group}space -o doesn't use specified fields order
3064 usr/src/cmd/zpool/zpool_main.c misspells "successful"
3093 zfs {user,group}space's -i is noop
3098 zfs userspace/groupspace fail without saying why when run as non-root
MFC r240955 (partial):
Merge recent vendor changes in ZFS.
Illumos issued covered:
3139 zdb dies when it tries to determine path of unlinked file
3189 kernel panic in ZFS test suite during hotspare_onoffline_004_neg
3208 moving zpool cross-endian results in incorrect user/group accounting
MFC r241655:
Add missing initialization for do_prefix.
Corrects porting error in r238391
Vendor issue and changeset reference:
2883 changing "canmount" property to "on" should not always remount dataset
https://www.illumos.org/issues/2883
Changeset 13743:95aba6e49b9f
MFC r243014:
Move zpool-features manual page from section 5 to section 7
and fix references
mav [Thu, 29 Nov 2012 18:23:21 +0000 (18:23 +0000)]
MFC r242323, r242328:
Add basic BIO_DELETE support to GEOM RAID class for all RAID levels.
If at least one subdisk in the volume supports it, BIO_DELETE requests
will be propagated down. Unfortunatelly, for RAID levels with redundancy
unmapped blocks will be mapped back during first rebuild/resync process.
melifaro [Tue, 27 Nov 2012 20:16:37 +0000 (20:16 +0000)]
MFC r241406, r241502, r241884.
Do not check if found IPv4 rte is dynamic if net.inet.icmp.drop_redirect is
enabled. This eliminates one mtx_lock() per each routing lookup thus improving
performance in several cases (routing to directly connected interface or routing
to default gateway).
Icmp redirects should not be used to provide routing direction nowadays, even
for end hosts. Routers should not use them too (and this is explicitly restricted
in IPv6, see RFC 4861, clause 8.2).
Current commit changes rnh_machaddr function to 'stock' rn_match (and back) for every
AF_INET routing table in given VNET instance on drop_redirect sysctl change.
Eliminate code checking if found IPv6 rte is dynamic. IPv6 redirects
are using (different) ND-based approach described in RFC 4861. This change
is similar to r241406 which conditionally skips the same check in IPv4.
Cleanup documentation: cloning route support has been removed in r186119.
This change is part of bigger patch eliminating rte locking.
mav [Mon, 26 Nov 2012 16:19:27 +0000 (16:19 +0000)]
MFC r238943:
Add several performance optimizations to acpi_cpu_idle().
For C1 and C2 states use cpu_ticks() to measure sleep time instead of much
slower ACPI timer. We can't do it for C3, as TSC may stop there. But it is
less important there as wake up latency is high any way.
For C1 and C2 states do not check/clear bus mastering activity status, as
it is important only for C3. As side effect it can make CPU enter C2 instead
of C3 if last BM activity was two sleeps back (unlike one before), but
that may be even good because of collecting more statistics. Premature BM
wakeup from C3, entered because of overestimation, can easily be worse then
entering C2 from both performance and power consumption points of view.
Together on dual Xeon E5645 system on sequential 512 bytes read test this
change makes cpu_idle_acpi() as fast as simplest cpu_idle_hlt() and only
few percents slower then cpu_idle_mwait(), while deeper states are still
actively used during idle periods.
To help with diagnostics, add C-state type into dev.cpu.X.cx_supported.
yongari [Mon, 26 Nov 2012 04:40:26 +0000 (04:40 +0000)]
MFC r242426:
TCP/UDP checksum offloading feature for IP fragmented datagram was
removed in r99417. bge(4) controllers can do TCP checksum offload
for IP fragmented datagrams but unlike ti(4), it lacks UDP checksum
offloading for IP fragmented datagrams. The problem was bge(4)
blindly requested TCP/UDP checksum for IP fragmented datagrams such
that it resulted in corrupted UDP datagrams before r99417.
Remove remaining code for TCP checksum offloading for IP fragmented
datagrams which should have been removed in r99417.
yongari [Mon, 26 Nov 2012 04:34:53 +0000 (04:34 +0000)]
MFC r241983-241985:
r241983:
Do not hardcode phy address. Multi-port controllers use different phy
address.
r241984:
Ethernet@WireSpeed is defined for 1000baseT adapter to establish a
link at a lower speed so enabling it for fiber adapters is wrong.
Fix the issue by setting BGE_PHY_NO_WIRESPEED such that brgphy(4)
wouldn't enable the feature.
While I'm here move PHY specific feature/bug configuration to new
location(just before mii attach) for readability.
r241985:
For fast ethernet controllers, Ethernet@WireSpeed is not defined so
explicitly set BGE_PHY_NO_WIRESPEED flag.
yongari [Mon, 26 Nov 2012 04:26:27 +0000 (04:26 +0000)]
MFC r241438:
Add APE firmware support and improve firmware handshake procedure.
This change will enable IPMI access on 5717/5718/5719/5720 and 5761
controllers. Because ASF is not available when APE firmware is
present, bge_allow_asf tunable is ignored when driver detects APE
firmware. Also bge(4) no longer performs two resets(one blind
reset and the other reset with firmware in mind) in device attach.
Now bge(4) performs a reset with enough information in bge_reset().
The APE firmware also needs special handling to make suspend/resume
work but it was not implemented yet.
With this change, bge(4) should work on any 5717/5718/5719/5720
controllers. Special thanks to Mike Hibler at Emulab who setup
remote debugging on Dell R820. Without his help I couldn't be able
to address several issues happened on Dell Rx20 systems. And many
thanks to Broadcom for continuing to support FreeBSD!
Submitted by: davidch (initial version)
H/W donated by: Broadcom
Tested by: many
Tested on: Del R820/R720/R620/R420/R320 and HP Proliant DL 360 G8
yongari [Mon, 26 Nov 2012 04:20:59 +0000 (04:20 +0000)]
MFC r241437:
For 5717C/5719C/5720C and 57765 PHYs, do not perform any special
handling(jumbo, wire speed etc) in brgphy_reset(). Touching
BRGPHY_MII_AUXCTL register seems to confuse APE firmware such that
it couldn't establish a link.
yongari [Mon, 26 Nov 2012 04:11:12 +0000 (04:11 +0000)]
MFC r241436:
Rework controller reset procedure. Previously driver saved
BGE_PCI_PCISTATE register before issuing global reset. After
issuing reset, it reads BGE_PCI_PCISTATE register again and
compares the saved register value and current value. It was used to
know whether the global reset operation was completed or not.
Unfortunately, this logic caused several issues on recent BCM5717/
5718/5719 and BCM5720 controllers. It seems APE firmware accesses
some registers while global reset is in progress such that reading
BGE_PCI_PCISTATE register after reset does not yield old pre-reset
state value. This resulted in consuming too much time in global
reset and sometimes it couldn't successfully complete reset.
The BGE_MISCCFG_RESET_CORE_CLOCKS of BGE_MISC_CFG register is
self-clearing bit so driver is able to know the reset completion.
But the core-lock reset will disable indirect/flat/standard access
modes such that driver cannot poll BGE_MISCCFG_RESET_CORE_CLOCKS
bit of BGE_MISC_CFG register. So just wait enough time for
core-clock reset to complete.
Data sheet says driver should wait 100us for PCI/PCI-X devices and
100ms for PCIe devices. I chose 1ms for PCI/PCI-X since this value
was used for many years in bge(4). For PCIe devices, use 100ms as
recommended by data sheet.
bge_chipinit() also cleared BGE_MAC_MODE register which shall clear
firmware configured mode information. I think this will result in
losing ASF/IPMI link in device attachment. Let bge_reset() honor
firmware configured BGE_MAC_MODE register and don't announce driver
is UP in bge_reset(). Firmware should have control over driver until
it's fully initialized by driver.
While I'm here, enable workaround for PCI-X BCM5704 A0 in
bge_reset(). This will prevent internal arbitration logic from
switching to the other DMA engine after a retry cycle.
yongari [Mon, 26 Nov 2012 02:42:19 +0000 (02:42 +0000)]
MFC r241388-241393:
r241388:
If the maximum payload size is 256 bytes or more, set the DMA write
water mark to 256 bytes. Otherwise controller will encounter DMA
write under run errors and would result in RX DMA hang. If the
maximum payload size is 128 bytes, the water mark is set to 128
bytes as usual.
While here, set maximum read request size to 2048 for BCM5719/BCM5720.
For other PCIe devices, use 4096. And reprogram the maximum read
request size whenever device reset is performed.
r241389:
On PHY write error use hex number to show the value.
Add more comments.
r241390:
Honor PHY type fiber for BCM5717/BCM5718/BCM5719/BCM5720.
r241391:
Do not force PCIe 1.0a mode in device reset on BCM5717 and newer
controllers. BCM5785 does not require PCI 1.0a mode as well during
reset.
r241392:
Fix a long standing VCPU reset sequence bug on BCM5906.
The VCPU(Virtual CPU) of BCM5906 is used to provide a mechanism to
control the bootcode execution and to pick up configuration data
stored inside the EEPROM.
The bootcode of BCM5906 will check the BGE_VCPU_STATUS_DRV_RESET
bit to decide which booting procedure to choose.
Data sheet indicates the VCPU of BCM5906 should set
BGE_VCPU_STATUS_DRV_RESET bit *before* VCPU reset or global reset.
r241393:
Remove unnecessary delay. I don't see any comments in data sheet
that requires 10ms delay after device reset. Because that code was
there from day 1, I guess it was added to give enough settlement
time after updating BGE_MAC_MODE register.
The recommended delay time for BGE_MAC_MODE after updating is 40us
and it was already done in r241219.