MFC r233711:
Major update to driver to support for Drake Skinny and ThunderBolt cards.
MFC r233768:
Change typedef atomic_t to struct mfi_atomic to avoid name space
collision and some couple more style changes.
MFC r233805:
Move struct megasas_sge from mfi_ioctl.h to mfivar.h so we can
remove including machine/bus.h. Add some more mfi_ prefixes to
avoid name space pollution.
MFC r233877:
- Do not include machine/atomic.h. It is no longer necessary since r233768.
- Remove bogus "atomic" macros and a read-only variable from softc.
MFC r233176:
Add new GEOM_PART_LDM module that implements the Logical Disk Manager
scheme. The LDM is a logical volume manager for MS Windows NT and it
is also known as dynamic volumes. It supports about 2000 partitions
and also provides the capability for software RAID implementations.
This version implements only partitioning scheme capability and based
on the linux-ntfs project documentation and several publications across
the Web. NOTE: JBOD, RAID0 and RAID5 volumes aren't supported.
An access to the LDM metadata is read-only. When LDM is on the disk
partitioned with MBR we can also destroy metadata. For the GPT
partitioned disks destroy action is not supported.
MFC r233177:
Connect geom_part_ldm module to the build.
MFC r233178:
Connect geom_part_ldm to the kernel build.
MFC r233181:
Add CTLFLAG_TUN to sysctls.
MFC r233651:
Do proper cleanup for the GPT case when an error occurs.
MFC r233652:
VMDB offset should be greater than logical volume size only for MBR.
233103:
Some software think a mutex can be destroyed after it owned it, for
example, it uses a serialization point like following:
pthread_mutex_lock(&mutex);
pthread_mutex_unlock(&mutex);
pthread_mutex_destroy(&muetx);
They think a previous lock holder should have already left the mutex and
is no longer referencing it, so they destroy it. To be maximum compatible
with such code, we use IA64 version to unlock the mutex in kernel, remove
the two steps unlocking code.
233912:
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
Special code for stable/9:
And add code to detect if the UMTX_OP_MUTEX_WAKE2 is available.
mtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
Merge r232779 from head:
Move determination of socket buffer sizes from startup to the first time
a socket is used. The previous code structure assumed that AF_INET
sockets were always available, which is an invalid assumption on
IPv6-only systems.
This merges the fololowing revisions from NetBSD:
src/usr.bin/ftp/main.c 1.120
src/usr.bin/ftp/util.c 1.156
MFC r232799:
- add comments to syscalls.master and linux(32)_dummy about which linux
kernel version introduced the sysctl (based upon a linux man-page)
- add comments to syscalls.master regarding some names of sysctls which are
different than the linux-names (based upon the linux unistd.h)
- add some dummy sysctls
- name an unimplemented sysctl
If hastd is invoked with "-P pidfile" option always create pidfile
regardless of whether -F (foreground) option is set or not.
Also, if -P option is specified, ignore pidfile setting from configuration
not only on start but on reload too. This fixes the issue when for hastd
run with -P option reload caused the pidfile change.
Merge 233272:
in6_pcblookup_local() still can return a pcb with NULL
inp_socket. To avoid panic, do not dereference inp_socket,
but obtain reuse port option from inp_flags2, like this
is done after next call to in_pcblookup_local() a few lines
down below.
Merge 231831:
Refactor the name hash and the ID hash, that are used to address nodes:
- Make hash sizes growable, to satisfy users running large mpd
installations, having thousands of nodes.
- NG_NAMEHASH() proved to give a very bad distribution in real life
name sets, while generic hash32_str(name, HASHINIT) proved to give
an even one, so use the latter for name hash.
- Do not store unnamed nodes in slot 0 of name hash, no reason for that.
- Use the ID hash in cases when we need to run through all nodes: the
NGM_LISTNODES command and in the vnet_netgraph_uninit().
- Implement NGM_LISTNODES and NGM_LISTNAMES as separate code, the former
iterates through the ID hash, and the latter through the name hash.
- Keep count of all nodes and of named nodes, so that we don't need
to count nodes in NGM_LISTNODES and NGM_LISTNAMES. The counters are
also used to estimate whether we need to grow hashes.
- Close a race between two threads running ng_name_node() assigning same
name to different nodes.
Merge 231760,231761,231764,231765,231766,231823,231830 from head:
231760,231766:
style(9): sort includes
231761:
In ng_bypass() add more protection against potential race
with ng_rmnode() and its followers.
231764:
Remove testing stuff, reducing kernel memory footprint by 1 Kb.
231765:
Trim double empty lines.
231823:
In ng_getsockaddr() allocate memory prior to obtaining lock.
231830:
Specify correct loading order for core of netgraph(4).
dim [Fri, 13 Apr 2012 21:50:14 +0000 (21:50 +0000)]
MFC r233710:
Fix the following compilation warning with clang trunk in isci(4):
sys/dev/isci/isci_task_request.c:198:7: error: case value not in enumerated type 'SCI_TASK_STATUS' (aka 'enum _SCI_TASK_STATUS') [-Werror,-Wswitch]
case SCI_FAILURE_TIMEOUT:
^
This is because the switch is done on a SCI_TASK_STATUS enum type, but
the SCI_FAILURE_TIMEOUT value belongs to SCI_STATUS instead.
Because the list of SCI_TASK_STATUS values cannot be modified at this
time, use the simplest way to get rid of this warning, which is to cast
the switch argument to int. No functional change.
dim [Fri, 13 Apr 2012 21:47:14 +0000 (21:47 +0000)]
MFC r233354:
Work around the following clang warning in mps(4):
sys/dev/mps/mps_sas.c:861:1: error: function 'mpssas_discovery_timeout' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
mpssas_discovery_timeout(void *data)
^
Because the driver is obtained from upstream, we don't want to modify
it; just silence the warning instead, it is harmless.
dim [Fri, 13 Apr 2012 21:35:24 +0000 (21:35 +0000)]
MFC r233052:
Change the style of share/mk/bsd.sys.mk to that of the other bsd.*.mk
files, and style.Makefile(5), where applicable. While here, update the
link to the gcc warning documentation.
MFC r233000:
Add MODULE_DEPEND() to geom_part modules.
MFC r233342:
Check that scheme is not already registered. This may happens when a
KLD is preloaded with loader(8) and leads to infinity loop.
Also do not return EEXIST error code from MOD_LOAD handler, because
we have undocumented(?) ability replace kernel's module with preloaded one.
And if we have so, then preloaded module will be initialized first.
Thus error in MOD_LOAD handler will be triggered for the kernel.
MFC 233547:
Use VM_MEMATTR_UNCACHEABLE instead of VM_MEMATTR_UNCACHED for UC mappings.
VM_MEMATTR_UNCACHED is actually the x86-specific UC- mode (where a WC
MTRR can override the PAT setting).
MFC 233670,233671:
- Use VM_MEMATTR_UNCACHEABLE for the constant for UC memory rather than
VM_MEMATTR_UNCACHED on mips.
- Rename VM_MEMATTR_UNCACHED to VM_MEMATTR_WEAK_UNCACHEABLE on x86 to
be less ambiguous and more clearly identify what it means. An alias
from VM_MEMATTR_WEAK_UNCACHEABLE to VM_MEMATTR_WEAK_UNCACHED remains
on x86 to preserve the KPI.
- Remove the VM_MEMATTR_UNCACHED alias from powerpc.
MFC 232919:
Add kern.eventtimer.activetick tunable/sysctl, specifying whether each
hardclock() tick should be run on every active CPU, or on only one.
On my tests, avoiding extra interrupts because of this on 8-CPU Core i7
system with HZ=10000 saves about 2% of performance. At this moment option
implemented only for global timers, as reprogramming per-CPU timers is
too expensive now to be compensated by this benefit, especially since we
still have to regularly run hardclock() on at least one active CPU to
update system uptime. For global timer it is quite trivial: timer runs
always, but we just skip IPIs to other CPUs when possible.
Option is enabled by default now, keeping previous behavior, as periodic
hardclock() calls are still used at least to implement setitimer(2) with
ITIMER_VIRTUAL and ITIMER_PROF arguments. But since default schedulers don't
depend on it since r232917, we are much more free to experiment with it.
MFC r232917:
Rewrite thread CPU usage percentage math to not depend on periodic calls
with HZ rate through the sched_tick() calls from hardclock().
Potentially it can be used to improve precision, but now it is just minus
one more reason to call hardclock() for every HZ tick on every active CPU.
SCHED_4BSD never used sched_tick(), but keep it in place for now, as at
least SCHED_FBFS existing in patches out of the tree depends on it.
MFC r233018:
Make ofw_bus_get_node() consistently return -1 when there is no associated
OF node, instead of a random mixture of 0 and -1. Update all checks for 0
to check for -1 instead.
MFC 233676:
Use a more proper fix for enabling HT MSI mapping windows on Host-PCI
bridges. Rather than blindly enabling the windows on all of them, only
enable the window when an MSI interrupt is enabled for a device behind
the bridge, similar to what already happens for HT PCI-PCI bridges.
MFC 233305,233623:
- Mark the 'lapics' and 'ioapics' arrays here static since they are
private to this file. The 'lapics' array was actually shadowing a
completely different 'lapics' array that is private to local_apic.c.
- Allocate the ioapics[] array dynamically since it is only needed for the
duration of madt_setup_io(). This avoids having the array take up
permanent space in the BSS.
MFC 232228,233613:
Move the DTrace return IDT vector back up from 0x20 to 0x92. The 0x20
vector is currently dedicated to servicing IRQ 0 from the 8259A's, so
it shouldn't be overloaded for DTrace.
MFC 232744,232747,233031:
- Allow a native i386 kernel to be built with 'nodevice atpic'. Just as on
amd64, if 'device isa' is present quiesce the 8259A's during boot and
resume from suspend.
- Move i386's intr_machdep.c to the x86 tree and share it with amd64.
- Merge r232744 changes to pc98.
(Allow a kernel to be built with 'nodevice atpic'.)
- Move ICU related defines from x86/isa/atpic.c to x86/isa/icu.h and
use them in x86/x86/intr_machdep.c.
Note, I normally would have merged 232747 separately, but 233031 assumed
232747 was already merged and 232744 needs to be merged with 233031.
MFC 232742:
MFamd64:
- Return failure for a suspend attempt if we have no wake address.
- Use intr_disable()/intr_restore() instead of ACPI_DISABLE_IRQS().
- Invoke intr_suspend() earlier and call intr_resume() if suspend
fails.
- Use pause in the loop waiting for CPU to suspend.
- Restore PAT MSR, switchtime, switchticks, and MTRRs on resume.
MFC r233688-233689:
r233688:
Remove task queue based link state change handler. Driver no longer
needs to defer link state handling.
While I'm here, mark IFF_DRV_RUNNING before changing media. If
link is established without any delay, that link state change
handling could be lost.
r233689:
Do not report current link status if driver is not running.
This change also workarounds dhclient's link state handling bug by
not giving current link status.
Unlike other controllers, ale(4)'s PHY hibernation perfectly works
such that driver does not see a valid link if the controller is not
brought up. If dhclient(8) runs on ale(4) it will blindly waits
until link UP and then gives up after 10 seconds. Because
dhclient(8) still thinks interface got a valid link when IFM_AVALID
is not set for selected media, this change makes dhclient initiate
DHCP without waiting for link UP.
MFC r233585-233587:
r233585:
Partially revert r223608 and selectively allow microcode loading
for 82550C. For 82550 controllers this change restores CPUSaver
microcode loading. Due to silicon bug on 82550 and 82550C with
server extension, these controllers seem to require CPUSaver
microcode to receive fragmented UDP datagrams. However the
microcode shouldn't be used on client featured 82550C as it locks
up the controller. In addition, client featured 82550C does not
have the silicon bug. Also clear temporary memory used for
microcode loading since the same memory area is used for other
commands.
While I'm here use 82550C in probe message instead of generic
82550.
Reported by: Andreas Longwitz <longwitz <> incore de>
Tested by: Andreas Longwitz <longwitz <> incore de>
r233586:
Load entire EEPROM contents in device attach time and verify
whether the checksum of EEPROM is valid or not. Because driver
heavily relies on EEPROM information when it selectively enables
features/workarounds, it would be helpful to know whether driver
sees valid EEPROM.
While I'm here remove all other EEPROM accesses since the entire
EEPROM is loaded at device attach time.
r233587:
Remove unnecessary #if as the software workaround for PCI protocol
violation should be activated unless the system is cold-booted
after updating EEPROM.
The PCI protocol violation happens only when established link is
10Mbps so the workaround should be updated whenever link state
change is detected. Previously the workaround was activated only
when user checks current media status with ifconfig(8).
MFC r233948:
Give the kernel pmap lock a different name than user pmap locks. It has
(slightly) different semantics and renaming it prevents a (harmless)
WITNESS warning during bootup for 32-bit kernels on 64-bit CPUs.
Properly resolve the _ctx_start function descriptor (the symbol _ctx_start
is a descriptor, not a code address), which prevents crashes when starting
a context. This fixes QEMU on powerpc64.
- Write the ISO9660 descriptor after the apm partition entries.
- Fill the needed pmPartStatus flags. At least the OpenBIOS
implementation relies on these flags.
This commit fixes the panic seen on OS-X when inserting a FreeBSD/ppc disc.
Additionally OpenBIOS recognizes the partition where the boot code is located.
This lets us load a FreeBSD/ppc PowerMac kernel inside qemu.
- Add support for the Intel Sandy Bridge microarchitecture (both core and
uncore counting events)
- New manpages with event lists.
- Add MSRs for the Intel Sandy Bridge microarchitecture
marius [Sat, 7 Apr 2012 12:46:27 +0000 (12:46 +0000)]
MFC: r233827
Fix probing of SAS1068E with a device ID of 0x0059 after r232411 (MFC'ed
to stable/9 in r232562).
Reported by: infofarmer
MFC: r233886
Refine r233827; as it turns out, controllers with a device ID of 0x0059
can be upgraded to MegaRAID mode, in which case mfi(4) should attach to
these based on the sub-vendor and -device ID instead (not currently done).
Therefore, let mpt_pci_probe() return BUS_PROBE_LOW_PRIORITY.
While it, let mpt_pci_probe() return BUS_PROBE_DEFAULT instead of 0 in
the default case.
MFC 233675:
Restore proper use of bounce buffers for ISA DMA. When locking was
added, the call to pmap_kextract() was moved up, and as a result the
code never updated the physical address to use for DMA if a bounce
buffer was used. Restore the earlier location of pmap_kextract() so
it takes bounce buffers into account.
MFC r233809:
When process exists, not only the children shall be reparented to
init, but also the orphans shall be removed from the orphan list,
because the list header is destroyed.
Major pmap performance, concurrency, and correctness improvements, mostly
for the 64-bit PMAP module (64-bit-capable CPUs with either a 32-bit or
64-bit kernel). Thanks to alc for his help and prodding.
marius [Wed, 4 Apr 2012 21:19:19 +0000 (21:19 +0000)]
MFC: r233747, r233748
- Fix panic on kernel traps having a mapping in trap_sig b0rked in r206086.
Reported by: David E. Cross
- Remove checks that are redundant due to tf_type being unsigned.
MFC r233692:
Reenable unsolicited responses on CODEC if hdaa_sense_init() called again.
This fixes jack connection events handling after suspend/resume.
229988: Fix prototype formatting (indentation, long lines, and
continued lines).
230011: More prototype formatting fixes, struct member formatting fixes,
and namespace fix for property_find() prototype.
230037: Move struct pidfh definition into pidfile.c, and leave a forward
declaration for pidfh in libutil.h in its place.
This allows us to hide the contents of the pidfh structure, and also
allowed removal of the "#ifdef _SYS_PARAM_H" guard from around the
pidfile_* function prototypes.
230233: Fix more disorder in prototypes and constants.
Fix header comments for each section of constants.
Fix whitespace in #define lines.
Fix unnecessary parenthesis in constants.
230599: Restore the parenthesis that are necessary around the
constant values.
230600: Make the comments consistent (capitalization, punctuation, and
format).
230601: Consensus between bde and pjd seemed to be that if the function
names are lined up, then any * after a long type should appear after
the type instead of being in front of the function name on the
following line.
Prevent tmpfs_rename() deadlock in a way similar to UFS. Unlock
vnodes and try to lock them one by one. Relookup fvp and tvp.
Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp().
Doomed vnode is hardly of any use here, besides all callers handle
error case. vfs_hash_get() does the same. Don't mess with vnode
holdcount, vget() takes care of it already.
MFC r233231:
Fix several problems with our ELF filters implementation.
Do not relocate twice an object which happens to be needed by loaded
binary (or dso) and some filtee opened due to symbol resolution when
relocating need objects. Record the state of the relocation
processing in Obj_Entry and short-circuit relocate_objects() if
current object already processed.
Do not call constructors for filtees loaded during the early
relocation processing before image is initialized enough to run
user-provided code. Filtees are loaded using dlopen_object(), which
normally performs relocation and initialization. If filtee is
lazy-loaded during the relocation of dso needed by the main object,
dlopen_object() runs too earlier, when most runtime services are not
yet ready.
Postpone the constructors call to the time when main binary and
depended libraries constructors are run, passing the new flag
RTLD_LO_EARLY to dlopen_object(). Symbol lookups callers inform
symlook_* functions about early stage of initialization with
SYMLOOK_EARLY. Pass flags through all functions participating in
object relocation.
Use the opportunity and fix flags argument to find_symdef() in
arch-specific reloc.c to use proper name SYMLOOK_IN_PLT instead of
true, which happen to have the same numeric value.
MFC r233777 (by kan):
Do not try to adjust stacks if dlopen_object is called too early.
MFC r233778 (by kan):
Remove extra blank line from revious commit.
MFC note: the ARM and MIPS TLS support is not merged back, so the chunks
from r233231 which fix misuse of flags in calls to find_symdef() in
the corresponding relocation type handlers were not applied. When TLS
support is merged, the rest of r233231 should be applied too.
MFC r233746:
Be more conservative in using READ CAPACITY(16) command. Previous code
checked PROTECT bit in INQUIRY data for all SPC devices, while it is defined
only since SPC-3. But there are some SPC-2 USB devices were reported, that
have PROTECT bit set, return no error for READ CAPACITY(16) command, but
return wrong sector count value in response.
MFC 232700:
Add a new sched_clear_name() method to the scheduler interface to clear
the cached name used for KTR_SCHED traces when a thread's name changes.
This way KTR_SCHED traces (and thus schedgraph) will notice when a thread's
name changes, most commonly via execve().