Provide the unique implementation for the VOP_GETPAGES() method used
by ffs and ext2fs. Remove duplicated call to vm_page_zero_invalid(),
done by VOP and by vm_pager_getpages(). Use vm_pager_free_nonreq().
Reviewed by: alc (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 6 weeks (after r271596)
ian [Sun, 14 Sep 2014 23:48:18 +0000 (23:48 +0000)]
Use gic_decode_fdt() rather than a local routine to parse fdt interrupt
properties. Move fdt_pic_table and fdt_fixup_table into imx6_machdep.c,
which means imx6 doesn't need imx_common.c anymore.
Add couple memory barries to serialize tdq_cpu_idle and tdq_load accesses.
This change fixes transient performance drops in some of my benchmarks,
vanishing as soon as I am trying to collect any stats from the scheduler.
It looks like reordered access to those variables sometimes caused loss of
IPI_PREEMPT, that delayed thread execution until some later interrupt.
Fix PowerPC backtraces. Since kernel and user have completely separate address
spaces, rather than a split address, we actually can't check for being within
the kernel's address range. Instead, do what other backtraces do, and use
trapexit()/asttrapexit() as the stack sentinel.
ian [Sun, 14 Sep 2014 21:21:03 +0000 (21:21 +0000)]
Add a common routine for parsing FDT data describing an ARM GIC interrupt.
In the fdt data we've written for ourselves, the interrupt properties
for GIC interrupts have just been a bare interrupt number. In standard
data that conforms to the published bindings, GIC interrupt properties
contain 3-tuples that describe the interrupt as shared vs private, the
interrupt number within the shared/private address space, and configuration
info such as level vs edge triggered.
The new gic_decode_fdt() function parses both types of data, based on the
#interrupt-cells property. Previously, each platform implemented a decode
routine and put a pointer to it into fdt_pic_table. Now they can just
list this function in their table instead if they use arm/gic.c.
dim [Sun, 14 Sep 2014 18:50:38 +0000 (18:50 +0000)]
Pull in r217410 from upstream llvm trunk (by Bob Wilson):
Set trunc store action to Expand for all X86 targets.
When compiling without SSE2, isTruncStoreLegal(F64, F32) would return
Legal, whereas with SSE2 it would return Expand. And since the Target
doesn't seem to actually handle a truncstore for double -> float, it
would just output a store of a full double in the space for a float
hence overwriting other bits on the stack.
Patch by Luqman Aden!
This should fix clang -O0 on i386 assigning garbage to floats, in
certain scenarios.
Avoid an exclusive acquisition of the object lock on the expected execution
path through the NFS clients' getpages functions.
Introduce vm_pager_free_nonreq(). This function can be used to eliminate
code that is duplicated in many getpages functions. Also, in contrast to
the code that currently appears in those getpages functions,
vm_pager_free_nonreq() avoids acquiring an exclusive object lock in one
case.
ian [Sun, 14 Sep 2014 17:47:04 +0000 (17:47 +0000)]
Add compat strings for all the flavors of GIC this driver should support.
Also allow the driver to attach to ofwbus as well as simplebus, some FDT
data puts the root interrupt controller on the root bus.
ian [Sun, 14 Sep 2014 17:36:57 +0000 (17:36 +0000)]
Fix an undefined variable that was accidentally not causing an error.
The code had references to both intr_offset and intr_parent variable names
as referring to the parent interrupt node. The intr_parent variable
wasn't actually defined anywhere, but the only references to it were as
an argument to a macro that didn't use that argument in expansion, so
the undefined variable accidentally didn't cause an error.
The intr_parent name makes more sense in context, so change all occurrances
of intr_offset to intr_parent.
devq_openings counter lost its meaning after allocation queues has gone.
held counter is still meaningful, but problematic to update due to separate
locking of CCB allocation and queuing.
To fix that replace devq_openings counter with allocated counter. held is
now calculated on request as difference between number of allocated, queued
and active CCBs.
Fix mis-spelling of bits and types names in the vnode_pager_putpages().
The changes should not modify the generated code.
The pager->pgo_putpages() method takes int flags as its fourth
argument, while vnode_pager_putpages() used boolean_t (which is
typedef'ed to int). The flags are from VM_PAGER_* namespace, while
vnode_pager_putpages() passed TRUE and OBJPC_SYNC to VOP_PUTPAGES(),
which both are numerically equal to VM_PAGER_PUT_SYNC.
Noted and reviewed by: alc (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kan [Sun, 14 Sep 2014 00:02:37 +0000 (00:02 +0000)]
Add delay to Octeon MDIO access routines.
Prevent saturattion of the bus by constant polling which in
extreme cases can cause interface lockup. This makes FreeBSD
match similar case in the executive.
Rename the choices in the partitioning methods dialog to reflect current
reality. In particular, draw a connection between the auto ZFS script and
the auto UFS one, since they fulfill similar functions. I'm not sure the
auto ZFS code is actually experimental anymore, so it might be worth
changing that label still.
Make the default choice for the chroot shell at the end be "No". This allows
just pressing enter repeatedly to successfully install a reasonable system.
When asked to find a hole, the DMU sees that there are no holes in the
object, and returns ESRCH. The ZPL interprets this as "no holes before
the end of the file", and therefore inserts the "virtual hole" at the
end of the file. Because DMU and ZPL have different ideas of where the
end of an object/file is, we will end up returning the end of file,
which is generally larger, instead of returning the end of object.
The fix is to handle the "virtual hole" in the DMU. If no hole is found,
the DMU will return a hole at the end of the file, rather than an error.
Illumos issue:
5139 SEEK_HOLE failed to report a hole at end of file
ian [Sat, 13 Sep 2014 17:38:26 +0000 (17:38 +0000)]
Make inclusion of fdt clock support conditional on fdt_clock, not just fdt.
There are plenty of platforms that use fdt without needing the overhead of
the new clock support routines.
Add a new tunable/sysctl, vfs.zfs.free_max_blocks, which can be used to
limit how many blocks can be free'ed before a new transaction group is
created. The default is no limit (infinite), but we should probably have
a lower default, e.g. 100,000.
With this limit, we can guard against the case where ZFS could run out of
memory when destroying large numbers of blocks in a single transaction
group, as the entire DDT needs to be brought into memory.
Illumos issue:
5138 add tunable for maximum number of blocks freed in one txg
Implement control over command reordering via options and control mode page.
It allows to bypass range checks between UNMAP and READ/WRITE commands,
which may introduce additional delays while waiting for UNMAP parameters.
READ and WRITE commands are always processed in safe order since their
range checks are almost free.
Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.
Implement range checks between UNMAP and READ/WRITE commands.
Before this change UNMAP completely blocked other I/Os while running.
Now it blocks only colliding ones, slowing down others only due to ZFS
locks collisions.
Workaround for receiving Voice Calls using the E1750 dongle from
Huawei. It might appear as if the firmware is allocating memory blocks
according to the USB transfer size and if there is initially a lot of
data, like at the answering machine prompt, it simply dies without any
apparent reason. The simple workaround for this is to force a zero
length packet at hardware level after every 512 bytes of data. This
will force the other side to use smaller memory blocks aswell.
Fix various issues with invalid file operations:
- Add invfo_rdwr() (for read and write), invfo_ioctl(), invfo_poll(),
and invfo_kqfilter() for use by file types that do not support the
respective operations. Home-grown versions of invfo_poll() were
universally broken (they returned an errno value, invfo_poll()
uses poll_no_poll() to return an appropriate event mask). Home-grown
ioctl routines also tended to return an incorrect errno (invfo_ioctl
returns ENOTTY).
- Use the invfo_*() functions instead of local versions for
unsupported file operations.
- Reorder fileops members to match the order in the structure definition
to make it easier to spot missing members.
- Add several missing methods to linuxfileops used by the OFED shim
layer: fo_write(), fo_truncate(), fo_kqfilter(), and fo_stat(). Most
of these used invfo_*(), but a dummy fo_stat() implementation was
added.
Simplify vntype_to_kinfo() by returning when the desired value is found
instead of breaking out of the loop and then immediately checking the loop
index so that if it was broken out of the proper value can be returned.
- Don't let rman_reserve_resource() activate the resource in
nexus_alloc_resource() and don't set a bushandle.
nexus_activate_resource() will set a proper bushandle.
- Implement a proper nexus_release_resource().
- Fix ixppcib_activate_resource() to call rman_activate_resource()
before creating a mapping for the resource.
Add support for adding empty partition entries. I.e. skip partition
numbers or names. This gives more control over the actual layout and
helps to construct BSD disklabels with /usr or /var at dedicated
partitions.
Obtained from: Juniper Networks, Inc.
MFC after: 3 days
Relnotes: yes
- Provide a sleepable lock to protect against ioctl() vs ioctl() races.
- Use the new lock to protect against simultaneous DIOCSTART and/or
DIOCSTOP ioctls.
Reported & tested by: jmallett
Sponsored by: Nginx, Inc.
Optimize the common case of injecting an interrupt into a vcpu after a HLT
by explicitly moving it out of the interrupt shadow. The hypervisor is done
"executing" the HLT and by definition this moves the vcpu out of the
1-instruction interrupt shadow.
Prior to this change the interrupt would be held pending because the VMCS
guest-interruptibility-state would indicate that "blocking by STI" was in
effect. This resulted in an unnecessary round trip into the guest before
the pending interrupt could be injected.
Be compatible with boot code that starts right after the disk label in
the second sector by only clearing the amount of bytes needed for the
disklabel in the second sector. Previously we were clearing exactly 1
sector worth of bytes and as such writing over boot code that may have
been there.
Since we do support more than 8 partitions, make sure to set all fields
in d_partitions. For the first 8 partitions this is unneeded, but for
partitioons 9 and up this compensates for the fact that we don't clear
an entire sector anymore.
Obviously, one cannot use more than 8 partitions when using boot code
that starts right after the disk label.
Relevant GRNs:
107879 - Employ unused bytes after the disklabel in the second sector.
189500 - Revert the part of change 107879 that employs the unused bytes
after the disklabel in the 2nd sector for boot code.
Obtained from: Juniper Networks, Inc.
MFC after: 3 days
Fix checksum calculation:
1. Iterate over all partitions counted in the label, which can be more
than the number of partitions given to mkimg(1).
2. Start the checksum from the beginning of the label; not the beginning
of the bootarea.
Attach the ISO to an ahci-cd emulated device. The
ISO will appear to be mounted on a /dev/cd device
instead of /dev/vtbd. This is similar to how other
virtualization environments handle mounting ISO images.