Rafal Jaworowski [Mon, 19 Aug 2013 15:58:39 +0000 (15:58 +0000)]
Simplify and clean up pmap_clearbit()
There is no need for calling vm_page_dirty() when clearing "modified" flag as
it is already set for that page in pmap_fault_fixup() or pmap_enter() thanks
to "modified" bit emulation.
Also, there is no need for checking PTE "referenced" or "writeable" flags. If
there is a request to clear a particular flag we should just do it.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf
Rafal Jaworowski [Mon, 19 Aug 2013 15:36:23 +0000 (15:36 +0000)]
Fix ARMv6/v7 mapping's wired status.
Last input argument in pmap_modify_pv() should be a mask of flags to be set.
In pmap_change_wiring() however, the straight wired status was used, which
does not represent valid flags (and is of type boolean).
This commit fixes the issue so that wired flag is passed to pmap_modify_pv()
properly.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf
Rafal Jaworowski [Mon, 19 Aug 2013 15:12:36 +0000 (15:12 +0000)]
Clear all L2 PTE protection bits before their configuration.
Revise L2_S_PROT_MASK to include all of the protection bits. Notice that
clearing these bits does not always take away the corresponding permissions
(for example, permission is granted when the bit is cleared). The bits are
cleared but are to be set or left cleared accordingly in pmap_set_prot(),
pmap_enter_locked(), etc.
Clear L2_XN along with L2_S_PROT_MASK in pmap_set_prot() so that all
permissions related bits are cleared before actual configuration.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf
Rafal Jaworowski [Mon, 19 Aug 2013 14:56:17 +0000 (14:56 +0000)]
Simplify pv_entry removal or ARMv6/v7:
- PGA_WRITEABLE indicates that there *might be* a writable mapping for the
particular page, so to avoid frequent sweeping of the pv_entries whenever
pmap_nuke_pv(), pmap_modify_pv(), etc. is called, it is sufficient to
clear that flag if there are no managed mappings for that page anymore
(notice that only pmap_enter is authorized to set this flag).
- Avoid redundant checking for PVF_WIRED flag when this flag cannot be set
anyway.
- Clear PGA_WRITEABLE only once for each vm_page instead of multiple,
redundant clearing it in loop when there are no writeable mappings
to that page anymore.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: gber
Sponsored by: The FreeBSD Foundation, Semihalf
Andre Oppermann [Mon, 19 Aug 2013 12:30:18 +0000 (12:30 +0000)]
Move the SCTP specific definition of M_NOTIFICATION onto a protocol
specific mbuf flag from sys/mbuf.h to netinet/sctp_os_bsd.h. It is
only relevant within SCTP.
Andre Oppermann [Mon, 19 Aug 2013 11:16:53 +0000 (11:16 +0000)]
Remove the unused M_NOFREE mbuf flag. It didn't have any in-tree users
for a very long time, if ever.
Should such a functionality ever be needed again the appropriate and
much better way to do it is through a custom EXT_SOMETHING external mbuf
type together with a dedicated *ext_free function.
Andre Oppermann [Mon, 19 Aug 2013 10:34:10 +0000 (10:34 +0000)]
Move ip_reassemble()'s use of the global M_FRAG mbuf flag to a protocol layer
specific flag instead. The flag is only relevant while the packet stays in
the IP reassembly queue.
Andre Oppermann [Mon, 19 Aug 2013 10:30:15 +0000 (10:30 +0000)]
Remove unused M_FRAG, M_FIRSTFRAG and M_LASTFRAG tagging from ip_fragment().
There wasn't any real driver (and hardware) support for it. Modern hardware
does full fragmentation/segmentation offload instead.
Ian Lepore [Mon, 19 Aug 2013 01:29:13 +0000 (01:29 +0000)]
Allow a hardware driver to pass clock frequencies into the sdhci driver.
The sdhci spec says that if the base or timeout clock frequency in the
capabilities register is zero, the driver must obtain the frequency "from
another source." This change defines that other source to be the low-level
hardware driver, which can pre-set the frequencies in slot.max_clk and
slot.timeout_clk before calling sdhci_init_slot().
This helps with a growing number of SoCs that have sdhci base clock
frequencies that either won't fit into the range allowed by the number of
bits available in the capabilities register, or the frequency is runtime-
configurable.
When code from r254064 in pmap_ts_referenced() drops pv lock and
blocks on a pmap lock, pmap_release() might proceed in parallel and
destroy the pmap mutex, since unlocked pv lock allows to remove pv
entry owned by the pmap.
For now, gate the pmap_release() on write-locked pvh_global_lock.
Since pmap_ts_release() does not unlock the global lock,
pmap_release() would not destroy pmap mutex until the
pmap_ts_referenced() finished. We cannot enter pmap_ts_referenced()
and encounter a pv entry for the destroyed pmap if pmap_release()
passed the global lock gate, since pmap_remove_pages() would finish
earlier.
Reported by: jeff, pho
Reviewed by: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
Ian Lepore [Sun, 18 Aug 2013 19:08:53 +0000 (19:08 +0000)]
Add a new SDHCI_QUIRK_DONT_SHIFT_RESPONSE for hardware that pre-shifts
the response bits the way we do in software. While the hardware is just
doing the sensible thing rather than leaving it to the software, it's in
violation of the spec by doing so. Grrrr.
The cap_rights_limit(2) system calls needs a wrapper for 32bit binaries
running under 64bit kernels as the 'rights' argument has to be split
into two registers or the half of the rights will disappear.
Reported by: jilles
Sponsored by: The FreeBSD Foundation
Consistently use 'af' as an argument name for address family.
Now both gethostbyname2(3) and gethostbyaddr(3) use the same argument name.
The same argument name is also used in implementations of those functions.
Add process descriptors support to the GENERIC kernel. It is already being
used by the tools in base systems and with sandboxing more and more tools
the usage should only increase.
Submitted by: Mariusz Zaborski <oshogbo@FreeBSD.org>
Sponsored by: Google Summer of Code 2013
MFC after: 1 month
Adrian Chadd [Sun, 18 Aug 2013 06:08:52 +0000 (06:08 +0000)]
Add in missing events for Sandy Bridge Xeon.
* Add in MEM_LOAD_UOPS_LLC_HIT_RETIRED for both sandy bridge and sandy
bridge Xeon. Right now it only is enabled for Sandy Bridge.
* D2/0F is actually a combination rather than a separate counter, so
just flip that on for the CPU types that support it.
There's an errata for using this on SB Xeon hardware - I've documented
it in kern/181346.
Mark Johnston [Sat, 17 Aug 2013 22:06:30 +0000 (22:06 +0000)]
Update the SDT(9) man page with the macros added in 254468. Also change the
existing examples to not pass an mbuf as a probe argument. There's no
obvious reason to have it there, and it doesn't really jibe with the example
added in this revision.
Mark Johnston [Sat, 17 Aug 2013 22:02:26 +0000 (22:02 +0000)]
Add a "translated type" argument to SDT_PROBE_ARGTYPE() and add some macros
which allow one to define SDT probes that specify translated types. The idea
is to make it easy to write SDT probe definitions that can work across
multiple operating systems. In particular, this makes it possible to port
illumos SDT probes to FreeBSD without changing their argument types, so long
as the appropriate translators are defined. Then DTrace scripts written for
Solaris/illumos will work on FreeBSD without any changes.
Neel Natu [Sat, 17 Aug 2013 19:49:08 +0000 (19:49 +0000)]
Bump up the maximum addressable memory on amd64 systems from 1TB to 4TB.
Bump up the KVA size proportionally from 512GB to 2TB.
The number of page table pages used by the direct map is now calculated at
run time based on 'Maxmem'. This means the small memory systems will not
see any additional tax in terms of page table pages for the direct map.
However all amd64 systems, regardless of the memory size, will use 3 more
pages to accomodate the bump in the KVA size.
More details available here:
http://lists.freebsd.org/pipermail/freebsd-hackers/2013-June/043015.html
http://lists.freebsd.org/pipermail/freebsd-current/2013-July/043143.html
Tested with the following configurations:
- Sandybridge server with 64GB of memory.
- bhyve VM with 64MB of memory.
- bhyve VM with a 8GB of memory with the memory segment above 4GB cuddling
right up against the 4TB maximum memory limit.
Discussed on: hackers@, current@
Submitted by: Chris Torek (torek@torek.net)
Jilles Tjoelker [Sat, 17 Aug 2013 19:24:58 +0000 (19:24 +0000)]
libc: Access _logname_valid more efficiently.
The variable _logname_valid is not exported via the version script;
therefore, change C and i386/amd64 assembler code to remove indirection
(which allowed interposition). This makes the code slightly smaller and
faster.
Also, remove #define PIC_GOT from i386/amd64 in !PIC mode. Without PIC,
there is no place containing the address of each variable, so there is no
possible definition for PIC_GOT.
Andrew Turner [Sat, 17 Aug 2013 18:51:38 +0000 (18:51 +0000)]
Rename device vfp to option VFP and retire the ARM_VFP_SUPPORT option. This
simplifies enabling as previously both options were required to be enabled,
now we only need a single option.
Bryan Venteicher [Sat, 17 Aug 2013 17:02:43 +0000 (17:02 +0000)]
Do not use potentially stale thread in kthread_add()
When an existing process is provided, the thread selected to use
to initialize the new thread could have exited and be reaped.
Acquire the proc lock earlier to ensure the thread remains valid.
Reviewed by: jhb, julian (previous version)
MFC after: 3 days
Andrew Turner [Sat, 17 Aug 2013 14:52:19 +0000 (14:52 +0000)]
Remove unused FPE code. This is not enabled anywhere as it is the only
file I can find containing FAST_FPE. It appears this would not work as
want_resched is not defined anywhere.
Andrew Turner [Sat, 17 Aug 2013 14:36:32 +0000 (14:36 +0000)]
Silence a warning that is incorrect on ARMv6 and later. In the smull, umull,
smlal, and umlal the output registers are allowed to be the same as either
input registers, where in ARMv4 and ARMv5 they could only be the same as the
last input register.
Hiroki Sato [Sat, 17 Aug 2013 07:12:52 +0000 (07:12 +0000)]
Unbreak rwhod(8):
- It did not work with GENERIC kernel after r250603 because
options PROCDESC was required for pdfork(2). It now just uses fork(2)
instead when this syscall is not available.
- Fix verify(). This function was broken in r250602 because the outermost
"()" was removed from the condition !(isalnum() || ispunct()).
It prevented hostnames including "-", for example.
Ian Lepore [Fri, 16 Aug 2013 23:05:34 +0000 (23:05 +0000)]
Handle command retries for commands originating at the mmc layer, and
ensure that all such commands have a non-zero retry count except for those
that are expected to fail (for example, because they are used to probe for
feature support).
While it is possible to pass a retry count down to the hardware driver in
the command request structure, no hardware driver currently implements any
retry logic. The hardware doesn't know much about the context of a single
request, so it makes more sense to handle retries at a layer that does.
This adds retry loops to the mmc_wait_for_cmd() and mmc_wait_for_app_cmd()
functions. These functions are the gateway from other code within mmc.c
to the hardware. App commands are a sequence of two commands and a retry
has to rerun both of them in order, so it needs its own retry loop.
Retry looping is specifically NOT implemented in mmc_wait_for_request()
because it is the gateway for children on the bus, and they have to
implement their own retry logic depending on what makes sense for them.
John Baldwin [Fri, 16 Aug 2013 21:13:55 +0000 (21:13 +0000)]
Add new mmap(2) flags to permit applications to request specific virtual
address alignment of mappings.
- MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n).
Requests for n >= number of bits in a pointer or less than the size of
a page fail with EINVAL. This matches the API provided by NetBSD.
- MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED. It can be used
to optimize the chances of using large pages. By default it will align
the mapping on a large page boundary (the system is free to choose any
large page size to align to that seems best for the mapping request).
However, if the object being mapped is already using large pages, then
it will align the virtual mapping to match the existing large pages in
the object instead.
- Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and
VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment.
MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while
MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE.
- mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than
explicitly using VMFS_SUPER_SPACE. All device objects are forced to
use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively
equivalent.
Ian Lepore [Fri, 16 Aug 2013 20:32:56 +0000 (20:32 +0000)]
During card identification, run the bus at 400KHz, not the minimum
speed the bus claims to be capable of. The 400KHz speed is dictated
by the SD and MMC standards.
Ian Lepore [Fri, 16 Aug 2013 19:44:49 +0000 (19:44 +0000)]
Add named constants for 8-bit bus support. The sdhci and mmc drivers
don't have support for this yet, but some low-level hardware is ready
for it when the higher layers catch up.
Ian Lepore [Fri, 16 Aug 2013 19:40:00 +0000 (19:40 +0000)]
When the timeout clock is based on the SD clock, the timeout counter
has to be recalculated every time the SD clock frequency changes.
Also, tidy up the counter calculation... it makes no sense to calculate
a value one larger than the limit, then whine that it's too large and
truncate it to the limit. If the BROKEN_TIMEOUT quirk is set, don't
calculate the counter at all, just set it to the limit value.
Simon J. Gerraty [Fri, 16 Aug 2013 16:26:23 +0000 (16:26 +0000)]
When we need to build using the in-tree make,
switch at the earliest opportunity.
In the case of fmake vs bmake, this helps ensure correct load handling.
Kenneth D. Merry [Fri, 16 Aug 2013 16:14:32 +0000 (16:14 +0000)]
Add unmapped I/O and larger I/O support to the sa(4) driver.
We now pay attention to the maxio field in the XPT_PATH_INQ CCB,
and if it is set, propagate it up to physio via the si_iosize_max
field in the cdev structure.
We also now pay attention to the PIM_UNMAPPED capability bit in the
XPT_PATH_INQ CCB, and set the new SI_UNMAPPED cdev flag when the
underlying SIM supports unmapped I/O.
scsi_sa.c: Add unmapped I/O support and propagate the SIM's
maximum I/O size up.
Adjust scsi_tape_read_write() in the same way that
scsi_read_write() was changed to support unmapped
I/O. We overload the readop parameter with bits
that tell us whether it's an unmapped I/O, and we
need to set the CAM_DATA_BIO CCB flag. This change
should be backwards compatible in source and
binary forms.