* new macro to remove magic number - IPV6_ADDR_SCOPES_COUNT;
* sa6_checkzone() - this function checks sockaddr_in6 structure
for correctness of sin6_scope_id. It also can fill correct
value sometimes.
* in6_getscopezone() - this function returns scope zone id for
specified interface and scope.
* in6_getlinkifnet() - this function returns struct ifnet for
corresponding zone id of link-local scope.
- Add $netif_ipexpand_max to specify the upper limit for the number of
addresses generated by an address range specification. The default
value is 2048. This can be increased by setting $netif_ipexpand_max
in rc.conf.
- Fix warning messages when an address range spec exceeds the upper limit.
* constify argument of in6_addrscope();
* use IN6_IS_ADDR_XXX() macro instead of hardcoded values;
* for multicast addresses just return scope value, the only exception
is addresses with 0x0F scope value (RFC 4291 p2.7.0);
Add new a M_START() mbuf macro that returns a pointer to the start of
an mbuf's storage (internal or external).
Add a new M_SIZE() mbuf macro that returns the size of an mbuf's
storage (internal or external).
These contrast with m_data and m_len, which are with respect to data
in the buffer, rather than the buffer itself.
Rewrite M_LEADINGSPACE() and M_TRAILINGSPACE() in terms of M_START()
and M_SIZE().
This is done as we currently have many instances of using mbuf flags
to generate pointers or lengths for internal storage in header and
regular mbufs, as well as to external storage. Rather than replicate
this logic throughout the network stack, centralising the
implementation will make it easier for us to refine mbuf storage.
This should also help reduce bugs by limiting the amount of
mbuf-type-specific pointer arithmetic. Followup changes will
propagate use of the macros throughout the stack.
M_SIZE() conflicts with one macro in the Chelsio driver; rename that
macro in a slightly unsatisfying way to eliminate the collision.
Use the linker to perform relocations in the SUNW_dof section rather than
doing them in drti during startup. This fixes a number of problems with
using USDT probes in stripped executables and shared libraries, and with
USDT probes in static functions.
Fix a bug which could break extended attributes in a dump output.
This occurred when a file was >892kB long and had a large data (>1kB)
in the extended attributes.
MFamd64: Use initializecpu() to set various model-specific registers on
AP startup and AP resume (it was already used for BSP startup and BSP
resume).
- Split code to do one-time probing of cache properties out of
initializecpu() and into initializecpucache(). This is called once on
the BSP during boot.
- Move enable_sse() into initializecpu().
- Call initializecpu() for AP startup instead of enable_sse() and
manually frobbing MSR_EFER to enable PG_NX.
- Call initializecpu() when an AP resumes. In theory this will now
properly re-enable PG_NX in MSR_EFER when resuming a PAE kernel on
APs.
To workaround an errata on certain Pentium Pro CPUs, i386 disables
the local APIC in initializecpu() and re-enables it if the APIC code
decides to use the local APIC after all. Rework this workaround
slightly so that initializecpu() won't re-disable the local APIC if
it is called after the APIC code re-enables the local APIC.
None of existing STEC devices need UNMAP or even support it well, having
many limitations and even hanging sometimes executing those commands.
New devices that may use UNMAP going to be released under HGST name.
Add support for calling pcibios routines from the
bootloader. Implement the following routines:
pcibios-device-count count the number of instances of a devid
pcibios-read-config read pci config space
pcibios-write-config write pci config space
pcibios-find-devclass find the nth device with a given devclass
pcibios-find-device find the nth device with a given devid
pcibios-locator convert bus device function ti pcibios locator
These commands are thin wrappers over their PCI BIOS 2.1 counterparts. More
informaiton, such as it is, can be found in the standard.
Export a nunmber of pcibios.X variables into the environment to report
what the PCI IDENTIFY command returned.
Also implmenet a new command line primitive (pci-device-count), but don't
include it by default just yet, since it depends on the recently added
words and any errors here can render a system unbootable.
This is intended to allow the boot loader to do special things based
on the hardware it finds. This could be have special settings that are
optimized for the specific cards, or even loading special drivers. It
goes without saying that writing to pci config space should not be
done without a just cause and a sound mind.
Conditionalize build of etcupdate(8) on MK_RCS. Since etcupdate calls
merge(1), which is part of the RCS package, it must not be installed if
WITHOUT_RCS update is set. Otherwise, it will produce confusing errors.
Add scope zone id to the in_endpoints and hc_metrics structures.
A non-global IPv6 address can be used in more than one zone of the same
scope. This zone index is used to identify to which zone a non-global
address belongs.
Also we can have many foreign hosts with equal non-global addresses,
but from different zones. So, they can have different metrics in the
host cache.
Add additional checks for IPV6_PKTINFO handling (RFC 3542):
* Return ENETDOWN when interface specified by ipi6_ifindex is not
enabled for IPv6 use.
* Return EADDRNOTAVAIL when ipi6_ifindex specifies an interface, but the
address ipi6_addr is not available for use on that interface.
* Return EINVAL when ipi6_addr is multicast address.
Make it possible to disable NOP-In PDUs by the iSCSI initiator by setting
kern.cam.ctl.iscsi.ping_timeout to 0. This fixes interoperability with
some initiators that don't properly support NOP-Ins, namely iPXE/gPXE.
Submitted by: Chen Wen <pokkys@gmail.com>
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
ray [Wed, 10 Sep 2014 11:13:13 +0000 (11:13 +0000)]
o Add sysctls to enable/disable potentially dengerous key combinations, like
reboot/halt/debug.
o Add support for most key combinations supported by syscons(4).
Reviewed by: dumbbell, emaste (prev revision of D747)
MFC after: 5 days
Sponsored by: The FreeBSD Foundation
Replace local copy-and-paste implementations of printmbuf() in several
device drivers with calls to the centralised m_print() implementation.
While the formatting and output details differ a little, the content
is essentially the same, and it is unlikely anyone has used this
debugging output in some time.
This change reduces awareness of mbuf cluster allocation (and,
especially, the M_EXT flag) outside of the mbuf allocator, which will
make it easier to refine the external storage mechanism without
disrupting drivers in the future.
- Reduce DPADD and LDADD in checkdpadd to -l<foo>
- Skip over -Wl,[es]*-group because -Wl,--end-group and
-Wl,--start-group might be required to properly link objects (see
usr.bin/clang/lldb as an example)
This caveat has been present for a while with some components of
the build. However, these false positives were made more more apparent
after r269648.
Fix a boundary case error in vm_reserv_alloc_contig(): If a reservation
isn't being allocated for the last of the requested pages, because a
reservation won't fit in the gap between allocated pages, then the
reservation structure shouldn't be initialized.
Fix issue with nmdm and leading zeros in device name.
The nmdm code enforces a number between the 'nmdm' and 'A|B' portions
of the device name. This is then used as a unit number, and sprintf'd
back into the tty name. If leading zeros were used in the name,
the created device name is different than the string used for the
clone-open (e.g. /dev/nmdm0001A will result in /dev/nmdm1A).
Since unit numbers are no longer required with the updated tty
code, there seems to be no reason to force the string to be a
number. The fix is to allow an arbitrary string between
'nmdm' and 'A|B', within the constraints of devfs names. This allows
all existing user of numeric strings to continue to work, and also
allows more meaningful names to be used, such as bhyve VM names.
Tested on amd64, i386 and ppc64.
Reported by: Dave Smith
PR: 192281
Reviewed by: neel, glebius
Phabric: D729
MFC after: 3 days
NetBSD's virtio-net implementation doesn't negotiate
the merged rx-buffers feature. To support this, check
to see if the feature was negotiated, and then adjust
the operation of the receive path accordingly by using
a larger iovec, and a smaller rx header.
In addition, ignore writes to the (read-only) status byte.
Tested with NetBSD/amd64 5.2.2, 6.1.4 and 7-beta.
Reviewed by: neel, tychon
Phabric: D745
MFC after: 3 days
Change how the recommended mailing list to track is
added to the footer of the release/doc/ pages by
moving a hard-coded value (that is subject to human
error to change) to release.ent where other values
are regularly changed, and adding parsing logic to
release.xsl.
Approved by: re (implicit)
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Fix ctld(8) to not forget to send TargetPortalGroupTag and TargetAlias
when the initiator skips security negotiation. This fixes interoperability
with Xtend SAN initiator.
PR: 193021
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
ian [Tue, 9 Sep 2014 13:50:21 +0000 (13:50 +0000)]
Rename new to newval in inline asm code, to avoid clashes with C++ new.
Also rename cmp to cmpval just to keep the asm variable names similar to
the C variable names.
Add the ability to set `prefer_source' flag to an IPv6 address.
It affects the IPv6 source address selection algorithm (RFC 6724)
and allows override the last rule ("longest matching prefix") for
choosing among equivalent addresses. The address with `prefer_source'
will be preferred source address.
adrian [Tue, 9 Sep 2014 04:20:53 +0000 (04:20 +0000)]
Add basic RSS awareness for the UDPv6 send path.
This doesn't include the same kind of userland overriding that the IPv4
path has; nor does it yet know about 2-tuple versus 4-tuple hashing.
That'll come later.
adrian [Tue, 9 Sep 2014 04:18:20 +0000 (04:18 +0000)]
Update the IPv4 input path to handle reassembled frames and incoming frames
with no RSS hash.
When doing RSS:
* Create a new IPv4 netisr which expects the frames to have been verified;
it just directly dispatches to the IPv4 input path.
* Once IPv4 reassembly is done, re-calculate the RSS hash with the new
IP and L3 header; then reinject it as appropriate.
* Update the IPv4 netisr to be a CPU affinity netisr with the RSS hash
function (rss_soft_m2cpuid) - this will do a software hash if the
hardware doesn't provide one.
NICs that don't implement hardware RSS hashing will now benefit from RSS
distribution - it'll inject into the correct destination netisr.
Note: the netisr distribution doesn't work out of the box - netisr doesn't
query RSS for how many CPUs and the affinity setup. Yes, netisr likely
shouldn't really be doing CPU stuff anymore and should be "some kind of
'thing' that is a workqueue that may or may not have any CPU affinity";
that's for a later commit.
adrian [Tue, 9 Sep 2014 03:10:21 +0000 (03:10 +0000)]
Implement IPv4 RSS software hash functions to use during packet ingress
and egress.
* rss_mbuf_software_hash_v4 - look at the IPv4 mbuf to fetch the IPv4 details
+ direction to calculate a hash.
* rss_proto_software_hash_v4 - hash the given source/destination IPv4 address,
port and direction.
* rss_soft_m2cpuid - map the given mbuf to an RSS CPU ("bucket" for now)
These functions are intended to be used by the stack to support
the following:
* Not all NICs do RSS hashing, so we should support some way of doing
a hash in software;
* The NIC / driver may not hash frames the way we want (eg UDP 4-tuple
hashing when the stack is only doing 2-tuple hashing for UDP); so we
may need to re-hash frames;
* .. same with IPv4 fragments - they will need to be re-hashed after
reassembly;
* .. and same with things like IP tunneling and such;
* The transmit path for things like UDP, RAW and ICMP don't currently
have any RSS information attached to them - so they'll need an
RSS calculation performed before transmit.
TODO:
* Counters! Everywhere!
* Add a debug mode that software hashes received frames and compares them
to the hardware hash provided by the hardware to ensure they match.
The IPv6 part of this is missing - I'm going to do some re-juggling of
where various parts of the RSS framework live before I add the IPv6
code (read: the IPv6 code is going to go into netinet6/in6_rss.[ch],
rather than living here.)
Note: This API is still fluid. Please keep that in mind.
adrian [Tue, 9 Sep 2014 00:19:02 +0000 (00:19 +0000)]
Add a flag to ip_output() - IP_NODEFAULTFLOWID - which prevents it from
overriding an existing flowid/flowtype field in the outbound mbuf with
the inp_flowid/inp_flowtype details.
The upcoming RSS UDP support calculates a valid RSS value for outbound
mbufs and since it may change per send, it doesn't cache it in the inpcb.
So overriding it here would be wrong.
ian [Mon, 8 Sep 2014 19:19:10 +0000 (19:19 +0000)]
Add a 'ubenv import' command to import environment variables from the
u-boot env into the loader(8) env (which also gets them into the kernel
env). You can import selected variables or the whole environment. Each
u-boot var=value becomes uboot.var=value in the loader env. You can also
use 'ubenv show' to display uboot vars without importing them.
Note that r271282 contains only the src change from Clang rev 200797.
This patch file includes two follow-on changes to the test case, which
do not apply to the copy in the FreeBSD tree.
Upstream Clang revisions:
200797:
Debug info: fix a crasher when when emitting debug info for
not-yet-completed templated types. getTypeSize() needs a complete type.
Make it possible to use empty user name ("-U ''") for mount_smbfs(8).
It's just like "-U guest", except that it actually works, at least
with Samba 4, which seems to return authentication failure for "-U guest".
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
- Make hhook_run_socket() vnet-aware instead of adding CURVNET_SET() around
the function calls.
- Fix a memory leak and stats in the case that hhook_run_socket() fails
in soalloc().
vt(4): Change the terminal and buffer sizes, even without a font
This fixes a bug where scroll lock would not work for tty #0 when using
vt_vga's textmode. The reason was that this window is created with a
static 256x100 buffer, larger than the real size of 80x25.
Now, in vt_change_font() and vt_compute_drawable_area(), we still
perform operations even of the window has no font loaded (this is the
case in textmode here vw->vw_font == NULL). One of these operation
resizes the buffer accordingly.
In vt_compute_drawable_area(), we take the terminal size as is (ie.
80x25) for the drawable area.
The font argument to vt_set_border() is removed (it was never used) and
the code now uses the computed drawable area instead of re-doing its own
calculation.
Reported by: Harald Schmalzbauer <h.schmalzbauer_omnilan.de>
Tested by: Harald Schmalzbauer <h.schmalzbauer_omnilan.de>
MFC after: 3 days
peter [Mon, 8 Sep 2014 05:14:58 +0000 (05:14 +0000)]
Temporarily remove the warning added r270781 - it prints the warning
regardless of whether the usage is correct or not and this generates a
LOT of noise, even when you have specified a mask.
adrian [Mon, 8 Sep 2014 03:16:28 +0000 (03:16 +0000)]
(more) correctly account TX completion status for A-MPDU session frames.
The rules turn out to be:
* for non-aggregation session TX queues - it's either sent or not sent.
* for aggregation session TX queues - if nframes=1, then the status reflects
the completed transmission.
* however, for nframes > 1, then this is just a status reflecting what
the initial transmission did. The compressed BA (immediate or delayed)
may not have yet been received, so the actual frame status is in the
compressed BA updates.
Whilst here, I fiddled with debugging and formatting a bit.
There's also RTS attempts (what the atheros chips call "short retries")
which weren't being logged and they aren't yet being used in the rate
control statistics updates. For now, at least log them.
TODO:
* This still isn't 100% correct! So I have to tinker with this some more.
(The failures aren't always failures..)
* Extend the rate control API in net80211 so it can take both short and
long retry counts.