For channel0, it will never be processed on event handling path,
so there is no need to install it. After skipping in the channel0
installation, we could discard the channel0 check on event
handling hot code path.
r300892:
Rename function to be less generic.
r300893:
Don't truncate existing error when writing the log.
r301130:
Enable filemon on all architectures.
r301404:
Support all architectures by just using sysent.
r301414:
Fix build after r301404.
r301460:
Cleanup COMPAT_FREEBSD32 support.
r297156:
Track filemon usage via a proc.p_filemon pointer rather than its own lists.
r297157:
Stop tracking stat(2).
r297158:
Consolidate open(2) and openat(2) code.
r297159:
Use curthread for vn_fullpath.
r297161:
Attempt to use the namecache for openat(2) path resolution.
r297172:
Consolidate common link(2) logic.
r297200:
Follow-up r297156: Close the log in filemon_dtr rather than in the last
reference.
r297201:
Return any log write failure encountered when closing the filemon fd.
r297202:
Remove unused done argument to copyinstr(9).
r297203:
Handle copyin failures.
r297256:
Remove unneeded return left from refactoring.
Support for compression has been available from July 2007 but it
was never imported due to concerns with patents once held by
STAC/HiFn. The issues have clearly been resolved so bring it
in now.
Special thanks to Brett Glass for preserving the code and
pointing documentation for the expiration case.
sephe [Tue, 21 Jun 2016 05:33:26 +0000 (05:33 +0000)]
MFC 298449,298568
298449
hyperv/et: Make Hyper-V event timer a device.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5957
298568
hyperv/et: Strip extra white space in function name
sephe [Tue, 21 Jun 2016 04:51:55 +0000 (04:51 +0000)]
MFC 297931,298022
297931
Expose doreti as a global symbol on amd64 and i386.
doreti provides the common code path for returning from interrupt
andlers on x86. Exposing doreti as a global symbol allows kernel
modules to include low-level interrupt handlers instead of requiring
all low-level handlers to be statically compiled into the kernel.
Submitted by: Howard Su <howard0su@gmail.com>
Reviewed by: kib
298022
hyperv: Deprecate HYPERV option by moving Hyper-V IDT vector into vmbus
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: jhb, kib, sephe
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5910
truckman [Mon, 20 Jun 2016 19:00:47 +0000 (19:00 +0000)]
MFC r300240
Change net.inet.tcp.ecn.enable sysctl mib from a binary off/on
control to a three way setting.
0 - Totally disable ECN. (no change)
1 - Enable ECN if incoming connections request it. Outgoing
connections will request ECN. (no change from present != 0 setting)
2 - Enable ECN if incoming connections request it. Outgoing
conections will not request ECN.
Change the default value of net.inet.tcp.ecn.enable from 0 to 2.
Linux version 2.4.20 and newer, Solaris, and Mac OS X 10.5 and newer have
similar capabilities. The actual values above match Linux, and the default
matches the current Linux default.
cy [Sun, 19 Jun 2016 00:39:23 +0000 (00:39 +0000)]
MFC r300259:
Enable the two ip_frag tuneables. The code is there but the two
ip_frag tuneables aren't registered in the ipf_tuners linked list.
This commmit enables the two existing ip_frag tuneables by registering
them.
ed [Sat, 18 Jun 2016 12:44:14 +0000 (12:44 +0000)]
MFC r301406:
Don't test for INKERNEL to check whether we're in kernel space.
It turns out that <machine/param.h> actually defines a macro under this
name, even when we're not in kernelspace. This causes us to suppress
some macro definitions that are used by userspace apps.
emaste [Sat, 18 Jun 2016 01:23:38 +0000 (01:23 +0000)]
MFC r300231: elf_common.h: add section header flag and dynamic types
SHF_COMPRESSED section contains compressed data
DT_TLSDESC_PLT Location of PLT entry for TLS descriptor resolver calls
DT_TLSDESC_GOT Location of GOT entry used by resolver PLT entry
mm [Fri, 17 Jun 2016 22:40:10 +0000 (22:40 +0000)]
MFC r299529,r299540,r299576,r299896:
r299529,r299540:
Update libarchive to 3.2.0
New features:
- new bsdcat command-line utility
- LZ4 compression (in src only via external utility from ports)
- Warc format support
- 'Raw' format writer
- Zip: Support archives >4GB, entries >4GB
- Zip: Support encrypting and decrypting entries
- Zip: Support experimental streaming extension
- Identify encrypted entries in several formats
- New --clear-nochange-flags option to bsdtar tries to remove noschg and
similar flags before deleting files
- New --ignore-zeros option to bsdtar to handle concatenated tar archives
- Use multi-threaded LZMA decompression if liblzma supports it
- Expose version info for libraries used by libarchive
arybchik [Fri, 17 Jun 2016 09:04:06 +0000 (09:04 +0000)]
MFC r301427
sfxge(4): allow firmware to auto-configure event queues on Medford
On Medford, licenses are required to enable RX and event cut through and to
disable RX batching. To avoid the need for the driver to make decisions based on
the licensing state, the MC_CMD_INIT_EVQ has been extended to allow us to leave
the decision to the firmware. If the adapter is licensed for low-latency use,
the firmware will choose the optimal settings for latency, otherwise it will use
the best settings for throughput.
For Huntington we still need to choose the settings ourselves.
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D6717
arybchik [Fri, 17 Jun 2016 09:02:51 +0000 (09:02 +0000)]
MFC r301309
sfxge(4): always be ready to receive batched events
When the low-latency firmware variant is running, it is reported as not
being capable of batching RX events, but it can still do so if the
FORCE_EV_MERGING flag is set on an RXQ. Therefore we need to handle
batched RX events even if the capability isn't set.
If this bug is fixed in the firmware such that the capability is set
even when running the low-latency firmware variant, it will almost
always be reported so I don't think we lose much by removing the check.
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D6705
arybchik [Fri, 17 Jun 2016 09:01:11 +0000 (09:01 +0000)]
MFC r301308
sfxge(4): add helper to compute timer quantum
This also adjusts the timer values used to match the Linux net
driver implementation:
a) non-zero time intervals should result in at least one quantum
b) timer load/reload values are only zero biased for Falcon/Siena
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D6704
arybchik [Fri, 17 Jun 2016 08:59:08 +0000 (08:59 +0000)]
MFC r301237
sfxge(4): support EVQ timer workaround via MCDI
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/6675
arybchik [Fri, 17 Jun 2016 08:56:01 +0000 (08:56 +0000)]
MFC r301122
sfxge(4): set moderation in efx_ev_qcreate
This simplifies setting an initial interrupt moderation value, and
avoids most calls to evx_ev_qmoderate from contexts where MCDI is
not allowed (MCDI is need for an EVQ timer workaround in a later patch).
Submitted by: Andy Moreton <amoreton at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D6673
sephe [Thu, 16 Jun 2016 03:04:16 +0000 (03:04 +0000)]
MFC 297142,297143,297176,297177,297178,297221
297142
hyperv: Factor out snprinf_hv_guid()
Submitted by: Ju Sun <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5651
Submitted by: Jun Su <junsu microsoft com>
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5669
297176
hyperv/evttimer: Use an independent message slot so that it can work
Using the same message slot as the other types of the messages has
the side effect that the event timer message could be deferred to
the swi threads to run (lacking of trapframe and the original code
didn't even handle that, so the event timer was actually broken).
As of this commit we use an independent message slot for event timer,
so that we could handle all of event timer messages in the interrupt
handler directly. Note, the message slot for event timer is still
bind to the same interrupt vector as the other types of messages.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe
Discussed with: Jun Su <junsu microsoft com>, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5696
297177
hyperv/vmbus: Use taskqueue_fast for non-performance critical messages
This gets rid of the per-cpu SWIs.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5215
297178
hyperv/vmbus: Remove NULL check for taskqueue_create_fast(M_WAITOK)
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5215
297221
hyperv/vmbus: Create per-cpu fast taskqueue for msg handling
Using one taskqueue does not work, since the EOM MSR must be written
on the msg's owner CPU.
Noticed by: Jun Su <junsu microsoft com>
Discussed with: Jun Su <junsu microsoft com>, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
sephe [Thu, 16 Jun 2016 01:57:16 +0000 (01:57 +0000)]
MFC 297802,297803(297481),297804
297802
hyperv: Identify Hyper-V features and recommends properly
Features bits will be used to detect devices, e.g. timers, which
do not have corresponding event channels.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe, Dexuan Cui <decui microsoft com>
Rearranged by: sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
279803(297481)
hyperv: Register Hyper-V timer early enough for TSC freq calibration
The i8254 simulation in Hyper-V is kinda broken and is not available
in Generation 2 Hyper-V VMs, so Hyper-V timer must be registered early
enough so that it can be used to do the TSC freq calibration.
This fixes the notorious warning like this:
calcru: runtime went backwards from 50 usec to 25 usec for pid 0 (kernel)
Submitted by: Dexuan Cui <decui microsoft com>
Reviewed by: kib, sephe
Tested by: kib, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5778
=================
hyperv: Resurrect r297481
This time we make sure that the TIME_REF_COUNT MSR exists.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
vangyzen [Wed, 15 Jun 2016 14:11:49 +0000 (14:11 +0000)]
MFC r301532
newsyslog: Eliminate unnecessary sleep(10) when -R and -s are specified
After going through the signal work list, during which do_sigwork()
is called and essentially does nothing because -s and -R were
specified on the command line, newsyslog will sleep for 10 seconds
as the (verbose) code says: "Pause 10 seconds to allow daemon(s)
to close log file(s)".
However, the man page verbiage for -R (and -s) seems quite clear
that this sleep() is unnecessary because the daemon was expected
to have already closed the log file before calling newsyslog.
PR: 210020
Submitted by: David A. Bright <david_a_bright@dell.com>
Sponsored by: Dell Inc.
sephe [Wed, 15 Jun 2016 09:52:01 +0000 (09:52 +0000)]
MFC 297635
hyperv/vmbus: Use default mtx for channel message queue
First of all sema_post() can't be called w/ spinlock, and the channel
message queue processing is not on hot code path, i.e. spinlock is not
necessary.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5812
sephe [Wed, 15 Jun 2016 09:39:41 +0000 (09:39 +0000)]
MFC 297219
hyperv/vmbus: use a better retry method in hv_vmbus_post_message()
Most often, hv_vmbus_post_message() doesn't fail. However, it fails
intermittently when GPADLs of large shared memory is to be established
with the host, e.g. on the hn(4) attach path: a GPADL of 15MB sendbuf
is created, for which lots of messages will be flooded to the host.
The host side tries to throttle the message rate by returning
HV_STATUS_INSUFFICIENT_BUFFERS.
Before this commit, we do several retries for failed messages, but the
delay between each retry is pretty/too low, which will cause sporadic
message posting failure. We now use large delay (>=1ms) between each
retry to fix the message posting failure.
Submitted by: Dexuan Cui <decui microsoft com>
Reviewed by: sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5715
truckman [Wed, 15 Jun 2016 06:40:30 +0000 (06:40 +0000)]
MFC r301592
Don't leak addrinfo if ai->ai_addrlen <= minsiz test fails.
If the ai->ai_addrlen <= minsiz test fails, then freeaddrinfo()
does not get called to free the memory just allocated by getaddrinfo().
Fix by moving ai->ai_addrlen <= minsiz to a separate nested if
block, and keep freeaddrinfo() in the outer block so that freeaddrinfo()
will be called whenever getaddrinfo() succeeds.
truckman [Wed, 15 Jun 2016 06:33:40 +0000 (06:33 +0000)]
MFC r301582
Explicitly NUL terminate the buffer filled by fread().
The fix in r300649 was not sufficient to convince Coverity that the
buffer was NUL terminated, even with the buffer pre-zeroed. Swap
the size and nmemb arguments to fread() so that a valid lenght is
returned, which we can use to terminate the string in the buffer
at the correct location. This should also quiet the complaint about
the return value of fread() not being checked.
sephe [Wed, 15 Jun 2016 06:32:00 +0000 (06:32 +0000)]
MFC 296293,296296,296297,296305
296293
hyperv/hn: Pass channel to hv_nv_on_receive_completion()
While I'm here, staticize this function.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Modified by: sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
296296
hyperv/hn: Make read buffer per-channel
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Reorganized by: sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
296297
hyperv/hn: Fix typo in comment
MFC after: 1 week
Sponsored by: Microsoft OSTC
296305
hyperv/hn: Make # of rings configurable
And since the host may not being able to allocate the # of rings
requested by us, save the # of rings allocated by the host in the
ring_inuse counters; use ring_inuse counters for run time operation.
truckman [Wed, 15 Jun 2016 06:27:43 +0000 (06:27 +0000)]
MFC r299484, r301574
r299484 | cem | 2016-05-11 15:04:28 -0700 (Wed, 11 May 2016) | 13 lines
random(6): Fix double-close
In the case where a file lacks a trailing newline, there is some "evil" code to
reverse goto the tokenizing code ("make_token") for the final token in the
file. In this case, 'fd' is closed more than once. Use a negative sentinel
value to guard close(2), preventing the double close.
Ideally, this code would be restructured to avoid this ugly construction.
Fix a (false positive?) Argument cannot be negative coverity defect.
Rather than guarding close(fd) with an fd >= 0 test and setting fd
to -1 when it is closed to avoid a potential double-close, just
move the close() call after the conditional "goto make_token". This
moves the close() call totally outside the loop to avoid the
possibility of calling it twice. This should also prevent a Coverity
warning about checking fd for validity after it was previously passed
to read().
sephe [Wed, 15 Jun 2016 06:08:05 +0000 (06:08 +0000)]
MFC 296291,301109
296291
hyperv/chan: Factor out the vcpu setting
And use it for cpu0 assignment; it does not sound right to assume that
cpu0 maps to vcpu0. And this factored out function will be exposed to
drivers, if driver specific CPU binding is needed, e.g. hn(4).
Move default cpu select after saving channel offer message. This makes
sure that all useful information of the channel has been setup.
sephe [Wed, 15 Jun 2016 05:31:35 +0000 (05:31 +0000)]
MFC 296180,297634
296180
hyperv: Use proper fence function to keep store-load order for msgs
sfence only makes sure about the store-store order, which is not
sufficient here. Use atomic_thread_fence_seq_cst() as suggested
jhb and kib (a locked op in the nutshell, which should have the
Reviewed by: jhb, kib, Jun Su <junsu microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5436
297634
hyperv: Use mb() instead of atomic_thread_fence_seq_cst()
Since atomic_thread_fence_seq_cst() will become compiler fence on UP kernel.
Reviewed by: kib, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5852
sephe [Wed, 15 Jun 2016 05:16:37 +0000 (05:16 +0000)]
MFC 296178
buf_ring/drbr: Add buf_ring_peek_clear_sc and use it in drbr_peek
Unlike buf_ring_peek, it only supports single consumer mode, and it
clears the cons_head if DEBUG_BUFRING/INVARIANTS is defined.
The normal use case of drbr_peek for network drivers is:
m = drbr_peek(br);
err = hw_spec_encap(&m); /* could m_defrag/m_collapse */
(*)
if (err) {
if (m == NULL)
drbr_advance(br);
else
drbr_putback(br, m);
/* break the loop */
}
drbr_advance(br);
The race is:
If hw_spec_encap() m_defrag or m_collapse the mbuf, i.e. the old mbuf
was freed, or like the Hyper-V's network driver, that transmission-
done does not even require the TX lock; then on the other CPU at the
(*) time, the freed mbuf could be recycled and being drbr_enqueue even
before the current CPU had the chance to call drbr_{advance,putback}.
This triggers a panic in drbr_enqueue duplicated element check, if
DEBUG_BUFRING/INVARIANTS is defined.
Use buf_ring_peek_clear_sc() in drbr_peek() to fix the above race.
This change is a NO-OP, if neither DEBUG_BUFRING nor INVARIANTS are
defined.
jamie [Wed, 15 Jun 2016 01:56:20 +0000 (01:56 +0000)]
MFC r301745:
Make sure the OSD methods for jail set and remove can't run concurrently,
by holding allprison_lock exclusively (even if only for a moment before
downgrading) on all paths that call PR_METHOD_REMOVE. Since they may run
on a downgraded lock, it's still possible for them to run concurrently
with PR_METHOD_GET, which will need to use the prison lock.
mav [Wed, 15 Jun 2016 01:42:53 +0000 (01:42 +0000)]
MFC r301293:
When negotiating NTB_SB01BASE_LOCKUP workaround, don't try to limit the
BAR size to 1MB. According to Xeon v3 specifications and my tests, that
size register is write-once and so not writeable after BIOS written it.
Instead of that, make the code work with BAR of any sufficient size,
properly calculating offset within its base. It also simplifies the code.
295918
hyperv/hn: Use IFQ_DRV_PREPEND instead of IF_PREPEND
IF_PREPEND promises out-of-order packet sending when the TX desc list
is depleted. It was overlooked and copied blindly when the transmission
path was partially rewritten.
sephe [Mon, 13 Jun 2016 08:03:53 +0000 (08:03 +0000)]
MFC 295748,295792,295793,295794
295748
hyperv/hn: Use buf_ring for txdesc list
So one spinlock is avoided, which would be potentially dangerous for
virtual machine, if the spinlock holder was scheduled out by the host,
as noted by royger.
Old spinlock based txdesc list is still kept around, so we could have
a safe fallback.
No performance regression nor improvement is observed.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5290
295792
hyperv/hn: Add option to bind TX taskqueues to the specified CPU
It will be used to help tracking host side transmission ring selection
issue; and it will be turned on by default, once we have concrete result.
Reviewed by: adrian, Jun Su <junsu microsoft com>
Approved by: adrian (mento)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5316
295793
hyperv/hn: Enable IP header checksum offloading for WIN8 (WinServ2012)
Tested on Windows Server 2012.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5317
295794
hyperv/hn: Free the txdesc buf_ring when the TX ring is destroyed
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5318
sephe [Mon, 13 Jun 2016 07:30:54 +0000 (07:30 +0000)]
MFC 295743,295744,295745,295746,295747
295743
hyperv/hn: Change global tunable prefix to hw.hn
And use SYSCTL+CTLFLAG_RDTUN for them.
Suggested by: adrian
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5274
295744
hyperv/hn: Split RX ring data structure out of softc
This paves the way for upcoming vRSS stuffs and eases more code cleanup.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5275
295745
hyperv/hn: Use taskqueue_enqueue()
This also eases experiment on the non-fast taskqueue.
Reviewed by: adrian, Jun Su <junsu microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5276
295746
hyperv/hn: Use non-fast taskqueue for transmission
Performance stays same; so no need to use fast taskqueue here.
295747
hyperv/hn: Split TX ring data structure out of softc
This paves the way for upcoming vRSS stuffs and eases more code cleanup.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5283
sephe [Mon, 13 Jun 2016 07:03:00 +0000 (07:03 +0000)]
MFC 295740,295741,295742
295740
hyperv/hn: Set the TCP ACK/data segment aggregation limit
Set TCP ACK append limit to 1, i.e. aggregate 2 ACKs at most. Aggregating
anything more than 2 hurts TCP sending performance in hyperv. This
significantly improves the TCP sending performance when the number of
concurrent connetion is low (2~8). And it greatly stabilizes the TCP
sending performance in other cases.
Set TCP data segments aggregation length limit to 37500. Without this
limitation, hn(4) could aggregate ~45 TCP data segments for each
connection (even at 64 or more connections) before dispatching them to
socket code; large aggregation slows down ACK sending and eventually
hurts/destabilizes TCP reception performance. This setting stabilizes
and improves TCP reception performance for >4 concurrent connections
significantly.
sephe [Mon, 13 Jun 2016 06:38:46 +0000 (06:38 +0000)]
MFC 295307,295308,295309,295606
295307
hyperv: Use standard taskqueue instead of hv_work_queue
HyperV code was ported from Linux. There is an implementation of
work queue called hv_work_queue. In FreeBSD, taskqueue could be
used for the same purpose. Convert all the consumer of hv_work_queue
to use taskqueue, and remove work queue implementation.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4963
295308
hyperv: Use WAITOK in the places where we can wait
And convert rndis non-hot path spinlock to mutex.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: adrian, sephe
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5081
295309
hyperv: Use malloc for page allocation.
We will eventually convert them to use busdma.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: adrian, sephe, Dexuan Cui <decui microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5087
295606
hyperv/hn: Fix typo in comment
Noticed by: avos
Reviewed by: adrian, avos, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5199
sephe [Mon, 13 Jun 2016 05:43:42 +0000 (05:43 +0000)]
MFC 295296,295297,295298,295299,295300,295301
295296
hyperv/hn: Avoid duplicate csum features settings
- Record csum features in softc, so we don't need to duplicate the
logic from attach path to ioctl path.
- Protect if_capenable and if_hwassist changes by main lock.
- Prefer turn on/off bits in if_hwassist explicitly instead of using
XOR.
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5085
295297
hyperv/hn: Reorganize TX csum offloading
- For non-TSO offloading, we don't need to access mbuf to know
which csum offloading is requested, we can just use the
CSUM_{IP,TCP,UDP} in the csum_flags.
- For TSO offloading, we still can depend on CSUM_{TSO4,TSO6}
in the csum_flags to tell whether the TSO packet is an IPv4
TSO packet or an IPv6 TSO packet.
This streamlines csum offloading handling (remove the two goto)
and allows us the nuke the unnecessary get_transport_proto_type().
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5098
295298
hyperv/hn: Enable IP header checksum offloading
So that:
- TCP/IP stack will not do unnecessary IP header checksum for TSO
packets.
- Reduce guest load for non-TSO IP packets.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5099
295299
hyperv/hn: Enable UDP RXCSUM
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5102
295300
hyperv/hn: Add sysctls to trust host side UDP and IP csum verification
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5103
295301
hyperv/hn: Obey IFCAP_RXCSUM configure
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5104
sephe [Mon, 13 Jun 2016 05:06:07 +0000 (05:06 +0000)]
MFC 294886
hyperv/vmbus: Event handling code refactor.
- Use taskqueue instead of swi for event handling.
- Scan the interrupt flags in filter
- Disable ringbuffer interrupt mask in filter to ensure no unnecessary
interrupts.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: adrian, sephe, Dexuan <decui microsoft com>
Approved by: adrian (mentor)
MFC after: 2 weeks
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4920
sephe [Mon, 13 Jun 2016 03:28:37 +0000 (03:28 +0000)]
MFC 294701,294702,294703,294705,294788
294701
hyperv/hn: Use m_copydata for chimney sending.
While I'm here, move stack variables near their usage.
Reviewed by: adrian, delphij, Jun Su <junsu microsoft com>
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4977
294702
hyperv/hn: Remove unnecessary zeroing out the netvsc_packet
All used fields are setup one by one, so there is no need to zero
out this large struct.
While I'm here, move the stack variable near its usage.
Reviewed by: adrian, delphij, Jun Su <junsu microsoft com>
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4978
294703
hyperv/hn: Trust host TCP segment checksum verification by default.
According to all available information, VMSWITCH always does the
TCP segment checksum verification before sending the segment to
guest.
Reviewed by: adrian, delphij, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4991
294705
hyperv/vmbus: Avoid extra copy of page information.
The page information array could contain up to 32 elements (i.e. 512B).
And on network side w/ TSO, 11+ (176B+) elements, i.e. ~44K TSO packet,
in the page information array is quite common.
This saves us some cpu cycles.
Reviewed by: adrian, delphij
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4992
294788
hyperv/hn: Improve sending performance
- Avoid main lock contention by trylock for if_start, if that fails,
schedule TX taskqueue for if_start
- Don't do direct sending if the packet to be sent is large, e.g.
TSO packet.
This change gives me stable 9.1Gbps TCP sending performance w/ TSO
over a 10Gbe directly connected network (the performance fluctuated
between 4Gbps and 9Gbps before this commit). It also improves non-
TSO TCP sending performance a lot.
Reviewed by: adrian, royger
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5074
sephe [Mon, 13 Jun 2016 02:54:58 +0000 (02:54 +0000)]
MFC 293653
hyperv/kvp_daemon: Make poll(2) block indefinitely
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, me, adrain
Approved by: adrian
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4762
dim [Sun, 12 Jun 2016 11:45:45 +0000 (11:45 +0000)]
MFC r300967:
Stop exposing the C11 _Atomic() macro in <sys/cdefs.h>, when compiling
for C++. It clashes with the one in libc++'s <atomic> header.
(Previously, the _Atomic() macro was defined in <stdatomic.h>, which is
only for use with C11, but for various reasons it was moved to its
current location in r251804.)