It appears that some UVISOR devices do not handle when the clear stall command
is issued at the beginning of the initial IN/OUT data transfers. Reason
unknown, probably firmware fault. Now the stall is only cleared on data
transfer errors.
- make the usb_temp_setup() and usb_temp_unsetup() functions public so that
other modules can generate USB descriptors.
- extend the vendor specific request function by one length pointer argument,
because not all descriptors store the length in the first byte. For example
HID descriptors.
MFC: r205562
When the regular NFS server replied to a UDP client out of the replay
cache, it did not free the request argument mbuf list, resulting in a leak.
This patch fixes that leak.
When receiving a management frame, pass the mbuf to bpf before calling
iv_recv_mgmt(). iv_recv_mgmt() will generate management frame
responses
and pass them to bpf before the management frame that triggered the
response.
PR: 144323
Submitted by: Alexander Egorenkov <egorenar at gmail.com>
Sponsored by: iXsystems, inc.
r204271:
Accessing an mbuf after it has been handed off to the hardware is a bad
race as it could already have been tx'd and freed by that time. Place
the bpf tap just _before_ writing the gen bit.
This fixes a panic when running tcpdump on a cxgb interface.
r204274:
There is no need to test __FreeBSD_version for features that have
been around for a long time now (7.1-ish or even earlier); assume
they are present. These includes MSI, TSO, LRO, VLAN, INTR_FILTERS,
FIRMWARE, etc.
Also, eliminate some dead code and clean up in other places as part
of this quick once-over.
r204348:
Support IFCAP_VLANHWTSO in cxgb(4). It works with or without vlanhwtag.
While here, remove old DPRINTFs and tidy up the capability code a bit.
r204921:
Better TwinAx transceiver detection.
Originally submitted by: <Bruno dot Bittner at isilon dot com>
(This is a rewritten, corrected version of that patch)
r205944:
Refresh the firmware version immediately after it is upgraded (or downgraded).
r205945:
Improved PHY EDC settings.
r205946:
Do not attempt to retrieve interrupt information before it is available.
r205947:
Fix build with "nooptions INET"
r205948:
Fix tx drop statistics.
r205949:
Fix signed/unsigned mix-up that allowed txq->in_use to grow beyond txq->size.
r205950:
Multiple fixes related to queue set sizing and resources:
- Only the tunnelq (TXQ_ETH) requires a buf_ring, an ifq, and the watchdog/timer
callouts. Do not allocate these for the other tx queues.
- Use 16k jumbo clusters only on offload capable cards by default.
- Do not allocate a full tx ring for the offload queue if the card is not
offload capable.
- Slightly better freelist size calculation.
- Fix nmbjumbo4 typo, remove unneeded global variables.
r206109:
Increase response queue size to avoid starvation, add a counter
to track it when it does occur.
Clean up some printing stuff so that we can have a bit finer control
on debug output. Add a new platform function requirement to allow
for printing based upon the ITL nexus instead of the isp unit plus
channel, target and lun. This allows some printouts and error messages
from the core code to appear in the same format as the platform's
subsystem (in FreeBSD's case, CAM path).
MFC of 2 items to fix the csum for v6 issue:
Revision 205075 and 205104:
---------205075----------
With the recent change of the sctp checksum to support offload,
no delayed checksum was added to the ip6 output code. This
causes cards that do not support SCTP checksum offload to
have SCTP packets that are IPv6 NOT have the sctp checksum
performed. Thus you could not communicate with a peer. This
adds the missing bits to make the checksum happen for these cards.
-------------------------
---------205104----------
The proper fix for the delayed SCTP checksum is to
have the delayed function take an argument as to the offset
to the SCTP header. This allows it to work for V4 and V6.
This of course means changing all callers of the function
to either pass the header len, if they have it, or create
it (ip_hl << 2 or sizeof(ip6_hdr)).
-------------------------
PR: 144529
MFC of 204670:
-------------------------
sched_getparam was just plain broke for time-share
processes. It did not return an error but instead
just let garbage be passed back. This I fix so
it actually properly translates the priority the
process is at to a posix's high means more priority.
I also fix it so that if the ULE scheduler has bumped
it up to a realtime process you get back a sane value
i.e. the highest priority (63 for time-share).
sched_setscheduler() had the setting of the
timeshare class priority disabled. With some notes
about rejecting the posix high numbers is greater
priority and use nice instead. This fix also
adjusts that to work, with the cavet that a t-s
process may well get bumped up or down i.e. the
setscheduler() will NOT change the nice value only
the current priority. I think this is reasonable
considering if the user wants to play with nice then
he can. At least all the posix'ish interfaces now
respond sanely.
-----------------------
marius [Sun, 4 Apr 2010 14:57:46 +0000 (14:57 +0000)]
MFC: r205269
o Add support for UltraSparc-IV+:
- Swap the configuration of the first and second large dTLB as with
US-IV+ these can only hold entries of certain page sizes each, which
we happened to chose the non-working way around.
- Additionally ensure that the large iTLB is set up to hold 8k pages
(currently this happens to be a NOP though).
- Add a workaround for US-IV+ erratum #2.
- Turn off dTLB parity error reporting as otherwise we get seemingly
false positives when copying in the user window by simulating a
fill trap on return to usermode. Given that these parity errors can
be avoided by disabling multi issue mode and the problem could be
reproduced with a second machine this appears to be a silicon bug of
some sort.
- Add a membar #Sync also before the stores to ASI_DCACHE_TAG. While
at it, turn of interrupts across the whole cheetah_cache_flush() for
simplicity instead of around every flush. This should have next to no
impact as for cheetah-class machines we typically only need to flush
the caches a few times during boot when recovering from peeking/poking
non-existent PCI devices, if at all.
- Just use KERNBASE for FLUSH as we also do elsewhere as the US-IV+
documentation doesn't seem to mention that these CPUs also ignore the
address like previous cheetah-class CPUs do. Again the code changing
LSU_IC is executed seldom enough that the negligible optimization of
using %g0 instead should have no real impact.
With these changes FreeBSD runs stable on V890 equipped with US-IV+
and -j128 buildworlds in a loop for days are no problem. Unfortunately,
the performance isn't were it should be as a buildworld on a 4x1.5GHz
US-IV+ V890 takes nearly 3h while on a V440 with (theoretically) less
powerfull 4x1.5GHz US-IIIi it takes just over 1h. It's unclear whether
this is related to the supposed silicon bug mentioned above or due to
another issue. The documentation (which contains a sever bug in the
description of the bits added to the context registers though) at least
doesn't mention any requirements for changes in the CPU handling besides
those implemented and the cache as well as the TLB configurations and
handling look fine.
o Re-arrange cheetah_init() so it's easier to add support for SPARC64
V up to VIIIfx CPUs, which only require parts of this initialization.
MFC r205652
A ptrace(2) by one process may trigger a page size promotion in the
address space of another process. Modify pmap_promote_pde() to handle
this.
MFC r205998:
If there is multiple PMCs for the same interrupt ignore new post.
This will indirectly fix a bug where the thread will be pinned
forever if the assert is not compiled.
When tearing down IPsec as part of a (virtual) network stack,
do not try to free the same list twice but free both the
acquiring list and the security policy acquiring list.
Verify interface up status using its link state only
if the interface has such capability. The interface
capability flag indicates whether such capability
exists. This approach is much more backward compatible.
Physical device driver changes will be part of another
commit.
Also updated the ifconfig utility to show the LINKSTATE
capability if present.
The if_tap interface is of IFT_ETHERNET type, but it
does not set or update the if_link_state variable.
As such RT_LINK_IS_UP() fails for the if_tap interface.
Also, the RT_LINK_IS_UP() needs to bypass all loopback
interfaces because loopback interfaces are considered
up logically as long as the system is running.
This patch fixes the above issues by setting and updating
the if_link_state variable when the tap interface is
opened or closed respectively. Similary approach is
already done in the if_tun device.
One of the advantages of enabling ECMP (a.k.a RADIX_MPATH) is to
allow for connection load balancing across interfaces. Currently
the address alias handling method is colliding with the ECMP code.
For example, when two interfaces are configured on the same prefix,
only one prefix route is installed. So connection load balancing
among the available interfaces is not possible.
The other advantage of ECMP is for failover. The issue with the
current code, is that the interface link-state is not reflected
in the route entry. For example, if there are two interfaces on
the same prefix, the cable on one interface is unplugged, new and
existing connections should switch over to the other interface.
This is not done today and packets go into a black hole.
Also, there is a small bug in the kernel where deleting ECMP routes
in the userland will always return an error even though the command
is successfully executed.
introduce a local variable rte acting as a cache of ro->ro_rt
within ip_output, achieving (in random order of importance):
- a reduction of the number of 'r's in the source code;
- improved legibility;
- a reduction of 64 bytes in the .text
The flow-table module retrieves the destination and source
address as well as the transport protocol port information
from the outbound packets. The routing code is generic and
compares every byte in the given sockaddr object. Therefore
the temporary sockaddr objects must be cleared due to padding
bytes. In addition, the port information must be stripped
or the route search will either fail or return the incorrect
route entry.
Unit testing is done using OpenVPN over the if_tun interface.
marius [Thu, 1 Apr 2010 15:17:50 +0000 (15:17 +0000)]
MFC: r205409
- The firmware of Sun Fire V1280 has a misfeature of setting %wstate to
7 which corresponds to WSTATE_KMIX in OpenSolaris whenever calling into
it which totally screws us even when restoring %wstate afterwards as
spill/fill traps can happen while in OFW. The rather hackish OpenBSD
approach of just setting the equivalent of WSTATE_KERNEL to 7 also is
no option as we treat %wstate as a bit field. So in order to deal with
this problem actually implement spill/fill handlers for %wstate 7 which
just act as the WSTATE_KERNEL ones except of theoretically also handling
32-bit, turn off interrupts completely so we don't even take IPIs while
in OFW which should ensure we only take spill/fill traps at most and
restore %wstate after calling into OFW once we have taken over the trap
table. While at it, actually set WSTATE_{,PROM}_KMIX before calling into
OFW just like OpenSolaris does, which should at least help testing this
change on non-V1280.
- Remove comments referring to the %wstate usage in BSD/OS.
- Remove the no longer used RSF_ALIGN_RETRY macro.
- Correct some trap table addresses in comments.
- Ensure %wstate is set to WSTATE_KERNEL when taking over the trap table.
- Ensure PSTATE_AM is off when entering or exiting to OFW as well as that
interrupts are also completely off when exiting to OFW as the firmware
trap table shouldn't be used to handle our interrupts.
Update the page table locking for the 64-bit PMAP. One of these revisions
largely reverted the other, so there is a small amount of churn and the
addition of some mtx_assert()s.
Fix two small bugs. The PowerPC 970 does not support non-coherent memory
access, and reflects this by autonomously writing LPTE_M into PTE entries.
As such, we should not panic if LPTE_M changes by itself. While here,
fix a harmless typo in moea64_sync_icache().
MFC: r197542:
- When we run our trap cleanup handler, echo that we are running this
handler to make it more clear why we are 'suddenly' running df,
umount, and mdconfig.
- Remove trap handler again after we have unconfigured the memory
device etc. Before we could end up running the trap handler if a
later stage failed, which was a bit confusing and not really useful.
MFC after: 2 weeks
Log:
- restructure flowtable to support ipv6
- add a name argument to flowtable_alloc for printing with ddb commands
- extend ddb commands to print destination address or 4-tuples
- don't parse ports in ulp header if FL_HASH_ALL is not passed
- add kern_flowtable_insert to enable more generic use of flowtable
(e.g. system calls for adding entries)
- don't hash loopback addresses
- cleanup whitespace
- keep statistics per-cpu for per-cpu flowtables to avoid cache line contention
- add sysctls to accumulate stats and report aggregate
r205069:
Log:
fix stats reporting sysctl
r205093:
Log:
re-update copyright to 2010
pointed out by danfe@
r205097:
Log:
flowtable_get_hashkey is only used by a DDB function - move under #ifdef DDB
pointed out by jkim@
r205488:
Log:
- boot-time size the ipv4 flowtable and the maximum number of flows
- increase flow cleaning frequency and decrease flow caching time
when near the flow limit
- stop allocating new flows when within 3% of maxflows don't start
allocating again until below 12.5%
marius [Wed, 31 Mar 2010 22:05:49 +0000 (22:05 +0000)]
MFC: r205399
Improve the KVA space sizing of r186682; on machines with large dTLBs we
can actually use all of the available lockable entries of the tiny dTLB
for the kernel TSB. With this change the KVA space sizing happens to be
more in line with the MI one so up to at least 24GB machines KVA doesn't
need to be limited manually. This is just another stopgap though, the
real solution is to take advantage of ASI_ATOMIC_QUAD_LDD_PHYS on CPUs
providing it so we don't need to lock the kernel TSB pages into the dTLB
in the first place.
marius [Wed, 31 Mar 2010 21:57:48 +0000 (21:57 +0000)]
MFC: r205258
- Add TTE and context register bits for the additional page sizes supported
by UltraSparc-IV and -IV+ as well as SPARC64 V, VI, VII and VIIIfx CPUs.
- Replace TLB_PCXR_PGSZ_MASK and TLB_SCXR_PGSZ_MASK with TLB_CXR_PGSZ_MASK
which just is the complement of TLB_CXR_CTX_MASK instead of trying to
assemble it from the page size bits which vary across CPUs.
- Add macros for the remainder of the SFSR bits, which are useful for at
least debugging purposes.
marius [Wed, 31 Mar 2010 21:32:52 +0000 (21:32 +0000)]
MFC: r204152, r204164
Some machines can not only consist of CPUs running at different speeds
but also of different types, f.e. Sun Fire V890 can be equipped with a
mix of UltraSPARC IV and IV+ CPUs, requiring different MMU initialization
and different workarounds for model specific errata. Therefore move the
CPU implementation number from a global variable to the per-CPU data.
Functions which are called before the latter is available are passed the
implementation number as a parameter now.
jkim [Wed, 31 Mar 2010 16:01:48 +0000 (16:01 +0000)]
MFC: r205855
Print memory model of the video mode except for planar memory model.
'P', 'D', 'C', 'H', and 'V' mean packed pixel, direct color, CGA, Hercules,
and VGA X memory models respectively where they have fixed number of planes.
Sync. pixel mode support for VESA and VGA frame buffers with HEAD.
- Map entire video memory again. Although we do not use them all directly,
it seems VGA renderer may access unmapped memory region and cause kernel
panic.
- Fall back to VGA palette functions if VESA function failed and DAC is
still in 6-bit mode. Although we have to check non-VGA compatibility bit
here, it seems there are too many broken VESA BIOSes out to rely on it.
- Be careful when we determine bytes per scan line information. We compare
mode table data against minimum value. If the mode table does not make
sense, we set the minimum in the mode info.
- Teach VGA framebuffer about 8-bit palette format for VESA.
- Add my copyright here.
jkim [Wed, 31 Mar 2010 15:39:46 +0000 (15:39 +0000)]
MFC: r205550, r205605, r205865
Sync. pixel mode support for syscons(4) with HEAD.
- Separate 24-bit pixel draw from 32-bit case. Although it is slower, we do
not want to write a useless zero to inaccessible memory region.
- We only want the dummy palette for direct color mode.
- Detect illegal access to unmapped memory within real mode emulator.
- Map EBDA if available and support memory wraparound above 1MB as VM86 does.
- Set initial %ds to 0x40 as X.org int10 handler does.
- Print the initial memory map when bootverbose is set.
- Optimize real mode page table lookup.
- Add strictly aligned memory access for distant future.
- Update copyright date.