jh [Sun, 28 Mar 2010 11:22:38 +0000 (11:22 +0000)]
MFC r198175:
- If lstat()/stat() fails with an error other than ENOENT, don't ignore
the error and assume that the file doesn't exist. Touch could return
success with -c option even if the file existed and time was not set.
- If the first utimes_f() call fails with -A option, give up and don't
continue trying to set times to current time. [1]
- Set exit status to 1 when setting of timestamps fails for a directory
or symbolic link even though lstat()/stat() would succeed.
- Don't print bogus error message when rw() succeeds.
trasz [Sat, 27 Mar 2010 18:45:53 +0000 (18:45 +0000)]
MFC r203122:
Improve descriptions, remove turnstiles (since, from what I understand,
they are only used to implement other synchronization primitives), tweak
formatting.
MFC r203127:
Add description of bounded sleep vs unbounded sleep (aka blocking). Move
rules into their own section.
MFC r203131:
Cosmetic fixes.
MFC r203759:
Improve description for Giant and mention blocking inside interrupt threads.
MFC r203762:
Start sentences with a new line.
Submitted by: brueffer
MFC r203825:
Remove list of locking primitives, which is kind of redundant, move
information about witness(9) to the section about interactions, and
expand 'contexts' table.
MFC r203929:
Some rewording and language fixes.
PR: docs/136918, docs/134074
Submitted by: Ben Kaduk <kaduk at mit dot edu>, Haven Hash <havenster at gmail dot com>
trasz [Sat, 27 Mar 2010 18:09:40 +0000 (18:09 +0000)]
MFC r200273:
Don't add VAPPEND if the file is not being opened for writing. Note that this
only affects cases where open(2) is being used improperly - i.e. when the user
specifies O_APPEND without O_WRONLY or O_RDWR.
trasz [Sat, 27 Mar 2010 18:08:14 +0000 (18:08 +0000)]
MFC r200058:
Add change that was somehow missed in r192586. It could manifest by
incorrectly returning EINVAL from acl_valid(3) for applications linked
against pre-8.0 libc.
trasz [Sat, 27 Mar 2010 18:04:33 +0000 (18:04 +0000)]
MFC r199875:
Provide a set of sysctls and tunables to disable device node creation
for specific "kinds" of disk labels - for example, GPT UUIDs. Reason
for this is that sometimes, other GEOM classes attach to these device
nodes instead of the proper ones - e.g. they attach to /dev/gptid/XXX
instead of /dev/ada0p2, which is annoying.
bz [Sat, 27 Mar 2010 17:57:17 +0000 (17:57 +0000)]
MFC r204840:
As statfs.f_flags are uint64_t the local variables should be as well.
We'll start noticing this with the next flag introduced as the lower
32bit are all used.
bz [Sat, 27 Mar 2010 17:54:44 +0000 (17:54 +0000)]
MFC r205626:
Print the pointer to the lock with the panic message. The previous
panic: rw lock not unlocked
was not really helpful for debugging. Now one can at least call
show lock <ptr>
form ddb to learn more about the lock.
bz [Sat, 27 Mar 2010 17:52:56 +0000 (17:52 +0000)]
MFC r205276:
Add ddb support to the "new" link layer code ("new-arp"):
- show all lltables [1] (optional flag to also show the llentries as well)
- show lltable <struct lltable *>
- show llentry <struct llentry *>
bz [Sat, 27 Mar 2010 17:50:02 +0000 (17:50 +0000)]
MFC r204838:
Destroy TCP UMA zones (empty or not) upon network stack teardown
to not leak them, otherwise making UMA/vmstat unhappy with every
stoped vnet.
We will still leak pages (especially for zones marked NOFREE).
Reshuffle cleanup order in tcp_destroy() to get rid of what we can
easily free first.
bz [Sat, 27 Mar 2010 17:48:13 +0000 (17:48 +0000)]
MFC r204805:
Rework reference counting in case we queue into the netisr,
or overflow the netisr queue and fall back to the interface
queue so that we can garuantee that the ifnet pointer stays
valid. Formerly we ended up with reference counts <= 0 in
case the netisr had returned ENOBUFS. The idea is to track
any packet in the netisr queue and only change the refount
on edge operations for the fallback interface queue. This
also avoids problems in case the if_snd.ifq_len lies to us.
Also rework refount assertions to make sure they trigger if
we go below 1. Formerly a negative refence count did not
trigger the assert as the refcount variable is u_int.
bz [Sat, 27 Mar 2010 17:46:06 +0000 (17:46 +0000)]
MFC r204807:
Destroy UDP UMA zones (empty or not) upon network stack teardown
to not leak them making UMA/vmstat -z unhappy with every stoped vnet.
We will still leak pages (especially as zones are marked NOFREE).
bz [Sat, 27 Mar 2010 17:40:28 +0000 (17:40 +0000)]
MFC r204279:
Use the DB_SHOW_ALL_COMMAND() macro to register the formerly 'show ifnets'
in the db_show_all_table as 'show all ifnets' and with that follow the
convention for showing complete lists.
bz [Sat, 27 Mar 2010 17:39:02 +0000 (17:39 +0000)]
MFC r204145:
Start to implement ifnet DDB support:
- 'show ifnets' prints a list of ifnet *s per virtual network stack,
- 'show ifnet <struct ifnet *>' prints fields matching the given ifp.
We do not yet print the complete set of fields and might want to
factor this out to an extra if_debug.c file in case this grows
a lot[1]. We may also want to grow 'show ifnet <if_xname>' support[1].
bz [Sat, 27 Mar 2010 17:34:57 +0000 (17:34 +0000)]
MFC r204140:
Split up ip_drain() into an outer lock and iterator part and
a "locked" version that will only handle a single network stack
instance. The latter is called directly from ip_destroy().
Hook up an ip_destroy() function to release resources from the
legacy IP network layer upon virtual network stack teardown.
bz [Sat, 27 Mar 2010 17:31:54 +0000 (17:31 +0000)]
MFC r203729:
Add DDB support for printing vnet_sysinit and vnet_sysuninit
ordered call lists. Try to lookup function/symbol names and print
those in addition to the pointers, along with the constants for
subsystem and order.
This is useful for debugging vnet teardown ordering issues.
Make it possible to call the actual printing frunction from normal
code at runtime, ie. from vnet_sysuninit(), if DDB support is there.
bz [Sat, 27 Mar 2010 17:29:50 +0000 (17:29 +0000)]
MFC r203727:
Add an SDT provider for "vnet"s along with probes for vnet_alloc
and vnet_destroy.
Use the line number rather than NULL as dummy argument.
Note: the fbt provider does not reliably provide :return probes
(depending on optimization levels used at compile time) making
it unusable for scripts to generate complete call-traces with
well defined boundaries over allocations or destructions of
virtual network stacks.
trasz [Sat, 27 Mar 2010 17:22:11 +0000 (17:22 +0000)]
MFC r197680:
Provide default implementation for VOP_ACCESS(9), so that filesystems which
want to provide VOP_ACCESSX(9) don't have to implement both. Note that
this commit makes implementation of either of these two mandatory.
bz [Sat, 27 Mar 2010 17:22:08 +0000 (17:22 +0000)]
MFC r201815:
To avoid hardcoding further kernel configuration names for
make universe, split the logic into two parts:
- 1st to build worlds and generate kernel configs like LINT.
- 2nd to build kernels for a given TARGET architecture correctly
finding all newly generated configs, not knowing anything about
LINT anymore.
MFC rr201960:
Use uname -m [1] and rename BUILD_ARCH to XMACHINE[2].
Submitted by: nyan[1], imp[2]
MFC r202095:
Rather than using an extra variable, only call uname if really needed and
then directly assign the result.
bz [Sat, 27 Mar 2010 17:17:11 +0000 (17:17 +0000)]
MFC r201814:
Generate a second LINT configuration for i386 and amd64 in
sys/conf/makeLINT.mk, which includes LINT and sets options VIMAGE
so that we will have VIMAGE LINT builds. For now only do it for
those two architectures to avoid massive universe times for archs,
where people will less likely use VIMAGE or not at all.
bz [Sat, 27 Mar 2010 17:14:55 +0000 (17:14 +0000)]
MFC r201813:
In sys/<arch>/conf/Makefile set TARGET to <arch>. That allows
sys/conf/makeLINT.mk to only do certain things for certain
architectures.
Note that neither arm nor mips have the Makefile there, thus
essentially not (yet) supporting LINT. This would enable them
do add special treatment to sys/conf/makeLINT.mk as well chosing
one of the many configurations as LINT.
bz [Sat, 27 Mar 2010 17:11:06 +0000 (17:11 +0000)]
MFC r202123:
Change DDB show prison:
- name some columns more closely to the user space variables,
as we do for host.* or allow.* (in the listing) already.
- print pr_childmax (children.max).
- prefix hex values with 0x.
trasz [Sat, 27 Mar 2010 14:58:28 +0000 (14:58 +0000)]
MFC r202919:
Fix array overflow. This routine is only called from procfs,
which is not mounted by default, and I've been unable to trigger
a panic without this fix applied anyway.
jhb [Fri, 26 Mar 2010 18:58:22 +0000 (18:58 +0000)]
MFC 205332:
Use the same policy for rejecting / not-reject ACPI tables with incorrect
checksums as the base acpi(4) driver. This fixes a problem where the MADT
parser would reject the MADT table during early boot causing the MP Table
to be, but then the acpi(4) driver would attach and use non-SMP interrupt
routing.
jhb [Fri, 26 Mar 2010 13:49:46 +0000 (13:49 +0000)]
MFC 205214:
- Extend the machine check record structure to include several fields useful
for parsing model-specific and other fields in machine check events
including the global machine check capabilities and status registers,
CPU identification, and the FreeBSD CPU ID.
- Report these added fields in the console log of a machine check so that
a record structure can be reconstituted from the console messages.
- Parse new architectural errors including memory controller errors.
delphij [Thu, 25 Mar 2010 20:07:30 +0000 (20:07 +0000)]
MFC r205654:
The rmt client in GNU cpio could have a heap overflow when a malicious
remote tape service returns deliberately crafted packets containing
more data than requested.
Fix this by checking the returned amount of data and bail out when it
is more than what we requested.
jhb [Thu, 25 Mar 2010 15:48:23 +0000 (15:48 +0000)]
MFC 205013:
Print out the family and model from the cpu_id. This is especially useful
given the advent of the extended family and extended model fields. The
values are printed in hex to match their common usage in documentation.
gavin [Thu, 25 Mar 2010 12:56:20 +0000 (12:56 +0000)]
Merge r204165 from head:
Add a "-x" option to chown(8)/chgrp(1) similar to the same option in
du(1), cp(1) etc, to prevent the crossing of mountpoints whilst using the
commands recursively.
ed [Thu, 25 Mar 2010 08:33:56 +0000 (08:33 +0000)]
MFC r205008 and 205009:
Make script(1) a little less broken.
Close the file descriptor to the TTY. There is no reason why the parent
process should keep track of the descriptor. This ensures that the
application inside properly drains the TTY during exit(2).
yongari [Wed, 24 Mar 2010 17:36:56 +0000 (17:36 +0000)]
MFC r205161:
It seems PCI_OUR_REG_[1-5] registers are not mapped on PCI
configuration space on Yukon Ultra(88E8056) such that accesses to
these registers were NOPs which in turn make msk(4) instable on
this controller. Use indirect access method to access
PCI_OUR_REG_[1-5] registers. This should fix a long standing
instability bug which prevented msk(4) working on Yukon Ultra.
Special thanks to koitsu who gave me remote access to his system.
yongari [Wed, 24 Mar 2010 17:29:32 +0000 (17:29 +0000)]
MFC r204975,204978-204979,204981:
r204975:
Enable hardware fixes for BCM5704 B0 as recommended by data sheet.
r204978:
Set maximum read byte count to 2048 for PCI-X BCM5703/5704 devices.
Also disable relaxed ordering as recommended by data sheet for
PCI-X devices. For PCI-X BCM5704, set maximum outstanding split
transactions to 0 as indicated by data sheet.
For BCM5703 in PCI-X mode, DMA read watermark should be less than
or equal to maximum read byte count configuration. Enforce this
limitation in DMA read watermark configuration.
r204979:
Fix typo in r204975.
r204981:
Fix typo in r204978.
yongari [Wed, 24 Mar 2010 17:18:44 +0000 (17:18 +0000)]
MFC r204545:
Remove taskqueue based interrupt handling. After r204541 msk(4)
does not generate excessive interrupts any more so we don't need
to have two copies of interrupt handler.
While I'm here remove two STAT_PUT_IDX register accesses in LE
status event handler. After r204539 msk(4) always sync status LEs
so there is no need to resort to reading STAT_PUT_IDX register to
know the end of status LE processing. Just trust status LE's
ownership bit.
yongari [Wed, 24 Mar 2010 17:11:01 +0000 (17:11 +0000)]
MFC r204541:
Implement rudimentary interrupt moderation with programmable
countdown timer register. The timer resolution may vary among
controllers but the value would be represented by core clock
cycles. msk(4) will automatically computes number of required clock
cycles from given micro-seconds unit.
The default interrupt holdoff timer value is 100us which will
ensure less than 10k interrupts under load. The timer value can be
changed with dev.mskc.0.int_holdoff sysctl node.
Note, the interrupt moderation is shared resource on dual-port
controllers so you can't use separate interrupt moderation value
for each port. This means we can't stop interrupt moderation in
driver stop routine. Also have msk_tick() reclaim transmitted Tx
buffers as safety belt. With this change there is no need to check
missing Tx completion interrupt in watchdog handler, so remove it.
luigi [Wed, 24 Mar 2010 15:19:47 +0000 (15:19 +0000)]
MFC 205602:
Honor ip.fw.one_pass when a packet comes out of a pipe without being delayed.
I forgot to handle this case when i did the mtag cleanup three months ago.
I am merging immediately because this bugfix is important for
people using RELENG_8.
yongari [Tue, 23 Mar 2010 22:22:26 +0000 (22:22 +0000)]
MFC r204378:
Add TSO support on VLANs. While I'm here remove unnecessary check
of VLAN hardware checksum offloading. vlan(4) already takes care of
this.
yongari [Tue, 23 Mar 2010 22:19:27 +0000 (22:19 +0000)]
MFC r204377:
Add TSO support on VLANs. While I'm here remove unnecessary check
of VLAN hardware checksum offloading. vlan(4) already takes care of
this.
yongari [Tue, 23 Mar 2010 22:16:12 +0000 (22:16 +0000)]
MFC r204376:
Disable TSO on BCM5755M controller until I understand better for
the issue. I still have no idea why TSO does not work on this
controller. davidch@ also confirmed there is no known TSO related
issues for this controller.
yongari [Tue, 23 Mar 2010 22:11:39 +0000 (22:11 +0000)]
MFC r204373-204374:
r204373:
Move TSO setup to new function bce_tso_setup(). Also remove VLAN
parsing code in TSO path as the controller requires VLAN hardware
tagging to make TSO work over VLANs.
While parsing the mbuf in TSO patch, always perform check for
writable mbuf as bce(4) have to reset IP length and IP checksum
field of IP header and make sure to ensure contiguous buffer before
accessing IP/TCP headers. While I'm here replace magic number 40 to
more readable sizeof(struct ip) + sizeof(struct tcphdr).
r204374:
Add TSO support on VLANs. bce(4) controllers require VLAN hardware
tagging to make TSO work on VLANs so explicitly disable TSO on VLAN
if VLAN hardware tagging is disabled.
yongari [Tue, 23 Mar 2010 22:04:18 +0000 (22:04 +0000)]
MFC r204368,204370-204372:
r204368:
Allow disabling VLAN hardware tag stripping with software work
around. Management firmware(ASF/IPMI/UMP) requires the VLAN
hardware tag stripping so don't actually disable VLAN hardware tag
stripping. If VLAN hardware tag stripping was disabled, bce(4)
manually reconstruct VLAN frame by appending stripped VLAN tag.
Also remove unnecessary IFCAP_VLAN_MTU message.
r204370:
Make sure to stop controller first before changing MTU. And if
interface is not running don't initialize controller.
While here remove unnecessary update of error variable.
r204371:
Make toggling TSO, VLAN hardware checksum offloading work. Also fix
TX/RX checksum handler to set/clear relavant assist bits which was
used to cause unexpected results.
With this change, bce(4) can be bridged with other interfaces that
lack TSO, VLAN checksum offloading.
yongari [Tue, 23 Mar 2010 21:51:31 +0000 (21:51 +0000)]
MFC r204363,204365-204367,204539-204540:
r204363:
Optimize inserting LE for TX checksum computation. Controller does
not require checksum LE configuration if checksum start and write
position is the same as before. So keep track last checksum start
and write position and insert new LE whenever the position is
changed. This reduces number of LEs used in TX path as well as
slightly enhance TX performance.
r204365:
Don't hardcod register offset to set PCIe max read request size.
The register offset is not valid on 88E8072 controller. Also don't
blindly increase max read request size to 4096, instead, use 2048
which seems to be more sane value and only change the value if the
hardware default size(512) was used on that register.
For PCIX controllers, use system defined constant rather than using
magic value.
While I'm here stop showing negotiated link width.
r204366:
Allocate single MSI message. msk(4) used to allocate 2 MSI messages
for controllers like 88E8053 which reports two MSI messages.
Because we don't get anything useful things with 2 MSI messages,
allocating 1 MSI message would be more sane approach.
While I'm here, enable MSI for dual-port controllers too. Because
status block is shared for dual-port controllers, I don't think
msk(4) will encounter problem for using MSI on dual-port
controllers.
r204367:
Remove trailing white spaces.
r204539:
Properly sync status LEs after processing.
r204540:
Make sure to enable flow-control only if established link is
full-duplex. Previously msk(4) used to allow flow-control on
1000baseT half-duplex media. Also GMAC pause is enabled if link
partner is capable of handling it.
While I'm here use IFM_OPTIONS instead of using IFM_GMASK to check
optional flags of link.
yongari [Tue, 23 Mar 2010 21:38:25 +0000 (21:38 +0000)]
MFC r204361-204362:
r204361:
Reuse the configured LE for VLAN if new LE was created for TSO.
Only old controllers need to create new LE for TSO. This change
makes TSO work over VLANs.
r204362:
Add TSO support on VLANs. Controller requires VLAN hardware tagging
to make TSO work over VLANs.
yongari [Tue, 23 Mar 2010 19:41:43 +0000 (19:41 +0000)]
MFC r204228,204230:
r204228:
Add TSO support on VLANs. Also make sure to update TSO capability
whenever jumbo frame is configured.
While I'm here remove unnecessary check of VLAN hardware checksum
offloading. vlan(4) already takes care of this.
r204230:
Remove Tx mbuf parsing code for VLAN in TSO path. Controller does
not support TSO over VLAN if VLAN hardware tagging is disabled so
there is no need to check VLAN here.
yongari [Tue, 23 Mar 2010 19:30:15 +0000 (19:30 +0000)]
MFC r204155,204219:
r204155:
Increase PCIe maximuim read request size to 2048. Because re(4) uses
Tx DMA burst size 2048, I beleive PCIe maximum read request size
also should match to the value of Tx DMA burst size. With this
change I can get more than 800Mbps for TCP bulk transfers.
Previously I was not able to get more than 700Mbps. If I enable TSO
it now shows 927Mbps.
r204219:
Add TSO on VLANs. Because re(4) has a TSO limitation for jumbo
frame, make sure to update VLAN capabilities whenever jumbo frame
is configured.
While I'm here rearrange interface capabilities configuration. The
controller requires VLAN hardware tagging to make TSO work on VLANs
so explicitly check this requirement.
yongari [Tue, 23 Mar 2010 19:16:35 +0000 (19:16 +0000)]
MFC r204151,204223:
r204151:
Add TSO support on VLAN. Controller requires VLAN hardware tagging
to make TSO work on VLAN. So if VLAN hardware tagging is disabled
explicitly clear TSO on VLAN. While I'm here remove duplicated
VLAN_CAPABILITIES call.
r204223:
Remove Tx mbuf parsing code for VLAN in TSO path. Controller does
not support TSO over VLAN if VLAN hardware tagging is disabled so
there is no need to check VLAN here.
While I'm here make sure to pullup IP/TCP headers in the first
buffer.
jh [Tue, 23 Mar 2010 16:45:29 +0000 (16:45 +0000)]
MFC r205121:
Use an unique directory name instead of hardcoded /tmp/.diskless.
A malicious user could create a file named /tmp/.diskless and cause
the script to misbehave.
luigi [Tue, 23 Mar 2010 09:58:59 +0000 (09:58 +0000)]
MFC of a large number of ipfw and dummynet fixes and enhancements
done in CURRENT over the last 4 months.
HEAD and RELENG_8 are almost in sync now for ipfw, dummynet
the pfil hooks and related components.
Among the most noticeable changes:
- r200855 more efficient lookup of skipto rules, and remove O(N)
blocks from critical sections in the kernel;
- r204591 large restructuring of the dummynet module, with support
for multiple scheduling algorithms (4 available so far)
See the original commit logs for details.
Changes in the kernel/userland ABI should be harmless because the
kernel is able to understand previous requests from RELENG_8 and
RELENG_7. For this reason, this changeset would be applicable
to RELENG_7 as well, but i am not sure if it is worthwhile.
hrs [Mon, 22 Mar 2010 22:07:19 +0000 (22:07 +0000)]
MFC r203272:
- Fix a bug when adding an interface with an invalid MTU sets the
bridge's MTU if it is the firstly-added one while the addition
itself fails.
- Allow SIOCSIFMTU only when all members have the same MTU.
- Remove IFT_GIF check when defining the brige MTU by the
firstly-added interface's one. The MTU of the gif interface
has to be the same as the bridge's one.
edwin [Mon, 22 Mar 2010 21:35:54 +0000 (21:35 +0000)]
MFC of r205475, tzdata2010f:
The Australian Antartic Division:
- Macquarie Island will stay on UTC+11 for winter and not switch back from DST.
- Casey station reverted to its normal time of UTC+8 on 5 March 2010.
- Davis station will revert to its normal time of UTC+7 at 10 March 2010
- Mawson station stays on UTC+5.
Syria will start DST on Thursday 1 April 2010 at midnight.
jkim [Mon, 22 Mar 2010 20:36:35 +0000 (20:36 +0000)]
MFC: r205223
Fix a long standing regression of readdir(3) in fdescfs(5) introduced
in r1.48. We were stopping at the first null pointer when multiple file
descriptors were opened and one in the middle was closed. This restores
traditional behaviour of fdescfs.
jkim [Mon, 22 Mar 2010 19:59:00 +0000 (19:59 +0000)]
MFC: r205092
Tidy up callout for select(2) and read timeout.
- Add a missing callout_drain(9) before the descriptor deallocation.[1]
- Prefer callout_init_mtx(9) over callout_init(9) and let the callout
subsystem handle the mutex for callout function.
PR: kern/144453
Submitted by: Alexander Sack (asack at niksun dot com)[1]