Warner Losh [Sun, 7 Dec 2008 18:34:27 +0000 (18:34 +0000)]
Use '0' rather than PZERO to not change the priority that I'm waiting
at. I don't think this will make a huge difference, but I have
received a report of a interrupt storm on one 16-bit card that this
might fix (chances are it won't, since I think that we may need to
check both the CBB registers for the 16-bit card as well as the PCIC
registers for power state change).
Warner Losh [Sun, 7 Dec 2008 18:32:09 +0000 (18:32 +0000)]
Use atomic_add_int rather than a simple ++ to ensure no cache races if
the power interrupt and init code waiting for the interrupt are
running on different CPUs. I haven't seen this make any real
difference, but I've also had some reports of odd behavior I can't
otherwise explain. It is an infrequent operation, and certainly
wouldn't hurt.
Improve usefulness of the panic by printing the pointer to the problematic
dquot. In-tree gdb is often unable to get the dq value, so supply it in
panic message.
Peter Wemm [Sun, 7 Dec 2008 02:32:49 +0000 (02:32 +0000)]
When libthr and rtld start up, there are a number of magic spells cast
in order to get the symbol binding state "just so". This is to allow
locking to be activated and not run into recursion problems later.
However, one of the magic bits involves an explicit call to _umtx_op()
to force symbol resolution. It does a wakeup operation on a fake,
uninitialized (ie: random contents) umtx. Since libthr isn't active, this
is harmless. Nothing can match the random wakeup.
However, valgrind finds this and is not amused. Normally I'd just
write a suppression record for it, but the idea of passing random
args to syscalls (on purpose) just doesn't feel right.
Add support for automated reboot after power failure on Apple Core99 machines
(G3 laptops, all G4 machines, early G5s, G5 Xserves). The relevant sysctl
is named dev.pmu.0.server_mode for mental compatibility with Linux.
Fix some nasty race conditions in the VIA-CUDA driver that ended up preventing
my right mouse button and keyboard LEDs from working due to mangled
configuration packets. Fixed several other races and associated problems in the
main ADB stack that were exposed while fixing this.
Alexander Motin [Sat, 6 Dec 2008 23:00:48 +0000 (23:00 +0000)]
Carefully handle memory errors to keep peers compression/encryption state
consistent. There are some cases reported where peers fatally getting out
of sync without any visible reason. I hope this solve the problem.
Alexander Motin [Sat, 6 Dec 2008 21:41:27 +0000 (21:41 +0000)]
Implement suspend/resume for mmc and mmcsd drivers.
Now it is possible to suspend/resume with inserted and active card.
To reinitialize card on resume and to detect card change while suspended,
implement bus rescan routines. It can also be used by controllers without
card presence detection signals or with multiple cards per slot support.
While there, cleanup msleep() usage. We have no any rights to exit without
"request done" signal from driver as it could lead to modify after free.
Add a -L option to wc(1), for finger compatibility with the GNU
wc utility. The -L option can be used to report the length of
the longest line wc has seen in one or more files. It is
disabled by default, and wc uses the standard `-lwc'.
in_rtalloc1(9) returns a locked route, so make sure that we use
RTFREE_LOCKED() here. This macro makes sure the reference count
on the route is being managed properly. This elimates another
case which results in the following message being printed to the
console:
Randall Stewart [Sat, 6 Dec 2008 13:19:54 +0000 (13:19 +0000)]
Code from the hack-session known as the IETF (and a
bit of debugging afterwards):
- Fix protection code for notification generation.
- Decouple associd from vtag
- Allow vtags to have less strigent requirements in non-uniqueness.
o don't pre-hash them when you issue one in a cookie.
o Allow duplicates and use addresses and ports to
discriminate amongst the duplicates during lookup.
- Add support for the NAT draft draft-ietf-behave-sctpnat-00, this
is still experimental and needs more extensive testing with the
Jason Butt ipfw changes.
- Support for the SENDER_DRY event to get DTLS in OpenSSL working
with a set of patches from Michael Tuexen (hopefully heading to OpenSSL soon).
- Update the support of SCTP-AUTH by Peter Lei.
- Use macros for refcounting.
- Fix MTU for UDP encapsulation.
- Fix reporting back of unsent data.
- Update assoc send counter handling to be consistent with endpoint sent counter.
- Fix a bug in PR-SCTP.
- Fix so we only send another FWD-TSN when a SACK arrives IF and only
if the adv-peer-ack point progressed. However we still make sure
a timer is running if we do have an adv_peer_ack point.
- Fix PR-SCTP bug where chunks were retransmitted if they are sent
unreliable but not abandoned yet.
With the help of: Michael Teuxen and Peter Lei :-)
MFC after: 4 weeks
Tim Kientzle [Sat, 6 Dec 2008 07:08:08 +0000 (07:08 +0000)]
New tests:
* support for bzip2 file with multiple concatenated bzip2 streams
* support for bzip2 file with junk after bzip2 stream
* support for gzip file with junk after gzip stream
* "fuzz" tester randomly modifies a bunch of input files in order to try
to crash libarchive (this found an amusing hang in the ISO9660 code
when trying to read images that advertised a zero blocksize).
This test is implemented, but commented out for now:
* support for gzip file with multiple concatenated gzip streams
Tim Kientzle [Sat, 6 Dec 2008 06:45:15 +0000 (06:45 +0000)]
MfP4: Big read filter refactoring.
This is an attempt to eliminate a lot of redundant
code from the read ("decompression") filters by
changing them to juggle arbitrary-sized blocks
and consolidate reblocking code at a single point
in archive_read.c.
Along the way, I've changed the internal read/consume
API used by the format handlers to a slightly
different style originally suggested by des@. It
does seem to simplify a lot of common cases.
The most dramatic change is, of course, to
archive_read_support_compression_none(), which
has just evaporated into a no-op as the blocking
code this used to hold has all been moved up
a level.
There's at least one more big round of refactoring
yet to come before the individual filters are as
straightforward as I think they should be...
Tim Kientzle [Sat, 6 Dec 2008 06:17:18 +0000 (06:17 +0000)]
Style fixes:
* Wrap long declarations to fit 80 chars
* #undef macros that shouldn't be exported
* Organize the version-dependent conditionals a
bit more consistently
Speculative:
* libarchive 3.0 will (eventually) use int64_t
instead of off_t. This is an attempt to avoid
some the headaches caused by Linux LFS. (I'll
still have to do ugly things for the struct stat
references in archive_entry.h, of course.)
John Baldwin [Fri, 5 Dec 2008 21:19:24 +0000 (21:19 +0000)]
Add simple locking for the in-kernel iconv code. Translation operations
do not need any locking. Opening and closing translators is serialized
using an sx lock.
Note: This depends on the earlier fix to kern_module.c to properly order
MOD_UNLOAD events.
Unconditionally use locked addition of zero to tip of the stack for
memory barriers on i386. It works as a serialization instruction on
all IA32 CPUs.
Alternative solution of using {s,l,}fence requires run-time checking
of the presense of the corresponding SSE or SSE2 extensions, and
possible boot-time patching of the kernel text.
Several threads in a process may do vfork() simultaneously. Then, all
parent threads sleep on the parent' struct proc until corresponding
child releases the vmspace. Each sleep is interlocked with proc mutex of
the child, that triggers assertion in the sleepq_add(). The assertion
requires that at any time, all simultaneous sleepers for the channel use
the same interlock.
Silent the assertion by using conditional variable allocated in the
child. Broadcast the variable event on exec() and exit().
Since struct proc * sleep wait channel is overloaded for several
unrelated events, I was unable to remove wakeups from the places where
cv_broadcast() is added, except exec().
Reported and tested by: ganbold
Suggested and reviewed by: jhb
MFC after: 2 week
Luigi Rizzo [Fri, 5 Dec 2008 17:13:40 +0000 (17:13 +0000)]
Some libstand/bootp.c extension (written by Danny Braniss, slightly
revised/modified by me) to store dhcp options into kenv variables,
so the information is available to the boot loader and can be used
to customize the boot process.
The change is totally unintrusive, essentially made of a single
function to be called while parsing a dhcp response, and a couple
of tables to classify options. The values extracted from dhcp
options are stored in the kenv environment in one of these forms:
+ options whose name and type is known are saved as
dhcp.name = value (string, or number/ip addresses lists)
+ unknown options are assumed to be strings and saved as
dhcp.option-NNN = "value"
+ options listed as '__INDIR' and sent on the wire as e.g.
option unknown-252 "some.name=the actual value"
are saved as
some.name = "the actual value"
+ options listed as '__ILIST' and sent on the wire as e.g.
option unknown-249 "a.b=foo bar; c.d= 123; e.f=done"
are saved as multiple values
a.b="foo bar"
c.d="123"
e.f="done"
As you can see there is quite a bit of flexibility on what can
be passed to the loader or the kernel.
For the time being the vendor-specific table is mostly disabled,
because there is no standard set of options for FreeBSD, and I don't
know all the pxe-specific vendor options.
Also, applications using libstand may live in memory-constrained
environments, so it makes sense to keep these tables as small as
possible, especially considering that one can generate arbitrary
name=value pairs using site-specific options of type __INDIR or
__ILIST (there are 4 __ILIST and 5 __INDIR in the table, numbered
246..249 and 250..254).
Actually, considering that probably 75% of the standard dhcp options
are totally useless, it might make sense to remove them as well.
John Baldwin [Fri, 5 Dec 2008 16:47:30 +0000 (16:47 +0000)]
When the SYSINIT() to load a module invokes the MOD_LOAD event successfully,
move that module to the head of the associated linker file's list of modules.
The end result is that once all the modules are loaded, they are sorted in
the reverse of their load order. This causes the kernel linker to invoke
the MOD_QUIESCE and MOD_UNLOAD events in the reverse of the order that
MOD_LOAD was invoked. This means that the ordering of MOD_LOAD events that
is set by the SI_* paramters to DECLARE_MODULE() are now honored in the same
order they would be for SYSUNINIT() for the MOD_QUIESCE and MOD_UNLOAD
events.
John Baldwin [Fri, 5 Dec 2008 13:40:25 +0000 (13:40 +0000)]
- Invoke MOD_QUIESCE on all modules in a linker file (kld) before
unloading any modules. As a result, if any module veto's an unload
request via MOD_QUIESCE, the entire set of modules for that linker
file will remain loaded and active now rather than leaving the kld
in a weird state where some modules are loaded and some are unloaded.
- This also moves the logic for handling the "forced" unload flag out of
kern_module.c and into kern_linker.c which is a bit cleaner.
- Add a module_name() routine that returns the name of a module and use that
instead of printing pointer values in debug messages when a module fails
MOD_QUIESCE or MOD_UNLOAD.
Warner Losh [Fri, 5 Dec 2008 05:20:08 +0000 (05:20 +0000)]
Move to using filter for the change interrupts. Also rework the power
interrupt code to be more robust. I've been running these changes for
over a year... With these changes, I don't see the ath card going
into reset like the code in the tree.
Fix a bug with the ael1006 PHY. The bug shows up as persistent but incomplete
packet loss, of between 10-30%. The fix is to put the PHY into
and take it out of local loopback mode when resetting the interface.
Pyun YongHyeon [Thu, 4 Dec 2008 01:58:40 +0000 (01:58 +0000)]
Add support for newer JMC250/JMC260 revisions.
o Chip full mask revision 2 or later controllers have to
set correct Tx MAC and Tx offload clock depending on negotiated
link speed.
o JMC260 chip full mask revision 2 has a silicon bug that can't
handle 64bit DMA addressing. Add workaround to the bug by
limiting DMA address space to be within 32bit.
o Valid FIFO space of receive control and status register was
changed on chip full mask revision 2 or later controllers. For
these controllers, use default 16QW as it's supposed to be the
safest value for maximum PCIe compatibility. JMicron confirmed
performance will not be reduced even if the FIFO space is set
to 16QW.
o When interface is put into suspend/shutdown state, remove Tx MAC
and Tx offload clock to save more power. We don't need Tx clock
at all in this state.
o Added new register definition for chip full mask revision 2 or
later controllers.
Thanks to JMicron for their continuous support of FreeBSD.
Xin LI [Wed, 3 Dec 2008 23:00:00 +0000 (23:00 +0000)]
Don't attempt to clear status updates if we did not do a link state
change. As a side effect, this makes the excessive interrupts to
disappear which has been observed as a regression in recent stable/7.
Reported by: many (on -stable@)
Reviewed by: davidch
Luigi Rizzo [Wed, 3 Dec 2008 18:36:59 +0000 (18:36 +0000)]
Enable operation of newfs on plain files, which is useful when you
want to prepare disk images for emulators (though 'makefs' in port
can do something similar).
This relies on:
+ minor changes to pass the consistency checks even when working on a file;
+ an additional option, '-p partition' , to specify the disk partition to
initialize;
+ some changes on the I/O routines to deal with partition offsets.
The latter was a bit tricky to implement, see the details in newfs.h:
in newfs, I/O is done through libufs which assumes that the file
descriptor refers to the whole partition. Introducing support for
the offset in libufs would require a non-backward compatible change
in the library, to be dealt with a version bump or with symbol
versioning.
I felt both approaches to be overkill for this specific application,
especially because there might be other changes to libufs that might
become necessary in the near future.
So I used the following trick:
- read access is always done by calling bread() directly, so we just add
the offset in the (few) places that call bread();
- write access is done through bwrite() and sbwrite(), which in turn
calls bwrite(). To avoid rewriting sbwrite(), we supply our own version
of bwrite() here, which takes precedence over the version in libufs.
Luigi Rizzo [Wed, 3 Dec 2008 18:22:36 +0000 (18:22 +0000)]
Some useful operational extensions to newfs_msdos, especially
when preparing images for emulators or flash devices:
+ option '-C size' to create the underlying image file with given size.
Saves doing a 'dd' before, and especially it creates a sparse file
+ option '-@ offset' to build the FAT image at the specified offset
in the image file or device;
+ make the cluster size adaptive on the filesystem size.
Previously the default was 4k which is really unconvenient with
large media; now it goes from 512 bytes to 32k depending on
filesystem size (i still need to check whether it makes sense
to go further up, to 64k or above);
+ fix default geometry when not specified on the command line,
use 63 sectors/255 heads by default.
Also trim the size so it exactly a multiple of a track, to avoid
complaints in some filesystem code.
+ document all the above, plus some manual page clarifications.
Change nfsserver slightly so that it does not trip over the timestamp
validation code on ZFS.
Problem: when opening file with O_CREAT|O_EXCL NFS has to jump through
extra hoops to ensure O_EXCL semantics. Namely, client supplies of 8
bytes (NFSX_V3CREATEVERF) bytes of verification data to uniquely
identify this create request. Server then creates a new file with access
mode 0, copies received 8 bytes into va_atime member of struct vattr and
attempt to set the atime on file using VOP_SETATTR. If that succeeds, it
fetches file attributes with VOP_GETATTR and verifies that atime
timestamps match. If timestamps do not match, NFS server concludes it
has probbaly lost the race to another process creating the file with the
same name and bails with EEXIST.
This scheme works OK when exported FS is FFS, but if underlying
filesystem is ZFS _and_ server is running 64bit kernel, it breaks down
due to sanity checking in zfs_setattr function, which refuses to accept
any timestamps which have tv_sec that cannot be represented as 32bit
int. Since struct timespec fields are 64 bit integers on 64bit platforms
and server just copies NFSX_V3CREATEVERF bytes info va_atime, all eight
bytes supplied by client end up in va_atime.tv_sec, forcing it out of
valid 32bit range.
The solution this change implements is simple: it treats
NFSX_V3CREATEVERF as two 32bit integers and unpacks them separately into
va_atime.tv_sec and va_atime.tv_nsec respectively, thus guaranteeing
that tv_sec remains in 32 bit range and ZFS remains happy.
Bjoern A. Zeeb [Wed, 3 Dec 2008 15:54:35 +0000 (15:54 +0000)]
Fix a credential reference leak. [1]
Close subtle but relatively unlikely race conditions when
propagating the vnode write error to other active sessions
tracing to the same vnode, without holding a reference on
the vnode anymore. [2]
Luigi Rizzo [Wed, 3 Dec 2008 14:53:59 +0000 (14:53 +0000)]
Another, hopefully final set of changes to boot0 and boot0cfg.
boot0.S changes:
+ import a patch from Christoph Mallon to rearrange the various
print functions and save another couple of bytes;
+ implement the suggestion in PR 70531 to enable booting from
any valid partition because even the extended partitions that
were previously in our kill list may contain a valid boot loader.
This simplifies the code and saves some bytes;
+ followwing up PR 127764, implement conditional code to preserve
the 'Volume ID' which might be used by other OS (NT, XP, Vista)
and is located at offset 0x1b8. This requires a relocation of the
parameter block within the boot sector -- there is no other
possible workaround.
To address this, boot0cfg has been updated to handle both
versions of the boot code;
+ slightly rearrange the strings printed in the menus to make
the code buildable with all options. Given the tight memory
budget, this means that with certain options we need to
shrink or remove certain labels.
and especially:
make -DVOLUME_LABEL -DPXE the default options.
This means that the newly built boot0 block will preserve the
Volume ID, and has the (hidden) option F6 to boot from INT18/PXE.
I think the extra functionality is well worth the change.
The most visible difference here is that the 'Default: ' string
now becomes 'Boot: ' (it can be reverted to the old value
but then we need to nuke 1/2 partition name or entries to
make up for the extra room).
boot0cfg changes:
+ modify the code to recognise the new boot0 structure (with the
relocated options block to make room for the Volume id).
+ add two options, '-i xxxx-xxxx' to set the volume ID, -e c
to modify the character printed in case of bad input
Pyun YongHyeon [Wed, 3 Dec 2008 08:56:01 +0000 (08:56 +0000)]
Add some PHY magic to enable PHY hibernation and 1000baseT/10baseT
power adjustment. This change is required to guarantee correct
operation on certain switches.
Submitted by: Jie Yang < Jie.Yang <> Atheros com >
Robert Watson [Tue, 2 Dec 2008 23:26:43 +0000 (23:26 +0000)]
Merge OpenBSM 1.1 alpha 2 from the OpenBSM vendor branch to head, both
contrib/openbsm (svn merge) and sys/{bsm,security/audit} (manual merge).
- Add OpenBSM contrib tree to include paths for audit(8) and auditd(8).
- Merge support for new tokens, fixes to existing token generation to
audit_bsm_token.c.
- Synchronize bsm includes and definitions.
OpenBSM history for imported revisions below for reference.
MFC after: 1 month
Sponsored by: Apple Inc.
Obtained from: TrustedBSD Project
--
OpenBSM 1.1 alpha 2
- Include files in OpenBSM are now broken out into two parts: library builds
required solely for user space, and system includes, which may also be
required for use in the kernels of systems integrating OpenBSM. Submitted
by Stacey Son.
- Configure option --with-native-includes allows forcing the use of native
include for system includes, rather than the versions bundled with OpenBSM.
This is intended specifically for platforms that ship OpenBSM, have adapted
versions of the system includes in a kernel source tree, and will use the
OpenBSM build infrastructure with an unmodified OpenBSM distribution,
allowing the customized system includes to be used with the OpenBSM build.
Submitted by Stacey Son.
- Various strcpy()'s/strcat()'s have been changed to strlcpy()'s/strlcat()'s
or asprintf(). Added compat/strlcpy.h for Linux.
- Remove compatibility defines for old Darwin token constant names; now only
BSM token names are provided and used.
- Add support for extended header tokens, which contain space for information
on the host generating the record.
- Add support for setting extended host information in the kernel, which is
used for setting host information in extended header tokens. The
audit_control file now supports a "host" parameter which can be used by
auditd to set the information; if not present, the kernel parameters won't
be set and auditd uses unextended headers for records that it generates.
OpenBSM 1.1 alpha 1
- Add option to auditreduce(1) which allows users to invert sense of
matching, such that BSM records that do not match, are selected.
- Fix bug in audit_write() where we commit an incomplete record in the
event there is an error writing the subject token. This was submitted
by Diego Giagio.
- Build support for Mac OS X 10.5.1 submitted by Eric Hall.
- Fix a bug which resulted in host XML attributes not being arguments so
that const strings can be passed as arguments to tokens. This patch was
submitted by Xin LI.
- Modify the -m option so users can select more then one audit event.
- For Mac OS X, added Mach IPC support for audit trigger messages.
- Fixed a bug in getacna() which resulted in a locking problem on Mac OS X.
- Added LOG_PERROR flag to openlog when -d option is used with auditd.
- AUE events added for Mac OS X Leopard system calls.
Bjoern A. Zeeb [Tue, 2 Dec 2008 21:37:28 +0000 (21:37 +0000)]
Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.
For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.
Reviewed by: brooks, gnn, des, zec, imp
Sponsored by: The FreeBSD Foundation
Ed Schouten [Tue, 2 Dec 2008 19:09:08 +0000 (19:09 +0000)]
Remove "[KEEP THIS!]" from COMPAT_43TTY. It's not really that important.
Sgtty is a programming interface that has been replaced by termios over
the years. In June we already removed <sgtty.h>, which exposes the
ioctl()'s that are implemented by this interface. The importance of this
flag is overrated right now.
Ken Smith [Tue, 2 Dec 2008 18:13:29 +0000 (18:13 +0000)]
Remove slip.log. Slip got removed as part of the MPSAFE tty work. If
it does come back it would probably be better if users who were interested
in slip added appropriate lines instead of this being here unconditionally.
Ken Smith [Tue, 2 Dec 2008 16:46:01 +0000 (16:46 +0000)]
The slip.log file got removed along with the MPSAFE tty work. If slip
does ever come back it's probably best if its log file be something that
gets added if the user decided they want to run slip instead of having
it here unconditionally.
Bug fix from Chelsio which addresses the issue of the device resetting
when it sees only received packets. In some cases where a device only
recieves data it mistakenly thinks that its transmitting side is broken
and resets the device.
Luigi Rizzo [Tue, 2 Dec 2008 14:57:48 +0000 (14:57 +0000)]
This commits brings in a lot of documentation and some enhancement
of the boot0.S code, with a number of compile-time selectable options,
the most interesting one being the ability to select PXE booting.
The code is completely compatible with the previous one, and with
the boot0cfg program. Even the actual code is largely unmodified,
with only minor rearrangements or fixes to make room for the new
features.
The behaviour of the standard build differs from the previous
version in the following, minor things:
+ 'noupdate' is the default, which means the code does not
write back the selection to disk. You can enable the feature
at runtime with boot0cfg, or changing the flags in the Makefile.
+ a drive number of 0x00 (floppy, or USB in floppy emulation) is
now accepted as valid. Previously, it was overridden with 0x80,
meaning that the partition table coming from the media was
used to access sectors on a possibly different media.
You can revert to the previous mode building with -DCHECK_DRIVE,
and you can always use the 'setdrv' option in boot0cfg
+ certain FAT or NTFS partitions are listed as WIN instead of DOS.
+ the 'bel' character on a bad selection is replaced by a '#' to
make it clear that the system is not hang even if the machine
does not have a speaker. This can be reverted back at compile
time, or at runtime with an upcoming boot0cfg option.
Additional features are available as compile time options,
and may be become the default if deemed useful. In particular:
+ INT18/PXE boot (make -DPXE)
This option enables booting through INT 18h (which on certain
BIOSes can be hooked to PXE) by pressing F6. There is unfortunately
no room to print the additional menu option.
Also, to make room for the code, the 'Default: ' string is
changed to 'Boot: '
+ print current drive number (make -DTEST)
Prints a line indicating the current drive number.
This is useful to figure out what is going on for machines/bioses
which remap drives in sometimes surprising ways.
+ disable numeric keys in console mode (make -DONLY_F_KEYS)
Not really a significant option, but it is needed to make
room for the -DTEST mode.
+ disable floppy support (make -DCHECK_DRIVE)
Revert to the old behaviour of only accepting 0x80 and above
as valid drive numbers.