Qing Li [Tue, 20 Oct 2009 21:27:03 +0000 (21:27 +0000)]
The flow-table function flowtable_route_flush() may be called
during system initialization time. Since the flow-table is
designed to maintain per CPU flow cache, the existing code
did not check whether "smp_started" is true before calling
sched_bind() and sched_unbind(), which triggers a page fault.
Andrew Gallatin [Tue, 20 Oct 2009 18:58:28 +0000 (18:58 +0000)]
Make mxge do a better job recovering from NIC h/w faults
by checking PCI config space when the NIC is not
transmitting. Previously, a h/w fault would not have been
detected if the NIC was down, or handling an RX only
workload.
Qing Li [Tue, 20 Oct 2009 17:55:42 +0000 (17:55 +0000)]
In the ARP callout timer expiration function, the current time_second
is compared against the entry expiration time value (that was set based
on time_second) to check if the current time is larger than the set
expiration time. Due to the +/- timer granularity value, the comparison
returns false, causing the alternative code to be executed. The
alternative code path freed the memory without removing that entry
from the table list, causing a use-after-free bug.
Ruslan Ermilov [Tue, 20 Oct 2009 16:36:51 +0000 (16:36 +0000)]
Random number generator initialization cleanup:
- Introduce new SI_SUB_RANDOM point in boot sequence to make it
clear from where one may start using random(9). It should be as
early as possible, so place it just after SI_SUB_CPU where we
have some randomness on most platforms via get_cyclecount().
- Move stack protector initialization to be after SI_SUB_RANDOM
as before this point we have no randomness at all. This fixes
stack protector to actually protect stack with some random guard
value instead of a well-known one.
Note that this patch doesn't try to address arc4random(9) issues.
With current code, it will be implicitly seeded by stack protector
and hence will get the same entropy as random(9). It will be
securely reseeded once /dev/random is feeded by some entropy from
userland.
Submitted by: Maxim Dounin <mdounin@mdounin.ru>
MFC after: 3 days
Jaakko Heinonen [Tue, 20 Oct 2009 15:06:18 +0000 (15:06 +0000)]
Unloading of the nfscl module is unsupported because newnfslock doesn't
support unloading. It's not trivial to implement newnfslock unloading so
for now just admit that unloading is unsupported and refuse to attempt
unload in all nfscl module event handlers.
Jaakko Heinonen [Tue, 20 Oct 2009 15:01:46 +0000 (15:01 +0000)]
Fix ordering of nfscl_modevent() and ncl_uninit(). nfscl_modevent() must
be called after ncl_uninit() when unloading the nfscl module because
ncl_uninit() uses ncl_iod_mutex which is destroyed in nfscl_modevent().
Edwin Groothuis [Tue, 20 Oct 2009 06:54:31 +0000 (06:54 +0000)]
Instead of having to know which timezone was picked last time, you
now can run "tzsetup -r" which will reinstall the last choice. This
data is recorded in /var/db/zoneinfo.
Alexander Kabaev [Tue, 20 Oct 2009 02:35:12 +0000 (02:35 +0000)]
Use callout_init_mtx on FreeBSD versions recent enough. This closes
the race where interrupt thread can complete the request for which
timeout has fired and while mpt_timeout has blocked on mpt_lock.
Do a best effort to keep 4.x ang Giant-locked configurartions
compiling still.
Jung-uk Kim [Mon, 19 Oct 2009 20:58:10 +0000 (20:58 +0000)]
Rewrite x86bios and update its dependent drivers.
- Do not map entire real mode memory (1MB). Instead, we map IVT/BDA and
ROM area separately. Most notably, ROM area is mapped as device memory
(uncacheable) as it should be. User memory is dynamically allocated and
free'ed with contigmalloc(9) and contigfree(9). Remove now redundant and
potentially dangerous x86bios_alloc.c. If this emulator ever grows to
support non-PC hardware, we may implement it with rman(9) later.
- Move all host-specific initializations from x86emu_util.c to x86bios.c and
remove now unnecessary x86emu_util.c. Currently, non-PC hardware is not
supported. We may use bus_space(9) later when the KPI is fixed.
- Replace all bzero() calls for emulated registers with more obviously named
x86bios_init_regs(). This function also initializes DS and SS properly.
- Add x86bios_get_intr(). This function checks if the interrupt vector is
available for the platform. It is not necessary for PC-compatible hardware
but it may be needed later. ;-)
- Do not try turning off monitor if DPMS does not support the state.
- Allocate stable memory for VESA OEM strings instead of just holding
pointers to them. They may or may not be accessible always. Fix a memory
leak of video mode table while I am here.
- Add (experimental) BIOS POST call for vesa(4). This function calls VGA
BIOS POST code from the current VGA option ROM. Some video controllers
cannot save and restore the state properly even if it is claimed to be
supported. Usually the symptom is blank display after resuming from suspend
state. If the video mode does not match the previous mode after restoring,
we try BIOS POST and force the known good initial state. Some magic was
taken from NetBSD (and it was taken from vbetool, I believe.)
- Add a loader tunable for vgapci(4) to give a hint to dpms(4) and vesa(4)
to identify who owns the VESA BIOS. This is very useful for multi-display
adapter setup. By default, the POST video controller is automatically
probed and the tunable "hw.pci.default_vgapci_unit" is set to corresponding
vgapci unit number. You may override it from loader but it is very unlikely
to be necessary. Unfortunately only AGP/PCI/PCI-E controllers can be
matched because ISA controller does not have necessary device IDs.
- Fix a long standing bug in state save/restore function. The state buffer
pointer should be ES:BX, not ES:DI according to VBE 3.0. If it ever worked,
that's because BX was always zero. :-)
- Clean up register initializations more clearer per VBE 3.0.
- Fix a lot of style issues with vesa(4).
Powercrypt and NetSec seem to be defunct (webpages point to link farms
and a google search yields no alternative). Remove the links but
keep the entries around for reference.
PR: 139756
Submitted by: Patrick Oonk <patrick@pine.nl>
MFC after: 3 days
Rui Paulo [Mon, 19 Oct 2009 11:17:46 +0000 (11:17 +0000)]
HWMP fixes, namely:
* fix the processing of RANN frames
* the originator and target addresses were swapped and while it worked
fine, it was not spec compliant.
Ed Schouten [Mon, 19 Oct 2009 11:10:44 +0000 (11:10 +0000)]
Partially revert the change to the gettytab made in r198214.
By misinterpreting some data, I thought that getty wouldn't apply any
baud rate to the syscons devices, but it uses the default entry instead.
This means that the baud rate is set to 1200. This isn't too bad, except
when using canonical mode. Make it use 9600 baud by default.
Ed Schouten [Mon, 19 Oct 2009 07:17:37 +0000 (07:17 +0000)]
Properly set the low watermarks when reducing the baud rate.
Now that buffers are deallocated lazily, we should not use
tty*q_getsize() to obtain the buffer size to calculate the low
watermarks. Doing this may cause the watermark to be placed outside the
typical buffer size.
This caused some regressions after my previous commit to the TTY code,
which allows pseudo-devices to resize the buffers as well.
Ed Schouten [Sun, 18 Oct 2009 19:48:53 +0000 (19:48 +0000)]
Allow the buffer size to be configured for pseudo-like TTY devices.
Devices that don't implement param() (which means they don't support
hardware parameters such as flow control, baud rate) hardcode the baud
rate to TTYDEF_SPEED. This means the buffer size cannot be configured,
which is a little inconvenient when using canonical mode with big lines
of input, etc.
Make it adjustable, but do clamp it between B50 and B115200 to prevent
awkward buffer sizes. Remove the baud rate assignment from
/etc/gettytab. Trust the kernel to fill in a proper value.
Ed Schouten [Sun, 18 Oct 2009 19:45:44 +0000 (19:45 +0000)]
Make lock devices work properly.
It turned out I did add the code to use the init state devices to set
the termios structure when opening the device, but it seems I totally
forgot to add the bits required to force the actual locking of flags
through the lock state devices.
Reported by: ru
MFC after: 1 week (to be discussed)
Nathan Whitehorn [Sun, 18 Oct 2009 17:22:08 +0000 (17:22 +0000)]
Don't assume that physical addresses are identity mapped. This allows
the second processor on G5 systems to start. Note that SMP is still
non-functional on these systems because of IPI delivery problems.
Marius Strobl [Sun, 18 Oct 2009 13:08:15 +0000 (13:08 +0000)]
Change the load base to below 2GB so PIE binaries work including when
compiled to use the Medium/Low code model, which we currently default
to for the userland. GNU/Linux has moved their default to Medium/Middle
some time ago, which probably explains why the current GNU ld(1) uses
a base in the range between 32 and 44 bits instead.
If ET_DYN binary has non-zero base address for some reason, honour it
and do not relocate the binary to ET_DYN_LOAD_ADDR. This allows for the
binary author to influence address map of the process. In particular,
when the binary is actually an interpeter, this allows to have almost
usual process address map.
Communicate the relocation bias of the mapping for interpeter-less
ET_DYN binary, that is interperter itself, in AT_BASE aux entry. This
way, rtld is able to find its dynamic structure and relocate itself.
Note that mapbase in the rtld is still wrong and requires further
fixing.
Reported and tested by: rwatson
Discussed with: kan
MFC after: 3 days
Max Khon [Sun, 18 Oct 2009 11:28:31 +0000 (11:28 +0000)]
Reset UPTODATE gnodes after remaking makefiles when make
is not going to be restarted: such nodes could be marked UPTODATE
without doing rebuild due to remakingMakefiles being TRUE.
Weongyo Jeong [Sun, 18 Oct 2009 00:11:49 +0000 (00:11 +0000)]
overhauls urtw(4) for supporting RTL8187B devices properly that there
was major changes to initialize RF chipset and set H/W registers and
removed a lot of magic numbers on code. Details are as follows:
- uses the endpoint 0x89 to get TX status information which used to
get TX complete or retry numbers or get a beacon interrupt. It's
only valuable for RTL8187B.
- removes urtw_write[8|16|32]_i functions that it's useless now.
- uses ic->ic_updateslot to set SLOT, SIFS, DIES, EIFS, CW_VAL
registers that doesn't set these whenever the channel is changed.
- code for initializing RF chipset for RTL8187B changed a lot that
there was many problems on TX transfers so it doesn't work properly
even if just for a ping/pong. Now it becomes more stable than
before that TX throughputs using netperf(1) were about 15 ~ 17Mbps/s
though sometimes it encounters packet losses.
- removes a lot of magic numbers that in the previous all of
representing RX and TX descriptors were consisted of magic numbers
and structures. It'd be more readable rather than before.
- calculates TX duration more accurately for urtw(4) devices.
- style(9)
Ed Schouten [Sat, 17 Oct 2009 08:59:41 +0000 (08:59 +0000)]
Print backspaces after echoing an EOF.
Applications like shells expect EOF to give no graphical output, while
our implementation prints ^D by default (tunable with stty echoctl).
Make the new implementation behave like the old TTY code. Print two
backspaces afterwards.
Jaakko Heinonen [Fri, 16 Oct 2009 20:52:45 +0000 (20:52 +0000)]
- If lstat()/stat() fails with an error other than ENOENT, don't ignore
the error and assume that the file doesn't exist. Touch could return
success with -c option even if the file existed and time was not set.
- If the first utimes_f() call fails with -A option, give up and don't
continue trying to set times to current time. [1]
- Set exit status to 1 when setting of timestamps fails for a directory
or symbolic link even though lstat()/stat() would succeed.
- Don't print bogus error message when rw() succeeds.
John Baldwin [Fri, 16 Oct 2009 19:30:48 +0000 (19:30 +0000)]
Close a race with caching of -ve name lookups in the NFS client.
Specifically, clients only trust -ve cache entries while the directory
remains unchanged and discard any -ve cache entries for a directory when
they notice that the modification time of a directory entry changes. The
race involves two concurrent lookups as follows:
- Thread A does a lookup for file 'foo' which sends a lookup RPC to the
server. The lookup fails and the server replies.
- The 'foo' file is created (either by the same client or a different
client) updating the modification time on the parent directory of 'foo'.
- Thread B does a lookup for a different file 'bar' which updates the
cached attributes of the parent directory of 'foo' to reflect the new
modification time after 'foo' was created.
- Thread A finally resumes execution to parse the reply from the NFS
server. It adds a -ve cache entry and sets the cached value of the
directory's modification time that is used for invalidating -ve cached
lookups to the new modification time set by thread B.
At this point, future lookups of 'foo' will honor the -ve cached entry
until the cached entry is pushed out of the name cache's LRU or the
modification time of the parent directory is changed again by some other
change. The fix is to read the directory's modification time before
sending the lookup RPC and use that cached modification time when setting
the directory's cached modification time. Also, we do not add a -ve cache
entry if another thread has added -ve cache entry that set the directory's
cached modification time to a newer value than the value we read before
sending the lookup RPC.
Jilles Tjoelker [Fri, 16 Oct 2009 16:17:57 +0000 (16:17 +0000)]
sh: Show more information about syntax errors in command substitution:
the line number where the command substitution started.
This applies to both the $() and `` forms but is most useful for ``
because the other line number is relative to the enclosed text there.
(For older versions, -v can be used as a workaround.)
Andrew Thompson [Thu, 15 Oct 2009 20:07:08 +0000 (20:07 +0000)]
Workaround buggy BIOS code in USB regard. By doing the BIOS to OS handover for
all host controllers at the same time, we avoid problems where the BIOS will
actually write to the USB registers of all the USB host controllers every time
we handover one of them, and consequently reset the OS programmed values.
This is useful to test the behaviour of systems that do some kind
of flow classifications and so exhibit different behaviour depending
on the number of flows that hit them.
I plan to add a similar extension to sweep on a range of IP addresses,
so we can issue a single command to flood (obviously, for testing
purposes!) a number of different destinations.
When there is only one destination, we do a preliminary connect()
of the socket so we can use send() instead of sendto().
When we have multiple ports, the socket is not connect()'ed and we
do a sendto() instead. There is a performance hit in this case,
as the throughput on the loopback interface (with a firewall rule
that blocks the transmission) goes down from 900kpps to 490kpps on
my test machine.
If the number of different destinations is limited, one option to
explore is to have multiple connect()ed sockets.
John Baldwin [Thu, 15 Oct 2009 14:54:35 +0000 (14:54 +0000)]
Add a facility for associating optional descriptions with active interrupt
handlers. This is primarily intended as a way to allow devices that use
multiple interrupts (e.g. MSI) to meaningfully distinguish the various
interrupt handlers.
- Add a new BUS_DESCRIBE_INTR() method to the bus interface to associate
a description with an active interrupt handler setup by BUS_SETUP_INTR.
It has a default method (bus_generic_describe_intr()) which simply passes
the request up to the parent device.
- Add a bus_describe_intr() wrapper around BUS_DESCRIBE_INTR() that supports
printf(9) style formatting using var args.
- Reserve MAXCOMLEN bytes in the intr_handler structure to hold the name of
an interrupt handler and copy the name passed to intr_event_add_handler()
into that buffer instead of just saving the pointer to the name.
- Add a new intr_event_describe_handler() which appends a description string
to an interrupt handler's name.
- Implement support for interrupt descriptions on amd64 and i386 by having
the nexus(4) driver supply a custom bus_describe_intr method that invokes
a new intr_describe() MD routine which in turn looks up the associated
interrupt event and invokes intr_event_describe_handler().
Requested by: many
Reviewed by: scottl
MFC after: 2 weeks
Luigi Rizzo [Thu, 15 Oct 2009 14:18:35 +0000 (14:18 +0000)]
A small change to avoid calling gettimeofday() too often
(hardwired to once every 20us at most).
I found out that on many machines round here, i could only get
300-400kpps with netsend even on loopback and a 'deny' rule in
the firewall, while reducing the number of calls to gettimeofday()
brings the value to 900kpps and more.
This code is just a quick fix for the problem. Of course it could be
done better, with proper getopt() parsing and the like, but since
this applies to the entire program i'll postpone that to when i have
more time.
Robert Watson [Thu, 15 Oct 2009 10:31:24 +0000 (10:31 +0000)]
Print routing statistics as unsigned short rather than unsigned int,
otherwise sign extension leads to unlikely values when in the negative
range of the signed short structure fields that hold the statistics.
The type used to hold routing statistics is arguably also incorrect.
Hiroki Sato [Thu, 15 Oct 2009 07:58:01 +0000 (07:58 +0000)]
Bump version numbers and update descriptions for the 9-CURRENT
world. The %[no]include.historic knobs are removed because they
are not used for a long time.
Qing Li [Thu, 15 Oct 2009 06:12:04 +0000 (06:12 +0000)]
This patch fixes the following issues in the ARP operation:
1. There is a regression issue in the ARP code. The incomplete
ARP entry was timing out too quickly (1 second timeout), as
such, a new entry is created each time arpresolve() is called.
Therefore the maximum attempts made is always 1. Consequently
the error code returned to the application is always 0.
2. Set the expiration of each incomplete entry to a 20-second
lifetime.
3. Return "incomplete" entries to the application.
Weongyo Jeong [Wed, 14 Oct 2009 20:30:27 +0000 (20:30 +0000)]
fixes a TX hang that could be possible to happen when the trasfers are
in the high speed that some drivers don't call if_start callback after
marking ~IFF_DRV_OACTIVE.
John Baldwin [Wed, 14 Oct 2009 14:13:42 +0000 (14:13 +0000)]
Use zfs_read() instead of xfsread() to read /boot.config. xfsread() fails
short read requests, so the result was that a /boot.config smaller than 512
bytes was ignored. boot2 uses fsread() instead of xfsread() to read
/boot.config already, so this makes zfsboot more like boot2.
Submitted by: Johny Mattsson johny-freebsd of earthmagic org
Reviewed by: dfr
MFC after: 3 days
Bjoern A. Zeeb [Wed, 14 Oct 2009 11:55:55 +0000 (11:55 +0000)]
Unbreak the VIMAGE build with IPSEC, broken with r197952 by
virtualizing the pfil hooks.
For consistency add the V_ to virtualize the pfil hooks in here as well.
MFC after: 55 days
X-MFC after: julian MFCed r197952.
Jilles Tjoelker [Tue, 13 Oct 2009 21:51:50 +0000 (21:51 +0000)]
ls: Make -p not inhibit following symlinks.
According to the man page, when neither -H/-L nor -F/-d/-l are given, -H is
implied. This agrees with POSIX, GNU ls and Solaris ls. This means that -p,
although it is very similar to -F, does not prevent the implicit following
of symlinks.
Bjoern A. Zeeb [Tue, 13 Oct 2009 20:22:12 +0000 (20:22 +0000)]
Immediately after clearing a pending callout that didn't make it due
to the lock we hold, disable interrupts, and announce to the firmware
that we are shutting down. Especially do this before disabling blocks.
This makes some types of machines with asf enabled no longer hang upon
boot, when we start configuring the interface.
John Baldwin [Tue, 13 Oct 2009 19:04:01 +0000 (19:04 +0000)]
Sync with other GENERIC kernel configs:
- Move USB serial drivers earlier to match their placement in other kernel
configs.
- Add descriptions to various USB drivers.
- Move the USB wireless drivers into a new section.
- Add ulscom to the list of USB serial drivers.
John Baldwin [Tue, 13 Oct 2009 18:07:56 +0000 (18:07 +0000)]
Fix this module so it at least builds. Note that it isn't hooked up to
the build however, and ubser(4) is also not present in any kernel configs
(including NOTES).