David E. O'Brien [Sat, 15 Dec 2001 06:02:15 +0000 (06:02 +0000)]
Add some granularity to the WARNS levels.
1: add -Werror
2: -Wall [only], as this is the most used warnings setting by developers
3: our old `1'
4: our old `2'
John Baldwin [Fri, 14 Dec 2001 23:37:35 +0000 (23:37 +0000)]
Fix some nits in fork_exit() so it more properly duplicates the backend
of mi_switch:
- Set the oncpu value for the current thread.
- Always set switchticks, not just in the SMP case.
- Add a KTR entry for fork_exit that is the same as the "new proc"
entry in mi_switch().
- Release sched_lock a bit later like we do with mi_switch().
Sheldon Hearn [Fri, 14 Dec 2001 23:11:45 +0000 (23:11 +0000)]
Kernel support for smbfs is only built on the i386 at the moment, so
limit the building and installation of the userland utilities to that
architecture for now.
John Polstra [Fri, 14 Dec 2001 22:17:54 +0000 (22:17 +0000)]
Make bpf's read timeout feature work more correctly with
select/poll, and therefore with pthreads. I doubt there is any way
to make this 100% semantically identical to the way it behaves in
unthreaded programs with blocking reads, but the solution here
should do the right thing for all reasonable usage patterns.
The basic idea is to schedule a callout for the read timeout when a
select/poll is done. When the callout fires, it ends the select if
it is still in progress, or marks the state as "timed out" if the
select has already ended for some other reason. Additional logic in
bpfread then does the right thing in the case where the timeout has
fired.
Note, I co-opted the bd_state member of the bpf_d structure. It has
been present in the structure since the initial import of 4.4-lite,
but as far as I can tell it has never been used.
David Greenman [Fri, 14 Dec 2001 22:04:58 +0000 (22:04 +0000)]
Disabled input hardware checksum due to it being calculated incorrected
for some packets, in particular small (0 byte payload) packets. May also
be related to TCP options.
At least once mention the long names of WF2Q+ (Worst-case Fair Weighted
Fair Queueing) and RED (Random Early Detection) to both give the reader
a hint what they are and to make it easier to find out more information
about them.
Mike Silbersack [Fri, 14 Dec 2001 18:26:52 +0000 (18:26 +0000)]
Reduce the local network slowstart flightsize from infinity to 4 packets.
Now that we've increased the size of our send / receive buffers, bursting
an entire window onto the network may cause congestion. As a result,
we will slow start beginning with a flightsize of 4 packets.
Problem reported by: Thomas Zenker <thz@Lennartz-electronic.de>
Luigi Rizzo [Fri, 14 Dec 2001 17:56:12 +0000 (17:56 +0000)]
Device Polling code for -current.
Non-SMP, i386-only, no polling in the idle loop at the moment.
To use this code you must compile a kernel with
options DEVICE_POLLING
and at runtime enable polling with
sysctl kern.polling.enable=1
The percentage of CPU reserved to userland can be set with
sysctl kern.polling.user_frac=NN (default is 50)
while the remainder is used by polling device drivers and netisr's.
These are the only two variables that you should need to touch. There
are a few more parameters in kern.polling but the default values
are adequate for all purposes. See the code in kern_poll.c for
more details on them.
Polling in the idle loop will be implemented shortly by introducing
a kernel thread which does the job. Until then, the amount of CPU
dedicated to polling will never exceed (100-user_frac).
The equivalent (actually, better) code for -stable is at
http://info.iet.unipi.it/~luigi/polling/
and also supports polling in the idle loop.
NOTE to Alpha developers:
There is really nothing in this code that is i386-specific.
If you move the 2 lines supporting the new option from
sys/conf/{files,options}.i386 to sys/conf/{files,options} I am
pretty sure that this should work on the Alpha as well, just that
I do not have a suitable test box to try it. If someone feels like
trying it, I would appreciate it.
NOTE to other developers:
sure some things could be done better, and as always I am open to
constructive criticism, which a few of you have already given and
I greatly appreciated.
However, before proposing radical architectural changes, please
take some time to possibly try out this code, or at the very least
read the comments in kern_poll.c, especially re. the reason why I
am using a soft netisr and cannot (I believe) replace it with a
simple timeout.
Quick description of files touched by this commit:
sys/conf/files.i386
new file kern/kern_poll.c
sys/conf/options.i386
new option
sys/i386/i386/trap.c
poll in trap (disabled by default)
sys/kern/kern_clock.c
initialization and hardclock hooks.
sys/kern/kern_intr.c
minor swi_net changes
sys/kern/kern_poll.c
the bulk of the code.
sys/net/if.h
new flag
sys/net/if_var.h
declaration for functions used in device drivers.
sys/net/netisr.h
NETISR_POLL
sys/dev/fxp/if_fxp.c
sys/dev/fxp/if_fxpvar.h
sys/pci/if_dc.c
sys/pci/if_dcreg.h
sys/pci/if_sis.c
sys/pci/if_sisreg.h
device driver modifications
Luigi Rizzo [Fri, 14 Dec 2001 17:31:58 +0000 (17:31 +0000)]
Let M_LEADINGSPACE write into non-shared mbufs.
A similar thing has been in -stable for weeks and is completely safe.
This has very good performance implications as it saves some data
copying, and sometimes avoids triggering performance bugs in devices
(such as the "dc" and other Tulip clones) which do not like scattered
data.
Luigi Rizzo [Fri, 14 Dec 2001 16:22:41 +0000 (16:22 +0000)]
Add prototypes for main() so that these programs compile with -Werror
(which somehow now seems to be the default for compiling -current).
This error popped up while doing a PicoBSD cross-compile on a 4.3-ish system,
it may well be that there are other apps which have similar problems,
but I did not spot them as they are not included in my picobsd config.
Whether adding prototypes for main() is the correct solution or not
I have no idea, a request to -current on the matter went basically
unanswered. Those who have better ideas are welcome to back this out
and replace it with the correct fix.
Sheldon Hearn [Fri, 14 Dec 2001 12:17:03 +0000 (12:17 +0000)]
Arrange for the smbfs examples to be installed.
We don't install dot.nsmbrc or smbfs.sh.sample, since we already install
the former as /etc/nsmb.conf and the latter is unnecessary, since
boot-time mounts can be arranged directly within /etc/fstab without fear
of breaking the boot when the smbfs port (now unnecessary is removed).
The MFC reminder below is subject to <re@FreeBSD.org> approval
priod to 4.5-RELEASE.
Robert Watson [Fri, 14 Dec 2001 11:21:16 +0000 (11:21 +0000)]
o Clarify the comments on AIO to note that yes, AIO really is unsuitable
for use on machines with untrusted local users, for security as well
as stability reasons.
o Lack of clarity pointed out by: David Rufino <dr@soniq.net> via bugtraq.
Kirk McKusick [Fri, 14 Dec 2001 10:49:15 +0000 (10:49 +0000)]
Add disk I/O scheduling for positively niced processes.
When a positively niced process requests a disk I/O, make
it wait for its nice value of ticks before scheduling its
I/O request if there are any other processes with I/O
requests in the disk queue. For all the gory details, see
the ``Running fsck in the Background'' paper in the Usenix
BSDCon 2002 Conference Proceedings, pages 55-64.
Kirk McKusick [Fri, 14 Dec 2001 05:50:44 +0000 (05:50 +0000)]
Add disk I/O scheduling for positively niced processes.
When a positively niced process requests a disk I/O, make
it wait for its nice value of ticks before scheduling its
I/O request if there are any other processes with I/O
requests in the disk queue. For all the gory details, see
the ``Running fsck in the Background'' paper in the Usenix
BSDCon 2002 Conference Proceedings, pages 55-64.
David Greenman [Fri, 14 Dec 2001 04:41:07 +0000 (04:41 +0000)]
Moved the updating of if_ibytes from ether_demux() to ether_input() to fix
a bug where the interface input bytes count wasn't updated when bridging
is enabled.
Matthew Dillon [Fri, 14 Dec 2001 01:16:57 +0000 (01:16 +0000)]
This fixes a large number of bugs in our NFS client side code. A recent
commit by Kirk also fixed a softupdates bug that could easily be triggered
by server side NFS.
* An edge case with shared R+W mmap()'s and truncate whereby
the system would inappropriately clear the dirty bits on
still-dirty data. (applicable to all filesystems)
THIS FIX TEMPORARILY DISABLED PENDING FURTHER TESTING.
see vm/vm_page.c line 1641
* The straddle case for VM pages and buffer cache buffers when
truncating. (applicable to NFS client side)
* Possible SMP database corruption due to vm_pager_unmap_page()
not clearing the TLB for the other cpu's. (applicable to NFS
client side but could effect all filesystems). Note: not
considered serious since the corruption occurs beyond the file
EOF.
* When flusing a dirty buffer due to B_CACHE getting cleared,
we were accidently setting B_CACHE again (that is, bwrite() sets
B_CACHE), when we really want it to stay clear after the write
is complete. This resulted in a corrupt buffer. (applicable
to all filesystems but probably only triggered by NFS)
* We have to call vtruncbuf() when ftruncate()ing to remove
any buffer cache buffers. This is still tentitive, I may
be able to remove it due to the second bug fix. (applicable
to NFS client side)
* vnode_pager_setsize() race against nfs_vinvalbuf()... we have
to set n_size before calling nfs_vinvalbuf or the NFS code
may recursively vnode_pager_setsize() to the original value
before the truncate. This is what was causing the user mmap
bus faults in the nfs tester program. (applicable to NFS
client side)
* Fix to softupdates (see ufs/ffs/ffs_inode.c 1.73, commit made
by Kirk).
Testing program written by: Avadis Tevanian, Jr.
Testing program supplied by: jkh / Apple (see Dec2001 posting to freebsd-hackers with Subject 'NFS: How to make FreeBS fall on its face in one easy step')
MFC after: 1 week
Kirk McKusick [Fri, 14 Dec 2001 00:15:06 +0000 (00:15 +0000)]
Minimize the time necessary to suspend operations on a filesystem
when taking a snapshot. The two time consuming operations are
scanning all the filesystem bitmaps to determine which blocks
are in use and scanning all the other snapshots so as to be able
to expunge their blocks from the view of the current snapshot.
The bitmap scanning is broken into two passes. Before suspending
the filesystem all bitmaps are scanned. After the suspension,
those bitmaps that changed after being scanned the first time
are rescanned. Typically there are few bitmaps that need to be
rescanned. The expunging of other snapshots is now done after
the suspension is released by observing that we can easily
identify any blocks that were allocated to them after the
suspension (they will be maked as `not needing to be copied'
in the just created snapshot). For all the gory details, see
the ``Running fsck in the Background'' paper in the Usenix
BSDCon 2002 Conference Proceedings, pages 55-64.
Mike Heffner [Thu, 13 Dec 2001 23:46:44 +0000 (23:46 +0000)]
Connect lukemftp to the build as the default ftp client. Lukemftp
supports most of the previous features of FreeBSD ftp, but has been
better maintained and includes new features.
Robert Watson [Thu, 13 Dec 2001 22:09:37 +0000 (22:09 +0000)]
o Back out portions of 1.50 and 1.47, eliminating sonewconn3() and
always deriving the credential for a newly accepted connection from
the listen socket. Previously, the selection of the credential
depended on the protocol: UNIX domain sockets would use the
connecting process's credential, and protocols supporting a creation
of the socket before the receiving end called accept() would use
the listening socket. After this change, it is always the listening
credential.
Alexey Zelkin [Thu, 13 Dec 2001 21:05:27 +0000 (21:05 +0000)]
Also fix cases when thousands separator should be put before number. For
example before for grouping sequence "\003\003" number 123456 was formated
as ",123,456", now "123,456".
Mike Silbersack [Thu, 13 Dec 2001 20:00:45 +0000 (20:00 +0000)]
Limit maxprocperuid to 9/10 maxproc, and limit maxfilesperproc to 9/10
maxfiles. This should make local resource exhaustion attacks easier
to handle with a non-tweaked setup.
Alexey Zelkin [Thu, 13 Dec 2001 19:45:41 +0000 (19:45 +0000)]
Respect locale while handling of \' flag.
In original version grouping was hardcoded. It assumed that thousands
separator should be inserted to separate each 3 numbers. I.e. grouping
string "\003" was assumed for all cases. In correct case (per POSIX)
vfprintf should respect locale defined non-monetary (LC_NUMERIC
category) grouping sequence.
Sheldon Hearn [Thu, 13 Dec 2001 13:08:34 +0000 (13:08 +0000)]
Add module dependency on libmchain.
With this change, mounting an smb share (using mount_smb, which is not
yet included in the tree) without any of smbfs, libiconv or libmchain
compiled into the kernel or loaded works.
John Baldwin [Thu, 13 Dec 2001 10:33:20 +0000 (10:33 +0000)]
Use a per-thread variable for keeping state when a thread is processing
a KTR log entry. Any KTR requests made while working on an entry are
ignored/discarded to prevent recursion. This is a better fix for the
hack to futz with the CPU mask and call getnanotime() if KTR_LOCK or
KTR_WITNESS was on. It also covers the actual formatting of the log entry
including dumping it to the display which the earlier hacks did not.