Reclaim directory vnodes held in namecache if few free vnodes are
available.
Only directory vnodes holding no child directory vnodes held in
v_cache_src are recycled, so that directory vnodes near the root of
the filesystem hierarchy remain in namecache and directory vnodes are
not reclaimed in cascade.
The period of vnode reclaiming attempt and the number of vnodes
attempted to reclaim can be tuned via sysctl(2).
Ian Dowse [Wed, 18 Apr 2001 00:28:37 +0000 (00:28 +0000)]
A few more mountd cleanups:
- Remove some horrible code that faked a "struct addrinfo" to be
later passed to freeaddrinfo(). Instead, add a new group type
"GT_DEFAULT" used to denote that the filesystem is exported to the
world, and treat this case separately.
- Don't clear the AI_CANONNAME flag in a struct addrinfo returned
by getaddrinfo. There's still a bit more struct addrinfo abuse
left in here.
- Simplify do_mount() slightly by using an addrinfo pointer to keep
track of the current address.
Warner Losh [Tue, 17 Apr 2001 23:15:00 +0000 (23:15 +0000)]
When booting, turn on the 3E0 compatibility address for ricoh cardbus
parts. This is based on the newcard code that turns it off :-). We
can now reboot after NEWCARD or Windows and have OLDCARD work. Add
support for the RL5C466 while I'm at it.
Treat TI1031 the same as the CLPD6832. It doesn't work yet, but sucks
less than it did before.
Also add a few #defines for other changes in the pipe.
Ian Dowse [Tue, 17 Apr 2001 22:25:48 +0000 (22:25 +0000)]
Various bugfixes and cleanups, mainly from Martin Blapp:
- Revert del_mlist() to its pre-tirpc prototype. Unlike NetBSD's version,
ours lets the caller generate any syslog() messages, so that it
can include the service name in the message.
- Initialise a few local variables to clarify the logic and avoid some
compiler warnings.
- Remove a few unused functions and local variables, and fix some
whitespace issues.
- Reinstate the logic for avoiding duplicate host entries that got
removed accidentally in revision 1.41 (added in r1.5). This bit
was submitted in a slightly different form by Thomas Quinot.
Submitted by: Martin Blapp <mb@imp.ch>,
Thomas Quinot <quinot@inf.enst.fr>
PR: bin/26148
John Baldwin [Tue, 17 Apr 2001 18:27:55 +0000 (18:27 +0000)]
Save are floating point state in cpu_switch() if needed instead of relying
completely on lazy floating point state saving. This is needed for the
SMP case since processes can migrate to other CPUs.
Note that the previous commit also restored some historical behaviour
in the TCP_COMPAT_42 case (e.g. choosing '1' as the initial sequence
number at boot-time, instead of randomizing it). TCP_COMPAT_42 is the
repository for old security holes, too :-)
John Baldwin [Tue, 17 Apr 2001 17:53:36 +0000 (17:53 +0000)]
Fix an old bug related to BETTER_CLOCK. Call forward_*clock if SMP
and __i386__ are defined rather than if SMP and BETTER_CLOCK are defined.
The removal of BETTER_CLOCK would have broken this except that kern_clock.c
doesn't include <machine/smptests.h>, so it doesn't see the definition of
BETTER_CLOCK, and forward_*clock aren't called, even on 4.x. This seems to
fix the problem where a n-way SMP system would see 100 * n clk interrupts
and 128 * n rtc interrupts.
Andrew Gallatin [Tue, 17 Apr 2001 14:55:09 +0000 (14:55 +0000)]
Improved support for alpha SMP. The following commit gets dual AS2100s
and AS4100s into single user mode. This work was done jointly by jhb and
myself, and builds on dfr's earlier work.
smp_init_secondary() / smp_start_secondary()
- use the uniq val to pass the globalp (me)
- fancy footwork to take any pending machine checks (me)
- doing things the FreeBSD way and getting the per-cpu idleproc created
correctly, and synchronizing the startup of secondaries (jhb)
mp_start()
- better recognition of available cpus (jhb)
smp_rendezvous()
- if smp hasn't started, only run the rendezvous function on the current
cpu. Sleuthing and (prior) incorrect fix by me, correct fix by jhb
smp_handle_ipi()
- more verbose handling of console messages (jhb)
- grab sched lock around setting PS_ASTPENDING (jhb)
forward_*clock()
- commented out. Joint decision by dfr, jhb and myself
General synchronization improvements (more mb()s, etc) (jhb)
Andrew Gallatin [Tue, 17 Apr 2001 14:20:33 +0000 (14:20 +0000)]
Changes to support SMP:
- don't do the stack overflow sanity check on MP systems -- p->p_addr
will be malloc'ed memory (not K0SEG) and the check will fail.
- don't ignore clock interrupts on secondaries. Alphas apparently
roundrobin clock interrupts to all cpus, so we're going to take clock
interrupts on all CPUS and not forward them.
Andrew Gallatin [Tue, 17 Apr 2001 14:15:12 +0000 (14:15 +0000)]
changes to smp_init_secondary_glue():
- use the unique value to save the per-cpu globalp struct like the
comment says
- don't lower the ipl to ALPHA_PSL_IPL_HIGH: we may have a pending machine
check to take and we're not prepared for that yet, as we haven't setup
our interrupt entry points. (this may only happen on sable/lynx)
- indicate the fact that the working version of smp_init_secondary() doesn't
return (this is tied up in other changes and hasn't yet been committed).
By popular demand, have adduser preserve comments at the top of the
group file. Because of the way the group sorting works while printing
out the new file it's not possible at this time to restore comments
in other locations, but at least they won't just disappear altogether.
VOP_BWRITE() was a hack which made it possible for NFS client
side to use struct buf with non-bio backing.
This patch takes a more general approach and adds a bp->b_op
vector where more methods can be added.
The success of this patch depends on bp->b_op being initialized
all relevant places for some value of "relevant" which is not
easy to determine. For now the buffers have grown a b_magic
element which will make such issues a tiny bit easier to debug.
Add fmtcheck(), a function for checking consistency of format string
arguments where the format string is obtained from user data, or
otherwise difficult to verify statically.
checks the format string user_format for consistency (same number/order/
type of format operators) with standard_format. If they differ,
standard_format is used instead to avoid potential crashes or security
violations.
Add debugging option to always read/write cylinder groups as full
sized blocks. To enable this option, use: `sysctl -w debug.bigcgs=1'.
Add debugging option to disable background writes of cylinder
groups. To enable this option, use: `sysctl -w debug.dobkgrdwrite=0'.
These debugging options should be tried on systems that are panicing
with corrupted cylinder group maps to see if it makes the problem
go away. The set of panics in question are:
ffs_clusteralloc: map mismatch
ffs_nodealloccg: map corrupted
ffs_nodealloccg: block not in map
ffs_alloccg: map corrupted
ffs_alloccg: block not in map
ffs_alloccgblk: cyl groups corrupted
ffs_alloccgblk: can't find blk in cyl
ffs_checkblk: partially free fragment
The following panics are less likely to be related to this problem,
but might be helped by these debugging options:
If you try these options, please report whether they helped reduce your
bitmap corruption panics to Kirk McKusick at <mckusick@mckusick.com>
and to Matt Dillon <dillon@earth.backplane.com>.
Robert Watson [Tue, 17 Apr 2001 04:33:34 +0000 (04:33 +0000)]
In my first reading of POSIX.1e, I misinterpreted handling of the
ACL_USER_OBJ and ACL_GROUP_OBJ fields, believing that modification of the
access ACL could be used by privileged processes to change file/directory
ownership. In fact, this is incorrect; ACL_*_OBJ (+ ACL_MASK and
ACL_OTHER) should have undefined ae_id fields; this commit attempts
to correct that misunderstanding.
o Modify arguments to vaccess_acl_posix1e() to accept the uid and gid
associated with the vnode, as those can no longer be extracted from
the ACL passed as an argument. Perform all comparisons against
the passed arguments. This actually has the effect of simplifying
a number of components of this call, as well as reducing the indent
level, but now seperates handling of ACL_GROUP_OBJ from ACL_GROUP.
o Modify acl_posix1e_check() to return EINVAL if the ae_id field of
any of the ACL_{USER_OBJ,GROUP_OBJ,MASK,OTHER} entries is a value
other than ACL_UNDEFINED_ID. As a temporary work-around to allow
clean upgrades, set the ae_id field to ACL_UNDEFINED_ID before
each check so that this cannot cause a failure in the short term
(this work-around will be removed when the userland libraries and
utilities are updated to take this change into account).
o Modify ufs_sync_acl_from_inode() so that it forces
ACL_{USER_OBJ,GROUP_OBJ,MASK,OTHER} ae_id fields to ACL_UNDEFINED_ID
when synchronizing the ACL from the inode.
o Modify ufs_sync_inode_from_acl to not propagate uid and gid
information to the inode from the ACL during ACL update. Also
modify the masking of permission bits that may be set from
ALLPERMS to (S_IRWXU|S_IRWXG|S_IRWXO), as ACLs currently do not
carry none-ACCESSPERMS (S_ISUID, S_ISGID, S_ISTXT).
o Modify ufs_getacl() so that when it emulates an access ACL from
the inode, it initializes the ae_id fields to ACL_UNDEFINED_ID.
o Clean up ufs_setacl() substantially since it is no longer possible
to perform chown/chgrp operations using vop_setacl(), so all the
access control for that can be eliminated.
o Modify ufs_access() so that it passes owner uid and gid information
into vaccess_acl_posix1e().
Pointed out by: jedger
Obtained from: TrustedBSD Project
John Baldwin [Tue, 17 Apr 2001 04:18:08 +0000 (04:18 +0000)]
Blow away the panic mutex in favor of using a single atomic_cmpset() on a
panic_cpu shared variable. I used a simple atomic operation here instead
of a spin lock as it seemed to be excessive overhead. Also, this can avoid
recursive panics if, for example, witness is broken.
John Baldwin [Tue, 17 Apr 2001 03:35:38 +0000 (03:35 +0000)]
Check to see if enroll() returns NULL in the witness initialization. This
can happen if witness runs out of resources during initialization or if
witness_skipspin is enabled.
Sleuthing by: Peter Jeremy <peter.jeremy@alcatel.com.au>
Peter Wemm [Tue, 17 Apr 2001 03:03:45 +0000 (03:03 +0000)]
Previous clobbered a work-in-progress. Here is the merged result:
Limit the "pathname" glob to one item, as that is what all users of it
are expecting, except for LIST.
Always glob, instead of when the first character is a ~. For example,
if you had directories ~/x1, and ~/x2, then "cwd x[1]" would fail, but
"cwd ~/x[1]" would work since it was globbed due to the ~ character.
Also, "cwd ~/x[12]" used to arbitarily work as it used the first
expansion (ie: x1) without an error. Make it return '550 ambiguous'
instead of '550 not found' so that the user can see the difference.
For LIST, just use the user supplied string as the popen does the glob.
Problem noticed by: Ajay Mittal <amittal@iprg.nokia.com>
John Baldwin [Tue, 17 Apr 2001 02:51:28 +0000 (02:51 +0000)]
- Add appropriate #ifndef/#define/#endif to protect against multiple
inclusions.
- Blow away all evidence of a static curpcb as curpcb is a per-CPU variable
and this definition is now bogus.
John Baldwin [Tue, 17 Apr 2001 02:50:05 +0000 (02:50 +0000)]
- Fix memory barriers in atomic operations so that the barriers are always
"inside" of locked regions. That is, an acquire atomic operation will
always enforce a memory barrier after the atomic operation and a release
operation will always enforce a memory barrier before the atomic
operation.
- Explicitly use 'mb' instead of 'wmb' in release atomic operations. The
'wmb' memory barrier is not strong enough to guarantee coherence with
other processors. This is effectively a nop since alpha_wmb() actually
performs a 'mb' and not a 'wmb', but I wanted the code to be more
correct since at some point in the future alpha_wmb()'s implementation
may switch to being a real 'wmb'.
John Baldwin [Tue, 17 Apr 2001 02:44:35 +0000 (02:44 +0000)]
In exception_return(), test for usermode before testing the IPL to see if
we should call ast(). This allows us to branch to a separate Lkernelret
label so we can fixup the saved t7 register in the trapframe. Otherwise
we can run into a problem on SMP systems where a process is interrupted by
a trap or interrupt on one CPU, migrates to another CPU, and then returns
with the t7 in the stack clobbering the CPU's t7. As a result, two CPU's
would both point to the same per-CPU data and things would go downhill from
there.
John Baldwin [Tue, 17 Apr 2001 02:41:41 +0000 (02:41 +0000)]
- Stop other CPU's in the SMP case when we enter ddb.
- Add a new ddb command: 'show pcpu' similar to the i386 command added
recently. By default it displays the current CPU's info, but an optional
argument can specify the logical ID of a specific CPU to examine.
Minor background cleanups:
1) Set the FS_NEEDSFSCK flag when unexpected problems are encountered.
2) Clear the FS_NEEDSFSCK flag after a successful foreground cleanup.
3) Refuse to run in background when the FS_NEEDSFSCK flag is set.
4) Avoid taking and removing a snapshot when the filesystem is already clean.
5) Properly implement the force cleaning (-f) flag when in preen mode.
Note that you need to have revision 1.21 (date: 2001/04/14 05:26:28) of
fs.h installed in <ufs/ffs/fs.h> defining FS_NEEDSFSCK for this to compile.
Fix an off-by-2 error in periphdriver_register(). The read side of the
bcopy would go off the end of the array by two elements, which sometimes
causes a panic if it happens to cross into a page that isn't mapped.
Luigi Rizzo [Mon, 16 Apr 2001 06:37:03 +0000 (06:37 +0000)]
New script to help creation of shared readonly diskless partition.
It also has some instructions on how to setup the client and
the server. I have been using this code for over 2 years
on RELENG_3 and later RELENG_4. Have not tried on CURRENT, but
in case there are any issues these are in /etc/rc and
/etc/rc.diskless{12}
This allows you to determine if the file on the other side is the same
as the one you have without transferring the entire file to compare.
Needless to say, if the server end lies to you this check doesn't work,
but on the other hand, if it lies to you about the files checksum,
what can you trust from it ?
Add a more useful solution to the problem of password files with more than
one user who differs only by case. The other perl tools assume (or enforce)
the all lowercase requirement, therefore making the search through
master.passwd case insensitive seemed a reasonable optimization, IMO.
I understand, although I do not sympathize with, the argument that someone
might want to do this on purpose, and might subsequently want to use the
wrong tool for the job. So, this fix should hopefully satisfy both camps.