Peter Wemm [Sat, 20 Jul 2002 02:56:12 +0000 (02:56 +0000)]
Infrastructure tweaks to allow having both an Elf32 and an Elf64 executable
handler in the kernel at the same time. Also, allow for the
exec_new_vmspace() code to build a different sized vmspace depending on
the executable environment. This is a big help for execing i386 binaries
on ia64. The ELF exec code grows the ability to map partial pages when
there is a page size difference, eg: emulating 4K pages on 8K or 16K
hardware pages.
Flesh out the i386 emulation support for ia64. At this point, the only
binary that I know of that fails is cvsup, because the cvsup runtime
tries to execute code in pages not marked executable.
Peter Wemm [Fri, 19 Jul 2002 21:06:01 +0000 (21:06 +0000)]
Set P_NOLOAD on the pagezero kthread so that it doesn't artificially skew
the loadav. This is not real load. If you have a nice process running in
the background, pagezero may sit in the run queue for ages and add one to
the loadav, and thereby affecting other scheduling decisions.
Maxime Henrion [Fri, 19 Jul 2002 16:05:31 +0000 (16:05 +0000)]
- Merge the mount options at MNT_UPDATE time with vfs_mergeopts().
- Sanity check the mount options list (remove duplicates) with
vfs_sanitizeopts().
- Fix some malloc(0)/free(NULL) bugs.
Mark Murray [Fri, 19 Jul 2002 13:38:43 +0000 (13:38 +0000)]
"inline" fixing. Replace "inline" with "__inline" to make more BSD
standard (and easier to define away with support in cdefs.h).
Also convert two function-like macros to static inline functions
for lint and the debugger.
Ruslan Ermilov [Fri, 19 Jul 2002 07:51:58 +0000 (07:51 +0000)]
Don't install any old cruft present in the tree, including
editor backups, .orig or .rej files, etc. Make transition
from SHARED=symlinks to SHARED=copies and vice versa work.
Add support to UFS2 to provide storage for extended attributes.
As this code is not actually used by any of the existing
interfaces, it seems unlikely to break anything (famous
last words).
The internal kernel interface to manipulate these attributes
is invoked using two new IO_ flags: IO_NORMAL and IO_EXT.
These flags may be specified in the ioflags word of VOP_READ,
VOP_WRITE, and VOP_TRUNCATE. Specifying IO_NORMAL means that
you want to do I/O to the normal data part of the file and
IO_EXT means that you want to do I/O to the extended attributes
part of the file. IO_NORMAL and IO_EXT are mutually exclusive
for VOP_READ and VOP_WRITE, but may be specified individually
or together in the case of VOP_TRUNCATE. For example, when
removing a file, VOP_TRUNCATE is called with both IO_NORMAL
and IO_EXT set. For backward compatibility, if neither IO_NORMAL
nor IO_EXT is set, then IO_NORMAL is assumed.
Note that the BA_ and IO_ flags have been `merged' so that they
may both be used in the same flags word. This merger is possible
by assigning the IO_ flags to the low sixteen bits and the BA_
flags the high sixteen bits. This works because the high sixteen
bits of the IO_ word is reserved for read-ahead and help with
write clustering so will never be used for flags. This merge
lets us get away from code of the form:
if (ioflags & IO_SYNC)
flags |= BA_SYNC;
For the future, I have considered adding a new field to the
vattr structure, va_extsize. This addition could then be
exported through the stat structure to allow applications to
find out the size of the extended attribute storage and also
would provide a more standard interface for truncating them
(via VOP_SETATTR rather than VOP_TRUNCATE).
I am also contemplating adding a pathconf parameter (for
concreteness, lets call it _PC_MAX_EXTSIZE) which would
let an application determine the maximum size of the extended
atribute storage.
Alan Cox [Fri, 19 Jul 2002 03:33:04 +0000 (03:33 +0000)]
o Duplicate an odd side-effect of vm_page_wire() in vm_page_allocate()
when VM_ALLOC_WIRED is specified: set the PG_MAPPED bit in flags.
o In both vm_page_wire() and vm_page_allocate() add a comment saying
that setting PG_MAPPED does not belong there.
Clear up confusion in ugly code. ^T gave wrong results for RSS.
I misinterpretted this code when changing it to handle threads.
(there are still issues here)
Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
Try to give a more descriptive error message for the pilot error of
attempting to export the non-root of a filesystem with -alldirs. This
pilot error seems to be very common, and the "could not remount" error
message doesn't give much hints about the real reason. See the old PR
below for an example.
While i was at it, make it possible to entirely omit the often
annoying error message in that case by specifying the "quiet" exports
flag. This allows to specify something like
/cdrom -alldirs,ro,quiet <where to export to>
which will silently fail if nothing is mounted under /cdrom, but do
the rigth thing as soon as you mount something.
While doing this, i've put the embedded example in the exports(5) man
page into a subsection of its own as it ought to be.
Thanks for Paul Southworth for reminding me about this problem.
Matthew Dillon [Thu, 18 Jul 2002 19:06:12 +0000 (19:06 +0000)]
Introduce two new sysctl's:
net.inet.tcp.rexmit_min (default 3 ticks equiv)
This sysctl is the retransmit timer RTO minimum,
specified in milliseconds. This value is
designed for algorithmic stability only.
net.inet.tcp.rexmit_slop (default 200ms)
This sysctl is the retransmit timer RTO slop
which is added to every retransmit timeout and
is designed to handle protocol stack overheads
and delayed ack issues.
Note that the *original* code applied a 1-second
RTO minimum but never applied real slop to the RTO
calculation, so any RTO calculation over one second
would have no slop and thus not account for
protocol stack overheads (TCP timestamps are not
a measure of protocol turnaround!). Essentially,
the original code made the RTO calculation almost
completely irrelevant.
Please note that the 200ms slop is debateable.
This commit is not meant to be a line in the sand,
and if the community winds up deciding that increasing
it is the correct solution then it's easy to do.
Note that larger values will destroy performance
on lossy networks while smaller values may result in
a greater number of unnecessary retransmits.
Alan Cox [Thu, 18 Jul 2002 17:40:07 +0000 (17:40 +0000)]
o Remove the acquisition and release of Giant from the idle priority thread
that pre-zeroes free pages.
o Remove GIANT_REQUIRED from some low-level page queue functions. (Instead
assertions on the page queue lock are being added to the higher-level
functions, like vm_page_wire(), etc.)
Ruslan Ermilov [Thu, 18 Jul 2002 12:54:55 +0000 (12:54 +0000)]
To force install(1) to always compare files before installing, one
now needs to set COPY=-C as -C is no longer compatible with the -d
option. It is also likely to be renamed to INSTALL_COPY soon.
Update documentation to reflect this change.
Remove the statically allocated array that holds OpenFirmware memory mappings
during pmap_bootstrap. Instead, temporarily help ourselves to some memory
from phys_avail since we won't need it post-boostrap.
Add an entry for the AMD Elan SC520 hostbridge. I do not belive we can
identify this gadget on the CPUID result alone, so I intend to activate
the necessary magic (i8254 frequency for instance) for it based on the
precense of the on-chip host to PCI bridge.
Peter Wemm [Thu, 18 Jul 2002 10:28:00 +0000 (10:28 +0000)]
(VM_MAX_KERNEL_ADDRESS - KERNBASE) / PAGE_SIZE may not fit in an integer.
Use lmin(long, long), not min(u_int, u_int). This is a problem here on
ia64 which has *way* more than 2^32 pages of KVA. 281474976710655 pages
to be precice.
Tim J. Robbins [Thu, 18 Jul 2002 10:22:42 +0000 (10:22 +0000)]
Avoid using ints or shorts to store process id's, use pid_t instead.
The pgrp member of struct job was declared as a short and could not store
every possible process group ID value, the rest of them were benign because
pid_t happens to be an int.
Warner Losh [Thu, 18 Jul 2002 08:13:45 +0000 (08:13 +0000)]
Integrate the hw.pcic.pd6722_vsense tunable from the nomads list.
This allows one to select the method of 3.3V card detection from the
three possible choices (none (0), the "6710 way" (1) and the "6729
way" (2)). The default is the 6710 way, since it works in the most
cases. The datasheets for the 6722 suggest that the '29 way is more
correct, but experience has shown this method to cause some laptops to
hang solid. See source code for details until I update the man page.
Warner Losh [Thu, 18 Jul 2002 08:05:00 +0000 (08:05 +0000)]
Some strange hacks for the clpd6729:
o It needs to have pcic_isa_intr intrrupt handler
o for pci interrupts, in the func interrupt handler it needs to check the isa
registers rather than the pci ones for card present.
o better commentary for some of the strangeness of the 6729 on pci
o fix some crunchy comments to better reflect reality.
With this I almost have the WL200 working, but an interrupt storm
after attach is causing problems for reasons unknown. This code
doesn't seem to break the normal clpd6729 case, and I'd like others
with 6729 cards to try to test it (there were some that were used for
external pccard slots in pci only systems).
Warner Losh [Thu, 18 Jul 2002 06:01:35 +0000 (06:01 +0000)]
The Compaq WL200 is a CL-PD6729 based pci card with a prism 2 pcmcia
card behind it (without the pcmcia form factor). This entry gets to
the point of attaching, but there's something wrong with the '29
support, so it doesn't quite work yet.
Alan Cox [Thu, 18 Jul 2002 04:08:10 +0000 (04:08 +0000)]
o Introduce an argument, VM_ALLOC_WIRED, that requests vm_page_alloc()
to return a wired page.
o Use VM_ALLOC_WIRED within Alpha's pmap_growkernel(). Also, because
Alpha's pmap_growkernel() calls vm_page_alloc() from within a critical
section, specify VM_ALLOC_INTERRUPT instead of VM_ALLOC_SYSTEM. (Only
VM_ALLOC_INTERRUPT is implemented entirely with a spin mutex.)
o Assert that the page queues mutex is held in vm_page_wire()
on Alpha, just like the other platforms.
Peter Wemm [Wed, 17 Jul 2002 23:43:55 +0000 (23:43 +0000)]
ia64 does not have the same degree of stealth include file nesting,
so it needs an explicit #include <machine/frame.h> to get 'struct
trapframe'. The fact that it needs this at this level is rather bogus
but it will not compile without it.
Peter Wemm [Wed, 17 Jul 2002 23:21:59 +0000 (23:21 +0000)]
Cap the initial PV and PTE table preallocations. Otherwise we explode
on the Itanium2 system I have when we use up *all* of the initial 256MB
direct mapped region before we are ready to dynamically expand it.
The machine that I have has 4 cpus and a very big hole in the middle.
This makes the bogus '(last_address - first_address) / PAGE_SIZE'
calculations especially dangerous and caused many millions of initial
PV/PTE's to be preallocated.
Peter Wemm [Wed, 17 Jul 2002 23:17:49 +0000 (23:17 +0000)]
Be sure to use a logical address for the SAL table. For some reason the
phsysical address is still mapped at this stage of boot on the Itanium1
SDV boxes we have. But Itanium2 does *not* let us get away with this.
Peter Wemm [Wed, 17 Jul 2002 21:47:05 +0000 (21:47 +0000)]
Avoid trying to set PG_G on the first 4MB when we set up the 4MB page.
This solves the SMP panic for at least one system. I'd still like to know
why my xeon works though.
Fix setting parameters for getipnodebyaddr(3):
o "struct addrinfo" contains a pointer to "struct sockaddr,"
not "struct sockaddr" itself
o the function takes a pointer to "struct in*_addr", not to
"struct sockaddr," so the address length must be corresponding
Warner Losh [Wed, 17 Jul 2002 06:02:07 +0000 (06:02 +0000)]
o Remove workaround that I put in to mask the BadVcc problem.
o Add preliminary support for Cirrus Logic CL-PD6729 using PCI
interrupts. To use it you you need to set hw.pcic.pd6729_intr_path
to 2. This is allow us to still default to ISA intrrupt path for
this part (which is found much more often in laptops using ISA IRQs).
But some PCI cards have this part on them and this should allow them
to be used. It is untested on PCI, but it seems to not break the ISA
case.
o Better sysctl descriptions (I hope).
Warner Losh [Wed, 17 Jul 2002 05:50:06 +0000 (05:50 +0000)]
Be more conservative about the address ranges we assign. Some
machines don't like the more liberal default, so be more conservative
about what we do by default.
Matthew Dillon [Wed, 17 Jul 2002 05:41:43 +0000 (05:41 +0000)]
Qualify comment on machdep.cpu_idle_hlt. Turning this on on a SMP
machine will result in approximately a 4.2% loss of performance (buildworld)
and approximately a 5% reduction in power consumption (when idle). Add XXX
note on how to really make hlt work (send an IPI to wakeup HLTed cpus on
a thread-schedule event? Generate an interrupt somehow?).