Bruce Evans [Wed, 5 Mar 2008 11:21:14 +0000 (11:21 +0000)]
Change float_t and double_t to long double on i386. All floating point
expressions on i386 are evaluated in the range of the long double type,
so this is wrong in a different but hopefully less worse way than
before. Since expressions are evaluated in long double registers,
there is no runtime cost to using long double instead of double to
declare intermediate values (except in cases where this avoids compiler
bugs), and by careful use of float_t or double_t it is possible to
avoid some of the compiler bugs in this area, provided these types are
declared as long double.
I was going to change float.h to be less broken and more usable in
combination with the change here (in particular, it is more necessary
to know the effective number of bits in a double_t when double_t !=
double, since DBL_MANT_DIG no longer logically gives this, and
LDBL_MANT_DIG doesn't give it either with FreeBSD-i386's default
rounding precision. However, this was too hard for now. In particular,
LDBL_MANT_DIG is used a lot in libm, so it cannot be changed. One
thing that is completely broken now is LDBL_MAX. This may have sort
of worked when it was changed from DBL_MAX in 2002 (adding 0 to it at
runtime gave +Inf, but you could at least compare with it), but starting
with gcc-3.3.1 in 2003, it is always +Inf due to evaluating it at
compile time in the default rounding precision.
Bruce Evans [Wed, 5 Mar 2008 11:11:53 +0000 (11:11 +0000)]
Change float_t and double_t to long double on i386. All floating point
expressions on i386 are evaluated in the range of the long double type,
so this is wrong in a different but hopefully less worse way than
before. Since expressions are evaluated in long double registers,
there is no runtime cost to using long double instead of double to
declare intermediate values (except in cases where this avoids compiler
bugs), and by careful use of float_t or double_t it is possible to
avoid some of the compiler bugs in this area, provided these types are
declared as long double.
I was going to change float.h to be less broken and more usable in
combination with the change here (in particular, it is more necessary
to know the effective number of bits in a double_t when double_t !=
double, since DBL_MANT_DIG no longer logically gives this, and
LDBL_MANT_DIG doesn't give it either with FreeBSD-i386's default
rounding precision. However, this was too hard for now. In particular,
LDBL_MANT_DIG is used a lot in libm, so it cannot be changed. One
thing that is completely broken now is LDBL_MAX. This may have sort
of worked when it was changed from DBL_MAX in 2002 (adding 0 to it at
runtime gave +Inf, but you could at least compare with it), but starting
with gcc-3.3.1 in 2003, it is always +Inf due to evaluating it at
compile time in the default rounding precision.
Craig Rodrigues [Wed, 5 Mar 2008 10:09:29 +0000 (10:09 +0000)]
Expand the nfs_opts array to include all possible string
mount options that mount_nfs could pass down, if it passed
down string mount options. Right now, mount_nfs jut passes
down a single mount option named "nfs_args" with a fully
initialized 'struct nfs_args'.
In future commits, we will add code to the kernel for parsing stringified
NFS mount options, so that we can convert mount_nfs to pass string options
from userspace to kernel, instead of an initialized struct nfs_args.
Craig Rodrigues [Wed, 5 Mar 2008 09:41:22 +0000 (09:41 +0000)]
In nfs_mount(), default initialize struct nfs_args
the same way that it is default initialized in revision 1.77 of mount_nfs.c.
Right now, this is a no-op, because currently we initialize
struct nfs_args in mount_nfs in userspace, and pass it
down into the kernel via nmount(), so we overwrite whatever we initialize
here with the value passed in from userspace.
However, this lays the groundwork for moving away from passing
struct nfs_args from userspace to kernel via nmount(), so that we
can instead pass string mount options via nmount() which can be parsed in
the kernel. This will make it easier to add new NFS mount options.
Craig Rodrigues [Wed, 5 Mar 2008 08:25:49 +0000 (08:25 +0000)]
For a mounted file system which is read-only, when
doing the MNT_RELOAD, pass in "ro" and "update"
string mount options to nmount() instead of MNT_RDONLY and MNT_UPDATE flags.
Due to the complexity of the mount parsing code especially
with respect to the root file system, passing in MNT_RDONLY and MNT_UPDATE
flags would do weird things and would cause fsck to convert the root
file system from a read-only mount to read-write.
To test:
- boot into single user mode
- show mounted file systems with: mount
- root file system should be mounted read-only
- fsck /
- show mounted file systems with: mount
- root file system should still be mounted read-only
Jeff Roberson [Wed, 5 Mar 2008 08:08:32 +0000 (08:08 +0000)]
- Don't overwrite the recently allocated 'nset' in cpuset_setthread() by
passing it to cpuset_which(). Pass in 'set' instead. This argument
is not used but for convenience cpuset_which() nulls all incoming
parameters.
Craig Rodrigues [Wed, 5 Mar 2008 07:55:07 +0000 (07:55 +0000)]
Remove hacks which filter out MNT_ROOTFS.
They are no longer needed now that we filter out MNT_ROOTFS
inside the nmount() call in revision 1.267 of vfs_mount.c.
David Xu [Wed, 5 Mar 2008 07:01:20 +0000 (07:01 +0000)]
Use cpuset defined in pthread_attr for newly created thread, for now,
we set scheduling parameters and cpu binding fully in userland, and
because default scheduling policy is SCHED_RR (time-sharing), we set
default sched_inherit to PTHREAD_SCHED_INHERIT, this saves a system
call.
Pyun YongHyeon [Wed, 5 Mar 2008 05:36:09 +0000 (05:36 +0000)]
Plug memory leak in jumbo buffer allocation failure path.
Patch in the PR was modified to check active jumbo buffers in use
and other possible jumbo buffer leak.
Jumbo buffer usage in lge(4) still wouldn't be reliable due to lack
of driver lock in local jumbo buffer allocator. Either introduce
a new lock to protect jumbo buffer or switch to UMA backed page
allocator for jumbo frame is required.
Jeff Roberson [Wed, 5 Mar 2008 02:10:43 +0000 (02:10 +0000)]
- Remove the -i argument when running a command to simplify things a
little bit and to prevent users from specifying a private mask that may
later restrict other group changes.
- Add a man page which brueffer generously contributed to.
Jeff Roberson [Wed, 5 Mar 2008 01:49:20 +0000 (01:49 +0000)]
- Verify that when a user supplies a mask that is bigger than the kernel
mask none of the upper bits are set.
- Be more careful about enforcing the boundaries of masks and child sets.
- Introduce a few more CPU_* macros for implementing these tests.
- Change the cpusetsize argument to be bytes rather than bits to match
other apis.
Temporarily back out revision 1.98 to give Portmgr some time to
address PR ports/121363 (current day re-opening of PR ports/73797)
to make ports CFLAGS more independent of src/'s CFLAGS WRT aliasing.
Rui Paulo [Tue, 4 Mar 2008 19:16:21 +0000 (19:16 +0000)]
Change the default port range for outgoing connections by introducing
IPPORT_EPHEMERALFIRST and IPPORT_EPHEMERALLAST with values
10000 and 65535 respectively.
The rationale behind is that it makes the attacker's life more
difficult if he/she wants to guess the ephemeral port range and
also lowers the probability of a port colision (described in
draft-ietf-tsvwg-port-randomization-01.txt).
While there, remove code duplication in in_pcbbind_setup().
Submitted by: Fernando Gont <fernando at gont.com.ar>
Approved by: njl (mentor)
Reviewed by: silby, bms
Discussed on: freebsd-net
Back out revision 1.97, which backed out part of revision 1.96.
Change the default CFLAGS to match the simple defaults that the
tinderboxes use. By using -fno-strict-aliasing by default we are
choosing to ignore problems in code which had the potential to
shoot ourselves in the foot.
Alan Cox [Tue, 4 Mar 2008 18:50:15 +0000 (18:50 +0000)]
Add support for automatic promotion of 4KB page mappings to 2MB page
mappings. Automatic promotion can be enabled by setting the tunable
"vm.pmap.pg_ps_enabled" to a non-zero value. By default, automatic
promotion is disabled. (Expect this to change.)
John Baldwin [Tue, 4 Mar 2008 16:54:31 +0000 (16:54 +0000)]
Force an explicit dependency on opt_global.h for all module object files
when building modules as part of a kernel build just as we do for kernel
object files.
Robert Watson [Tue, 4 Mar 2008 12:50:11 +0000 (12:50 +0000)]
Continue on-going campaign to replace lockmgr locks with sx locks where
the specific semantics of ockmgr aren't required: update UFS1 extended
attributes to protect its data structures using an sx lock.
Robert Watson [Tue, 4 Mar 2008 12:10:03 +0000 (12:10 +0000)]
Move setting of MNTK_MPSAFE flag before UFS1 extended attribute
auto-start so that the flag is set before we start performing I/O
in the auto-start routine.
Yaroslav Tykhiy [Tue, 4 Mar 2008 11:25:23 +0000 (11:25 +0000)]
Revise the description of how .Ev MAKEFILE and .Va .MAKEFILE relate.
The most important point is that -f option(s) are never copied from
.Ev MAKEFILE to .Va .MAKEFILE by make(1), which is consistent with
handling the command line. (-f silently sit in .Ev MAKEFILE and go
to make's children unless overwritten via .Va .MAKEFILE)
Warner Losh [Tue, 4 Mar 2008 05:35:27 +0000 (05:35 +0000)]
Linux requires -D__dead2= and -D__unused= to get rid of the
sys/cdef.h-isms in the make source. The variant of linux I tried it
on doesn't have arc4random, so -Darc4random=random too.
David Xu [Tue, 4 Mar 2008 04:28:59 +0000 (04:28 +0000)]
If a new thread is created, it inherits current thread's signal masks,
however if current thread is executing cancellation handler, signal
SIGCANCEL may have already been blocked, this is unexpected, unblock the
signal in new thread if this happens.
Alexander Motin [Mon, 3 Mar 2008 19:36:03 +0000 (19:36 +0000)]
Use more compact LIST instead of TAILQ for session hash.
Add all listening hooks into LIST to simplify searches.
Use ng_findhook() instead of own equal implementation.
Support for Freescale integrated Three-Speed Ethernet Controller (TSEC).
TSEC is the MAC engine offering 10, 100 or 1000 Mbps speed and is found on
different Freescale parts (MPC83xx, MPC85xx). Depending on the silicon version
there are up to four TSEC units integrated on the chip.
This driver also works with the enhanced version of the controller (eTSEC),
which is backwards compatible, but doesn't take advantage of its additional
features (various off-loading mechanisms) at the moment.
Support for Freescale QUad Integrated Communications Controller.
The QUICC engine is found on various Freescale parts including MPC85xx, and
provides multiple generic time-division serial channel resources, which are in
turn muxed/demuxed by the Serial Communications Controller (SCC).
Along with core QUICC/SCC functionality a uart(4)-compliant device driver is
provided which allows for serial ports over QUICC/SCC.
Initial support for Freescale PowerQUICC III MPC85xx system-on-chip family.
The PQ3 is a high performance integrated communications processing system
based on the e500 core, which is an embedded RISC processor that implements
the 32-bit Book E definition of the PowerPC architecture. For details refer
to: http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC8555E
This port was tested and successfully run on the following members of the PQ3
family: MPC8533, MPC8541, MPC8548, MPC8555.
The following major integrated peripherals are supported:
Pyun YongHyeon [Mon, 3 Mar 2008 04:15:08 +0000 (04:15 +0000)]
Don't map memory/IO resource in device probe and just use PCI
vendor/revision/sub device id of the hardware to probe it.
This is the same way as NetBSD does and it enhances readabilty
a lot.
Robert Watson [Sun, 2 Mar 2008 22:52:14 +0000 (22:52 +0000)]
Don't auto-start or allow extattrctl for UFS2 file systems, as UFS2 has
native extended attributes. This didn't interfere with the operation of
UFS2 extended attributes, but the code shouldn't be running for UFS2.
Robert Watson [Sun, 2 Mar 2008 21:34:17 +0000 (21:34 +0000)]
Rather than copying out the full audit trigger record, which includes
a queue entry field, just copy out the unsigned int that is the trigger
message. In practice, auditd always requested sizeof(unsigned int), so
the extra bytes were ignored, but copying them out was not the intent.
Unify and generalize PowerPC headers, adjust AIM code accordingly.
Rework of this area is a pre-requirement for importing e500 support (and
other PowerPC core variations in the future). Mainly the following
headers are refactored so that we can cover for low-level differences between
various machines within PowerPC architecture:
Bjoern A. Zeeb [Sun, 2 Mar 2008 08:40:47 +0000 (08:40 +0000)]
Some "cleanup" of tcp_mss():
- Move the assigment of the socket down before we first need it.
No need to do it at the beginning and then drop out the function
by one of the returns before using it 100 lines further down.
- Use t_maxopd which was assigned the "tcp_mssdflt" for the corrrect
AF already instead of another #ifdef ? : #endif block doing the same.
- Remove an unneeded (duplicate) assignment of mss to t_maxseg just before
we possibly change mss and re-do the assignment without using t_maxseg
in between.
Reviewed by: silby
No objections: net@ (silence)
MFC after: 5 days
Jeff Roberson [Sun, 2 Mar 2008 08:20:59 +0000 (08:20 +0000)]
Add support for the new cpu topology api:
- When searching for affinity search backwards in the tree from the last
cpu we ran on while the thread still has affinity for the group. This
can take advantage of knowledge of shared L2 or L3 caches among a
group of cores.
- When searching for the least loaded cpu find the least loaded cpu via
the least loaded path through the tree. This load balances system bus
links, individual cache levels, and hyper-threaded/SMT cores.
- Make the periodic balancer recursively balance the highest and lowest
loaded cpu across each link.
Add support for cpusets:
- Convert the cpuset to a simple native cpumask_t while the kernel still
only supports cpumask.
- Pass the derived cpumask down through the cpu_search functions to
restrict the result cpus.
- Make the various steal functions resilient to failure since all threads
can not run on all cpus any longer.
General improvements:
- Precisely track the lowest priority thread on every runq with
tdq_setlowpri(). Before it was more advisory but this ended up having
pathological behaviors.
- Remove many #ifdef SMP conditions to simplify the code.
- Get rid of the old cumbersome tdq_group. This is more naturally
expressed via the cpu_group tree.
Jeff Roberson [Sun, 2 Mar 2008 07:58:42 +0000 (07:58 +0000)]
- Remove the old smp cpu topology specification with a new, more flexible
tree structure that encodes the level of cache sharing and other
properties.
- Provide several convenience functions for creating one and two level
cpu trees as well as a default flat topology. The system now always
has some topology.
- On i386 and amd64 create a seperate level in the hierarchy for HTT
and multi-core cpus. This will allow the scheduler to intelligently
load balance non-uniform cores. Presently we don't detect what level
of the cache hierarchy is shared at each level in the topology.
- Add a mechanism for testing common topologies that have more information
than the MD code is able to provide via the kern.smp.topology tunable.
This should be considered a debugging tool only and not a stable api.
Jeff Roberson [Sun, 2 Mar 2008 07:51:29 +0000 (07:51 +0000)]
Add a simple utility for manipulating cpusets. Man page will be available
soon.
- Lists of cpus may be specified with -l with ranges specified as low-high and
commas between individual cpus and ranges. ie -l 0-2,4,6-8.
- cpuset can modified -p pids, -t tids, or -s cpusetids.
- cpuset can -g get the current mask for any of the above.
Jeff Roberson [Sun, 2 Mar 2008 07:39:22 +0000 (07:39 +0000)]
Add cpuset, an api for thread to cpu binding and cpu resource grouping
and assignment.
- Add a reference to a struct cpuset in each thread that is inherited from
the thread that created it.
- Release the reference when the thread is destroyed.
- Add prototypes for syscalls and macros for manipulating cpusets in
sys/cpuset.h
- Add syscalls to create, get, and set new numbered cpusets:
cpuset(), cpuset_{get,set}id()
- Add syscalls for getting and setting affinity masks for cpusets or
individual threads: cpuid_{get,set}affinity()
- Add types for the 'level' and 'which' parameters for the cpuset. This
will permit expansion of the api to cover cpu masks for other objects
identifiable with an id_t integer. For example, IRQs and Jails may be
coming soon.
- The root set 0 contains all valid cpus. All thread initially belong to
cpuset 1. This permits migrating all threads off of certain cpus to
reserve them for special applications.
Sponsored by: Nokia
Discussed with: arch, rwatson, brooks, davidxu, deischen
Reviewed by: antoine
Kai Wang [Sun, 2 Mar 2008 07:01:01 +0000 (07:01 +0000)]
- Do not malloc buffer for 0-size member when reading from archive.
- Fix a malloc buffer overrun: Use a while loop to check whether
the string buffer is big enough after resizing, since doubling
once might not be enough when a very long member name or symbol
name is provided.
- Fix typo.
Reported by: Michael Plass <mfp49_freebsd@plass-family.net>
Tested by: Michael Plass <mfp49_freebsd@plass-family.net>
Reviewed by: jkoshy
Approved by: jkoshy
Add support for VTOC8 labels (aka sun disk labels). When a label does
not have VTOC information about the partitions, it will be created.
This is because the VTOC information is used for the partition type
and FreeBSD's sunlabel(8) does not create nor use VTOC information.
For this purpose, new tags have been added to support FreeBSD's
partition types.
Make the vm_pmap field of struct vmspace the last field in the
structure. This allows per-CPU variations of struct pmap on a
single architecture without affecting the machine-independent
fields. As such, the PMAP variations don't affect the ABI. They
become part of it.