marius [Wed, 1 Feb 2012 21:14:04 +0000 (21:14 +0000)]
MFC: r230632
- Now that we have a working OF_printf() since r230631 (MFC'ed to stable/9
in r230884), use it for implementing a simple OF_panic() that may be used
during the early cycles when panic() isn't available, yet.
- Mark cpu_{exit,shutdown}() as __dead2 as appropriate.
marius [Wed, 1 Feb 2012 21:09:59 +0000 (21:09 +0000)]
MFC: r230630
For machines where the kernel address space is unrestricted increase
VM_KMEM_SIZE_SCALE to 2, awaiting more insight from alc@. As it turns
out, the VM apparently has problems with machines that have large holes
in the physical address space, causing the kmem_suballoc() call in
kmeminit() to fail with a VM_KMEM_SIZE_SCALE of 1. Using a value of 2
allows these, namely Blade 1500 with 2GB of RAM, to boot.
mav [Wed, 1 Feb 2012 17:56:38 +0000 (17:56 +0000)]
MFC r228820, r228851:
Merge to da driver quirks hinting 4K physical sector sizes for SATA disks
connected via SAS or USB. Unluckily I've found that SAS (mps) and USB-SATA
I have translate models in different ways, requiring twice more quirks.
Unluckily for Hitachi, their model names are trimmed on SAS, making
impossible to identify 4K sector drives that way.
glebius [Wed, 1 Feb 2012 15:57:49 +0000 (15:57 +0000)]
Merge some cleanups and bugfixes to pfsync(4) and pf(4) from head. Merged
revisions: r229773,229777,229849-229853,229857,229959,229961-229964,229976.
r229777:
Merge from OpenBSD:
revision 1.170
date: 2011/10/30 23:04:38; author: mikeb; state: Exp; lines: +6 -7
Allow setting big MTU values on the pfsync interface but not larger
than the syncdev MTU. Prompted by the discussion with and tested
by Maxim Bourmistrov; ok dlg, mpf
Consistently use sc_ifp->if_mtu in the MTU check throughout the
module. This backs out r228813.
r229849:
o Fix panic on module unload, that happened due to mutex being
destroyed prior to pfsync_uninit(). To do this, move all the
initialization to the module_t method, instead of SYSINIT(9).
o Fix another panic after module unload, due to not clearing the
m_addr_chg_pf_p pointer.
o Refuse to unload module, unless being unloaded forcibly.
o Revert the sub argument to MODULE_DECLARE, to the stable/8 value.
r229850:
Bunch of fixes to pfsync(4) module load/unload:
o Make the pfsync.ko actually usable. Before this change loading it
didn't register protosw, so was a nop. However, a module /boot/kernel
did confused users.
o Rewrite the way we are joining multicast group:
- Move multicast initialization/destruction to separate functions.
- Don't allocate memory if we aren't going to join a multicast group.
- Use modern API for joining/leaving multicast group.
- Now the utterly wrong pfsync_ifdetach() isn't needed.
o Move module initialization from SYSINIT(9) to moduledata_t method.
o Refuse to unload module, unless asked forcibly.
o Improve a bit some FreeBSD porting code:
- Use separate malloc type.
- Simplify swi sheduling.
r229857:
Can't pass MSIZE to m_cljget(), an mbuf can't be attached as external storage
to another mbuf.
r229963:
Add necessary locking in pfsync_in_ureq().
r229976:
Redo r226660:
- Define schednetisr() to swi_sched.
- In the swi handler check if there is some data prepared,
and if true, then call pfsync_sendout(), however tell it
not to schedule swi again.
- Since now we don't obtain the pfsync lock in the swi handler,
don't use ifqueue mutex to synchronize queue access.
r229773, r229851, r229959, r229961, r229962, r229964 - minor cleanups.
ken [Tue, 31 Jan 2012 23:04:58 +0000 (23:04 +0000)]
MFC: 230000, 230544
Fix a race condition in CAM peripheral free handling, locking
in the CAM XPT bus traversal code, and a number of other periph level
issues.
r230544 | ken | 2012-01-25 10:58:47 -0700 (Wed, 25 Jan 2012) | 9 lines
Fix a bug introduced in r230000. We were eliminating all LUNs on a target
in response to CAM_DEV_NOT_THERE, instead of just the LUN in question.
This will now just eliminate the specified LUN in response to
CAM_DEV_NOT_THERE.
Reported by: Richard Todd <rmtodd@servalan.servalan.com>
r230000 | ken | 2012-01-11 17:41:48 -0700 (Wed, 11 Jan 2012) | 72 lines
Fix a race condition in CAM peripheral free handling, locking
in the CAM XPT bus traversal code, and a number of other periph level
issues.
cam_periph.h,
cam_periph.c: Modify cam_periph_acquire() to test the CAM_PERIPH_INVALID
flag prior to allowing a reference count to be gained
on a peripheral. Callers of this function will receive
CAM_REQ_CMP_ERR status in the situation of attempting to
reference an invalidated periph. This guarantees that
a peripheral scheduled for a deferred free will not
be accessed during its wait for destruction.
Panic during attempts to drop a reference count on
a peripheral that already has a zero reference count.
In cam_periph_list(), use a local sbuf with SBUF_FIXEDLEN
set so that mallocs do not occur while the xpt topology
lock is held, regardless of the allocation policy of the
passed in sbuf.
Add a new routine, cam_periph_release_locked_buses(),
that can be called when the caller already holds
the CAM topology lock.
Add some extra debugging for duplicate peripheral
allocations in cam_periph_alloc().
Treat CAM_DEV_NOT_THERE much the same as a selection
timeout (AC_LOST_DEVICE is emitted), but forgo retries.
cam_xpt.c: Revamp the way the EDT traversal code does locking
and reference counting. This was broken, since it
assumed that the EDT would not change during
traversal, but that assumption is no longer valid.
So, to prevent devices from going away while we
traverse the EDT, make sure we properly lock
everything and hold references on devices that
we are using.
The two peripheral driver traversal routines should
be examined. xptpdperiphtraverse() holds the
topology lock for the entire time it runs.
xptperiphtraverse() is now locked properly, but
only holds the topology lock while it is traversing
the list, and not while the traversal function is
running.
The bus locking code in xptbustraverse() should
also be revisited at a later time, since it is
complex and should probably be simplified.
scsi_da.c: Pay attention to the return value from cam_periph_acquire().
Return 0 always from daclose() even if the disk is now gone.
Add some rudimentary error injection support.
scsi_sg.c: Fix reference counting in the sg(4) driver.
The sg driver was calling cam_periph_release() on close,
but never called cam_periph_acquire() (which increments
the reference count) on open.
The periph code correctly complained that the sg(4)
driver was trying to decrement the refcount when it
was already 0.
attilio [Tue, 31 Jan 2012 01:45:20 +0000 (01:45 +0000)]
MFC r227758,227759,227788:
Introduce macro stubs in the mutex and sxlock implementation that will
be always defined and will allow consumers, willing to provide options,
file and line to locking requests, to not worry about options
redefining the interfaces.
This is typically useful when there is the need to build another
locking interface on top of the mutex one.
trociny [Mon, 30 Jan 2012 19:32:33 +0000 (19:32 +0000)]
MFC r227839, r230146:
r227839:
Now kvm_getenvv() and kvm_getargv() don't need procfs(5).
r230146:
In kvm_argv(), the case when the supplied buffer was too short to hold the
requested value was handled incorrectly, and the function retuned NULL
instead of the truncated result.
Fix this and also remove unnecessary check for buf != NULL, which alway
retuns true.
truckman [Mon, 30 Jan 2012 07:20:52 +0000 (07:20 +0000)]
MFC r230064:
Allow an MBR primary or extended Linux swap partition to be specified
as the system dump device. This was already allowed for GPT. The Linux
swap metadata at the beginning of the partition should not be disturbed
because the crash dump is written at the end.
In procfs_doproccmdline() if arguments are not cashed read them from
the process stack.
Suggested by: kib
Reviewed by: kib
Tested by: pho
MFC r227836:
Retire linprocfs_doargv(). Instead use new functions, proc_getargv()
and proc_getenvv(), which were implemented using linprocfs_doargv() as
a reference.
In sysctl_kern_proc_auxv the process was released too early: we still
need to hold it when checking process sv_flags.
r228030, r228046:
Add sysctl to retrieve ps_strings structure location of another process.
Suggested by: kib
Reviewed by: kib
r228264:
In sysctl_kern_proc_ps_strings() there is no much sense in checking
for P_WEXIT and P_SYSTEM flags.
Reviewed by: kib
r228288, r228302:
Protect kern.proc.auxv and kern.proc.ps_strings sysctls with p_candebug().
Citing jilles:
If we are ever going to do ASLR, the AUXV information tells an attacker
where the stack, executable and RTLD are located, which defeats much of
the point of randomizing the addresses in the first place.
Given that the AUXV information seems to be used by debuggers only anyway,
I think it would be good to move it to p_candebug() now.
The full virtual memory maps (KERN_PROC_VMMAP, procstat -v) are already
under p_candebug().
Suggested by: jilles
Discussed with: rwatson
r228648:
On start most of sysctl_kern_proc functions use the same pattern:
locate a process calling pfind() and do some additional checks like
p_candebug(). To reduce this code duplication a new function pget() is
introduced and used.
As the function may be useful not only in kern_proc.c it is in the
kernel name space.
Suggested by: kib
Reviewed by: kib
r228666:
Fix style and white spaces.
MFC r230145:
Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want
to read strings completely to know the actual size.
As a side effect it fixes the issue with kern.proc.args and kern.proc.env
sysctls, which didn't return the size of available data when calling
sysctl(3) with the NULL argument for oldp.
Note, in get_ps_strings(), which does actual work for proc_getargv() and
proc_getenvv(), we still have a safety limit on the size of data read in
case of a corrupted procces stack.
Suggested by: kib
r230470:
Change kern.proc.rlimit sysctl to:
- retrive only one, specified limit for a process, not the whole
array, as it was previously (the sysctl has been added recently and
has not been backported to stable yet, so this change is ok);
- allow to set a resource limit for another process.
marius [Sun, 29 Jan 2012 14:55:20 +0000 (14:55 +0000)]
MFC: r228857
On FreeBSD just use the MD5 implementation of libmd rather than that of
libcrypto so we don't need to relinquish csup when world is built without
OpenSSL.
marius [Sun, 29 Jan 2012 12:58:06 +0000 (12:58 +0000)]
MFC: r228211
It doesn't make much sense to check whether child is NULL after already
having dereferenced it. We either should generally check the device_t's
supplied to bus functions before using them (which we seem to virtually
never do) or just assume that they are not NULL.
While at it make this code fit 78 columns.
marius [Sun, 29 Jan 2012 12:56:18 +0000 (12:56 +0000)]
MFC: r228209
- In device_probe_child(9) check the return value of device_set_driver(9)
when actually setting a driver as especially ENOMEM is fatal in these
cases.
- Annotate other calls to device_set_devclass(9) and device_set_driver(9)
without the return value being checked and that are okay to fail.
marius [Sun, 29 Jan 2012 12:54:31 +0000 (12:54 +0000)]
MFC: r228027
Move the scsi_da_bios_params() prototype from pc98_machdep.h to md_var.h
where the prototype for pc98_ata_disk_firmware_geom_adjust() also lives
in order to avoid an #ifdef'ed include in cam(4).
mckusick [Sun, 29 Jan 2012 08:03:45 +0000 (08:03 +0000)]
MFC r230249:
Make sure all intermediate variables holding mount flags (mnt_flag)
and that all internal kernel calls passing mount flags are declared
as uint64_t so that flags in the top 32-bits are not lost.
MFC r230250:
There are several bugs/hangs when trying to take a snapshot on a UFS/FFS
filesystem running with journaled soft updates. Until these problems
have been tracked down, return ENOTSUPP when an attempt is made to
take a snapshot on a filesystem running with journaled soft updates.
marius [Sun, 29 Jan 2012 01:32:24 +0000 (01:32 +0000)]
MFC: r227907, r22791 (for diff reduction)
Add BCM5785 but wrap it in #ifdef notyet for now. According to yongari@ there
are issues probably needing workarounds in bge(4) when brgphy(4) handles this
PHY. Letting ukphy(4) handle it instead results in a working configuration,
although likely with performance penalties.
marius [Sun, 29 Jan 2012 01:00:11 +0000 (01:00 +0000)]
MFC: r227687, r228290
- Add a hint.miibus.X.phymask hint, allowing do individually exclude PHY
addresses from being probed and attaching something including ukphy(4)
to it. This is mainly necessarily for PHY switches that create duplicate
or fake PHYs on the bus that can corrupt the PHY state when accessed or
simply cause problems when ukphy(4) isolates the additional instances.
- Change miibus(4) to be a hinted bus, allowing to add child devices via
hints and to set their attach arguments (including for automatically
probed PHYs). This is mainly needed for PHY switches that violate IEEE
802.3 and don't even implement the basic register set so we can't probe
them automatically. However, the ability to alter the attach arguments
for automatically probed PHYs is also useful as for example it allows
to test (or tell a user to test) new variant of a PHY with a specific
driver by letting an existing driver attach to it via manipulating the
IDs without the need to touch the source code or to limit a Gigabit
Ethernet PHY to only announce up to Fast Ethernet in order to save
energy by limiting the capability mask. Generally, a driver has to
be hinted via hint.phydrv.X.at="miibusY" and hint.phydrv.X.phyno="Z"
(which already is sufficient to add phydrvX at miibusY at PHY address
Z). Then optionally the following attach arguments additionally can
be configured:
hint.phydrv.X.id1
hint.phydrv.X.id2
hint.phydrv.X.capmask
- Some minor cleanup.
marius [Sun, 29 Jan 2012 00:50:41 +0000 (00:50 +0000)]
MFC: r227685
- There's no need to ignore the return value of mii_attach(9) when attaching
dcphy(4) (CID 9283).
- In dc_detach(), check whether ifp is NULL as dc_attach() may call the
former without ifp being allocated (CID 4288).
marius [Sun, 29 Jan 2012 00:32:37 +0000 (00:32 +0000)]
MFC: r226057
- Currently, sched_balance_pair() may cause a CPU to send an IPI_PREEMPT to
itself, which sparc64 hardware doesn't support. One way to solve this
would be to directly call sched_preempt() instead of issuing a self-IPI.
However, quoting jhb@:
"On the other hand, you can probably just skip the IPI entirely if we are
going to send it to the current CPU. Presumably, once this routine
finishes, the current CPU will exit softlock (or will do so "soon") and
will then pick the next thread to run based on the adjustments made in
this routine, so there's no need to IPI the CPU running this routine
anyway. I think this is the better solution. Right now what is probably
happening on other platforms is as soon as this routine finishes the CPU
processes its self-IPI and causes mi_switch() which will just switch back
to the softclock thread it is already running."
- With r226054 (MFC'ed to stable/9 in r230690) and the the above change in
place, sparc64 now no longer is incompatible with ULE and vice versa.
However, powerpc/E500 still is.
marius [Sun, 29 Jan 2012 00:24:46 +0000 (00:24 +0000)]
MFC: r226054
- Use atomic operations rather than sched_lock for safely assigning pm_active
and pc_pmap for SMP. This is key to allowing adding support for SCHED_ULE.
Thanks go to Peter Jeremy for additional testing.
- Add support for SCHED_ULE to cpu_switch().
marius [Sat, 28 Jan 2012 23:53:06 +0000 (23:53 +0000)]
MFC: r225931, r225932, r227000
Make sparc64 compatible with NEW_PCIB and enable it:
- Implement bus_adjust_resource() methods as far as necessary and in non-PCI
bridge drivers as far as feasible without rototilling them.
- As NEW_PCIB does a layering violation by activating resources at layers
above pci(4) without previously bubbling up their allocation there, move
the assignment of bus tags and handles from the bus_alloc_resource() to
the bus_activate_resource() methods like at least the other NEW_PCIB
enabled architectures do. This is somewhat unfortunate as previously
sparc64 (ab)used resource activation to indicate whether SYS_RES_MEMORY
resources should be mapped into KVA, which is only necessary if their
going to be accessed via the pointer returned from rman_get_virtual() but
not for bus_space(9) as the later always uses physical access on sparc64.
Besides wasting KVA if we always map in SYS_RES_MEMORY resources, a driver
also may deliberately not map them in if the firmware already has done so,
possibly in a special way. So in order to still allow a driver to decide
whether a SYS_RES_MEMORY resource should be mapped into KVA we let it
indicate that by calling bus_space_map(9) with BUS_SPACE_MAP_LINEAR as
actually documented in the bus_space(9) page. This is implemented by
allocating a separate bus tag per SYS_RES_MEMORY resource and passing the
resource via the previously unused bus tag cookie so we later on can call
rman_set_virtual() in sparc64_bus_mem_map(). As a side effect this now
also allows to actually indicate that a SYS_RES_MEMORY resource should be
mapped in as cacheable and/or read-only via BUS_SPACE_MAP_CACHEABLE and
BUS_SPACE_MAP_READONLY respectively.
- Do some minor cleanup like taking advantage of rman_init_from_resource(),
factor out the common part of bus tag allocation into a newly added
sparc64_alloc_bus_tag(), hook up some missing newbus methods and replace
some homegrown versions with the generic counterparts etc.
- While at it, let apb_attach() (which can't use the generic NEW_PCIB code
as APB bridges just don't have the base and limit registers implemented)
regarding the config space registers cached in pcib_softc and the SYSCTL
reporting nodes set up.
marius [Sat, 28 Jan 2012 23:26:50 +0000 (23:26 +0000)]
MFC: r225891
Re-reading the Schizo errata suggests that it's actually tolerable to
also use the streaming buffer of pre version 5/revision 2.3 hardware as
long as we stay away from context flushes (which iommu(4) so far doesn't
take advantage of). OpenSolaris does the same.
marius [Sat, 28 Jan 2012 23:25:28 +0000 (23:25 +0000)]
MFC: r225890
- Add protective parentheses to macros as far as possible.
- Move {r,w,}mb() to the top of this file where they live on most of the
other architectures.
marius [Sat, 28 Jan 2012 23:24:03 +0000 (23:24 +0000)]
MFC: r225889, r228222
In total store which we use for running the kernel and all of the userland
atomic operations behave as if they were followed by a CPU memory barrier
so there's no need to include ones in the acquire variants of atomic(9) and
it's sufficient to just use include compiler memory barriers to satisfy
the requirements of atomic(9). Removing the CPU memory barriers results in
a small performance improvement, specifically this is sufficient to
compensate the performance loss seen in the worldstone benchmark seen when
using SCHED_ULE instead of SCHED_4BSD.
This change is inspired by Linux even more radically doing the equivalent
thing some time ago.
Thanks go to Peter Jeremy for additional testing.
marius [Sat, 28 Jan 2012 23:15:02 +0000 (23:15 +0000)]
MFC: r225886
- Right-justify backslashes as suggested by style(9).
- Rename ATOMIC_INC_ULONG to ATOMIC_INC_LONG in order to be consistent with
the names of the other macros in this file an adjust accordingly.
marius [Sat, 28 Jan 2012 23:12:55 +0000 (23:12 +0000)]
MFC: r228022, r228026
For sparc64 also adjust the geometry of da(4) driven disks to not overflow
the 16-bit cylinders field of the VTOC8 disk label (at around 502GB). The
geometry chosen for disks above that limit allows to use disks up to 2TB,
which is the limit of the extended VTOC8 format. The geometry used for
disks smaller than the 16-bit cylinders limit stays the same as used by
cam_calc_geometry(9) for extended translation.
Thanks to Hans-Joerg Sirtl for providing hardware for testing this change.
rmacklem [Sat, 28 Jan 2012 01:45:19 +0000 (01:45 +0000)]
MFC: r230100
Tai Horgan reported via email that there were two places in
the new NFSv4 server where the code follows the wrong list.
Fortunately, for these fairly rare cases, the lc_stateid[]
lists are normally empty. This patch fixes the code to
follow the correct list.
qingli [Fri, 27 Jan 2012 02:13:27 +0000 (02:13 +0000)]
MFC 227460
A default route learned from the RAs could be deleted manually
after its installation. This removal may be accidental and can
prevent the default route from being installed in the future if
the associated default router has the best preference. The cause
is the lack of status update in the default router on the state
of its route installation in the kernel FIB. This patch fixes
the described problem.
dumbbell [Thu, 26 Jan 2012 19:46:13 +0000 (19:46 +0000)]
MFC r228259:
Support domain-search in dhclient(8)
The "domain-search" option (option 119) allows a DHCP server to publish
a list of implicit domain suffixes used during name lookup. This option
is described in RFC 3397.
For instance, if the domain-search option says:
".example.org .example.com"
and one wants to resolve "foobar", the resolver will try:
1. "foobar.example.org"
2. "foobar.example.com"
The file /etc/resolv.conf is updated with a "search" directive if the
DHCP server provides "domain-search".
A regression test suite is included in this patch under
tools/regression/sbin/dhclient.
MFC r229000:
Invalid Domain Search option isn't considered as a fatal error
In the original Domain Search option patch, an invalid option value
would cause the whole lease to be rejected. However, DHCP servers who
emit such an invalid value are more common than I thought. With this new
patch, just the option is rejected, not the entire lease.
glebius [Wed, 25 Jan 2012 13:47:55 +0000 (13:47 +0000)]
Merge r230127 from head/:
Restore functionality to pack several kernels into release. All
kernels specified by KERNCONF are built and packed into release.
The first one is packed into kernel.txz, all others to
kernel.CONFIG.txz.
rmacklem [Wed, 25 Jan 2012 01:45:19 +0000 (01:45 +0000)]
MFC: r229956
jwd@ reported via email that the "CacheSize" field reported by "nfsstat -e -s"
would go negative after using the "-z" option to zero out the stats.
This patch fixes that by not zeroing out the srvcache_size field
for "-z", since it is the size of the cache and not a counter
pluknet [Tue, 24 Jan 2012 10:28:19 +0000 (10:28 +0000)]
MFC r230256:
Fix the "lock &zrl->zr_mtx already initialized" assertion by initializing
the allocated memory before calling mtx_init(9) on mtx pointing to it.
Otherwize, random contents of uninitialized memory might occasionally
trigger the assertion.
Reported by: Pavel Polyakov <bsd kobyla org>
Reviewed by: pjd
jh [Mon, 23 Jan 2012 16:28:35 +0000 (16:28 +0000)]
MFC r229694:
r222004 changed sbuf_finish() to not clear the buffer error status. As a
consequence sbuf_len() will return -1 for buffers which had the error
status set prior to sbuf_finish() call. This causes a problem in
pfs_read() which purposely uses a fixed size sbuf to discard bytes which
are not needed to fulfill the read request.
Work around the problem by using the full buffer length when
sbuf_finish() indicates an overflow. An overflowed sbuf with fixed size
is always full.
pho [Sun, 22 Jan 2012 18:27:24 +0000 (18:27 +0000)]
MFC: r228360
Move cpu_set_upcall(newtd, td) up before the first call of
thread_free(newtd). This to avoid a possible page fault in
cpu_thread_clean() as seen on amd64 with syscall fuzzing.
rmacklem [Sun, 22 Jan 2012 05:16:31 +0000 (05:16 +0000)]
MFC: r229802
opt_inet6.h was missing from some files in the new NFS subsystem.
The effect of this was, for clients mounted via inet6 addresses,
that the DRC cache would never have a hit in the server. It also
broke NFSv4 callbacks when an inet6 address was the only one available
in the client. This patch fixes the above, plus deletes opt_inet6.h
from a couple of files it is not needed for.
alc [Sat, 21 Jan 2012 19:21:42 +0000 (19:21 +0000)]
MFC r228923, r228935, and r229007
Eliminate many of the unnecessary differences between the native and
paravirtualized pmap implementations for i386.
Fix a bug in the Xen pmap's implementation of
pmap_extract_and_hold(): If the page lock acquisition is retried,
then the underlying thread is not unpinned.
Wrap nearby lines that exceed 80 columns.
Merge r216333 and r216555 from the native pmap
When r207410 eliminated the acquisition and release of the page
queues lock from pmap_extract_and_hold(), it didn't take into
account that pmap_pte_quick() sometimes requires the page queues
lock to be held. This change reimplements pmap_extract_and_hold()
such that it no longer uses pmap_pte_quick(), and thus never
requires the page queues lock.
Merge r177525 from the native pmap
Prevent the overflow in the calculation of the next page
directory. The overflow causes the wraparound with consequent
corruption of the (almost) whole address space mapping.
Strictly speaking, r177525 is not required by the Xen pmap because
the hypervisor steals the uppermost region of the normal kernel
address space. I am nonetheless merging it in order to reduce the
number of unnecessary differences between the native and Xen pmap
implementations.
alc [Sat, 21 Jan 2012 18:11:12 +0000 (18:11 +0000)]
MFC r228746
The Xen pmap doesn't support superpages. So, there is no point in it
initializing structures, like the pv table, that are only used to
implement superpages. In fact, some of the unnecessary code in
pmap_init() was actually doing harm. It was preventing the kernel from
booting on virtual machines with more than 768 MB of memory.
rmh [Sat, 21 Jan 2012 17:22:50 +0000 (17:22 +0000)]
MFC r227827
Define __FreeBSD_kernel__ macro in sys/param.h.
__FreeBSD_kernel__ indicates that this system uses the kernel of FreeBSD,
which by definition is always true on FreeBSD. This macro is also defined
on other systems that use the kernel of FreeBSD, such as GNU/kFreeBSD.
It is tempting to use this macro in userland code when we want to enable
kernel-specific routines, and in fact it's fine to do this in code that
is part of FreeBSD itself. However, be aware that as presence of this
macro is still not widespread (e.g. older FreeBSD versions, 3rd party
compilers, etc), it is STRONGLY DISCOURAGED to check for this macro in
external applications without also checking for __FreeBSD__ as an
alternative.
alc [Sat, 21 Jan 2012 08:26:41 +0000 (08:26 +0000)]
MFC r228398
Avoid the possibility of integer overflow in the calculation of
VM_KMEM_SIZE_MAX. Specifically, if the user/kernel address space split
was changed such that the kernel address space was greater than or equal
to 2 GB, then overflow would occur.
alc [Sat, 21 Jan 2012 05:03:10 +0000 (05:03 +0000)]
MFC r226163, r228317, and r228324
Fix the handling of an empty kmem map by sysctl_kmem_map_free().
Eliminate the possibility of 32-bit arithmetic overflow in the
calculation of vm_kmem_size that may occur if the system
administrator has specified a vm.vm_kmem_size tunable value that
exceeds the hard cap.
lstewart [Sat, 21 Jan 2012 03:59:31 +0000 (03:59 +0000)]
MFC r229898:
Consumers of bpfdetach() expect it to remove all bpf_if structs from the
bpf_iflist list which reference the specified ifnet. The existing implementation
only removes the first matching bpf_if found in the list, effectively leaking
list entries if an ifnet has been bpfattach()ed multiple times with different
DLTs.
Fix the leak by performing the detach logic in a loop, stopping when all bpf_if
structs referencing the specified ifnet have been detached and removed from the
bpf_iflist list.
Whilst here, also:
- Remove the unnecessary "bp->bif_ifp == NULL" check, as a bpf_if should never
exist in the list with a NULL ifnet pointer.
- Except when INVARIANTS is in the kernel config, silently ignore the case where
no bpf_if referencing the specified ifnet is found, as it is harmless and does
not require user attention.