MFC r236959:
Add a IP_RECVTOS socket option to receive for received UDP/IPv4
packets a cmsg of type IP_RECVTOS which contains the TOS byte.
Much like IP_RECVTTL does for TTL. This allows to implement a
protocol on top of UDP and implementing ECN.
MFC r236958:
Deliver IPV6_TCLASS, IPV6_HOPLIMIT and IPV6_PKTINFO cmsgs (if
requested) on IPV6 sockets, which have been marked to be not IPV6_V6ONLY,
for each received IPV4 packet.
MFC r235360:
Provide in the association change notification the received ABORT chunk
if case of SCTP_COMM_LOST or SCTP_CANT_STR_ASSOC as required by RFC 6458.
ken [Sun, 1 Jul 2012 05:22:45 +0000 (05:22 +0000)]
MFC 237683:
r237683 | ken | 2012-06-27 21:48:54 -0600 (Wed, 27 Jun 2012) | 129 lines
Bring in LSI's latest mps(4) 6Gb SAS and WarpDrive driver, version
14.00.00.01-fbsd.
Their description of the changes is as follows:
1. Copyright contents has been changed in all respective .c
and .h files
2. Support for WRITE12 and READ12 for direct-io (warpdrive only)
has been added.
3. Driver has added checks to see if Drive has READ_CAP_16
support before sending it down to the device.
If SPC3_SID_PROTECT flag is set in the inquiry data, the
device supports protection information, and must support
the 16 byte read capacity command, otherwise continue without
sending read cap 16. This will optimize driver performance,
since it will not send READ_CAP_16 to the drive which does
not have support of READ_CAP_16.
4. With new approach, "MPTIOCTL_RESET_ADAPTER" IOCTL will not
use DELAY() which is busy loop implementation.
It will use <msleep> (Better way to sleep without busy
loop). Also from the HBA reset code path and some other
places, DELAY() is replaced with msleep() or "pause()",
which is based on sleep/wakeup style calls. Driver use
msleep()/pause() instead of DELAY based on CAN_SLEEP/NO_SLEEP
flags to avoid busy loop which is not required all the
time.e.a
a. While driver is getting loaded, driver calls most of the
commands with NO_SLEEP.
b. When Driver is functional and it needs Reinit of HBA,
CAN_SLEEP flag is used.
5. <mpslsi> driver is not Endian safe. It will not work on Big
Endian machines like Sparc and PowerPC platforms because it
assumes it is running on a Little Endian machine.
Driver code is modified such way that it does not assume CPU
arch is Little Endian.
a. All places where Driver interacts from HBA to Host, it
converts Little Endian format to CPU format.
b. All places where Driver interacts from Host to HBA, it
converts CPU format to Little Endian.
6. Findout memory leaks in FreeBSD Driver and resolve those,
such as memory leak in targ's luns creation/deletion.
Also added additional checks to see memory allocation
success/fail.
7. Add loginfo prints as debug message, i.e. When FW sends any
loginfo, Driver should print those as debug message.
This will help for debugging purpose.
8. There is possibility to get config request timeout. Current
driver is able to detect config request timetout, but it does
not do anything on config_request timeout. Driver should
call mps_reinit() if any request_poll (which is called as
part of config_request) is time out.
9. cdb length check is required for 32 byte CDB. Add correct mpi
control value for 32 bit CDB as below while submitting SCSI IO
Request to controller.
mpi_control |= 4 << MPI2_SCSIIO_CONTROL_ADDCDBLEN_SHIFT;
10. Check the actual status of Message unit reset
(mps_message_unit_reset).Previously FreeBSD Driver just writes
MPI2_FUNCTION_IOC_MESSAGE_UNIT_RESET and never check the ack
(it just wait for 50 millisecond). So, Driver now check the
status of "MPI2_FUNCTION_IOC_MESSAGE_UNIT_RESET" after writing
it to the FW.
Now it also checking for whether doorbell ack uses msleep with
proper sleep flags, instead of <DELAY>.
11. Previously CAM does not detect Multi-Lun Devices. In order to
detect Multi-Lun Devices by CAM the driver needs following change
set:
a. There is "max_lun" field which Driver need to set based on
hw/fw support. Currently LSI released driver does not set
this field.
b. Default of "max_lun" should not be 0 in OS, but it is
currently set to 0 in CAM layer.
c. Export max_lun capacity to 255
12. Driver will not reset target info after port enable complete and
also do Device removal when Device remove from FW. The detail
description is as follows
a. When Driver receive WD PD add events, it will add all
information in driver local data structure.
b. Only for WD, we have below checks after port enable
completes, where driver clear off all information retrieved
at #1.
if ((sc->WD_available &&
(sc->WD_hide_expose == MPS_WD_HIDE_ALWAYS)) ||
(sc->WD_valid_config && (sc->WD_hide_expose ==
MPS_WD_HIDE_IF_VOLUME)) {
// clear off target data structure.
}
It is mainly not to attach PDs to OS.
FreeBSD does bus rescan as older Parallel scsi style. So Driver
needs to handle which Drive is visible to OS. That is a reason
we have to clear off targ information for PDs.
Again, above logic was implemented long time ago. Similar concept
we have for non-wd also. For that, LSI have introduced different
logic to hide PDs.
Eventually, because of above gap, when Phy goes offline, we
observe below failure. That is what Driver is not doing complete
removal of device with FW. (which was pointed by Scott)
Apr 5 02:39:24 Freebsd7 kernel: mpslsi0: mpssas_prepare_remove
Apr 5 02:39:24 Freebsd7 kernel: mpssas_prepare_remove 497 : invalid handle 0xe
Now Driver will not reset target info after port enable complete
and also will do Device removal when Device remove from FW.
13. Returning "CAM_SEL_TIMEOUT" instead of "CAM_TID_INVALID"
error code on request to the Target IDs that have no devices
conected at that moment. As if "CAM_TID_INVALID" error code
is returned to the CAM Layaer then it results in a huge chain
of errors in verbose kernel messages on boot and every
hot-plug event.
ken [Sun, 1 Jul 2012 05:13:50 +0000 (05:13 +0000)]
MFC 237518, 237545, 237648:
r237518 | ken | 2012-06-23 22:29:03 -0600 (Sat, 23 Jun 2012) | 72 lines
Fix a bug which causes a panic in daopen(). The panic is caused by
a da(4) instance going away while GEOM is still probing it.
In this case, the GEOM disk class instance has been created by
disk_create(), and the taste of the disk is queued in the GEOM
event queue.
While that event is queued, the da(4) instance goes away. When the
open call comes into the da(4) driver, it dereferences the freed
(but non-NULL) peripheral pointer provided by GEOM, which results
in a panic.
The solution is to add a callback to the GEOM disk code that is
called when all of its resources are cleaned up. This is
implemented inside GEOM by adding an optional callback that is
called when all consumers have detached from a provider, and the
provider is about to be deleted.
scsi_cd.c,
scsi_da.c: In the register routine for the cd(4) and da(4)
routines, acquire a reference to the CAM peripheral
instance just before we call disk_create().
Use the new GEOM disk d_gone() callback to register
a callback (dadiskgonecb()/cddiskgonecb()) that
decrements the peripheral reference count once GEOM
has finished cleaning up its resources.
In the cd(4) driver, clean up open and close
behavior slightly. GEOM makes sure we only get one
open() and one close call, so there is no need to
set an open flag and decrement the reference count
if we are not the first open.
In the cd(4) driver, use cam_periph_release_locked()
in a couple of error scenarios to avoid extra mutex
calls.
geom.h: Add a new, optional, providergone callback that
is called when a provider is about to be deleted.
geom_disk.h: Add a new d_gone() callback to the GEOM disk
interface.
Bump the DISK_VERSION to version 2. This probably
should have been done after a couple of previous
changes, especially the addition of the d_getattr()
callback.
geom_disk.c: Add a providergone callback for the disk class,
g_disk_providergone(), that calls the user's
d_gone() callback if it exists.
Bump the DISK_VERSION to 2.
geom_subr.c: In g_destroy_provider(), call the providergone
callback if it has been provided.
In g_new_geomf(), propagate the class's
providergone callback to the new geom instance.
blkfront.c: Callers of disk_create() are supposed to pass in
DISK_VERSION, not an explicit disk API version
number. Update the blkfront driver to do that.
disk.9: Update the disk(9) man page to include information
on the new d_gone() callback, as well as the
previously added d_getattr() callback, d_descr
field, and HBA PCI ID fields.
r237545 | ken | 2012-06-24 22:26:10 -0600 (Sun, 24 Jun 2012) | 7 lines
Consume spare fields for the providergone pointers added to the g_class and
g_geom structures in change 237518. The original change would have broken
the ABI.
Suggested by: ae
r237648 | ken | 2012-06-27 10:05:09 -0600 (Wed, 27 Jun 2012) | 6 lines
In g_disk_providergone(), don't continue if the softc is NULL. This may be
the case if we've already gone through g_disk_destroy().
Reported by: Michael Butler <imb@protected-networks.net>
MFC r237178:
attach_generic causes missing devices in /dev when the driver interacts with some non-highpoint controollers. Change attach_generic to be off by default.
ken [Fri, 29 Jun 2012 22:33:48 +0000 (22:33 +0000)]
MFC r237452:
r237452 | ken | 2012-06-22 12:57:06 -0600 (Fri, 22 Jun 2012) | 25 lines
Change 'camcontrol defects' to first probe a drive to find out how much
defect information it has before grabbing the full defect list.
This works around a bug with some Hitachi drives that generate data overrun
errors when they are asked for more defect data than they have.
The change is done in a spec-compliant way, so it should have no negative
impact on drives that don't have this issue.
This is based on work originally done at Sandvine.
scsi_da.h: Add a define for the maximum amount of data that can be
contained in a defect list.
camcontrol.c: Update the readdefects() function to issue an initial
command to determine the length of the defect list, and
then use that length in the request for the full defect
list.
camcontrol.8: Add a note that some drives will report 0 defects available
if you don't request either the PLIST or GLIST.
Submitted by: Mark Johnston <markjdb@gmail.com> (original version)
ken [Fri, 29 Jun 2012 22:28:31 +0000 (22:28 +0000)]
MFC r237328, except for the scsi_enc.c changes:
r237328 | ken | 2012-06-20 11:08:00 -0600 (Wed, 20 Jun 2012) | 69 lines
Fix several reference counting and object lifetime issues between
the pass(4) and enc(4) drivers and devfs.
The pass(4) driver uses the destroy_dev_sched() routine to
schedule its device node for destruction in a separate thread
context. It does this because the passcleanup() routine can get
called indirectly from the passclose() routine, and that would
cause a deadlock if the close routine tried to destroy its own
device node.
In any case, once a particular passthrough driver number, e.g.
pass3, is destroyed, CAM considers that unit number (3 in this
case) available for reuse.
The problem is that devfs may not be done cleaning up the previous
instance of pass3, and will panic if isn't done cleaning up the
previous instance.
The solution is to get a callback from devfs when the device node
is removed, and make sure we hold a reference to the peripheral
until that happens.
Testing exposed some other cases where we have reference counting
issues, and those were also fixed in the pass(4) driver.
cam_periph.c: In camperiphfree(), reorder some of the operations.
The peripheral destructor needs to be called before
the peripheral is removed from the peripheral is
removed from the list. This is because once we
remove the peripheral from the list, and drop the
topology lock, the peripheral number may be reused.
But if the destructor hasn't been called yet, there
may still be resources hanging around (like devfs
nodes) that haven't been fully cleaned up.
cam_xpt.c: Add an argument to xpt_remove_periph() to indicate
whether the topology lock is already held.
scsi_enc.c: Acquire an extra reference to the peripheral during
registration, and release it once we get a callback
from devfs indicating that the device node is gone.
Call destroy_dev_sched_cb() in enc_oninvalidate()
instead of calling destroy_dev() in the cleanup
routine.
scsi_pass.c: Add reference counting to handle peripheral and
devfs object lifetime issues.
Add a reference to the peripheral and the devfs
node in the peripheral registration.
Don't attempt to add a physical path alias if the
peripheral has been marked invalid.
Release the devfs reference once the initial
physical path alias taskqueue run has completed.
Schedule devfs node destruction in the
passoninvalidate(), and release our peripheral
reference in a new routine, passdevgonecb() once
the devfs node is gone. This allows the peripheral
to fully go away, and the peripheral destructor,
passcleanup(), will get called.
ken [Fri, 29 Jun 2012 22:00:30 +0000 (22:00 +0000)]
MFC r236138, with the exception of scsi_enc.c:
r236138 | ken | 2012-05-27 00:11:09 -0600 (Sun, 27 May 2012) | 64 lines
Work around a race condition in devfs by changing the way closes
are handled in most CAM peripheral drivers that are not handled by
GEOM's disk class.
The usual character driver open and close semantics are that the
driver gets N open calls, but only one close, when the last caller
closes the device.
CAM peripheral drivers expect that behavior to be honored to the
letter, and the CAM peripheral driver code (specifically
cam_periph_release_locked_busses()) panics if it is done incorrectly.
Since devfs has to drop its locks while it calls a driver's close
routine, and it does not have a way to delay or prevent open calls
while it is calling the close routine, there is a race.
The sequence of events, simplified a bit, is:
- devfs acquires a lock
- devfs checks the reference count, and if it is 1, continues to close.
- devfs releases the lock
- 2nd process open call on the device happens here
- devfs calls the driver's close routine
- devfs acquires a lock
- devfs decrements the reference count
- devfs releases the lock
- 2nd process close call on the device happens here
At the second close, we get a panic in
cam_periph_release_locked_busses(), complaining that peripheral
has been released when the reference count is already 0. This is
because we have gotten two closes in a row, which should not
happen.
The fix is to add the D_TRACKCLOSE flag to the driver's cdevsw, so
that we get a close() call for each open(). That does happen
reliably, so we can make sure that our reference counts are
correct.
Note that the sa(4) and pt(4) drivers only allow one context
through the open routine. So these drivers aren't exposed to the
same race condition.
scsi_ch.c,
scsi_enc.c,
scsi_enc_internal.h,
scsi_pass.c,
scsi_sg.c:
For these drivers, change the open() routine to
increment the reference count for every open, and
just decrement the reference count in the close.
Call cam_periph_release_locked() in some scenarios
to avoid additional lock and unlock calls.
scsi_pt.c: Call cam_periph_release_locked() in some scenarios
to avoid additional lock and unlock calls.
ken [Fri, 29 Jun 2012 21:33:36 +0000 (21:33 +0000)]
MFC r237601:
r237601 | ken | 2012-06-26 08:51:35 -0600 (Tue, 26 Jun 2012) | 38 lines
Fix an issue that caused the kernel to panic inside CTL when trying
to attach to target capable HBAs that implement the old immediate
notify (XPT_IMMED_NOTIFY) and notify acknowledge (XPT_NOTIFY_ACK)
CCBs. The new API has been in place since SVN change 196008 in
2009.
The solution is two-fold: fix CTL to handle the responses from the
HBAs, and convert the HBA drivers in question to use the new API.
These drivers have not been tested with CTL, so how well they will
interoperate with CTL is unknown.
scsi_target.c:Update the userland target example code to use the
new immediate notify API.
scsi_ctl.c: Detect when an immediate notify CCB is returned
with CAM_REQ_INVALID or CAM_PROVIDE_FAIL status,
and just free it.
Fix a duplicate assignment.
aic79xx.c,
aic79xx_osm.c:Update the aic79xx driver to use the new API.
Target mode is not enabled on for this driver, so
the changes will have no practical effect.
aic7xxx.c,
aic7xxx_osm.c:Update the aic7xxx driver to use the new API.
sbp_targ.c: Update the firewire target code to work with the
new API.
mpt_cam.c: Update the mpt(4) driver to work with the new API.
Target mode is only enabled for Fibre Channel
mpt(4) devices.
Change the SCSI INQUIRY peripheral qualifier that CTL reports for LUNs
that don't exist.
Anecdotal evidence indicates that it is better to return 011b (bad LUN)
than 001b (LUN offline). However, this change also gives the user a
sysctl/tunable, kern.cam.ctl.inquiry_pq_no_lun, to override the change
and return to the previous behavior. (The previous behavior was to
return 001b, or LUN offline.)
ctl.c: Change the default inquiry peripheral qualifier to 011b,
and add a sysctl and tunable to allow the user to change
it back to 001b if needed.
Don't insert a Copan copyright statement in the inquiry
data. The copyright statements on the files are
sufficient.
ctl_private.h:Add sysctl variable context to the CTL softc.
ctl_cmd_table.c,
ctl_frontend_internal.c,
ctl_frontend.c,
ctl_backend.c,
ctl_error.c: Include sys/sysctl.h.
jhb [Fri, 29 Jun 2012 21:24:56 +0000 (21:24 +0000)]
MFC 235024,235029,235556,235834,235845:
Use MADT to match ACPI Processor objects to CPUs. MADT and DSDT/SSDTs may
list CPUs in different orders, especially for disabled logical cores. Now
we match ACPI IDs from the MADT with Processor objects, strictly order CPUs
accordingly, and ignore disabled cores. This prevents us from executing
methods for other CPUs, e. g., _PSS for disabled logical core, which may not
exist. Unfortunately, it is known that there are a few systems with buggy
BIOSes that do not have unique ACPI IDs for MADT and Processor objects. To
work around these problems, 'debug.acpi.cpu_unordered' tunable is added.
Set this to a non-zero value to restore the old behavior.
marius [Fri, 29 Jun 2012 18:39:22 +0000 (18:39 +0000)]
MFC: r236581
The loaddev environment variable is not modifiable once set, so it is not
update for ZFS. It seems that this does not really affect anything except
the help command. Nevertheless, rearrange things so loaddev is set only
once in all cases in order to get it right.
Pointed out by: avg
dim [Fri, 29 Jun 2012 18:09:39 +0000 (18:09 +0000)]
MFC r235281:
Fix sys/boot/i386/cdboot/cdboot.S compilation with clang after r235219.
This file uses .code16 directives, which are not yet supported by
clang's integrated assembler.
jhb [Fri, 29 Jun 2012 17:21:19 +0000 (17:21 +0000)]
MFC 233191:
Fix madvise(MADV_WILLNEED) to properly handle individual mappings larger
than 4GB. Specifically, the inlined version of 'ptoa' of the the 'int'
count of pages overflowed on 64-bit platforms. While here, change
vm_object_madvise() to accept two vm_pindex_t parameters (start and end)
rather than a (start, count) tuple to match other VM APIs as suggested
by alc@.
jhb [Fri, 29 Jun 2012 17:12:03 +0000 (17:12 +0000)]
MFC 237334:
Move the per-thread deferred user map entries list into a private list
in vm_map_process_deferred() which is then iterated to release map entries.
This avoids having a nested vm map unlock operation called from the loop
body attempt to recuse into vm_map_process_deferred(). This can happen if
the vm_map_remove() triggers the OOM killer.
jhb [Fri, 29 Jun 2012 16:29:38 +0000 (16:29 +0000)]
MFC 237008,237271,237272,237673:
- Fix a couple of bugs that prevented windows in PCI-PCI bridges from
growing "downward" (moving the start address down). First, an off by
one error caused the end address to be moved down an extra alignment
chunk unnecessarily. Second, when aligning the new candidate starting
address, the wrong bits were masked off.
- Add a 'wmask' variable to hold the expression '(1ul << w->step) - 1' in
pcib_grow_window().
- For subtractively decoding bridges, don't try to grow windows but pass
the request up the tree in order to be on the safe side. Growing windows
in this case would mean to switch resources to positive decoding and
it's unclear how to correctly handle this. At least with ALi/ULi M5249
PCI-PCI bridges, this also just doesn't work out of the box.
glebius [Fri, 29 Jun 2012 12:11:31 +0000 (12:11 +0000)]
Merge r236364 from head by eri@:
Correct table counter functionality to not panic.
This was caused by not proper initialization of necessary parameters.
glebius [Fri, 29 Jun 2012 12:05:19 +0000 (12:05 +0000)]
Merge r233773 from head:
Historically arp(8) did a route lookup for the entry it is
about to add, and failed if it exist and had invalid data
link type.
Later on, in r201282, this check morphed to other code, but
message "proxy entry exists for non 802 device" still left,
and now it is printed in a case if route prefix found is
equal to current address being added. In other words, when
we are trying to add ARP entry for a network address. The
message is absolutely unrelated and disappointing in this
case.
I don't see anything bad with setting ARP entries for
network addresses. While useless in usual network,
in a /31 RFC3021 it may be necessary. This, remove this code.
pfg [Fri, 29 Jun 2012 03:01:38 +0000 (03:01 +0000)]
MFC r237448:
Merge changes from upstream libedit.
Here we update most of the files to at least match the
version available in NetBSD's snapshot of 20091228. This
version was chosen because it still doesn't include wide
character support (UTF-8), which involves many changes and
new files.
jhb [Thu, 28 Jun 2012 21:24:09 +0000 (21:24 +0000)]
MFC 228161,230774,230822,236415:
Add a new -e flag to pciconf(8)'s list mode to display PCI error details.
Currently this dumps the status of any error bits in the PCI status register
and PCI-express device status register. It also lists any errors indicated
by version 1 of PCI-express Advanced Error Reporting (AER).
jhb [Thu, 28 Jun 2012 19:34:23 +0000 (19:34 +0000)]
MFC 236404:
Extend VERBOSE_SYSINIT to also print out the name of variables passed
to SYSINIT routines if they can be resolved via symbol look up in DDB.
To avoid false positives, only honor a name if the symbol resolves
exactly to the pointer value (no offset).
kib [Thu, 28 Jun 2012 14:26:55 +0000 (14:26 +0000)]
Fix unbounded-length malloc, controlled from usermode. The added check
is performed before exact size of the buffer is calculated, but the
buffer cannot have size greater then the total space allocated for
extended attributes. The existing check is executing with precise
size, but it is too late, since buffer needs to be allocated in
advance.
Also, adapt to uio_resid being of ssize_t type. Use lblktosize instead of
multiplying by fs block size by hand as well.
kib [Thu, 28 Jun 2012 14:13:45 +0000 (14:13 +0000)]
MFC r237058:
Eliminate the static buffer used to read the first page of the mapped
object, and eliminate the pread(2) call as well. Mmap the first
page of the object temporaly, and unmap it on error or last use.
Potentially, this leaves one-page gap between succeeding dlopen(3),
but there are other mmap(2) consumers as well.
Fix several cases were the whole mapping of the object leaked on error.
Use MAP_PREFAULT_READ for mmap(2) calls which map real object pages
pfg [Thu, 28 Jun 2012 01:02:50 +0000 (01:02 +0000)]
MFC r237406:
Bring a couple of fixes for gcc optimizations.
The GCC4.3 branch contains some optimization fixes
that were not considered regressions and therefore
were never backported. We are bringing a couple of
them that are under GPLv2 since they were made
before the license switch upstream.
jhb [Wed, 27 Jun 2012 21:12:15 +0000 (21:12 +0000)]
MFC 233925,236357:
Add new ktrace records for the start and end of VM faults. This gives
a pair of records similar to syscall entry and return that a user can
use to determine how long page faults take. The new ktrace records are
enabled via the 'p' trace type, but are not enabled in the default set of
trace points.
mm [Wed, 27 Jun 2012 11:59:57 +0000 (11:59 +0000)]
MFC r236823 (pjd):
ds_guid of 0 is special, as it is used by snapshot receive code to
differentiate between an incremental and full stream.
Be sure not to generate guid equal to 0.
Reported by: someone who saw 0 being generated as 64bit random guid
mav [Wed, 27 Jun 2012 11:02:35 +0000 (11:02 +0000)]
MFC r237398:
In camisr() clear CAM_SIM_ON_DONEQ flag after camisr_runqueue() purged SIM
done queue. Clearing it before caused extra SIM queueing in some cases.
It was invisible during normal operation, but during USB device unplug and
respective SIM destruction it could keep pointer on SIM without having
counted reference and as result crash the system by use afer free.
delphij [Wed, 27 Jun 2012 00:31:30 +0000 (00:31 +0000)]
MFC r237339:
Polish previous revision: if the fts_* routines have lstat()'ed the
directory entry then use the struct stat from that instead of doing
it again, and skip the rm_overwrite() call if fts_read() indicated
that the entry couldn't be a regular file.
Obtained from: OpenBSD
MFC r237284 (kevlo):
Fix potential symlink race condition in "rm -P" by adding a check
that the file we have opened is the one we expected. Also open in
non-blocking mode to avoid a potential hang with FIFOs.
eadler [Tue, 26 Jun 2012 03:05:17 +0000 (03:05 +0000)]
MFC r237259 r237260 r237329:
Allow users with RO privilege to the device to read the RO attributes. [0]
Add __unused macros to appropriate places in order to allow building
with WARNS=6 on base gcc, gcc46, and clang