MFC r220376: Allow strerror(0) and strerror_r(0, ...).
Of course, strerror_r() may still fail with ERANGE.
Although the POSIX specification said this could fail with EINVAL and
doing this likely indicates invalid use of errno, most other
implementations permitted it, various POSIX testsuites require it to
work (matching the older sys_errlist array) and apparently some
applications depend on it.
MFC r220461:
Remove setting of PCB_FULL_IRET at the places where we are going to call
update_gdt_{f,g}sbase. The functions set the flag when td == curthread,
and sysarch is always called with curthread.
* Add the readline(3) API to libedit. The libedit versions of
{readline,history}.h are in /usr/include/edit so as to not conflict with
the GNU libreadline versions. To use the libedit readline(3) one should
add "-I/usr/include/edit" to their Makefile
(spelled "-I${DESTDIR}/${INCLUDEDIR}/edit" within the FreeBSD source tree).
* Enable its use in the BSD licensed utilities that support readline(3).
* histedit.h is moved into libedit's directory
MFC: 220152
This patch fixes the Experimental NFS client to properly deal with 32 bit or
64 bit fileid's in NFSv2 and NFSv3. Without this fix, invalid casting (and sign
extension) was creating problems for any fileid greater than 2^31.
We discovered this because we have test clusters with more than 2 billion
allocated files and 64-bit ino_t's (and friend structures).
Handle the special ruleset 0 in devfs_ruleset_use(). An attempt set the
current ruleset to 0 with command "devfs ruleset 0" triggered a KASSERT
in devfs_ruleset_create().
In g_eli_read_done() and g_eli_write_done(), for a bio with
bio_children > 1, g_destroy_bio() is never called and the bio
leaks. Fix this by calling g_destroy_bio() earlier, before the check.
Submitted by: Victor Balada Diaz <victor@bsdes.net> (initial version)
Use timeout from configuration file not only when sending and receiving,
but also when establishing connection.
r220007 (pjd):
Add mapsize to the header just before sending the packet.
Before it could change later and we were sending invalid mapsize.
Some time ago I added optimization where when nodes are connected for the
first time and there were no writes to them yet, there is no initial full
synchronization. This bug prevented it from working.
r220266 (pjd):
Handle the problem described in r220264 by using GEOM GATE queue of unlimited
length. This should fix deadlocks reported by HAST users.
r220270 (pjd):
Allow to disable sends or receives on a socket using shutdown(2) by
interpreting NULL 'data' argument passed to proto_common_send() or
proto_common_recv() as a will to do so.
r220271 (pjd):
Declare directions for sockets between primary and secondary.
In HAST we use two sockets - one for only sending the data and one for only
receiving the data.
r220272 (pjd):
When we are operating on blocking socket and get EAGAIN on send(2) or recv(2)
this means that request timed out. Translate the meaningless EAGAIN to
ETIMEDOUT to give administrator a hint that he might need to increase timeout
in configuration file.
r220273 (pjd):
Handle ENOBUFS on send(2) by retrying for a while and logging the problem.
r220274 (pjd):
Increase default timeout from 5 seconds to 20 seconds. 5 seconds is definitely
to short under heavy load and I was experiencing those timeouts in my recent
tests.
GEOM has an internal mechanism to deal with ENOMEM errors returned via
g_io_deliver(). In such case it increases 'pace' counter on each ENOMEM and
reschedules the request. The 'pace' counter is decreased for each request going
down, but until 'pace' is greater than zero, GEOM will handle at most 10
requests per second. For GEOM GATE users that are proxy to local GEOM providers
(like ggatel(8) and HAST) we can end up with almost permanent slow down of GEOM
down queue. This is because once we reach GEOM GATE queue limit, we return
ENOMEM to the GEOM. This means that we have, eg. 1024 I/O requests in the GEOM
GATE queue. To make room in the queue and stop returning ENOMEM we need to
proceed the requests of course, but those requests are handled by userland
daemons that handle them by reading/writing also from/to local GEOM providers.
For example with HAST, a new requests comes to /dev/hast/data, which is GEOM
GATE provider. GEOM GATE passes the request to hastd(8) and hastd(8)
reads/writes from/to /dev/da0. Once we reach GEOM GATE queue limit, to free up
a slot in GEOM GATE queue, hastd(8) has to read/write from/to /dev/da0, but
this request will also be very slow, because GEOM now slows down all the
requests. We end up with full queue that we can unload at the speed of 10
requests per second. This simply looks like a deadlock.
Fix it by allowing userland daemons that work with both GEOM GATE and local
GEOM providers to specify unlimited queue size, so GEOM GATE will never return
ENOMEM to the GEOM.
MFC 220382:
Correct 'list scan' description in the examples. The previous description
was incorrect - 'list scan' does not actually do a scan, but instead lists
the results of the background 'scan' cache.
Make `make tinderbox` work with MAKEOBJDIRPREFIX set (or in possibly other
combinations) by forcing FAILFILE into .CURDIR as we do for all other
universe output files. [1] Similarly make FAILFILE start with "_." as well.
Push a possible "unbind" in some situation from in6_pcbsetport() to
callers. This also fixes a problem when the prison call could set
the inp->in6p_laddr (laddr) and a following priv_check_cred() call
would return an error and will allow us to merge the IPv4 and IPv6
implementation.
Make sure the locally cached value of rt->rt_gateway stays stable,
even after dropping the reference and unlocking. Previously we
have dereferenced a NULL pointer (after r121765).
Simply unlocking after the block does not work either because of
lock ordering (see r121765) and in addition we would still hold
a pointer to something that might be gone by the time we access it.
Thus take a copy of the value rather than just caching the pointer.
For now remove options FLOWTABLE from the remaining GENERIC kernel
configurations and make it opt-in for those who want it. LINT will
still build it.
While it may be a perfect win in some scenarios, it still troubles users
(see PRs) in general cases. In addition we are still allocating resources
even if disabled by sysctl and still leak arp/nd6 entries in case of
interface destruction.
Discussed with: qingli (2010-11-24, just never executed)
Discussed with: juli (OCTEON1)
PR: kern/148018, kern/155604, kern/144917, kern/146792
MFC r220249,220252:
r220249:
64bit DMA caused data corruption. Unfortunately there is no known
workaround to use 64bit DMA.
Disable 64bit DMA on Attansic L1 controller.
- Remove bsdlabel.5
- Remove bsdlabel test-script that was full of broken assumptions
- Remove dead code depending on __alpha__
- Widen fields that display partition offset/length.
MFhead 220317:
When removing ifnets, we should first remove the reference to ifnet
from the interface index, then decrease refcount, not vice versa.
Otherwise there is a race (reproducible) when if_free_internal()
contests on IFNET_WLOCK(), and we got a zero-refed ifnet in the
index for a long time. It may be picked by some other thread,
that runs ifnet_byindex_ref(), who takes the ifnet from index,
and bumps refcount. When reader drops the lock, if_free_internal()
proceeds with free. Then reader tries to free it a second time.
Improve locking of creating and dropping links in the graph, acquiring
the topology mutex in the following functions, that manipulate pointers
to peer nodes:
- ng_bypass()
- ng_path2noderef() when switching to the next node in sequence.
Rewrite the function a bit.
- ng_address_hook()
- ng_address_path()
This patch improves stability of large mpd5 installations.
Redo r166423. It is important not only skip freeing multicast
entires when underlying interface is detached, but also purge
pointers to them, to avoid double-free in future.
- add static and const where appropriate
- check pointers against NULL
- minor styling nits
- it is actually WARNS=6 clean for non-strict alignment platforms
MFC r220103:
Normally fxp(4) does not receive bad frames but promiscuous mode
makes controller to receive bad frames and i82557 will also receive
bad frames since fxp(4) have to receive VLAN oversized frames. If
fxp(4) encounter DMA overrun error, the received frame size would
be 0 so the actual frame size after checksum field extraction the
length would be negative(-2). Due to signed/unsigned comparison
used in driver, frame length check did not work for DMA overrun
frames. Correct this by casting it to int.
While I'm here explicitly check DMA overrun error and discard the
frame regardless of result of received frame length check.
MFC r219845, r219930, r219949 and r219983.
- Use software to compute EHCI data toggle instead of hardware.
- Fix EHCI initialisation order with regard to debug prints.
MFC r219048, r219004, r218475 and r204632.
- The NetBSD Foundation has granted permission to remove clause 3 and 4 from
their software.
- use device_printf() instead of printf() to give more accurate warnings.
- use memcpy() instead of bcopy().
- add missing #if's for non-FreeBSD compilation.
- Add missing xhci(4) manual page.
- Minor update in some USB manual pages.
- Correct USB 3.0 wire-speed to 5.0Gbps
In g_gate_create() there is a window between when g_gate_softc is
registered in g_gate_units array and when its sc_provider field is
filled. If during this period g_gate_units is accessed by another
thread that is checking for provider name collision the crash is
possible.
Fix this by adding sc_name field to struct g_gate_softc. In
g_gate_create() when g_gate_softc is created but sc_provider is still
not sc_name points to provider name stored in the local array.
Reported by: Freddie Cash <fjwcash@gmail.com>
r220173:
Increase debug level on g_gate device destruction and add message on
device creation.
Change for Africa/Casablanca:
- The 3rd april 2011 at 00:00:00, [it] will be 3rd april 1:00:00
- The 31th july 2011 at 00:59:59, [it] will be 31th July 00:00:00
Update for SouthAmerica/Chili:
- Chile's clocks will go back an hour this year on the 7th of May instead
of this Saturday. They will go forward again the 3rd Saturday in
August, not in October as they have since 1968. This is a pilot plan
which will be reevaluated in 2012.
ktrace_resize_pool() locking slightly reworked:
1) do not take a lock around the single atomic operation.
2) do not lose the invariant of lock by dropping/acquiring
ktrace_mtx around free() or malloc().
MFC r219042:
Introduce preliminary support of the show description of the ABI of
traced process by adding two new events which records value of process
sv_flags to the trace file at process creation/execing/exiting time.
MFC r219311:
Partially rework r219042.
The reason for this is a bug at ktrops() where process dereferenced
without having a lock. This might cause a panic if ktrace was runned
with -p flag and the specified process exited between the dropping
a lock and writing sv_flags.
Since it is impossible to acquire sx lock while holding mtx switch
to use asynchronous enqueuerequest() instead of writerequest().
Rename ktr_getrequest_ne() to more understandable name.
MFC r219312:
Fix indentation in comment, double ';' in variable declaration.
edwin [Thu, 31 Mar 2011 06:29:15 +0000 (06:29 +0000)]
MFC of 198254, 198255, 198350, 198267, 209190, 208831, 208830, 210243
198254:
When tzsetup is run as non-root and the "CMOS clock question on
UTC" is answered as No, it would abort without properly ending the
dialog session.
198255:
Make the usage of the default zoneinfo file to install clearer.
198350:
- Add support for chrooted installs.
- Add examples to the man-page.
198267:
Instead of having to know which timezone was picked last time, you
now can run "tzsetup -r" which will reinstall the last choice. This
data is recorded in /var/db/zoneinfo.
209190:
Use literal format strings. Found by clang.
208831:
Add comment that this value is unused.
It is obvious that it isn't used, but both clang and Coverity talk about it.
208830:
When there is a problem with writing, also bail out.
Allow to checksum on-the-wire data using either CRC32 or SHA256.
r219354 (pjd):
Allow to compress on-the-wire data using two algorithms:
- HOLE - it simply turns all-zero blocks into few bytes header;
it is extremely fast, so it is turned on by default;
it is mostly intended to speed up initial synchronization
where we expect many zeros;
- LZF - very fast algorithm by Marc Alexander Lehmann, which shows
very decent compression ratio and has BSD license.
r219369 (pjd):
Provides three states for pjdlog_initialized, so we can also tell that
this is fist initialization ever.
r219370 (pjd), r219385 (pjd):
- Turn on printf extentions.
- Load support for %T for pritning time.
- Add support for %N for printing number in human readable form.
- Add support for %S for printing sockaddr structure (currently only AF_INET
family is supported, as this is all we need in HAST).
- Disable gcc compile-time format checking as this will no longer work.
r219371 (pjd):
Use %S to print IP address and port number.
r219372 (pjd):
- Log size of data to synchronize in human readable form (using %N).
- Log synchronization time (using %T).
- Log synchronization speed in human readable form (using %N).
r219373 (pjd):
Print some of the numbers in human readable form (using %N).
r219482:
Make workers inherit debug level from the main process.
r219620 (pjd):
In command line options allow size to be specified using k/M/G/T
suffixes.
r219669 (pjd):
Remove #include needed for debugging.
r219721:
For secondary, set 2 * HAST_KEEPALIVE seconds timeout for incoming
connection so the worker will exit if it does not receive packets from
the primary during this interval.
Reported by: Christian Vogt <Christian.Vogt@haw-hamburg.de>
Tested by: Christian Vogt <Christian.Vogt@haw-hamburg.de>
r219813 (pjd):
If there is any traffic on one of out descriptors, we were not checking for
long running hooks. Fix it by not using select(2) timeout to decide if we want
to check hooks or not.
r219814 (pjd):
When creating connection on behalf of primary worker, set pjdlog prefix
to resource name and role, so that any logs related to that can be identified
properly.
r219815 (pjd):
Add snprlcat() and vsnprlcat() - the functions I'm always missing.
They work as a combination of snprintf(3) and strlcat(3) - the caller
can append a string build based on the given format.
r219816 (pjd):
Use snprlcat() instead of two strlcat(3)s.
r219817 (pjd):
Log when we start hooks checking and when we execute a hook.
r219818 (pjd), r219821 (pjd):
In hast.conf we define the other node's address in 'remote' variable.
This way we know how to connect to secondary node when we are primary.
The same variable is used by the secondary node - it only accepts
connections from the address stored in 'remote' variable.
In cluster configurations it is common that each node has its individual
IP address and there is one addtional shared IP address which is assigned
to primary node. It seems it is possible that if the shared IP address is
from the same network as the individual IP address it might be choosen by
the kernel as a source address for connection with the secondary node.
Such connection will be rejected by secondary, as it doesn't come from
primary node individual IP.
Add 'source' variable that allows to specify source IP address we want to
bind to before connecting to the secondary node.
r219821 (pjd):
Forgot to commit this as a part of r219818.
r219830 (pjd):
Detect situation where resource internal identifier differs.
This means that both nodes have separately managed resources that don't
have the same data.
r219831 (pjd):
Be pedantic and free nvout before exiting.
r219832 (pjd):
Increase debug level of "Checking hooks." message.
r219833 (pjd):
Remove stale comment. Yes, it is valid to set role back to init.
r219837 (pjd):
Before handling any events on descriptors check signals so we can update
our info about worker processes if any of them was terminated in the meantime.
This fixes the problem with 'hastctl status' running from a hook called on
split-brain:
1. Secondary calls a hooks and terminates.
2. Hook asks for resource status via 'hastctl status'.
3. The main hastd handles the status request by sending it to the secondary
worker who is already dead, but because signals weren't checked yet he
doesn't know that and we get EPIPE.
r219843 (pjd):
Fix typo.
r219844 (pjd):
Initialize localcnt on first write. This fixes assertion when we create
resource, set role to primary, do no writes, then sent it to secondary
and accept connection from primary.
r219864 (pjd):
White space cleanups.
r219873 (pjd), r219873 (pjd):
The proto API is a general purpose API, so don't use 'hast' in structures or
function names. It can now be used outside of HAST.
r219879:
For requests that are sent only to remote component use the
error from remote.
r219882:
After synchronization is complete we should make primary counters be
equal to secondary counters:
trociny [Tue, 29 Mar 2011 17:59:30 +0000 (17:59 +0000)]
MFC r219368:
r219368 (pjd):
To be able to use printf extensions we need to turn off gcc format checking.
Following the convention of NO_WERROR and NO_WCAST_ALIGN add NO_WFORMAT,
which, when defined in Makefile, turns off compile-time format checking
(by adding -Wno-format), but still allows to use high WARNS level.
trociny [Tue, 29 Mar 2011 17:52:45 +0000 (17:52 +0000)]
MFC r219342, r219346:
r219342 (pjd):
Fix various issues in how %#T is handled:
- If precision is 0, don't print period followed by no digits.
- If precision is 0 stop printing units as soon as possible
(eg. if we have three years and five days and precision is 0
print only 3y5d).
- If precision is not 0, print all units (eg. 3y0d0h0m0s.00).
r219346 (pjd):
Because we call __printf_out() with a on-stack buffer, also call
__printf_flush() so we are sure it won't be referenced after we return.