Ar cannot handle UIDs with more than 6 digits, and storing the mtime,
uid, gid and mode provides little to negative value anyhow for ar's
uses. Turn on deterministic (-D) mode by default; it can be disabled by
the user with -U.
PR: 196929
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3190
ed [Wed, 29 Jul 2015 12:42:45 +0000 (12:42 +0000)]
Split up Capsicum to CloudABI rights conversion into two separate routines.
CloudABI's openat() ensures that files are opened with the smallest set
of relevant rights. For example, when opening a FIFO, unrelated rights
like CAP_RECV are automatically removed. To remove unrelated rights, we
can just reuse the code for this that was already present in the rights
conversion function.
Limit the number of supported device IDs to 0x100000
in order to decrease the size of the ITS device table so
that it matches with the HW capabilities.
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3131
Skip checks for IPv6 multicast addresses.
Use in6_localip() for global unicast.
And for IPv6 link-local addresses do search in the IPv6 addresses list.
Since LLA are stored in the kernel internal form, use
IN6_ARE_MASKED_ADDR_EQUAL() macro with lla_mask for addresses comparison.
lla_mask has zero bits in the second word, where we keep sin6_scope_id.
Convert in_ifaddr_lock and in6_ifaddr_lock to rmlock.
Both are used to protect access to IP addresses lists and they can be
acquired for reading several times per packet. To reduce lock contention
it is better to use rmlock here.
RFC4868 section 2.3 requires that the output be half... This fixes
problems that was introduced in r285336... I have verified that
HMAC-SHA2-256 both ah only and w/ AES-CBC interoperate w/ a NetBSD
6.1.5 vm...
When we allocate the struct pf_fragment in pf_fillup_fragment() we forgot to
initialise the fr_flags field. As a result we sometimes mistakenly thought the
fragment to not be a buffered fragment. This resulted in panics because we'd end
up freeing the pf_fragment but not removing it from V_pf_fragqueue (believing it
to be part of V_pf_cachequeue).
The next time we iterated V_pf_fragqueue we'd use a freed object and panic.
While here also fix a pf_fragment use after free in pf_normalize_ip().
pf_reassemble() frees the pf_fragment, so we can't use it any more.
ed [Wed, 29 Jul 2015 06:31:44 +0000 (06:31 +0000)]
Implement CloudABI's readdir().
Summary:
CloudABI's readdir() system call could be thought of as a mixture
between FreeBSD's getdents(2) and pread(). Instead of using the file
descriptor offset, userspace provides a 64-bit cloudabi_dircookie_t
continue reading at a given point. CLOUDABI_DIRCOOKIE_START, having
value 0, can be used to return entries at the start of the directory.
The file descriptor offset is not used to store the cookie for the
reason that in a file descriptor centric environment, it would make
sense to allow concurrent use of a single file descriptor.
The remaining space returned by the system call should be filled with a
partially truncated copy of the next entry. The advantage of doing this
is that it gracefully deals with long filenames. If the C library
provides a buffer that is too small to hold a single entry, it can still
extract the directory entry header, meaning that it can retry the read
with a larger buffer or skip it using the cookie.
Test Plan:
This implementation passes the cloudlibc unit tests at:
jeff [Wed, 29 Jul 2015 02:26:57 +0000 (02:26 +0000)]
- Make 'struct buf *buf' private to vfs_bio.c. Having a global variable
'buf' is inconvenient and has lead me to some irritating to discover
bugs over the years. It also makes it more challenging to refactor
the buf allocation system.
- Move swbuf and declare it as an extern in vfs_bio.c. This is still
not perfect but better than it was before.
- Eliminate the unused ffs function that relied on knowledge of the buf
array.
- Move the shutdown code that iterates over the buf array into vfs_bio.c.
jeff [Tue, 28 Jul 2015 20:24:09 +0000 (20:24 +0000)]
- Eliminate the EMPTYKVA queue. It served as a cache of KVA allocations
attached to bufs to avoid the overhead of the vm. This purposes is now
better served by vmem. Freeing the kva immediately when a buf is
destroyed leads to lower fragmentation and a much simpler scan algorithm.
- Avoid lock contention in the if_transmit callback by using trylock and
enqueueing the frames when it fails. This way there is some latency
removed from the transmitting path.
- If IFF_DRV_OACTIVE is set (and also if IFF_DRV_RUNNING is not) just
enqueue the desired frames and return successful transmit. This way we
avoid to return errors on transmit side and resulting in
possible out-of-order frames. Please note that IFF_DRV_OACTIVE is set
everytime we get the threshold ring hit, so this can be happening quite
often.
Submitted by: Attilio.Rao@isilon.com
MFC after:5 days
On some platforms, the /cpus node contains cpu-to-cluster
map which deffinitely is not a CPU node. Its presence was
causing incrementing of "id" variable and reporting more
CPUs available than it should.
To make "id" valid, increment it only when an entry really
is a CPU device.
Reviewed by: andrew
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3216
ed [Tue, 28 Jul 2015 12:57:19 +0000 (12:57 +0000)]
Implement file attribute modification system calls for CloudABI.
CloudABI uses a system call interface to modify file attributes that is
more similar to KPI's/FUSE, namely where a stat structure is passed back
to the kernel, together with a bitmask of attributes that should be
changed. This would allow us to update any set of attributes atomically.
That said, I'd rather not go as far as to actually implement it that
way, as it would require us to duplicate more code than strictly needed.
Let's just stick to the combinations that are actually used by
cloudlibc.
As ZFS requires a more kernel stack pages than is the default on some
architectures e.g. i386, warn if KSTACK_PAGES is less than
ZFS_MIN_KSTACK_PAGES (which is 4 at the time of writing).
Optimise the DWC OTG host mode driver's receive path:
Remove NAKing limit and pause IN and OUT transactions for 125us in
case of NAK response for BULK and CONTROL endpoints. This gets the
receive latency down and improves USB network throughput at the cost
of some CPU usage.
Remove full barrier from the amd64 atomic_load_acq_*(). Strong
ordering semantic of x86 CPUs makes only the compiler barrier
neccessary to give the acquire behaviour.
Existing implementation ensured sequentially consistent semantic for
load_acq, making much stronger guarantee than required by standard's
definition of the load acquire. Consumers which depend on the barrier
are believed to be identified and already fixed to use proper
operations.
Noted by: alc (long time ago)
Reviewed by: alc, bde
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Remove useless acquire semantic from the atomic_add operation before
sosend(). The only release on the xp_snt_cnt is done after sosend(),
with an intent to synchronize with load_acq in svc_vc_ack().
Reviewed by: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
ed [Tue, 28 Jul 2015 06:50:47 +0000 (06:50 +0000)]
Implement directory and FIFO creation.
The file_create() system call can be used to create files of a given
type. Right now it can only be used to create directories and FIFOs. As
CloudABI does not expose filesystem permissions, this system call lacks
a mode argument. Simply use 0777 or 0666 depending on the file type.
ed [Tue, 28 Jul 2015 06:36:49 +0000 (06:36 +0000)]
Make fstat() and friends work.
Summary:
CloudABI provides access to two different stat structures:
- fdstat, containing file descriptor level status: oflags, file
descriptor type and Capsicum rights, used by cap_rights_get(),
fcntl(F_GETFL), getsockopt(SO_TYPE).
- filestat, containing your regular file status: timestamps, inode
number, used by fstat().
Unlike FreeBSD's stat::st_mode, CloudABI file descriptor types don't
have overloaded meanings (e.g., returning S_ISCHR() for kqueues). Add a
utility function to extract the type of a file descriptor accurately.
CloudABI does not work with O_ACCMODEs. File descriptors have two sets
of Capsicum-style rights: rights that apply to the file descriptor
itself ('base') and rights that apply to any new file descriptors
yielded through openat() ('inheriting'). Though not perfect, we can
pretty safely decompose Capsicum rights to such a pair. This is done in
convert_capabilities().
Test Plan: Tests for these system calls are fairly extensive in cloudlibc.
Rewrite scan procedure with a FSM. This improves code readability by
making clear transits between different states, and avoids bug with
handling repeated $'s.
marius [Mon, 27 Jul 2015 15:26:50 +0000 (15:26 +0000)]
- Move the remainder of host controller capability registers reading from
xhci_start_controller() to xhci_init(). These values don't change at run-
time so there's no point of acquiring them on every USB_HW_POWER_RESUME
instead of only once during initialization. In r276717, reading the first
couple of registers in question already had been moved as a prerequisite
for the changes in that revision.
- Identify ASMedia ASM1042A controllers.
- Use NULL instead of 0 for pointers.
ed [Mon, 27 Jul 2015 13:17:57 +0000 (13:17 +0000)]
Make shutdown() return ENOTCONN as required by POSIX, part deux.
Summary:
Back in 2005, maxim@ attempted to fix shutdown() to return ENOTCONN in case the socket was not connected (r150152). This had to be rolled back (r150155), as it broke some of the existing programs that depend on this behavior. I reapplied this change on my system and indeed, syslogd failed to start up. I fixed this back in February (279016) and MFC'ed it to the supported stable branches. Apart from that, things seem to work out all right.
Since at least Linux and Mac OS X do the right thing, I'd like to go ahead and give this another try. To keep old copies of syslogd working, only start returning ENOTCONN for recent binaries.
I took a look at the XNU sources and they seem to test against both SS_ISCONNECTED, SS_ISCONNECTING and SS_ISDISCONNECTING, instead of just SS_ISCONNECTED. That seams reasonable, so let's do the same.
Test Plan:
This issue was uncovered while writing tests for shutdown() in CloudABI:
marius [Mon, 27 Jul 2015 12:14:14 +0000 (12:14 +0000)]
- Probe UICLASS_CDC/UISUBCLASS_ABSTRACT_CONTROL_MODEL/0xff again. This
variant of Microsoft RNDIS, i. e. their unofficial version of CDC ACM,
has been disabled in r261544 for resolving a conflict with umodem(4).
Eventually, in r275790 that problem was dealt with in the right way.
However, r275790 failed to put probing of RNDIS devices in question
back.
- Initialize the device prior to querying it, as required by the RNDIS
specification. Otherwise already determining the MAC address may fail
rightfully.
- On detach, halt the device again.
- Use UCDC_SEND_ENCAPSULATED_{COMMAND,RESPONSE}. While these macros are
resolving to the same values as UR_{CLEAR_FEATURE,GET_STATUS}, the
former set is way more appropriate in this context.
- Report unknown - rather: unimplemented - events unconditionally and
not just in debug mode. This ensures that we'll get some hint of what
is going wrong instead of the driver silently failing.
- Deal with the Microsoft ActiveSync requirement of using an input buffer
the size of the expected reply or larger - except for variably sized
replies - when querying a device.
- Fix some pointless NULL checks, style bugs etc.
This changes allow urndis(4) to communicate with a Microsoft-certified
USB RNDIS test token.
ed [Mon, 27 Jul 2015 10:07:29 +0000 (10:07 +0000)]
Add a futex implementation for CloudABI.
Summary:
CloudABI provides two different types of futex objects: read-write locks
and condition variables. There is no need to provide separate support
for once objects and thread joining, as these are efficiently simulated
by blocking on a read-write lock. Mutexes simply use read-write locks.
Condition variables always have a lock object associated to them. They
always know to which lock a thread needs to be migrated if woken up.
This allows us to implement requeueing. A broadcast on a condition
variable will never cause multiple threads to be woken up at once. They
will be woken up iteratively.
This implementation still has lots of room for improvement. Locking is
coarse and right now we use linked lists to store all of the locks and
condition variables, instead of using a hash table. The primary goal of
this implementation was to behave correctly. Performance will be
improved as we go.
Test Plan:
This futex implementation has been in use for the last couple of months
and seems to work pretty well. All of the cloudlibc and libc++ unit
tests seem to pass.
ed [Mon, 27 Jul 2015 10:04:06 +0000 (10:04 +0000)]
Sync in latest upstream system call definitions.
Futex object scopes have been renamed from using their own constants to
simply reusing the existing CLOUDABI_MAP_{PRIVATE,SHARED} flags, as they
are more accurate in this context.
Change the dev argument from a full path to just the device
identification (e.g. isa:0x3f0 or pci0:2:1:0). In libbus,
the device is turned into a path name. For bus_space_map(),
the resource is now specified in a second argument.
Before:
bus.map('/dev/proto/pci0:2:1:0/pcicfg')
busdma.tag_create('/dev/proto/pci0:2:1:0/busdma', ...)
Now:
bus.map('pci0:2:1:0', 'pcicfg')
busdma.tag_create('pci0:2:1:0', ...)
Document r285679, bsdinstall(8) updates to workaround various
problematic BIOSes when booting from GPT, and partition scheme
selection in the UFS partition menu.
o make sure the boundary is a power of 2, when not zero.
o don't convert 0 to ~0 just so that we can use MIN. ~0 is not a
valid boundary. Introduce BNDRY_MIN that deals with 0 values
that mean no boundary.
Build debug version of rmlock's methods only when LOCK_DEBUG > 0.
Currently LOCK_DEBUG is always defined in sys/lock.h (0 or 1).
This means that debugging code always built. In addition the kernel
modules have always defined LOCK_DEBUG as 1. So, debugging rmlock code
is always used by kernel modules.
Bump GCC max-inline-insns-single in libiconv_modules and grep
This is required by our FORTIFY_SOURCE implementation as it
does more inlining. As a rule of thumb, FORTIFY_SOURCE doubles
the number of inlines except that in grep inlining
blows up for some reason.
Revert r173708's modifications to vm_object_page_remove().
Assume that a vnode is mapped shared and mlocked(), and then the vnode
is truncated, or truncated and then again extended past the mapping
point EOF. Truncation removes the pages past the truncation point,
and if pages are later created at this range, they are not properly
mapped into the mlocked region, and their wiring count is wrong.
The revert leaves the invalidated but wired pages on the object queue,
which means that the pages are found by vm_object_unwire() when the
mapped range is munlock()ed, and reused by the buffer cache when the
vnode is extended again.
The changes in r173708 were required since then vm_map_unwire() looked
at the page tables to find the page to unwire. This is no longer
needed with the vm_object_unwire() introduction, which follows the
objects shadow chain.
Also eliminate OBJPR_NOTWIRED flag for vm_object_page_remove(), which
is now redundand, we do not remove wired pages.
Reported by: trasz, Dmitry Sivachenko <trtrmitya@gmail.com>
Suggested and reviewed by: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Move including netinet/icmp6.h around to avoid a problem when including
netinet/icmp6.h and net/netmap.h. Both use ni_flags...
This allows to build multistack with SCTP support.
With the removal of b_saveaddr in the r285819, b_data must be reset to
b_kvabase when the buffer is reclaimed. Otherwise, if b_data for the
mapped buffer was adjusted with the page-offset portion of b_offset,
nothing would re-adjust the b_data, which breaks buffer management
code which expects page-aligned b_data (see e.g. bpman_qenter(), which
skips partial pages).
Fix a minor issue with the GB_KVAALLOC requests, which could result in
returning the mapped buffer if the reused buffer is mapped and have
the right amount of KVA reserved.
Improve assertion in the vfs_buf_check_mapped() to catch unmapped
buffers which have their b_data incorrectly adjusted with offset.
Reported and tested by: pho (previous version)
Reviewed by: jeff (previous version)
Sponsored by: The FreeBSD Foundation
Synchronize PIN input/output modes with gnu/dts/include/dt-bindings/pinctrl/am33xx.h
gpio driver requires exact value to match SoC pin mode with GPIO pin direction
OF_getencprop_alloc shouldn't be used to get string value. If string
length + 1 is not divisible by 4 this function returns NULL property
value. Otherwise - string with each 4 letters inverted
Document the fact that system(3) can easily be misused due to shell meta
characters are honored. While I'm there also mention posix_spawn in the
SEE ALSO section.
Add a comment discussing the appropriate use of the atomic_*() functions
with acquire and release semantics versus the *mb() functions on amd64
processors.