Rework rtld's TLS Variant I implementation to match r326794
The above commit fixed handling overaligned TLS segments in libc's
TLS Variant I implementation, but rtld provides its own implementation
for dynamically-linked executables which lacks these fixes. Thus,
port these changes to rtld.
This was previously commited as r337978 and reverted in r338149 due to
exposing a bug the ARM rtld. This bug was fixed in r338317 by mmel.