The hijacked load node does not need to be a FIFO. A FIFO is only
required for blocking init's control flow, which is already achieved
by hijacking the enforce node.
Previously `read_string()` calls `std::string.resize()` with a int read from remote process. When I/O error occurs, -1 will be used for resizing the string, `std::bad_alloc` is thrown and since magisk is compiled with `-fno-exceptions`, it will crash the whole daemon process.
May fix topjohnwu#5681
Since Android 13, sepolicy are also loaded from APEX modules. Part
of the change is to run restorecon before SELinux is set to enforce.
In order to support this situation, we also hijack plat_file_contexts
if necessary to properly order our operations.
Original idea credits to @yujincheng08, close#5603
On older Android versions, pre-mounting selinuxfs will lead to errors,
so we have to use a different method to block init's control flow.
Since all devices that falls in this catagory must both:
1. Be Android 8.0 - 9.0
2. Have early mount fstab in its device tree
We can actually use the same FIFO trick, but this time not on selinuxfs,
but on the read-only device tree nodes in sysfs or procfs. By mocking
the fstab/compatible node in the device tree, we can block init when
it attempts to do early mount; at that point, we can then mock selinuxfs
as we normally would, successfully hijack and inject patched sepolicy.
In the current implementation, Magisk will either have to recreate
all early mount implementation (for legacy SAR and rootfs devices) or
delegate early mount to first stage init (for 2SI devices) to access
required partitions for loading sepolicy. It then has to recreate the
split sepolicy loading implementation in-house, apply patches, then
dump the compiled + patched policies into monolithic format somewhere.
Finally, it patches the original init to force it to load the sepolicy
file we just created.
With the increasing complexity involved in early mount and split
sepolicy (there is even APEX module involved in the future!),
it is about time to rethink Magisk's sepolicy strategy as rebuilding
init's functionality is not scalable and easy to maintain.
In this commit, instead of building sepolicy ourselves, we mock
selinuxfs with FIFO files connected to a pre-init daemon, waiting
for the actual init process to directly write the sepolicy file into
MagiskInit. We then patch the file and load it into the kernel. Some
FIFO tricks has to be used to hijack the original init process's
control flow and prevent race conditions, details are directly in the
comments in code.
At the moment, only system-as-root (read-only root) support is added.
Support for legacy rootfs devices will come with a follow up commit.
Design credit to @yujincheng08
Close#5146. Fix#5491, fix#3752
Previously, Magisk changes the mount point from /system to /system_root
by patching fstab to prevent the original init from changing root.
The reason why we want to prevent the original init from switching the
root directory is because it will then be read-only, making patching
and injecting magiskinit into the boot chain difficult.
This commit (ab)uses the fact that the /data folder will never be part
of early mount (because it is handled very late in the boot by vold),
so that we can use it as the mount point of tmpfs to store files.
Some advantages of this method:
- No need to switch root manually
- No need to modify fstab, which significantly improves compatibility
e.g. avoid hacks for weird devices like those using oplus.fstab,
and avoid hacking init to bypass fstab in device trees
- Supports skip_mount.cfg
- Support DSU
In the constructor of mmap_data, there are two possible values when fails: nullptr if fstat() fails, and MAP_FAILED if mmap() fails, but mmap_data treated MAP_FAILED as valid address and crashes.
Samsung FDE devices with the "persist.sys.zygote.early=true" property will cause Zygote to start before post-fs-data. According to Magisk's document, the post-fs-data phase should always happen before Zygote is started. Features assuming this behavior (like Zygisk and modules that need to control zygote) will not work. To avoid breaking existing modules, we simply invalidate this property to prevent this non-standard behavior from happening
Fix#5299, fix#5328, fix#5308
Co-authored-by: LoveSy <shana@zju.edu.cn>
* Further fix `oplus.fstab` support
In some oneplus devices, `oplus.fstab` does exists but `init` never
loaded it and those entries in `oplus.fstab` are written directly to
`fstab.qcom`. Previous implementation will introduce duplicate entries
to `fstab.qcom` and brick the device. This commit filters those entries
from `oplus.fstab` that are already in `fstab.qcom` and further filters
duplicated entries in `oplus.fstab` (keep only the last entry).
Fix#5016
* Fix UB
Since we moved entry, we need to explicitly copy its member.
For c++23 we can use `auto{}`.
- Use ftruncate64 instead of ftruncate to workaround seccomp
- Cast uint32_t to off64_t before making it negative
Note: Using ftruncate with a modern NDK libc should actually be
fine as the syscall wrapper in bionic will use ftruncate64 internally.
However, since we are using the libc.a from r10e built for Gingerbread,
seccomp wasn't a thing back then, and also the ftruncate64 symbol is
missing; we have to create our own wrapper and call it instead on
32-bit ABIs.
Props to @jnotuo for discovering the overflow bug and seccomp issue
Fix#3703, close#4915
`operator==` of string_view will create a tmp `string_view`.
It's an UB if the `const char *` is a nullptr.
`fdt_get_name` however will return a nullptr.
Samsung Galaxy A21S and Galaxy M12, probably others, are hdr_v2 boot.img with 2SI judging by the ramdisk contents, but the dtb contains an extra cmdline with skip_initramfs present, even though this shouldn't exist on 2SI and the kernel apparently doesn't even contain a skip_initramfs function
I can't find examples of other devices where skip_initramfs is present in the dtb other than these so patch it out like we do the kernel
Co-authored-by: topjohnwu <topjohnwu@gmail.com>
Custom ROM bring-ups of legacy Sony devices contain the following:
/init (symlink to /bin/init_sony)
/init.real (the "real" Android init)
/bin/init_sony (this was /sbin/init_sony on Android <11)
Kernel loads the ramdisk and starts /init -> /bin/init_sony
/bin/init_sony does low-level device setup (see: https://github.com/LineageOS/android_device_sony_common/blob/lineage-18.1/init/init_main.cpp)
/bin/init_sony unlinks /init and renames /init.real to /init
/bin/init_sony starts /init
Since init_sony needs to run first magiskinit needs to replace init.real instead, so add workarounds based on detection of init.real to boot patcher and uninstaller
Thanks @115ek and @bleckdeth
Fixes#3636
Co-authored-by: topjohnwu <topjohnwu@gmail.com>
Fix topjohnwu#4810
> [ 2.927463] [1: init: 1] magiskinit: Replace [/system/etc/selinux/plat_sepolicy.cil] -> [xxx]
[ 2.936801] [1: init: 1] magiskinit: write failed with 14: Bad address
Since topjohnwu#4596, magisk fails to patch `/init`, xwrite() fails with EFAULT, break the original `/init` file and make the device unbootable. Reverting this commit for legacy rootfs devices fixes the problem. I think this is a Samsung kernel magic since currently I can't reproduce this on other devices or find something special in the log currently we have.
- The lambda here infers its return type as `std::string`,
and since `info` is `const`, the labmda copies `info.name`
and returns a `std::string&&`. After captured by the
`std::string_view`, the `std::string&&` return value
deconstructs and makes `std::string_view` refers to a
dangling pointer.
libselinux.so will be unmounted when magiskd starts. If magiskd restarts (like it died before boot completed), the files we want to unmount is the original files because the modified files is unmounted in previous start, which will causes many crashes due to missing libselinux.so.
- In `unmap_all`, replace readable pages atomically with mmap + mremap
- Create new function `remap_all` to replace pages with equivalent
anonymous copies to prevent simple maps name scanning
* This seems to be a logic that has been abandoned for a
long time. Now we automatically choose which partition
to store sepolicy.rule. Furthermore, touching /persist is
what we should avoid doing whenever possible.
Fix#4204
`_root` is uninitialized for non-root nodes. And it will cause `module_node::mount` fail because it uses `root()`. Once the bug is triggered, signal 11 is received but Magisk catch all signals and therefore stuck forever.
* Support deodexed ROM: This should not be done and dexpreopt is mandatory since P
Xposed: Xposed handles them just fine, at least in the latest version 89.3
suMiscL6: For whatever audio mods, a leftover of phh time
Liveboot and suBackL6: Was for CF.lumen and LiveBoot, not needed now
* Also cleanup binder sepolicies since we allow all binder transactions.
faccessat() should return 0 when success, but it returns random number with errno == 0 in x86 platform.
It’s a side effect of commit bf80b08b5f when magisk binaries ‘corretly’ linked with library of API16 .. lol
Co-authored-by: John Wu <topjohnwu@gmail.com>
* There will be garbage output when executing `su` (#4016)
* Failed to check root status and showing N/A in status (#4005)
Signed-off-by: Shaka Huang <shakalaca@gmail.com>
- Block signals in logging routine (fix#3976)
- Prevent possible deadlock after fork (stdio locks internally)
by creating a new FILE pointer per logging call (thread/stack local)
Magisk's policy is to never allow 3rd party code to be loaded in the
zygote daemon process so we have 100% control over injection and hiding.
However, this makes it impossible for 3rd party modules to run anything
before process specialization, which includes the ability to modify the
arguments being sent to these original nativeForkAndXXX methods.
The trick here is to fork before calling the original nativeForkAndXXX
methods, and hook `fork` in libandroid_runtime.so to skip the next
invocation; basically, we're moving the responsibility of process
forking to our own hands.
On devices where the primary storage is slow to probe it makes sense to
wait forever for the system partition to mount, this emulates the
kernel's behaviour when waiting for rootfs on SAR if the rootwait
parameter is supplied.
This issue was encountered with some SD cards on the Nintendo Switch.
Previously, Magisk uses persist or cache for storing modules' custom
sepolicy rules. In this commit, we significantly broaden its
compatibility and also prevent mounting errors.
The persist partition is non-standard and also critical for Snapdragon
devices, so we prefer not to use it by default.
We will go through the following logic to find the best suitable
non-volatile, writable location to store and load sepolicy.rule files:
Unencrypted data -> FBE data unencrypted dir -> cache -> metadata -> persist
This should cover almost all possible cases: very old devices have
cache partitions; newer devices will use FBE; latest devices will use
metadata FBE (which guarantees a metadata parition); and finally,
all Snapdragon devices have the persist partition (as a last resort).
Fix#3179
This commit adds support for kernel initialized dm-verity on legacy SAR
devices.
Tested on a Pixel 2 XL with a kernel patch to initialize mappings
specified via the `dm=` kernel parameter even when an initramfs is used.
Due to changes in ec3705f2ed, the app can
no longer communicate with the dameon through a socket opened on the
daemon side due to SELinux restrictions. The workaround here is to have
the daemon decide a socket name, send it to the app, have the app create
the socket server, then finally the daemon connects to the app through
the socket.
Introduce new domain `magisk_client` and new file type `magisk_exec`.
Connection to magiskd's always-on socket is restricted to magisk_client
only. Whitelisted process domains can transit to magisk_client through
executing files labelled magisk_exec. The main magisk binary shall be
the only file labelled as magisk_exec throughout the whole system.
All processes thus are no longer allowed to connect to magiskd directly
without going through the proper magisk binary.
Connection failures are silenced from audit logs with dontaudit rules,
so crazy processes which traverse through all unix domain sockets to try
connection can no longer check logcat to know the actual reason behind
EACCES, leaking the denied process policy (which is u:r:magisk:s0).
This also allows us to remove many rules that open up holes in
untrusted_app domains that were used to make remote shell work properly.
Since all processes establishing the remote shell are now restricted to
the magisk_client domain, all these rules are moved to magisk_client.
This makes Magisk require fewer compromises in Android's security model.
Note: as of this commit, requesting new root access via Magisk Manager
will stop working as Magisk Manager can no longer communicate with
magiskd directly. This will be addressed in a future commit that
involves changes in both native and application side.
- legacy devices brought up to Android 10 may now use a compressed dt in a hdr_v0 AOSP dt variant extra section, so detect, decompress and recompress this
- so far these have only been done using lz4 compression (latest format revision magic), e.g. LOS 17.1 victara (Moto X)
For match-all-type rules (e.g. "allow magisk * * *" used in Magisk),
we used to iterate and apply rules on all existing types. However, this
is actually unnecessary as all selinux types should have at least 1
attributes assigned to it (process types "domain", file context types
"file_type" etc.). This means in order to create rules that applies to
all types, we actually only need to create rules for all attributes.
This optimization SIGNIFICANTLY reduces the patched sepolicy that is
loaded into the kernel when running Magisk. For example on Pixel 4 XL
running Android R DP4, the sepolicy sizes are
patched (before) : 3455948
patched (after) : 843176
stock : 630229
The active sepolicy size actually impacts the performance of every single
operation in the operating system, because the larger the policies gets,
the longer it takes for the kernel to lookup and match rules.
It is possible that a module is breaking the device so bad that zygote
cannot even be started. In this case, system_server cannot start and
detect the safe mode key combo, set the persist property, and reboot.
Also on old Android versions, the system directly goes to safe mode
after detecting a key combo without rebooting, defeating the purpose of
Magisk's safe mode protection if we only check for the persist property.
Directly adding key combo check natively in magiskd allows us to enter
Magisk safe mode before the system is even aware of it.
When detecting device is booting as Safe Mode, disable all modules and
MagiskHide and skip all operations. The only thing that'll be available
in this state is root (Magisk Manager will also be disabled by system).
Since the next normal boot will also have all modules disabled, this can
be used to rescue a device in the case when a rogue module causes
bootloop and no custom recovery is available (or recoveries without
the ability to decrypt data).
Since we no longer need to add new properties in the device tree, and
all the patches we do removes strings, we can just directly patch
the flat device tree in-place, ignoring basically all the higher level
DTB structure and format to accomplish 100% compatibility.
Patching DTBs is proven to be difficult and problematic as there are
tons of different formats out there. Adding support for all the formats
in magiskboot has been quite an headache in the past year, and it still
definitely does not cover all possible cases of them out there.
There is another issue: fake dt fstabs. Some super old devices do not
have device trees in their boot images, so some custom ROM developers
had came up with a "genius" solution: hardcode fstab entries directly
in the kernel source code and create fake device tree nodes even if
Android 10+ init can graciously take fstab files instead (-_-) 。。。
And there is YET another issue: DTBs are not always in boot images!
Google is crazy enough to litter DTBs all over the place, it is like
they cannot make up their minds (duh). This means the dt fstabs can be
either concatnated after the kernel (1), in the DTB partition (2), in
the DTBO partition (3), in the recovery_dtbo section in boot images (4),
or in the dtb section in boot images (5). FIVE f**king places, how can
anyone keep up with that!
With Android 10+ that uses 2 stage inits, it is crutual for Magisk to
be able to modify fstab mount points in order to let the original init
mount partitions for us, but NOT switch root and continue booting. For
devices using dt for early mount fstab, we used to patch the DTB at
install time with magiskboot. However these changes are permanent and
cannot be restored back at reinstallation.
With this commit, Magisk will read dt fstabs and write them to ramdisk
at boot time. And in that case, the init binary will also be patched
to force it to NEVER use fstabs in device-tree. By doing so, we can
unify ramdisk based 2SI fstab patching as basically we are just patching
fstab files. This also means we can manipulate fstab whatever Magisk
needs in the future without the need to going through the headache that
is patching DTBs at installation.