When detecting device is booting as Safe Mode, disable all modules and
MagiskHide and skip all operations. The only thing that'll be available
in this state is root (Magisk Manager will also be disabled by system).
Since the next normal boot will also have all modules disabled, this can
be used to rescue a device in the case when a rogue module causes
bootloop and no custom recovery is available (or recoveries without
the ability to decrypt data).
Rewrite the whole module mounting logic from scratch.
Even the algorithm is different compared to the old one.
This new design focuses on a few key points:
- Modular: Custom nodes can be injected into the mount tree.
It's the main reason for starting the rewrite (needed for Android 11)
- Efficient: Compared to the existing implementation, this is the most
efficient (both in terms of computation and memory usage) design I
currently can come up with.
- Accurate: The old mounting logic relies on handling specifically every
edge case I can think of. During this rewrite I actually found some
cases that the old design does not handle properly. This new design is
architected in a way (node types and its rankings) that it should
handle edge cases all by itself when constructing mount trees.
Since SafetyNet CTS is impossible to achieve, leaving MagiskHide on
by default no longer serves a purpose.
For more details regarding the latest SafetyNet changes, please check:
https://twitter.com/topjohnwu/status/1237656703929180160https://twitter.com/topjohnwu/status/1237830555523149824
MagiskHide's functionality will continue to exist within the Magisk
project as it is still extremely effective to hide modifications in
userspace (including SafetyNet's basicIntegrity check).
Future MagiskHide improvements _may_ come, but since the holy grail
has been taken, any form of improvement is now a very low priority.
Vendors are always adding “extra libraries” in /vendor/lib* for their own sake, in this case AS*S loaded with customized `libicuuc.so` for Zenf*ne 5z and led to the failure of dynamic loading libsqlite.so:
<quote>
db: dlopen failed: cannot locate symbol "UCNV_FROM_U_CALLBACK_ESCAPE_63" referenced by "/apex/com.android.runtime/lib64/libandroidicu.so"...
</quote>
Signed-off-by: Shaka Huang <shakalaca@gmail.com>
* Minor optimizations
Co-authored-by: John Wu <topjohnwu@gmail.com>
Previously, we use either BroadcastReceivers or Activities to receive
messages from our native daemon, but both have their own downsides.
Some OEMs blocks broadcasts if the app is not running in the background,
regardless of who the caller is. Activities on the other hand, despite
working 100% of the time, will steal the focus of the current foreground
app, even though we are just doing some logging and showing a toast.
In addition, since stubs for hiding Magisk Manager is introduced, our
only communication method is left with the broadcast option, as
only broadcasting allows targeting a specific package name, not a
component name (which will be obfuscated in the case of stubs).
To make sure root requests will work on all devices, Magisk had to do
some experiments every boot to test whether broadcast is deliverable or
not. This makes the whole thing even more complicated then ever.
So lets take a look at another kind of component in Android apps:
ContentProviders. It is a vital part of Android's ecosystem, and as far
as I know no OEMs will block requests to ContentProviders (or else
tons of functionality will break catastrophically). Starting at API 11,
the system supports calling a specific method in ContentProviders,
optionally sending extra data along with the method call. This is
perfect for the native daemon to start a communication with Magisk
Manager. Another cool thing is that we no longer need to know the
component name of the reciever, as ContentProviders identify themselves
with an "authority" name, which in Magisk Manager's case is tied to the
package name. We already have a mechanism to keep track of our current
manager package name, so this works out of the box.
So yay! No more flaky broadcast tests, no more stupid OEMs blocking
broadcasts for some bizzare reasons. This method should in theory
work on almost all devices and situations.
According to this comment in #1880:
https://github.com/topjohnwu/Magisk/issues/1880#issuecomment-546657588
If Linux recycled our PPID, and coincidentally the process that reused
the PPID is root, AND init wants to kill the whole process group,
magiskd will get killed as a result.
There is no real way to block a SIGKILL signal, so we simply make sure
our daemon PID is the process group leader by renaming the directory.
Close#1880
Usually, the communication between native and the app is done via
sending intents to either broadcast or activity. These communication
channels are for launching root requests dialogs, sending root request
notifications (the toast you see when an app gained root access), and
root request logging.
Sending intents by am (activity manager) usually requires specifying
the component name in the format of <pkg>/<class name>. This means parts
of Magisk Manager cannot be randomized or else the native daemon is
unable to know where to send data to the app.
On modern Android (not sure which API is it introduced), it is possible
to send broadcasts to a package, not a specific component. Which
component will receive the intent depends on the intent filter declared
in AndroidManifest.xml. Since we already have a mechanism in native code
to keep track of the package name of Magisk Manager, this makes it
perfect to pass intents to Magisk Manager that have components being
randomly obfuscated (stub APKs).
There are a few caveats though. Although this broadcasting method works
perfectly fine on AOSP and most systems, there are OEMs out there
shipping ROMs blocking broadcasts unexpectedly. In order to make sure
Magisk works in all kinds of scenarios, we run actual tests every boot
to determine which communication method should be used.
We have 3 methods in total, ordered in preference:
1. Broadcasting to a package
2. Broadcasting to a specific component
3. Starting a specific activity component
Method 3 will always work on any device, but the downside is anytime
a communication happens, Magisk Manager will steal foreground focus
regardless of whether UI is drawn. Method 1 is the only way to support
obfuscated stub APKs. The communication test will test method 1 and 2,
and if Magisk Manager is able to receive the messages, it will then
update the daemon configuration to use whichever is preferable. If none
of the broadcasts can be delivered, then the fallback method 3 will be
used.
In commit 8d4c407, native Magisk always launches an activity for
communicating with Magisk Manager. While this works extremely well,
since it also workaround stupid OEMs that blocks broadcasts, it has a
problem: launching an activity will claim the focus of the device,
which could be super annoying in some circumstances.
This commit adds a new feature to run a broadcast test on boot complete.
If Magisk Manager successfully receives the broadcast, it will toggle
a setting in magiskd so all future su loggings and notifies will always
use broadcasts instead of launching activities.
Fix#1412
For devices come with two /data mount points, magisk will bind the one in tmpfs and failed to load modules since this partition is empty.
Signed-off-by: Shaka Huang <shakalaca@gmail.com>
We used to construct /sbin tmpfs overlay in early-init stage after
SELinux is properly initialized. However the way it is implemented
(forking daemon from magiskinit with complicated file waiting triggers)
is extremely complicated and error prone.
This commit moves the construction of the sbin overlay to pre-init
stage. The catch is that since SELinux is not present at that point,
proper selabel has to be reconstructed afterwards. Some additional
SEPolicy rules are added to make sure init can access magisk binaries,
and the secontext relabeling task is assigned to the main Magisk daemon.
The root nodes are /system and /vendor. Adding new files into these
directories, although works on some devices, mostly bootloops on many
devices out there. So don't allow it, which also makes the whole magic
mounting logic much easier and extensible.
Samsung does not like running cmd before system services are started.
Instead of failing, it will enter an infinite wait on binder.
Move APK installation to boot complete to make sure pm can be run
without blocking process.
Forseeing the future that more and more A only system-as-root devices
would have similar bootloader behavior as the latest Samsung devices
(that is, no ramdisk will be loaded into memory when booting from
the boot partition), a solution/workaround has to be made when Magisk
is installed to the recovery partition, making custom recoveries
unable to co-exist with Magisk.
This commit allows magiskinit to read input device events from the
kernel to detect when a user holds volume key up to toggle whether
system-as-root mode is enabled. When system-as-root mode is disabled,
magiskinit will boot with ramdisk instead of cloning rootfs from system,
which in this case will boot to the recovery.
Some devices (mainly new Samsung phones we're talking here...) using
A only system-as-root refuse to load ramdisk when booted with boot
no matter what we do. With many A only system-as-root devices, even
though their boot image is kernel only, we can still be able to add
a ramdisk section into the image and force the kernel to use it as
rootfs. However the bootloader on devices like the S10 simply does
not load anything within boot image into memory other than the kernel.
This gives as the only option is to install Magisk on the recovery
partition. This commits adds proper support for these kind of scenarios.
95%+ of existing modules enables auto mount (obviously).
Switching auto mount to opt-out makes more sense than opt-in as
in previous module format. The file 'auto_mount' will be ignored, and
the file 'skip_mount' will be checked to toggle the mounting behavior.
After scanning through the current Magisk Module Repo modules, no
modules are using custom bind mounting; all modules with auto mount
disabled have empty system folder, which means this change will not
affect any existing module.
It seems both Android cancers, Samsung and Huawei devices, don't
like preloading sepolicy. For a temporary solution now is to limit
the sepolicy loading to Android Q only.
Of course, the cancer of Android, Huawei, has to do some f**king weird
modifications to the Linux kernel. Its kernel only accepts 1 single
policy load in its lifetime, a second load will result in ENOMEM error.
Since Huawei devices always use their own stupid ramdisk setup and not
system-as-root, not loading sepolicy is not a concern (for now).
Android Q init assumes rootfs to always be on EXT4 images, thus
never runs restorecon on the whole root directory. This is an issue
because some folders in rootfs were set with special selabels in
the system partition, but when copying over to initramfs by magiskinit,
these labels will not be preserved.
So the solution is to relabel the files in rootfs with the original
context right? Yes, but rootfs does not allow security xattr to be set
on files before the kernel SELinux initializes with genfs_contexts.
We have to load our sepolicy to the kernel before we clone the root
directory from system partition, which we will also restore the selabel
in the meantime.
Unfortunately this means that for each reboot, the exact same policy
will be loaded to the kernel twice: once in magiskinit so we can label
rootfs properly, and once by the original init, which is part of the
boot procedure. There is no easy way to prevent init from loading
sepolicy, as init will refuse to continue if policy loading has failed.
No matter if we use the old, buggy, error prone am_proc_start monitoring,
or the new APK inotify method, both methods rely on MagiskHide 'reacting'
fast enough to hijack the process before any detection has been done.
However, this is not reliable and practical. There are apps that utilize
native libraries to start detects and register SIGCONT signal handlers
to mitigate all existing MagiskHide process monitoring mechanism. So
our only solution is to hijack an app BEFORE it is started.
All Android apps' process is forked from zygote, so it is easily the
target to be monitored. All forks will be notified, and subsequent
thread spawning (Android apps are heaviliy multithreaded) from children
are also closely monitored to find the earliest possible point to
identify what the process will eventually be (before am_proc_bound).
ptrace is extremely complicated and very difficult to get right. The
current code is heaviliy tested on a stock Android 9.0 Pixel system,
so in theory it should work fine on most devices, but more tests and
potentially fixes are expected to follow this commit.
Before switching to the new MagiskHide implementation (APK inotify),
logcat parsing provides us lots of information to target a process.
We were targeting components so that apps with multi-processes
can still be hidden properly.
After switching to the new implementation, our granularity is limited
to the UID of the process. This is especially dangerous since Android
allow apps signed with the same signature to share UIDs, and many system
apps utilize this for elevated permissions for some services.
This commit introduces process name matching. We could not blanketly
target an UID, so the workaround is to verify its process name before
unmounting.
The tricky thing is that any app developer is allowed to name the
process of its component to whatever they want; there is no 'one
rule to catch them all' to target a specific package. As a result,
Magisk Manager is updated to scan through all components of all apps,
and show different processes of the same app, each as a separate
hide target in the list.
The hide target database also has to be updated accordingly.
Each hide target is now a <package name, process name> pair. The
magiskhide CLI and Magisk Manager is updated to support this new
target format.
Since we switched to imageless Magisk, module files are directly
stored in /data. However, /data is mounted with nosuid, which also
prevents SELinux typetransition to work (auto transition from one
domain to another when executing files with specific context).
This could cause serious issues when we are replacing system critical
components (e.g. app_process for Xposed), because most of them
are daemons that run in special process domains.
This commit introduced /data mirror. Using similar mirroring technique
we used for system and vendor, we mount another mirror that mounts
/data without nosuid flag. All module files are then mounted from this
mirror mountpoint instead of directly from /data.
Close#1080
Since we are parsing through /data/app/ to find target APKs for
monitoring, system apps will not be covered in this case.
Automatically reinstall system apps as if they received an update
and refresh the monitor target after it's done.
As a bonus, use RAII idioms for locking pthread_mutex_t.
Previous MagiskHide detects new app launches via listening through logcat
and filtering launch info messages.
This is extremely inefficient and prone to cause multiple issues both
theoratically and practically.
Rework this by using inotify to detect open() syscalls to target APKs.
This also solves issues related to Zygote-forked caching mechanisms such as
OnePlus OxygenOS' embryo.
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Mounting ext4 images causes tons of issues, such as unmountable with broken F2FS drivers.
Resizing is also very complicated and does not work properly on all devices.
Each step in either measuring free space, resizing, and shrinking the image is a
point of failure, and either step's failure could cause the module system completely broken.
The new method is to directly store modules into /data/adb/modules, and for module installation
on boot /data/adb/modules_update. Several compatibility layers has been done: the new path is
bind mounted to the old path (/sbin/.magisk/img), and the helper functions in util_functions.sh
will now transparently make existing modules install to the new location without any changes.
MagiskHide is also updated to unmount module files stored in this new location.
Introduce a new communication method between Magisk and Magisk Manager.
Magisk used to hardcode classnames and send broadcast/start activities to
specific components. This new method makes no assumption of any class names,
so Magisk Manager can easily be fully obfuscated.
In addition, the new method connects Magisk and Magisk Manager with random
abstract Linux sockets instead of socket files in filesystems, bypassing
file system complexities (selinux, permissions and such)
Boot services tend to fail in the middle when the kernel loads a sepolicy live.
It seems that moving full patch (allow magisk * * *) to late_start is still not enough to fix service startup failures.
So screw it, apply all patched in magiskinit, which makes sure that all rules are only loaded in a single step.
The only down side is that some OEM with a HUGE set of secontexts (e.g. Samsung) might suffer a slightly longer boot time, which IS the reason why the rules are split to 2 parts in the first place.
In previous versions, magiskinit will not early mount if /sepolicy is detected. However on OP5/5T latest betas, the devices are fully trebelized,
but for some reason the file /sepolicy still exists, making magiskinit think it is NOT a treble device and doesn't work properly.
So to properly fix this issue, I will have to use the "official" way - check fstab in device trees. Any block mentioned in the fstab in device trees
are supposed to be early mounted. Currently magiskinit will only mount system and vendor even if other partitions exists in the dtb fstab, since other
partitions are not used to construct sepolicy (currently).
These changes can also fix#373, since we dynamically detect PARTNAME from device trees.