Due to changes in ec3705f2ed187863efc34af5415495e1ee7775d2, the app can
no longer communicate with the dameon through a socket opened on the
daemon side due to SELinux restrictions. The workaround here is to have
the daemon decide a socket name, send it to the app, have the app create
the socket server, then finally the daemon connects to the app through
the socket.
Introduce new domain `magisk_client` and new file type `magisk_exec`.
Connection to magiskd's always-on socket is restricted to magisk_client
only. Whitelisted process domains can transit to magisk_client through
executing files labelled magisk_exec. The main magisk binary shall be
the only file labelled as magisk_exec throughout the whole system.
All processes thus are no longer allowed to connect to magiskd directly
without going through the proper magisk binary.
Connection failures are silenced from audit logs with dontaudit rules,
so crazy processes which traverse through all unix domain sockets to try
connection can no longer check logcat to know the actual reason behind
EACCES, leaking the denied process policy (which is u:r:magisk:s0).
This also allows us to remove many rules that open up holes in
untrusted_app domains that were used to make remote shell work properly.
Since all processes establishing the remote shell are now restricted to
the magisk_client domain, all these rules are moved to magisk_client.
This makes Magisk require fewer compromises in Android's security model.
Note: as of this commit, requesting new root access via Magisk Manager
will stop working as Magisk Manager can no longer communicate with
magiskd directly. This will be addressed in a future commit that
involves changes in both native and application side.
- legacy devices brought up to Android 10 may now use a compressed dt in a hdr_v0 AOSP dt variant extra section, so detect, decompress and recompress this
- so far these have only been done using lz4 compression (latest format revision magic), e.g. LOS 17.1 victara (Moto X)
For match-all-type rules (e.g. "allow magisk * * *" used in Magisk),
we used to iterate and apply rules on all existing types. However, this
is actually unnecessary as all selinux types should have at least 1
attributes assigned to it (process types "domain", file context types
"file_type" etc.). This means in order to create rules that applies to
all types, we actually only need to create rules for all attributes.
This optimization SIGNIFICANTLY reduces the patched sepolicy that is
loaded into the kernel when running Magisk. For example on Pixel 4 XL
running Android R DP4, the sepolicy sizes are
patched (before) : 3455948
patched (after) : 843176
stock : 630229
The active sepolicy size actually impacts the performance of every single
operation in the operating system, because the larger the policies gets,
the longer it takes for the kernel to lookup and match rules.
It is possible that a module is breaking the device so bad that zygote
cannot even be started. In this case, system_server cannot start and
detect the safe mode key combo, set the persist property, and reboot.
Also on old Android versions, the system directly goes to safe mode
after detecting a key combo without rebooting, defeating the purpose of
Magisk's safe mode protection if we only check for the persist property.
Directly adding key combo check natively in magiskd allows us to enter
Magisk safe mode before the system is even aware of it.
When detecting device is booting as Safe Mode, disable all modules and
MagiskHide and skip all operations. The only thing that'll be available
in this state is root (Magisk Manager will also be disabled by system).
Since the next normal boot will also have all modules disabled, this can
be used to rescue a device in the case when a rogue module causes
bootloop and no custom recovery is available (or recoveries without
the ability to decrypt data).
Since we no longer need to add new properties in the device tree, and
all the patches we do removes strings, we can just directly patch
the flat device tree in-place, ignoring basically all the higher level
DTB structure and format to accomplish 100% compatibility.
Patching DTBs is proven to be difficult and problematic as there are
tons of different formats out there. Adding support for all the formats
in magiskboot has been quite an headache in the past year, and it still
definitely does not cover all possible cases of them out there.
There is another issue: fake dt fstabs. Some super old devices do not
have device trees in their boot images, so some custom ROM developers
had came up with a "genius" solution: hardcode fstab entries directly
in the kernel source code and create fake device tree nodes even if
Android 10+ init can graciously take fstab files instead (-_-) 。。。
And there is YET another issue: DTBs are not always in boot images!
Google is crazy enough to litter DTBs all over the place, it is like
they cannot make up their minds (duh). This means the dt fstabs can be
either concatnated after the kernel (1), in the DTB partition (2), in
the DTBO partition (3), in the recovery_dtbo section in boot images (4),
or in the dtb section in boot images (5). FIVE f**king places, how can
anyone keep up with that!
With Android 10+ that uses 2 stage inits, it is crutual for Magisk to
be able to modify fstab mount points in order to let the original init
mount partitions for us, but NOT switch root and continue booting. For
devices using dt for early mount fstab, we used to patch the DTB at
install time with magiskboot. However these changes are permanent and
cannot be restored back at reinstallation.
With this commit, Magisk will read dt fstabs and write them to ramdisk
at boot time. And in that case, the init binary will also be patched
to force it to NEVER use fstabs in device-tree. By doing so, we can
unify ramdisk based 2SI fstab patching as basically we are just patching
fstab files. This also means we can manipulate fstab whatever Magisk
needs in the future without the need to going through the headache that
is patching DTBs at installation.
Rewrite the whole module mounting logic from scratch.
Even the algorithm is different compared to the old one.
This new design focuses on a few key points:
- Modular: Custom nodes can be injected into the mount tree.
It's the main reason for starting the rewrite (needed for Android 11)
- Efficient: Compared to the existing implementation, this is the most
efficient (both in terms of computation and memory usage) design I
currently can come up with.
- Accurate: The old mounting logic relies on handling specifically every
edge case I can think of. During this rewrite I actually found some
cases that the old design does not handle properly. This new design is
architected in a way (node types and its rankings) that it should
handle edge cases all by itself when constructing mount trees.
Value of <dt>/fstab/<partition>/dev and <dt>/fstab/<partition>/type in official Android emulator ends with newline instead of \0, Magisk won’t be able to patch sepolicy and crash the system.
Signed-off-by: Shaka Huang <shakalaca@gmail.com>
The existing method for handling legacy SAR is:
1. Mount /sbin tmpfs overlay
2. Dump all patched/new files into /sbin
3. Magic mount root dir and re-exec patched stock init
With Android 11 removing the /sbin folder, it is quite obvious that
things completely break down right in step 1.
To overcome this issue, we have to find a way to swap out the init
binary AFTER we re-exec stock init. This is where 2SI comes to rescue!
2SI normal boot procedure is:
1st stage -> Load sepolicy -> 2nd stage -> boot continue...
2SI Magisk boot procedure is:
MagiskInit 1st stage -> Stock 1st stage -> MagiskInit 2nd Stage ->
-> Stock init load sepolicy -> Stock 2nd stage -> boot continue...
As you can see, the trick is to make stock 1st stage init re-exec back
into MagiskInit so we can do our setup. This is possible by manipulating
some ramdisk files on initramfs based 2SI devices (old ass non SAR
devices AND super modern devices like Pixel 3/4), but not possible
on device that are stuck using legacy SAR (device that are not that
modern but not too old, like Pixel 1/2. Fucking Google logic!!)
This commit introduces a new way to intercept stock init re-exec flow:
ptrace init with forked tracer, monitor PTRACE_EVENT_EXEC, then swap
out the init file with bind mounts right before execv returns!
Going through this flow however will lose some necessary backup files,
so some bookkeeping has to be done by making the tracer hold these
files in memory and act as a daemon. 2nd stage MagiskInit will ack the
daemon to release these files at the correct time.
It just works™ ¯\_(ツ)_/¯
Since SafetyNet CTS is impossible to achieve, leaving MagiskHide on
by default no longer serves a purpose.
For more details regarding the latest SafetyNet changes, please check:
https://twitter.com/topjohnwu/status/1237656703929180160https://twitter.com/topjohnwu/status/1237830555523149824
MagiskHide's functionality will continue to exist within the Magisk
project as it is still extremely effective to hide modifications in
userspace (including SafetyNet's basicIntegrity check).
Future MagiskHide improvements _may_ come, but since the holy grail
has been taken, any form of improvement is now a very low priority.
readlinkat() may return random value instead of the number of bytes placed in buf and crashing the system in two ways:
1. segmentation fault (buf[-7633350] = ‘\0’)
2. wrong link of watchdogd, resulting dog timeout
Confirmed working in ZenFone 2 x86 series, may fix#2247 and #2356
Signed-off-by: Shaka Huang <shakalaca@gmail.com>
Vendors are always adding “extra libraries” in /vendor/lib* for their own sake, in this case AS*S loaded with customized `libicuuc.so` for Zenf*ne 5z and led to the failure of dynamic loading libsqlite.so:
<quote>
db: dlopen failed: cannot locate symbol "UCNV_FROM_U_CALLBACK_ESCAPE_63" referenced by "/apex/com.android.runtime/lib64/libandroidicu.so"...
</quote>
Signed-off-by: Shaka Huang <shakalaca@gmail.com>
* Minor optimizations
Co-authored-by: John Wu <topjohnwu@gmail.com>
Some Motorola devices (Qualcomm kernel with CONFIG_MMI_DEVICE_DTBS
configuration enabled) need 1k of padding to the DTBs to allow for
environment variables to be runtime added by the bootloader.
Those extra paddings will be removed during the process of dtb patch,
devices won’t be able to boot-up and return to fastboot mode immediately
after flashed the flawed boot.img.
Credits to @shakalaca, close#2273
- some Samsung devices (e.g. Galaxy S5 SMG-900H) use a slightly different AOSP bootimg.h variant with `#define BOOT_NAME_SIZE 20` instead of 16
- since all known examples of these device images do not have anything in the NAME or CMDLINE fields, and the bootloader also accepts standard AOSP images, simply offset the SHA1/SHA256 detection by 4 bytes to avoid false positives from these images, remain an equally effective detection shortcut, and ensure a proper SHA1 checksum on repack
aosp-dtbhdt2-4offhash-seandroid-256sig-samsung_gs5-smg900h-boot.img
UNPACK CHECKSUM [00000000b11580f7d20f70297cdc31e02626def0356c82b90000000000000000]
REPACK CHECKSUM [73b18751202e56c433f89dfd1902c290eaf4eef3e167fcf03b814b59a5e984b6]
AIK CHECKSUM [b11580f7d20f70297cdc31e02626def0356c82b9000000000000000000000000]
This patch should result in a `magiskboot unpack -n boot.img; magiskboot repack boot.img` new-boot.img matching the AIK CHECKSUM above.
Previously, we use either BroadcastReceivers or Activities to receive
messages from our native daemon, but both have their own downsides.
Some OEMs blocks broadcasts if the app is not running in the background,
regardless of who the caller is. Activities on the other hand, despite
working 100% of the time, will steal the focus of the current foreground
app, even though we are just doing some logging and showing a toast.
In addition, since stubs for hiding Magisk Manager is introduced, our
only communication method is left with the broadcast option, as
only broadcasting allows targeting a specific package name, not a
component name (which will be obfuscated in the case of stubs).
To make sure root requests will work on all devices, Magisk had to do
some experiments every boot to test whether broadcast is deliverable or
not. This makes the whole thing even more complicated then ever.
So lets take a look at another kind of component in Android apps:
ContentProviders. It is a vital part of Android's ecosystem, and as far
as I know no OEMs will block requests to ContentProviders (or else
tons of functionality will break catastrophically). Starting at API 11,
the system supports calling a specific method in ContentProviders,
optionally sending extra data along with the method call. This is
perfect for the native daemon to start a communication with Magisk
Manager. Another cool thing is that we no longer need to know the
component name of the reciever, as ContentProviders identify themselves
with an "authority" name, which in Magisk Manager's case is tied to the
package name. We already have a mechanism to keep track of our current
manager package name, so this works out of the box.
So yay! No more flaky broadcast tests, no more stupid OEMs blocking
broadcasts for some bizzare reasons. This method should in theory
work on almost all devices and situations.
- support unpack without decompression to allow easy testing of magiskboot's header, structure and hashing handling by comparing repack checksum versus origbootimg
- make -n first to match repack
According to this comment in #1880:
https://github.com/topjohnwu/Magisk/issues/1880#issuecomment-546657588
If Linux recycled our PPID, and coincidentally the process that reused
the PPID is root, AND init wants to kill the whole process group,
magiskd will get killed as a result.
There is no real way to block a SIGKILL signal, so we simply make sure
our daemon PID is the process group leader by renaming the directory.
Close#1880
Usually, the communication between native and the app is done via
sending intents to either broadcast or activity. These communication
channels are for launching root requests dialogs, sending root request
notifications (the toast you see when an app gained root access), and
root request logging.
Sending intents by am (activity manager) usually requires specifying
the component name in the format of <pkg>/<class name>. This means parts
of Magisk Manager cannot be randomized or else the native daemon is
unable to know where to send data to the app.
On modern Android (not sure which API is it introduced), it is possible
to send broadcasts to a package, not a specific component. Which
component will receive the intent depends on the intent filter declared
in AndroidManifest.xml. Since we already have a mechanism in native code
to keep track of the package name of Magisk Manager, this makes it
perfect to pass intents to Magisk Manager that have components being
randomly obfuscated (stub APKs).
There are a few caveats though. Although this broadcasting method works
perfectly fine on AOSP and most systems, there are OEMs out there
shipping ROMs blocking broadcasts unexpectedly. In order to make sure
Magisk works in all kinds of scenarios, we run actual tests every boot
to determine which communication method should be used.
We have 3 methods in total, ordered in preference:
1. Broadcasting to a package
2. Broadcasting to a specific component
3. Starting a specific activity component
Method 3 will always work on any device, but the downside is anytime
a communication happens, Magisk Manager will steal foreground focus
regardless of whether UI is drawn. Method 1 is the only way to support
obfuscated stub APKs. The communication test will test method 1 and 2,
and if Magisk Manager is able to receive the messages, it will then
update the daemon configuration to use whichever is preferable. If none
of the broadcasts can be delivered, then the fallback method 3 will be
used.
Old Qualcomn devices have their own special QC table of DTB to
store device trees. Since patching fstab is now mandatory on Android 10,
and for older devices all early mount devices have to be included into
the fstab in DTBs, patching QCDT is crucial for rooting Android 10
on legacy devices.
Close#1876 (Thanks for getting me aware of this issue!)
The state of ROM A/B OTA addon.d-v2 support is an inconsistent mess currently:
- LineageOS builds userdebug with permissive update_engine domain, OmniROM builds userdebug with a more restricted update_engine domain, and CarbonROM builds user with a hybrid closer to Omni's
- addon.d-v2 scripts cannot function to the full extent they should when there is a more restricted update_engine domain sepolicy in place, which is likely why Lineage made update_engine completely permissive
Evidence for the above:
- many addon.d-v2 scripts only work (or fully work) on Lineage, see below
- Magisk's addon.d-v2 script would work on Lineage without issue, but would work on Carbon and Omni only if further allow rules were added for basic things like "file read" and "dir search" suggesting these ROMs' addon.d-v2 is severely limited
- Omni includes a /system/addon.d/69-gapps.sh script with the ROM itself (despite shipping without GApps), and with Magisk's more permissive sepolicy and no GApps installed it will remove important ROM files during OTA, resulting in a bootloop; the issue with shipping this script was therefore masked by Omni's overly restrictive update_engine sepolicy not allowing the script to function as intended
The solution:
- guarantee a consistent addon.d-v2 experience for users across ROMs when rooted with Magisk by making update_engine permissive as Lineage has
- hopefully ROMs can work together to come up with something standard for unrooted addon.d-v2 function
Directly read from urandom instead of using std::random_device.
libc++ will use iostream under-the-hood, which brings significant
binary size increase that is not welcomed, especially in magiskinit.
- while many newer devices cannot allow / (system partition) to be mounted rw due to compressed fs (e.g. erofs) or logical partitions, it should remain possible to alter rootfs files/directories on those that previously allowed it
The way how logical partition, or "Logical Resizable Android Partitions"
as they say in AOSP source code, is setup makes it impossible to early
mount the partitions from the shared super partition with just
a few lines of code; in fact, AOSP has a whole "fs_mgr" folder which
consist of multiple complex libraries, with 15K lines of code just
to deal with the device mapper shenanigans.
In order to keep the already overly complicated MagiskInit more
managable, I chose NOT to go the route of including fs_mgr directly
into MagiskInit. Luckily, starting from Android Q, Google decided to
split init startup into 3 stages, with the first stage doing _only_
early mount. This is great news, because we can simply let the stock
init do its own thing for us, and we intercept the bootup sequence.
So the workflow can be visualized roughly below:
Magisk First Stage --> First Stage Mount --> Magisk Second Stage --+
(MagiskInit) (Original Init) (MagiskInit) +
+
+
...Rest of the boot... <-- Second Stage <-- Selinux Setup <--+
(__________________ Original Init ____________________)
The catch here is that after doing all the first stage mounting, /init
will pivot /system as root directory (/), leaving us impossible to
regain control after we hand it over. So the solution here is to patch
fstab in /first_stage_ramdisk on-the-fly to redirect /system to
/system_root, making the original init do all the hard work for
us and mount required early mount partitions, but skips the step of
switching root directory. It will also conveniently hand over execution
back to MagiskInit, which we will reuse the routine for patching
root directory in normal system-as-root situations.
- Magisk "dirty" flashes would remove the /overlay directory which might have been put there by a custom kernel or other mod
- this is a leftover from when Magisk itself used /overlay for placing init.magisk.rc, so just remove this file specifically and leave the rest intact
The current system-as-root magiskinit implementation (converting
root directory in system partition to legacy rootfs setup) is now
considered as backwards compatible only.
The new implementation that is hide and Android Q friendly is coming soon.
- when input image had a compressed ramdisk magiskboot had no way to force the repack with the uncompressed ramdisk.cpio since it does not formally recognize cpio as its own format, so add a switch to support forcing repacking to any possible ramdisk format regardless of input image
- when input image had a different supported format (e.g. gzip) magiskboot would not accept a manually compressed ramdisk or kernel in an unsupported format (e.g. lzop) despite being able to recognize it, so instead would double compress using whatever the input format was, breaking the image with, in effect, a ramdisk.cpio.lzo.gz
In commit 8d4c407, native Magisk always launches an activity for
communicating with Magisk Manager. While this works extremely well,
since it also workaround stupid OEMs that blocks broadcasts, it has a
problem: launching an activity will claim the focus of the device,
which could be super annoying in some circumstances.
This commit adds a new feature to run a broadcast test on boot complete.
If Magisk Manager successfully receives the broadcast, it will toggle
a setting in magiskd so all future su loggings and notifies will always
use broadcasts instead of launching activities.
Fix#1412
For devices come with two /data mount points, magisk will bind the one in tmpfs and failed to load modules since this partition is empty.
Signed-off-by: Shaka Huang <shakalaca@gmail.com>
We used to construct /sbin tmpfs overlay in early-init stage after
SELinux is properly initialized. However the way it is implemented
(forking daemon from magiskinit with complicated file waiting triggers)
is extremely complicated and error prone.
This commit moves the construction of the sbin overlay to pre-init
stage. The catch is that since SELinux is not present at that point,
proper selabel has to be reconstructed afterwards. Some additional
SEPolicy rules are added to make sure init can access magisk binaries,
and the secontext relabeling task is assigned to the main Magisk daemon.
Some stupid Samsung ROMs will spawn multiple zygote daemons. Since we
switched to ptrace based process monitoring, we have to know all zygote
processes to trace. This is an attempt to fix this issue.
Close#1272
Since Android Q does not allow launching activities from the background
(Services/BroadcastReceivers) and our native process is root, directly
launch activities and use it for communication between native and app.
The target activity is not exported, so non-root apps cannot send an
intent to fool Magisk Manager. This is as safe as the previous
implementation, which uses protected system broadcasts.
This also workaround broadcast limitations in many ROMs (especially
in Chinese ROMs) which blocks the su request dialog if the app is
frozen/force stopped by the system.
Close#1326
The root nodes are /system and /vendor. Adding new files into these
directories, although works on some devices, mostly bootloops on many
devices out there. So don't allow it, which also makes the whole magic
mounting logic much easier and extensible.
Samsung does not like running cmd before system services are started.
Instead of failing, it will enter an infinite wait on binder.
Move APK installation to boot complete to make sure pm can be run
without blocking process.
Forseeing the future that more and more A only system-as-root devices
would have similar bootloader behavior as the latest Samsung devices
(that is, no ramdisk will be loaded into memory when booting from
the boot partition), a solution/workaround has to be made when Magisk
is installed to the recovery partition, making custom recoveries
unable to co-exist with Magisk.
This commit allows magiskinit to read input device events from the
kernel to detect when a user holds volume key up to toggle whether
system-as-root mode is enabled. When system-as-root mode is disabled,
magiskinit will boot with ramdisk instead of cloning rootfs from system,
which in this case will boot to the recovery.
Some devices (mainly new Samsung phones we're talking here...) using
A only system-as-root refuse to load ramdisk when booted with boot
no matter what we do. With many A only system-as-root devices, even
though their boot image is kernel only, we can still be able to add
a ramdisk section into the image and force the kernel to use it as
rootfs. However the bootloader on devices like the S10 simply does
not load anything within boot image into memory other than the kernel.
This gives as the only option is to install Magisk on the recovery
partition. This commits adds proper support for these kind of scenarios.
It seems that even adding this to the list doesn't 100% works on all
devices out there, and some even reported crashes on several Google
services. Disable it for now and do further investigations in the future.
95%+ of existing modules enables auto mount (obviously).
Switching auto mount to opt-out makes more sense than opt-in as
in previous module format. The file 'auto_mount' will be ignored, and
the file 'skip_mount' will be checked to toggle the mounting behavior.
After scanning through the current Magisk Module Repo modules, no
modules are using custom bind mounting; all modules with auto mount
disabled have empty system folder, which means this change will not
affect any existing module.
It seems both Android cancers, Samsung and Huawei devices, don't
like preloading sepolicy. For a temporary solution now is to limit
the sepolicy loading to Android Q only.
Of course, the cancer of Android, Huawei, has to do some f**king weird
modifications to the Linux kernel. Its kernel only accepts 1 single
policy load in its lifetime, a second load will result in ENOMEM error.
Since Huawei devices always use their own stupid ramdisk setup and not
system-as-root, not loading sepolicy is not a concern (for now).
Android Q init assumes rootfs to always be on EXT4 images, thus
never runs restorecon on the whole root directory. This is an issue
because some folders in rootfs were set with special selabels in
the system partition, but when copying over to initramfs by magiskinit,
these labels will not be preserved.
So the solution is to relabel the files in rootfs with the original
context right? Yes, but rootfs does not allow security xattr to be set
on files before the kernel SELinux initializes with genfs_contexts.
We have to load our sepolicy to the kernel before we clone the root
directory from system partition, which we will also restore the selabel
in the meantime.
Unfortunately this means that for each reboot, the exact same policy
will be loaded to the kernel twice: once in magiskinit so we can label
rootfs properly, and once by the original init, which is part of the
boot procedure. There is no easy way to prevent init from loading
sepolicy, as init will refuse to continue if policy loading has failed.
Allow zygote to execute other programs (such as dex2oat).
This fixes the bug that cause ART framework boot images failed to load
and result to extremely serious performance degradation.
Fix#1195
vector<bool> uses bitsets, so we actually only use 12k memory to
store all 3 possible PID info tables. PID checkup will be now become
O(1) instead of O(logn).
P.S. The reason why we don't use unordered_map is because including it
will result in significant binary size increase (might be due to the
complex hash table STL implementation? I really don't know).
MicroG uses a different package to handle DroidGuard service (SafetyNet),
but still uses the same com.google.android.gms.unstable process name.
Thanks to the changes in 4e53ebfe, we can target both official GMS
and MicroG SafetyNet services at the same time.
No matter if we use the old, buggy, error prone am_proc_start monitoring,
or the new APK inotify method, both methods rely on MagiskHide 'reacting'
fast enough to hijack the process before any detection has been done.
However, this is not reliable and practical. There are apps that utilize
native libraries to start detects and register SIGCONT signal handlers
to mitigate all existing MagiskHide process monitoring mechanism. So
our only solution is to hijack an app BEFORE it is started.
All Android apps' process is forked from zygote, so it is easily the
target to be monitored. All forks will be notified, and subsequent
thread spawning (Android apps are heaviliy multithreaded) from children
are also closely monitored to find the earliest possible point to
identify what the process will eventually be (before am_proc_bound).
ptrace is extremely complicated and very difficult to get right. The
current code is heaviliy tested on a stock Android 9.0 Pixel system,
so in theory it should work fine on most devices, but more tests and
potentially fixes are expected to follow this commit.