一.簡(jiǎn)介
???????Watchdog,從中文字面意思來(lái)看是“看門(mén)狗”,有看護(hù)之意。最早引入Watchdog是在單片機(jī)系統(tǒng)中,由于單片機(jī)的工作環(huán)境容易受到外界磁場(chǎng)的干擾,導(dǎo)致程序“跑飛”造成整個(gè)系統(tǒng)無(wú)法正常工作,因此引入了一個(gè)“看門(mén)狗”,對(duì)單片機(jī)的運(yùn)行狀態(tài)進(jìn)行實(shí)時(shí)監(jiān)測(cè),針對(duì)運(yùn)行故障做一些保護(hù)處理,譬如讓系統(tǒng)重啟。這種Watchdog屬于硬件層面,必須有硬件電路的支持。
???????Linux系統(tǒng)也引入了Watchdog,在Linux內(nèi)核下,當(dāng)Watchdog啟動(dòng)后,便設(shè)定了一個(gè)定時(shí)器,如果在超時(shí)時(shí)間內(nèi)沒(méi)有對(duì)/dev/Watchdog進(jìn)行寫(xiě)操作,則會(huì)導(dǎo)致系統(tǒng)重啟。通過(guò)定時(shí)器實(shí)現(xiàn)的Watchdog屬于軟件層面。
???????在Android系統(tǒng)中,也設(shè)計(jì)了一個(gè)軟件層面Watchdog,用于保護(hù)一些重要的系統(tǒng)服務(wù),比如:AMS、WMS、PMS等,由于以上核心服務(wù)運(yùn)行在system_server進(jìn)程里面,所以當(dāng)以上服務(wù)出現(xiàn)異常時(shí),通常會(huì)將system_server進(jìn)程kill掉,即讓Android系統(tǒng)重啟,由于Watchdog機(jī)制的存在,平時(shí)會(huì)出現(xiàn)一些system_server進(jìn)程被Watchdog殺掉而發(fā)生Android系統(tǒng)重啟的現(xiàn)象。
???????前面簡(jiǎn)單介紹了Watchdog,那在Android系統(tǒng)中,它是如何工作的,如何對(duì)系統(tǒng)服務(wù)進(jìn)行檢測(cè)的?本文基于Andorid 8.1來(lái)對(duì)Watchdog源碼及工作機(jī)制進(jìn)行分析。
二.Watchdog注冊(cè)及啟動(dòng)
a.注冊(cè)
???????通過(guò)framework源碼可以發(fā)現(xiàn),在AMS、WMS的構(gòu)造方法內(nèi)部,會(huì)進(jìn)行相應(yīng)Watchdog檢測(cè)類(lèi)型的注冊(cè):
??????? 1.Monitor注冊(cè)
???????注冊(cè)監(jiān)聽(tīng)方式如下:
Watchdog.getInstance().addMonitor(this);
???????回調(diào)方法如下:
public void monitor() {
synchronized (this) { }
}
???????用來(lái)檢測(cè)不能長(zhǎng)時(shí)間持有核心系統(tǒng)服務(wù)的對(duì)象鎖,否則會(huì)阻塞很多函數(shù)的運(yùn)行;
???????2.Handler注冊(cè)
???????注冊(cè)監(jiān)聽(tīng)方式如下:
Watchdog.getInstance().addThread(mHandler);
???????用來(lái)檢測(cè)是否存在長(zhǎng)時(shí)間霸占的消息,否則其他消息將得不到處理;
???????以上兩類(lèi)都會(huì)導(dǎo)致系統(tǒng)卡住(System Not Responding),后面會(huì)進(jìn)行分析。
b.啟動(dòng)
???????Watchdog本身是一個(gè)Thread,在啟動(dòng)前需要完成注冊(cè),否則會(huì)報(bào)異常[后面分析Watchdog源碼會(huì)講到],那么Watchdog是在什么地方啟動(dòng)的呢?
???????熟悉framework的同學(xué)應(yīng)該清楚,有一個(gè)進(jìn)程叫system_server,Android系統(tǒng)所有的核心服務(wù)都運(yùn)行在該進(jìn)程中,如果該進(jìn)程出現(xiàn)異常,那么Android系統(tǒng)就會(huì)重啟;Watchdog就是在system_server進(jìn)程啟動(dòng)的,一起看一下:
try {
traceBeginAndSlog("StartServices");
startBootstrapServices();
startCoreServices();
startOtherServices();
SystemServerInitThreadPool.shutdown();
}
???????system_server進(jìn)程在啟動(dòng)時(shí),會(huì)啟動(dòng)系統(tǒng)核心服務(wù),具體調(diào)用邏輯如上,在startOtherServices()內(nèi)部會(huì)啟動(dòng)Wathdog,簡(jiǎn)單看一下代碼:
private void startOtherServices() {
........
........
traceBeginAndSlog("StartWatchdog");
Watchdog.getInstance().start();
traceEnd();
........
........
}
???????等所有的核心系統(tǒng)服務(wù)啟動(dòng)完成后,才執(zhí)行的Watchdog.getInstance().start(),跟前面講到的是保持一致的,接下來(lái)通過(guò)源碼對(duì)Watchdog工作機(jī)制進(jìn)行分析;
三.Watchdog機(jī)制源碼分析
???????Watchdog的源碼位于frameworks/base/services/core/java/com/android/server/Watchdog.java,根據(jù)前面講到,Watchdog本身是一個(gè)Thread,一起看一下:
a.初始化
public class Watchdog extends Thread {
........
final ArrayList<HandlerChecker> mHandlerCheckers = new ArrayList<>();
final HandlerChecker mMonitorChecker
.......
.......
private Watchdog() {
super("watchdog");
// The shared foreground thread is the main checker. It is where we
// will also dispatch monitor checks and do other work.
mMonitorChecker = new HandlerChecker(FgThread.getHandler(),
"foreground thread", DEFAULT_TIMEOUT);
mHandlerCheckers.add(mMonitorChecker);
// Add checker for main thread. We only do a quick check since there
// can be UI running on the thread.
mHandlerCheckers.add(new HandlerChecker(new Handler(Looper.getMainLooper()),
"main thread", DEFAULT_TIMEOUT));
// Add checker for shared UI thread.
mHandlerCheckers.add(new HandlerChecker(UiThread.getHandler(),
"ui thread", DEFAULT_TIMEOUT));
// And also check IO thread.
mHandlerCheckers.add(new HandlerChecker(IoThread.getHandler(),
"i/o thread", DEFAULT_TIMEOUT));
// And the display thread.
mHandlerCheckers.add(new HandlerChecker(DisplayThread.getHandler(),
"display thread", DEFAULT_TIMEOUT));
// Initialize monitor for Binder threads.
addMonitor(new BinderThreadMonitor());
mOpenFdMonitor = OpenFdMonitor.create();
// See the notes on DEFAULT_TIMEOUT.
assert DB ||
DEFAULT_TIMEOUT > ZygoteConnectionConstants.WRAPPED_PID_TIMEOUT_MILLIS;
}
.......
.......
}
???????在構(gòu)造方法內(nèi)部,會(huì)構(gòu)建多個(gè)HandlerChecker,重點(diǎn)關(guān)注一下mMonitorChecker,然后加入到mHandlerCheckers列表中,HandlerChecker是用來(lái)對(duì)系統(tǒng)服務(wù)進(jìn)行檢測(cè),可以分為以下兩類(lèi):
???????Monitor Checker:用來(lái)檢查Monitor對(duì)象可能發(fā)生的死鎖,AMS、WMS等核心系統(tǒng)服務(wù)都是Monitor對(duì)象。
???????Handler Checker:用來(lái)檢查線程的消息隊(duì)列是否長(zhǎng)時(shí)間處于工作狀態(tài)。Watchdog自身的消息隊(duì)列,Ui、 Io、 Display這些全局的消息隊(duì)列都是被檢查的對(duì)象。此外,一些核心服務(wù)的重要線程的消息隊(duì)列,比如AMS、PMS,也會(huì)加入到Handler Checker中,這些是在對(duì)應(yīng)的對(duì)象初始化時(shí)加入的。
???????可以看到在構(gòu)造方法內(nèi)部,執(zhí)行了addMonitor(new BinderThreadMonitor()),用來(lái)對(duì)Binder進(jìn)行檢測(cè);
// Initialize monitor for Binder threads.
addMonitor(new BinderThreadMonitor());
private static final class BinderThreadMonitor implements Watchdog.Monitor {
@Override
public void monitor() {
Binder.blockUntilThreadAvailable();
}
}
???????當(dāng)進(jìn)行定時(shí)檢測(cè)時(shí),會(huì)回調(diào)到native層,對(duì)應(yīng)的代碼為frameworks/native/libs/binder/IPCThreadState.cpp
void IPCThreadState::blockUntilThreadAvailable()
{
pthread_mutex_lock(&mProcess->mThreadCountLock);
while (mProcess->mExecutingThreadsCount >= mProcess->mMaxThreads) {
ALOGW("Waiting for thread to be free. mExecutingThreadsCount=%lu mMaxThreads=%lu\n",
static_cast<unsigned long>(mProcess->mExecutingThreadsCount),
static_cast<unsigned long>(mProcess->mMaxThreads));
pthread_cond_wait(&mProcess->mThreadCountDecrement, &mProcess->mThreadCountLock);
}
pthread_mutex_unlock(&mProcess->mThreadCountLock);
}
???????BinderThreadMonitor也是被添加到mMonitorChecker中,主要是用于確認(rèn)binder是否有出現(xiàn)不夠用的情況,例如:假設(shè)binder的mMaxThreads為15個(gè),超過(guò)15后就需要check是否存在binder阻塞。
b.HandlerChecker
???????在初始化時(shí),會(huì)創(chuàng)建多個(gè)HandlerChecker,然后加入到列表中,一起看一下HandlerChecker是什么:
/**
* Used for checking status of handle threads and scheduling monitor callbacks.
*/
public final class HandlerChecker implements Runnable {
private final Handler mHandler;
private final String mName;
private final long mWaitMax;
private final ArrayList<Monitor> mMonitors = new ArrayList<Monitor>();
private boolean mCompleted;
private Monitor mCurrentMonitor;
private long mStartTime;
HandlerChecker(Handler handler, String name, long waitMaxMillis) {
mHandler = handler;
mName = name;
mWaitMax = waitMaxMillis;
mCompleted = true;
}
public void addMonitor(Monitor monitor) {
mMonitors.add(monitor);
}
public void scheduleCheckLocked() {
if (mMonitors.size() == 0 && mHandler.getLooper().getQueue().isPolling()) {
mCompleted = true;
return;
}
if (!mCompleted) {
// we already have a check in flight, so no need
return;
}
mCompleted = false;
mCurrentMonitor = null;
mStartTime = SystemClock.uptimeMillis();
mHandler.postAtFrontOfQueue(this);
}
public boolean isOverdueLocked() {
return (!mCompleted) && (SystemClock.uptimeMillis() > mStartTime + mWaitMax);
}
public int getCompletionStateLocked() {
if (mCompleted) {
return COMPLETED;
} else {
long latency = SystemClock.uptimeMillis() - mStartTime;
if (latency < mWaitMax/2) {
return WAITING;
} else if (latency < mWaitMax) {
return WAITED_HALF;
}
}
return OVERDUE;
}
.......
........
public String describeBlockedStateLocked() {
if (mCurrentMonitor == null) {
return "Blocked in handler on " + mName + " (" + getThread().getName() + ")";
} else {
return "Blocked in monitor " + mCurrentMonitor.getClass().getName()
+ " on " + mName + " (" + getThread().getName() + ")";
}
}
@Override
public void run() {
final int size = mMonitors.size();
for (int i = 0 ; i < size ; i++) {
synchronized (Watchdog.this) {
mCurrentMonitor = mMonitors.get(i);
}
mCurrentMonitor.monitor();
}
synchronized (Watchdog.this) {
mCompleted = true;
mCurrentMonitor = null;
}
}
}
???????可以看到,HandlerChecker是一個(gè)Runnable,用檢測(cè)Handler的運(yùn)行狀態(tài)和Monitor的回調(diào),主要方法如下:
???????addMonitor():將Monitor對(duì)象添加到mMonotors列表中;
???????scheduleCheckLocked():檢測(cè)開(kāi)始入口,將自身加入到消息隊(duì)列中執(zhí)行;
???????isOverdueLocked():判斷是否超時(shí);
???????getCompletionStateLocked():獲取完成狀態(tài);
???????describeBlockedStateLocked():獲取異常信息,來(lái)判斷是Monitor超時(shí)還是Handler執(zhí)行超時(shí);
???????run():開(kāi)始執(zhí)行檢測(cè);
c.注冊(cè)
???????Watchdog檢測(cè)包括Monitor check和Handler check,分別都有對(duì)應(yīng)的注冊(cè)方法入口:
???????c.1:Monitor注冊(cè)
public void addMonitor(Monitor monitor) {
synchronized (this) {
if (isAlive()) {
throw new RuntimeException("Monitors can't be added once the Watchdog is running");
}
mMonitorChecker.addMonitor(monitor);
}
}
???????前面講到,Watchdog在啟動(dòng)前需要先注冊(cè),從addMonitor()方法可以看到,在內(nèi)部有判斷,如果線程已經(jīng)啟動(dòng)了,再執(zhí)行的話就拋異常了;滿足條件,執(zhí)行mMonitorChecker.addMonitor將monitor加入到mMonitorChecker的mMonitors列表里面;
???????注意:所有的核心系統(tǒng)服務(wù)都是調(diào)用addMonitor()來(lái)對(duì)自身進(jìn)行注冊(cè)的,最終都會(huì)調(diào)用到mMonitorChecker的addMonitor(),也就是說(shuō)都是通過(guò)mMonitorChecker來(lái)進(jìn)行檢測(cè)的[在run()內(nèi)部遍歷mMonitors],該mMonitorChecker是在Watchdog構(gòu)造方法內(nèi)部創(chuàng)建的,然后再將mMonitorChecker加入到mHandlerCheckers列表中;
???????c.2:Handler注冊(cè)
public void addThread(Handler thread) {
addThread(thread, DEFAULT_TIMEOUT);
}
public void addThread(Handler thread, long timeoutMillis) {
synchronized (this) {
if (isAlive()) {
throw new RuntimeException("Threads can't be added once the Watchdog is running");
}
final String name = thread.getLooper().getThread().getName();
mHandlerCheckers.add(new HandlerChecker(thread, name, timeoutMillis));
}
}
???????addThread()來(lái)注冊(cè)Handler檢測(cè),DEFAULT_TIMEOUT為1分鐘,將自身的Handler作為參數(shù)來(lái)創(chuàng)建一HandlerChecker對(duì)象,然后添加到mHandlerCheckers列表中;
???????對(duì)比兩個(gè)注冊(cè)方法可以看到,Monitor和Handler是分開(kāi)檢測(cè)的,所有核心系統(tǒng)服務(wù)的Monitor檢測(cè)是在一個(gè)HandlerChecker里面執(zhí)行的,即;而Handler檢測(cè)是在不同的HandlerChecker里面執(zhí)行的,每個(gè)系統(tǒng)服務(wù)都創(chuàng)建一個(gè)HandlerChecker。
d.開(kāi)啟檢測(cè)
???????Watchdog是一個(gè)Thread,所以開(kāi)啟檢測(cè)肯定是在線程啟動(dòng)的時(shí)候就執(zhí)行了,一起看一下run()方法:
@Override
public void run() {
........
while (true) {
.......
synchronized (this) {
long timeout = CHECK_INTERVAL;
//-----------------分析1------------------------
for (int i=0; i<mHandlerCheckers.size(); i++) {
HandlerChecker hc = mHandlerCheckers.get(i);
hc.scheduleCheckLocked();
}
........
//-----------------分析2------------------------
long start = SystemClock.uptimeMillis();
while (timeout > 0) {
......
try {
wait(timeout);
} catch (InterruptedException e) {
Log.wtf(TAG, e);
}
.........
timeout = CHECK_INTERVAL - (SystemClock.uptimeMillis() - start);
}
boolean fdLimitTriggered = false;
.......
if (!fdLimitTriggered) {
//-----------------分析3------------------------
final int waitState = evaluateCheckerCompletionLocked();
if (waitState == COMPLETED) {
waitedHalf = false;
continue;
} else if (waitState == WAITING) {
continue;
} else if (waitState == WAITED_HALF) {
........
continue;
}
// something is overdue!
//-----------------分析4------------------------
blockedCheckers = getBlockedCheckersLocked();
subject = describeCheckersLocked(blockedCheckers);
} else {
blockedCheckers = Collections.emptyList();
subject = "Open FD high water mark reached";
}
allowRestart = mAllowRestart;
}
..........
.........
//------------------------分析5--------------------------------
Slog.w(TAG, "*** WATCHDOG KILLING SYSTEM PROCESS: " + subject);
for (int i=0; i<blockedCheckers.size(); i++) {
Slog.w(TAG, blockedCheckers.get(i).getName() + " stack trace:");
StackTraceElement[] stackTrace
= blockedCheckers.get(i).getThread().getStackTrace();
for (StackTraceElement element: stackTrace) {
Slog.w(TAG, " at " + element);
}
}
Slog.w(TAG, "*** GOODBYE!");
Process.killProcess(Process.myPid());
System.exit(10);
}
waitedHalf = false;
}
}
???????run()方法內(nèi)部執(zhí)行邏輯有點(diǎn)復(fù)雜,把他們拆分成五個(gè)部分:
???????分析1:遍歷mHandlerCheckers列表,執(zhí)行scheduleCheckLocked()來(lái)開(kāi)啟檢測(cè);
???????分析2:開(kāi)啟定期檢測(cè),每一次檢查的間隔時(shí)間由CHECK_INTERVAL常量設(shè)定,默認(rèn)為30秒;
???????分析3:檢查HanddlerChecker的完成狀態(tài):COMPLETED表示已經(jīng)完成;WAITING和WAITED_HALF表示還在等待,但未超時(shí);OVERDUE表示已經(jīng)超時(shí);
???????分析4:如果存在超時(shí)的HandlerChecker,獲取阻塞的HandlerChecker,生成一些描述信息;
???????分析5:保存日志,打印調(diào)用棧,然后kill系統(tǒng)進(jìn)程;
???????run()方法是while(true)死循環(huán),只要是系統(tǒng)進(jìn)程沒(méi)有被kill就會(huì)一直循環(huán)執(zhí)行HandlerChecker的scheduleCheckLocked(),接下來(lái)再看一下該方法:
public void scheduleCheckLocked() {
if (mMonitors.size() == 0 && mHandler.getLooper().getQueue().isPolling()) {
mCompleted = true;
return;
}
//如果沒(méi)有完成,直接返回不去執(zhí)行
if (!mCompleted) {
// we already have a check in flight, so no need
return;
}
mCompleted = false;
mCurrentMonitor = null;
mStartTime = SystemClock.uptimeMillis();
//將Monitor Checker的對(duì)象置于消息隊(duì)列之前,優(yōu)先運(yùn)行
mHandler.postAtFrontOfQueue(this);
}
???????對(duì)于核心服務(wù)的Monitor檢測(cè),mHandler統(tǒng)一用的是FgThread(name是android.fg)提供的Handler;對(duì)于核心服務(wù)的Handler檢測(cè),mHandler用的是服務(wù)自身的Handler;兩者檢測(cè)方向不同,所以用不同的Handler。
@Override
public void run() {
final int size = mMonitors.size();
for (int i = 0 ; i < size ; i++) {
synchronized (Watchdog.this) {
mCurrentMonitor = mMonitors.get(i);
}
mCurrentMonitor.monitor();
}
synchronized (Watchdog.this) {
mCompleted = true;
mCurrentMonitor = null;
}
}
???????在HandlerChecker的run()內(nèi)部,如果是Monitor檢測(cè)(mMonitorChecker),mMonitors的size不為0,會(huì)遍歷回調(diào)monitor()來(lái)獲取核心服務(wù)的鎖來(lái)進(jìn)行檢測(cè),都執(zhí)行完畢后進(jìn)行置位;如果是Handler檢測(cè),mMonitors的size為0,不會(huì)執(zhí)行monitor(),直接置位;
???????Monitor檢測(cè)與Handler檢測(cè)的區(qū)別是:Monitor檢測(cè)需要執(zhí)行monitor()來(lái)獲取鎖,獲取不到就一直block直至超時(shí),可能是死鎖或鎖一直被其他占用;而Handler檢測(cè)是只要執(zhí)行了run(),說(shuō)明核心服務(wù)的Handler是正常工作的,沒(méi)有被其他消息堵塞,如果mCompleted = false,說(shuō)明該runnable沒(méi)有被執(zhí)行,可能是Handler內(nèi)部有一直執(zhí)行的消息導(dǎo)致了阻塞;
四.案例分析
???????當(dāng)系統(tǒng)核心服務(wù)出現(xiàn)異常觸發(fā)了Watchdog的檢測(cè)時(shí),會(huì)將異常堆棧信息輸出到文件中,文件名為如下格式:20210810014838_traces_SystemServer_WDT10_8月_01_48_37.661_pid1107,如果是Monitor阻塞的話,在日志中會(huì)打印以下EventLog信息:
08-10 01:47:56.394 1107 1440 I watchdog: Blocked in monitor com.android.server.wm.WindowManagerService on foreground thread (android.fg)
???????如果是Handler阻塞的話,會(huì)在日志中打印以下EventLog信息:
08-10 01:47:56.394 1107 1440 I watchdog: Blocked in handler on xx (xx)
???????然后打印warning log信息:
08-10 01:48:42.967 1107 1440 W Watchdog: *** WATCHDOG KILLING SYSTEM PROCESS: Blocked in monitor com.android.server.wm.WindowManagerService on foreground thread (android.fg)
08-10 01:48:46.255 1107 1440 W Watchdog: *** GOODBYE!
???????此時(shí)system_server進(jìn)程就被kill了,系統(tǒng)就重啟了。
???????在anr文件下,找到20210810014838_traces_SystemServer_WDT10_8月_01_48_37.661_pid1107文件進(jìn)行解壓,找到android.fg線程:
"android.fg" prio=5 tid=16 Blocked
| group="main" sCount=1 dsCount=0 flags=1 obj=0x134011c0 self=0x7a8077c600
| sysTid=1147 nice=0 cgrp=default sched=0/0 handle=0x7a70bfc4f0
| state=S schedstat=( 1586113377 2192768722 6889 ) utm=70 stm=88 core=1 HZ=100
| stack=0x7a70afa000-0x7a70afc000 stackSize=1037KB
| held mutexes=
at com.android.server.wm.WindowManagerService.monitor(WindowManagerService.java:7069)
- waiting to lock <0x05f70fd8> (a com.android.server.wm.WindowHashMap)
at com.android.server.Watchdog$HandlerChecker.run(Watchdog.java:211)
at android.os.Handler.handleCallback(Handler.java:790)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loop(Looper.java:164)
at android.os.HandlerThread.run(HandlerThread.java:65)
at com.android.server.ServiceThread.run(ServiceThread.java:46)
???????可以看到monitor在等待0x05f70fd8這個(gè)鎖,該鎖是WindowHashMap,再搜索一下0x05f70fd8被誰(shuí)占用;
"android.anim" prio=5 tid=26 Runnable
| group="main" sCount=0 dsCount=0 flags=0 obj=0x133c40c0 self=0x7a70e19600
| sysTid=1199 nice=-4 cgrp=default sched=0/0 handle=0x7a700cd4f0
| state=R schedstat=( 299015045884 448343750812 840845 ) utm=22091 stm=7810 core=3 HZ=100
| stack=0x7a6ffcb000-0x7a6ffcd000 stackSize=1037KB
| held mutexes= "mutator lock"(shared held)
at com.android.server.wm.WindowContainer.forAllWindows(WindowContainer.java:-1)
at com.android.server.wm.AppWindowToken.forAllWindowsUnchecked(AppWindowToken.java:1549)
at com.android.server.wm.AppWindowToken.forAllWindows(AppWindowToken.java:1544)
at com.android.server.wm.WindowContainer.forAllWindows(WindowContainer.java:616)
at com.android.server.wm.WindowContainer.forAllWindows(WindowContainer.java:616)
at com.android.server.wm.WindowContainer.forAllWindows(WindowContainer.java:616)
at com.android.server.wm.DisplayContent$TaskStackContainers.forAllWindows(DisplayContent.java:3434)
at com.android.server.wm.DisplayContent.forAllWindows(DisplayContent.java:1556)
at com.android.server.wm.WindowContainer.forAllWindows(WindowContainer.java:633)
at com.android.server.wm.DisplayContent.updateWallpaperForAnimator(DisplayContent.java:2655)
at com.android.server.wm.WindowAnimator.animate(WindowAnimator.java:202)
- locked <0x05f70fd8> (a com.android.server.wm.WindowHashMap)
at com.android.server.wm.WindowAnimator.lambda$-com_android_server_wm_WindowAnimator_3951(WindowAnimator.java:105)
at com.android.server.wm.-$Lambda$OQfQhd_xsxt9hoLAjIbVfOwa-jY.$m$0(unavailable:-1)
at com.android.server.wm.-$Lambda$OQfQhd_xsxt9hoLAjIbVfOwa-jY.doFrame(unavailable:-1)
at android.view.Choreographer$CallbackRecord.run(Choreographer.java:964)
at android.view.Choreographer.doCallbacks(Choreographer.java:778)
at android.view.Choreographer.doFrame(Choreographer.java:710)
at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:952)
at android.os.Handler.handleCallback(Handler.java:790)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loop(Looper.java:164)
at android.os.HandlerThread.run(HandlerThread.java:65)
at com.android.server.ServiceThread.run(ServiceThread.java:46)
???????可以看到,該鎖是被WindowAnimator內(nèi)部的animate()方法占用著,那么接下來(lái)就是看一下該方法為啥一直占用這該鎖了。
???????上面詳細(xì)分析了Watchdog的使用及工作流程,system_server進(jìn)程作為Android系統(tǒng)重要的進(jìn)程,運(yùn)行著核心服務(wù),如果核心服務(wù)不能正常運(yùn)行時(shí)[死鎖或消息隊(duì)列一直處于忙碌狀態(tài)],系統(tǒng)也就沒(méi)有運(yùn)行的必要了,Watchdog就承擔(dān)起了檢測(cè)system_server進(jìn)程的任務(wù),如果system_server進(jìn)程異常,就執(zhí)行kill讓系統(tǒng)重啟。