Backdoors: Definition, Deniability and Detection

[Research in Attacks, Intrusions, and Defenses]21st International Symposium, RAID 2018 Heraklion, Crete, Greece, September 10–12, 2018 Proceedings.

本篇為讀後分享筆記。由於閱讀此篇時,花了比較長的時間去理解,因此接下來的筆記內容,沒有針對原文作翻譯,筆記偶爾用英文描述,希望能幫助跟我一樣為入門的同輩們共同成長。

先備知識

FSM ( Finite-State Machine )

根據WikiPedia的說明:
FSM有限狀態自動機,簡稱狀態機,是表示有限個狀態(states)以及在這些狀態之間的轉移(transition)和動作(actions)等行為的數學模型。

Finite-State Machine

基本上FSM的用途是,在實現演算法時,提供了一種高階與容易維護的方式。而在本篇論文中,FSM將會用來表達「產生Backdoor」的各種狀況。
以下表(State transition table)為例,了解到:State B的狀態,遇上Input Y,將會走到Stat eC。

State transition table

FSM大致上可以分成:

  • Acceptors (recognizers/sequence detectors)接收器:
    會產生binary output(accepting or not accepting)。
    用數學模型表示:a quintuple θ = (S,i,F,Σ,δ)
  • Transducers變換器
    根據inpu或使用動作的狀態來產生output。
    可以被分成Moore machine和Mealy machine。通常控制應用。
    用數學模型表示:a sextuple θ =(Σ,Γ,S,S0,δ,ω)

CFG

會有人不知道這個嗎😳還是補充一下。
Control flow graph(CFG): graph notation to represent paths traversed through a program during its execution.

ROP

Return-Oriented Programming(ROP):也算是buffer overflow的一種,允許在安全防禦下的情況執行程式碼,以此控制程式流程。

Reference

在本篇論文中,作者有參考許多論文,以下是我認為可以多加詳細了解的論文,在此先列出,供讀者可以先行閱讀。

  1. Reference [15] 
    -
    Thomas F. Dullien 
    [Weird machines, exploitability, and provable unexploitability ]
    19 December 2017
    IEEE Transactions on Emerging Topics in Computing


    這篇論文部分內容是說明:作者提供了exploit、weird machine兩者清楚的定義,也解釋了weird machine怎樣的情況會導致exploit。
    與本篇backdoor論文中有相關性的內容在第一章節:
    1 THE INTENDED FINITE-STATE MACHINE (IFSM)
    1.1 Software as emulators for the IFSM
    Since any real-world software can be modelled as an IFSM, but has to execute on a real-world general-purpose machine, an emulator for the IFSM needs to be constructed.”
    這段說明了emulator的需求。
    至於為什麼不是研究software本身,而是反而去研究emulator of software呢?
    根據 bug or security vulnerability 定義:
    When the security issue arises from a software flaw, it is impossible to even define ’flaw’ without taking into account what a bug-free version of the software would have been.
    所以把software作為一個有潛在缺陷的IFSM emulator,在state-space sense上,缺陷可以被放大觀察。
    也就是說,如果能夠觀察出IFSM有什麼問題,那real-world software應該也有相對應的漏洞。
    以上是針對軟體漏洞,至於硬體漏洞可以看
    -
    Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. 
    Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors. 
    June 2014
    SIGARCH Comput. Archit. News 42, 3 (June 2014), 361–372.

    -
  2. Reference [14]
    -
    Dennis Andriesse, Herbert Bos
    Instruction-Level Steganography for Covert Trigger-Based Malware 2014 
    Detection of Intrusions and Malware, and Vulnerability Assessment(DIMVA), pp 41–50

    -
    我們要先知道,trigger-based software會一直保持休眠直到trigger發生。
    這篇是在說明:以往的惡意程式碼都是在benign host binaries中很少運行的路徑,所以就比較難被檢查到。然而近年來自動後門檢測讓惡意程式碼不能在這樣藏了,所以這篇論文為trigger-based software找出一種新的隱藏方法:把惡意程式碼塞在spurious code fragments中。
    用這樣的方式,反編譯和靜態分析都無法偵測到惡意程式碼,只有正確的trigger才會跳到hidden code。
    Abstract是這樣說的:
    “…implement stealthy control transfers to the hidden code by crafting trigger-dependent bugs, which jump to the hidden code only if provided with the correct trigger.”
    所以沒跳到hidden code都不會被偵測出來。
    而此篇作者以實作證明可行性:利用Nginx HTTP server module製作隱藏的後門。
    大家可以先多多去了解關於Nginx以及它的後門實作

至於在論文中其他的reference,在本文筆記中若是有提到也會做註記,但就請各位回去翻原始論文的Reference對照查閱了。


接下來正文的部分,希望大家還是搭配著原文讀。
我會以重點整理或是討論的方式敘述,偶爾夾雜著來自原文的一點翻譯。

Abstract

以條列式重點說明

  • Detecting backdoor is difficult task because…
  1. The lack of automated tooling.
  2. Detected by labourious manual analysis.
  3. No concrete or rigorous definition.
  • This paper will…
  1. Provide a definition about backdoor, backdoor detection, and backdoor deniability.
  2. Present a framework to decompose a backdoor through 4 components.
  3. Show how current backdoor detection methodologies are

Introduction

  • The potential presence of backdoors from third-parties.
    這裡的第三方為「部署軟硬體設備」者。
    根據研究,第三方可能為競爭者[3],或是消費者的設備製造商[2,5]。其中後者會帶來backdoor的原因可能為: accidental, left-over ”debug” functionality, 缺乏使用者驗證的software configuration updates[5], or 當設備製造商只負責韌體,找來的third-party software有夾帶了前三項的問題。
    而其中configuration是一個相當重要的議題,Configuration File(組態設定檔:儲存相關設定的檔案)常常是發生資安問題的疑慮。
  • backdoor v.s traditional vulnerabilities
    傳統漏洞會表現很奇怪且非預期的program state。
    backdoor應該是呈現出明確的(explicit)、有意的(intentional)、且基本上看起來是normal program functionality。
    🔅 接下來文章內會強調backdoor是否由explicit component組成。
    🔅 正常的程式在此篇稱作normal;反之為abnormal。
  • “Backdoor” is hard to define as it…
     — Take many forms.
    所以沒辦法用歸納(generalized)的方式,但不能定義變成無法定義detection methodology。
     — The forms such as a hardware component, a dedicated program or a malicious program fragment.
     — Is too complex to be the sheer lack of real-world samples.
  • Documented real-world backdoors are simplistic for analyzing static input data and have been studied in the literature[19.21]
    這裡simplistic是相較於無法被記錄的複雜sample。
    static input data 指通常是放入trigger function(來看看是否能觸發發生backdoor的條件 e.g. hard-coded credentials)的值。
    🔅 hard-coded credentials代表 程式碼寫死的驗證機制
  • This work focus on…
     — Software-based backdoors primarily.
     — The technical aspects of a backdoor-like functionality. 作者想要強調沒有政治色彩吧哈哈
  • Both academic and real-world backdoors expressed in terms of definitions can be reasoned their deniability and detectability.
    使用本篇對於backdoor的定義,無論backdoor複不複雜(academic v.s. rea-world),都可以確定backdoor detection methodology的好壞和backdoor deniability(是否可以判斷是否為backdoor)。

Preliminaries

  • Platform
    the highest level of abstraction for a device that a given backdoor targets.
    帶有backdoor的裝置,我們將它抽象化作Platform。
  • System
     the highest level of abstraction required to model a given backdoor, within a platform.
    在Platform中,對於backdoor的模型,我們將它抽象化作System。
    那針對不同程度的抽象化,這個Platform可能是:a dedicated program, a hardware component, embedded as part of another program.

抽象化(abstraction)的意義是為了能夠看出或指定 裝置(Platform)和Backdoor(System) 的細節。
E.g. 帶有backdoor的router,我們抽象化為platform。而我們發現backdoor就在該router的software,所以將a dedicated program抽象化為system,並以FSM model起來(是highest level of system)。而OS中的process也可以被FSM model為arbitrary levels of FSMs。

本篇就是基於abstraction,將System的部分以FSM來表達。
🔅 接下來有提到這兩個詞將用斜體字來表示,提醒各位在此篇這些詞有特殊的定義。

FSM

使用FSM時,因為end-user, implementers et.al來看backdoor的角度不同,因此在此篇我們將FSM分為四個:

  1. Developer FSM(DFSM)
     developer’s view of the system
  2. Actual FSM(AFSM)
     a real manifestation of the system
  3. Expected FSM(EFSM)
     the end-user’s expectations of the system
  4. Reverse-engineered FSM(RFSM)
     a refinement of the EFSM obtained by reverse-engineering the actual system

分成以上四種FSM的目的,是為了以DFSM代表「真正的」system,除了不小心的漏洞外,開發者應該都要能知道此system哪裡有造成backdoor的元件,但這些元件可能會被隱藏的好好的(也就是non-explicit component,我們將含有這種元件的backdoor叫作bug-based backdoor),所以表面上會將system成AFSM。我們身為非開發者,藉由將AFSM反編譯成為RFSM,希望能找出non-explicit,進而判斷backdoor的意圖(關於backdoor deniability)。而一般的使用者看到的為EFSM。

在本篇,FSM採用 as a quintuple: θ = (S,i,F,Σ,δ) 來表達,定義如下:

  • S: the set of its states
    state是system中的特定的functionality。
  • i: its initial state
  • F: the set of its final states
  • Σ: the set of its state transition conditions
  • δ: its state transitions and the transition labelling function is like S × Σ → S

所以整個system就像是各個小system組合起來,當越來越細部,可以從state中再去看到sub-system。如下圖

An emulator for the AFSM

根據Reference[15],也在上面的先備知識中提到了,此篇參考了Dullien的做法,將real system建模成FSM,並把此FSM作為emulator of AFSM。
也就是說,今天EFSM可能認為沒有backdoor,但系統卻出現了異常的行為。為了判斷是否真的有backdoor,我們需要一個emulator,以下為步驟:

  • Derive RFSM(θR) — by reverse-engineering the real system — by making perceptions and observations of its concrete implementation.
    由程式分析師觀察出states和transitions,產出RFSM。
    這些states和transitions都是在platform中,例如:CPU的states。而目標Actual system是software,所以emulator從program binary中看。
  • As the emulator for AFSM (θA) 
    RFSM將會作為emulator of AFSM。
  • Map concrete states and transitions of the platform, to the level of abstraction modelled by the states and transitions of their FSM.
    對應FSM和platform的states和transitions。
  • The granularity of the θR is dependent upon how a system is analysed.
     E.g. 
    use IDA Pro and perceive S and δ between those states of the real system.
    use a debugger tool observe S and δ of the real system.

Definition

因為backdoor表現有很多種方式,要先把它分為四個元件才能了解:

  • A distinguishing feature of all backdoors is its trigger mechanism — a pivotal component.
  • Another component is account for the satisfaction of the trigger condition: input source.
  • The eventual system state upon trigger activation is considered the backdoor-activated state, a privileged state.
  • An intermediate component that facilitates the transition from the normal system state to the backdoor-activated state is payload.

了解完這四個元件的大意,那我們就可以來正式定義backdoor:

Definition 1 Backdoor.
An intentional construct contained within a system that serves to compromise its expected security by facilitating access to otherwise privileged functionality or information. Its implementation is identifiable by its decomposition into four components: input source, trigger, payload, and privileged state, and the intention of that implementation is reflected in its complete or partial (e.g., in the case of bug-based backdoors) presence within the DFSM and AFSM, but not the EFSM of the system containing it.

Definition 1 用自己的話來說,重點有幾個:

  1. An intentional construct compromise system expected security by escalating privileges.
  2. Its implementation must be identifiable by its decomposition into four components: input source, trigger, payload, and privileged state
  3. Backdoor is known to the system’s implementer (presence within DFSM & AFSM) and unknown to its end-user (no presence within EFSM).
    RFSM就是看有沒有分析出來。

Backdoor的四個元件都可以被拿去建模成FSM,其中有兩個相關的FSMs:

  1. θtrigger: 代表trigger
  2. θpayload: 代表payload and Fpayload
     ( Fpayload: a set of possible privileged states)

這兩個可以幫助定義Backdoor detection。

Definition 2 Backdoor Detection.
A backdoor is detected by obtaining:
Within θR, the states and transitions of both the trigger and payload must exist:
The privileged states reachable as a result of the payload are either final states of θR, or states that can be transitioned from to some state of θR:
The payload must be reachable from the trigger, and there must exist a transition to the trigger within θR:

Definition 2 用自己的話來說,上面順序可以等同如下:

  1. The states and transitions within both the trigger and payload of FSM must exist within θR
  2. As a result of the payload is privileged state or some states from states of θR
  3. The payload must be reachable from the trigger
  4. There must exist a transition to the trigger within θR

A framework for modelling backdoors

在把framework構築出來前,我們要知道產生framework後是為了方便分析(產生RFSM as an emulator of AFSM),所以先來談談分析的目標產生framework的過程、和如何產生出RFSM

分析的對象為何?依據對象的不同,分析的東西也不同。

  • Open source: 分析the source code version control logs
    control logs可能會指出每次刪除了什麼、變更了什麼,進而知道states和transitions有什麼改變。
  • Closed-source software: 分析the differences between software versions

確定好分析對象以後,就可以來產生framework,通常過程如下:

  • A construct consisting of an input source, trigger, payload, and privileged state would be part of the DFSM of the system.
    最終目標是DFSM,DFSM也都會含有那四個元件。
  • Functionality modelled by EFSM or RFSM express what they have learnt about the system.
    通常是藉由RFSM來找出system中的states和transitions。
    end-user用EFSM。
  • To discover a backdoor through analysis of the emulator for the AFSM.
    然後透過分析emulator和RFSM(post-anlaysis),來找找backdoor。
  • If backdoor present in the emulator of AFSM, then there is in AFSM and might be found in RFSM(or EFSM).
    如果emulator of AFSM有,那基本上AFSM就會有,所以RFSM也有機會被找到(如果你有能力找到)。

那在分析期間,總會找到新的state和transition可以加入RFSM,那這些可以分類成兩種:

  1. Discovered(not newly created):也就是Explicited。
    ➤ Exist in AFSM.
    ➤ Explicit states and transitions will always exist within the backdoor implementer’s DFSM.
  2. Created:Non-explicited。
    ➤ might not in AFSM

為了更清楚意思,舉個例子:
RFSM models a program. The new states and state transitions that are added to it after analysing.
Basic blocks and branches that are explicitly part of the program’s code are be part of AFSM & DFSM.
Some shellcodes that are not explicit are in a sense weird states and state transitions.

Input source

  • Trigger function takes at least one parameter, which is input source.
     
    ( It doesn’t cause the activation of the backdoor trigger.)
     e.g. input source有可能的型態:
    string input by attacker wishing to activate the trigger
    system clock that during a specific time period the trigger becomes active.
  • It decide which state transition made as a result of executing the function.

Trigger mechanism

  • Core concept: the collection of checks is as a single function to decide if payload execute or not.
    因為在現實的情況裡,有multiple branch conditions and execution of multiple basic blocks。
  • The way FSM transitions satisfying the trigger conditions to the payload is model with 2 cases:
    可以等等回過頭來看這個表格,先往下一點的Visual cases觀察。
    表格解釋:
    第一格[State transitions is]:看圖中的transitions是否明確、明顯。
    第二格[Trigger is added to the RFSM by states and state transitions satisfying the backdoor trigger conditions are]:因為發現trigger了,將trigger後走到next state時,所經過的states和state transitions加入RFSM,並判斷它們是否明顯。
    第三格[Trnasitions of RFSM are]:上述的state和state transitions要加進RFSM時,看看原本是否就在RFSM了(取決於原本是否為explicit),如果是的話,那此transitions就是discovered transitions;反之為created transitions。
  • Visual cases:

#1
Within valid CFG, a transition to the payload constitutes normal control-flow.
➤ 只要不是bug-based system,在本篇都當作normal
➤ 由圖可知,頂端的state為一開始的trigger function。當滿足了trigger condition,無論是否activate backdoor,從圖中很明確地就知道可以走到next state。

# 2
Within a program bug allowing control-flow hijacking constitutes abnormal control-flow.
➤ 再提一次,此為bug-based system,也就是圖中含有non-explicit states and state transitions,所以為abnormal control-flow。
➤ 這個圖的問題是出在一開始的認證就是漏洞(strcpy那行會產生buffer overflow的問題)了,所以不是明顯的trigger function。但是滿足trigger condition的還是明顯的state和state transitions,是前往payload的那條transition,要經過分析才知道,所以為non-explicit。

# 3 ( a more complex example)
➤ Backdoor trigger rely both on explicit checks and a bug.
➤ Explicit check: a hard-coded credential check.
 — True: guard access to a vulnerable password check
 — False: execute standard authentication routine

Payload

  • Payload is as a solution how to reach a privileged state from satisfying the conditions of trigger.
  • In practice, a payload component can take many forms by how to be modelled as part of a RFSM
    一樣,等等再來看這個表格,先往下一點的Visual cases觀察。
    表格解釋:
    第一格[Transition to the payload]:看去payload的transition是否明確、明顯。
    第二格[The creation of new states and transitions]:如果第一格提到的transition不明顯,那允不允許新增新的states或state transitions呢?(明顯的話當然就直接是discovered states and state transitions了)
    第三格[Payload added to the RFSM through states and transitions facilitating to privileged state are]:上述的state和state transitions要加進RFSM時,看看原本是否就在RFSM了(取決於原本是否為explicit),如果是的話,那此transitions就是discovered state and discovered state transitions;反之為created transitions。
    第四格[States and transitions is contained in the backdoor implementer’s DFSM or not]:如果上述的state和state transitions是discovered,那就都會存在於DFSM哦!但是case#2比較特別一點,在下面會解釋。
  • Visual cases:

# 1
Explicit transition to payload, where payload has explicit components.
 ➤ state1是trigger condition(也就是hard-coded credential check,如範例程式碼中的strcmp(user._name, “backdoor”)==0),如果滿足了便會進到state2得到的admin權限進行了第一次的提權,接著open shell進入了state4(也就是privileged state)。
➤ 這時候的privileged state就如同an undocumented backdoor shell(例如:輸入特定的使用者名稱,攻擊者就可以執行額外的功能)。

# 2
Explicit transition to payload with both explicit and non-explicit components.
➤ Trigger condition為state1,確定使用者在接下來要不要特殊的路徑(以抵達privileged state)。
🔅If Yes: trigger transition will send the request(req._data) used as input to an interpreter.
➤ 在state2, 透過interpreter去執行(run),privileged state就會動態的建構出來。因為要經過run,可知states和transitions當初在DFSM中是沒有的出現的,現在發現了要把它們加進RFSM。

# 3
Non-explicit transition to payload, where payload has both explicit and non-explicit components.
➤ Trigger mechanism是bug-based system.
state3state4那一組的system有點像是control panel,其中看程式碼可以知道進行到state3時,已經為 administrative privilege了。
 ➤ 在表格中[The creation of new states and transitions],由於state皆是明顯的,沒有新增new state的問題;但transitions有不明顯者,所以要新增。
 ➤ 正常情況下圖中的三個system並不互相影響,在DFSM中是始料未及的,所以新的states和transitions(此範例只有新的transitions)不存在於DFSM,但因為發現了所以把他們加進RFSM。

# 4 Single Transition Payloads
 A backdoor payload composed solely of a state transition.
➤ 把state1叫做trapdoor,因為trapdoor 會允許攻擊者繞過複雜的使用者驗證。(而非直接滿足trigger condition)
➤ The form of the payload is identical for states and transitions before.
另外,payload為explicit(直接是triggerprivileged state的transition)。

看了上述payload的例子,在本篇論文中,還提到了 Payload Obfuscation的概念。

Payload Obfuscation
這個詞主要是要討論:backdoor implementer要如何混淆payload,才不會被攻擊者拿此backdoor攻擊。
除了用bug-based trigger mechanism外,backdoor implementer還能怎麼隱藏backdoor的存在呢?

  • 先討論使用bug-based trigger mechanism的問題:
    因為trigger很簡單,他沒辦法讓implementer確定backdoor到最後要怎麼應用,而且trigger control can be regained.
    🔅“Control is regained” means limiting the computational freedom of newly created states.
  • 透過將obfuscated payload在一定程度的abnormal control-flow下執行。
    當obfuscated payload來自於…
     — Reusing component.
    用這個方法,在攻擊者的角度,必須要先前知道payload的存在和用途,或是從原本的system慢慢解出來。
    E.g. for a program, from static analysis methods or code fragments executed in sequence upon the backdoor being triggered.
    🔅“code fragment” is embedded and distributed throughout a binary.
     — From attacker controlled data.
    controlled data就像成功攻陷buffer overflow的shellcode,本身不存在在程式的元件中。

還有比較複雜的例子,就是payload採用hybrid方法:user-data需要被interpreter、或是程式的元件和input一起使用。如圖:

Privileged state

因為考慮到privileged state當初是怎麼加入RFSM的,那些看看privileged state是否為explicit:
 — explicit:可能或不可能在normal system的情況下獲得權限。
 — non-explicit:不可能在normal system的情況下獲得權限。

好吧,感覺有點饒口,簡單來說就是
到達privileged state有兩個可能性:
1. Under normal system execution

E.g. 
現在有一trigger為a hard-coded credential check和一個state叫is_valid_password。
可以經由這兩種方式到達privileged state

2. Only be reached guarded by activation of the backdoor
也就是在legitimate user情況下,必須通過trigger才能到privileged state
➤ 當privileged state明顯

privileged state manifests as an undocumented backdoor shell

要輸入特定的input,才能使用額外的功能。

➤ 當privileged state不明顯

legitimate user無法使用。

要放入特別的input,經過run(&req_data),才能到達privileged state。

Detection and Deniability

關於Backdoor detection

如果漏洞被故意放置,我們可以很確定的說他就是backdoor;如果沒有那麼明顯,一般都是藉由non-tech.跟intuition的方式來判斷backdoor,例如:去手動反編譯二進制的程式 、觀察反常的網路活動行為。
那讀者會想說,此篇不是提出framework了嗎,能不能藉由decompose the program to framework來找出backdoor呢?
答案是不行。作者自己承認這篇的framework太簡單了XD。
但可以利用此framework來判斷backdoor那四個元件是否explicit。
用framework的情況:

  • Transition from trigger to payload is explicit.
    ➤ 例如:我們可以發現trigger condition ( E.g. hard-coded credential check),以此確定intent is explicit。
  • Transition is non-explicit(bug-based).
    可以利用software的control logs去判斷backdoor會在什麼地方植入、或是用binary software的versions去判斷是否每一次的變動都有legitimate reason,來判斷backdoor的置入。
    ➤ 例如:code fragment今天只做提權這件事,且在normal program中無法被接觸到。(在下一段有提供一個大Case study)

所以,雖然無法去偵測出backdoor,但我們可以做的是,來談談如何判斷 是backdoor-like constructs 還是 accidental vulnerability。(backdoor deniability這個詞的意義)

關於Backdoor deniability

先提醒一下,只要是backdoor-like construct,那會emulator of AFSM和RFSM都可以呈現出來;如果確定是backdoor,那DFSM也會能夠呈現。
在本論文中,作者以technical角度分了三種情況:真的是backdoor、像backdoor(灰色地帶的backdoor)、只是意外的漏洞。

Definition 3 Intentional backdoor.
Those constructs that can be unambiguously identified as backdoors: the transition from their trigger satisfaction to their payload is explicit. Will be present in the DFSM, AFSM, and if found, the RFSM, but not the EFSM.

重點整理:

  • The construct is explicitly identified as backdoor.
  • It can present in the DFSM, AFSM, RFSM because transitions to payload is explicit.
Definition 4 Deniable backdoor.
Those constructs that fall into a grey area, where the transition from their trigger satisfaction to their payload is non-explicit (i.e., it appears to be a bug), but from a non-technical perspective can be argued to be intentional. Will be present in the AFSM, if found, the RFSM, but not the EFSM; we cannot definitively tell if it is in the DFSM.

重點整理:

  • The construct is bug-based and is intentional by non-technical perspective.
  • It can present in the AFSM and RFSM, but not definitively in DFSM.
Definition 5 Accidental vulnerability.
Those constructs where there is no evidence — technical, or otherwise — to suggest any intent, and the transition from their trigger satisfaction to their payload is non-explicit. Will be present in the AFSM, and if found, the RFSM, but not the DFSM or EFSM.

重點整理:

  • The construct has no evidence to identify as backdoor.
  • It can present in the AFSM, RFSM.

以上述的定義去決定backdoor deniability。所以若為Definition 3的結果,那implementer就必須為此backdoor負責。
而這篇論文有了這麼多的定義和框架,最後作者他們討論的結果是,本篇提出來的framework可以(其實是「只能」XD)判斷backdoor detection methodologies的好壞。

Case Study

Definition 4這類的backdoor處於灰色地帶、比較複雜且難以deny,所以以下將討論關於Definition 4的Case Study。
在Reference[14](前面“先備知識”有帶大家了解過),Andriesse and Bos最後用以Nginx為實作例子,他們描述了一個被backdoor嵌入的二進制程式,利用了程式中的bug+寫死的payload(這個payload利用x86指令集的特性:當提供不同的offset時,每個位元就會代表不同的指令。這就是“錯位的指令序列碎片(misaligned instruction sequence fragment)”,拿來做payload的話算是obfuscated payload)去觸發後門。也就是說,藉由修改Nginx,input source只要滿足trigger conditions,就可以遠端利用的backdoor。
🔅這個input source是一個network socket,只要放進malformed HTTP packet就可以觸發。

由圖中可以知道state1trigger,在整個虛線框框內都是non-explicit,是因為兩個function(ngx http finalize request & hash in ngx http parse header line)stack frames overlap(如果你不知道這是啥,可以先視為buffer overflow)導致have_err和err_handler有被初始化,滿足了bug-based trigger condition(have err == 1, and err handler != NULL);而payload就是state2state3則是最終的privileged state。

那你會想,從這麼一大坨的source code去畫出圖根本很難啊,那要怎麼完成componentisation?
答案就是用這篇的framework!

  • 利用symbolic execution找出bug-based trigger condition
  • 利用misaligned instruction sequences的原理,掃出其他的instruction sequences。藉由找出其他的instruction sequences來看看能不能進行提權,則這些instruction sequences就為payload

framework不只可以克服source code,還可以評斷backdoor detection methodology的好壞ㄛ。

Current backdoor detection methodologies

好,現在用framwork的四格元件來對 四種 最先進的後門檢測方法 進行分解。每個工具都聲稱能夠檢測後門類型的特定子集。 然而,雖然這些工具都是有效的,但作者認為沒有人考慮後門的完整模型,因此其效果有限。

一個一個來解釋,找出每個方法的缺點哈哈

Firmalice

is designed to detect authentication bypass vulnerabilities.

  • It can detect a privileged state by modification of the input security policy.
  • Problem: It require the same amount of manual analysis to detect the entire backdoor as it would to identify the privileged state.也就是說,因為沒有payload的觀念,所以必須一直手動更改input source,才能看看privileged state的改變。

HumIDIFy

aims to detect if a program can execute functionality it should never execute under normal circumstances.

  • It does not consider the notion of a trigger.
  • Problem: Only detected when program is performed by a legitimate user and behavior that is anomalous。

Stringer

attempts to detect static data used as program.

  • It uses a scoring metric to rank static data.
  • It uses heuristics for identifying payload-like constructs.
  • Problem: It is unable to meaningfully score data that leads to states that are actually privileged higher than those that are not.

Weasel

detects both authentication bypass vulnerabilities and undocumented commands in server-like program binaries.

  • It is assumed to reveal all deciders and handlers when processed.
  • Problem: For instance, Tenda web- server backdoor(可以去原文看Table1.的案例解釋). It will be unable to detect such a backdoor due to using a separate input source from the standard input to the program.因為後門用戶達到的privileged state與合法用戶達到的privileged state不同,但沒有input source的觀念所以不能分辨用戶。

Future Work

This paper does not intent to provide a direct means to detect backdoors, rather it serves as a general means to decompose backdoors in an abstract way.

  • The deficiencies in those methods due to them not fully capturing the rigorous definition of a backdoor.
  • A backdoor detection methodology based upon our proposed framework would be a natural extension of this work.(都把問題丟給別人做欸)
  • A deliberate side-channel vulnerability would prove difficult to model using our FSM-based abstraction; we view this as an additional area for investigation.

Conclusion

This paper provides

  • Definitions: backdoor, backdoor detection, deniable backdoors.
  • Means to discern: intentional backdoors and accidental vulnerabilities.
  • Framework serves as a basis for identifying backdoor-like construct and the reasoning about detection.

Discussion

  1. 我覺得本篇沒有實作「如何從程式碼中(或是difference of version control logs)轉變成CFG並判斷FSM」,很可惜。雖然他有在Nginx case study的部分有大概說明如何找出trigger condition,但卻沒有針對這部份多做更多解釋。
  2. 其實不只是針對backdoor detection,我們也可以從此篇得到靈感:在偵測無論何種惡意程式時,都可以試以FSM來釐清自己的概念。

以上內容,如有問題或建議請不吝提出 😊 謝謝大家
歡迎來信 emily971133@gmail.com
Emily Tseng