上次介绍的OSPF属于IGP,用于一个AS当中。那不同AS之间的要如何实现路由呢?就需要BGP了。(每个AS都有一个unique的编号,全称为AS number,简称ASN)


和RIP这类基于distance-vector来作为路由的metrics cost不同的是,BGP是path-vector。当一个路由器要前往一个网络N1的时候,它得到的路由信息不仅仅是由哪个路由器 (next hop,下一跳)可以前往N1 with how much cost这些信息这么简单,而是有一个信息的前往N1的路径信息(以path attributes的形式)。

那这些关于路径的信息要如何存储呢?就需要存储在RIB,也就是routing information base,以path attributes的形式。所以,关于path attributes (路径属性)的存储,处理,发送以及接收的方式就十分重要,当然,了解一些常见且重要的path attributes也是十分必要且重要的。

这些path attributes信息都会被封装在update message当中,并可以归于以下四种类型:

  1. well-known mandatory:所有BGP speakers都可以识别,必须包含在update message当中且advertise给peers。
  2. well-known discretionary :所有的BGP speakers都必须知道但并不一定要被包含在update message当中,对于update信息的发送路由器来说并不一定要发送这些path attributes,但对于接受到这些path attributes的路由器来说必须要处理。
  3. optional transitive:BGP speakers不一定可以识别,不过当收到这些attributes的时候需要advertise给peers。
  4. optional non-transitive:BGP speakers不一定可以识别,并且不再把这些path attributes pass给下一个路由器。

属于well-known mandatory的path attributes:

  1. origin:标注路由信息的来源
  2. AS_path
  3. next_hop

属于well-known discretionary的path attributes:

  1. local_pref
  2. atomic_aggregate

属于 optional transitive的path attributes:

  1. aggregator

属于 optional non-transitive的path attributes:

  1. multi_exit_disc (MED)


路由存储 (storage)、更新 (update)、选择 (select)和通知 (advertise)。

RIB是用来存储路径信息的,路径信息由path attributes构成。


同一个AS的BGP peers为IBGP,不同AS的BGP peers为EBGP。

BGP决策过程 (decision process) 的三个阶段 (in -> local -> out):

Phase 1

Each route received from a BGP speaker in a neighboring AS is analyzed and assigned a preference level. The routes are then ranked according to preference and the best one for each network advertised to other BGP speakers within the autonomous system.

Phase 2

The best route for each destination is selected from the incoming data based on preference levels, and used to update the local routing information base (the Loc-RIB).

Phase 3

Routes in the Loc-RIB are selected to be sent to neighboring BGP speakers in other ASes.


  1. The number of autonomous systems between the router and the network (fewer generally being better).


2. The existence of certain policies that may make certain routes unusable; for example, a route may pass through an AS that this AS is not willing to trust with its data.


3. The origin of the path — that is, where it came from。

就是well-known mandatory里面的origin attribute,不同的路径来源在选择最佳路径的时候有不同的preference。

从BGP speaker的视野,它看到的是一个个的AS,关于AS内部的情况它是“看不到”且“无需关心”的。

BGP peers建立的过程

BGP operation begins with BGP peers forming a transport protocol connection. BGP uses TCP for its reliable transport layer, so the two BGP speakers establish a TCP session that remains in place during the course of the subsequent message exchange. When this (TCP connection) is done, each BGP speaker sends a BGP Open message. This message is like an “invitation to dance”, and begins the process of setting up the BGP link between the devices. In this message, each router identifies itself and its autonomous system, and also tells its peer what parameters it would like to use for the link. This includes an exchange of authentication parameters. Assuming that each device finds the contents of its peer’s Open message acceptable, it acknowledges it with a Keepalive message and the BGP session begins.

TCP connection (三次握手) -> 建立 TCP session -> 发送open报文 -> 接受keepalive报文 -> 建立 BGP session

Each BGP speaker encodes information from its Routing Information Bases (RIBs) into BGP Update messages.

每个BGP speaker将它的路由信息encode成BGP update messages。

BGP Peers如何维系 BGP connect session?

定期发送 keepalive message用来维系两个路由器之间的BGP session,因为update message 被发送的频率是稍微比较低的。这个keepalive message并没有包含什么实际的信息,只是让peers知道它们之间的BGP session是没有被打断的。

The TCP session between BGP speakers can be kept open for a very long time, but Updates need to be sent only when changes occur to routes, which are usually infrequent. This means many seconds may elapse between the transmission of Update messages. To ensure that the peers maintain contact with each other, they both send Keepalive messages on a regular basis when they don’t have other information to send. These are null messages that contain no data and just tell the peer device “I’m still here”. These messages are sent infrequently — no more often than one per second — but regularly enough that the peers won’t think the session was interrupted.

除了用于建立BGP session的open和keepalive报文,还有用于路由信息交换的update报文以及当有错误出现时用于通知peers的notification报文。

After sending a BGP Notification message, the device that sent it will terminate the BGP connection between the peers. A new connection will then need to be negotiated, possibly after the problem that led to the Notification has been corrected.


  1. open (trying to build BGP session)
  2. keepalive (for building the initial BGP session and then maintain)
  3. update (the most important one)
  4. notification (used to notify the peer once something goes wrong)
  5. Route-refresh (Route-refresh这个报文用于在改变路由策略后请求对等体重新发送路由信息,不过只有支持路由刷新(Route-refresh)能力的BGP设备会发送和响应此报文)

update信息报文的数据结构很复杂,有两种用途,一种是route advertisement,另外一种是route withdrawal。route advertisement仅有一条route的相关信息(不过前往multiple network destinations的path和path attributes可能是一样的,在这种情况下是可以共享放入一条route信息的),但是route withdrawal可以包含很多条routes。

六种状态 (BGP Finite State Machine,简称FSM)

BGP peers有六种不同的状态,分别是:

  1. idle (空闲状态,idle这个英文单词的含义就是non-employed):准备TCP连接
  2. connect:建立TCP session,成功的话就准备发送open报文,失败的话则转到active state
  3. active:如果TCP在connect retry timeout前一直没有连接成功,则转到connect state
  4. openSent:发送了open message
  5. openConfirm:接收了对应于发送open报文的keepalive message
  6. established:BGP session建立成功,可以exchange update message

本篇文章的一些英文内容是从 copy过来的,我个人觉得这个网站也是一个比较好的学习TCP/IP相关知识的平台,有兴趣的盆友可以去看看支持一下作者呀~



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store