@zhongdao
2018-11-21T18:46:15.000000Z
字数 18635
阅读 2241
Bittorrent
协议
翻译
BT协议规范中英文对照
原文网址: http://www.bittorrent.org/beps/bep_0003.html
BEP: | 3 |
---|---|
Title: | The BitTorrent Protocol Specification |
Version: | 0e08ddf84d8d3bf101cdf897fc312f2774588c9e |
Last-Modified: | Sat Feb 4 12:58:40 2017 +0100 |
Author: | Bram Cohen <bram@bittorrent.com> |
Status: | Final |
Type: | Standard |
Created: | 10-Jan-2008 |
Post-History: | 24-Jun-2009 (arvid@bittorrent.com), clarified the encoding of strings in torrent files. 20-Oct-2012 (arvid@bittorrent.com), clarified that info-hash is the digest of en bencoding found in .torrent file. Introduced some references to new BEPs and cleaned up formatting. 11-Oct-2013 (arvid@bittorrent.com), correct the accepted and de-facto sizes for request messages 04-Feb-2017 (the8472.bep@infinite-source.de), further info-hash clarifications, added resources for new implementors |
BitTorrent is a protocol for distributing files. It identifies content by URL and is designed to integrate seamlessly with the web. Its advantage over plain HTTP is that when multiple downloads of the same file happen concurrently, the downloaders upload to each other, making it possible for the file source to support very large numbers of downloaders with only a modest increase in its load.
BitTorrent是一种用于分发文件的协议。它通过URL标识内容,旨在与Web无缝集成。它优于普通HTTP的优势在于,当同一文件的多次下载同时发生时,下载程序会相互上传,从而使文件源可以支持非常大量的下载程序,其负载只会适度增加。
There are ideally many end users for a single file.
4:spam
corresponds to 'spam'. 字符串是以十进制长度为前缀,后跟着冒号和字符串,例如 4:spam 对应于 'spam'i3e
corresponds to 3 and i-3e
corresponds to -3. Integers have no size limitation. i-0e
is invalid. All encodings with a leading zero, such as i03e
, are invalid, other than i0e
, which of course corresponds to 0. 整数用'i'表示,后面是数字10,后跟'e'.例如 i3e 对应 3 , 而 i-3e 对应于 -3。 整数没有大小限制。 i-0e 无效。 所有带有前导0的,例如 i03e 是无效的,除了 i0e 当然是对应0。l4:spam4:eggse
corresponds to ['spam', 'eggs']. 列表被编码成 "l", 后跟着其他元素(也是ben编码), 后跟'e'。例如14:spam4:eggse 对应于 ['spam','eggs']d3:cow3:moo4:spam4:eggse
corresponds to {'cow': 'moo', 'spam': 'eggs'} and d4:spaml1:a1:bee
corresponds to {'spam': ['a', 'b']}. Keys must be strings and appear in sorted order (sorted as raw strings, not alphanumerics). 字典被编码成"d", 后跟交替键列表及其对应的值后跟"e"。 例如 d3:cow3:moo4:spam4:eggse 对应 {'cow': 'moo', 'spam': 'eggs'} 和 d4:spaml1:a1:bee 对应 {'spam': ['a', 'b']}。 键必须是字符串并按排序顺序出现(按原始字符串排序,而不是字母数字)。Metainfo files (also known as .torrent files) are bencoded dictionaries with the following keys:
Metainfo文件(也称为.torrent文件)是带有以下键的bencoded字典
- announce 宣布
The URL of the tracker. 跟踪器的URL。
info 信息
This maps to a dictionary, with keys described below. 这将映射到字典,其中包含下面描述的键。
All strings in a .torrent file that contains text must be UTF-8 encoded.
.torrent文件中包含文本的所有字符串都必须采用UTF-8编码。
The name
key maps to a UTF-8 encoded string which is the suggested name to save the file (or directory) as. It is purely advisory.
'name'键映射到UTF-8编码的字符串,这是建议文件(或目录)保存为的名称。这纯粹是建议性的。
piece length
maps to the number of bytes in each piece the file is split into. For the purposes of transfer, files are split into fixed-size pieces which are all the same length except for possibly the last one which may be truncated. piece length
is almost always a power of two, most commonly 2 18 = 256 K (BitTorrent prior to version 3.2 uses 2 20 = 1 M as default).
'piece length'映射到文件被分割成的每个片段中的字节数。出于传输的目的,文件被分成固定大小的片段,除了可能被截断的最后一个片段之外,它们的长度都相同。片长‘piece length'几乎总是2的幂,最常见的是2 18 = 256 K(版本3.2之前的BitTorrent默认使用2 20 = 1 M)。
pieces
maps to a string whose length is a multiple of 20. It is to be subdivided into strings of length 20, each of which is the SHA1 hash of the piece at the corresponding index.
’pieces' 映射到长度为20的倍数的字符串。它被细分为长度为20的字符串,每个字符串是相应索引处的片段的SHA1哈希值
There is also a key length
or a key files
, but not both or neither. If length
is present then the download represents a single file, otherwise it represents a set of files which go in a directory structure.
还有一个密钥长度‘length'或密钥文件'files',但不是两者都兼有。如果存在长度'length',则下载表示单个文件,否则它表示进入目录结构的一组文件。
In the single file case, length
maps to the length of the file in bytes.
在单个文件的情况下,长度’length'映射到文件的长度(以字节为单位)。
For the purposes of the other keys, the multi-file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the value files
maps to, and is a list of dictionaries containing the following keys:
出于其他键的目的,通过按文件列表中出现的顺序连接文件,将多文件视为仅单个文件。文件列表是'files'映射到的值,是包含以下键的字典列表:
length
- The length of the file, in bytes. 文件的长度,以字节为单位。
path
- A list of UTF-8 encoded strings corresponding to subdirectory names, the last of which is the actual file name (a zero length list is an error case). 对应于子目录名称的UTF-8编码字符串列表,其中最后一个是实际文件名(零长度列表是错误情况)。
In the single file case, the name key is the name of a file, in the muliple file case, it's the name of a directory. 在单个文件的情况下,name键是文件的名称,在多文件情况下,是目录名称。
Tracker GET requests have the following keys: 跟踪器GET请求具有以下键:
info_hash
The 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. This value will almost certainly have to be escaped.Note that this is a substring of the metainfo file. The info-hash must be the hash of the encoded form as found in the .torrent file, which is identical to bdecoding the metainfo file, extracting the info dictionary and encoding it if and only if the bdecoder fully validated the input (e.g. key ordering, absence of leading zeros). Conversely that means clients must either reject invalid metainfo files or extract the substring directly. They must not perform a decode-encode roundtrip on invalid data.
来自metainfo文件的信息值的bencoded形式的20字节sha1哈希。这个值几乎肯定必须被转义。
请注意,这是元信息文件的子字符串。info-hash必须是.torrent文件中找到的编码格式的散列,这与对元信息文件进行bdecoding相同,提取信息字典并对其进行编码,当且仅当 bdecoder完全验证了输入时(例如密钥排序) ,没有前导零)。相反,这意味着客户端必须拒绝无效的元信息文件或直接提取子字符串。它们不得对无效数据执行解码编码往返。
peer_id
A string of length 20 which this downloader uses as its id. Each downloader generates its own id at random at the start of a new download. This value will also almost certainly have to be escaped.
长度为20的字符串被下载程序用作其id。每个下载程序在新下载开始时随机生成自己的ID。这个值几乎肯定也必须被转义。
ip
An optional parameter giving the IP (or dns name) which this peer is at. Generally used for the origin if it's on the same machine as the tracker.
一个可选参数,给出该对等体peer所在的IP(或dns名称)。通常用于Origin如果它与跟踪器在同一台机器上。
port
The port number this peer is listening on. Common behavior is for a downloader to try to listen on port 6881 and if that port is taken try 6882, then 6883, etc. and give up after 6889.
此对等端正在侦听的端口号。常见的行为是下载者尝试侦听端口6881,如果该端口被尝试尝试6882,然后是6883等,并在6889之后放弃。
uploaded
The total amount uploaded so far, encoded in base ten ascii.
到目前为止上传的总额,以十进制ascii编码。
downloaded
The total amount downloaded so far, encoded in base ten ascii.
到目前为止下载的总量,以十进制ascii编码。
left
The number of bytes this peer still has to download, encoded in base ten ascii. Note that this can't be computed from downloaded and the file length since it might be a resume, and there's a chance that some of the downloaded data failed an integrity check and had to be re-downloaded.
此对等端仍需下载的字节数,以十进制ascii编码。请注意,这不能通过下载和文件长度来计算,因为它可能是一个续传,并且有些下载的数据可能无法通过完整性检查而不得不重新下载。
event
This is an optional key which maps to started
, completed
, or stopped
(or empty
, which is the same as not being present). If not present, this is one of the announcements done at regular intervals. An announcement using started
is sent when a download first begins, and one using completed
is sent when the download is complete. No completed
is sent if the file was complete when started. Downloaders send an announcement using stopped
when they cease downloading.
这是一个可选键,它映射到 ‘started', 已完成'completed'或已停止'stopped'(或为 空,这与不存在时相同)。如果不存在,这是定期进行的公告之一。下载首次开始时会发送使用已启动'started'的通知,下载完成后会发送一个已完成'completed'的通知。如果开始的时候文件是完整的则不发送"completed'。下载程序在停止下载时发送通知'stopped'。
Tracker responses are bencoded dictionaries. If a tracker response has a key failure reason
, then that maps to a human readable string which explains why the query failed, and no other keys are required. Otherwise, it must have two keys: interval
, which maps to the number of seconds the downloader should wait between regular rerequests, and peers
. peers
maps to a list of dictionaries corresponding to peers
, each of which contains the keys peer id
, ip
, and port
, which map to the peer's self-selected ID, IP address or dns name as a string, and port number, respectively. Note that downloaders may rerequest on nonscheduled times if an event happens or they need more peers.
跟踪器的响应是bencoded词典。如果跟踪器响应具有一个键’failure reason',则映射到人类可读的字符串,该字符串解释了查询失败的原因,并且不需要其他keys。否则,它必须有两个键:'interval',它映射到下载器在常规请求和peer之间应该等待的秒数。对等体'peers'映射到对等体对应的字典列表,每个字典包含对等体ID 'peer id','ip'和 端口'port',它分别映射到对等方的自选ID,IP地址或dns名称作为字符串,以及端口号。请注意,如果事件event发生或需要更多peers,下载程序可能会在非计划时间重新请求。
More commonly is that trackers return a compact representation of the peer list, see BEP 23.
更常见的是跟踪器返回对等列表的紧凑表示,参见BEP 23。
If you want to make any extensions to metainfo files or tracker queries, please coordinate with Bram Cohen to make sure that all extensions are done compatibly.
如果您想对元信息文件或跟踪器查询进行任何扩展,请与Bram Cohen协调以确保所有扩展都兼容完成。
It is common to announce over a UDP tracker protocol as well.
通常也会通过UDP跟踪器协议进行通告。
BitTorrent's peer protocol operates over TCP or uTP.
BitTorrent的对等协议通过TCP或uTP进行操作。
Peer connections are symmetrical. Messages sent in both directions look the same, and data can flow in either direction.
对等连接是对称的。在两个方向上发送的消息看起来相同,数据可以在任一方向上流动。
The peer protocol refers to pieces of the file by index as described in the metainfo file, starting at zero. When a peer finishes downloading a piece and checks that the hash matches, it announces that it has that piece to all of its peers.
对等协议通过索引引用文件的片段,如元信息文件中所述,从零开始。当一个对等体完成下载一个片段并检查该哈希值是否匹配时,它会宣布它对所有对等体都有该片段。
Connections contain two bits of state on either end: choked or not, and interested or not. Choking is a notification that no data will be sent until unchoking happens. The reasoning and common techniques behind choking are explained later in this document.
连接在任一端包含两个状态位:阻塞或不阻塞,以及是否感兴趣。Choking是一种通知,在发生解锁之前不会发送任何数据。choking背后的推理和常见技术将在本文后面解释。
Data transfer takes place whenever one side is interested and the other side is not choking. Interest state must be kept up to date at all times - whenever a downloader doesn't have something they currently would ask a peer for in unchoked, they must express lack of interest, despite being choked. Implementing this properly is tricky, but makes it possible for downloaders to know which peers will start downloading immediately if unchoked.
只要一方感兴趣而另一方没有choking,就会进行数据传输。Interest状态必须始终保持 - 每当一个下载器没有什么东西他们会要求未阻塞unchoked的peer,他们必须表示缺乏兴趣interest,尽管被阻塞choked。正确地实现这一点是很棘手的,但它可以让下载者知道,如果没有阻塞unchoked,哪些下载伙伴将立即开始下载。
Connections start out choked and not interested.
连接开始choked而不感兴趣。
When data is being transferred, downloaders should keep several piece requests queued up at once in order to get good TCP performance (this is called 'pipelining'.) On the other side, requests which can't be written out to the TCP buffer immediately should be queued up in memory rather than kept in an application-level network buffer, so they can all be thrown out when a choke happens.
在传输数据时,下载程序应保持多个piece请求的队列queue,以获得良好的TCP性能(这称为“流水线”。)另一方面,无法立即写入TCP缓冲区的请求应该在内存中排队而不是保存在应用程序级网络缓冲区中,因此当发生阻塞choke时它们都会被抛出。
The peer wire protocol consists of a handshake followed by a never-ending stream of length-prefixed messages. The handshake starts with character ninteen (decimal) followed by the string 'BitTorrent protocol'. The leading character is a length prefix, put there in the hope that other new protocols may do the same and thus be trivially distinguishable from each other.
对等线协议包括握手,后跟永不停止的长度前缀消息流。握手以字符19(十进制)开头,后跟字符串'BitTorrent protocol'。前导字符是长度前缀,放在那里,希望其他新协议可以做同样的事情,因此可以在很小的方面彼此区分。
All later integers sent in the protocol are encoded as four bytes big-endian.
在协议中发送的所有后来的整数都被编码为大端(big-endian)四字节。
After the fixed headers come eight reserved bytes, which are all zero in all current implementations. If you wish to extend the protocol using these bytes, please coordinate with Bram Cohen to make sure all extensions are done compatibly.
在固定头之后有八个保留字节,在所有当前实现中都是零。如果您希望使用这些字节扩展协议,请与Bram Cohen协调以确保所有扩展都兼容。
Next comes the 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. (This is the same value which is announced as info_hash
to the tracker, only here it's raw instead of quoted here). If both sides don't send the same value, they sever the connection. The one possible exception is if a downloader wants to do multiple downloads over a single port, they may wait for incoming connections to give a download hash first, and respond with the same one if it's in their list.
接下来是来自metainfo文件的信息info值的bencoded形式的20字节sha1哈希。(这与向跟踪器宣布为info_hash的值相同,只是在这里它是原始的而不是引用的)。如果双方都没有发送相同的值,则会切断连接。一个可能的例外是,如果下载加载程序想要在一个端口上进行多次下载,它们可能会等待传入的连接先给出下载哈希,如果列表中有相同的哈希值,则使用相同的哈希值进行响应。
After the download hash comes the 20-byte peer id which is reported in tracker requests and contained in peer lists in tracker responses. If the receiving side's peer id doesn't match the one the initiating side expects, it severs the connection.
在下载散列之后出现20字节的对等peer id,该对等ID在跟踪器请求中报告并包含在跟踪器响应中的peer 列表中。如果接收方的对等方peer ID与发起方期望的对等方peer ID不匹配,则会切断连接。
That's it for handshaking, next comes an alternating stream of length prefixes and messages. Messages of length zero are keepalives, and ignored. Keepalives are generally sent once every two minutes, but note that timeouts can be done much more quickly when data is expected.
这就是握手,接下来是长度前缀和消息的交替流。长度为零的消息是keepalive,并被忽略。Keepalive通常每两分钟发送一次,但请注意,当预期数据时超时可能更快。
All non-keepalive messages start with a single byte which gives their type.
所有非keepalive消息都以一个字节开头,该字节给出了它们的类型。
The possible values are:
可能的值是:
'choke', 'unchoke', 'interested', and 'not interested' have no payload.
'choke','unchoke','interest'和'not interested'都没有有效载荷。
'bitfield' is only ever sent as the first message. Its payload is a bitfield with each index that downloader has sent set to one and the rest set to zero. Downloaders which don't have anything yet may skip the 'bitfield' message. The first byte of the bitfield corresponds to indices 0 - 7 from high bit to low bit, respectively. The next one 8-15, etc. Spare bits at the end are set to zero.
'bitfield'只作为第一个消息发送。它的有效负载是一个位字段,下载器发送的每个索引都设置为1,其余的都设置为0。没有任何内容的下载程序可能会跳过“bitfield”消息。位字段的第一个字节分别对应于从高比特到低比特的索引0 - 7。下一个8-15等等。末尾的备用位被设为零。
The 'have' message's payload is a single number, the index which that downloader just completed and checked the hash of.
'have'消息的有效负载是一个数字,该下载器刚刚完成的索引并检查了散列。
'request' messages contain an index, begin, and length. The last two are byte offsets. Length is generally a power of two unless it gets truncated by the end of the file. All current implementations use 2^14 (16 kiB), and close connections which request an amount greater than that.
'request'消息包含索引,开头和长度。最后两个是字节偏移。长度通常是2的幂,除非它在文件末尾被截断。所有当前实现都使用2 ^ 14(16 kiB),并且请求大于该值的关闭连接。
'cancel' messages have the same payload as request messages. They are generally only sent towards the end of a download, during what's called 'endgame mode'. When a download is almost complete, there's a tendency for the last few pieces to all be downloaded off a single hosed modem line, taking a very long time. To make sure the last few pieces come in quickly, once requests for all pieces a given downloader doesn't have yet are currently pending, it sends requests for everything to everyone it's downloading from. To keep this from becoming horribly inefficient, it sends cancels to everyone else every time a piece arrives.
'cancel'消息与请求消息具有相同的有效负载。它们通常仅在下载结束时发送,称为“结束游戏模式”。当下载几乎完成时,最后几个部分的趋势是从单个调制解调器线路下载,需要很长时间。为了确保最后几个piece快速进入,一旦给定下载器的所有pieces的请求当前尚未处理,它会向所有正在下载的人发送所有请求。为了防止这种情况变得非常低效,每当一个piece到货时,它就会向其他人发送取消。
'piece' messages contain an index, begin, and piece. Note that they are correlated with request messages implicitly. It's possible for an unexpected piece to arrive if choke and unchoke messages are sent in quick succession and/or transfer is going very slowly.
'piece'消息包含索引,开头和片段。请注意,它们隐式与请求消息相关联。如果快速连续发送阻塞和取消消息, 或传输速度非常慢,则可能会出现意外的部分。
Downloaders generally download pieces in random order, which does a reasonably good job of keeping them from having a strict subset or superset of the pieces of any of their peers.
下载程序通常以随机顺序下载文件,这样做可以很好地防止它们拥有任何同类文件的严格子集或超集。
Choking is done for several reasons. TCP congestion control behaves very poorly when sending over many connections at once. Also, choking lets each peer use a tit-for-tat-ish algorithm to ensure that they get a consistent download rate.
Choking有几个原因。当一次发送多个连接时,TCP拥塞控制表现很差。此外,choking让每个同伴使用针锋相对的算法来确保他们获得一致的下载速率。
The choking algorithm described below is the currently deployed one. It is very important that all new algorithms work well both in a network consisting entirely of themselves and in a network consisting mostly of this one.
下面描述的choking算法是当前部署的算法。非常重要的是,所有新算法在完全由他们自己组成的网络中以及在主要由这个组成的网络中都能很好地工作。
There are several criteria a good choking algorithm should meet. It should cap the number of simultaneous uploads for good TCP performance. It should avoid choking and unchoking quickly, known as 'fibrillation'. It should reciprocate to peers who let it download. Finally, it should try out unused connections once in a while to find out if they might be better than the currently used ones, known as optimistic unchoking.
一个好的choking算法应该满足几个标准。它应该限制同时上传的数量以获得良好的TCP性能。它应该避免快速choking和unchoking,称为“颤动”。它应该回应让它下载的同行。最后,它应该偶尔尝试使用未使用的连接,以确定它们是否可能比当前使用的更好,称为乐观unchoking。
The currently deployed choking algorithm avoids fibrillation by only changing who's choked once every ten seconds. It does reciprocation and number of uploads capping by unchoking the four peers which it has the best download rates from and are interested. Peers which have a better upload rate but aren't interested get unchoked and if they become interested the worst uploader gets choked. If a downloader has a complete file, it uses its upload rate rather than its download rate to decide who to unchoke.
目前部署的choking算法通过每十秒钟改变choking一次来避免颤动。它通过释放它拥有最好的下载速率和感兴趣的四个对等点来实现对等和数量上限。有更好的上载率,但不感兴趣的同伴会被释放,如果他们感兴趣,最差的上传者会被choked。如果一个下载器有一个完整的文件,它会使用它的上传速率而不是下载速率来决定取消挂载的权限。
For optimistic unchoking, at any one time there is a single peer which is unchoked regardless of its upload rate (if interested, it counts as one of the four allowed downloaders.) Which peer is optimistically unchoked rotates every 30 seconds. To give them a decent chance of getting a complete piece to upload, new connections are three times as likely to start as the current optimistic unchoke as anywhere else in the rotation.
对于乐观的unchoking,在任何时候都有一个单独的对等体,无论其上传速率如何都是未被阻塞unchoked的(如果感兴趣的话,它将被视为四个允许的下载者中的一个。)哪个对等体乐观地未被阻止每30秒旋转一次。为了给他们一个很好的机会获得一个完整的片段上传,新的连接的开始时间是当前乐观的unchoke的三倍,与轮转中的其他任何地方一样。
This document has been placed in the public domain.
本文档已置于公共领域。
bittorrent协议简介 (有图,建议看看)
https://www.jianshu.com/p/84202c4f11d3?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation