@zhongdao
2019-09-01T17:13:35.000000Z
字数 17753
阅读 2510
未分类
IO::Socket::Timeout: socket timeout made easy
IO::Socket::Timeout: Socket Timeout 变得简单
原文链接:
https://medium.com/booking-com-development/io-socket-timeout-socket-timeout-made-easy-4dfd43e777f7
Without network operations, running a website for booking accommodation online would be nearly impossible. Network operations can be anything from simple actions like talking to a browser with a user at the other end, to providing an API for our affiliates, or even writing the internal services that help maintain our systems.
如果没有网络运营,运行一个在线预订住宿的网站几乎是不可能的。 网络操作可以是任何简单的动作,比如与另一端的浏览器用户通讯,为我们的分支机构提供 API,甚至是编写内部服务来帮助维护我们的系统
dams Feb 26, 2015 2015年2月26日
Network operations are everywhere, and these are only a few examples of where we use them.
网络操作无处不在,这些只是我们使用它们的几个例子。
Network communication typically happens through the use of sockets.
网络通信通常通过使用套接字进行。
A network socket is one of the core software components of all modern operating systems. It is a software interface for network services provided by the operating system. It provides a uniformed way of opening a connection to this service and sends and receives data.
网络套接字network socket是所有现代操作系统的核心软件组件之一。 它是操作系统提供的网络服务的软件接口。 它提供了一种统一的方式来打开到此服务的连接,并发送和接收数据。
Network sockets are used everywhere in the IT world, and they allow us to communicate between different hosts or different programs on the same host. Despite having different kinds of network services (TCP and **UDP**being prominent examples), network sockets provide a common way to interact with them.
网络套接字在 IT 世界中无处不在,它们允许我们在同一台主机上的不同主机或不同程序之间进行通信。 尽管有不同种类的网络服务(TCP 和 UDP 是突出的例子) ,网络套接字提供了与它们交互的通用方式。
Here is an example of interacting with the Google.com website using IO::Socket::INET, the standard Perl socket module. (IO means Input/Output, and INET means Internet.)
下面是使用标准的 Perl 套接字模块 IO: : Socket: INET 与 google 网站进行交互的一个示例。 (IO 表示输入 / 输出,INET 表示互联网。)
# example 1my $socket = IO::Socket::INET->new('google.com:80');
print {$socket} "GET / \n";
my $html = join '', <$socket>;
Interestingly, IO::Socket::INET
is mostly used for its Object Oriented capable interface. The following example performs the same operations as the previous one, but in an object oriented way:
有趣的是,IO: : Socket: INET 主要用于其面向对象的接口。 下面的示例执行与前一个示例相同的操作,但是是以面向对象的方式:
# example 2
my $socket = IO::Socket::INET->new(
PeerHost => 'www.booking.com',
PeerPort => 80,);
$socket->print("GET / \n");
my $html = join '', $socket->getlines();
At Booking.com, the handling of requests in a timely manner is critical to the user experience, to operations, and ultimately to our business. To achieve speed and low latency, our platform involves different subsystems and resources are constantly requested.
在预订网站上,及时处理请求对于用户体验、运营以及最终对于我们的业务都是至关重要的。 为了实现速度和低延迟,我们的平台涉及到不同的子系统和资源,需要不断地请求。
It is essential that these systems reply quickly. It is also vital that we detect when one of them isn’t replying fast enough, that way it can be flagged and a mitigating strategy can be found (such as using an alternative subsystem).
这些系统必须迅速作出反应。 同样重要的是,当其中一个回复速度不够快时,我们能够检测到它,这样就可以标记它,并找到一个缓解策略(比如使用一个替代子系统)。
We use Redis in a lot of places: as a cache layer, queue system, and for storage. Redis offers very low latency, and we make use of this feature. However, with sockets, we don’t always know immediately when a connection has been lost. Knowing a Redis server is unreachable 30 seconds after the fact — is 30 seconds too late. Our goal is to know this in under a second. For other cases it might be possible (or even mandatory) to allow a longer timeout. It really depends on the subsystems involved as well as the usage.
我们在很多地方使用 Redis: 作为缓存层、队列系统和存储。 Redis 提供了非常低的延迟,我们利用了这个特性。 但是,使用 sockets 时,我们并不总是能够立即知道何时断开了连接。 知道 Redis 服务器在事后30秒后无法访问,那就晚了30秒。 我们的目标是在一秒钟之内了解这一点。 对于其他情况,允许更长的超时是可能的(甚至是强制的)。 这实际上取决于所涉及的子系统以及使用情况。
Most of the time these subsystems are queried using a network socket. So being able to detect that a subsystem is not reachable implies that the sockets provide a way to specify timeouts.
大多数情况下,使用网络套接字查询这些子系统。 因此,能够检测到子系统是不可到达的,这意味着套接字提供了一种指定超时的方法。
This is why having a fast and reliable platform relies on having sockets that support timeouts. Using a socket involves three main steps: connecting to the external server, reading and writing data from and to it, and, at some point, closing the connection. A socket timeout implementation should allow for setting the timeout at connections, and both reading and writing steps at the very least.
这就是为什么拥有一个快速可靠的平台依赖于拥有支持超时的套接字。 使用套接字涉及三个主要步骤: 连接到外部服务器,从外部服务器读取和写入数据,以及在某个时候关闭连接。 套接字超时实现应该允许在连接处设置超时,以及至少读写步骤。
IO::Socket provides a timeout
method, and IO::Socket::INET provides a Timeout
option. The Timeout
option can be used to set a timeout on the connection to the server. For example, this is how we connect to a local HTTP server on port 80 with a connection timeout of 3 seconds:
Socket 提供了一个超时方法,IO: : Socket: INET 提供了一个超时选项。 可以使用 Timeout 选项设置到服务器的连接的超时。 例如,我们在端口80上连接本地 HTTP 服务器,连接超时时间为3秒:
my $socket = IO::Socket::INET->new( PeerHost => '127.0.0.1', PeerPort => 80, Timeout => 3,);
So far so good, but how do we deal with read or write timeouts? What if the server accepts the connection, but then at some point stops communicating? The client socket needs to realize this quickly. We need timeouts for this.
到目前为止还不错,但是我们如何处理读或写超时呢? 如果服务器接受了这个连接,但是在某个时刻停止了通信,那该怎么办? 客户端套接字需要快速实现这一点。 我们需要暂停。
It is relatively easy to change the option of a socket to supply these timeouts. This is an example that works on GNU/Linux, given $timeout
in (optionally fractional) seconds:
更改套接字的选项来提供这些超时是相对容易的。 这是一个在 gnu / linux 上运行的示例,给定 $timeout (可选的小数)秒:
my $seconds = int($timeout);my $useconds = int( 1_000_000 * ( $timeout - $seconds ) );my $timeout = pack( 'l!l!', $seconds, $useconds );$socket->setsockopt( SOL_SOCKET, SO_RCVTIMEO, $timeout )# then use $socket as usual
The only problem is that it only works on some architecture and operating systems. A generic solution is better. Let’s look at the available options on systems that do not support setsockopt
.
唯一的问题是它只适用于某些架构和操作系统。 一般的解决方案更好。 让我们看看不支持 setsockopt 的系统上的可用选项。
Another more portable (albeit slower) way to simulate a timeout on a socket is to check if the socket is readable/writable with a timeout in a non-blocking way. select(2)
can do this, and the Perl select()
function can provide access to it.
在套接字上模拟超时的另一种更便携(尽管较慢)的方法是以非阻塞的方式检查套接字是否具有可读 / 可写超时。 Select (2)可以做到这一点,并且 Perl select ()函数可以提供对它的访问。
Here is a simplified version of a function that returns true if we can read the socket with the given timeout:
下面是一个函数的简化版本,如果我们可以用给定的超时读取套接字,它将返回 true:
sub _can_read { my ( $file_desc, $timeout ) = @_; vec( my $fdset = '', $file_desc, 1 ) = 1; my $nfound = select( $fdset, undef, undef, $timeout );}
Yet another way is to use external modules or system calls, like epoll (via IO::Poll), libevent, or libev. To simplify things, it’s common to use higher-level event-based modules like AnyEvent and POE. They make it easy to specify a timeout to any IO (Input/Output) operations.
还有一种方法是使用外部模块或系统调用,比如 epoll (通过 IO: : Poll)、 libevent 或 libev。 为了简化事情,通常使用基于事件的高级模块,如 AnyEvent 和 POE。 它们使得很容易为任何 IO (输入 / 输出)操作指定超时。
This is an example using AnyEvent
, which will set a connection timeout of 0.5 second and a read or write timeout of 0.01 second:
这是一个使用 AnyEvent 的例子,它将连接超时设置为0.5秒,读写超时设置为0.01秒:
my $handle = AnyEvent::Handle->new ( connect => [ $host, $port ], on_prepare => sub { 0.5 }, # ...);$handle->on_timeout( sub { say 'timeout occurred' } );$handle->timeout(0.01);
While completely valid, it applies only to programs that use these event-based modules. It is useless to standard imperative programs. We need a method for providing timeout features to the standard socket API without changing the operation needed to require an event loop.
虽然完全有效,但它只适用于使用这些基于事件的模块的程序。 对于标准的命令式程序来说是没有用的。 我们需要一种方法来为标准套接字 API 提供超时特性,而不需要更改需要事件循环的操作。
Let’s step back for a moment. We have two ways to setup a timeout on a socket:
让我们退后一步。 我们有两种方法来设置套接字超时:
setsocket
.select
.We need to abstract these two ways of setting timeouts behind a simple and easy-to-use API. Let’s consider this example:
我们需要抽象出在简单易用的 API 背后设置超时的这两种方法。 让我们来看看这个例子:
my $socket = IO::Socket::INET->new( ... );print {$socket} 'something';
(Please note that we don’t use object-oriented notations on the socket.)
(请注意,我们不在套接字上使用面向对象的符号。)
What we want is an easier way to set timeout on the $socket
. For example this:
我们需要的是一种在 $socket 上设置超时的简单方法。 例如:
my $socket = IO::Socket::INET->new( ... );# set timeouts$socket->read_timeout(0.5);# use the socket as beforeprint {$socket} 'something';# later, get the timeout valuemy $timeout = $socket->read_timeout();
If we can use setsockopt
, setting the timeout using ->read_timeout(0.5)
is easy. It can be implemented as a method that we add to IO::Socket::INET
class, possibly by using a Role.
如果我们可以使用 setsockopt,使用-read timeout (0.5)设置超时很容易。 它可以作为我们添加到 IO: : Socket: INET 类的方法实现,可以使用一个角色。
This method would just fire setsockopt
with the right parameters, and save the timeout value into $socket
for later retrieval. Then we can carry on using $socket
as before.
这个方法只需使用正确的参数触发 setsockopt,并将超时值保存到 socket。
One subtlety is that, because the $socket
is not a classic hash reference instance, but an anonymous typeglob on a hash reference, instead of doing $socket->{ReadTimeout} = 0.5
we need to do ${*$socket}{ReadTimeout} = 0.5
... but that's just an implementation detail.
一个微妙之处在于,因为 { * $socket }{ ReadTimeout }0.5... ... 但这只是一个实现细节。
If however the program is running in a situation where setsockopt
can't be used, we have to resort to using the select
method. That poses a problem. Because we're not using object oriented programming, the operation on the socket is not done via a method we could easily override, but directly using the built-in function print
.
但是,如果程序在 setsockopt 不能使用的情况下运行,我们必须使用 select 方法。 这就产生了一个问题。 因为我们没有使用面向对象编程,所以套接字上的操作不是通过我们可以轻松覆盖的方法来完成的,而是直接使用内置函数 print。
Overwriting a core function is not a good practice for various reasons. Luckily, Perl provides a clean way to implement custom behavior in the IO layer.
由于各种原因,覆盖核心函数并不是一个好的做法。 幸运的是,Perl 提供了一种在 IO 层中实现自定义行为的干净方法。
Perl Input/Output mechanism is based on a system of layers. It is documented in the perliol(1) man page.
Perl 输入 / 输出机制基于一个层系统。 它在 perliol (1)手册页中有记录。
What’s the PerlIO API? It’s a stack of layers that live between the system and the perl generic file-handle API. Perl provides core layers (such as :unix
, :perlio
, :stdio
, and :crlf
). It also provides extension layers (such as :encoding
and :via
).
什么是 PerlIO API? 它是一个位于系统和 perl 通用文件句柄 API 之间的层堆栈。 Perl 提供了核心层(例如: unix、 : perlio、 : stdio 和: crlf)。 它还提供了扩展层(例如: encoding 和: via)。
These layers can be stacked and removed in order to provide more features (when layers are added) or more performance (when layers are removed).
这些层可以堆叠和移除,以提供更多的功能(当层被添加)或更多的性能(当层被移除)。
The huge benefit is that no matter which layers are setup on a file handle or socket, the API doesn’t change and the read/write operations are the same. Calls to them will go through the specified layers attached to the handle until they potentially reach the system calls.
最大的好处是,不管在文件句柄或套接字上设置了哪些层,API 都不会改变,读 / 写操作也是相同的。 对它们的调用将通过附加在句柄上的指定层,直到它们可能到达系统调用。
Here is an example:
下面是一个例子:
open my $fh, 'filename';# for direct binary non-buffered accessbinmode $fh, ':raw';# specify that the file is in utf8, and enforce validationbinmode $fh, ':encoding(UTF-8)';my $line = <$fh>;
The :via
layer is a special layer that allows anyone to implement a PerlIO layer in pure Perl. Contrary to implementing a PerlIO layer in C, using the :via
layer is rather easy: it is just a Perl class, with some specific methods. The name of the class is given when setting the layer:
Via 层是一个特殊层,任何人都可以使用纯 Perl 实现 PerlIO 层。 与在 c 语言中实现 PerlIO 层相反,使用: via 层相当简单: 它只是一个 Perl 类,带有一些特定的方法。 在设置图层时给出类的名称:
binmode $fh, ':via(MyOwnLayer)';
Many :via
layers already exist. They all start with PerlIO::via::
and are available on CPAN. For instance, PerlIO::via::json will automatically and transparently decode and encode the content of a file or a socket from and to JSON.
许多: 通过层已经存在。 它们都是以 PerlIO: : via: 开始的,并且可以在 CPAN 上获得。 例如,PerlIO: : via: JSON 将自动、透明地解码并编码来自 JSON 和来自 JSON 的文件或套接字的内容。
Back to the problem. We could implement a :via
layer that makes sure that read and write operations on the underlying handle are performed within the given timeout.
回到问题上来。 我们可以实现一个: via 层,确保在给定的超时内对底层句柄执行读写操作。
A :via
layer is a class that should start with PerlIO::via::
and implement a set of methods, like READ
, WRITE
, PUSHED
, and POPPED
- (see the PerlIO::viamanual for more details).
Via 层是一个类,它应该从 PerlIO: : via: 开始,并实现一组方法,比如 READ、 WRITE、 PUSHED 和 POPPED-(参见 PerlIO: : via 手册获得更多详细信息)。
Let’s take the READ
method as an illustration. This is a very simplified version. The real version handles EINTR
and other corner cases.
让我们以 READ 方法为例。 这是一个非常简化的版本。 真正的版本处理 EINTR 和其他角落情况。
package PerlIO::via::Timeout;sub READ { my ( $self, $buf, $len, $fh ) = @_; my $fd = fileno($fh); # we use the same can_read as previously can_read( $fd, $timeout ) or return 0; return sysread( $fh, $buf, $len, 0 );}
The idea is to check if we can read on the filesystem using select
in the given timeout. If not, return 0. If yes, call the normal sysread operation. It's simple and it works great.
这个想法是检查我们是否可以在给定的超时中使用 select 读取文件系统。 如果没有,返回0。 如果是,调用正常的 sysread 操作。 它很简单,而且效果很好。
We’ve just implemented a new PerlIO layer using the :via
mechanism! A PerlIO layer works on any handle, including file and socket. Let's try it on a file-handle:
我们刚刚使用: via 机制实现了一个新的 PerlIO 层! Perlio 层适用于任何句柄,包括文件和套接字。 让我们在文件句柄上试一试:
use PerlIO::via::Timeout;open my $fh, '<:via(Timeout)', 'foo.html';my $line = <$fh>;if ( $line == undef && 0+$! == ETIMEDOUT ) { # timed out reading ...} else { # we read one line fast enough, success! ...}
I’m sure you can see that there is an issue in the code above. At no point do we set the read timeout value. The :via
pseudo-layer doesn't allow us to easily pass a parameter to the layer creation. Though we can technically, we would not be able to change the parameter afterwards. If we want to be able to set, change, or remove the timeout on the handle at any time, we need to somehow attach this information to the handle, and we need to be able to change it.
我相信您可以看到在上面的代码中有一个问题。 在任何时候,我们都不设置读取超时值。 Via 的伪层不允许我们轻松地传递一个参数到层创建。 虽然我们可以在技术上,我们将不能改变的参数之后。 如果我们希望能够在任何时候设置、更改或删除句柄上的超时,我们需要以某种方式将此信息附加到句柄,并且我们需要能够更改它。
A handle is not an object. We can’t just add a new timeout attribute to a handle and then set or get it.
句柄不是对象。 我们不能只是向句柄添加一个新的 timeout 属性,然后设置或获取它。
Luckily, the moment a handle is opened it receives a unique ID: its file descriptor. A file descriptor is not always unique because they are recycled and reused. Yet, if we know when a handle is opened and closed we can be sure that between these actions a file descriptor is given that uniquely identifies it.
幸运的是,当一个句柄被打开时,它会收到一个惟一的 ID: 它的文件描述符。 文件描述符并不总是唯一的,因为它们被回收和重用。 然而,如果我们知道一个句柄何时打开或关闭,我们可以确定在这些操作之间会给出一个文件描述符来唯一地标识它。
The :via
PerlIO layer allows us to implement PUSHED
, POPPED
, and CLOSE
. These functions are called when the layer is added to the handle, when it's removed, and when the handle is closed. We can use these functions to detect if and when to consider the file descriptor as a unique ID for the given handle.
通过 PerlIO 层允许我们实现 push、 POPPED 和 CLOSE。 当将层添加到句柄时、删除该层时以及关闭句柄时,都会调用这些函数。 我们可以使用这些函数来检测是否以及何时将文件描述符视为给定句柄的唯一 ID。
We can create a hash table as a class attribute of our new layer. Here the keys are file descriptors and the values are a set of properties on the associated handle — essentially a basic implementation of Inside-Out OO — with the object not being its data structure only an ID. Using this hash table, we can associate a set of properties to a file descriptor and set the timeout value when the PerlIO layer is added. Like this:
我们可以创建一个哈希表作为新层的类属性。 这里的键是文件描述符,值是相关句柄上的一组属性(实质上是 Inside-Out OO 的基本实现) ,对象不是它的数据结构,而只是一个 ID。 使用这个哈希表,我们可以将一组属性关联到文件描述符,并在添加 PerlIO 层时设置超时值。 像这样:
my %fd_properties;sub PUSHED { my ( $class, $mode, $fh ) = @_; $fd_properties{ fileno($fh) } = { read_timeout => 0.5 }; # ...}
By doing this when we remove the layer too, we can also implement a way to associate timeout values to the file-handle.
通过这样做,当我们删除层也,我们还可以实现一种方法来关联超时值的文件句柄。
Wrapping up all the bits of code and features, the full package that implements this timeout layer, PerlIO::via::Timeout, is available on Github and CPAN.
包装所有的代码和特性,实现这个超时层的完整包 PerlIO: : via: : Timeout 可以在 Github 和 CPAN 上使用。
We now have all the ingredients we need to implement the desired behavior. enable_timeouts_on
will receive the socket and modify its class (which should be IO::Socket::INET
or inherited from it) to implement these methods:
我们现在已经具备了实现所需行为的所有要素。 启用超时将接收套接字并修改其类(应该是 IO: : Socket: INET 或从其继承)来实现这些方法:
read_timeout
: get/set the read timeout : 获取 / 设置读取超时write_timeout
: get/set the write timeout : 获取 / 设置写入超时disable_timeout
: switch off timeouts (while remembering their values) : 关闭超时(同时记住其值)enable_timeout
: switch back on the timeouts : 重新打开暂停timeout_enabled
: returns whether the timeouts are enabled : 返回是否启用了超时In order to modify the IO::Socket::INET
class in a clean way, let's create a role and apply it to the class. In fact, let's create two roles: one that implements the various methods using setsockopt
and another role that uses select
(with PerlIO::via::Timeout
).
为了以一种干净的方式修改 IO: : Socket: INET 类,让我们创建一个角色并将其应用于该类。 实际上,让我们创建两个角色: 一个使用 setsockopt 实现各种方法,另一个角色使用 select (使用 PerlIO: : via: Timeout)。
A Role (sometimes known as Trait) provides additional behavior to a class in the form of composition. Roles provide introspection, mutual exclusion capabilities, and horizontal composition instead of the more widely used inheritance model. A class simply consumes a Role, receiving any and all behavior the Role provides, whether these are attributes, methods, method modifiers, or even constraints on the consuming class.
角色(有时称为 Trait)以组合的形式为类提供附加行为。 角色提供自省、互斥锁和水平组合功能,而不是使用更广泛的继承模型。 一个类仅仅使用一个 Role,接收 Role 提供的所有行为,不管这些行为是属性、方法、方法修饰符,还是消费类上的约束。
Detailing the implementation of the role mechanism here is a bit out of the scope, but it’s still interesting to note that to keep IO::Socket::Timeout
lightweight, we don't use Moose::Role
or Moo::Role
, but instead we apply a stripped down variant of Role::Tiny
, which uses a single inheritance of a special class crafted in real time specifically for the targeted class. The code is short and can be seen here.
这里详细说明角色机制的实现有点超出范围,但是值得注意的是,为了保持 IO: : Socket: : Timeout 轻量级,我们不使用 Moose: : Role 或 Moo: : Role,而是应用一个精简版的 Role: : Tiny,它使用一个特殊类的实时继承,这个特殊类是专门为目标类制作的。 代码很短,可以在这里看到。
Use IO::Socket::Timeout to add read/write timeouts to any network socket created with IO::Socket::INET
, on any platform:
使用 IO: : Socket: : Timeout 在任何平台上向使用 IO: : Socket: INET 创建的任何网络套接字添加读 / 写超时:
# 1. Creates a socket as usualmy $socket = IO::Socket::INET->new( ... );# 2. Enable read and write timeouts on the socketIO::Socket::Timeout->enable_timeouts_on($socket);# 3. Setup the timeouts$socket->read_timeout(0.5);$socket->write_timeout(0.5);# 4. Use the socket as usualmy $data = <$socket>;# 5. Profit!
IO::Socket::Timeout
provides a lightweight, generic, and portable way of applying timeouts on sockets, and it plays an important role in the stability of the interaction between our subsystems at Booking.com. This wouldn't be possible without Perl's extreme flexibility.
1: : Socket: : Timeout 提供了一种轻量级的、通用的、可移植的方式来应用套接字超时,它在 booking. com 的子系统之间的交互的稳定性中扮演着重要的角色。 如果没有 Perl 的极端灵活性,这是不可能的。
Please be aware that there is a performance penalty associated with implementing IO layers in pure Perl. If you are worried about this, we recommend benchmarking when making the decision on whether to use it.
请注意,在纯 Perl 中实现 IO 层会带来性能损失。 如果您对此感到担心,我们建议您在决定是否使用基准测试时使用它。
IO::Socket::Timeout
is available on GitHub and CPAN.
在 GitHub 和 CPAN 上可以使用 Socket: : : Timeout。
Would you like to be a Developer at Booking.com? Work with us!
你想成为预订网站的开发者吗? 和我们一起工作吧!