CS144-Lab1

实验目的

实现一个TCPreceiver,接收datagram,重排,输出正确顺序的字节流。

要实现的类:StreamReassembler

1
2
3
4
5
6
7
8
9
10
11
12
13
class StreamReassembler {
private:
ByteStream _output; //!< The reassembled in-order byte stream
size_t _capacity; //!< The maximum number of bytes

public:
StreamReassembler(const size_t capacity);
void push_substring(const std::string &data, const uint64_t index, const bool eof);
const ByteStream &stream_out() const { return _output; }
ByteStream &stream_out() { return _output; }
size_t unassembled_bytes() const;
bool empty() const;
};

TCPreceiver的容量:字节流中,已经存储在内存中但未被读取的字节流,加上读取了一部分但是还没有接收完毕的字节流,这两部分之和。

不能修改public interface,可以自行添加private interface。

实验思路(踩坑记录)

  1. 参考手册中给出的图,设置了3个变量:firstUnread, firstUnassembled, firstUnacceptable。
  2. firstUnread到firstUnacceptable中间的部分是buffer,其大小不超过capacity,用一整个std::string实现。
  3. 关键的函数是push_substring,大致采用如下的流程:
1
2
3
4
5
6
7
8
void push_substring(data, index, eof) { // 注意data可能有重复
判断真实的起始位置;
将起始位置中间的字节扔给buffer;
更新firstUnassembled;
将firstUnread到firstUnassembled中间的部分扔给_output;
更新firstUnread, firstUnassembled, firstUnacceptable;
判断并处理EOF;
}
  1. 处理EOF必须用“已经写入output的字节数”和“当前EOF下的最后一个字节index”比较!如果当前输入eof为真,则需要更新indexEOF,并且与已经写入的字节数比较。使用其他方式获取真实EOF都有可能漏掉字节。
  2. buffer必须用“循环”的方式实现。如果不循环,就需要在buffer内部移动字符,这样可能造成输出字符串与输入字符串不一致。尽量不要在buffer内部移动字符。
  3. 用一个采用了deque实现的bitMap指示当前buffer内部有哪些字节是已经写入的,哪些字节还没有写入。这样做是因为可能第一个字节是最后到达的,因此必须记录哪些字节已到达,哪些未到达。
  4. unassembled_bytes()用一个单独的变量记录,不要尝试根据bitMap计算,容易出错。

代码

下面列出类StreamReassembler的私有变量。注意涉及指针的用智能指针。

1
2
3
4
5
6
7
8
9
10
ByteStream _output;  //!< The reassembled in-order byte stream
size_t _capacity; //!< The maximum number of bytes
size_t _unassempledBytes;
size_t _eofIndex;
std::unique_ptr<std::string> _buffer; //! Bytes stored in buffer
std::unique_ptr<std::deque<bool>> _bitMap; //Bit map, which indicates the storage usage
uint64_t _firstUnread;
uint64_t _firstUnassembled;
uint64_t _firstUnacceptable;
bool _eof;

下面给出该类的方法的实现。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
StreamReassembler::StreamReassembler(const size_t capacity)
: _output(capacity)
, _capacity(capacity)
, _unassempledBytes(0)
, _eofIndex(0)
, _buffer()
, _bitMap()
, _firstUnread(0)
, _firstUnassembled(0)
, _firstUnacceptable(capacity)
, _eof(false) {
_buffer.reset(new std::string(capacity, 0));
_bitMap.reset(new std::deque<bool>);
for (size_t i = 0; i < capacity; i++) {
_bitMap->emplace_back(false);
}
}

//! \details This function accepts a substring (aka a segment) of bytes,
//! possibly out-of-order, from the logical stream, and assembles any newly
//! contiguous substrings and writes them into the output stream in order.
void StreamReassembler::push_substring(const string &data, const size_t index, const bool eof) {
size_t start = max(index, _firstUnassembled);
size_t end = min(min(index + data.size(), _firstUnacceptable), _firstUnread + _output.remaining_capacity());
end = max(start, end);
if (end - start > 0) {
for (size_t i = 0; i < end - start; i++) {
_buffer->at((start + i) % _capacity) = data[start - index + i];
if (!(_bitMap->at(start - _firstUnread + i))) {
_unassempledBytes++;
_bitMap->at(start - _firstUnread + i) = true;
}
}

// Change tags
for (size_t i = _firstUnassembled - _firstUnread; i < _capacity && _bitMap->at(i); i++, _firstUnassembled++)
;

// Copy _buffer[_firstUnread.._firstUnassembled] to _output
if (_firstUnread != _firstUnassembled) {
string dataSubmit(_firstUnassembled - _firstUnread, 0);
for (size_t i = _firstUnread; i < _firstUnassembled; i++) {
dataSubmit[i - _firstUnread] = _buffer->at(i % _capacity);
}
size_t writtenBytes = _output.write(dataSubmit);
_unassempledBytes -= writtenBytes;
for (size_t i = 0; i < min(_firstUnassembled - _firstUnread, _capacity) &&
_firstUnassembled - _firstUnread + i < _capacity;
i++) {
_bitMap->pop_front();
}
for (size_t i = _firstUnassembled - _firstUnread; i < _capacity; i++) {
_bitMap->emplace_back(false);
}
_firstUnread = _firstUnassembled;
_firstUnacceptable = _firstUnread + _capacity;
}
}

// Handle EOF
if (eof) {
_eofIndex = index + data.size() + 1;
}
if (_output.bytes_written() + 1 == _eofIndex) {
_output.end_input();
}
}

size_t StreamReassembler::unassembled_bytes() const {
return _unassempledBytes;
}

bool StreamReassembler::empty() const { return unassembled_bytes() == 0; }

通过了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
[100%] Testing the stream reassembler...
Test project /media/sf_share/lab0/sponge/build
Start 15: t_strm_reassem_single
1/16 Test #15: t_strm_reassem_single ............ Passed 0.01 sec
Start 16: t_strm_reassem_seq
2/16 Test #16: t_strm_reassem_seq ............... Passed 0.12 sec
Start 17: t_strm_reassem_dup
3/16 Test #17: t_strm_reassem_dup ............... Passed 0.02 sec
Start 18: t_strm_reassem_holes
4/16 Test #18: t_strm_reassem_holes ............. Passed 0.01 sec
Start 19: t_strm_reassem_many
5/16 Test #19: t_strm_reassem_many .............. Passed 0.61 sec
Start 20: t_strm_reassem_overlapping
6/16 Test #20: t_strm_reassem_overlapping ....... Passed 0.01 sec
Start 21: t_strm_reassem_win
7/16 Test #21: t_strm_reassem_win ............... Passed 0.68 sec
Start 22: t_strm_reassem_cap
8/16 Test #22: t_strm_reassem_cap ............... Passed 0.26 sec
Start 23: t_byte_stream_construction
9/16 Test #23: t_byte_stream_construction ....... Passed 0.00 sec
Start 24: t_byte_stream_one_write
10/16 Test #24: t_byte_stream_one_write .......... Passed 0.00 sec
Start 25: t_byte_stream_two_writes
11/16 Test #25: t_byte_stream_two_writes ......... Passed 0.00 sec
Start 26: t_byte_stream_capacity
12/16 Test #26: t_byte_stream_capacity ........... Passed 0.93 sec
Start 27: t_byte_stream_many_writes
13/16 Test #27: t_byte_stream_many_writes ........ Passed 0.01 sec
Start 50: t_address_dt
14/16 Test #50: t_address_dt ..................... Passed 5.02 sec
Start 51: t_parser_dt
15/16 Test #51: t_parser_dt ...................... Passed 0.01 sec
Start 52: t_socket_dt
16/16 Test #52: t_socket_dt ...................... Passed 0.01 sec

100% tests passed, 0 tests failed out of 16

Total Test time (real) = 7.88 sec
[100%] Built target check_lab1

总结

这个lab是CS144的正经的第一个lab。整体思路没有太大问题,但是细节上有很多需要注意的地方,需要考虑很多特殊的case。buffer需要用循环的这一点确实没有想到,可能是对buffer内部的修改比较容易出错。