跳转至

How to write Unit Test in Python

Here I take an example in skypilot to show how to write unit tests in Python.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
from unittest.mock import patch

import pytest

from sky.serve import serve_utils

class TestValidateLogFile:

    @pytest.fixture
    def mock_exists(self):
        with patch('os.path.exists') as mock:
            yield mock

    def test_file_not_exists(self, mock_exists):
        # File does not exist, return False.
        mock_exists.return_value = False
        assert not serve_utils.validate_log_file('replica_1.log', 1)

    def test_target_id_none(self, mock_exists):
        # File exists, Target ID is None, always return True.
        mock_exists.return_value = True
        assert serve_utils.validate_log_file('replica_1.log', None)

    def test_valid_replica_id(self, mock_exists):
        # File exists, ReplicaID matches, return True.
        mock_exists.return_value = True
        assert serve_utils.validate_log_file('replica_1.log', 1)

    def test_mismatched_replica_id(self, mock_exists):
        # ReplicaID does not match, return False.
        mock_exists.return_value = True
        assert not serve_utils.validate_log_file('replica_1.log', 2)

    def test_no_replica_id_in_filename(self, mock_exists):
        # File name is illegal, return False.
        mock_exists.return_value = True
        assert not serve_utils.validate_log_file('service.log', 1)

    def test_invalid_replica_id_format(self, mock_exists):
        # FileReplicaID raises ValueError, return False.
        mock_exists.return_value = True
        assert not serve_utils.validate_log_file('replica_abc.log', 1)

    def test_empty_filename(self, mock_exists):
        # Remote filename is empty, return False.
        mock_exists.return_value = True
        assert not serve_utils.validate_log_file('', 1)

    def test_valid_replica_id_with_underscores(self, mock_exists):
        # file_replica_id == target_replica_id.
        mock_exists.return_value = True
        assert serve_utils.validate_log_file('test_replica_1_launch.log', 1)

    def test_complex_filename_with_valid_id(self, mock_exists):
        # file_replica_id == target_replica_id.
        mock_exists.return_value = True
        assert serve_utils.validate_log_file('service-name_replica_123.log', 123)

We need to attach some importance to the usage of unittest library and some basic syntax of pytest.

Use encapsulated class to test

Python
1
2
from sky.serve import serve_utils
class TestValidateLogFile:

We set a test class called TestValidateLogFile to test the function serve_utils.validate_log_file().

What and how to use mock

Python
1
2
3
4
@pytest.fixture
def mock_exists(self):
    with patch('os.path.exists') as mock:
        yield mock

Why do we need mock

The main reasons for using mock in this test class are:

  • Isolated test: avoid test dependence on the actual file system, making the test more reliable and repeatable1.
  • Control behavior: can precisely control the return value of os.path.exists(), simulating the existence or non-existence of the file

It is just a simulation, instead of actual running.

Syntax and components

Python
1
@pytest.fixture

A decorator, showcases this function is a fixture in pytest (a fixed component in the test environment).

Python
1
def mock_exists(self):

A function that returns a mock object.

Tip
  1. Use self to indicate that this is a method of the TestValidateLogFile class.
  2. This fixture can be referred by other test methods in the same class.
  3. Others can just use mock_exists as a parameter to use this fixture.
Python
1
with patch('os.path.exists') as mock:

A context manager that temporarily replaces os.path.exists() with a mock object.

Patch
  1. patch is a function provided by the unittest.mock library.
  2. patch can temporarily replace a function or method with a mock object.
  3. The mock object can be used to control the behavior of the function or method.

You can roughly regard patch as #Define a = b in C++.

  1. os.path.exists(): the function to be replaced.
  2. as mock: the mock object that will replace os.path.exists().
  3. with: mock object is only used within the with block, it will automatically clear this function after this code block and recover.
Python
1
yield mock
  1. yield: you can roughly regard yield as return.
  2. The difference is yield can be implemented clear function after the code block which is not provided by return.
yield vs return

yieldreturn 的主要区别在于:

  • yield 用于创建生成器(generator),可以逐个产生值
  • return 用于直接返回结果并结束函数执行

yield 用法示例:

Python
1
2
3
4
5
6
7
8
def number_list():
    numbers = []
    for i in range(3):
        numbers.append(i)
    return numbers  # 一次性返回所有结果

result = number_list()
print(result)  # 输出: [0, 1, 2]

return 用法示例:

Python
1
2
3
4
5
6
def number_generator():
    for i in range(3):
        yield i  # 每次产生一个值

for num in number_generator():
    print(num)  # 依次输出: 0, 1, 2

小结(by perplexity)

执行方式:

  • yield 会暂停函数执行,保持状态,下次调用时从暂停处继续
  • return 会直接结束函数执行,返回所有结果

内存使用:

  • yield 适合处理大量数据,因为它不会一次性加载所有数据到内存
  • return 会将所有结果一次性加载到内存

调用次数:

  • yield 可以多次调用,每次产生一个值
  • return 只能调用一次,立即返回所有结果
Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# 使用 return 的方式(不推荐)
def read_file_return():
    with open('large_file.txt') as f:
        return f.readlines()  # 一次性读取整个文件

# 使用 yield 的方式(推荐)
def read_file_yield():
    with open('large_file.txt') as f:
        for line in f:
            yield line  # 逐行读取文件

How to write test

When file isn't exist, we don't need to see replica anymore, we can just return False.

Python
1
2
3
4
5
6
7
8
def test_file_not_exists(self, mock_exists):
    mock_exists.return_value = False
    # This means os.path.exists() will return False
    # Which is: the file does not exist
    assert not serve_utils.validate_log_file('replica_1.log', 1)
    # In this situation, we don't need to see replica anymore, we can just return False
    # Hence serve_utils.validate_log_file('replica_1.log', 1) should return False now
    # Therefore, use `assert not`.

When file exists, but the replica id doesn't match, we should return False.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
def test_valid_replica_id(self, mock_exists):
    mock_exists.return_value = True
    # The file does exist!
    assert serve_utils.validate_log_file('replica_1.log', 1)
    # We have to check for replica now.
    # We find 1 matches 1, so replcia is fine and it should return True.
    # Therefore, use `assert`.

def test_mismatched_replica_id(self, mock_exists):
    mock_exists.return_value = True
    # The file does exist!
    assert not serve_utils.validate_log_file('replica_1.log', 2)
    # We have to check for replica now.
    # We find 1 doesn't match 2, so replcia is not fine and it should return False.
    # Therefore, use `assert not`.