How to write Unit Test in Python¶

Here I take an example in skypilot to show how to write unit tests in Python.

Python
from unittest.mock import patch

import pytest

from sky.serve import serve_utils

class TestValidateLogFile:

    @pytest.fixture
    def mock_exists(self):
        with patch('os.path.exists') as mock:
            yield mock

    def test_file_not_exists(self, mock_exists):
        # File does not exist, return False.
        mock_exists.return_value = False
        assert not serve_utils.validate_log_file('replica_1.log', 1)

    def test_target_id_none(self, mock_exists):
        # File exists, Target ID is None, always return True.
        mock_exists.return_value = True
        assert serve_utils.validate_log_file('replica_1.log', None)

    def test_valid_replica_id(self, mock_exists):
        # File exists, ReplicaID matches, return True.
        mock_exists.return_value = True
        assert serve_utils.validate_log_file('replica_1.log', 1)

    def test_mismatched_replica_id(self, mock_exists):
        # ReplicaID does not match, return False.
        mock_exists.return_value = True
        assert not serve_utils.validate_log_file('replica_1.log', 2)

    def test_no_replica_id_in_filename(self, mock_exists):
        # File name is illegal, return False.
        mock_exists.return_value = True
        assert not serve_utils.validate_log_file('service.log', 1)

    def test_invalid_replica_id_format(self, mock_exists):
        # FileReplicaID raises ValueError, return False.
        mock_exists.return_value = True
        assert not serve_utils.validate_log_file('replica_abc.log', 1)

    def test_empty_filename(self, mock_exists):
        # Remote filename is empty, return False.
        mock_exists.return_value = True
        assert not serve_utils.validate_log_file('', 1)

    def test_valid_replica_id_with_underscores(self, mock_exists):
        # file_replica_id == target_replica_id.
        mock_exists.return_value = True
        assert serve_utils.validate_log_file('test_replica_1_launch.log', 1)

    def test_complex_filename_with_valid_id(self, mock_exists):
        # file_replica_id == target_replica_id.
        mock_exists.return_value = True
        assert serve_utils.validate_log_file('service-name_replica_123.log', 123)

We need to attach some importance to the usage of unittest library and some basic syntax of pytest.

Use encapsulated class to test¶

Python
from sky.serve import serve_utils
class TestValidateLogFile:

We set a test class called TestValidateLogFile to test the function serve_utils.validate_log_file().

What and how to use mock¶

Python
@pytest.fixture
def mock_exists(self):
    with patch('os.path.exists') as mock:
        yield mock

Why do we need mock¶

The main reasons for using mock in this test class are:

Isolated test: avoid test dependence on the actual file system, making the test more reliable and repeatable1.
Control behavior: can precisely control the return value of os.path.exists(), simulating the existence or non-existence of the file

It is just a simulation, instead of actual running.

Syntax and components¶

Python
@pytest.fixture

A decorator, showcases this function is a fixture in pytest (a fixed component in the test environment).

Python
def mock_exists(self):

A function that returns a mock object.

Tip

Use self to indicate that this is a method of the TestValidateLogFile class.
This fixture can be referred by other test methods in the same class.
Others can just use mock_exists as a parameter to use this fixture.

Python
with patch('os.path.exists') as mock:

A context manager that temporarily replaces os.path.exists() with a mock object.

Patch

patch is a function provided by the unittest.mock library.
patch can temporarily replace a function or method with a mock object.
The mock object can be used to control the behavior of the function or method.

You can roughly regard patch as #Define a = b in C++.

os.path.exists(): the function to be replaced.
as mock: the mock object that will replace os.path.exists().
with: mock object is only used within the with block, it will automatically clear this function after this code block and recover.

Python
yield mock

yield: you can roughly regard yield as return.
The difference is yield can be implemented clear function after the code block which is not provided by return.

yield vs return

yield 和 return 的主要区别在于：

yield 用于创建生成器(generator)，可以逐个产生值
return 用于直接返回结果并结束函数执行

yield 用法示例：

Python
def number_list():
    numbers = []
    for i in range(3):
        numbers.append(i)
    return numbers  # 一次性返回所有结果

result = number_list()
print(result)  # 输出: [0, 1, 2]

return 用法示例：

Python
def number_generator():
    for i in range(3):
        yield i  # 每次产生一个值

for num in number_generator():
    print(num)  # 依次输出: 0, 1, 2

小结（by perplexity）：

执行方式:

yield 会暂停函数执行，保持状态，下次调用时从暂停处继续
return 会直接结束函数执行，返回所有结果

内存使用:

yield 适合处理大量数据，因为它不会一次性加载所有数据到内存
return 会将所有结果一次性加载到内存

调用次数:

yield 可以多次调用，每次产生一个值
return 只能调用一次，立即返回所有结果

Python
# 使用 return 的方式（不推荐）
def read_file_return():
    with open('large_file.txt') as f:
        return f.readlines()  # 一次性读取整个文件

# 使用 yield 的方式（推荐）
def read_file_yield():
    with open('large_file.txt') as f:
        for line in f:
            yield line  # 逐行读取文件

How to write test¶

When file isn't exist, we don't need to see replica anymore, we can just return False.

Python
def test_file_not_exists(self, mock_exists):
    mock_exists.return_value = False
    # This means os.path.exists() will return False
    # Which is: the file does not exist
    assert not serve_utils.validate_log_file('replica_1.log', 1)
    # In this situation, we don't need to see replica anymore, we can just return False
    # Hence serve_utils.validate_log_file('replica_1.log', 1) should return False now
    # Therefore, use `assert not`.

When file exists, but the replica id doesn't match, we should return False.

Python
def test_valid_replica_id(self, mock_exists):
    mock_exists.return_value = True
    # The file does exist!
    assert serve_utils.validate_log_file('replica_1.log', 1)
    # We have to check for replica now.
    # We find 1 matches 1, so replcia is fine and it should return True.
    # Therefore, use `assert`.

def test_mismatched_replica_id(self, mock_exists):
    mock_exists.return_value = True
    # The file does exist!
    assert not serve_utils.validate_log_file('replica_1.log', 2)
    # We have to check for replica now.
    # We find 1 doesn't match 2, so replcia is not fine and it should return False.
    # Therefore, use `assert not`.