Collection of data files for automated tests

Automated testing for a formatter like Format Express is as straightforward as you'd expect : given a JSON or XML extract, check the formatter returns the expected formatted output. Have as many tests as possible to build a robust application. In this article, I show you how to create a nice collection of tests by putting each test in its own data file.

The issue

A ruby test for the following JSON {"single-quote": "'", "double-quote":"\""} would look like this :


          test "Test JSON with single and double quotes in values" do
            input = '{"single-quote":"\'","double-quote":"\""}'                         # ' must be escaped
            expected = "{\n  \"single-quote\": \"'\",\n  \"double-quote\": \"\\\"\"\n}" # " and \ are escaped
            assert_equal expected, FormatExpressFormatter.new.format(input)
          end

I'm sure it bothers you too ! The input and ouput strings are hard to read and understand.
Let's see how it can be better...

First try: improve Ruby strings

The 2 main issues to tackle are :

Special characters like ', " or \ must be escaped;
Use multi-line strings instead of \n in a single line string

Ruby offers several notations for string literals, let's take advantage of them.

One improvement is to use the %q(...) notation for Ruby strings. With this notation, I can choose another delimiter instead of quotes. This way I don't have to escape quotes. Here an example using | :


          input = %q|{"single-quote":"'","double-quote":"\""}|    # Neither ' nor " have to be escaped

For multi-line string, I use a heredoc << with the following attributes :

single quotes to disable interpolation, so I don't have to escape ', " or \
the "squiggly" heredoc <<~ that keeps a nice indentation
the .chomp to remove the final \n


          expected = <<~'EOS'.chomp
            {
              "single-quote": "'",
              "double-quote": "\""
            }
          EOS

The test now like this


          test "Test JSON with single and double quotes in values" do
            input = %q|{"single-quote":"'","double-quote":"\""}|
            expected = <<~'EOS'.chomp
              {
                "single-quote": "'",
                "double-quote": "\""
              }
            EOS
            assert_equal expected, FormatExpressFormatter.new.format(input)
          end

That's undeniably better, yet it's not as readable as I'd like, and it will be even harder with larger JSON/XML extracts.
That's why I switched to a different direction : put each test in an isolated test data file.

Better solution: using text files as test data

For each test I create a file with the 3 following sections :

a description of the test
the input to give to the formatter
the expected formatted output


          # Test JSON with single and double quotes in values
          ====INPUT====
          {"single-quote":"'","double-quote":"\""}
          ====EXPECTED====
          {
            "single-quote": "'",
            "double-quote": "\""
          }

The file is easy to understand by itself, and there is no special character to escape. I can create as many files that I want, group them into directories, ...

Next step is an automated test to look for all the test data files, read the input and expected output, apply the input to the formatter and check the returned formatted string matches the expected output.

The naive implementation would be a single test which loops on each test data file. That's not good because it would stop on the first error found, and I would not know how many tests are broken (did my last change break only one file ? or 80% of files ? That's 2 totally different situations).
I want every file checked on each test run, so instead I dynamically create a test for each file, using instance_eval


          class TestDataSuiteTest < ActionDispatch::IntegrationTest
           
            # Test data files delimiters
            INPUT_DELIMITER = "====INPUT====\n"
            EXPECTED_DELIMITER = "\n====EXPECTED====\n"
           
            setup do
              @formatter = FormatExpressFormatter.new
            end
           
            # Generate a test for each file in the test_data directory
            Dir['test/test_data/*.testdata.txt'].each do |filename|
              instance_eval do
                test "Should format #{filename}" do
                  execute_test_data(filename)
                end
              end
            end
           
            private
           
            def execute_test_data(filename)
              # Extract input and expected result from the test data file
              text = File.read(filename)
              comment, _, content = text.partition(INPUT_DELIMITER)
              input, _, expected = content.partition(EXPECTED_DELIMITER)
           
              # Test
              assert_equal expected, @formatter.format(input), 
                           "Not the expected result for #{filename} #{comment}; The input was #{input}"
            end
          end

With this, no excuse to not apply TDD : when a new feature is implemented, or someone points out an unexpected formatting, I create a new test data file (or several) with the new case, and work until all tests are green.