Search⌘ K
AI Features

CSV Parsing: The Property

Explore how to apply property-based testing to CSV parsing by creating generators for CSV data and implementing encoding-decoding roundtrip properties. Understand how to handle CSV specification nuances and write tests ensuring robust parsers in Erlang.

CSV format

CSV is a loose format that nobody really implements the same way. This can be quite confusing even though RFC 4180 tries to provide a simple specification:

  • Each record is on a separate line, separated by CRLF (a \r followed by a \n).

  • The last record of the file may or may not have a CRLF after it. This is optional.

  • The first line of the file may be a header line, ending with a CRLF. In this case, the problem description includes a header, which will be assumed to always be there.

  • Commas go between fields of a record.

  • Any spaces are considered to be part of the record. The example in the problem description doesn’t respect that, since it adds a space after each comma even though it’s clearly not part of the record.

  • Double quotes (") can be used to wrap a given field. Fields that contain line breaks (CRLF), double quotes, or commas must be wrapped in double-quotes.

  • All records in a document contain ...