Blender V5.0
blender::csv_parse Namespace Reference

Namespaces

namespace  detail
namespace  tests

Classes

class  CsvRecord
class  CsvRecords
struct  CsvParseOptions

Functions

std::optional< Vector< Any<> > > parse_csv_in_chunks (const Span< char > buffer, const CsvParseOptions &options, FunctionRef< void(const CsvRecord &record)> process_header, FunctionRef< Any<>(const CsvRecords &records)> process_records)
template<typename ChunkT>
std::optional< Vector< ChunkT > > parse_csv_in_chunks (const Span< char > buffer, const CsvParseOptions &options, FunctionRef< void(const CsvRecord &record)> process_header, FunctionRef< ChunkT(const CsvRecords &records)> process_records)
StringRef unescape_field (const StringRef str, const CsvParseOptions &options, LinearAllocator<> &allocator)
static int64_t guess_next_record_start (const Span< char > buffer, const int64_t start)
static Vector< Span< char > > split_into_aligned_chunks (const Span< char > buffer, int64_t approximate_chunk_size)
static std::optional< CsvRecordsparse_records (const Span< char > buffer, const CsvParseOptions &options, Vector< int64_t > &r_data_offsets, Vector< Span< char > > &r_data_fields)

Function Documentation

◆ guess_next_record_start()

int64_t blender::csv_parse::guess_next_record_start ( const Span< char > buffer,
const int64_t start )
static

Returns a guess for the start of the next record. Note that this could split up quoted fields. This case needs to be detected at a higher level.

Definition at line 21 of file csv_parse.cc.

References i, and blender::Span< T >::size().

Referenced by split_into_aligned_chunks().

◆ parse_csv_in_chunks() [1/2]

std::optional< Vector< Any<> > > blender::csv_parse::parse_csv_in_chunks ( const Span< char > buffer,
const CsvParseOptions & options,
FunctionRef< void(const CsvRecord &record)> process_header,
FunctionRef< Any<>(const CsvRecords &records)> process_records )

Parses a .csv file. There are two important aspects to the way this interface is designed:

  1. It allows the file to be split into chunks that can be parsed in parallel.
  2. Splitting the file into individual records and fields is separated from parsing the actual content into e.g. floats. This simplifies the implementation of both parts because the logical parsing does not have to worry about e.g. the delimiter or quote characters. It also simplifies unit testing.
Parameters
bufferThe buffer containing the .csv file.
optionsOptions that control how the file is parsed.
process_headerA function that is called at most once and contains the fields of the first row/record.
process_recordsA function that is called potentially many times in parallel and that processes a chunk of parsed records. Typically this function parses raw byte fields into e.g. ints or floats. The result of the parsing process has to be returned. Note that under specific circumstances, this function may be called twice for the same records. That can happen when the .csv file contains multi-line fields which were split incorrectly at first.
Returns
A vector containing the return values of the process_records function in the correct order. #std::nullopt is returned if the file was malformed, e.g. if it has a quoted field that is not closed.

Definition at line 91 of file csv_parse.cc.

References blender::Span< T >::drop_front(), blender::Vector< T, InlineBufferCapacity, Allocator >::index_range(), options, blender::threading::parallel_for(), blender::Vector< T, InlineBufferCapacity, Allocator >::size(), and split_into_aligned_chunks().

Referenced by blender::io::csv::import_csv_as_pointcloud(), blender::csv_parse::tests::parse_csv_fields(), and parse_csv_in_chunks().

◆ parse_csv_in_chunks() [2/2]

template<typename ChunkT>
std::optional< Vector< ChunkT > > blender::csv_parse::parse_csv_in_chunks ( const Span< char > buffer,
const CsvParseOptions & options,
FunctionRef< void(const CsvRecord &record)> process_header,
FunctionRef< ChunkT(const CsvRecords &records)> process_records )
inline

Same as above, but uses a templated chunk type instead of using Any which can be more convenient to use.

Definition at line 106 of file BLI_csv_parse.hh.

References blender::Vector< T, InlineBufferCapacity, Allocator >::append(), options, parse_csv_in_chunks(), blender::Vector< T, InlineBufferCapacity, Allocator >::reserve(), and result.

◆ parse_records()

std::optional< CsvRecords > blender::csv_parse::parse_records ( const Span< char > buffer,
const CsvParseOptions & options,
Vector< int64_t > & r_data_offsets,
Vector< Span< char > > & r_data_fields )
static

Parses the given buffer into records and their fields.

r_data_offsets and r_data_fields are passed into to be able to reuse their memory.

Definition at line 59 of file csv_parse.cc.

References blender::Vector< T, InlineBufferCapacity, Allocator >::append(), blender::Vector< T, InlineBufferCapacity, Allocator >::clear(), blender::Vector< T, InlineBufferCapacity, Allocator >::last(), options, and blender::Span< T >::size().

◆ split_into_aligned_chunks()

Vector< Span< char > > blender::csv_parse::split_into_aligned_chunks ( const Span< char > buffer,
int64_t approximate_chunk_size )
static

Split the buffer into chunks of approximately the given size. The function attempts to align the chunks so that records are not split. This works in the majority of cases, but can fail with multi-line fields. This has to be detected at a higher level.

Definition at line 39 of file csv_parse.cc.

References blender::Vector< T, InlineBufferCapacity, Allocator >::append(), blender::IndexRange::from_begin_end(), guess_next_record_start(), blender::Span< T >::size(), and blender::Span< T >::slice().

Referenced by parse_csv_in_chunks().

◆ unescape_field()

StringRef blender::csv_parse::unescape_field ( const StringRef str,
const CsvParseOptions & options,
LinearAllocator<> & allocator )

Fields in a CSV file may contain escaped quote characters (e.g. "" or "). This function replaces these with just the quote character. The returned string may be reference the input string if it's the same. Otherwise the returned string is allocated in the given allocator.

Definition at line 169 of file csv_parse.cc.

References blender::LinearAllocator< Allocator >::allocate_array(), i, blender::StringRefBase::not_found, options, str, blender::MutableSpan< T >::take_front(), and unescape_field().

Referenced by blender::io::csv::import_csv_as_pointcloud(), blender::csv_parse::tests::TEST(), and unescape_field().