[Overview][Types][Procedures and functions][Variables][Index] Reference for unit 'LazUTF8' (#lazutils)

UTF8CharacterLength (deprecated)

Returns the number of bytes of the codepoint starting at p

Declaration

Source position: lazutf8.pas line 77

function UTF8CharacterLength(

  p: PChar

):Integer;

Description

It returns 0 if p is nil. It returns 1 if p is a 1-byte UTF-8 codepoint or p is an invalid UTF-8 sequence. Otherwise it returns a number 2..4. It does not check for malicious codepoints like #$c0#$80, nor for non defined codepoints like #$f3#$a0#$87#$b9. Use UTF8CharacterLength to step through a string with a simple loop:
while p^<>#0 do begin
inc(p,UTF8CharacterLength(p));
end;
Even if p contains invalid UTF-8 it will run through the string without overflow.

The latest version of this document can be found at lazarus-ccr.sourceforge.net.