3.4 Byte Strings
Bytes and Byte Strings in Guide: Racket introduces byte strings.
A byte string is a fixed-length array of bytes. A
 byte is an exact integer between 0 and
 255 inclusive.
A byte string can be
mutable or immutable. When an immutable byte
string is provided to a procedure like bytes-set!, the
exn:fail:contract exception is raised. Byte-string constants generated by the
default reader (see Reading Strings) are immutable.
Two byte strings are equal? when they have the same length
and contain the same sequence of bytes.
A byte string can be used as a single-valued sequence (see
Sequences). The bytes of the string serve as elements
of the sequence. See also in-bytes.
See also: immutable?.
3.4.1 Byte String Constructors, Selectors, and Mutators
 Returns #t if v
 is a byte string, #f otherwise.
 Returns a new mutable byte string of length k where each
position in the byte string is initialized with the byte b.
 Returns a new mutable byte
string whose length is the number of provided bs, and whose
positions are initialized with the given bs.
Example:  | 
| > (bytes 65 112 112 108 101) |  #"Apple"  |  
 
  | 
Returns an immutable byte string with the same content
 as bstr, returning bstr itself if bstr is
 immutable.
 Returns #t if v is
 a byte (i.e., an exact integer between 0 and 255
 inclusive), #f otherwise.
Returns the length of bstr.
  Returns the character at position 
k in 
bstr.
 The first position in the bytes cooresponds to 
0, so the
 position 
k must be less than the length of the bytes,
 otherwise the 
exn:fail:contract exception is raised.
  Changes the
 character position 
k in 
bstr to 
b.  The first
 position in the byte string cooresponds to 
0, so the position
 
k must be less than the length of the bytes, otherwise the
 
exn:fail:contract exception is raised.
 Returns
 a new mutable byte string that is 
(- end start) bytes long,
 and that contains the same bytes as 
bstr from 
start
 inclusive to 
end exclusive.  The 
start and
 
end arguments must be less than or equal to the length of
 
bstr, and 
end must be greater than or equal to
 
start, otherwise the 
exn:fail:contract exception is raised.
Changes the bytes of 
dest starting at position
 
dest-start to match the bytes in 
src from
 
src-start (inclusive) to 
src-end (exclusive). The
 bytes strings 
dest and 
src can be the same byte
 string, and in that case the destination region can overlap with the
 source region; the destination bytes after the copy match the source
 bytes from before the copy. If any of 
dest-start,
 
src-start, or 
src-end are out of range (taking into
 account the sizes of the bytes strings and the source and destination
 regions), the 
exn:fail:contract exception is raised.
 Changes dest so that every position in the
 bytes is filled with b.
Returns a new mutable byte string
that is as long as the sum of the given 
bstrs’ lengths, and
that contains the concatenated bytes of the given 
bstrs. If
no 
bstrs are provided, the result is a zero-length byte
string.
 Returns a new
 list of bytes corresponding to the content of 
bstr. That is,
 the length of the list is 
(bytes-length bstr), and the
 sequence of bytes in 
bstr is the same sequence in the
 result list.
 Returns a new
 mutable byte string whose content is the list of bytes in 
lst.
 That is, the length of the byte string is 
(length lst), and
 the sequence of bytes in 
lst is the same sequence in
 the result byte string.
 Returns a new mutable byte string of length 
k where each
position in the byte string is initialized with the byte 
b.
When 
Places are enabled, the new byte string is allocated in the
shared memory space.
 Returns a new mutable byte
string whose length is the number of provided 
bs, and whose
positions are initialized with the given 
bs.
When 
Places are enabled, the new byte string is allocated in the
shared memory space.
3.4.2 Byte String Comparisons
 Returns
#t if all of the arguments are 
eqv?.
Returns 
#t if the arguments are lexicographically sorted
 increasing, where individual bytes are ordered by 
<,
 
#f otherwise.
Like 
bytes<?, but checks whether the arguments are decreasing.
3.4.3 Bytes to/from Characters, Decoding and Encoding
Produces a string by decoding the 
start to 
end
 substring of 
bstr as a UTF-8 encoding of Unicode code
 points.  If 
err-char is not 
#f, then it is used for
 bytes that fall in the range 
128 to 
255 but are
 not part of a valid encoding sequence. (This rule is consistent with
 reading characters from a port; see 
Encodings and Locales for more
 details.)  If 
err-char is 
#f, and if the
 
start to 
end substring of 
bstr is not a
 valid UTF-8 encoding overall, then the 
exn:fail:contract exception is raised.
Produces a string by decoding the 
start to 
end substring
of 
bstr using the current locale’s encoding (see also
Encodings and Locales). If 
err-char is not
#f, it is used for each byte in 
bstr that is not part
of a valid encoding; if 
err-char is 
#f, and if the
start to 
end substring of 
bstr is not a valid
encoding overall, then the 
exn:fail:contract exception is raised.
Produces a string by decoding the 
start to 
end substring
 of 
bstr as a Latin-1 encoding of Unicode code points; i.e.,
 each byte is translated directly to a character using
 
integer->char, so the decoding always succeeds.
 The 
err-char
 argument is ignored, but present for consistency with the other
 operations.
Produces a byte string by encoding the start to end
 substring of str via UTF-8 (always succeeding). The
 err-byte argument is ignored, but included for consistency with
 the other operations.
Produces a string by encoding the 
start to 
end substring
of 
str using the current locale’s encoding (see also
Encodings and Locales). If 
err-byte is not 
#f, it is used
for each character in 
str that cannot be encoded for the
current locale; if 
err-byte is 
#f, and if the
start to 
end substring of 
str cannot be encoded,
then the 
exn:fail:contract exception is raised.
Produces a string by encoding the 
start to 
end substring
 of 
str using Latin-1; i.e., each character is translated
 directly to a byte using 
char->integer. If 
err-byte is
 not 
#f, it is used for each character in 
str whose
 value is greater than 
255.
 If 
err-byte is 
#f, and if the
 
start to 
end substring of 
str has a character
 with a value greater than 
255, then the
 
exn:fail:contract exception is raised.
Returns the length in bytes of the UTF-8 encoding of str’s
 substring from start to end, but without actually
 generating the encoded bytes.
Returns the length in characters of the UTF-8 decoding of
 
bstr’s substring from 
start to 
end, but without
 actually generating the decoded characters. If 
err-char is
 
#f and the substring is not a UTF-8 encoding overall, the
 result is 
#f. Otherwise, 
err-char is used to resolve
 decoding errors as in 
bytes->string/utf-8.
Returns the 
skipth character in the UTF-8 decoding of
 
bstr’s substring from 
start to 
end, but without
 actually generating the other decoded characters. If the substring is
 not a UTF-8 encoding up to the 
skipth character (when
 
err-char is 
#f), or if the substring decoding produces
 fewer than 
skip characters, the result is 
#f. If
 
err-char is not 
#f, it is used to resolve decoding
 errors as in 
bytes->string/utf-8.
Returns the offset in bytes into 
bstr at which the 
skipth
 character’s encoding starts in the UTF-8 decoding of 
bstr’s
 substring from 
start to 
end (but without actually
 generating the other decoded characters). The result is relative to
 the start of 
bstr, not to 
start. If the substring is not
 a UTF-8 encoding up to the 
skipth character (when
 
err-char is 
#f), or if the substring decoding produces
 fewer than 
skip characters, the result is 
#f. If
 
err-char is not 
#f, it is used to resolve decoding
 errors as in 
bytes->string/utf-8.
3.4.4 Bytes to Bytes Encoding Conversion
Produces a 
byte converter to go from the encoding named by
from-name to the encoding named by 
to-name. If the
requested conversion pair is not available, 
#f is returned
instead of a converter.
Certain encoding combinations are always available:
(bytes-open-converter "UTF-8" "UTF-8") — the
identity conversion, except that encoding errors in the input lead
to a decoding failure.
(bytes-open-converter "UTF-8-permissive" "UTF-8") —
the identity conversion, except that
any input byte that is not part of a valid encoding sequence is
effectively replaced by the UTF-8 encoding sequence for
#\uFFFD.  (This handling of invalid sequences is
consistent with the interpretation of port bytes streams into
characters; see Ports.)
(bytes-open-converter "" "UTF-8") — converts from
the current locale’s default encoding (see Encodings and Locales)
to UTF-8.
(bytes-open-converter "UTF-8" "") — converts from
UTF-8 to the current locale’s default encoding (see
Encodings and Locales).
(bytes-open-converter "platform-UTF-8" "platform-UTF-16")
— converts UTF-8 to UTF-16 under Unix and Mac OS X, where each UTF-16
code unit is a sequence of two bytes ordered by the current
platform’s endianness. Under Windows, the input can include
encodings that are not valid UTF-8, but which naturally extend the
UTF-8 encoding to support unpaired surrogate code units, and the
output is a sequence of UTF-16 code units (as little-endian byte
pairs), potentially including unpaired surrogates.
(bytes-open-converter "platform-UTF-8-permissive" "platform-UTF-16")
— like (bytes-open-converter "platform-UTF-8" "platform-UTF-16"),
but an input byte that is not part of a valid UTF-8 encoding
sequence (or valid for the unpaired-surrogate extension under
Windows) is effectively replaced with (char->integer #\?).
(bytes-open-converter "platform-UTF-16" "platform-UTF-8")
— converts UTF-16 (bytes orderd by the current platform’s
endianness) to UTF-8 under Unix and Mac OS X. Under Windows, the input can
include UTF-16 code units that are unpaired surrogates, and the
corresponding output includes an encoding of each surrogate in a
natural extension of UTF-8. Under Unix and Mac OS X, surrogates are
assumed to be paired: a pair of bytes with the bits 55296
starts a surrogate pair, and the 1023 bits are used from
the pair and following pair (independent of the value of the
56320 bits). On all platforms, performance may be poor
when decoding from an odd offset within an input byte string.
A newly opened byte converter is registered with the current custodian
(see Custodians), so that the converter is closed when
the custodian is shut down. A converter is not registered with a
custodian (and does not need to be closed) if it is one of the
guaranteed combinations not involving "" under Unix, or if it
is any of the guaranteed combinations (including "") under
Windows and Mac OS X.
In the Racket software distributions for Windows, a suitable
"iconv.dll" is included with "libmzschVERS.dll".
The set of available encodings and combinations varies by platform,
depending on the iconv library that is installed; the
from-name and to-name arguments are passed on to
iconv_open. Under Windows, "iconv.dll" or
"libiconv.dll" must be in the same directory as
"libmzschVERS.dll" (where VERS is a version
number), in the user’s path, in the system directory, or in the
current executable’s directory at run time, and the DLL must either
supply _errno or link to "msvcrt.dll" for _errno;
otherwise, only the guaranteed combinations are available.
Use bytes-convert with the result to convert byte strings.
Converts the bytes from src-start-pos to src-end-pos
in src-bstr.
If dest-bstr is not #f, the converted bytes are
written into dest-bstr from dest-start-pos to
dest-end-pos. If dest-bstr is #f, then a
newly allocated byte string holds the conversion results, and if
dest-end-pos is not #f, the size of the result byte
string is no more than (- dest-end-pos dest-start-pos).
The result of bytes-convert is three values:
result-bstr or dest-wrote-amt — a byte
string if dest-bstr is #f or not provided, or the
number of bytes written into dest-bstr otherwise.
src-read-amt — the number of bytes successfully converted
from src-bstr.
'complete, 'continues,
'aborts, or 'error — indicates
how conversion terminated:
'complete: The entire input was processed, and
src-read-amt will be equal to (- src-end-pos src-start-pos).
'continues: Conversion stopped due to the limit on
the result size or the space in dest-bstr; in this case,
fewer than (- dest-end-pos dest-start-pos) bytes may be
returned if more space is needed to process the next complete
encoding sequence in src-bstr.
'aborts: The input stopped part-way through an
encoding sequence, and more input bytes are necessary to continue.
For example, if the last byte of input is 195 for a
"UTF-8-permissive" decoding, the result is
'aborts, because another byte is needed to determine how to
use the 195 byte.
'error: The bytes starting at (+ src-start-pos src-read-amt) bytes in src-bstr do not form
a legal encoding sequence. This result is never produced for some
encodings, where all byte sequences are valid encodings. For
example, since "UTF-8-permissive" handles an invalid UTF-8
sequence by dropping characters or generating “?,” every byte
sequence is effectively valid.
Applying a converter accumulates state in the converter (even when the
third result of bytes-convert is 'complete). This
state can affect both further processing of input and further
generation of output, but only for conversions that involve “shift
sequences” to change modes within a stream. To terminate an input
sequence and reset the converter, use bytes-convert-end.
Like 
bytes-convert, but instead of converting bytes, this
procedure generates an ending sequence for the conversion (sometimes
called a “shift sequence”), if any. Few encodings use shift
sequences, so this function will succeed with no output for most
encodings. In any case, successful output of a (possibly empty) shift
sequence resets the converter to its initial state.
The result of bytes-convert-end is two values:
result-bstr or dest-wrote-amt — a byte string if
dest-bstr is #f or not provided, or the number of
bytes written into dest-bstr otherwise.
'complete or 'continues —
indicates whether conversion completed. If 'complete, then
an entire ending sequence was produced. If 'continues, then
the conversion could not complete due to the limit on the result
size or the space in dest-bstr, and the first result is
either an empty byte string or 0.
Returns a string for the current locale’s encoding (i.e., the encoding
normally identified by 
""). See also
system-language+country.
3.4.5 Additional Byte String Functions
Examples:  | 
| > (bytes-append* #"a" #"b" '(#"c" #"d")) |  #"abcd"  |   |  #"Alpha, Beta, Gamma"  |  
 
  | 
Appends the byte strings in strs, inserting sep between
each pair of bytes in strs.
Example:  | 
| > (bytes-join '(#"one" #"two" #"three" #"four") #" potato ") |  #"one potato two potato three potato four"  |  
 
  |