% -*- Mode: TeX -*- % Character Comparison % Character Types % Character Roles % Character Casification % Character Codes % Character Names %-------------------- Character Types -------------------- %%% ========== CHARACTER (System Class) \begincom{character}\ftype{System Class} \label Class Precedence List:: \typeref{character}, \typeref{t} \label Description:: A \term{character} is an \term{object} that represents a unitary token in an aggregate quantity of text; \seesection\CharacterConcepts. \issue{CHARACTER-VS-CHAR:LESS-INCONSISTENT-SHORT} \issue{CHARACTER-PROPOSAL:2-3-1} \Thetypes{base-char} and \typeref{extended-char} form an \term{exhaustive partition} of \thetype{character}. \endissue{CHARACTER-PROPOSAL:2-3-1} \endissue{CHARACTER-VS-CHAR:LESS-INCONSISTENT-SHORT} \label See Also:: {\secref\CharacterConcepts}, {\secref\SharpsignBackslash}, {\secref\PrintingCharacters} \endcom%{character}\ftype{System Class} \begincom{base-char}\ftype{Type} \issue{CHARACTER-VS-CHAR:LESS-INCONSISTENT-SHORT} \label Supertypes:: \typeref{base-char}, \typeref{character}, \typeref{t} \label Description:: \Thetype{base-char} is defined as the \term{upgraded array element type} of \typeref{standard-char}. An \term{implementation} can support additional \subtypesof{character} (besides the ones listed in this standard) that might or might not be \supertypesof{base-char}. In addition, an \term{implementation} can define \typeref{base-char} to be the \term{same} \term{type} as \typeref{character}. \term{Base characters} are distinguished in the following respects: \beginlist \itemitem{1.} \Thetype{standard-char} is a \term{subrepertoire} of \thetype{base-char}. \itemitem{2.} The selection of \term{base characters} that are not \term{standard characters} is implementation defined. \itemitem{3.} Only \term{objects} of \thetype{base-char} can be \term{elements} of a \term{base string}. \itemitem{4.} No upper bound is specified for the number of characters in the \typeref{base-char} \term{repertoire}; the size of that \term{repertoire} is %\term{implementation-dependent} \term{implementation-defined}. The lower bound is~96, the number of \term{standard characters}. %defined for \clisp. \endlist %The distinction of base characters is largely a pragmatic %choice. It permits efficient handling of common situations, may %be privileged for host system I/O, and can serve as an %intermediate basis for portability, less general than the standard %characters, but possibly more useful across a narrower range of %implementations. % %Many computers have some "base" character representation which %is a function of hardware instructions for dealing with characters, %as well as the organization of the file system. The base character %representation is likely to be the smallest transaction unit permitted %for text file and terminal I/O operations. On a system with a record %based I/O paradigm, the base character representation is likely to %be the smallest record quantum. On many computer systems, %this representation is a byte. Whether a character is a \term{base character} depends on the way that an \term{implementation} represents \term{strings}, and not any other properties of the \term{implementation} or the host operating system. For example, one implementation might encode all \term{strings} as characters having 16-bit encodings, and another might have two kinds of \term{strings}: those with characters having 8-bit encodings and those with characters having 16-bit encodings. In the first \term{implementation}, \thetype{base-char} is equivalent to \thetype{character}: there is only one kind of \term{string}. In the second \term{implementation}, the \term{base characters} might be those \term{characters} that could be stored in a \term{string} of \term{characters} having 8-bit encodings. In such an implementation, \thetype{base-char} is a \term{proper subtype} of \thetype{character}. %KMP: Note that I think there could be implementations in which the 8-bit strings %are -not- base characters, if all the standard-chars were not %representable using the 8-bit encoding scheme. In such a case, %it might be that (upgraded-array-element-type 'standard-char) returned %the 16-bit representation. It might be that the 8-bit representation %was something else entirely. %% 2.15.0 15 \Thetype{standard-char} is a \issue{CHARACTER-PROPOSAL:2-3-1} \subtypeof{base-char}. %This text will be deleted: %subtype of \typeref{character}. \endissue{CHARACTER-PROPOSAL:2-3-1} %\thetype{string-char} is a \subtypeof{character}. \endissue{CHARACTER-VS-CHAR:LESS-INCONSISTENT-SHORT} \endcom%{base-char}\ftype{Type} \begincom{standard-char}\ftype{Type} \issue{STANDARD-REPERTOIRE-GRATUITOUS:RENAME} \label Supertypes:: \typeref{standard-char}, \issue{CHARACTER-VS-CHAR:LESS-INCONSISTENT-SHORT} \typeref{base-char}, \endissue{CHARACTER-VS-CHAR:LESS-INCONSISTENT-SHORT} \typeref{character}, \typeref{t} \label Description:: A fixed set of 96 \term{characters} required to be present in all \term{conforming implementations}. \term{Standard characters} are defined in \secref\StandardChars. %%% 13.2.0 4 \issue{CHARACTER-PROPOSAL:2-1-1} Any \term{character} that is not \term{simple} is not a \term{standard character}. \endissue{CHARACTER-PROPOSAL:2-1-1} \endissue{STANDARD-REPERTOIRE-GRATUITOUS:RENAME} \label See Also:: {\secref\StandardChars} \endcom%{standard-char}\ftype{Type} \begincom{extended-char}\ftype{Type} \issue{CHARACTER-PROPOSAL:2-3-1} \issue{CHARACTER-VS-CHAR:LESS-INCONSISTENT-SHORT} \label Supertypes:: \typeref{extended-char}, \typeref{character}, \typeref{t} \label Description:: \Thetype{extended-char} is equivalent to the \term{type} \f{(and character (not base-char))}. %% 2.3.0 1 %% 2.3.0 2 \label Notes:: %This next paragraph as added per Barrett's suggestion: \Thetype{extended-char} might %be equivalent to \thetype{nil} %% Replaced as controversial. -kmp 4-Feb-92 have no \term{elements}\meaning{4} in \term{implementations} in which all \term{characters} are \oftype{base-char}. \endissue{CHARACTER-VS-CHAR:LESS-INCONSISTENT-SHORT} \endissue{CHARACTER-PROPOSAL:2-3-1} \endcom%{extended-char}\ftype{Type} %-------------------- Character Comparison -------------------- %%% ========== CHAR-EQUAL %%% ========== CHAR-NOT-EQUAL %%% ========== CHAR-LESSP %%% ========== CHAR-GREATERP %%% ========== CHAR-NOT-LESSP %%% ========== CHAR-NOT-GREATERP %%% ========== CHAR= %%% ========== CHAR/= %%% ========== CHAR< %%% ========== CHAR> %%% ========== CHAR>= %%% ========== CHAR<= \begincom{char=, char/=, char<, char>, char<=, char>=, char-equal, char-not-equal, char-lessp, char-greaterp, char-not-greaterp, char-not-lessp}\ftype{Function} \label Syntax:: \DefunMultiWithValues {{\rest} \plus{characters}} {generalized-boolean} {\entry{{char$=$}} \entry{{char$/=$}} \entry{{char$<$}} \entry{{char$>$}} \entry{{char$<=$}} \entry{{char$>=$}} \noalign{\vskip 5pt} \entry{char-equal} \entry{char-not-equal} \entry{char-lessp} \entry{char-greaterp} \entry{char-not-greaterp} \entry{char-not-lessp}} \label Arguments and Values:: \param{character}---a \term{character}. \param{generalized-boolean}---a \term{generalized boolean}. \label Description:: These predicates compare \term{characters}. \funref{char=} returns \term{true} if all \param{characters} are the \term{same}; otherwise, it returns \term{false}. \issue{CHARACTER-PROPOSAL:2-1-1} If two \param{characters} differ in any \term{implementation-defined} \term{attributes}, then they are not \funref{char=}. \endissue{CHARACTER-PROPOSAL:2-1-1} \funref{char/=} returns \term{true} if all \param{characters} are different; otherwise, it returns \term{false}. \funref{char<} returns \term{true} if the \param{characters} are monotonically increasing; otherwise, it returns \term{false}. \issue{CHARACTER-PROPOSAL:2-1-1} If two \term{characters} have \term{identical} \term{implementation-defined} \term{attributes}, then their ordering by \funref{char<} is consistent with the numerical ordering by the predicate \f{<} on their \term{codes}. \endissue{CHARACTER-PROPOSAL:2-1-1} \funref{char>} returns \term{true} if the \param{characters} are monotonically decreasing; otherwise, it returns \term{false}. \issue{CHARACTER-PROPOSAL:2-1-1} If two \term{characters} have \term{identical} \term{implementation-defined} \term{attributes}, then their ordering by \funref{char>} is consistent with the numerical ordering by the predicate \f{>} on their \term{codes}. \endissue{CHARACTER-PROPOSAL:2-1-1} \funref{char<=} returns \term{true} if the \param{characters} are monotonically nondecreasing; otherwise, it returns \term{false}. \issue{CHARACTER-PROPOSAL:2-1-1} If two \term{characters} have \term{identical} \term{implementation-defined} \term{attributes}, then their ordering by \funref{char<=} is consistent with the numerical ordering by the predicate \f{<=} on their \term{codes}. \endissue{CHARACTER-PROPOSAL:2-1-1} \funref{char>=} returns \term{true} if the \param{characters} are monotonically nonincreasing; otherwise, it returns \term{false}. \issue{CHARACTER-PROPOSAL:2-1-1} If two \term{characters} have \term{identical} \term{implementation-defined} \term{attributes}, then their ordering by \funref{char>=} is consistent with the numerical ordering by the predicate \f{>=} on their \term{codes}. \endissue{CHARACTER-PROPOSAL:2-1-1} \funref{char-equal}, \funref{char-not-equal}, \funref{char-lessp}, \funref{char-greaterp}, \funref{char-not-greaterp}, and \funref{char-not-lessp} are similar to \funref{char=}, \funref{char/=}, \funref{char<}, \funref{char>}, \funref{char<=}, \funref{char>=}, respectively, except that they ignore differences in \term{case} and \issue{CHARACTER-PROPOSAL:2-1-1} % that the effect, if any, of each \term{implementation-defined} \term{attribute} % must be specified as part of the definition of that \term{attribute}. %% Sandra thought the above was awkward. Trying again. -kmp 26-Jan-92 might have an \term{implementation-defined} behavior for \term{non-simple} \term{characters}. % For example, an implementation might define that certain % \term{implementation-defined} \term{attributes} are ignored by % \funref{char-equal}, \i{etc.} %% More rewording to soothe awkwardness. -kmp 26-Jan-92 For example, an \term{implementation} might define that \funref{char-equal}, \i{etc.} ignore certain \term{implementation-defined} \term{attributes}. The effect, if any, of each \term{implementation-defined} \term{attribute} upon these functions must be specified as part of the definition of that \term{attribute}. %% This part has been moved to the notes. % %% 13.2.0 char-equal % This means that for the \term{standard characters}, the ordering used by % \funref{char-equal}, \etc. is such that % \f{A=a}, \f{B=b}, and so on, up to \f{Z=z}, and furthermore either % \f{9 #\\e #\\d) \EV \term{true} (char>= #\\e #\\d) \EV \term{true} (char> #\\d #\\c #\\b #\\a) \EV \term{true} (char>= #\\d #\\c #\\b #\\a) \EV \term{true} (char> #\\d #\\d #\\c #\\a) \EV \term{false} (char>= #\\d #\\d #\\c #\\a) \EV \term{true} (char> #\\e #\\d #\\b #\\c #\\a) \EV \term{false} (char>= #\\e #\\d #\\b #\\c #\\a) \EV \term{false} (char> #\\z #\\A) \EV \term{implementation-dependent} (char> #\\Z #\\a) \EV \term{implementation-dependent} (char-equal #\\A #\\a) \EV \term{true} (stable-sort (list #\\b #\\A #\\B #\\a #\\c #\\C) #'char-lessp) \EV (#\\A #\\a #\\b #\\B #\\c #\\C) (stable-sort (list #\\b #\\A #\\B #\\a #\\c #\\C) #'char<) \EV (#\\A #\\B #\\C #\\a #\\b #\\c) ;Implementation A \EV (#\\a #\\b #\\c #\\A #\\B #\\C) ;Implementation B \EV (#\\a #\\A #\\b #\\B #\\c #\\C) ;Implementation C \EV (#\\A #\\a #\\B #\\b #\\C #\\c) ;Implementation D \EV (#\\A #\\B #\\a #\\b #\\C #\\c) ;Implementation E \endcode %GLS observes that there are 15 other possibilities here: % (#\\A #\\a #\\b #\\c #\\B #\\C) % (#\\A #\\B #\\a #\\b #\\c #\\C) % (#\\a #\\A #\\b #\\B #\\C #\\c) % (#\\a #\\A #\\B #\\b #\\C #\\c) % (#\\a #\\A #\\b #\\c #\\B #\\C) % (#\\a #\\A #\\B #\\b #\\c #\\C) % (#\\a #\\A #\\B #\\C #\\b #\\c) % (#\\A #\\a #\\B #\\b #\\c #\\C) % (#\\A #\\a #\\B #\\C #\\b #\\c) % (#\\A #\\B #\\a #\\C #\\b #\\c) % (#\\a #\\b #\\A #\\c #\\B #\\C) % (#\\a #\\b #\\A #\\B #\\c #\\C) % (#\\a #\\b #\\A #\\B #\\C #\\c) % (#\\A #\\a #\\b #\\B #\\c #\\C) % (#\\A #\\a #\\b #\\B #\\C #\\c) %But I think readers will get the idea from those I've specified. -kmp 27-May-91 \endissue{CHARACTER-PROPOSAL:2-1-1} \label Affected By:\None. \label Exceptional Situations:: \Shouldcheckplus{character} \label See Also:: {\secref\CharacterSyntax}, {\secref\ImplementationDefinedScripts} \label Notes:: If characters differ in their \term{code} \term{attribute} or any \term{implementation-defined} \term{attribute}, they are considered to be different by \funref{char=}. %% 13.2.0 33 There is no requirement that \f{(eq c1 c2)} be true merely because \f{(char= c1 c2)} is \term{true}. While \funref{eq} can distinguish two \term{characters} that \funref{char=} does not, it is distinguishing them not as \term{characters}, but in some sense on the basis of a lower level implementation characteristic. If \f{(eq c1 c2)} is \term{true}, then \f{(char= c1 c2)} is also true. \funref{eql} and \funref{equal} compare \term{characters} in the same way that \funref{char=} does. The manner in which \term{case} is used by \funref{char-equal}, \funref{char-not-equal}, \funref{char-lessp}, \funref{char-greaterp}, \funref{char-not-greaterp}, and \funref{char-not-lessp} implies an ordering for \term{standard characters} such that \f{A=a}, \f{B=b}, and so on, up to \f{Z=z}, and furthermore either \f{9