Homepage: http://github.com/rolandwalker/ucs-utils
Author: Roland Walker
Updated:
Utilities for Unicode characters
Quickstart (require 'ucs-utils) (ucs-utils-char "Middle Dot" ; character to return ?. ; fallback if unavailable 'char-displayable-p) ; test for character to pass (ucs-utils-first-existing-char '("White Bullet" "Bullet Operator" "Circled Bullet" "Middle Dot" ?.) 'cdp) (ucs-utils-string "Horizontal Ellipsis" '[["..."]]) Explanation This library provides utilities for manipulating Unicode characters, with integrated ability to return fallback characters when Unicode display is not possible. Some ambiguities in Emacs' built-in Unicode data are resolved, and character support is updated to Unicode 8.0. There are three interactive commands: `ucs-utils-ucs-insert' ; `ucs-insert' workalike using ido `ucs-utils-eval' ; the inverse of `ucs-insert' `ucs-utils-install-aliases' ; install shorter aliases The other functions are only useful from other Lisp code: `ucs-utils-char' `ucs-utils-first-existing-char' `ucs-utils-vector' `ucs-utils-string' `ucs-utils-intact-string' `ucs-utils-pretty-name' `ucs-utils-read-char-by-name' `ucs-utils-subst-char-in-region' To use ucs-utils, place the ucs-utils.el library somewhere Emacs can find it, and add the following to your ~/.emacs file: (require 'ucs-utils) and optionally (ucs-install-aliases) See Also M-x customize-group RET ucs-utils RET http://en.wikipedia.org/wiki/Universal_Character_Set Notes Compatibility and Requirements GNU Emacs version 25.1-devel : not tested GNU Emacs version 24.5 : not tested GNU Emacs version 24.4 : yes GNU Emacs version 24.3 : yes GNU Emacs version 23.3 : yes (*) GNU Emacs version 22.3 and lower : no (*) For full Emacs 23.x support, the library ucs-utils-6.0-delta.el should also be installed. Uses if present: persistent-soft.el (Recommended) Bugs TODO Accept synonyms on inputs? at least Tab would be nice. There is an official list of aliases at http://www.unicode.org/Public/8.0.0/ucd/NameAliases.txt generated names for CJK blocks added in Unicode 6.2 CJK Unified Ideographs CJK Unified Ideographs Extension A CJK Unified Ideographs Extension C support alternate naming schemes for CJK ideographs support helm or other choosers which are able to cope with the entire set of character names, including CJK ideographs spin out older portions of ucs-utils-names-corrections which are not needed in recent Emacs releases (as with ucs-utils-6.0-delta.el) Namespace cache keys as with font-utils and unicode-utils. Separate test run without persistent-soft.el License Simplified BSD License: Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. This software is provided by Roland Walker "AS IS" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall Roland Walker or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage. The views and conclusions contained in the software and documentation are those of the authors and should not be interpreted as representing official policies, either expressed or implied, of Roland Walker. No rights are claimed over data created by the Unicode Consortium, which are included here under the terms of the Unicode Terms of Use.