News

Versión para impresión

MathML on the Clipboard

David Carslisle's Blog - Vie, 2010/01/22 - 01:31
I got a new machine at work with Windows 7 on it.

One of the more interesting applications coming with Windows 7 is the Math Input Panel. This is designed for pen input on a tablet-style device and performs pretty impressively accurate recognition of mathematical expressions. While designed for a tablet, it also works pretty well if you are just “writing” the expression with a finger on a small laptop trackpad, which is how I have been using it.

The Math Input Panel is designed with a very simple interface with virtually no customisation options. It offers no way of saving the expressions generated and just offers a simple insert button that tries to insert the math expression at the insertion point in a currently open application. This works well for Word 2007 which accepts MathML from the clipboard and transparently converts it to its internal form and renders it, but other more generic tools such as XML editors that could use the MathML do not accept MathML from the clipboard in this way. Unlike MathPlayer or Word, The Math Input Panel doesn't offer fallback text representations of the XML markup on the clipboard. Marko Panic, the program manager for the development of this tool confirmed to me that this was a design decision as they didn't want the end user to be faced with raw XML. This is not unreasonable but not what I wanted personally (I like to see my XML raw:-). Marko confirmed that the MathML is on the clipboard and it should be possible to extract it with a few lines of code, or if I wanted a more extensive customisation there was documentation of the API offered by the underlying DLL available at
http://msdn.microsoft.com/en-us/library/dd317324(VS.85).aspx
and
http://msdn.microsoft.com/en-us/library/dd317311(VS.85).aspx.

I decided to brush up my C# forms programming and produced a small form that shows any MathML on the clipboard. The main code (everything apart from the boilerplate Visual Studio files) is available on google code While it's particularly useful to see the MathML generated by the Math Input panel, it also works with other applications, notably MathPlayer and Word, that place MathML on the clipboard.

While looking via Google for some programming tips on my form, I came across a very similar blog posting from last year. That form had some differences though (displaying the IE folding tree view of the XML) so I completed my form here. The screenshot shows the Math Input Panel interpreting my appalling handwriting, and the mmlclipboard form displaying the generated MathML.



Categorías: Planet JEM

Gemeinsame Fachkonferenz Interaktive Kulturen

eLearning Europa - Jue, 2010/01/21 - 01:00
Unter dem Motto „Interaktive Kulturen“ geht die zehnte Konferenz Mensch & Computer gemeinsam mit DeLFI 2010 und einem Track der German UPA zur Usability-Praxix den vielschichtigen Fragen der Mensch-Technik-Interaktion, des Lernens mit digitalen Medien und der digitalen Vernetzung in Gruppen und Gemeinschaften nach. Die Konferenz findet im Kontext der Europäischen Kulturhauptstadt Ruhr.2010...
Categorías: eLearning

Concurso EUROSCOLA 2010 - XVI EDICIÓN

eLearning Europa - Jue, 2010/01/21 - 01:00
Categorías: eLearning

2010-01-18: Prof. Sherry Mantyka will visit ActiveMath from March 15 to April 2, 2010

ActiveMath - Lun, 2010/01/18 - 11:10
The Canadian mathematic pedagogist Prof. Sherry Mantyka who is organizing bridging courses at Memorial University of Newfoundland is visiting Saarbrücken to help designing a remedial scenario for ActiveMath.
Categorías: Technology

tristanhuntjuricek

xbeta - Sáb, 2010/01/16 - 17:52
Updated by tristanhuntjuricek on 2010-01-16 at 15:52:45Z.
Categorías: Technology

2009-12-24: Sergey has received EU Marie Curie International Incoming Fellowship

ActiveMath - Vie, 2010/01/15 - 16:11
Sergey Sosnovsky's proposal for EU Marie Curie Fellowship has been approved by the EU Research Executive Agency. The funding will start in April, 2010 and will last until April 2012. The project "Intelligent Support for Authoring Semantic Learning Content" will focus on implementation of author-friendly technologies for learning content development, including collaborative authoring support, metadata authoring support, open-corpus content discovery, interactivity authoring, and gap detection.
Categorías: Technology

2009-10-15: Visiting Researcher from Russia

ActiveMath - Vie, 2010/01/15 - 16:11
Anatoly Belchusov from Russia is a DAAD visiting researcher at DFKI. The purpose of his visit is developing domain reasoning services to serve intelligent diagnosis and feedback generation in the domain of integrals. He is using YACAS platform for encoding the domain reasoner, which is connected to the ITS module of ActiveMath Learning Environment. Anatoly stays in Saarbrücken from 15.10 till 15.12.2009.
Categorías: Technology

2009-10-01: A new researcher joined ActiveMath lab

ActiveMath - Vie, 2010/01/15 - 16:10
Sergey Sosnovsky, a PhD candidate from the School of Information Sciences, University of Pittsburgh (Pittsburgh, PA, USA) joined our group. Sergey received his MSc and BSc from Kazan State Technological University, Kazan, Russia. His research focuses on combining new trends of Web development with Adaptive and User Modelling technologies. Sergey has co-authored about 60 peer-reviewed research publications and served on programming committees of several workshops in the area of Semantic Web for Adaptation.
Categorías: Technology

Y a mí que no me parecen bien algunas webs de enlaces…

Otro blog mas - Vie, 2010/01/15 - 00:21

Tenía esta entrada en la cabeza desde hace tiempo. Tanto como el que ha pasado desde que suscribí el manifiesto en defensa de los derechos fundamentales en internet (y de esto hace ya cerca de mes y medio)… Finalmente, hoy me decido a hablar de lo que creo que debería ser castigable en la red. Vayamos por pasos…

En primer lugar, no tengo nada en contra del P2P, principalmente por dos motivos:

  1. Quien comparte algo, lo que sea, digital, en una red P2P, no lo hace para lucrarse (y, en la práctica, invierte en el esfuerzo un ancho de banda de subida que, en este país al menos, se paga a precio de oro). [De hecho, sí hay quien intenta sacar tajada: los que comparten archivos con contraseña e intentan obtener un rescate por esta... pero la 'comunidad' ya se encarga de 'lincharles' adecuadamente (o al menos lo hacen las comunidades por las que me muevo/he movido)]
  2. Si bien opino que las discográficas y distribuidoras de cine pierden ingresos a través del P2P,
    1. también estoy seguro de que nadie se cree sus cifras de pérdidas (al fin y al cabo, si Pixbox ofrece todo su catálogo por 6 euros al mes, difícilmente va a poder defender la industria que nadie que se descargue música le perjudique en más de esos 6 euros mensuales, a no ser que demanden de la misma forma a Pixbox, menos la tajada que se lleven)
    2. los que me preocupan son los creadores, no los intermediarios. Y a los creadores no parece que les vaya tan mal, últimamente
    3. a pesar de que a las industrias del disco y el DVD no les guste acordarse de ello, hay industrias que sufren más los efectos de la ‘piratería’: como mínimo la industria del videojuego y la subindustria de la triple equis. Y curiosamente a estos no se les oye escudarse en la pobre excusa del P2P para solicitar la ayuda de las arcas públicas ni de del ejecutivo, el legislativo ni el judicial: dedican sus esfuerzos, de manera bastante más inteligente, a buscar nuevos canales de distribución, nuevos modelos de negocio… y a perseguir a los piratas industriales.

Y ahí es donde me duele el tema de las webs de enlaces (que, como recordaba Miquel Peguera, no son delito, y seguirán sin serlo mientras no se cambie la legislación española sobre propiedad intelectual).

  1. Las webs de enlaces no son P2P: son una cosa centralizada, nada de entre iguales, tienen un responsable o responsables.
  2. En las webs de enlaces sí hay lucro (o, como mínimo, sí es fácil ver cómo puede haberlo).
  3. Ningún usuario de P2P le puede hacer suficiente daño a la industria como para que esta se inmute, pero la acción de una web de enlaces sí (o al menos eso cree aquí su humilde y poco informado servidor).

¿Todas las webs de enlaces son, por tanto, tan nocivas como para merecer el cierre administrativo? No, desde luego que no. Para comenzar, es esencial respetar los derechos que nos garantiza la Constitución y el resto de leyes en vigor. Y nada que implique el cierre de una web debería hacerse sin pasar por el sistema judicial. Naturalmente. A pesar de lo cerriles (tercera acepción del DRAE) que puedan resultar determinados legisladores. Y exaltarse porque alguien pueda intentar colar algo así en una ley presuntamente inofensiva me parece muy natural.

Ahora bien, no sé quién dijo que si había que elegir entre la incompetencia o la mala fe cuando algo parece hecho con muy mala baba uno debía inclinarse siempre por la primera opción, pero tenía muchísima razón. En este caso, no lo dudo, había una dosis más que notable de mala fe, puesta por el ‘lobby’ de las “industrias culturales” (si esas dos palabras juntas no son el mejor ejemplo posible de oxímoron, no sé cuáles pueden serlo (estoy seguro de que existen militares inteligentes)). Pero esa era la mala fe (y la ignorancia necesaria) de intentar acabar con el P2P, no la de atentar contra la libertad de expresión: que el redactado del celebérrimo “Anteproyecto de Ley de Economía sostenible” permita usarlo para atentar contra ese derecho fundamental es un accidente motivado por la incompetencia de (¿casi?) todos los implicados en el desaguisado. Sé perfectamente que es una cosa no demostrable (los culpables serán los primeros en defender su competencia, demostrando por el camino su falta de ella), pero como todo el mundo tiene derecho a una opinión, yo me reservo la mía ;-).

Y entonces… ¿cómo lo resolvemos? Confesando de nuevo mi desconocimiento casi total de la materia (que me temo que no es mucho mayor que el de muchos de los que han dado ya su opinión sobre el tema, especialmente aquellos que han hecho mucho ruido) a mí me atrae poderosamente el concepto de “safe harbor” que se incluye en el título segundo de la muy criticada (con razón) Digital Millennium Copyright Act, que protege a los prestadores de servicios de la legislación si se comprometen a comportarse como ‘puertos seguros’ y bloquean de manera diligente los contenidos que infringen la legislación sobre propiedad intelectual al ser notificados de tal infracción (con las esperables garantías para poder alegar). Introduciendo [bien] algo así en la legislación española, las webs de enlaces se dividirían rápidamente en las ‘especialistas en materiales más allá de la legislación de propiedad intelectual’ (que estarían jugando con fuego) y el resto del mundo (permitan que opine, de nuevo, que el resto del mundo se iba a demostrar muy escaso). Y a la industria le bastaría, para amargar la vida del webmaster de turno, con apostar a un francotirador (sirve un administrativo mileurista medianamente formado) sobre la tecla de F5 del navegador: nuestro hipotético webmaster no tiene un pelo de tonto y sabe bien cuándo el ‘torrent’ de turno es el último disco de Alejandro Sanz (y, por tanto, le conviene retirar el enlace a la voz de ya) y cuándo se trata de un material potencialmente más nocivo pero más allá del alcance de las leyes del copyright.

Una legislación así (esto es, ilegalizando cierto tipo de webs de enlaces y protegiendo los “puertos seguros”) no iba a parar el P2P (he dicho ya que no tengo nada en contra de este, me parece recordar), ni [suponiendo una buena redacción y su posterior buena aplicación, que no es poco suponer] tampoco atentaría contra la libertad de expresión. Pero a los “piratas industriales” sí les iba a desinflar el negocio. Y eso, qué quieren que les diga, no me parece mal…

Categorías: Planet JEM

RichEdit Versions 1.0 through 3.0

Murray Sargent: Math in Office - Mié, 2010/01/13 - 03:47

Digging through old doc files, I ran across the following summary of RichEdit up through Version 3.0. It’s more detailed than my post on RichEdit Versions, so it might be of interest to history buffs, anyhow. And it does describe the riched20.dll that still ships with Windows, mostly for purposes of backward compatibility. I wrote this document back in 1998 in preparing for an internal seminar on RichEdit 3.0. It even mentions that RichEdit 3.x would be an ideal development environment for WYSIWYG editing of built-up mathematical expressions! Sure hit that nail on the head. Naturally the statement “there are three main versions of RichEdit” is quite out of date.<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

What is RichEdit?

There are three main versions of RichEdit: 1.x, 2.x, and 3.0. Since all are being used, it makes sense to group the RichEdit features as they were introduced by these three versions. In general, RichEdit adds selective character and paragraph formatting along with embedded objects to the plain text editing facilities well-known in system edit controls.

A RichEdit instance consists of a single story, galley-like text that can be exported and imported using plain text or RTF.  Each version of RichEdit is a superset of the preceeding one, except that only FE builds of RE 1.0 have a vertical text option (a relatively elegant vertical option could be added to RE 3.0 if there’s sufficient demand).

RichEdit 1.0 was originally developed for rich-text email. Major differences between the various builds of RE 1.x and RE 2.0 are that the latter is based on Unicode, is a single world-wide binary (not including BiDi, Thai or Indic scripts), has multilevel undo, has a powerful set of com interfaces, and is substantially more Word compatible. RE 2.1 adds BiDi capabilities.

Major differences between RE 2.x and RE 3.0 include the latter's better performance, richer text, outline view, zoom, font binding, more powerful IME support, and rich complex script support (BiDi, Indic, and Thai).  RE 3.0 is a single, scalable, world-wide binary that offers high performance and substantial Word compatibility in a small package.

RichEdit 2.0 also includes simpler plain-text and single-line controls. RE 3.0 adds rich/plain ListBox and ComboBox controls.

RichEdit 1.0 Features

1.     Text Entry/Selection. Mostly standard (system-edit control) selection and entry of text. Selection bar support. Word-wrap and auto-word-select options. Single, double, and triple click selection.

2.     ANSI (SBCS and MBCS) editing.  No Unicode

3.     Basic set of character/paragraph formatting properties

4.     Character formatting properties: font facename and size, bold, italic, solid underline, strikeout, protected, link, offset, and text color.

5.     Paragraph formatting properties: start indent, right indent, subsequent line offset, bullet, alignment (left, center, right), and tabs.

6.     Find forward: includes case-insensitive and match-whole-word options.

7.     Message-based interface: almost a superset of the system edit-control message set plus the two OLE interfaces, IRichEditOle and IRichEditCallback.

8.     OLE embedded objects:  requires client collaboration based on IRichEditOle and IRichEditCallback interfaces.

9.     Right-button menu support: needs IRichEditOleCallback interface.

10.  Drag & Drop editing.

11.  Notifications: WM_COMMAND messages sent to client plus a number of others. Superset of common-control notifications

12.  Single-level undo/redo.

13.  Simple vertical text (Far East builds only)

14.  IME support. (Far East builds only)

15.  WYSIWYG editing using printer metrics. This is needed for WordPad, in particular.

16.  Cut/Copy/Paste/StreamIn/StreamOut with plain text (CF_TEXT) or RTF with and without objects.

17.  C code base

18.   Different builds for different scripts.

RichEdit 2.x Additions

1.     Unicode. Big effort needed to maintain compatibility with existing nonUnicode documents, i.e., ability to convert to/from nonUnicode plain and rich text. Substantial effort needed to run correctly on Win95.

2.     General international support. General line breaking algorithm (extension of Kinsoku rules), simple font linking, keyboard font switching.

3.     FE support. E.g., Level 2 and 3 IME support

4.     Find Up as well as down.

5.     BiDi support (RichEdit 2.1)

6.     Multilevel undo. Extensible undo architecture that allows client to participate in app-wide undo model.

7.     Magellan mouse support

8.     Dual-font support. Keyboard can automatically switch fonts when active font is inappropriate for current keyboard, e.g., Kanji characters in Times New Roman.

9.     Smart font apply.  Font change request doesn’t apply Western fonts to FE characters.

10.  Improved display.  An off-screen bitmap is used when multiple fonts occur on the same line.  This allows, for example, the last letter of the word “cool” not to be chopped off.

11.  Transparency support.  Also in windowless mode.

12.  System selection colors. Used for selecting text

13.  AutoURL recognition

14.  Word edit UI compatibility. Selection, cursor-keypad semantics.

15.  Word standard EOP (end-of-paragraph mark: CR). Can also handle CRLF

16.  Plain-text controls as well as rich-text. Single character format and single paragraph format.

17.  Single-line controls as well as multiline.  Truncate at first end-of-paragraph and no word wrap.

18.  Accelerator and Password Controls.

19.  Scalable architecture to reduce instance size.

20.  Windowless operation and interfaces (ITextHost/ITextServices). Added primarily for Forms^3.

21.  Com dual interfaces: TOM (Text Object Model)).  This powerful set of interfaces is described separately.

22.  CHARFORMAT2. Added font weight, background color, locale ID, underline type, superscript/subscript (in addition to offset), disabled effect. For RTF roundtripping only, added amount to space between letters, twip size above which to kern character pair, animated-text type, various effects: font shadow/outline, all caps, small caps, hidden, embossed, imprint, and revised.

23.  PARAFORMAT2. Added space before/after and Word line spacings. For RTF roundtripping only, added shading weight/style, numbering start/style/tab, border space/width/sides, tab alignment/leaders, various Word paragraph effects: RTL paragraph, keep, keep-next, page-break-before, no-line-number, no-widow-control, do-not-hyphenate, side-by-side.

24.  More RTF roundtripping. All of Word’s FormatFont and FormatParagraph properties.

25.  Improved OLE support.

26.  Code Stability and stabilization. E.g., parameter and object validation, function invariants, re-entrancy guards, object stabilization, etc.

27.  Strong testing infrastructure including extensive regressions tests and Genesis testing. Shipped with no priority 1 or 2 bugs and not many postponed bugs.

28.  Improved Performance.  Smaller working set, faster load and redisplay times, etc.

29.   C++ code base. The code is written in C++. Provided a solid foundation on which to build RichEdit 3.0.

RichEdit 3.0 Feature Additions

1.     Zoom.  The zoom factor is given by ratio of two longs.

2.     Paragraph numbering (single-level). Numeric, upper/lower alphabetic or Roman numeral.

3.     Simple tables (no wrap inside cells). Limited UI: no resizing, but can delete/insert rows. With LineServices, can align columns centered, flush right, and decimal.  Cells are simulated by tabs, so text tabs and carriage returns are replaced by blanks.

4.     Normal and heading styles. Built-in normal style and heading styles 1 through 9 are supported by the EM_SETPARAFORMAT and TOM APIs.

5.     Outline view (similar to Word’s).  Supports normal style and headings 1 through 9.  Can collapse to heading level n, promote/demote headings/text, move paragraphs up/down.  Can persist collapse status.

6.     More underline types (dashed, dash-dot, dash-dot-dot, dot)

7.     Underline coloring. Underlined text can be tagged with one of 15 document choices for underline colors.

8.     Hidden text. Marked by CHARFORMAT2 attribute. Handy for roundtripping of information that ordinarily shouldn’t be displayed.

9.     More default hot keys, which act as Word’s default hot keys act.  E.g., European accent dead keys (US keyboards only) and outline-view hot keys. Number hot key (Ctrl+L) cycles through numbering options available, starting with bullet.

10.  Smart-quotes (toggled on/off by Ctrl+") for US keyboards.

11.  Soft hyphens. (0xAD in plain text; \- in RTF).

12.  Italics Caret/Cursor.  Also hand cursor over URLs.

13.  LineServices Option: RichEdit 3.0 can use Office’s LineServices component for line breaking and display.  This elegant option was added primarily to facilitate handling complex scripts (BiDi, Indic, and Thai).  In addition a number of improvements occur for simple scripts, e.g., center, right, and decimal tabs, fully justified text, underline averaging giving a uniform underline even when adjacent text runs have different font sizes.  It opens the door to incorporating LineServices FE enhancements, such as Ruby, Warichu, Tatenakayoko, and vertical text. LineServices also paves the way for WYSIWYG editing of built-up mathematical expressions and RichEdit 3.x looks like the ideal development environment for this.

14.  Complex Script Support: RichEdit 3.0 will support BiDi (text with Arabic and/or Hebrew mixed with other scripts), Indic (Indian scripts like Devangari), and Thai.  For support of these complex scripts, the LineServices and NT Uniscribe components are used, which run on Win95 and later OSs.

15.  Font binding: RichEdit 3.0 will automatically choose an appropriate font for characters that clearly do not belong to the current charset stamp.  This is done by assigning charsets to runs and associating fonts with those charsets.  Please see the section on Font Binding below.

16.  Charset-specific plain-text read/write options, notably ability to read a file using one charset and write it with a different one.

17.  UTF-8 RTF. Used preferentially for cut/copy/paste and optionally externally, this file format is substantially more compact than ordinary RTF, faster, and is completely faithful to Unicode.

18.   Office 9 IME support (MSIME98). This more powerful IME capability has been factored out into an independent module (see RichEdit Architectural Improvements). Features include:
a. Reconversion - In the past, the user needs to delete the final string first and then type in a new string to get to the correct candidate. This feature enables the user to convert the final string back to composition mode, allowing easy selection of a different candidate string.
b. Document feed - This feature provides IME98 with the text for the current paragraph, which helps IME98 to do more accurate conversion during typing.
c. Mouse Operation - This feature allows the user to have better control over the candidate and UI windows during typing.
d. Caret position - This feature provides the current caret and line information, which IME98 uses to position UI windows (e.g., candidate list).

19.  AIMM support. Users can invoke the IE/AIMM object, which enables users to enter Far East characters on US systems (NT4.0 & Win95).

20.  More RTF round tripping.

21.  Improved 1.0 compatibility mode, e.g., MBCS to/from Unicode character-position (cp) mappings. Is being used to emulate RE 1.0 in NT 5.

22.  Increased Freeze Control. The display can be frozen over multiple API calls and then unfrozen to display the updates.

23.  Increased Undo Control. Undo can be suspended and resumed (needed for IME).

24.   Increase/Decrease Font Size. Increases or decreases font size to one of six standard values (12, 28, 36, 48, 72, 80 pts).

RichEdit 3.0 Architectural Improvements

1.     Input module: IME has been factored out into separate generally usable input module that supports the latest Office 9 IMEs.  RichEdit 3.0 itself knows nothing of IMEs!  In principle other IME clients can use this input module.  Did need to add some methods to RichEdit’s object model (the approach is discussed in a separate section).

2.     Virtual Win32 Environment: OS-dependent calls have been separated out into a class of their own. RE 3.0 works in a virtual Win32 with some multilingual enhancements. Most calls are static, so no runtime overhead is encountered.  Facilitates building RichEdit with different OSs, e.g., Windows CE.

3.     Factored Rich Text status: allows aspects of rich text to be used with plain-text semantics.  E.g., multiple fonts, coloring, and underlining.  Useful for font binding and IME highlighting. Plain text UI remains the same, so EM_SETCHARFORMAT and EM_SETPARAFORMAT apply to whole control.

4.      Dual Line Methods. Lines can be broken, queried, and displayed with or without LineServices. Simple text can be handled with small instance size and higher speed. More sophisticated text can use the elegant LineServices component.

RichEdit 3.0 Performance Improvements and Maintenance

1.     Many performance/size improvements.
a) reduced size of (to 1/3) and generalized internal versions of RichEdit 2.0’s character and paragraph formatting structures (CHARFORMAT2 and PARAFORMAT2).  Easy to add properties to these important structures, although the additions typically won't be available to the message interface.
b) reduced size of many other structures as well.
c) declared constant data structures const, so that they are included in the code segment and are shared by all active processes.
d) reduced the number of system calls by more caching of frequently used data
e) eliminated redundant code.

2.     Faster startup time: most initialization is postponed to the creation of the first control.  C runtime is no longer needed.

3.     Cleaned up code base. Used the same notation (Hungarian, etc.) for local variable names throughout. Added many new comments and improved many old comments.  Counts are now LONGs rather than the nefarious DWORDs, which might be described as “wishful thinking”! Eliminated evolutionary dead code. Simplified C++ model: no more multiple inheritance and almost no operator overloading (except for new and assignment).

4.      Numerous bug fixes. Eliminated some memory leaks and reference counting errors. Fixed various bugs postponed from RichEdit 2.x.

RichEdit 3.0 Rich System Controls

1.     System edit-control mode that emulates the OS edit controls more accurately.

2.      ListBox and ComboBox controls similar to system versions, but supporting Unicode and font binding on Win95 as well as on NT.  These controls can be made rich, opening the door to substantially more elegant dialogs.

What RichEdit 3.0 Isn't

1.     Native HTML control. There are HTML « RTF converters that can be used with RichEdit. There’s the Trident control, which is substantially bigger.... We have a prototype for direct HTML I/O that uses the TOM interfaces, but it hasn’t been tested adequately for general use. This prototype only roundtrips HTML that RichEdit understands.

2.     Active X control. We have a prototype RichEdit Active X control (ATL), but it too hasn’t undergone testing. Note there is a RichEdit 1.0 Active X control and in the future there may be a VB control based on RichEdit 3.0.

3.     MFC RichEdit class. Note there is a RichEdit 1.0 MFC class.

4.     Multistory editor (like Word).  Each RichEdit instance corresponds to a single story.  Word has many stories, e.g., body text, header, footer, footnote, textbox.  A RichEdit instance can be used for any one of those, but to handle more, you need one instance for each story.

RichEdit Clients

RichEdit Client

Version

Office 97 SDM

2.x

Office 9 SDM (3.0)

3.0

Office Binder

2.0, 3.0

Office 9 Command Bars (3.0)

3.0

Word 97 (non-SDM dialogs)

2.x

Default Exchange Client

1.0

Outlook 97 body/to/from/subject/notes

2.x

Outlook 9 body/to/from/subject/notes

3.0

Pocket Word 2.0

3.0-

WordPad (Win95)

1.x

WordPad (Win98)

2.0

WordPad (NT 5.0)

3.0

MFC RichText Control

1.0

VB RichText Control

1.0

Forms^3 97 edit engine

2.0

Forms^3 9 edit engine

3.0

Layout Control Pack for IE

2.0

FrontPage source viewer

2.0

Windows SDK

1.0

Project 98

2.0

Publisher 98

???

Comic Chat

1.0?

How Create a RichEdit Instance (1)

     

      HRESULT       hRE = LoadLibrary("RICHED20");

 

      hwndRE = CreateWindow(TEXT("RichEdit20W"), TEXT(""),

                                                dwStyle,

                                                rc.left, rc.top,

                                                rc.right - rc.left, rc.bottom - rc.top, hwndParent,

                                                NULL, hinst, NULL);

                              ...                // Send messages to hwndRE

 

      FreeLibrary(hRE);

How Create a RichEdit Instance (2)

A RichEdit control is based on an ITextHost object interacting with an ITextServices object.  The latter doesn’t have a window of its own. The CreateWindow() call above creates an ITextHost object, which, in turn, creates an ITextServices object.

Alternatively, you can create an ITextHost object directly that, in turn, creates as many ITextServices objects as you desire.  This is the way Forms^3 uses RichEdit for dialogs.  It’d also be a great way to make a table object, for which each cell would have its own ITextServices object.

The way to create an ITextServices object is to call the function (it’s a bit complicated, since it allows the object to be aggregated)

 

STDAPI CreateTextServices(

      IUnknown *punkOuter,      // Outer unknown, may be NULL

      ITextHost *phost,              // Client's ITextHost; must be valid

      IUnknown **ppUnk);          // Private IUnknown of text services engine

 

For example,

      if(FAILED(CreateTextServices(NULL, this, &pUnk)))

                  return FALSE;

      hr = pUnk->QueryInterface(IID_ITextServices, (void **)&_pserv);

      pUnk->Release();

 

You can then use the the _pserv pointer to call any ITextServices method, including TxSendMessage(), which is a faster way to send messages to the control than the system SendMessage().  But warning: CreateWindow() and the usual message interface is substantially easier to implement, since you don’t have to create an ITextHost object. As shown below, if all you want to do ist to use some ITextServices methods, you can get an ITextServices interface to a control created by CreateWindow().

How to use RichEdit

There are five main ways to use a RichEdit 2.x or 3.0 control:

 

1.     Messages

2.     ITextServices methods

3.     Keyboard input including cut/copy/paste

4.     File read/write (plain text or RTF)

5.     TOM (Text Object Model) methods

 

The most familar ways (messages and keyboard) are useful, but may not have the performance or functionality that you need.  We describe each of these approaches in the remainder of this talk.

For ordinary keyboard input (not IME), RichEdit acts very similarly to Word.  Word has more hot keys, but the cursor keypad and letter/punctuation keys work essentially the same way.  Ditto for mouse operations.

RichEdit Message Interface

There are many RichEdit messages.  In addition to the system edit control messages defined in winuser.h, there are many new messages defined in richedit.h. All edit messages handled by RichEdit (specifically by ITextServices::TxSendMessage()) are listed below.  System edit and RichEdit 1.0 messages are defined in the system SDK.  RichEdit 2.0 and 3.0 messages aren’t documented in my copy of the SDK, but should be documented on http://richedit sometime soon, and in the SDK sometime later.  Note that a number of RichEdit 1.0 messages have been generalized in later versions.  E.g., EM_STREAMIN/OUT take an optional codepage value (which can be 1200, i.e., Unicode, or CP_UTF8, i.e., UTF-8).  RichEdit only understands enough about IME messages to know to invoke the IME input module (see Input Module). Hence not all IME messages are listed below.

 

System edit control messages not handled by RichEdit

EM_GETHANDLE EM_SETHANDLE

EM_FMTLINES     EM_SETTABSTOPS

WM_GETFONT

 

System edit control messages handled by RichEdit

EM_GETFIRSTVISIBLELINE    EM_GETLINE

EM_GETLINECOUNT               EM_GETMODIFY

EM_GETSEL        EM_GETTHUMB

EM_GETWORDBREAKPROC  EM_LIMITTEXT

EM_LINEFROMCHAR               EM_LINEINDEX

EM_LINELENGTH                     EM_LINESCROLL

EM_REPLACESEL                    EM_SCROLL

EM_SETMODIFY  EM_SETSEL

EM_SETTARGETDEVICE         EM_SETWORDBREAKPROC

EM_UNDO

 

WM_CHAR           WM_CLEAR

WM_CONTEXTMENU               WM_COPY

WM_CUT              WM_DESTROYCLIPBOARD

WM_DROPFILES WM_ERASEBKGND

WM_GETTEXT     WM_GETTEXTLENGTH

WM_HSCROLL     WM_IME_CHAR

WM_INPUTLANGCHANGE       WM_INPUTLANGCHANGEREQUEST

WM_KEYDOWN   WM_KEYUP

WM_KILLFOCUS  WM_LBUTTONDBLCLK

WM_LBUTTONDOWN              WM_LBUTTONUP

WM_MBUTTONDBLCLK           WM_MBUTTONDOWN

WM_MBUTTONUP                    WM_MOUSEACTIVATE

WM_MOUSEMOVE                   WM_MOUSEWHEEL

WM_NCMBUTTONDOWN        WM_PASTE

WM_RBUTTONDBLCLK           WM_RBUTTONDOWN

WM_RBUTTONUP                    WM_RENDERALLFORMATS

WM_RENDERFORMAT            WM_SETFOCUS

WM_SETFONT     WM_SETTEXT

WM_SETTINGCHANGE            WM_SIZE

WM_SYSCHAR    WM_SYSCOLORCHANGE

WM_SYSKEYDOWN                 WM_TIMER

WM_UNDO           WM_VSCROLL

 

RichEdit 1.0 messages

EM_CANPASTE   EM_CHARFROMPOS                             

EM_DISPLAYBAND                  EM_EXGETSEL                                          

EM_EXLIMITTEXT                    EM_EXLINEFROMCHAR               

EM_EXSETSEL    EM_FINDTEXT                                        

EM_FINDTEXTEX                     EM_FINDWORDBREAK                 

EM_FORMATRANGE                EM_GETEVENTMASK

EM_GETCHARFORMAT           EM_GETLIMITTEXT                                   

EM_GETOLEINTERFACE         EM_GETOPTIONS                         

EM_GETPARAFORMAT           EM_GETSELTEXT                         

EM_GETTEXTRANGE              EM_GETWORDBREAKPROCEX  

EM_HIDESELECTION               EM_PASTESPECIAL                                  

EM_POSFROMCHAR               EM_REQUESTRESIZE                  

EM_SCROLLCARET                 EM_SELECTIONTYPE                   

EM_SETBKGNDCOLOR           EM_SETCHARFORMAT                

EM_SETEVENTMASK              EM_SETOLECALLBACK                

EM_SETOPTIONS                    EM_SETPARAFORMAT                 

EM_SETTARGETDEVICE         EM_SETWORDBREAKPROCEX   

EM_STREAMIN    EM_STREAMOUT                                   

 

RichEdit 2.0 messages

EM_SETUNDOLIMIT                 EM_REDO                                                  

EM_CANREDO     EM_GETUNDONAME                             

EM_GETREDONAME               EM_STOPGROUPTYPING            

EM_SETTEXTMODE                 EM_GETTEXTMODE                                 

EM_AUTOURLDETECT            EM_GETAUTOURLDETECT                      

EM_SETPALETTE                     EM_GETTEXTEX                           

EM_GETTEXTLENGTHEX        EM_SHOWSCROLLBAR                

EM_FINDTEXTW  EM_FINDTEXTEXW                                

 

Far East specific messages (some are RE 1.0)

EM_GETPUNCTUATION          EM_SETPUNCTUATION    

EM_GETWORDWRAPMODE   EM_SETWORDWRAPMODE                     

EM_GETIMECOLOR                 EM_SETIMECOLOR                                  

EM_GETIMEOPTIONS              EM_SETIMEOPTIONS                   

EM_GETLANGOPTIONS          EM_SETLANGOPTIONS                

EM_CONVPOSITION                EM_GETIMECOMPMODE             

 

RichEdit 3.0 messages

FE messages

EM_GETIMEMODEBIAS           EM_SETIMEMODEBIAS                

EM_RECONVERSION                                  

 

BiDi specific messages

EM_GETBIDIOPTIONS             EM_SETBIDIOPTIONS                  

 

Extended edit style specific messages

 EM_GETEDITSTYLE                EM_SETEDITSTYLE                                  

 

Outline view message

EM_OUTLINE

 

Message for getting and restoring scroll pos

EM_GETSCROLLPOS              EM_SETSCROLLPOS

 

Zoom and increment/decrement fontsize

EM_GETZOOM    EM_SETZOOM

EM_SETFONTSIZE

 

LineServices messages

EM_GETTYPOGRAPHYOPTIONS    EM_SETTYPOGRAPHYOPTIONS   

 

RichEdit RTF

The RTF control words recognized by RichEdit are given below. Not all of these control words are fully implemented, but almost all are round tripped.

 

adeff, animtext, ansi, ansicpg, b, bgbdiag, bgcross, bgdcross, bgdkbdiag, bgdkcross, bgdkdcross, bgdkfdiag, bgdkhoriz, bgdkvert, bgfdiag, bghoriz, bgvert, bin, blue, box, brdrb, brdrbar, brdrbtw, brdrcf, brdrdash, brdrdashsm, brdrdb, brdrdot, brdrhair, brdrl, brdrr, brdrs, brdrsh, brdrt, brdrth, brdrtriple, brdrw, brsp, bullet, caps, cbpat, cell, cellx, cf, cfpat, clbrdrb, clbrdrl, clbrdrr, clbrdrt, collapsed, colortbl, cpg, cs, deff, deflang, deflangfe, deftab, deleted, dibitmap, disabled, dn, embo, emdash, emspace, endash, enspace, emdash, expndtw, f, fbidi, fchars, fcharset, fdecor, fi, field, fldinst, fldrslt, fmodern, fname, fnil, fonttbl, footer, footerf, footerl, footerr, footnote, fprq, froman, fs, fscript, fswiss, ftech, ftncn, ftnsep, ftnsepc, green, header, headerf, headerl, headerr, highlight, hyphpar, i, impr, info, intbl, keep, keepn, kerning, lang, lchars, ldblquote, li, line, lnkd, lquote, ltrch, ltrdoc, ltrmark, ltrpar, macpict, noline, nosupersub, nowidctlpar, objattph, objautlink, objclass, objcropb, objcropl, objcropr, objcropt, objdata, object, objemb, objh, objicemb, objlink, objname, objpub, objscalex, objscaley, objsetsize, objsub, objw, outl, page, pagebb, par, pard, piccropb, piccropl, piccropr, piccropt, pich, pichgoal, picscalex, picscaley, pict, picw, picwgoal, plain, pmmetafile, pn, pndec, pnindent, pnlcltr, pnlcrm, pnlvlblt, pnlvlbody, pnlvlcont, pnqc, pnqr, pnstart, pntext, pntxta, pntxtb, pnucltr, pnucrm, protect, pwd, qc, qj, ql, qr, rdblquote, red, result, revauth, revised, ri, row, rquote, rtf, rtlch, rtldoc, rtlmark, rtlpar, s, sa, sb, sbys, scaps, sect, sectd, shad, shading, sl, slmult, strike, stylesheet, sub, super, tab, tb, tc, tldot, tleq, tlhyph, tlth, tlul, tqc, tqdec, tqr, trbrdrb, trbrdrl, trbrdrr, trbrdrt, trgaph, trleft, trowd, trqc, trqr, tx, u, uc, ul, uld, uldash, uldashd, uldashdd, uldb, ulhair, ulnone, ulth, ulw, ulwave, up, utf, v, viewkind, viewscale, wbitmap, wbmbitspixel, wbmplanes, wbmwidthbytes, wmetafile, xe, zwj, zwnj.

ITextServices Windowless Interface

As described above, you can get an ITextServices interface using CreateTextServices(), but this requires that you implement your own ITextHost object.  If you use CreateWindow() instead, you can still use ITextServices methods by using the following code:

 

      SendMessage(hedit, EM_GETOLEINTERFACE, 0, (LPARAM)&punk);

      if(punk)

      {

                              hr = pUnk->QueryInterface(IID_ITextServices, (void **)&_pserv);

                              pUnk->Release();

                              ....               // Use _pserv methods

                              _pserv->Release();

      }

 

All ITextServices methods are typed simply as HRESULT. This differs from standard com interface functions, which are typed HRESULT STDMETHODCALLTYPE. The methods are:

 

TxSendMessage(msg, wparam, lparam, plresult)

TxDraw(dwDrawAspect, lindex, pvAspect,ptd, hdcDraw,

                hicTargetDev, lprcBounds, lprcWBounds, lprcUpdate,

                pfnContinue, dwContinue, lViewId)

TxGetHScroll(plMin, plMax, plPos, plPage, pfEnabled)

TxGetVScroll(plMin, plMax, plPos, plPage, pfEnabled)

OnTxSetCursor(dwDrawAspect, lindex, pvAspect, ptd,

                hdcDraw, hicTargetDev, lprcClient, x, y)

TxQueryHitPoint(dwDrawAspect, lindex, pvAspect, ptd,

                hdcDraw, hicTargetDev, lprcClient, x, y, pHitResult)

OnTxInPlaceActivate(prcClient)

OnTxInPlaceDeactivate()

OnTxUIActivate()

OnTxUIDeactivate()

TxGetText(pbstrText)

TxSetText(pszText)

TxGetCurTargetX(pX)

TxGetBaseLinePos(pPos)

TxGetNaturalSize(dwAspect, hdcDraw, hicTargetDev, ptd, dwMode,

                psizelExtent, pwidth, pheight)

TxGetDropTarget(ppDropTarget)

OnTxPropertyBitsChange(dwMask, dwBits)

TxGetCachedSize(pdwWidth, pdwHeight)

Getting to the TOM Interfaces

 

// Skeleton function to manipulate text using TOM ITextRange interface

HRESULT Manipulate(HWND hedit)

{

         IUnknown *         punk;

         ITextDocument *pdoc;

         ITextRange *      prg;

 

         SendMessage(hedit, EM_GETOLEINTERFACE, 0, (LPARAM)&punk);

         if(punk)

         {

                  HRESULT hr;

                  hr = punk->QueryInterface(IID_ITextDocument, (void **)&pdoc);

                  if(pdoc)

                  {

                        hr = pdoc->Range(0, 0, &prg);

                        if(prg)

                        {

                                    ...

 

                                    prg->Release();

                        }

                        pdoc->Release();

                  }

                  punk->Release();

                  return hr;

         }

         return E_NOINTERFACE;

}

 

Font Binding

RichEdit 3.0 will assign a charset to plain-text characters depending on their context.  E.g., Hangul symbols get HANGUL_CHARSET, nonneutral ANSI characters get ANSI_CHARSET in any event, Chinese characters get SHIFTJIS_CHARSET if kana characters are found nearby and GB2312_CHARSET if no kana are found nearby.  Greek characters get GREEK_CHARSET, etc.  Note that we’re using Unicode internally, so this use of charset differs from the original one used in font specifications.  But charset seems to be a pretty good match with what we want, which is a script, and our CHARFORMAT has a well-defined place for the charset.  It also helps with some anomalies in Win95, where we can't always use Unicode. Neutral characters like blanks and digits get assigned a charset depending on their context.  For example, a blank surrounded by characters of the same charset gets that charset.  More generally neutrals/digits for BiDi text are assigned charsets in a way based on the Unicode BiDi algorithm.  Once charsets are assigned, we scan the text around the insertion point forward and backward to find the nearest fonts that have been used for the charsets.  If no font is found for a charset, we use the font chosen by the client for that charset.  If the client hasn’t specified a font for the charset, we use the default Office 9 font for that charset. If the client wants some other font, it can always change it, but the hope is that this approach will work most of the time.  Our current default font choices are based on the following table:

 

CodePage

Languages

Font facename

Size

125x

Western, CE, ME...

Times New Roman

10

932

Japanese

MS Mincho

10.5

949

Korean

Batang

10

936

Simplified Chinese

MS Song

10

950

Traditional Chinese

New MingLiU

10

874

Thai

Tahoma

8

 

Hence in our default font-binding table (entries have charset, facename, size), we allow ANSI_CHARSET to match all 8 125x charsets, while the appropriate charset matches other fonts on a one-to-one basis.  More precisely, we use the ANSI_CHARSET choice whenever no other alternative is found.  The client will be able to specify a finer granularity than this, e.g., assign a specific ARABIC_CHARSET for Arabic runs, a specific Greek font for Greek runs, etc.  This finer granularity will also be used if a font with the desired charset stamp is found somewhere in the document before the area being font-bound.

 

 

Categorías: Technology

XSS en algunos temas de WordPress

Otro blog mas - Mar, 2010/01/12 - 23:27

Hace un par de días un compañero de trabajo me avisaba de que la web de Mosaic, en la que hago “más o menos” de responsable técnico tenía un problema de XSS (inyección de código) en el formulario de búsqueda.

Alarmado, rápidamente actualicé la versión de WordPress a la 2.9.1, pero no conseguí solucionar el problema. La prueba era fácil, poniendo este sencillo script en el formulario de búsqueda

<script>alert("hola");</script>

Se abría un cuadro de diálogo de alerta.

Hoy, con tranquilidad, me he dedicado a investigar. El error se produce sólo en algunos blogs de WordPress, no en todos. Por tanto no es un problema del gestor de contenidos.

Después de algunas pruebas y algunos cambios, el error ha aparecido. Es un problema de algunos temas de WordPress y es muy fácil de arreglar. En el formulario de búsqueda de los temas que tienen la vulnerabilidad podemos ver algo parecido a esto:

<label for="s"><input type="text" name="s" id="s" size="50" maxlength="200" value="<?php echo get_search_query(); ?>" /></label>

El problema es el echo del código php. Eliminándolo se elimina el problema. Fácil :)

Actualización: Tal como apuntan Javier y Oscar en los comentarios, el problema no es tanto del echo (que permite mostrar la cadena buscada) como el hecho que no se filtre adecuadamente get_search_query().

Por tanto, tal y como propone Javier, en vez de eliminar el echo la solución más elegante es <?php echo htmlentities(get_search_query()); ?>

Categorías: Planet JEM

Special Capabilities of a Math Font

Murray Sargent: Math in Office - Mar, 2010/01/12 - 04:31

A fairly common inquiry is how a program can use and access the many special glyph variants of a math font. It’s clearly a much more intricate interaction than encountered in most text applications. This post outlines how the Office math layout software interacts with the Cambria Math font and, in principle, with any other math font that has similar capabilities. More specifically, this post describes the functionality of the special library, mathfont.dll, which is shipped with Office 2007/2010. This library, in turn, interacts with the OpenType and OpenType-like tables in a math font.

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /> 

Cambria Math and the math tables were developed together with the Office 2007 math software, each influencing the other to obtain high quality results. Some history is given in the post High-Quality Editing and Display of Mathematical Text in Office 2007. The font contains extensive math tables, glyph variants and glyphs for much of the Unicode math character set. It was designed with ClearType and excellent screen readability in mind and enables the best screen-resolution display of math text available today.

 

The specialized math tables include values that control glyph placements in math zones. Many math constants are defined to handle displacements such as axis height, fraction rule thickness, etc. The math tables are formalized as OpenType tables, although they are not yet part of the OpenType standard. Refinements include entries for positioning subscripts/superscripts horizontally using cut-ins and italic corrections. The cut-in tables allow automatic positioning of subscripts and superscripts horizontally better than un-tweaked TeX. Math characters have four cut-in values, one for each corner, allowing sub/superscripts to be kerned with their bases. Other table entries give larger glyph variants for operators like the integral sign, square root, and stretchy characters such as brackets and arrows.

 

The math tables are organized as a hierarchy accessed via the OpenType ID “MATH”. The names of the tables in the hierarchy are MathConstants, MathGlyphInfo, MathItalicsCorrectionInfo, MathTopAccentAttachment, ExtendedShapeCoverage, MathKernInfo, MathKern, MathVariants, MathGlyphConstruction, and GlyphAssembly. The MathConstants table includes parameters such as the em-size-dependent sub/superscript values

 

LONG lSubscriptShiftDown;

LONG lSubscriptTopMax;

LONG lSubscriptBottomDropMin;

LONG lSuperscriptShiftUp;

LONG lSuperscriptShiftUpCramped;

LONG lSuperscriptBottomMin;

LONG lSuperscriptTopRiseMin;

LONG lSubSuperscriptMinGap;

LONG lSuperscriptBottomMaxWithSubscript;

LONG lSpaceAfterScript;

 

Cambria Math contains full sets of glyph variants that have heavier weights so that when scaled down to the script and scriptscript levels the stem widths match those of the text-level glyphs. The prime (U+2032) and multiple prime characters need to be superscripted and scaled down accordingly. The dotless i and j glyph variants are used in the bases of accent objects. Accents over larger bases are given by special flattened and/or widened glyph variants.

 

Brackets, braces, parentheses and other stretchy characters have a number of larger glyph variants as well as arbitrarily large size created using glyph assemblies. When the assemblies are displayed, the pieces are clipped to prevent overlap, since overlaps create ClearType artifacts.

 

One choice not handled by the math font tables is that for the italic open-face characters 0x2145 - 0x2149 (differential D, d, and e, i, j).  According to a document setting, software can display these characters as themselves (useful for patent applications) or with the corresponding math italic or corresponding ASCII letters. Serif italic glyphs are used for these in most math publications, but serif upright glyphs are used in some European math publications and math calculation engines. The use of the differential d (U+2146) automatically introduces a small space between it and the preceding character if that character is alphabetic.

 

An OpenType table or feature is identified by a 32-bit constant equal to the contents of a four-byte little-endian string. For example, the “MATH” table is identified by the string 0x4854414D. In C/C++, you can use the macro

 

#define MakeTag(a, b, c, d)   (((d)<<24) | ((c)<<16) | ((b)<<8) | a)

#define tagMATH   MakeTag('M','A','T','H')

 

to create such IDs if you don’t want to type the ASCII values of the letters directly. Note that these IDs are case sensitive. In particular, “MATH” identifies the overall math table hierarchy, and “math” identifies the math script, which is used for math glyph-variant features such as subscripts, superscripts, and dotless i's.

mathfont.dll functions

 

The following table describes the functions exported by the mathfont.dll. All functions return an HRESULT. Some entries in the table refer to the “current font metrics”. These metrics depend on the font height (point size), the script level (0 for text size, 1, for script size and 2 for scriptscript size or higher level nestings), and the device mode (reference or presentation).

mathfont.dll function

Purpose

OpenType table used

GetMathConstants

Get pointer to math constants

MATH

GetMathGlyphItalicsCorrection

Get italic correction for a glyph at current font metrics

MATH

GetMathGlyphTopAccentAttachment

Get top accent attachment displacement for a glyph at current font metrics

MATH

GetMathGlyphIsExtendedShape

In [left]sub/sup math objects, determine whether adjacent base glyph is extended, i.e., stretched vertically

MATH

GetMathGlyphKerning

Get kerning for a given corner and height of a glyph at current font metrics

MATH

GetMathGlyphVariant

Get possibly stretched glyph variant or set of glyphs for a glyph of desired size at current font metrics

MATH

GetMathGlyphVariantItalicsCorrection

Get italic correction for a vertically stretched glyph (or set of glyphs) at current font metrics

MATH

GetMathGlyphScriptShape

Get glyph variant for script or scriptscript size (use “ssty” feature for “math” script and “dflt” language)

GDEF, GSUB

GetMathGlyphDotlessForm

Get dotless glyph variant (for i or j like glyphs) (use “dtls” feature for “math” script and “dflt” language)

GDEF, GSUB

GetMathGlyphAccentFlattenedShape

Get flattened accent glyph variant if base height exceeds x height ) (use “flac” feature for “math” script and “dflt” language)

GDEF, GSUB

GetMathFontTextMetrics

If font is a math font, get math font text ascent, descent, and linegap at current font metrics

OS/2

  Right to Left Math Zone Considerations

 

Right-to-left math requires mirroring the images of parentheses, integrals, square roots, arrows, etc. Many such mirror images can be obtained by using corresponding Unicode characters. For example the mirror image of a left parenthesis is a right parenthesis and vice versa. Such glyph variants are automatically returned by the Uniscribe function ShapeString() if SCRIPT_ANALYSIS::fRtl = TRUE. But Unicode doesn’t have many characters that are mirror images of other characters, such as integral signs and square roots. Furthermore it seems that using glyph variants for these characters makes more sense than adding characters to serve as the mirror images. Other approaches include using world transforms and mirrored bitmaps. But these approaches don’t solve the problem that the right-to-left character desired sometimes isn’t a perfect mirror image, e.g., the contour integral.

 

In principle (and in a prototype I’m working on), the glyph variant approach works by following the ShapeString() call with a call to Uniscribe’s ScriptSubstituteSingleGlyph() specifying tagScript as "math", tagLangSys as "dflt", and tagFeature as "rtlm". Here "math" identifies the script as math, "dflt" specifies the default language, and "rtlm" requests right-to-left mirroring. If no such special mirrored glyph exists, the call does nothing. In particular, if the appropriate mirrored glyph is given by a Unicode character, the call does nothing, so the ShapeString() call can be followed by the ScriptSubstituteSingleGlyph() call and never result in “double mirroring”.

 

If you want a complete specification of the math tables, please email me. Hopefully someday the specification will be available as part of the official OpenType standard. The mathfont.dll code was written by Sergey Malkin.

Categorías: Technology

Distribuir contenido