Wednesday, 7 December 2011

Robert Miner

Robert Miner, my friend and colleague on the W3C Math Working Group died yesterday from liver cancer.

Robert has Chaired or Co-Chaired the group since it started, and has been active in developing standards for mathematics on the web at the W3C since at least 1996. He will be sorely missed.

Friday, 23 September 2011

html5mathml


As some of you may have seen on google+ With a bit of help, I have MathML working in Firefox Mobile on Android. The key observation (of Karl Tomlinson) was that I needed a recent version of the font I was using (dejavu serif) as older ones did not have the necessary glyphs to build up “stretchy” characters such as large brackets needed for mathematical display.

So I have added support for Firefox/Android to my javascript library for enabling mathml-in-html5 (Firefox all versions, IE+MathPlayer 6-10, Chrome, Safari, Opera are all supported to some extent).

html5mathml on googlecode

This reminded me that I never actually announced its existence, hence this posting. The javascript is maintained in a google code svn repository at html5mathml, although you might want to start with the test/example file at test1.html

The code is all freely available, It is all under the MIT licence, except for the copy of the dejavu serif font which is available under its own custom free licence, included in the distribution.

Usage

The idea is that you write a conforming html5+mathml document and just add the line:

<script src="html5mathml.js"></script>

into the head of the document, adjusting the src attribute to point to a local copy of the files downloaded from the above location. Please do use local copies, googlecode is a source repository not intended for serving production code to be used. Depending on the client browser being used, this JavaScript file may use one of two css files or the above mentioned dejavu font, all of which are available from the source repository.

Comments fixes and experiences using more browsers all welcome, either here, or in email or on the www-math list.

Comparison with MathJax

Currently the primary javascript library for enabling MathML support in browsers is MathJax, so perhaps I should offer some comparison. MathJax does far more than html5mathml, and generally speaking produces better MathML display, however it is somewhat slower and perhaps harder to set up locally (although it now has a public CDN server distribution which simplifies things greatly if access to the server can be assumed).

MathJax includes parsers for Tex-like syntax as well as MathML; this library assumes the input is MathML and relies on the browser to parse it (although includes some fixup for pre-html5 legacy browser parsing.) MathJax can be configured to do its own MathML rendering using CSS, or uses the native MathMl rendering in the browser if available. html5mathml essentially just assumes that the browser has MathML support although it does include css similar to that used in the the CSS profile for MathML. As such it loads much more quickly and reveals the MathML capabilities of the browser.

Future Plans

I try to track browser developments as far as possible (IE10 preview + MathPlayer for example is supported) but this is a spare time activity and I don't have access to all platforms, so any comments or code contributions for other platforms is always welcome. Currently the largest component is the dejavu font, probably I should subset that to just the characters needed for stretchy symbols. IE10 will no doubt require some changes once the full version comes out, and hopefully there will be a version of Chrome using the webkit MathML rendering (as now used in Safari). Similar techniques may be used to enable svg-in-html in legacy browsers that only support svg in xml. An early version of this code, handling MathML and SVG on a smaller range of browsers was discussed in a posting I made to the NAG blog last year.


Monday, 18 July 2011

slinky canvas

The NAG Blog, to which I contribute from time to time, has a permanent link to Mike Croucher's blog, Walking Randomly. I happened to notice his current article on parametric plotting. The parametric plots are demonstrated with an applet providing Mathematica generated plots and slider controls. Mathematica is a nice piece of software but I couldn't help but think that a modern web browser ought to be able to do this without help from an external application.

HTML (5) provides a canvas drawing API that ought to be able to plot a curve or two, and also built in slider form controls.

I've not used canvas before, but it turns out to be pretty trivial to use, and my attempt is shown below. If you have a canvas enabled browser the plot should appear looking like the plot shown in Mike Croucher's article. Below that should appear some input forms that allow you to change the parameters. These appear as slider controls if a browser understands the HTML5 markup for this (Chrome does in its dev channel version at least). In other browsers they appear as text boxes, which allow you to change the values. (Firefox (4+) and IE (9+) work this way.) The plot should be redrawn if you move the mouse off the form controls.

Actually I prefer the text entry control to the slider, as it is easier to exactly control the value. I'd hoped that the arrow keys could be used to step the slider, but as implemented in chrome at least, that isn't the case. No doubt with a bit more scripting such keyboard shortcuts could be added, but this will do as a first attempt.




e: 0 80

f: 0 80

g: 0 80

h: 0 80

i: 0 80

Friday, 8 July 2011

Converting to Strict Content MathML

Some OpenMath History

MathML and OpenMath have always had a shared history and more or less documented ways of converting between them. Conceptually the conversion is very simple and OpenMath symbol abc could be expressed as the Content MathML symbol <csymbol>abc</csymbol>. Successive versions of MathML have in fact added features that made this conversion simpler and better information preserving. MathML2 added csymbol which is a better fit for OpenMath symbols than ci. In MathML1 and MathML2 further information about the OpenMath Symbol would have to be packaged into the definitionURL attribute. In MathML3 we added explicit support for recording Content Dictionaries by adding a cd attribute to csymbol.

Although the basic idea of the transformation was simple, the details of the transformation were complicated by a desire to map to the pre-defined MathML elements where available. So <OMS name="sin" cd="trans1"/> should map to the MathML element <sin/> rather than <csymbol>sin</csymbol>. The relationship between the predefined MathML Content MathML forms and the simpler, more regular, but much more verbose, OpenMath syntax was not formally specified by MathML so had to be specified as part of the transformation to OpenMath. An early version of such a transformation description is this 10 year old document still available from the OpenMath site. Conversion between MathML and OpenMath. Around the same time, the conversions were implemented in XSLT. The original versions predated XSLT 1, although the versions currently available from the OpenMath site use XSLT 2. Converting from OpenMath to MathML: om2cmml and from MathML to OpenMath: cmml2om.

MathML3 Strict Content MathML

The relationship between <sin/> and <csymbol>sin</csymbol> may rightfully be seen as a purely MathML issue and not something that should be a by-product of converting to the OpenMath form <OMS name="sin"/> and so MathML3 introduced, for each of its Content layout forms, an explicit rewrite rule expressing the construct in Strict Content MathML which is a restricted form just using csymbol. Section 4.6 of Chapter 4 of the MathML 3 spec specifies an explicit multi-pass algorithm applying these rewrite rules to convert any valid Content MathML expression into an expression just using the restricted Strict Content MathML vocabulary. 4.6 The Strict Content MathML Transformation.

Conceptually a conversion to Strict Content MathML could be made which first converted to OpenMath, then converted back to MathML using a stylesheet that removed all the special case rules originally added to om2cmml and documented in the OpenMath report referenced above. This was in fact implemented and is how the majority of the Strict Content MathML examples in Chapter 4 of the MathML spec were constructed. However there were some choices to be made in the mapping specification and in some cases the old OpenMath stylesheets made different choices. Where feasible I updated the stylesheets to match, but one essential structural difference remained. cmml2om implements a typical XSLT depth first walk over the input document, applying whichever template matches at that point. However the working group felt that the exposition of the algorithm was clearer if it was expressed as a multi-pass algorithm where each rule is applied in order over the whole tree, with the result being passed on to the next stage, to be rewritten by the next rewrite rule. These two approaches usually produce the same result, but in edge cases where the order of transformations matter they produce results that are (baring bugs) mathematically equivalent, but are structurally different.

As a requirement for MathML3 to proceed to W3C Recommendation status, we needed to show that the algorithm in Section 4.6 was implementable and did the right thing. It was clear that while my conversion via OpenMath was fairly reasonable it wasn't implementing the algorithm as specified and didn't produce the specified results in all cases.

C2S Implementation

Fortunately Robert Miner (co-chair of the Math WG) stepped up and offered to implement the algorithm as specified. It was good that he did, as inevitably, implementation experience showed some gaps or inconsistencies in the first drafts of the algorithm, and the final published form is much improved as a result of this implementation.

The initial home of the new stylesheet was in the W3C member area of the W3C CVS repository. Recently Robert suggested that we make it public, and asked if I'd host it at my google code web-xslt project site, since it is there and hosts other MathML related XSLT stylesheets.

So Robert's implementation is now available (under W3C or MIT licen[cs]e) from google code: c2s.

Comments on the stylesheet are probably best addressed to the www-math mailing list, but comments may also be dropped here on this blog or on the google code wiki pages.

Wednesday, 8 June 2011

MathML for CSS Profile

I'm delighted to announce that the MathML for CSS profile has been published as a W3C Recommendation.

http://www.w3.org/TR/2011/REC-mathml-for-css-20110607/

This had a dependency on CSS 2.1 which was also published today.

This completes the three REC-track documents produced by the MathML3-era Working Group, joining:

http://www.w3.org/TR/2010/REC-MathML3-20101021/

and

http://www.w3.org/TR/2010/REC-xml-entity-names-20100401/

The MathML for CSS profile (most of the technical work for which was done by George Chavchanidze of Opera) defines how a subset of MathML may be implemented using CSS. This is useful for two reasons: Firstly, it highlights possible places where CSS could be extended to improve its expressive power (so allowing a larger subset of MathML to be implemented) or secondly, it may be viewed directly as an implementation of a useful subset of MathML (and is similar to the CSS stylesheet for MathML which is used for MathML support in Opera). As usual, comments are welcome here or on www-math@w3.org list.

Sunday, 7 November 2010

Unicode 6:XML Entities draft

Unicode 6.0 was published last month, and the proposals for Unicode 6.1 are firming up, both of these releases have significant new characters for mathematical use, so I have updated the Editors' draft of “XML Entity Definitions for Characters ”.

The main source file, unicode.xml has been updated to contain information for all characters in Unicode 6.0, and the provisional allocations for the Arabic Math Alphabets in Unicode 6.1. There is no change to the set of entity names or the MathML or HTML dtd derived from these sources.

Although this document is styled as an editors' draft for an update to the current recommendation, there are no immediate plans to publish a formal update to the W3C recommendation. However I hope to track changes to Unicode in this editors' draft, and perhaps once the proposals to add Arabic mathematical characters to Unicode are all processed, we may try to submit this for formal review as a Proposed Edited Recommendation.

Unicode 6.0

Most of the new characters in Unicode 6.0 are not directly related to Mathematics, although the large collection of “emoji” derived from characters used in the Japanese mobile phone industry provides some interesting characters that I'm sure could be used for mathematical operators (U+1F4A9 perhaps?). However there are some specifically mathematical characters including new heavy (ultra bold) plus and minus (U+2795 and U+2796) which may find use either in display contexts or as additional operators distinct from the usual plus and minus.

Unicode 6.1 (proposals)

The mathematical alphabets (bold, fraktur, double-struck, etc. ) that are in Unicode, and available as values in MathML 2's mathvariant attribute fit well with the mathematical traditions using the Roman and Greek alphabets but don't really work with other alphabets, notably Arabic.

Azzeddine Lazrek proposed that MathML and Unicode be extended with additional math alphabets corresponding to conventions used in Arabic typeset Mathematics. (initial, tailed, looped, stretched) these were added to MathML in the recently finalised MathML 3,0, and the corresponding code points have been allocated to Unicode (all in the block 1EE??) and planned to be standardised in Unicode 6.1. I have provisionally added this data to unicode.xml, and added a table showing the characters to the entities draft.

Thursday, 21 October 2010

MathML 3.0 Recommendation

I'm pleased to report that MathML 3.0 is today published as a W3C Recommendation. Seven years to the day since the last MathML Recommendation, MathML 2.0 2nd edition was published on 21st October 2003.

So, what have we been doing for seven years?

My Working Group colleague, Neil Soiffer has just posted a blog entry describing some of the main features and linking to a nice summary of the main new features in MathML3, so I won't list them all in detail here, however some of the headline features are listed below.

Additions to control bi-directional layout, for Arabic styles in particular

The Arabic Note detailed some extensions to MathML 2.0 that would enable a richer variety of right-to-left layouts as used in typesetting Arabic mathematics. In MathML 3.0 this bi-directional control is fully integrated into the language, allowing the effect of each of the presentation layout elements to be specified in RTL and LTR modes.

Elementary math layouts (long division, etc)

Previously long division and long multiplication etc could be typeset using table layout and a lot of spacing adjustment but this was difficult to produce and very hard to process in an accessible way, by for example a speech renderer. MathML 3 introduces new elements (mstack and mlongdiv, principally) which enable a much more natural and accessible encoding of these layout schemes.

Linebreaking of mathematics

For the first time MathML 3 specifies control over automatic linebreaking and provides improved features for manual forced linebreaking and alignment of expressions.

Officially registered MIME Types

The mime type application/mathml+xml has been unofficially used for some time, but it is now officially registered (along with two more mime types specific to Presentation and Content MathML). These Mime types might prove particularly useful in systems that use MIME to label clipboard fragments.

Closer alignment between Content MathML and OpenMath

Chapter 4, describing Content MathML has been totally rewritten to provide a direct, explicit alignment with OpenMath. This alignment was always in the background (the two languages being developed at roughly the same time, by overlapping groups of people) but making the alignment explicit, and making small changes on both the MathML and OpenMath side has allowed many rough edges to be removed, and I think gives a much clearer presentation of the “semantics” underlying the Content MathML elements.

RelaxNG Schema

The normative DTD used for MathML 1 and 2 is replaced in MathML 3 by a Relax NG schema. this is much more expressive than a DTD, and allows many of the constraints that previously could only be expressed in words to be built in to the grammar. A DTD (and XSD schema) are still provided in the Math Working Group pages, as a convenience.

Many clarifications and improvements throughout the spec

If you stare at the MathML2 spec for 7 years you may notice that some parts are clearer than others.

Integration with HTML5

This is briefly mentioned in Chapter 6, but mainly specified in the HTML 5 draft specification. One of the main difficulties of using MathML on the web has always been that it was designed as an XML application to fit with XHTML, and using XHTML has proved to be far more difficult than envisaged in 1998 when XML started. Some notable browsers are only now starting to support XHTML in beta releases, and even in browsers with good XML/XHTML support, it is difficult to integrate XHTML with HTML based document systems (such as the blogger system hosting this blog). HTML5, allowing MathML (and SVG) in text/html systems will be a massive boost to getting Mathematics into web based systems.