Do you like this project? Please support my Mecha CMS project too. Thank you!

Ash Syntax Highlighter v0.0.3

Ash aims to solve problems related to client-side syntax highlighter file size which usually gets bigger as you have more syntax variants that you want to highlight. With Ash, you will only need to load the main file once, and the associated language files will be loaded automatically as needed.

Below is a demonstration of dynamic syntax highlighting. Try opening your network inspector to see what files will be loading each time a new syntax is applied.


Result goes here…

Features

Limitations

Usage

<!DOCTYPE html>
<html dir="ltr">
  <head>
    <meta charset="utf-8">
    <link href="ash.min.css" rel="stylesheet">
  </head>
  <body>
    <pre><code class="json">{&quot;foo&quot;:1}</code></pre>
    <script src="ash.min.js"></script>
    <script>
    document.querySelectorAll('pre > code[class]').forEach(code => {
        let ash = new ASH(code);
    });
    </script>
  </body>
</html>

Examples

Settings

let ash = new ASH(source, prefix);
let ash = new ASH(source, state = {
        class: 'ash'
    });

Methods and Properties

ASH.esc(content)

Escape special characters in regular expression.

ASH.h(type, content, fn)

Static method to highlight a string input.

let code = document.querySelector('code');
ASH.h('xml', code.textContent, result => {
    code.className = 'ash ash-xml';
    code.innerHTML = result;
});

ASH.instances

Return the syntax highlighter instances.

for (let key in ASH.instances) {
    console.log(key);
    console.log(ASH.instances[key]);
}

ASH.state

This property stores the initial values of ash.state.

let ash = new ASH(source, {
        foo: ['bar', 'baz', 'qux']
    });

console.log([ASH.state, ash.state]);

ASH.version

Return the syntax highlighter version.

let version = ASH.version,
    major = version.split('.')[0];

if (+major < 2) { … }

ASH.x

List of regular expression’s special characters.

ash.state

Return the modified syntax highlighter states.

ash.source

Return the syntax highlighter source element that holds the initial code snippet.

ash.source.addEventListener('click', function() {
    console.log(this.nodeName);
}, false);

ash.pop()

Remove syntax highlighting from the source element.

Languages

Currently support these languages:

Regular expressions and class naming specifications are very inspired by the Highlight.js project. API may be very ugly, but I want to prioritize performance and size of the file for now.

Regular Expression Specifications

If you already know that some tokens have a consistent pattern, you can define the pattern as constant to be used later across languages that could possibly benefit from it:

ASH.LOG = '\\b(?:false|null|true)\\b';
ASH.NUM = '\\b-?(?:\\d+?\\.)?\\d+\\b';
ASH.STR = '"(?:\\\\.|[^"])*"|\'(?:\\\\.|[^\'])*\'';

Class Naming Specifications

Every language syntax basically have a structure, and usually they have the same kind of categorization. For example, although CSS and JavaScript are two different languages, they obviously have a unified tokens such as function, keyword, number, and string. The following are standard class names that likely would apply to all kind of language syntax:

Name Description
cla Class.
com Comment.
con Constant.
exp Expression. Regular expression.
fun Function. A function declaration.
key Key. Should be paired with val.
lib Library. Built-in objects or classes.
log Three-valued logic. Includes false, null, true.
mar Markup. As in HTML tags or Markdown syntax.
nam Name. Name of function, HTML tag, etc.
num Number. Including units and modifiers, if any.
str String. Literal string.
sym Symbol. Smiley, HTML entities, etc.
typ Type. Document type or identifier.
val Value. Should be paired with key.
var Variable. In general, it should not be highlighted unless it has special pattern such as PHP variables.
wor Word. Special words that are reserved by the language parser such as do, else, function, if, var, while, etc.

Others are free to be defined by developers. Each category must at least be compatible with the existing classes, and must consist of a maximum of three letters. For example, you may want to distinguish between float and integer number. You can add flo class together with num. Or, you may want to make a sub-category for number that is represented in HEX format. You can add hex class together with num. Sub-category coloring can be added by the syntax highlighter theme designer optionally. The default color for numbers will always inherit to the num class.

Name Description
att Attribute. As in attribute selector in CSS.
bul Bullet. As for list and YAML array sequence.
cod Code. As in Markdown syntax for codes.
id ID. As in ID selector in CSS.
ima Image. As in Markdown syntax for images.
lin Link. As in Markdown syntax for links.
pse Pseudo. As in pseudo class or element selector in CSS.
quo Quote. As in Markdown syntax for quotes.
que Query. As in CSS selector.
sec Section. As in INI syntax for sections, or in Markdown syntax for heading elements.
uri URI. Just in case you want to highlight URI.

Adding Your Own Syntax Highlighter

I don’t want to look fancy here. The main feature of this syntax highlighter is the ability to load language definition asynchronously. About the way you will mark the tokens is up to you. All you have to do is define a language category based on the file extension like so:

ASH.token.css = function(content) {};
ASH.token.html = function(content) {};
ASH.token.js = function(content) {};

// You can also make alias
ASH.token.jsx = 'js';
ASH.token.ts = ASH.token.js;

Defining languages together with the core will make the highlighting work synchronously. To make it asynchronous, you will need to store them as separate files:

.\
├── ash\
│   ├── css.js
│   ├── html.js
│   └── js.js
└── ash.js

The first function parameter contains the plain text version of the source element contents. this refers to the ASH instance. You can get the available methods and properties from there:

ASH.token.json = function(content) {
    // Mark the desired parts of your syntax here
    content = content.replace(/[\{\}\[\]:,]/g, '<b>$&</b>');
    // Then return the modified content
    return content;
};

Below is a simple example of using the array method to mark portions of a JSON file using regular expressions:

ASH.token.json = [
    ['(' + ASH.STR + ')(\\s*:)', [0, 'key', 0]],
    [ASH.STR, ['val.str']],
    [ASH.LOG, ['val.log']],
    [ASH.NUM, ['val.num']]
];

The order in which the patterns are given is very important. You can also make sub-pattern tasks:

ASH.token.css = [
    ['(#)((?:[a-fA-F\\d]{1,2}){3,4})', ['val', 0, 'num.hex']],
    ['(\\brgba?\\([^)]+\\))', ['val', [
        ['([\\w-]+)(\\()', [0, 'fun', 'pun']],
        ['-?\\d*\\.\\d+', ['num.flo']],
        ['-?\\d+', ['num.int']],
        ['[,)]', ['pun']]
    ]],
    [ … ],
    [ … ],
    …
    …
];

Current implementation of syntax highlighter relies on regular expressions. I know this is bad, but to handle short code snippets, it should be enough. If you want to handle more complex cases, a syntax highlighter library such as Highlight.js will be more appropriate.

License

Use it for free, pay if you get paid. So, you’ve just benefited financially after using this project? It’s a good idea to share a little financial support with this open source project too. Your support will motivate me to do any further development, as well as to provide voluntary support to overcome problems related to this project.

Thank you! ❤️