Monarch 是 Monaco Editor 自帶的一個(gè)語法高亮庫,可以用類似 JSON 的語法來實(shí)現(xiàn)自定義語言的語法高亮功能。本文將通過編寫一個(gè)簡單的mips匯編語言的自定義語法高亮,來介紹 Monarch 的使用。
1. 初始化
首先需要定義一門語言,在此我們指定語言的名字叫 asm。
// Register a new language
monaco.languages.register({ id: "asm", ignoreCase: false });
monaco 官方文檔如下,
### register
register(language: ILanguageExtensionPoint): void
Defined in monaco.d.ts:4659
Register information about a new language.
#### Parameters
* language: ILanguageExtensionPoint
#### Returns void
其中 ILanguageExtensionPoint 是以下 Object,
{
aliases?: string[],
configuration?: Uri,
extensions?: string[], // 源代碼文件拓展名
filenamePatterns?: string[],
filenames?: string[],
firstLine?: string,
id: string, // 語言的名字
mimetypes?: string[]
}
2. Monarch Tokens Provider
接下來需要注冊(cè)該語言的標(biāo)識(shí)解釋器,在此我們?cè)O(shè)置該語言是大小寫敏感的,并且有一個(gè) tokenizer。
// Register a tokens provider for the language
monaco.languages.setMonarchTokensProvider("asm", {
ignoreCase: false,
tokenizer: {...}
}
Tokenizer
官方文檔中有以下描述
(object with states) This defines the tokenization rules. The tokenizer attribute describes how lexical analysis takes place, and how the input is divided into tokens. Each token is given a CSS class name which is used to render each token in the editor.
即是將源代碼轉(zhuǎn)化為各個(gè)標(biāo)識(shí)符(關(guān)鍵字、字符串、注釋)的規(guī)則。具體而言, tokenizer 描述了一系列 state 和其規(guī)則,可以看成是一個(gè)語法解析狀態(tài)機(jī),而每一條規(guī)則描述了該 state 的匹配規(guī)則、行為action、下一狀態(tài) next。
在 https://microsoft.github.io/monaco-editor/monarch.html 中有很多樣例,這里不具體講解各種配置的意義,下面直接舉例 asm 語言的 tokenizer。
話不多說上代碼,最終的結(jié)果如下,
{
storage_type_kw: /\.(ascii|asciiz|byte|data|double|float|half|kdata|ktext|space|text|word|set\s*(noat|at|noreorder|reorder))\b/,
function_normal: ["abs.d", "abs.s", "add", "add.d", "add.s", ..., "xor", "xori"],
function_pseudo: ["mul", "abs", "div", "divu", ..., "sd", "ush", "usw", "move", "mfc1.d", "l.d", "l.s", "s.d", "s.s"],
tokenizer: {
root: [
[/^\s*?/, "line.line", "@line_pre"],
{ include: "@normal" }
],
normal: [
[/#.*$/, "comment", "@popall"],
[/"/, { token: "string.quote", bracket: "@open", next: "@string" }],
[/[\w\.\-]+/, {
cases: {
"-?\\d+": { token: "number", next: "@popall" },
"-?\\d+\\.\\d+": { token: "number.float", next: "@popall" },
"0[xX]([0-9a-fA-F]*)": { token: "number.hex", next: "@popall" },
"0[bB]([01]*)": { token: "number.binary", next: "@popall" },
"@default": { token: "source", next: "@popall" },
"@eos": { token: "line.line", next: "@popall" }
}
}],
{ include: "register" }
],
line_pre: [
[/([a-zA-Z_]\w*):/, "tag.label.$1", "@line_fun"],
{ include: "@line_fun" },
{ include: "@normal" },
],
line_fun: [
[/[a-z][\w\.]*/, {
cases: {
"@function_normal": { token: "function.normal.$0", next: "@popall" },
"@function_pseudo": { token: "function.pseudo.$0", next: "@popall" },
"@default": { token: "source", next: "@popall" },
"@eos": { token: "line.line", next: "@popall" }
}
}],
[/@storage_type_kw/, "constructor.storage.type", "@popall"],
[/\.(align|extern|globl)\b/, "constructor.storage.modifier", "@popall"],
{ include: "@normal" },
],
register: [
[/(\$)(0|[2-9]|1[0-9]|2[0-589]|3[0-1])\b/, "variable.register.by-number", "@popall"],
[/(\$)(zero|v[01]|a[0-3]|t[0-9]|s[0-7]|gp|sp|fp|ra)\b/, "variable.register.by-name", "@popall"],
[/(\$)(at|k[01]|1|2[67])\b/, "variable.register.reserved", "@popall"],
[/(\$)f([0-9]|1[0-9]|2[0-9]|3[0-1])\b/, "variable.register.floating-point", "@popall"]
],
string: [
[/[^\\"&]+/, "string"],
{ include: "@string_common" },
[/"/, { token: 'string.quote', bracket: '@close', next: '@popall' }]
],
string_common: [
[/\\[rnt\\']/, "string.escape"],
[/&\w+;/, 'string.escape'],
[/[\\&]/, 'string']
]
}
}
其中規(guī)則的入口是 tokenizer.root ,與tokenizer同級(jí)的是關(guān)鍵字表,tokenizer 的子元素是規(guī)則表。
include
包含 tokenizer 下其它的規(guī)則,例如,
root: [ { include: "@normal" } ]
Inspecting Tokens
Monaco provides an Inspect Tokens tool in browsers to help identify the tokens parsed from source code.
To activate:
- Press
F1while focused on a Monaco instance. (或者右鍵-Command Palette) - Trigger the
Developer: Inspect Tokensoption.
This will show a display over the currently selected token for its language, token type, basic font style and colors, and selector you can target in your editor themes.

3. Theme
4. Completion Item Provider
[To be continued]