Search

YLS: First Step Towards YARA Development Environment

Welcome to part one of a series exploring our recently open-sourced YARA tools. In this article, we take a look at YARA Language Server (YLS).

YARA is a tool and a language to classify and identify malware samples. The language consists of rules which describe static and dynamic traits that are later matched in the input. YARA is a well-known name and de facto industry standard in the security world. The critical part of the definition is that it is a language. Users (in this case malware analysts) would expect at least adequate tooling to write YARA rules. However, in reality, the situation is completely different. There are very few solutions providing limited features, many of which are unmaintained or paid. 

When I first started working with YARA and meeting analysts, many of the people were using a bare text editor, sometimes even without syntax highlighting. Users were left on their own, and creating rules was a cumbersome task. This lack of tooling not only took a toll on users but also on the quality of the produced rules. We wanted to change it and provide a modern and efficient way to read and write YARA language, preferably in your favorite editor. 

This is where YLS comes into the picture. It’s the core of the solution that we are open-sourcing today. You can check out the sources at avast/yls. It has been developed and successfully used inside Avast by malware analysts for over the last two years. It provides many long-missing features to make creating, editing, and reading YARA rules more pleasant and efficient.

Language Server Protocol

Language Server Protocol (LSP) is an open-source protocol specification from Microsoft. It is used to guide the implementation of clients (editors) and servers, independent of one another. This architecture allows the implementation of a single program compatible with most modern editors and provides language IntelliSense for them. The single program, in this case, is YLS. It is a Python package based on the great pygls library. Under the hood, YLS uses Yaramod to parse the rules and provide information about YARA modules.

Image: Comparison of different ways to add language support using LSP and without using LSP (https://medium.com/@malintha1996/understanding-the-language-server-protocol-5c0ba3ac83d2)

Features

Here are some of the main features of YLS.

Code completion

Arguably the most frequently used feature is code completion. YLS leverages Yaramod capabilities and provides completion for all modules currently available in YARA. Furthermore, it can complete many additional things, from basic snippets to references to other rules in the file.

Image: Code completion dialog showing the code completion for PE module.

Symbols from YARA modules are enriched with parameters, types, and a documentation string. After the completion item is accepted, the editor triggers a signature help feature that shows the arguments of the currently typed function call.

Image: Signature help feature displaying information about the currently typed function call. Users can see that the first expected parameter is a regular expression, the function returns an integer, and the popup contains the function docstring.

Linting

To make the writing of rules efficient, YLS provides a fast feedback loop returning diagnostic messages regarding the currently edited ruleset. Diagnostic messages highlight where in the source the problem is present. Currently, we support two linting sources: yara-python and Yaramod. Every time the document is saved, it is inspected with both tools to check for possible problems.

Image: The rule is missing the required condition section. The editor shows a diagnostic message.
Image: Rule containing invalid meta value. Floats are not allowed.

Formatting

Automatic code formatting is a convenient feature that allows you to keep code nicely formatted and consistent across the rulesets. Yaramod is responsible for formatting and has sensible defaults. Consider the following rule and notice the inconsistent code formatting.

Image: Inconsistently formatted rule.

Users can format the document using Shift+Alt+F (on Linux Ctrl+Shift+I) or by selecting the Format Document command from the Command Palette. Most of the editors also support automatic formatting when the document is saved. After formatting, the document will have a consistent formatting style:

Image: Rule reformatted using YLS.

Goto definition and References

Quickly navigating through source files is another helpful feature. YLS can construct links between symbols used in rules. This way, we can jump from the symbol to its definition. It is useful when a rule or a string is referenced in the condition of another rule. Using a single keybinding, we can see the original definition. Links also work the other way around, and we can determine, for example, where the current rule is referenced within the document.

This information is also used to highlight linked symbols under the cursor. With a quick glance, the user can determine what lines are related to the current symbol.

Image: Placing the cursor over symbols highlights their related occurrences. It allows to quickly determine parts of the code that are related.

Hovering

Hover popups show more detailed information about the symbol under the mouse cursor. This improves the readability of the already existing rules. The hover popup usually contains a documentation string for a particular function or the value for a given string/rule in the condition section.

Image: Hover popup showing documentation and function signature.

Document hierarchy

YARA documents can be represented using a hierarchy of symbols. YLS parses the current document and can create the necessary hierarchy. This information is then used to display the current “symbol path” in the editor. Furthermore, it can be used by other views to better visualize the code structure in the side window.

Image: Outline view of the current document. The outline contains information about the rules and strings present in the document.

Code actions

Code actions allow users to attach an action to a location in the code. This action can transform the code, aiding the development process. YLS implements a code action to help users add new strings to a rule. It works by pasting the raw strings on the lines in the “strings” section. After the user invokes the “extract strings” code action, YLS integrates the strings into the YARA rule. This includes generating the string names and wrapping the string contents into quotes if necessary.

Image: Pasting strings in the “strings” section adds a code action (left). Invoked code action integrates the strings with identifiers into the rule (right).

Installation

Currently, Visual Studio Code is the editor with the best YLS support. YLS comes with a YARA VS Code extension that acts as a wrapper around YLS. It also provides syntax highlighting and YARA language definitions to tell the editor how to interact with YARA files.

To see the full installation guide check out YLS wiki. The installation currently consists of two steps:

  1. Install yls-yara from PyPi. After this step ensure that yls executable is available from the PATH (depending on your used system). You can either install the package globally or into a virtual environment and symlink it to /usr/local/bin for example.
  2. Install YARA Language Server vscode package from the VS Code marketplace available in the editor.

Another tested editor is Vim using CoC. It should be enough to install YLS binary from PyPi and then add to ~/.config/vim/coc-settings.json the following snippet:

{ 
 "languageserver": {
    "yara": {
        "command": "yls",
        "args": ["-vv"],
        "filetypes": ["yara"]
    }
  }
}Code language: JSON / JSON with Comments (json)

Feel free to add support for your favorite editor if it’s not yet supported. We will be happy to help!

Conclusion

To summarize, in this blog post, we introduced a new addition to YARA tooling called YLS, available at avast/yls. It is a Language Server that provides features like auto-completion, formatting, go-to definition, etc., and is compatible with most modern editors. We are also looking for feedback from the community. Feel free to open issues or pull requests – we really appreciate it. Also, if you have any ideas on how to improve this project, don’t hesitate to contact us.