CSS Selector Parsers

tutorial javascript tool

source on gitlab

intro

Handling many different CSS selectors can be a very tedious task, especially when we have many changes to the UI. One of the use case of this would be when dealing with UI testing.

The usual way of doing it would be to hard code the selectors into the tests themselves, checking that these are displayed. For example, a particular username field in a login form may have the selector #app #login form input[name="user"]. If a new login form is used, the new selector could have been changed, and manually going into all the different instances of the selector and altering them can be a hassle.

The way elements are structured, it makes sense to have a nested structure when storing these selectors. This is similar to what we have in SASS, where indentation is used to indicate nested levels.

example

This is a short example of what type of structure the input will take.

mainApplication #app
    loginModal #login
        loginForm form
            username input[name="user"]
            password input[name="pass"]
        confirmationBox #confirm
            errText .status .content.err
            confirmButton button

and if all goes well, it might return a mapping like so

{
  'mainApplication':
    '#app',
  'mainApplication loginModal':
    '#app #login',
  'mainApplication loginModal loginForm':
    '#app #login form',
  'mainApplication loginModal loginForm username':
    '#app #login form input[name="user"]',
  'mainApplication loginModal loginForm password':
    '#app #login form input[name="pass"]',
  'mainApplication loginModal confirmationBox':
    '#app #login #confirm',
  'mainApplication loginModal confirmationBox errText':
    '#app #login #confirm .status .content.err',
  'mainApplication loginModal confirmationBox confirmButton':
    '#app #login #confirm button'
}

Without the nesting, everything is expanded out and it can be rather hard to use. But this is exactly how CSS selectors work, and the information in this form dutifully replicated in this structure.

nested

Given the dictionary structure in Javascript, it can be a little troublesome trying to embed objets, and give them value at the same time.

For example, nesting an object into mainApplication.loginModal means that mainApplication.loginModal cannot point to a string value.

To get pass this limitation, we will be using the _ character to access the final string value of a certain object (if it was set in the intialization).

Given the same input from the previous example, we will get a nested structure of like so,

{
  "mainApplication": {
    "_": "#app",
    "loginModal": {
      "_": "#app #login",
      "loginForm": {
        "_": "#app #login form",
        "username": {
          "_": "#app #login form input[name=\"user\"]"
        },
        "password": {
          "_": "#app #login form input[name=\"pass\"]"
        }
      },
      "confirmationBox": {
        "_": "#app #login #confirm",
        "errText": {
          "_": "#app #login #confirm .status .content.err"
        },
        "confirmButton": {
          "_": "#app #login #confirm button"
        }
      }
    }
  }
}

description

There are 2 main parts to this project - first figuring out the indentation of the layout, second, using the indentation to recreate the nesting of the objects.

indentation problem

The first problem that the parser needs to solve is the indentation problem. There is 2 parts to it. One, we have to figure out what is the right character to use for the indentation, and how many of it to use as one level.

We will be using the first indentation we encounter as a standard, and the rest of the document will follow suit. First “4 spaces” is first used, it will be standardized for each level of indentation, using 8 spaces, 12 space, and so on.

Second problem of checking indentation level, there is just one main rule to it. For increasing indentation, only one level of jumping is allowed, but for decreasing indentation, multiple levels of jump is allowed. That is basically all we will need to figure out the indentation levels of the document.

representing the nesting

To represent the nesting of the different indentation, we have to use a structure similar to that of a stack. For each increasing level of indentation, we have append another level of selector to the “stack”.

For each increasing level of indentation, we “add” new selectors, and with decreasing levels of indentation, we simply “rollback” the selectors, this will give us the right information to prepend each of the different CSS objects.

explaining the code

The code for this project is not terribly complicated, but it took me a few tries to get there - each time implementing one portion of the functionality. The project can be viewed on gitlab, with all the commits information.

Here, we will be extracting some portions of the code to be explained in greater details here.

get indentation

In the source code, the getIdentation function basically goes through the whole of the input and figure out which level of indentation each specific line is on.

It happens in a few steps. The first step is to identify the right type of indentation to use, this snipplet below here tries to find the type of indentation (space or tab) and the number of those character that preprends the content.

function getIndent(line) {
	for (const i in ws) {
		const type = ws[i];

		if (line.trimEnd().length && line.startsWith(type)) {
			let count = 0;
			while (line[count] === type) {
				count += 1;
			}

			let res = '';
			for (let i=0; i<count; i++) { res += type; }
			return res;
		}
	}

	return '';
}

Then for each of the subsequent lines, we figure out the level of indentation for that line and push it to the indentation array. The array will represent the level of indentation for each line, the first item being the indentation level for the first line, and so on. This is also accompanied by a few checks in order to ensure the input is value:

if (firstIndent[0] != indentation[0]) {
    error = true;
    console.log(`ERR - indentation of spaces and tabs mixed (L${lnum})`);
}
if (indentation.length % firstIndent.length !== 0) {
    error = true;
    console.log('ERR - indentation should be multiple of '
        + `${firstIndent.length} (L${lnum})`);
}

getting the selectors

After getting the array of indentation levels, the whole jobs gets a lot easier. Knowing the indentation, we can just track the corresponding selectors and the objects’ name.

To do that, we have an array where the items represent the name of the objects at that level. The first item of the array represent the selector at the indentation level of 1. When the scope increases, we just push a new selector in to indicate a new indentation level.

The following snippet show how it is being implemented, with levelsName as the array used for determining the object’s name, and levelsRest used for determining the selectors.

for (const lnum in lines) {
    const line = lines[lnum].trim();
    const indent = indentation[lnum];

    if (line.length == 0) { continue; }

    const name = line.split(' ')[0];
    const rest = line.split(' ').slice(1).join(' ');

    if (indent < levelsName.length+1) {
        levelsName.push('');
        levelsRest.push('');
    }

    let past = indent ? levelsName[indent-1] : '';
    levelsName[indent] = past ? past + ' ' + name : name;

    past = indent ? levelsRest[indent-1] : '';
    levelsRest[indent] = past ? past + ' ' + rest : rest;

    if (!nested) {
        res[levelsName[indent]] = levelsRest[indent];
    } else {
        let temp = res;
        const vals = levelsName[indent].split(' ');

        for (const i in vals) {
            if (!temp[vals[i]]) {
                temp[vals[i]] = {};
            }
            temp = temp[vals[i]];
        }
        temp._ = levelsRest[indent];
    }
}

conclusion

These are all that is required to produce the a nested structure for CSS selectors. The source is available on gitlab.