HTML Data Attributes: Don’t Overuse It



The other day when I was code reviewing my colleague’s UI update, I saw this:

<span className={styles.userRating} data-status={liked}>
    <!--Thumbsup icon goes in here-->
</span>

And the following in the CSS:

.userRating {
    opacity: 0.5;
}
.userRating[data-status="true"]{
    opacity: 1;
}


When I saw this it made me cringe, but I couldn’t explain why at that moment. First thing that came up in my mind was, sure it works, but why do it this way?

Raises Questions

It is a valid HTML/CSS, but it seems to be doing what classes should be doing. Hmm…a code smell. As someone who looked at this CSS for the first time, you would think “What does data-status even mean when set to true? What does this data attribute control?”.

Just changing the attribute name to data-selected instead of data-status would clear up some confusion, but there still lies another issue – Separation of Concerns.

Separation of Concerns

We all want to keep things tidy and make sure that the boundaries are properly set between different components and layers. But sometimes this is challenging, especially in the web development world because HTML/CSS/JS to me has always had fragile boundaries where each realm easily trespasses another’s boundary (for example, Javascript updating DOM elements, which is a totally valid thing to do).

Ideally, Javascript should control the behavior, HTML the representation, and CSS the styling. Framework helps strengthen the boundary, but it is not perfect.
Data attributes reside more so on HTML/JS side, as its name suggests. It may make sense to use data attributes inside CSS in some occasion, but definitely not justified with a simple true/false value.

In the code snippet above, you don’t know what data-status does or where it is used in, unless you do a find-all in your IDE. There is just no way of telling what effect(s) this attribute has.

But can’t you say the same thing about classes? Can’t you use classes elsewhere too like in jQuery selector? Yes but it is general consensus that classes are mostly used for styling, and because the code above uses CSS Modules, it is very clear from the syntax that the class gives a style to the element:

styles.userRating

There Are Pros and Cons To Everything

There is one benefit of doing it using data attribute though, which is the HTML is a lot cleaner than conditionally applying CSS classes in className. But that is all I can come up with, and I am not sure if that outweighs all the other benefits.

The Looks

Also, .userRating[data-status="true"] is just plain ugly. It is long, has brackets, and it hurts my eyes if there are dozens of these in HTML and CSS files (and there indeed was dozen of them found during the code review. I removed all of them later for cleanup :p). I would much rather see .userRating.selected; it is concise and shows the intention clearly.

I agree that the aesthetics is somewhat subjective, but if it could directly impact the readability of the code, then you should not dismiss it.
Note that selecting data attributes are said to be slower than classes too, but I have not verified this myself.

Conclusion

You should use data attributes only when it makes sense, like passing data between script and the element. If it is just a styling thing, use classes!

Web development is hard because there is so much to learn from the beginning. Backend is hard too, but from my experience backend has somewhat solid path that you can follow. Frontend, not so much; learning path diverges the moment you step into it, and each path leads to its own deep rabbit hole.

But principle of software engineering is still applicable to web dev; keep things separate as much as possible, and let each component do what it’s designed to do.