Skip to content

Commit d0cc527

Browse files
committed
lastgenre: Docs for genre normalization (aliases)
1 parent 383491c commit d0cc527

File tree

1 file changed

+49
-0
lines changed

1 file changed

+49
-0
lines changed

docs/plugins/lastgenre.rst

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -212,6 +212,55 @@ plain ``metal`` will not match ``heavy metal`` unless you write a regex like
212212
double-escape backslashes in unquoted or single-quoted strings (e.g., use
213213
``\w``, not ``\\w``).
214214

215+
Genre Normalization (Aliases)
216+
-----------------------------
217+
218+
Last.fm tags often contain variant spellings, abbreviations, or inconsistent
219+
formatting (e.g., "hip-hop", "hiphop", and "hip hop"). The normalization feature
220+
uses an ordered list of regular expression aliases to map these variants to a
221+
single canonical name *before* any other filtering or canonicalization takes
222+
place.
223+
224+
This feature is enabled by default (``aliases: yes``) and uses a bundled
225+
``aliases.yaml`` file which covers many common cases, such as mapping "dnb" to
226+
"drum and bass" or "r&b" to "rhythm and blues".
227+
228+
You can extend or override these aliases in your configuration. The keys are the
229+
canonical genre names (which support ``\g<1>`` back-references to regex capture
230+
groups) and the values are lists of regex patterns:
231+
232+
::
233+
234+
lastgenre:
235+
aliases:
236+
drum and bass:
237+
- d(rum)?[ &n/]*b(ass)?
238+
\g<1> hop:
239+
- (glitch|hip|jazz|trip)y?[ /-]*hop
240+
241+
.. note::
242+
243+
The same formatting and quoting rules regarding YAML special characters and
244+
backslashes apply here as well. See the **Attention** box in the **Genre
245+
Ignorelist** section above for details.
246+
247+
Choosing the Right Tool
248+
-----------------------
249+
250+
With multiple ways to filter and map genres, here is a quick guide on when to
251+
use what:
252+
253+
- **Aliases**: Use these first to fix spelling variants and abbreviations (e.g.,
254+
``dnb`` → ``drum and bass``).
255+
- **Ignorelist**: Use this for error correction when Last.fm results are not
256+
accurate, or for precise per-artist or global exclusions (e.g., rejecting
257+
``Metal`` for specific electronic artists).
258+
- **Whitelist**: Use this to strictly limit your library to a predefined set of
259+
genres. When combined with canonicalization, the plugin will try to map a
260+
sub-genre to its closest whitelisted parent. Anything else is dropped.
261+
- **Canonicalization**: Use this to automatically map specific sub-genres to
262+
broader categories (e.g., ``Grindcore`` → ``Metal``).
263+
215264
Configuration
216265
-------------
217266

0 commit comments

Comments
 (0)