However, this labelling mainly results from human work (from “record labels, industry partners and our users” Footnote 3) and platforms explicitly declare the unfeasibility to tag all (actual) explicit content. Indeed, many of these online platforms label with some dedicated tag (e.g.,“explict lyrics” or “E”) songs whose reproduction to kids should be prevented. ![]() Protecting kids from the exposure to such content is even more pressing nowadays with the widespread diffusion on the Web of easy-to-access digital content providers (e.g., Amazon Music, Spotify, YouTube) that deliver millions of songs. When dealing with language, such content includes: “strong language references to violence, physical, or mental abuse references to sexual behaviour discriminatory language.” Footnote 1 Even in the music business, over the years, organizations such as the Recording Industry Association of America (RIAA) have recommended the use of labels, such as PAL, Footnote 2 to mark that the content of a music product (mainly lyrics, but also booklets or related products) may be hurtful or inappropriate for children. We also provide further analyses, including: a qualitative assessment of the predictions produced by the classifiers, an assessment of the performance of the best performing classifier in a few-shot learning scenario, and the impact of dataset balancing.Īn important moral duty of our modern society is to prevent the exposure of young people to content (e.g., language, images, movies) that may be offensive or unsuitable for them, typically referred to as explicit content. The evaluation shows that, on this dataset, most of the classifiers built on top of neural language models perform substantially better than non-neural approaches. For the comparison of the different systems, we exploit a novel dataset we contribute, consisting of approximately 34K songs, annotated with labels indicating explicit content. We assess the performance of many classifiers, including those–not fully exploited so far for this task–leveraging neural language models, i.e., rich language representations built from textual corpora in an unsupervised way, that can be fine-tuned on various natural language processing tasks, including text classification. ![]() We investigate the automatic detection of explicit lyrics for Italian songs, complementing previous analyses performed on different languages. Previous works that have computationally tackled this problem have dealt with English or Korean songs, comparing the performance of various machine learning approaches. In this paper, we investigate the problem of assessing whether music lyrics contain content unsuitable for children (a.k.a., explicit content). Preventing the reproduction of songs whose textual content is offensive or inappropriate for kids is an important issue in the music industry.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |