Females in media

A data-based exploration of female representation in media. Let's dive in

"Improving diversity and pluralism in the media also means providing greater opportunities for people."

- UNESCO, 2019

With gender equality being raised in a wide variety of domains, a data-backed analysis can help our society understand its current position and the steps it needs to take to improve. This project aims at analysing the status of females through the lens of media.

By using the Quotebank dataset, a corpus of english quotations from a decade of news, the project provides insights on how gender is represented. The data in this project covers quotes published between 2015 and 2020. The site names at the origin of the quotes were extracted from the URLs of the article, which was provided in the original dataset. Based on this list, which uses Google Page Rank and other independent web metrics for various search engines (more about the ranking method here), 116 sites were selected based on their web ranking scores. Only quotes whose source was within this list were kept for the study. This filtering allowed to reduce the media sources to the most known and common journals or sites.

years of data
2015 to 2020
millions of quotes in total
media countries represented
media kept
categories analysed
Annual percentage of quotes

With this data, an in-depth analysis is performed to understand:

How does female representation in media vary in different countries?

How does female representation evolve in time?

Do different types of media sources represent females equally?

What topics are females quoted in?


Is gender equality in the media a universal concept?