2016版单词的减法_在2016年最大的电影中,女性只说了27%的单词。

2016版单词的减法

by Amber Thomas

通过琥珀托马斯

在2016年最大的电影中,女性只说了27%的单词。 (Women only said 27% of the words in 2016’s biggest movies.)

Movie trailers in 2016 promised viewers so many strong female characters. Jyn Erso. Dory. Harley Quinn. Judy Hopps. Wonder Woman. I felt like this could be the year for gender equality in Hollywood’s biggest films.

2016年的电影预告片向观众承诺了这么多坚强的女性角色。 珍妮·艾索(Jyn Erso)。 海ry 哈雷奎恩。 朱迪·霍普斯(Judy Hopps)。 神奇女侠。 我觉得这可能是好莱坞最大的电影中实现性别平等的一年。

I was wrong.

我错了。

And I don’t make this statement lightly.

而且我不会轻易发表这一声明。

As a scientist, I turn to data to answer questions I have about the world. And I’ve got the data to back up my claim. In fact, you can have the data, code, and resulting data visualization that I made trying to better understand this topic. But first, let me tell you how I became so interested.

作为科学家,我求助于数据来回答关于世界的问题。 而且我有数据来支持我的主张。 实际上,您可以获取我试图更好地理解该主题的数据,代码和结果数据可视化 。 但是首先,让我告诉您我是如何变得如此感兴趣的。

It all started when I went to see Rogue One: A Star Wars Story. All promotional materials for the movie indicated that Jyn Erso (played by Felicity Jones) was the main character. I mean, just look at the poster.

当我去看《侠盗一号:星球大战外传》时,一切就开始了。 电影的所有宣传材料都表明,金恩·埃索(由Felicity Jones饰演)是主角。 我的意思是,只看海报。

When your picture is several times larger than everyone else’s, you’re probably the main character.

当您的图片比其他所有人大几倍时,您可能就是主角。

What I didn’t notice at first was that Jyn is the only woman on that poster.

起初我没有注意到的是Jyn是那张海报上唯一的女人。

I went into the movie theater expecting to see men and women fighting side by side. I left feeling certain that I could count every female character from the movie on one hand. While Jyn was the main character, I was profoundly aware that she was often the only woman in any scene.

我走进电影院,希望看到男人和女人并肩作战。 我离开时确定自己可以一方面统计电影中的每个女性角色。 虽然Jyn 主要角色,但我深刻地意识到,她通常是任何场景中的唯一女性。

It felt strangely familiar to have a lead female character be so outnumbered. Then I realized that Jyn and Princess Leia suffered the same inequality 39 years apart. I was overwhelmed with a need to know exactly how female representation in Star Wars movies has changed. But it seemed unfair to compare movies made today with movies made decades ago.

拥有如此多的女主角令我感到奇怪。 然后我意识到Jyn和Leia公主相距39年,经历了同样的不平等。 我不知道要确切地知道《星球大战》电影中女性形象的变化,这让我不知所措。 但是,将今天制作的电影与几十年前制作的电影进行比较似乎是不公平的。

So instead, I decided to look for female equality across the Top 10 Worldwide Highest Grossing Films of 2016. They were:

因此,我决定在2016年全球十大票房最高的电影中寻求女性平等。他们是:

  • Captain America: Civil War

    美国队长:内战

  • Finding Dory

    海底总动员2

  • Zootopia

    动物界

  • The Jungle Book

    丛林书

  • The Secret Life of Pets

    宠物的秘密生活

  • Batman V. Superman: Dawn of Justice

    蝙蝠侠诉超人:正义曙光

  • Rogue One: A Star Wars Story

    侠盗一号:星球大战外传

  • Deadpool

    死侍

  • Fantastic Beasts and Where to Find Them

    神奇的野兽以及在哪里找到它们

  • Suicide Squad

    自杀小队

With so many powerful women in these films, some of them must be gender-equal, right?

这些影片中有这么多有影响力的女性,其中有些必须与性别平等,对吗?

数据 (The Data)

Now that I decided what I wanted to investigate, I needed to figure out how to do it. Similar data exploration projects have focused on dialogue or screen-time equality. Both seemed like good options, but I wanted the ability to report on equality at the movie and character level.

既然我确定了要调查的内容,就需要弄清楚该如何做。 类似的数据探索项目也将重点放在对话或屏幕时间平等上。 两者似乎都是不错的选择,但我希望能够在电影和角色级别上报道平等。

In the end, I decided to explore the movies’ dialogue. This choice gave me the ability to focus on characters with an active role in the story and to cut non-speaking characters from my analysis.

最后,我决定探索电影的对话。 这种选择使我能够专注于故事中活跃角色的角色,并从我的分析中切出不说话的角色。

Luckily for me, dedicated movie fans often transcribe a movie’s dialogue and make it freely available online. If I couldn’t find a transcript, I used closed-caption files instead. For those, I re-watched the movie and manually assigned characters to their spoken lines.

对我来说幸运的是,忠实的电影迷经常抄录电影的对白并免费在线上观看。 如果找不到笔录,请改用隐藏字幕文件。 为此,我重新观看了电影,并手动将角色分配给了他们的口语行。

This process was a labor of love. It was time consuming, but I have no regrets.

这个过程是爱的劳动。 这很耗时,但我不后悔。

分析 (Analysis)

Once I had all of the transcripts, I just needed to read the .txt files into R and separate the characters from their lines. For the Rogue One transcript, that process looked like this:

拥有所有成绩单后,我只需要将.txt文件读入R并将字符与行分开即可。 对于“流氓一号”笔录,该过程如下所示:

Now that I had a data frame with both Character and Words columns, I had to assign genders to each Character. To remain consistent with my categorizations, I came up with a few simple rules:

现在,我有了一个同时包含“字符”和“单词”列的数据框,我必须为每个字符分配性别。 为了与分类保持一致,我提出了一些简单的规则:

  1. When possible, assign gender according to the pronouns that other characters use. For example, if a character is referred to by others as “he” or “him”, then he is categorized as “male”.

    如果可能,根据其他字符使用的代词分配性别。 例如,如果一个角色被其他人称为“他”或“他”,则他被归类为“男性”。
  2. If there is no pronoun used throughout the movie but the character is named or credited (on IMDB), use the gender of the actor or actress. Note that the gender of an actor or actress was assumed based on publicly available information as of January 2017.

    如果在电影中没有使用代词,但是角色(在IMDB上 )已被命名或记为角色,请使用演员的性别。 请注意,根据截至2017年1月的公开信息,假定了演员的性别。

  3. If no pronoun is used for the character and the character is not named or credited, refer to the closed captions. Sometimes they will identify the character that spoke.

    如果该字符没有使用代词,并且该字符未命名或使用,则请参考隐藏字幕。 有时他们会识别说话的角色。
  4. If all else fails, make an educated guess based on the character’s voice.

    如果其他所有方法均失败,请根据角色的声音做出有根据的猜测。

I’ll be the first to say that these methods are not perfect. In fact, here are some caveats:

我将第一个说这些方法并不完美。 实际上,这里有一些警告:

  1. If a male character was voiced by a female actress (or vice versa) and the character was never addressed by other characters using pronouns, he may be incorrectly labelled. (I don’t think this happened, but anything is possible.)

    如果男性角色由女性女演员发声(反之亦然),而该角色从未被其他角色使用代词讲话,那么他的标签可能不正确。 (我不认为这发生了,但是一切皆有可能。)
  2. Voices that are not associated with a physical embodiment of a character (e.g., the voice of a computer) were categorized according to the gender of their voice actor/actress.

    与角色的物理实施方式不相关的语音(例如,计算机的语音)是根据其语音演员的性别来分类的。
  3. I can never really know the gender of any character, but I’m using the cues and information that I have at my disposal.

    我永远无法真正知道任何角色的性别,但是我正在使用自己掌握的线索和信息。

Again, I am far from infallible, so if you caught a mistake on my part, please let me know.

同样,我绝不是万无一失,因此,如果您遇到了我的失误,请告诉我 。

So now I just needed to count the number of words spoken by each character. Again, I was able to do this in R using the dplyr and stringi packages.

所以现在我只需要计算每个字符说出的单词数即可。 同样,我能够使用dplyrstringi包在R中做到这一点。

It’s worth noting that I included every speaking character in this analysis. So yes, every stormtrooper who shouts a simple “Wait, stop!” before getting shot is included.

值得注意的是,我在分析中包括了每个说话的角色。 所以,是的,每位冲锋队大喊一个简单的“等等,停下来!” 包括拍摄之前。

数据可视化 (Data Visualization)

I had my data. Unfortunately, tables upon tables of word counts and character names don’t give anyone much insight. Like any good data exploration project, it was time to visualize my results. I had to work through a few iterations before I found the best one.

我有我的数据。 不幸的是,字数统计表和字符名称表并没有给任何人以太多的见识。 像任何好的数据探索项目一样,是时候可视化我的结果了。 在找到最佳迭代之前,我必须经过几次迭代。

Scatterplots and bar charts both masked characters with small roles.

散点图和条形图都掩盖了角色较小的角色。

A simple bubble chart was better but it became difficult to identify individual characters. It was also challenging to understand movie-level statistics.

一个简单的气泡图比较好,但是识别单个字符变得困难。 了解电影级统计数据也具有挑战性。

In the end, I decided to learn enough d3.js to make an interactive graphic. Here, each bubble represents a character, and the bubble’s area is scaled based on the number of words spoken. Female and male bubbles can be separated for better insight. The stacked bars below indicate movie-level information.

最后,我决定学习足够的d3.js来制作交互式图形 。 在这里,每个气泡代表一个字符,气泡的面积根据说出的单词数进行缩放。 可以将雌性和雄性气泡分开以更好地了解情况。 下面堆叠的条表示电影级信息。

Go ahead, check out the full interactive version.

继续,查看完整的交互式版本 。

Interested in exploring the raw word-count data for yourself? I’ve made all of the data and code used to generate these visualizations open source. It’s available here:

有兴趣探索自己的原始字数统计数据吗? 我已经将用于生成这些可视化的所有数据和代码公开了。 在这里可用:

ProQuestionAsker/2016MovieDialogueContribute to 2016MovieDialogue development by creating an account on GitHub.github.com

ProQuestionAsker / 2016MovieDialogue 通过在GitHub上创建一个帐户为2016MovieDialogue开发 做出 贡献。 github.com

外卖 (Takeaways)

Ok, so the analysis is done. I’ve got a fancy (and fun-to-play-with) visualization. What did I find?

好的,分析完成了。 我有一个花哨的(而且很有趣的)可视化效果。 我找到了什么?

I recommend taking a quick second to look at something “a-Dory-ble” before going on, because this post is about to get real depressing real fast.

我建议在继续之前先花点时间看一下“ a-Dory-ble”,因为这篇文章很快就会令人沮丧。

Aw, so cute. Feeling good?

真可爱 感觉好吗?

All right, here we go.

好吧,我们开始。

This is a static version of what the visualization for all 10 movies looks like:

这是所有10部电影的可视化效果的静态版本:

(If you’d like to check out the interactive visualization, go here.)

(如果您想查看交互式可视化,请转到此处 。)

There are a couple of things here that I need to point out:

我需要指出以下几点:

Not one of the top 10 movies of 2016 had a 50% speaking, female cast.

2016年的十大电影中,没有一部拥有50%的女性演员。

Finding Dory was the closest to this level of equality with 43% female characters. To be equal, the movie would have needed 8 more speaking, female roles.

寻找多莉(Dory)最接近这个平等水平,女性角色占43%。 为了平等起见,这部电影还需要再增加8位女性角色。

Rogue One was the worst. Only 9% of its speaking characters were female. Of those 10 characters, 1 was a computer voice, 1 appeared on screen for no more than 5 seconds, and 1 was a CGI cameo that said 1 word.

流氓一号最糟糕。 它的说话角色中只有9%是女性。 在这10个字符中,有1个是计算机语音,有1个出现在屏幕上的时间不超过5秒,有1个是CGI客串,说了1个字。

Only 1 of 2016’s top 10 movies had 50% dialogue by a female character.

2016年的前10部电影中,只有1部的女性角色对话率为50%。

Finding Dory comes out on top here too with 53% female dialogue. But, 76% of that dialogue came from Dory alone.

在女性对话中,找到海莉也位居榜首。 但是,这种对话中有76%仅来自Dory。

Trailing at the end was The Jungle Book with only 10% of its dialogue spoken by a female character. Keep in mind, this is after casting Scarlett Johansson as the voice of the historically-male snake, Kaa.

排在最后的是《丛林书》,其中只有10%的对话是由女性角色讲的。 请记住,这是将斯嘉丽·约翰逊(Scarlett Johansson)选作历史上雄性蛇Kaa的声音之后。

Here’s a few more:

还有一些:

  • Finding Dory and Zootopia were the only 2 movies in 2016’s top 10 in which a female character had the most dialogue.

    在2016年的前10名电影中,《寻找海莉》和《动物世界》是仅有的两部女性角色对话最多的电影。
  • Female characters were outnumbered in Captain America: Civil War’s final battle 5:1. Throughout the movie, they only contributed 16% of the dialogue.

    在《美国队长:内战》的最后一场战斗中,女性角色的数量超过了5:1。 在整部电影中,他们只贡献了16%的对话。
  • Batman spoke 2.4 times more than Superman and 6 times more than Wonder Woman in Batman V. Superman.

    蝙蝠侠在蝙蝠侠V.超人中的说话能力是超人的2.4倍,是《神力女超人》的6倍。
  • 78% of the female-spoken lines in Rogue One came from Jyn Erso.

    Rogue One中78%的女性口语语系来自Jyn Erso。
  • While Harley Quinn was a highly advertised character in Suicide Squad, she only spoke 42% as many words as Floyd/Deadshot (played by Will Smith). Notably, Amanda Waller (played by Viola Davis) spoke frequently, totaling just 222 words (16%) short of Deadshot’s word count.

    虽然哈雷·奎因(Harley Quinn)是《自杀小队》(Supericide Squad)中一个备受推崇的角色,但她说的话只占弗洛伊德(Floyd / Deadshot)(威尔·史密斯(Will Smith)饰演)的42%。 值得注意的是,阿曼达·沃勒(Viola Davis饰演)经常讲话,仅比Deadshot少222个单词(16%)。

I started this project because I had a feeling that Rogue One’s cast and dialogue were not equally divided between male and female characters. I was shocked (and saddened) to find that almost none of the top 10 movies from last year were gender equal.

我之所以开始这个项目,是因为我觉得Rogue One的演员和对话在男女角色之间并不均等。 令我震惊(感到难过)的是,去年的前十部电影中几乎没有两性平等。

We can do better.

我们可以做得更好。

Added: If you’re looking for more studies and data explorations like this, check out:

补充 :如果您正在寻找更多类似的研究和数据探索,请查看:

  • Inequality in 800 popular films from 2007–2015 (includes gender, race/ethnicity, sexual orientation, and disability)

    2007年至2015年间800部受欢迎的电影中的不平等现象 (包括性别,种族/民族,性​​取向和残疾)

  • This exploration of 2000 randomly selected movie scripts from 1980’s — 2010's

    从1980年代至2010年代对2000种随机选择的电影剧本的探索

  • This research on 200 biggest movies from 2014 & 2015

    这项研究针对2014年和2015年的200部最大电影

  • Female representations in 2014’s biggest movies

    2014年最大电影中的女性形象

  • This Twitter thread about gender equality in 2016’s animated films

    这个推特主题是2016年动画电影中的性别平等

TL;DR Version: Women represent (on average) 30–35% of speaking roles across each of these investigations.

TL; DR版本:在每个调查中,女性平均占说话角色的30–35%。

Added: Have questions or comments about my methodology or conclusions? Check out my follow-up article featuring the most frequently asked questions.

补充 :对我的方法论或结论有疑问或意见吗? 查看我的后续文章,其中包含最常见的问题。

I analyzed the dialogue in 2016’s biggest movies and it started a lot of conversations.A few weeks ago I published a story about my analysis of the dialogue in 2016’s 10 Highest Grossing Films. I am so…medium.com

我分析了2016年最大电影中的对话,并开始了很多对话。 几周前,我发表了一个关于我对2016年10部最卖座电影中对话的分析的故事。 我是如此… medium.com

If you liked this article and want to see more like it, please click the green heart below and share away on your social media network of choice.

如果您喜欢这篇文章并希望看到更多类似文章,请单击下面的绿色心脏,然后在您选择的社交媒体网络上分享。

I am currently spending my time working on personal projects and data visualizations like this while I look for a data science job. So, if you have a fun project idea (or a job inquiry) you’d like to discuss with me, please reach out to me on Twitter or by email.

我目前正在寻找数据科学工作时,将时间花在诸如此类的个人项目和数据可视化上。 因此,如果您想与我讨论有趣的项目构想(或工作要求),请通过Twitter或通过电子邮件与我联系。

Thank you!

谢谢!

翻译自: https://www.freecodecamp.org/news/women-only-said-27-of-the-words-in-2016s-biggest-movies-955cb480c3c4/

2016版单词的减法

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/395812.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

软件工程博客---团队项目---个人设计2(算法)

针对分析我们团队项目的需求,我们选定Dijkstra算法。 算法的基本思想: Dijkstra算法是由E.W.Dijkstra于1959年提出,又叫迪杰斯特拉算法,它应用了贪心算法模式,是目前公认的最好的求解最短路径的方法。算法解决的是有向…

UWP 杂记

UWP用选取文件对话框 http://blog.csdn.net/u011033906/article/details/65448394 文件选取器、获取文件属性、写入和读取、保存读取和删除应用数据 https://yq.aliyun.com/articles/839 UWP判断文件是否存在 http://blog.csdn.net/lindexi_gd/article/details/51387901…

微信上传素材 java_微信素材上传(JAVA)

public String uploadMaterial(String url,InputStream sbs,String filelength,String filename, String type) throws Exception {try {DataInputStream innew DataInputStream(sbs);url url.replace("TYPE", type);URL urlObj new URL(url);// 创建Http连接HttpU…

SQL Server读写分离之发布订阅

一、发布 上面有多种发布方式,这里我选择事物发布,具体区别请自行百度。 点击下一步、然后继续选择需要发布的对象。 如果需要筛选发布的数据点击添加。 根据自己的计划选择发布的时间。 点击安全设置,设置代理信息。 最后单击完成系统会自动…

码农和程序员的几个重要区别!

如果一个企业老板大声嚷嚷说,“我要招个程序员”,那么十之八九指的是“码农”——一种纯粹为了钱而写代码的技术人员。这其实是一种非常狭隘和错误的做法,原因么,且听我一一道来。1、码农写代码,程序员写系统从本质上讲…

sql server2008禁用远程连接

1.打开SQL Server 配置管理器,双击左边 SQL Server 网络配置,点击TCP/IP协议,在协议一栏中,找到 全部侦听,修改为否,然后点击IP地址,将IP地址为127.0.0.1(IPV4)或::1(IPV6)的已启用修改为是,其它的IP地址的已启用修改为否 注意:如…

snapchat注册不到_从Snapchat获得开发人员职位中学到的经验教训

snapchat注册不到Here are three links worth your time:这是三个值得您花费时间的链接: I just got a developer job at Snapchat. Here’s what I learned and how it can help you with your job search (15 minute read) 我刚刚在Snapchat获得开发人员职位。 这…

java bitmap jar_Java面试中常用的BitMap代码

引言阿里内推面试的时候被考了一道编程题:10亿个范围为1~2048的整数,将其去重并计算数字数目。我看到这个题目就想起来了《编程珠玑》第一章讲的叫做BitMap的数据结构,但是我并没有在java上实现过,这就比较尴尬了,再加…

移动端工程架构与后端工程架构的思想摩擦之旅(1)

此文已由作者黎星授权网易云社区发布。欢迎访问网易云社区,了解更多网易技术产品运营经验记资源投放后端工程的架构调整与优化 架构思考一直以来对软件工程架构有着极大的兴趣,无论是之前负责的移动端Android工程,亦或是现在转到后端开发后维…

View野指针问题分析报告

【问题描述】 音乐组同事反馈了一个必现Native Crash问题&#xff0c;tombstone如下&#xff1a; pid: 5028, tid: 5028, name: com.miui.player >>> com.miui.player <<< signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 79801f28r0 7ac59c98 r1 …

SicilyFunny Game

一、题目描述 Two players, Singa and Suny, play, starting with two natural numbers. Singa, the first player, subtracts any positive multiple of the lesser of the two numbers from the greater of the two numbers, provided that the resulting number must be non…

java 分布式同步_Java Web分布式集群搭建(三)——Session同步

对于一个业务系统的Tomcat集群来说&#xff0c;必须保证同一个用户访问到任一台服务器上都可以维持之前操作的身份。比如在服务器A进行了登陆&#xff0c;那么在服务器B中也要同步该用户已登录的状态&#xff0c;这里就用到了Session的同步。同步方式sticky模式、复制模式、Ter…

移动应用程序和网页应用程序_如何不完全破坏您的移动应用程序的用户界面

移动应用程序和网页应用程序by Luke Konior卢克科尼尔(Luke Konior) 如何不完全破坏您的移动应用程序的用户界面 (How to not utterly ruin your mobile app’s user interface) There’s no single universal formula for designing a great user interface (if you discover…

logging记录日志

日志是一个系统的重要组成部分&#xff0c;用以记录用户操作、系统运行状态和错误信息。日志记录的好坏直接关系到系统出现问题时定位的速度。logging模块Python2.3版本开始成为Python标准库的一部分。 日志级别 在最简单的使用中&#xff0c;我们直接导入logging模块&#xff…

C#编程之接口

1.定义 接口是把公共方法和属性组合起来&#xff0c;以封装特定功能的一个集合。&#xff08;一旦定义了接口&#xff0c;就可以在类中实现它。这样类就可以支持接口所指定的所有属性和成员&#xff09; 注意1&#xff1a;接口不能单独存在。不能像实例化一个类那样实例化一个接…

supervisor守护进程

2019独角兽企业重金招聘Python工程师标准>>> supervisor 是一个client/server系统,把不是守护进程的进程变成守护进程,并监控和控制类 Unix 操作系统上的进程。 upervisor就是用Python开发的一套通用的进程管理程序&#xff0c;能将一个普通的命令行进程变为后台dae…

神经网络算法 java 源代码_神经网络算法与实现 ——基于Java语言 代码实例

【实例简介】Neural Network Programming with Java_ISBN 978-7-115-46093-6【实例截图】【核心代码】NeuralNetworkProgrammingwithJava_code└── Neural Network Programming with Java_code├── Chapter1│ ├── HiddenLayer.java│ ├── InputLayer.java│ ├…

javascript面试_在编码面试中需要注意的3个JavaScript问题

javascript面试JavaScript is the official language of all modern web browsers. As such, JavaScript questions come up in all sorts of developer interviews.JavaScript是所有现代Web浏览器的官方语言。 因此&#xff0c;各种开发人员访谈中都会出现JavaScript问题。 T…

【学习笔记】深入理解js原型和闭包(11)——执行上下文栈

继续上文的内容。 执行全局代码时&#xff0c;会产生一个执行上下文环境&#xff0c;每次调用函数都又会产生执行上下文环境。当函数调用完成时&#xff0c;这个上下文环境以及其中的数据都会被消除&#xff0c;再重新回到全局上下文环境。处于活动状态的执行上下文环境只有一个…

Java基础--访问权限控制符

今天我们来探讨一下访问权限控制符。 使用场景一&#xff1a;攻城狮A编写了ClassA&#xff0c;但是他不想所有的攻城狮都可以使用该类&#xff0c;应该怎么办&#xff1f; 使用场景二&#xff1a;攻城狮A编写了ClassA&#xff0c;里面有func1方法和func2方法&#xff0c;但是他…