NCMEC, Google and Image Hashing Technology


In the United States, the National Center for Missing & Exploited Children (NCMEC) receives millions of reports of online child sexual abuse material (CSAM) every year. NCMEC’s Senior Vice President and Chief Operating Officer, Michelle DeLaune, talks about the organisation’s evolution, how tech companies are stepping up to tackle CSAM, and Google’s Hash Matching API.

Can you tell us about NCMEC and what your role is?


I’ve been at NCMEC for more than 20 years, so I’ve witnessed first-hand the evolution of the organisation and the challenges and threats to our children and their safety. I began my career here as a CyberTipline analyst.

The CyberTipline was created and launched in 1998 as a way for members of the public to report potential incidents of child exploitation. At that time we were receiving reports from parents who were concerned that an adult was talking inappropriately to their child online and people who encountered websites that contained CSAM. Then a federal law was passed in the United States that required US tech companies to report to the CyberTipline any apparent incidents of CSAM on their systems.

In the early days, we might surpass 100 reports of child exploitation a week. We received our first report from a tech company in 2001. Fast-forward to 2021, and we receive approximately 70,000 new reports every day. Some of these are from the public, but the majority of our reports are being submitted by tech companies.

How does NCMEC help online companies fight CSAM?


The law does not require that there be any proactive effort made by companies. Simply, if they detect CSAM content or they become aware of it, they must report it. That is really the impetus for the instrumental growth that we’ve seen in the CyberTipline over the years. But within the last five years there’s been the most significant jump in reports. That explosion can be attributed to the efforts that many tech companies are voluntarily taking to proactively detect, remove, and report CSAM.

One of the flagship programmes we operate at the National Center for Missing & Exploited Children are hash sharing platforms, both for industry to contribute and another for select NGOs to contribute. Via the NGO hash sharing platform, NCMEC provides interested tech companies with more than five million hash values of confirmed, triple-vetted CSAM to assist them with their efforts to combat CSAM on their networks. Many large companies, including Google, have availed themselves of this list and are taking proactive steps to remove CSAM from their platforms. This list also enables other reputable NGOs who serve children to provide their hashes to the tech industry through NCMEC’s hash platform, to try to minimise the need for a tech company to go individually to each NGO.

We also offer an Industry Hash Sharing platform, which enables select companies to share their own CSAM hashes with each other. We are ensuring that any company that is willing and able to proactively detect this material has all of the tools it needs to do so and that companies can share their own CSAM hashes with each other. Google is the largest contributor to this platform with approximately 74% of the total number of hashes on the list.

As you can imagine with the volume of reports we get now, we are seeing many of the same pictures being reported multiple times. That is completely understandable as companies are using hash values to detect known material, but as known material increases, it’s more important to NCMEC to be able to identify new material that has been produced and shared online.

Google’s Hash Matching API has helped NCMEC prioritise CyberTipline reports. Can you tell us more about how this project began?


The success of the hash sharing programme has created a completely new challenge: a volume that presented immense challenges. A non-profit like NCMEC doesn’t have the computational power to scale to this volume. That’s why we were so eager and grateful for Google’s assistance in helping build the Hash Matching API tool.

In 2020 we received 21 million CyberTipline reports, but within each one of those reports you may have multiple images and videos. Those 21 million reports actually included close to 70 million child sexual abuse images and videos. Clearly there is duplication within that volume, and while it's easy for NCMEC to detect exact matches, we would be unable to detect visually similar matches at scale and in real time in order to identify and prioritise never before seen images. And that is key when we’re trying to identify children who are being actively sexually abused.

What benefits has the Hash Matching API brought to NCMEC?


We have a really important job, which is to take this critical information and turn it around as quickly as possible to law enforcement. One of the advantages of this tool is it gives us a new way of adding tremendous value to the CyberTipline reports.

We have a programme of work where we’re going through every child sexual abuse image and video and labelling it. For example, ‘This is CSAM’, ‘This is not CSAM’, or ‘This is hard to identify the age of the child or person.’ But, as you can imagine, with 70 million files last year alone, we’re never going to be able to label them all. This API enables us to do a comparison. When we tag one file the API allows us to identify all visually similar files which we then tag accordingly in real time. As a result, we’ve been able to tag more than 26 million images.

This helps us add more value to reports we’re sending to law enforcement so they can prioritise which reports they’re going to review first. It also helps us identify which images have never been seen before. Those images often contain a child somewhere in the world who is being sexually abused. If we’re looking at the haystack with the proverbial needle, in this case that needle is a child who needs to be rescued. Google’s tool has allowed us to zero in on those images that contain children who need immediate help.

And how has it impacted the well-being of NCMEC human reviewers who process reports from the CyberTipline and analyse CSAM content?


This CSAM detection tool has reduced the need for our staff to look at the same images over and over again. There are images of children being sexually abused where the children may now be well into their adult years. These images live online perpetually and contribute to the ongoing victimisation of those individuals. So being able to tag those images allows them to focus on those children depicting recent sexual abuse whilst at the same time removing the illegal images from view.

That’s why our staff is here; they want to help those children. This was a ground-breaking improvement in the ability of our staff to practise wellness and not be confronted with the same harmful known material over and over again.

How does this work help tech companies as a whole fighting this type of material online?


We know that Google provides CSAM detection technology to companies to help support the global fight against CSAM and the Hash Matching API itself has a direct impact on many beyond NCMEC. All tech companies are enjoying the benefit of a more streamlined, efficient process at the National Center. CyberTipline reports are being addressed and handled in a timelier manner and with more value added than if we didn’t have this tool.

NCMEC is a central resource for tech companies, law enforcement, survivors, and their families. We have an incredibly unique lens through which we look at problems and solutions. Because of the CyberTipline, we are very aware of newly-created and existing CSAM that is circulating online. All of these reports are made available to law enforcement. We should never lose sight that, at the end of this, we have real children who have been sexually victimised and exploited.

We know of more than 20,000 identified children who have been sexually abused and their abuse memorialised, whether in a video or an image. These survivors, some are still children of course and some are now adults, are keenly aware of the ongoing victimisation they’re facing. That’s why it’s so important for us to do what we can to minimise and reduce the circulation of these images.

One thing that may not be clear to the public is that there can be a tendency to dismiss known CSAM, because the images may be considered “old” or “recirculated”. We constantly beat the drum to remind people that these are real children – that those more than 20,000 individuals are trying to heal and regain control of their lives. They take great solace knowing that companies like Google are taking every effort to remove images depicting the worst moments of their lives.

If you encounter child sexual abuse images or material online, you can report it to the National Center for Missing and Exploited Children (NCMEC), or to an appropriate authority around the world.

Google is committed to fighting online child sexual abuse and exploitation and preventing our services from being used to spread child sexual abuse material (CSAM). You can learn more about this on our Protecting Children website.

Cybersecurity

Learn how we keep more people safe online than anyone else in the world.

Learn more