Graham Mudd, the Vice President of Facebook’s Product Marketing, posted on Facebook’s official website that in the future, Facebook will develop a series of Privacy-Enhancing Technologies (PETs) which enables Facebook to measure and optimize the advertising effects while protecting personal privacy. Neither the advertiser nor the platform can get the user’s personal information.
PETs mainly involves cryptographic and statistical techniques. Overall, these techniques can protect personal information security by minimizing the amount of data processing, while retaining core functions such as advertising measurement and personalization. Among them, Facebook mainly introduces three common techniques: Secure Multi-Party Computation (MPC), On-device Learning, and Differential Privacy.
1. Secure Multi-Party Computation (MPC)
Secure Multi-Party Computation (MPC) is a cryptographic technique, that is, multiple parties encrypt their respective data, and then exchange and learn, to realize the measurement and optimization of the advertising effect. In the process of data transmission, storage, and application, the data is encrypted all the way, and no party will see the data of other parties.
In the scenario of advertising effect measurement, in the past, advertisers would encrypt the data and then transmit it to the platform or third party, and the platform or third party could decrypt the data after receiving it to gain insight. This means that at least one of them would see the data’s full path from the user's click to the purchase.
If MPC technology is applied, it means that one party can only see the user click data it owns, and the other party can only see the user purchase data. Under the encryption condition, both parties exchange their encrypted data packets and encrypt the other party's data again. In this way, both data packets are locked twice, to ensure that both parties will not obtain other information except the matching result, and there is no need to authorize the data to other parties, to reduce the risk of data privacy disclosure.
In the scenario of advertising optimization, for example, if the advertiser wants to know the average consumption amount of click users in the said brand, MPC technology can also give the advertiser the final calculation result without disclosing any personal data. The general principle is to split the consumption data of each transformed user, then shuffle, aggregate and add again, and finally divide it by the total number of people to get the average consumption amount. In this way, advertisers not only get the desired advertising effect data to optimize the subsequent delivery strategy but also ensure the security of personal consumption data.
Facebook revealed that MPC technology has already been put into use. Last year, it began testing a solution called Private Lift Measurement, which includes using MPC technology to help advertisers measure the effect. It is expected to be open to all advertisers next year. In addition, Facebook has also already launched an open-source framework for privacy computing. Any developer can use MPC to create privacy-centric measurement products.
2. On-device Learning
On-device learning, that is, the system can find some useful patterns directly on the user equipment based on historical data and continuously learn and optimize the algorithm model. In this process, the prediction can be realized without sending personal data to the remote server or cloud. For example, if people who love to work out may also be potential buyers of protein milkshakes, then on-device learning can find out the correlation model between the two through study. In this process, the user's personal information will not be uploaded to the Facebook server, but will only be retained on the device, avoiding the risk of privacy disclosure.
So what principle does on-device use to carry out the protection of privacy?
Generally speaking, there is an independent and safe "small house" on each device to collect the user’s data in each app, which is usually called Sandbox.
Like APP download, purchase records, and other data will be saved here and will not be shared with other parties. In the sandbox, the system can learn some patterns according to a series of user behaviors on the device. For example, if a person likes to listen to rock music, often shops online at night, then these patterns are then summarized to ensure that they will not be recognized. When there is an update, the system can directly update the mode without relearning by summarizing personal data. In this way, although each device has only completed a small mode update, thousands of devices will generate a safe and unrecognizable report and complete a model optimization. Facebook only needs to learn the final summarized model to better recommend matching advertisements for each user.
In general, through the cycle of learning summary prediction, on-device learning doesn’t need to transmit user data to achieve more accurate directional recommendations.
This technology has been widely used on many Apple devices. However, according to Graham, the biggest challenge of on-device learning applications at present is the operating system. Whether the platform can use the required computing resources to execute or not is still in Apple's hands. "It would be meaningful if a series of standards could be established in the future around the access and use of these resources (in a way of fair competition)."
3. Differential Privacy
Differential privacy is a technical means to protect personal data from being cracked. It can be used alone or in combination with other techniques. The basic principle is to mix a certain ratio of "noise" into the data set, to be difficult to use third-party data to deduce personal information inversely. For example, if 118 people purchased after clicking on an advertisement on Facebook, then the differential privacy technology will add or extract a random number from it. Finally, the number that those people using the system may see will be 120 or 114.