Privacy And Zoom's AI | WNYLRC - Western New York Library Resources Council

Click here to return to the RAQ database

Submission Date:

Thursday, August 24, 2023

Question:

Recently, Zoom introduced new AI features and updated their terms of service agreement, indicating that any user data can be used to train their AI products (TOS 10.4: https://explore.zoom.us/en/terms/). There was a backlash and Zoom quickly put out a clarification and stated that these features are opt-in only (https://blog.zoom.us/zooms-term-service-ai/). Despite this clarification, I am wondering if there are any privacy or FERPA concerns that librarians and educators need to be worried about since Zoom is still used heavily in both library and school worlds. Should we be looking for alternatives or is this just the way of the world now?

Answer:

The day this story really broke (August 7, 2023, a day that will live in minor infamy), Nathan in my office pointed this issue out to me.

"Did you see that Zoom is going to use customer content to train AI?" he asked (this is what passes for casual morning conversation in my office).

My eyebrows went up, mostly because Zoom was being upfront about it, rather than because it was being done at all (because yes, this is the way of the world now). That said, there are some tricks libraries and educators—and any business that cares about use of personal data—can employ to resist it.

Not surprisingly, this comes down to two simple things: awareness, and language.

We'll use the recent Zoom scenario to illustrate:

I am not sure how awareness of the new clause first broke (I am going outsource that research to Nathan, and if he finds out, he'll put it in a footnote, here[1]). But it is clear that fairly soon, consumers were unambiguously aware of the privacy and use concerns posed by the "we'll suck you into our AI" Terms of Use.

Here is the language Zoom used[2] (and has since retracted) to announce it would use our conferences, etc. to train AI:

"[You agree Zoom can use your Content] ... for the purpose of product and service development, marketing, analytics, quality assurance, machine learning, artificial intelligence, training, testing, improvement of the Services, Software, or Zoom's other products, services, and software, or any combination thereof..."

This is where language comes in.

As the world soon knew, this "old" language listed "artificial intelligence", as well as "training", (although the Terms' dubious use of commas suggests to me that Zoom could use our Content for not just "training" AI, but humans, too... actually an even more terrifying prospect, from some perspectives).[3] So yes, lots to be concerned about when it comes to "Customer Content" (which is Zoom’s term for the recordings/data/analytics that come from "Customer Input", which is the raw content you put into Zoom[4]).

Now let's use our awareness of the current Term of Use (current as of August 24, 2023, at least), and see what the language says:

"10.2 Permitted Uses and Customer License Grant. Zoom will only access, process or use Customer Content for the following reasons (the “Permitted Uses”): (i) consistent with this Agreement and as required to perform our obligations and provide the Services; (ii) in accordance with our Privacy Statement; (iii) as authorized or instructed by you; (iv) as required by Law; or (v) for legal, safety or security purposes, including enforcing our Acceptable Use Guidelines. You grant Zoom a perpetual, worldwide, non-exclusive, royalty-free, sublicensable, and transferable license and all other rights required or necessary for the Permitted Uses."

Although not as stark as the old language, there is still a lot of wiggle room to squeeze a blending of Customer Content with AI there. What if Zoom is "obligated" to provide a service, and decides to use AI to do it? What if Zoom decides AI is needed for "enforcing Acceptable Use Guidelines?" What if Zoom decides that AI is needed for your safety, and that, also for your safety, Customer Content must be used to train that AI?

Of course, right now, the Terms also say (in bold, so you know they mean it[5]):

"Zoom does not use any of your audio, video, chat, screen sharing, attachments or other communications-like Customer Content (such as poll results, whiteboard and reactions) to train Zoom or third-party artificial intelligence models".

So can this assurance be trusted? This brings us back to language.

Back in the day, of course, computer systems were not "trained" (as one would train a dog, or a small child to use the toilet) but rather, "programmed."

However, even in the (relatively) slow-moving world of the law, this is no longer the case.

Here is an excerpt from a recent case[6] where lawyers were squabbling over how to gather "Electronically Stored Evidence" ("ESI"):

Defendants propose the following method for searching and producing relevant ESI:

1) Narrow the existing universe of approximately 27,000 documents...

2) Undersigned counsel reviews a statistically significant sample of the remaining e-mails at issue and marks them relevant/irrelevant to create a "training set;"

3) That training set is then used to "train" the eDiscovery vendor's artificial intelligence/predictive coding tool, which "reviews" the remaining e-mails and assigns each a percentage-based score that measures likelihood to be responsive...

So even in the law, computer systems are being "trained", and there is a precise meaning to the term (which in plain[7] terms is "repeatedly using data and parameters to create patterns desired by the user").

So, with all that said, let's look at the member's questions:

Question 1: I am wondering if there are any privacy or FERPA concerns that librarians and educators need to be worried about since Zoom is still used heavily in both library and school worlds.

The short answer is: yes.

Question 2: Should we be looking for alternatives or is this just the way of the world now?

The short answer is: yes.

Here is the reason for my first short answer: Many contracts have what I call a "we were just kidding" clause that allows the contractor to change their terms at will, and without notice. Here is the one in the current version of Zoom:

15.2 Other Changes. You agree that Zoom may modify, delete, and make additions to its guides, statements, policies, and notices, with or without notice to you, and for similar guides, statements, policies, and notices applicable to your use of the Services by posting an updated version on the applicable webpage. In most instances, you may subscribe to these webpages using an authorized email in order to receive certain updates to policies and notices.

What does this mean? Even though they are in bold, Zoom can change its assurance on AI at any time.

The reason for my second short answer is this: Libraries and education institutions have incredible commercial leverage when they work together. For this reason, libraries and educational institutions should always be using their awareness of data, ethics, use, and privacy issues to demand contract language that meets their expectations.

Those expectations will change from product to product. With a product like Zoom, which can generate audio/video/text/analytics/+, including content that later may be part of a student file (FERPA) or a library record (various) the assurances should be:

All content entered is property of the customer (library or school);
At all times, all content entered into the service, or content generated with the use of customer-supplied content, may only be used to provide the current service(s) specifically authorized by the customer;
Any other use of data (for product improvement, for marketing) must be via a specific opt-in;
Terms cannot change without notice and terms in effect at the time content was generated will govern such content, regardless of future changes;
Customers can receive assurance that all data is purged upon request.
Customers can verify that they can enforce and comply with all their own internal policies and obligations regarding data creation, use, and storage.

In addition, libraries and educational institutions should have a clear set of policies for how they, as the potential owners of recordings and other data associated with the use, will use their ownership and control of the content. It would be unfortunate, to say the least, for a student to find that their college disciplinary hearing for underage drinking is now available on YouTube.[8]

Many public library groups and academic consortia are already working to develop this type of criteria[9] (which should focus more on isolating aspirations and expectations than on legal wording, since legal wording will vary from state to state). And some institutions are designing their own services[10] in order to avoid contract terms that don't meet their criteria.

At the individual institutional level, this means building assessment of such services, and bargaining time, into the procurement process. It also means thinking through that institution's own particular ethics and responsibilities and developing internal policies to promote them.

So, while this is the world we live in, libraries and educational institutions are well-situated to make a better one.

Thanks for an important question.

[1] It may have been first pointed out by an anonymous user of the Reddit-like website Hacker News (https://news.ycombinator.com/item?id=37021160). This story (https://stackdiary.com/zoom-terms-now-allow-training-ai-on-user-content-with-no-opt-out/), published the same day, was shared on Twitter the next day.

[2] We didn't Wayback this. On the day Nathan informed me of this, I asked him to pull the Terms off the site, so I could review. We got the question to "Ask the Lawyer" about a week later. Sometimes things just work out.

[3] What perspectives? Ethical, moral, psychological, legal, to name a few.

[4] Definition is from paragraph "10" of the Zoom Terms of Use in effect on 8/7/2023.

[5] Like all things in law, the rules on use and interpretation of bold, underline, and italics vary from state to state. I am not kidding. For a great book on typography and legal writing, check out Matthew Butterick's "Typography for Lawyers."

[6] Maurer v. Sysco Albany, LLC, 2021 U.S. Dist. LEXIS 100351

[7] I trust it is painfully obvious I am not a programmer.

[8] An extreme example...then again, think of the use people have tried to make of old letters, files, and yearbooks. Also, do we think YouTube will make it to 2033?

[9] Examples include: https://www.ala.org/advocacy/privacy/checklists/ebook-digital-content and https://icolc.net/statements/joint-statement-metadata-rights-libraries

[10] Like this: https://bigbluebutton.org/open-source-project/about/.

Tag:

Privacy, AI, Zoom, FERPA, Ethics

Click here to return to the RAQ database