Endangering Data Interview with Sarah Lamdan

Sarah Lamdan headshotSarah Lamdan is a Professor of Law at CUNY School of Law in Long Island City, NY. She has a master’s degree in library science and legal information management. She also has a law certificate in environmental law. Her work focuses on information law and policy.

Professor Lamdan works on issues across the spectrum from open government to personal privacy. She is currently writing a book about data control and access called Data Cartels, which will be published by Stanford University Press. Sarah is a member of the Environmental Data & Governance Initiative and works with immigration groups on government surveillance issues. Lamdan’s book, Environmental Information: Research, Access & Environmental Decisionmaking (Environmental Law Institute 2017) serves as a resource for journalists, scientists, and researchers who use government science information in their work.


Tell us a bit about your projects and how you became interested in issues of data privacy, collection, and surveillance.

I became interested in the topic after seeing a news article in 2017 about ICE’s “extreme vetting” social media surveillance program, and noticing that Thomson Reuters and LexisNexis reps had attended an ICE event to learn about how to win gov’t contracts to participate in the invasive immigrant surveillance program. Thomson Reuters and LexisNexis (part of the data analytics giant RELX Group) are the main suppliers of legal research products for the legal profession. Their products, Westlaw and Lexis, are considered the “gold standard” legal research products, and together, the companies have a legal information duopoly. I was concerned about the ethical implications of immigration lawyers using products that may ultimately be participating in ICE surveillance programs that harm their clients.

 

You’ve written several pieces[1] detailing how many vendor business models go far beyond licensing scholarly journals to academic researchers and law firms, and include selling mailing addresses, social media data, credit and criminal records, and much more to marketing firms, political consultants, and law enforcement. How did those companies develop?

So, as I started researching about Thomson Reuters and LexisNexis’s relationships with ICE, it became clear that these companies weren’t the companies that I thought they were. As a librarian, these companies were marketed as publishers. I knew Reed Elsevier (RE of the RELX) as a publisher of scholarly journals, and LexisNexis (LX) as a publisher of legal resources and news. Thomson Reuters supplied financial and legal search platforms to business and law firm libraries that I’d worked in. 

I learned that, over the past decade, these companies have morphed from being “publishers” to being “data analytics corporations.” Library markets are changing as more information becomes open access and freely available online, especially when it comes to legal resources. Government websites and nonprofit groups have pushed to make laws more accessible on the internet. At the same time, data analytics seems to be the future profit source – collecting huge amounts of data and using algorithms, AI, and machine learning to “slice and dice” data to build informational resources for clients. Since the 90’s Thomson Reuters and RELX Group have acquired hundreds of companies and tons of data to position themselves as the premier data analytics firms.

Although vendors like Thomson Reuters and RELX are notoriously secretive about the library data they collect and how they use it, do members of the library community have any idea about how that data is used in their broader data broker ecosystem? How might data collected from users of LexisNexis, Scopus, Elsevier journals, etc. be of value to non-library audiences? How it may be aggregated with other data?

It seems that Thomson Reuters, RELX Group, and other online research platforms benefit from using library data to market their products, and create new products, for those same users. Sam Moore describes how these platforms use “seamless access” (“Get Full Text Research, for example[2]) to gather data about its users that the companies can monetize researchers’ searches to tailor services for those, and other, users. Wolfie Christl similarly noticed that when you do research using Elsevier, ThreatMetrix, an RELX surveillance data product, stores a personal identifier in your browser to track your searching.

We can’t be sure what the companies are doing with this data (aka we don’t know whether they are using it internally or selling it/sharing it externally, etc.) but we do know that our research is being tracked by the companies whose platforms we, and our patrons, rely on to do our research.

 

You’re working on a book manuscript about data cartels. Can you share a little bit more about that project, and what the larger ecosystem of data cartels looks like?

As I tried to figure out what these data analytics companies do and how their different products connected, I learned that there isn’t much research on these publishers-turned-data analytics corporations. Information science tends to focus more on communication technologies and platforms (algorithms, machine learning, social media, search engines) and not as much on the duller, less-dynamic data vendor side. It’s like focusing on modems, themselves, instead of the Internet – boooring. Because there isn’t much discussion of these companies beyond librarianship, we haven’t seen the full pictures of these companies: they don’t just sell platforms to libraries, they also sell platforms to financial firms, cops, news orgs, and more. Several companies are simultaneously academic research oligarchies, legal research duopolies, federal and state police surveillance monopolies. These companies have consolidated control over informational flows in libraries and beyond, restricting and stratifying informational access and data privacy in all of our communities.

 

In Librarianship at the Crossroads of ICE Surveillance, you write that we must not pass privacy protections on to patrons, or donate the labor of erasing our patrons’ data to vendors, but rather to demand “privacy by design” from vendors. Have you seen any progress on this front?

“Privacy by design” is an idea described by Ann Cavoukian, the former Information and Privacy Commissioner for the Canadian province of Ontario. I bought into this idea in an article I wrote in 2015 (Social Media Privacy: A Rallying Cry to Librarians), and tried to incorporate it into librarians’ work with vendors and the resources we use in our research and reference work. I haven’t seen any data analytics corporations affirm privacy by design concepts lately, and in fact, it seems that, based on research like Moore’s and Christl’s, they are expanding the surveillance in their own products. 

 

In Librarianship at the Crossroads of ICE Surveillance, you also wrote that librarians are information technology’s early adopters, and often information technology’s first critics. As information professionals, what do you think our role is outside of the library to advocate for data justice?

While I think that librarians have a lot of leverage as the gatekeepers for research platform products, the people who sign the contracts, teach the patrons how to use the products, etc., I am always cognizant of Ettarh Fobazi’s work on “vocational awe.” Librarians can harm ourselves as workers by assuming the huge societal burdens sometimes foisted on libraries and their employees. So, I think we have power, but I don’t think it’s our job, alone, to save the world. We can use our power as we choose, and there have been some really thoughtful and excellent library initiatives around data privacy including the open access movement, ideas around baking privacy guarantees into contracts with data analytics companies, and other negotiations with these data platform giants. We’ve seen how libraries can even choose to walk away from “big deal” contracts, which is very empowering.

 

Is there anything else you want to add, or any work or other projects you want readers to know about?

There is so much awesome librarian work going on right now. Information access and the products we use are changing all the time, and I think that there is no group more aware of how the changing data privacy and access universe impacts our lives than librarians. So, stay strong and keep going! 

  1. Defund the Police, and Defund Big Data Policing, Too (2020), Librarianship at the Crossroads of Big Data & Corporate Surveillance (2019), When Westlaw Fuels ICE Surveillance: Ethics in the Big Data Policing Era (2019).
  2. Individuation through infrastructure: Get Full Text Research, data extraction and the academic publishing oligopoly (2020)