LinkedData Stories

This piece inaugurates an occasional series by or about linked data practitioners that will be cross-posted on the DLF site and  LOD-LAM.net. The first post in the series is a personal reflection on the linked data landscape written by Jerry Persons, technology analyst at Knowledge Motifs, Chief Information Architect emeritus at Stanford, and author of the CLIR-commissioned Literature survey in support of Stanford Linked Data Workshop.

The ecosystem in which both library-generated metadata and vendor-generated search environments are players has changed radically with unprecedented swiftness:

Richard Wallis (late of Talis, now OCLC) recently summarized these trends in terms of web-wide factors in his post A data 7th wave approaching:

With the advent of many data associated advances, variously labelled Big Data, Social Networking, Open Data, Cloud Services, Linked Data, Microformats, Microdata, Semantic Web, Enterprise Data, it is now venturing beyond those closed systems into the wider world.

Well this is nothing new, you might say, these trends have been around for a while – why does this constitute the seventh wave of which you foretell?

and

It is precisely because these trends have been around for a while, and are starting to mature and influence each other, that they are building to form something really significant ….

Indeed, for those in pursuit of a broader-than-library take on what’s going on in the web-wide world of structured data, one should take advantage of Richard’s experience including a deep understanding of libraries as a member the Talis library systems group and spanning the company’s evolution toward its present-day provision of Kasabi, “a startup business spun out from and backed by Talis. Our aim is to unlock the value in the World’s data by enabling new business models for producers and consumers of structured data at all scales.”  Among his posts and presentations worth close review are those that can be had at his Data Liberate site, for example:

  • Create data not records
  • Libraries through the linked data telescope
  • Who will be mostly right – Wikidata, Schema.org

My own views on the potential benefits to be had from a rapidly evolving web that is increasingly dominated by well-structured and well-curated data were shaped in large part by exposure to the vision, concepts, and people involved in a set of antecedents to the current flurry of activity and developments.  The thread leads from a turn of the century piece written by Danny Hillis, through his Applied Minds and Metaweb companies, leading to Freebase and John Giannandrea, and onward from there to the recent Wall Street Journal interview with Amit Singhal and the subsequent discussions surrounding Knowledge Graph and things not strings:

Hillis: With the knowledge web, humanity’s accumulated store of information will become more accessible, more manageable, and more useful. Anyone who wants to learn will be able to find the best and the most meaningful explanations of what they want to know. Anyone with something to teach will have a way to reach those who want to learn. Teachers will move beyond their present role as dispensers of information and become guides, mentors, facilitators, and authors. The knowledge web will make us all smarter. The knowledge web is an idea whose time has come.  Hillis, W. Daniel. “Aristotle”: (The knowledge web), 2000, published in The Edge (138) in 2004.

 Freebase:  A new company founded by a longtime technologist is setting out to create a vast public database intended to be read by computers rather than people, paving the way for a more automated Internet in which machines will routinely share information.  Markoff, John. Start-up aims for database to automate web searching. NYT (9 March 2007).

Giannandrea:  Freebase is an open database of the world’s information, built by a global community and free for anyone to query, contribute to, and build applications on. … Part of what makes this open database unique is that it spans domains, but requires that a particular topic exist only once in Freebase. Thus freebase is an identity database with a user contributed schema which spans multiple domains. For example, Arnold Schwarzenegger may appear in a movie database as an actor, a political database as a governor, and in a bodybuilder database as Mr. Universe. In Freebase, however, there is only one topic for Arnold Schwarzenegger that brings all these facets together. The unified topic is a single reconciled identity, which makes it easier to find and contribute information about the linked world we live in. Giannandrea, John. Freebase: an open, writable database of the world’s information (a one-hour lecture delivered in October 2008).

 [Amit Singhal] said in a recent interview that the search engine [Google] will better match search queries with a database containing hundreds of millions of “entities”—people, places and things—which the company has quietly amassed in the past two years. Semantic search can help associate different words with one another.  Efrati, Mair.  Google gives search a refresh. WSJ (15 March 2012).

Knowledge Graph: [W]e’re focused on comprehensive breadth and depth. It currently contains more than 500 million objects, as well as more than 3.5 billion facts about and relationships between these different objects. And it’s tuned based on what people search for, and what we find out on the web.  Britt, Phil.  Google unveils knowledge graph. (24 May 2012).

Taken together, these and other suggestive developments in the linked-data ecosystem represent a confluence of tools, data, and methodologies of sufficient potential to warrant efforts that pursue:

new opportunities for addressing the traditional and prevailing problems of too many silos of content, too many disparate modes of search and access, and too little precision and too much ambiguity in search results in the extreme environments of academic information resources intended to support and report on the research and teaching in large research enterprises. Keller, Michael A. Linked data: a way out of the information chaos and toward the semantic web. EDUCAUSE Review 42 (4): July/August 2011.

Such opportunities are inextricably bound up with linked-data’s potential for (1) reshaping the infrastructure that supports web-wide management of information, knowledge, and data, and for (2) fueling unprecedented improvements in the efficiency and efficacy of navigation and discovery capabilities.  It’s long past being a matter of if, now it’s about when—the game that’s afoot is about finding roles that libraries can play in aiding and abetting the creation of an increasingly dense tapestry of facts and links woven together from the flows of intellectual resources that the global academic community consumes and produces in the course of its research, teaching, and learning.

Chelcie on 18 June 2012 / 4 Comments

The Digital Library Federation, together with LITA’s Linked Library Data Interest Group, is pleased to announce an open Zotero group for LOD-LAM tools and resources. The LOD-LAM Zotero group is intended to serve as a space both for practitioners seeking an entry point into the world of cultural heritage linked data and for practitioners seeking to share the tools and resources they have come to rely upon.

Members of LITA’s Linked Library Data Interest Group and other contributors have added many resources to the LOD-LAM Zotero group to date. In order to increase the usefulness of the group, we are asking for community involvement in two ways:

  • As you come across new tools and resources—either from conferences or in the course of your professional reading—please add them to the LOD-LAM Zotero group.
  • As you use the LOD-LAM Zotero group—either as a contributor or as a browser—please send us any feedback you may have.

Through collective effort, we hope the LOD-LAM Zotero group will become the “go to” place for information about linked data and its particular uses by libraries, archives, and museums.

Items added to the LOD-LAM Zotero group can be viewed in the group’s library. Alternatively, you may view the group’s library or collections within the group’s library through your feed reader. Click “Subscribe to this Feed” on the page of the library or collection that you wish to follow via RSS.

A Zotero account is not required for “read” access to the group’s library, but it is required for “write” access. To contribute, simply create a Zotero account, download either the Zotero browser plugin or standalone client, and begin adding items. More information about getting started and tips for contributing resources can be found in the README document in the group’s library.

We hope the LOD-LAM Zotero group will create more opportunities for DLF and LOD-LAM community members to learn from one another. We especially encourage community members interested in playing in the linked data sandbox to browse the collection titled LOD 101: Primers, Tutorials, etc. In addition, we encourage contributors to use the “Notes” field to share information about tools from their experience when adding new resources to the group’s library.

Management of the LOD-LAM Zotero group is shared by the DLF and LITA’s Linked Library Data Interest Group. For more information, or to send feedback, please email lodlamzotgrp -at- yahoogroups -dot- com.

Chelcie on 24 May 2012 / Comment

In a recent post at danbri.org, Dan Brinkley documents some of his work on NoTube (a European research project exploring Semantic Web and TV), reflects on the possibilities of linking bibliographic data with other web content, and calls for a contest to engage researchers in linked TV and bibliographic data.

Responding to Brinkley’s call via the DPLA listserv, Karen Coyle observes:

“A big and powerful chunk of knowledge organization that is just begging for exploitation is the fact that library records have classification numbers and subject headings from thesauri. All of this could now be correlated with an analysis of the full text. It’s only another step to associate this same information with non-library materials. The classifications have the advantage of being organized knowledge with implicit class membership and lots of interesting sibling relationships. What libraries have is not complete nor perfect, but it’s a seed to be built on, something that doesn’t exist when you do keyword indexing without any semantics.”

Join the conversation by commenting on Brinkley’s post or chiming in on the listserv thread!

Chelcie on 17 October 2011 / Comments Off

As part of LODLAM-DC, Jon Voss will deliver a free talk called “An Introduction to Linked Open Data in Libraries, Archives, & Museums” on Friday, September 16.

Based on an earlier talk given at NYPL Labs, Voss’s presentation will “explore the fundamental elements of Linked Open Data and discover how rapidly growing access to metadata within the world’s libraries, archives and museums is opening exciting new possibilities for understanding our past, and may help in predicting our future.”

This event is free and open to the public, so register soon. For a sneak peek, check out this slideshow from Voss’s earlier talk.

Chelcie on 26 August 2011 / Comments Off

This just in: the New York Times recently launched Longitude, an interactive map of the day’s news leveraging Linked Open Data, as a featured project of its larger beta620 website.

As described by Evan Sandhaus, its developer, Longitude links NYT subject headings to geographic and corporate or biographical data from Geonames and Freebase:

“When you open Longitude you’ll see a number of “Times T” pins plotted out in a Google Map. The locations for these pins were all derived from Geonames. Click on any pin and you’ll be presented with a pop-up balloon containing a list of the ten most recent, relevant Times articles. But wait, there’s more! For some locations such as Missouri, your balloon will have one or two additional tabs: “Natives” and/or “Companies.” Click on one of these tabs and you’ll be presented with list of locally-born people and locally-headquartered organizations. You can even view Times articles for these people and organizations.”

Read Sandhaus’s pitch for Longitude, in which he also promises future posts about technical details of the app.

Chelcie on 16 August 2011 / Comments Off

In an article for Wired Magazine titled “Why Open Data Alone is Not Enough,” Jesse Lichtenstein acknowledges the data divide and suggests how it could be bridged:

“The concern that open data may simply empower the empowered is not an argument against open data; it’s an argument against looking at open data as an end in itself. Massive data dumps and even friendly online government portals are insufficient. Ordinary people need to know what information is available, and they need the training to be conversant in it. And if people are to have anything more than theoretical access to the information, it needs to be easy and cheap to use. That means investing in the kinds of organizations doing outreach, advocacy, and education in the communities least familiar with the benefits of data transparency. If we want truly open government, we still have to do the hard work of addressing basic and stubborn inequalities. However freely it flows, the data alone isn’t enough.”

Chelcie on 19 July 2011 / Comments Off

Linked Data and Libraries 2011 was held at the British Library Conference Centre in London on Thursday, July 14, 2011. Below find a selection of sessions and slideshows.

View the afternoon session below, or view the morning session here.

Video streaming by Ustream

On the day of the conference, the British Library also introduced their new approach to publishing the British National Bibliography using linked data practices. Users can now preview the first subset of the LOD BNB, including books published or distributed in the UK since 2005, via the search service, the describe endpoint, and the SPARQL endpoint. Below are slides from a presentation by Neil Wilson, who heads the British Library’s Metadata Services, outlining the process behind creating the library’s LOD model.

Establishing the Connection: Creating a Linked Data Version of the BNB

View more presentations from nw13

Play with example records of an organization and a publication from the BNB preview, or check out the data model. To view more slides from Linked Data and Libraries 2011—including contributions from the Library of Congress and the University of Münster—visit the conference’s resource page.

Chelcie on 16 July 2011 / Comments Off

Laura Campbell, CIO of the Library of Congress, delivered the keynote address at the 2011 SemTech Conference in San Francisco, CA. Her talk focused on “how linked data is helping us to do more with less” while managing the Library’s existing collections; maintaining its role as a leader in the distribution of canonical information; and following its mission to collect, preserve, and provide access to a born digital collection.

During her talk, Campbell said she hoped the takeaway would be that ”We need to get very clever about new methods of doing our mission, new methods of executing both getting the material and managing it and providing access to it.”

Chelcie on 16 July 2011 / Comments Off

David Weinberger, senior researcher at the Berkman Center, filmed these interviews at the LOD-LAM Summit in San Francisco on June 2-3, 2011.

Want to join the conversation? Respond to the W3C Linked Library Data Incubator Group‘s call for public comment on the draft of their report. Feedback can sent as comments to individual sections posted on their dedicated blog or by email to an archived public mailing list at public-lld@w3.org using descriptive subject lines such as ‘[COMMENTS] “Benefits” section.’

http://youtu.be/swQYX4oqfB4

 

http://youtu.be/cY-aEuFLryo

Videos via the Harvard Library Innovation Laboratory Blog.

Chelcie on 15 July 2011 / 1 Comment

Via The Signal, LC’s compulsively readable blog on digital preservation:

“We have a vast amount of information on the internet, but we are missing the relationships needed to reach, discover and use this information to its fullest potential. Cultural heritage institutions and gatekeepers of knowledge are looking to provide open, linked data and help to build a better internet. Ed Summers, an Information Technology Specialist for the Office of Strategic Initiatives here at the Library maintains, ‘Linking makes the provenance of the items explicit, which will continue to be important to researchers on the Web. But perhaps more importantly it gives institutions a reason to participate in the project as a whole.’”

Read the full post here.

Chelcie on 15 July 2011 / Comments Off