Posts tagged ‘Linked Data’

A sixth-century mosaic map on the floor of Saint George church in Madaba, Jordan. Photo via Wikipedia Commons.

In the sixth-century city of Madaba , Jordan, a mosaic floor in Saint George’s church visually unites sacred space with the wider world—Jerusalem, Bethlehem, the Nile River, the Dead Sea are all represented in stone tesserae. The map suggests both devotion and action. Christian visitors who caught a glimpse of this twenty-foot wide mosaic could direct their prayers to God, and the church itself would direct the pilgrims to the Holy Land. Its scenes include cityscapes and houses, bodies of water with fish and boats, people at work. The mosaic was designed from a combination of bird’s eye views and non-linear perspective, letting the visitor’s visual experience shift between omniscient narrator and active participant. According to tradition, Moses gazed at the Promised Land from the top of nearby Mount Nemo. Art historian Antony Eastmond surmises that the Madaba map allowed pilgrims to reenact Moses’ experience by standing in the church and gazing down at the floor map’s image of the Holy Land spread out before them.[1] Maps like this tend to provide an intangible link between the viewer’s body and physical experience and the wider, imagined world, highlighting their participatory nature.

Bridging the gap between tangible, ancient cartographic arts and digital manifestations of them is Measuring and Mapping Space: Geographic Knowledge in Greco-Roman Antiquity, curated by Roberta Casagrande-Kim and Tom Elliott. This exhibition at ISAW (NYU’s Institute for the Study of the Ancient World) explores the work of Greek and Roman geographers, known to us mostly through written sources and copies in medieval and Renaissance manuscripts. The show explores other objects, such as coins with depictions of places, and juxtaposes them with digital components, resulting in a thoughtful and multi-faceted exploration of human relationships and physical space.

Screenshot of The Digital Map of the Roman Empire on Pelagios.

Fast forward a millennium or so, and one of my favorite online maps is Johan Åhlfeldt’s Digital Map of the Roman Empire on Pelagios. This is the map that made me love digital maps. It includes the Madaba sites and many more. There’s also Antiquity À-la-carte, an interactive GIS application of the Ancient World Mapping Center. In Antiquity À-la-carte, features such as settlements and churches are elegantly placed on a map of the medieval Mediterranean terrain, demonstrating at a glance how ancient communities were connected by roads and rivers. Zoomable, beautifully crafted with open data, and well-documented, this map represents the kind of project that drives both scholarship and public knowledge. I should note that both of these are part of the Pelagios network, a collaboration of around 30 ancient and medieval studies websites that rely on Linked Open Data.

Screen shot of the "Mapping ORBIS" tab of the website.

A similar project, ORBIS, goes beyond map representation to interactive tool. Essentially ORBIS is a travel planner for the ancient world that was created by Walter Scheidel and Elijah Meeks with a team at Stanford University. It goes beyond visualizing the ancient world, allowing users to determine variables and make calculations. I’ve used it in my own research to estimate how long it would take a monk to get from Caeserea, Cappadocia  to Constantinople  (15.8 days by horseback and sea in the month of June, in case you’re wondering). It’s also a good pedagogical tool, as evidenced by John Muccigrosso’s blog post on teaching archaeology.

These were the maps that hurled me into the digital cartography rabbit hole, making me want to understand how these projects actually work. In a juxtaposition of art and science, cartographic projects reflect the thoughtful decisions and interpretative nuances behind representations of geospatial data.

In future posts, I’ll consider issues and research questions as well as tools for making maps. 


[1] Anthony Eastmond, The Glory of Byzantium and Early Christendom. (London: Phaidon Press Ltd., 2013), 108.

This two-part post is my follow-up to LAWDI 2012, officially known as the first Linked Ancient World Data Institute. It brought together a multi-disciplinary group of digital scholars at NYU’s Institute for the Study of the Ancient World (ISAW) whose interests incorporate the Ancient Medierranean and Near East. This essay is cross-posted on the GC Digital Fellows blog.*

The Linked Data Cloud as of September 2011.

In preparation for LAWDI 2012, I wrote a post called “Linked Data: A Theory,” pondering the concepts behind Linked Data, but it was clear to me from the beginning that I needed a more sturdy vocabulary and concrete skills in order to put these ideas into practice. This essay explores how Linked Data can be useful to digital scholars with any level of technical experience and, ultimately, why it’s worth the trouble to tackle a new skill set while building a digital project.

Linked Data is a philosophy applied to web development. It incorporates best practices through links (that are both human- and machine-readable) to build connections between projects and data sets. The most effective scholarship acts as a springboard for other researchers who cite the work and build on its ideas. To maintain its relevancy, research must be published and shared. The same goes for data collected in support of that scholarship. Linked Data allows institutions and individuals to share resources in order to make data available to many users, all remixing or reinterpreting it to produce new scholarship.

Linked Data is often incorporated into conversations about Open Access, a crucial movement intended to counteract academia’s traditional exclusionary practices by making scholarship freely available to the public. It is also frequently associated with the Open Source movement, referring to projects in which the source code is freely available so that another developer can use that project as the basis for another, often tweaking or adapting it in new ways in a process called forking. This code is often deposited on sites like GitHub for sharing.

The inherent collaborative nature of Linked Data underscores the fact that “links” and “networks” are most useful when they refer to people as well as data. LAWDI has been particularly productive because it brings together people and organizations whose data sets have a good chance of being useful to one another.

LAWDI reading assignments provide a good overview of Linked Data concepts for readers who are familiar with developing or overseeing digital projects. The World Wide Web Consortium (W3C) is a network of organizations led by Tim Berners-Lee (inventor of the world wide web), that develops and recommends standards for best practice in building the Semantic (linked) Web. The W3C website is considered by many to be the gold standard for web-related definitions and explanations. The information below is intended to precede those readings with a more general overview of how those projects are constructed.

Open Data

Linked Data connections are built with Open Data that has been made freely available to reuse or remix. For instance, Europeana and the Digital Public Library of America are huge repositories of data that have made their content available to everyone. (The opposite of Linked Data, by the way, is a “data silo,” a repository that isn’t linked or shared and is unavailable except to those with exclusive access).

Making data open is the first step toward creating Linked Data. It’s essential, of course, to determine rights information before publishing or sharing data.  Archaeologists usually have permission to collect and publish their own findings; bibliographies can generally be shared; museums or archives can choose to share their own collections. Work that is under copyright probably isn’t something that should be shared. Adopting a Creative Commons License is one way to signal that data is available to others who may want to use it. In addition to data being available, it must be structured in such a way that others can use it, preferably at the initial stage of data entry or digitization.

How the Web Operates

At the front end of a website, when the user types a URL (Uniform Resource Locator) into the browser’s address bar, a website is displayed in the browser window.  It does this by calling up the website’s data from a server and translating it into the visual elements displayed on the page.

To fully integrate Linked Data into a project, it is necessary to understand how a digital project is constructed from the ground up. In a nutshell, URIs identify things; RDF describes those things; RDF works within a framework called XML; XML works with HTML; HTML sends information to your browser. In more complex terms, each of these elements operates at the back end of the website to form a series of relationships.

The building blocks of the Semantic Web are URIs (Uniform Resource Identifiers). These are the names of things described on a website. A URI looks like a string of characters that expresses the thing’s filename and/or path to the directory of the file.

A URI always begins with a scheme name followed by a colon and then the remainder of the URI. The scheme name identifies where the data is permanently stored. For instance, a scheme name that begins with “http:” is a web resource, and one that starts with “ftp:” is from an FTP site. Ideally, these URIs should be created from scratch, and not automatically generated by a Content Management System.

A “Cool URI” is an identifier that never changes–the domain name of the website is stable, and the data isn’t moved around or erased or altered. This is important because if someone else links to the data, they need to be certain that the link will remain useful in perpetuity. A “clean URI” clearly describes the item as simply as possible, without superfluous characters or confusing symbols. The Digital Classicist website lists Very Clean URIs with no “cruft,” a term that includes “.cgi”,”.php”,”.asp”,”?”,”&”,”=” or similar characters.

URIs are the basis of a data model called RDF (Resource Description Framework). In an RDF  framework, data is modeled into serializations (such as RDF triples expressing three ideas) in such a way that it is exposed to machine-readers as Linked Data. This data is intended to express relationships between people and/or things, using a controlled vocabulary for consistency among various projects and institutions.

Here’s an example from W3C’s RDF primer: An object whose URI is http://www.example.org/index.html has a creator named John Smith.
The RDF expression of this sentence would structure the data into three ideas:

a subject            http://www.example.org/index.html
a predicate          http://purl.org/dc/elements/1.1/creator
and an object       http://www.example.org/staffid/85740

Note that each of these ideas can be expressed with a URI.

RDF is a framework written in XML** (Extensible Markup Language). Markup languages are systems for annotating data that convey information about an item or instruct the software or web browser on what to display. All RDF triples written in XML are designed to describe data (by marking it up with machine-readable tags) to work in tandem with HTML (HyperText Markup Language), which displays data on the web. Technically speaking, a webpage is an HTML document.

By thoughtfully crafting clean URIs and incorporating them into RDF, a developer can facilitate Linked Data according to Tim Berners-Lee’s four “expectations of behavior” that are nicknamed the Four Rules for Linked Data. Quoted from his site, they are as follows:

  • Use URIs as names for things
  • Use HTTP URIs so that people can look up those names.
  • When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
  • Include links to other URIs. so that they can discover more things.

Berners-Lee calls these “simple.” Perhaps. But they’re not common sense. This leads to the question of whether creating Linked Data is advisable or even possible for a non-developer, an institution with limited resources, or a solo researcher.

In Part 2, I’ll discuss some options for creating Linked Data on a small scale and ways that existing Content Management Systems can be tweaked to be more Linked Data-friendly.

 

*This post represents a year of stumbling through data in ongoing efforts to become more digitally literate, an adventure supported by a GC Digital Fellowship and participation in the New Media Lab at the CUNY Graduate Center. My heartfelt thanks go out to all of the LAWDI 2012 presenters and participants, particularly Sebastian Heath and Chuck Jones of ISAW who have continued to help me aim for a LAWDI-friendly dissertation, and to Andrew Reinhard of ASCSA for keeping up Lawdite momentum. I’m also grateful to Aaron Knoll, former project advisor in the New Media Lab and overall good egg, for helping with an early draft of this post. Stephen Klein, Digital Services Librarian at the CUNY Graduate Center, has provided links and advice about sustainability. Conversations with Jared Simard about Mapping Mythology and Omeka are always helpful. Matt Rossi is an excellent writing consultant. Flaws and omissions are mine, of course, but it does, indeed, take a village to link data.

**UPDATE 9/13: In the original post, I stated that “RDF works within a framework called XML,” which could be considered an outdated view because linked data can be produced in a variety of formats. The RDF/XML model explained in this post is one example, but not the only option. For instance, RDF can be also written in JSON (JavaScript Object Notation). For a more technical account of RDF use and best practice, see W3C’s RDF primer.

Thanks to Kingsley Uyi Idehen (@kidehen), Hugh Cayless (@hcayless), and Sebastian Heath (@sebhth) for a lively twitter exchange on this topic when Kingsley pointed out that “Linked Data is format agnostic. Basically semantics, syntax, and encoding notations are loosely coupled. No XML or JSON specificity,” (tweet: 6 June from @kidehen).

3 Comments

Çariklı Kilise in Göreme. Photo: Horst Hallensleben, via University of Vienna/Europeana

Europeana, a database incorporating many of the European Union’s cultural heritage collections, has added a number of Cappadocian monuments to its vast holdings.

A search for “Göreme” yields over 200 results. Note that you do need to use the Turkish spelling (i.e. include the umlaut over the ö). Most of these are recent additions from the Hallensleben collection, provided by the University of Vienna.*

 

*Many thanks to Fani Gargova, Byzantine Research Associate at Dumbarton Oaks for bringing this to my attention.

 

 

Today marks the second annual Day of Archaeology, an online event dedicated to sharing the diverse activities of archaeologists throughout the world. Hundreds of archaeologists from around the world will share their own “day in the life” stories. These posts demonstrate the wide range of activities that archaeologists and their teams do to gather data and publish their findings. From digging in the dirt to building databases and from making coffee to organizing museum exhibitions, these tasks bring to light the myriad ways that basic archaeological information makes its way into public knowledge.

Ostensibly, ancient and medieval dwellings in Cappadocia would be excellent archaeological sites. But most complexes were made through cuting out of the rocky landscape, so there’s no real need to dig. Instead, archaeological surveys have been employed to great effect. For instance, Robert Ousterhout and his team surveyed the area near Çanlı Kilise, overturning the long-held belief that it had been a monastery. Veronica Kalas has published a survey of the settlement at Selime-Yaprakhisar, also arguing that secular settlements were a part of Byzantine society in Cappadocia. See the Documenting Cappadocia bibliography for a list of publications.

One of the more useful ways of viewing archaeology in terms of Cappadocia, then, its its inherent interdisciplinary nature. Archaeologists may be trained as art historians, anthropologists, historians, and IT practitioners. They are also quick to point out that there’s a difference between sharing data and producing scholarly work, and they are often involved in interpreting and disseminating their findings to both academic readers and the public at large. As I develop this site, archaeological projects continue to be an inspiration. At the Linked Ancient World Data Institute, archaeologists proved to be innovative participants in the Linked Data network, advancing digital publishing, and database building in ways that build foundations for further research. I hope that the data shared on this site will launch others into creating scholarly work as well.

I am excited that Documenting Cappadocia and I have scored an invitation to the Linked Ancient World Data Institute (LAWDI) this weekend, hosted by New York University’s Institute for the Study of the Ancient World. Linked Open Data is a method of structuring digitally published information so that it is stable and easily linked, enabling more effective links and better-networked scholarly resources.

Initial reading assignments for the event reveal that data on the web are interrelated. Among discussions of programmers, academics, librarians, and other practicioners of Linked Open Data is how those relationships are best conveyed by stable identifiers. Tom Scott has considered the difference between using identifiers such as URIs to distinguish web documents from the real-world things those documents represent. Ed Summers responded with a call for common sense–of course people know the difference between a real-world object and a document about it on the web, but it’s possible to maintain that distinction. Mike Bergman has taken the debate into the philosophical realm, pondering the responsibilities and implications of the practice of naming, of developing a vocabulary in the real world or in terms of identifying data.

The debate within the Linked Data community revolves around relationships between things and documents. There’s an ontological element to this discussion–these objects seem to have a life of their own, making their way around the internet, holding on to metadata and links that people attach to the documents before sending them on their way. From a medieval art historian’s perspective, this is similar to thing theory, the study of the life and existence of objects that have an agency of their own. Relics and icons functioned this way in Byzantium, working miracles, for instance. Pilgrimage souvenirs were thought to actively protect the pilgrim. Gems could conquer thirst, or blue charms could ward off the evil eye, without necessarily needing to be activated by a person every time. Some of these objects were made by people, but all of them had an agency that continued separately from their creators.

So this is my theory: Linked Data has a life of its own. Although humans initially put the data out into the world, it can then demonstrate relationships between objects and documents, and it can convey information through metadata without further human intervention. As practicioners of Linked Data, it’s up to us to send that data out into the world with adequate links and thoughtful, stable identifiers.