Taking Care of Data

A couple of things have had me thinking recently about data management in archaeology.

You might have seen the Atlantic’s recent article on the digital collection, curation, and analysis of archaeological data. The article emphasizes the massive size of datasets that are being collected particularly with digital methods, and it highlights a few points that will be familiar to archaeologists: we work and think at various scales, many of us are invested in new technological approaches to data, and often whatever documentation we can produce and preserve is all that will remain when the original record is destroyed by the process of our research (or by war, terrorism, or climate change). The article cites projects with data points that are apparently in the billions because of digital techniques—but of course our datasets can become unwieldy even with traditional methods once you take into account decades of research at a site or investigate questions across broad geographic areas. This article speaks to both the research potential of massive datasets, and the logistical challenges they can pose at all levels.

I thought of this article during a meeting of a class in “Responsible Conduct of Research and Scholarship.” The class is a new department requirement related to federal research funding, and so the inclusion of data management is no surprise if you consider the increasing attention paid to this component of NSF grant proposals. In the first session we touched on ways to plan for data management early on in a research project, whether that means selecting stable file formats or making informant information anonymous (for those anthropologists who work with the living). This can be especially challenging as a graduate student; many of us are planning the first project that we will be executing independently, and bringing from its earliest stages through to the end. What steps do we need to take to anticipate the management of data that we have yet to collect, and which will likely end up taking a different form than we expect when we first formulate the project?

datasupervision

Very thoroughly supervised excavation at Weeden Island, FL, Dec 2015

So the third thing that brings me to this topic is my own research. Having finished my fieldwork in December, I am now committing most of my time to lab-based sorting and analysis, along with organizing field notes and photographs and databases—and at the same time writing proposals, revising my 3 year life plan every other week, and otherwise trying to stay in touch with the big picture. Trying to balance these drastically different conceptual and practical scales really makes it clear how much effort can go into managing all the details of a project and the data it generates, and how critical it is to do that well in order to transition smoothly to analyzing and synthesizing those results, and then to making them available in a form that could be useful to others.

If I were starting all over tomorrow, I can think of (at least) a few things I would do differently with regards to record keeping and planning for database management. I think some big challenges for graduate students directing research are accurately estimating the scale and volume of data that will result, and developing systems of organization that will continue to make sense if strategies for sampling evolve over different phases of the project. Most of my work in this area has depended not on formal training but on observing the practices of other projects, remembering things that were difficult when I’ve worked with other datasets, and spending hours fiddling around with my tables in Access.

boxes

Boxes of excavated material on their way to becoming data

I have been thinking a lot about revision in writing lately, and perhaps there are some relevant comparisons and contrasts between writing and building databases. A first draft of a written work very often needs to be “re-envisioned” to be improved, perhaps through reworking its structure and reconsidering what information it is meant to convey. Many writers benefit from the feedback of readers as they move through revisions of written work; is this true for “data work” too? I know that each time I have had reason to share some portion of my preliminary dissertation data, it has forced me to refine the organization a bit, to check that my coding and conventions are accessible to another person, and to otherwise revise my database. But the structure of a database can be difficult or impossible to change once a project is really underway, in part because strategies for data collection are usually conceived along with plans for data management.

P1020221

Data collection teamwork at Weeden Island, FL, Dec 2015

As for making my data accessible after I finish my current work, I expect to include many appendices in my dissertation, but also to archive materials digitally with The Digital Archaeological Record (tDAR). I initially looked into the terms and requirements for tDAR to fulfill a requirement—but doing so prompted me to think about how archived data really gets used. I haven’t personally undertaken any serious work with some of the archaeological data that is recently being archived digitally and made accessible (e.g. the Digital Index of North American Archaeology), although I have made use of other relevant types of data available online, like NOAA’s coastal LiDAR. I think finding ways to seek out and incorporate more resources for available data will be a future goal of mine.

Archaeologists are always thinking about long time scales, the durability of materials, and the transmission of knowledge. Even so, there can be some disconnect when it comes to maintaining our own records in a way that will be readily accessible and understandable for future researchers. Graduate students out there, is this something you’re being trained in before delving into your research? What experiences do you have working with more novel forms of data collection, management, or archiving? Looking beyond the data you collect yourself, what ways have you found to work with the data that’s already available in digital archives?

Resources and Links

SEAC Aboveground: Ethical Conduct Panel & Other SAC Events in Nashville

 At last year’s meetings, the SEAC Student Affairs Committee (SAC) hosted a panel discussion on issues related to gender bias. The group discussed sexual harassment in archaeology; publication rates for men and women; workplace expectations about women, pregnancy, and child care; and other topics.

I was impressed by the honesty and public nature of this event. I have loved SEAC meetings since I first joined, but this panel said a lot about the community and was one reason I wanted to get more involved by joining the Student Affairs Committee. The stories and data that people shared were often terrible and, as a whole, disappointing—but the existence of the panel and the participation of SEAC leadership and so many of its members was also encouraging. Between the panel and active audience participation, we heard from faculty who were permanent and temporary, junior and senior, male and female; students with varied experiences and concerns; government and private CRM archaeologists at different stages in their careers. This panel was a prominent event at last year’s conference, and the discussions it promoted are still ongoing.

In particular, SAC will host a follow-up panel at this year’s conference, on Friday, November 20 (1:45-4pm). The purpose will be to work towards specific policies regarding appropriate behavior in the various settings where archaeologists work and learn. What constitutes sexual harassment in the field, classroom, or laboratory? Should SEAC adopt a code of ethical conduct that makes specific reference to sexual harassment and gender disparities? What is the role of our organization in preventing harassment and discrimination? This year’s event will be a collaborative discussion of these questions and others, featuring presentations from speakers who are knowledgeable about different aspects of these issues.

On behalf of SAC I want to encourage you to attend this year’s event. I also want you to know that you can start participating in this discussion today. SAC has created an anonymous Google survey as one way that SEAC members can submit questions and topics for the panelists to consider. You can also post to the SAC Facebook page or via Twitter (@SEAC_SAC – use #SEACethics).

There will also be a few events in Nashville focused specifically on students. As always, SAC will host a student reception (4-6pm on Thursday, November 19), but this year we have also organized an event-within-the-event: a student meet-and-greet where undergraduates can meet up with graduate students to ask questions and receive some spontaneous mentoring. Sign up now if you’re interested! No sign-up is required for the reception itself; just show up at 4 for free snacks and beverages

The student luncheon (12-1:30 on Friday, November 20) will focus on how to craft and polish your CV and cover letter for the public and private sector and for different types of academic jobs. As of my writing this we are very close to our capacity for this event. You may still be able to sign up now, and for those of you who are already attending, I’ll see you there!

I just put together my own SEAC 2015 schedule to make sure I’d have time to fit everything in. Of course I’m planning to attend all of the SAC events, but I’m also excited for a lot of talks about shells, fish, and pottery!

What are you looking forward to at this year’s conference?