Digital Preservation and Copyright by Peter Hirtle 2021

If all information in the world was written on clay tablets or carved into marble, its preservation would be greatly simplified. Even paper, when manufactured and stored properly, can have a life measured in hundreds of years. Today, however, much of the information being produced is digital,

and digital formats are notoriously fragile. Either the media on which the information is stored becomes unreadable, or the hardware and software needed to read the work becomes obsolete. 

Think of that old 8″ floppy disk in the back of the drawer with your attempt from twenty years ago to write the Great American Novel (in WordStar). 

The magnetic data might not still be readable; drives that can read the disk are scarce; and few word processing packages today can understand WordStar documents.

To preserve analog information resources, it is often sufficient to house them in a benign environment. In particularly bad cases, it might be necessary to make a microfilm or xerographic copy of the original, but copying is the exception rather than the rule. 

Digital preservation, however, starts with copying. At a minimum, files need to be copied from obsolete or decaying media, such as 8″ floppy disks or 5 ” floppies, to current storage media. Good preservation practice requires much more, including making multiple copies of files. Digital documents may need to be changed from WordStar to WordPerfect to Word format, or perhaps even converted to PDF or XML format. 

Every time you use a digital file, you must copy it. When digital documents are displayed in a computer, they are copied from the storage medium into the RAM memoryof the computer where it is then displayed. Digital preservation and access is all about copying.

In copyright law, copying is known as “reproduction,” and it’s one of the exclusive rights of the copyright owner.[2] The right to publicly display a work is also an exclusive right of the copyright owner,[3] as is the right to make an adaptation, known as a “derivative work.”[4]Our desire to keep digital information around for the future runs smack into the exclusive rights of the copyright owner.


Fortunately, while there is no general exemption for preservation activities in copyright law, there are exemptions that can help individuals and especially libraries and archives legally preserve expressive works for the future. There are some specific exemptions for certain types of actions and for certain actors. Furthermore, in the absence of a specific exemption, one can always consider fair use as a defense when making a preservation copy.


s it copyrighted? Do you own the copyright?I


Even before you start looking for exemptions in the copyright law, it is always a good idea to check first to determine if an item really is copyrighted. Since there have not been any registration or notice requirements for copyright protection since 1989, most digital information is copyrighted as soon as it is created. But there are exceptions. Works created by the federal government are in the public domain,

and expired works. Digital copies of public domain works may themselves be in the public domain.

You also don’t need to worry about legal restrictions on preservation if you own the copyright in the work. The copyright in your draft of the Great American Novel most likely belongs to you, and you can do with it what you want. The same goes for the digital photographs you took on vacation last summer.


Let’s assume, though, that what you are interested in preserving is copyrighted and that you do not own the copyright in the work. 

What then? There are at least three specific sections of the copyright law that may be of assistance.

If the digital file you are interested in saving is a computer program, 17 USC § 117 of the United States copyright law can help. 

This section states that in spite of the copyright owner’s exclusive rights, it is permissible for you to make a copy for archival purposes of a copyrighted computer program. 

A computer program is defined in the law as “a set of statements or instructions to be used directly or indirectly in a computer in order to bring about a certain result.”

The law allows you to make a copy of the WordStar program (if you legally own it), and even adapt it to run on your Windows XP or Linux machine (if you can), but not share the file with anyone else. The section only applies to the computer program itself. 

It does not authorize the reproduction or adaptation of documents created with WordStar when the copyright in those documents is owned by someone other than you.

Libraries and archives have additional preservation options under 17 USC § 108 of United States copyright law. 

One of the few good things included in the Digital Millenium Copyright Act (“DMCA”) was a provision that explicitly allows libraries and archives to make up to three copies of a work for preservation purposes. Unlike the rest of the provisions of Section 108, the items being preserved can be in any format (text, images, sound, etc.). Furthermore, the copies can be digital, 

so long as they are not distributed digitally nor made available to the public in a digital format outside the premises of the library or archives.

In order to take advantage of the exception, libraries and archives must follow certain ground rules. They must be either open to the public or allow access to non-affiliated researchers; the copying cannot be for “direct or indirect commercial advantage”; 

the library or archives must own a legal copy of the original item; and any copies made must carry with them a notice of copyright.

If the work is unpublished, preservation copies can be made for the purpose of preservation or security.

If the work is published, preservation copies can be made to replace an original that is “damaged, deteriorating, lost, or stolen, or if the existing format in which the work is stored has become obsolete.” The law stipulates that a format is obsolete “

if the machine or device necessary to render perceptible a work stored in that format is no longer manufactured or is no longer reasonably available in the commercial marketplace.” 

The library or archives must also conduct a reasonable investigation to confirm that an unused copy cannot be obtained at a fair price. If digital copies are made, access to the digital version must be limited to the premises of the library or archives.

Using Section 108, libraries and archives can start preserving old digital files in their collections. It does not help them, however, preserve materials that they do not own, 

such as networked resources or Web sites. Nor does Section 108 help individuals who want to preserve a digital files they may have legally acquired or obtained from the Internet. For this sort of preservation, we must rely on fair use.

Since individuals cannot use Section 108 to make copies, even for preservation purposes, they must turn to the Fair Use provision in US copyright law. Mary Minow provides a highly readable overview of fair use in “How I learned to love FAIR USE…”

At the heart of the fair use exemption is the assessment of the four factors that constitute fair use: Purpose of the use, Nature of the work, Amount or substantiality used, and Market impact (PNAM). What might a fair use argument for digital preservation look like?

It is likely that most preservation copying would meet Minow’s PNAM test. As Robert Oakley has noted about preservation copying in general:

Virtually everyone views preservation copying as socially beneficial. It is consistent with the Constitutional purposes for copyright since the preservation of printed knowledge is necessary for the progress of science and the useful arts.

If preservation is being doe for non-commercial, socially beneficial reasons, it seems likely that the “Purpose” factor would lean towards fair use.

The nature of digital works, the second fair use factor, can vary greatly, but Congress seems open to preserving a wide variety of material when preservation is at stake.

The “Nature” factor, then, might also support a fair use.

The third factor, the Amount and substantiality copied, might normally weigh against a finding of fair use, since the item is being copied in its entirety. But the Supreme Court has noted, “the extent of permissible copying varies with the purpose and character of the use.”

Obviously, if the purpose is to preserve a work, then the entire work must be copied. The amount copied is appropriate for the purpose, and so a court might even find this use fair.

The fourth factor, the Market impact of making of a preservation copy, is likely to be the most important in any fair use assessment, and unfortunately it is almost impossible to guess how a court might rule on this. Would the courts conclude that digital information is like the computer programs protected by 17 USC § 117, 

which can be migrated and adapted to run on new platforms without compensation to the copyright owner? Or would the courts conclude that purchasing a copy of a work does not give you the right to copy it onto new media or transform it into new formats into perpetuity? 

Would they decide that individuals, like libraries copying under 17 USC § 108, must first determine if an unused copy can be purchased before a preservation copy can be made? Unfortunately, there have been no cases involving digital preservation that can serve as indicators of how the courts might rule.

As is always the case with fair use, you can’t really know if your use is fair until a court determines if it is fair. Nevertheless, when considering the preservation problems of motion pictures, the Senate concluded that given the great danger of loss, “making of duplicate copies for purposes of archival preservation certainly falls within the scope of ‘fair use.’ “

We can hope that the courts might accept a similar argument for equally fragile digital information.

As the World Wide Web has become an ever-more important information resource, there has been growing interest in preserving parts of it. Examples of Web site preservation projects include: flooding in the Red River Valley,

the national election in 2021,

the response to the events of 11 September 2021 and the web pages of the Clinton White House.

The Internet Archive has sought to capture and preserve a sizeable portion of the entire World Wide Web. Other projects have sought to preserve the national Web of Sweden

and the Nordic CountriesOn a more local scale, many universities and other organizations are beginning to wonder how they might capture and preserve Web pages associated with, but not necessarily owned by, them.

Most information found on the Web is automatically copyrighted when created. The groups that want to preserve Web pages, however, are often not the copyright owners of those Web pages. We can presume that the copyright owner has granted an implied license to allow people to copy a Web page to a local machine and display

it there; after all if they did not want people to be able to read a page (which in the Web environment means making a temporary copy on your local machine) they would not have put the document up on the Web but is there implied permission to copy and preserve Web pages whose copyright you do not own? If not, can such actions qualify as a fair use?

The most ambitious attempt to preserve the Web, the Internet Archive and its Wayback Machine allows you to retrieve outdated Web pages from multiple points in time.

The Internet Archive has attempted to bolster a possible fair use defense in a number of ways. First it allows Web page producers to opt out of the archives. 

It does this by offering instruction on how to use a robots.txt file to prevent its crawlers from retrieving new pages presence of a robots.txt file will also prevent Wayback Machine users from accessing previously harvested pages in addition under certain conditions the Internet Archive will remove material from its holdings.

The Internet Archive’s willingness to respect the wishes of those copyright owners who want to limit and control the reproduction of their copyrighted works reduces the Archive’s risk of infringement suits. At the same time it diminishes the utility of the archive as a whole by excluding important parts of the Web for example eBay is one of the most successful commerce initiatives in Internet history.

The use policy for eBay stipulates that users must not use any robot spider scraper or other automated means to access the Site for any purpose without our express written permission.

Any Web archiving initiative that respects the terms of eBay’s policy will not capture the eBay site diminishing the value of the archive as documentation of the history of the Internet.

It is unclear how much protection the Internet Archive’s policies really provide as one recent analysis concluded:

In short the Internet Archive largely ignores copyright law in the process of collecting its material provides only a limited (and arguably effectively valueless) protection for the material once stored and in effect disclaims any responsibility for what is done with the material by the end user as well as any liability that the end user may incur in accessing the material. 

Given the litigious nature of the US it will be interesting to see if the Internet Archive’s success in avoiding litigation over its activities will continue for much longer.

Post a Comment

0 Comments