Respondents who use Greenstone to develop digital library collections were asked to describe characteristics of (a) how access is provided to the collections, (b) collection contents, and (c) who are the primary target audiences for the collections.
Access to Collections
Sixty-seven respondents indicated how access is provided to the collections they build using Greenstone. Of these, 76.1% (51 cases) provide access via the Internet; 61.2% (41 respondents) through local networks; 26.9% (18 respondents) through DVD/CD-Rom; and 4.5% (3 respondents) through other means. The other means indicated by respondents included through intranets (2 respondents) and on (a) local client station(s) (1 respondent). One respondent also indicated that a CD had been planned but was awaiting funding, and another that access provision had not yet commenced.
Respondents were also asked where their collections are hosted. Of the 69 participants who answered this question, 69.6% (48 respondents) indicated that the collection(s) is/are hosted on one or more local servers within their organization; 10.1% (7 respondents) that the collection(s) is/are hosted on an external server; 14.5% (10 respondents) both local and external servers; and 5.8% (4 respondents) “other”. Other hosting configurations included on local client machines (2 respondents).
Number of Collections
Respondents were asked how many production and test collections they (or their organizations) had developed in a free text response question. Forty-eight respondents gave quantifiable answers with respect to production collections, and thirty-nine for test collections. While the number of collections can be interpreted as a reflection of the content made available via digital libraries by an organization or individual, it may also be seen in part as a function of administrative and software implementation decisions.
A wide range in the number of both test and production collections was reported. The number of production collections reported by respondents ranged from 0 to “greater than 50”. The number of test collections reported ranged from 0 to “greater than 65”. The median number of production collections was 4.0, as was the median number of test collections. The mode for each type of collection was 1.
The survey included several questions that asked respondents to describe the content of their collections. The descriptions requested include a broad subject classification, format type (i.e., video, audio, text, photographic images, other images, other); content language(s); file types; and broad collection types (i.e., catalog/index to other resources, special collections, institutional repository, electronic theses and dissertations, journal/other publications, archival collections, museum collections, other).
Collection Subject Classifications
Sixty-seven respondents provided subject-based classifications of their collections. A good number (21 respondents) indicated subjects other than those provided via the survey either in addition to or instead of the provided subjects. The latter were coded to correspond with the provided subjects in order to achieve a subject classification at a broad level across respondent answers. While Local interest (30 responses), Social Sciences (30 responses), and Humanities (29 responses) were represented more than other subjects, all subjects were represented. Frequency and percent of responses are indicated in Table 1 below. Note that respondents were instructed to select as many subject categories as necessary to describe their collections.
Table 1. Subject Classifications of Collection Contents (N=67).
|Computer Science / Engineering||11||16.4%|
Language(s) of Collection Content
Sixty-two respondents indicated the language(s) of collection content. Of these respondents, 59.7% (37 respondents) indicated the collections they build are in one language; and 41.2% that they build collections in 2 or more languages. Sixty-one of the sixty-two respondents who answered this question indicated which language(s) their collection content is in. Of these, most (83.6%) have English language content. Spanish (19.7%) and French (13.1%) were indicated fairly frequently. It is important to note that this survey was only available in English. In total, 33 unique languages were mentioned, which are listed in alphabetical order in English below.
Collection Content Format Types
Text was the most prevalent format type indicated by respondents (92.6%; N=68). About two-thirds (69.1%) of respondents indicated that their collections contain photographic image files; and about a third each of three other format types: “images other than photos” (35.3%); audio (35.3%); and video (30.9%).
Collection Content File Types
The majority of respondents indicated inclusion of the following file types in their collections (N=68):
- PDF files: 85.3%;
- Image files (JPEG, GIF, etc.): 75.0%; and
- HTML files: 51.5%.
In addition to these very common file types, respondents indicated that a great number of other file types are also included in their collections. A full list of file types selected by respondents is indicated in Table 2 below. Other file types that were not listed but indicated in a text response option were: mail files, OGG video files (Theora/Vorbis), OpenOffice.org files, MPEG video files, Flash (video and audio) files, Dublin Core metadata files, and METS/ALTO XML metadata.
Table 2. File Types Included in Collections (N=68). The frequency and percent of respondents who indicated each file type as included in one or more of their collections is listed in decreasing order.
|Image files (JPEG, GIF, etc.)||51||75.0%|
|MS Office files (Excel, PowerPoint, Word, etc.)||28||41.2%|
|MP3 audio files||26||38.2%|
|Plain text (.txt)||23||33.8%|
|ISIS database files||13||19.1%|
|Open Archive data||9||13.2%|
|Compressed files (tar, jar, zip, gzip or bz)||7||10.3%|
|METS files (Greenstone format)||6||8.8%|
|DSpace archive format||3||4.4%|
|Source code (C/C++, Perl, Shell)||3||4.4%|
Respondents were asked to describe what type(s) of collection(s) they had developed or were currently developing using Greenstone. Respondents were instructed to select from a set of broad descriptive terms and/or provide their own terms to describe the type(s) of collection(s). The table below indicates the frequency with which respondents selected the provided terms to describe their collections. A list of respondent-provided terms follows the table.
Table 3. Terms used to describe types of collections (N=66). Reported frequencies and percents are the number of respondents who indicated a given term to describe collection type.
|Collection Type: Terms||Frequency||Percent|
|Electronic theses and dissertations||20||30.3%|
|Catalog / Index to other resources||18||27.3%|
“Other” terms: Digital Assets Management (for commercial organizations); Student project reports; Teaching and learning documents; News; Articles and Reports; Links to Knowledge objects; Conferences; Question papers; Training modules; eBooks; Bibliographic collections; Specialized web content; Multilingual; Public domain texts; and Original source legal documents.
When asked the broadest audience to whom access to digital library materials is provided, the majority of respondents (62.1%; N=66) indicated the “general public”. Others provide access to affiliates of their organizations (18.2%), organizational staff only (9.1%), or for themselves and/or other private parties (4.5%). Of those who indicated another level (4 respondents, 6.1%), one provided access to a regional international group of libraries and librarians, one to faculty and students, and two had not yet defined or implemented access policies. One respondent indicated that access is provided for one specific government department only.
Respondents were also asked to select from a set of broad terms to describe characteristics of the primary target communities for their collections or the collections they support. The terms related to educational contexts, age groups (i.e., children/teens, adults, elderly), geographical setting (i.e., urban, rural, suburban), and whether the collections are intended for multilingual audiences. Sixty-eight respondents indicated one or more terms to describe the primary target communities.
Audiences: Collections for Academics and Researchers
Of those who indicated terms to describe the primary target communities for their collections, 60.3% indicated “Academic: Researchers”, 57.4% “Academic: Students”, and 45.6% “Academic: Educators” (N=68). Additionally, two respondents wrote in responses indicating researchers and scientists/engineers as the primary target user communities. Overall, 50 of the 68 respondents (73.5%) described the target audience as related to academia by indicating that academic researchers, students, and/or educators were considered to be one of the the primary target audiences.
Audiences: Life Stage Groups
Just over half of the respondents (51.5%) described the primary target audience as “Adults”. The more specific term “Elderly” was selected by 10.3% of respondents, and “Children/Teens” was selected by 7.4% of respondents. In total, 37 of the 68 respondents indicated at least one age group/life stage of the primary target audience(s).
Audiences: Geographical Setting
Fewer respondents (20 of 68) indicated a geographical setting for their primary target audiences. Frequencies and percentages of those who did are as follows (N=68):
- Urban: 19.1% (13 respondents)
- Rural: 16.2% (11 respondents)
- Suburban: 13.2% (9 respondents)
Respondents were also asked to indicate the country or countries in which the primary target audience(s) is/are located. The majority of respondents who answered this question identified one country (70.7%; N=58); and 29.3% indicated multiple countries.
Audiences: Other Demographic Characteristics
The “general public” was indicated to be a primary target audience by just under half of the respondents (42.6%; N=68). Primary target audience(s) were described as “multilingual” by just over a third of respondents (35.3%; N=68).