Contact Me
iGNACiO GARCiA

CACI Int. - Software Developer (Level III)

Web Archiving Management System

The aim of this project is to create a web based management system (Digiboard) to aid the Library of Congress Web Archiving team in managing their collections, and all the records within the collections. The Digiboard is a crucial addition that allows an automated and application based management of all the information that the Web Archiving Team uses daily.

Responsibilities:
   · Design and implementation of underlying database. (mySQL)
   · Design and development of application code. (PHP, Ajax and Javascript)
   · Design and development of the application layout. (XHTML and CSS)
   · Maintenance of code and creation of add-on modules.

Creation, Maintenance, and Dissemination of XML-based Format Sustainability Assessment Documents

The Library of Congress wishes to take the growing collection of format description documents (FDDs) currently maintained in HTML and establish a workflow for creation, maintenance and dissemination of FDDs using XML Schema markup as the master encoding. The aim of the project is to develop an XML-based representation for FDDs that will support both efficient creation of FDDs and the current Web Site functionality. Based on this XML Schema, transformations from XML to both HTML and PDF will be developed to assist the Library of Congress in evaluating the integration of newly created FDDs into the Web Site and compliance with Library policies. http://www.digitalpreservation.gov/formats

Responsibilities:
   · Design and implementation of XML schema. (XML)
   · Design and development of XSLT transformations. (XML and XSLT)
   · Design and development of initial document transformation (XML and Perl)
   · Maintenance of code.

Netpreserve (IIPC) Web Management and Support

The aim of this project is to assess, install and maintain key web 2.0 technologies for the International Internet Preservation Consortium's website to facilitate the exchange of ideas, documentation and interaction between the large number of institutions, including several National Libraries, around the world that are part of the IIPC. http://www.netpreserve.org

Responsibilities:
   · Resesearch, study and testing of web 2.0 technologies and applications.
   · Installation and maintenance of selected applications and the main website.

Web Archive Tools

The aim of this project is to set up, test, identify the requirements and make recommendations for deploying a suite of open source tools within the Library of Congress technical environment. The tools involved in the project are The Wayback Machine, Heritrix, NutchWax, Hadoop and the 20th Century Search.

Responsibilities:
   · Research, study and testing of multiple applications.
   · Installation and maintenance of multiple Web Archiving applications.
   · Creation of reports of desired future updates to developers.

Archive Collections Metadata Extraction

The aim of this project is to create a suite of tools that will allow metadata extraction and compilation from several archived collections to support the Library of Congress' Metadata Object Description Schema cataloging efforts.

Responsibilities:
   · Implementation of metdata extraction application. (XML and Perl)
   · Regular modification due to requirement changes and additions.

Web Archiving and Retrieval Appliance (WARA)

WARA is a VMWare Appliance solution for web archiving and retrieval featuring Apache Tomcat, Wayback and Heritrix over Ubuntu OS.
WARA offers several of the leading web harvest, capture, and preservation software components from the Internet Archive, conveniently rolled up into one easy to use appliance.

Old Dominion University - Graduate Research Assistant

Library of Congress: Harvest Streaming Media With Heritrix and Retriever Tools

The aim of this project is to integrate the Web harvesting tool Heritrix with other retrieval software tools (Mplayer). The retrieval tool has the potential to download a variety of files that Heritrix currently has difficulty getting. In this phase of the project the focus is on downloading audio/video files, or streaming media. The tool will be responsible for listing and downloading all audio/video files that Heritrix did not get, and pack them into ARC files with the same format used by Heritrix.

mod_oai - http://modoai.org

The aim of this project is to create the mod_oai Apache software module that will expose content accessible from Apache Web servers, via the Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH). The mod_oai project (Old Dominion University/Los Alamos National Laboratories) is funded by the Andrew W. Mellon Foundation. The Apache Web server defines an extensible module format that allows specific functionality to be incorporated directly into the Web server. The mod_oai module is able to respond to OAI-PMH requests pertaining to files made accessible by the Apache server.

Teacher Assistant cs350

Course: cs350 - Introduction to Software Engineering
In charge of recitation lectures, grading assignments and creating course website. http://www.cs.odu.edu/~cs350/sp05

Fence