The Internet Archive (IA) is a non-profit digital library, top 200 website at, and repository of over 60PB (unique) of digital information running across an integrated cluster of over 1200 VMs on over 700 “bare-metal” physical machines in multiple self-owned and operated data centers -- all serving to advance our goal of “Universal Access to All Knowledge.”

The Internet Archive is seeking a Web Application Developer for its Archive-It Group. The Archive-It team is responsible for maintaining a web application which automates high quality captures of content from the web. An ideal candidate demonstrates independence and initiative, is a problem solver, works well autonomously, has experience on the Unix/Linux command line and broad experience designing and executing web application features. Additionally, the ideal candidate is open to helping advance the state of preserving web-published content, working on the platform which drives a large portion of global web capture.

We are seeking a full-time Web Application Developer to help grow our suite of services for collecting, preserving, and providing access to the massive trove of historically-important data now published on the web while at the same time working in partnership with a global set of institutions to provide web, data, access, research, and preservation services to users. 

Skills and responsibilities

The successful candidate will work in the Archive-It Group in support of building and maintaining high quality software for the collection, preservation, and accessibility of web content. The role will help maintain a toolset and APIs which automate web capture using open source technologies and platforms. An ideal candidate is interested in helping user interface designs come to life, developing harvest techniques and tools to enable archival capture, and re-rendering rich media, streaming content, social media, and traditional web page content. This role contributes to defining deployment architectures and workflows, managing data at scale, and monitoring production systems.

Essential Job Functions:
  • Maintenance and new features for an existing web application, written in AngularJS (1.x) and Python/Django
  • Possibilities to contribute to crawling technologies, distributed database systems
  • Delivering on commitments with deadlines and project timelines and working in a collaborative team of engineers and project/product managers


  • Location:
    San Francisco
  • This job is remote friendly.
  • Deadline:


Minimum qualifications

  • Bachelor's Degree in Computer Science or a related field or equivalent experience: 3 to 4 years of experience in software development
  • Strong experience with python and related debugging tools (pdb, etc) strongly preferred
  • Strong knowledge of HTML, JavaScript and Web technologies in general
  • Knowledge of building and deploying web applications, databases, web-hosted services
  • Ability to work in, and enjoy, a loosely structured work environment

Preferred qualifications

  • Cluster computing experience is preferred, especially familiarity with Hadoop and related technologies and tools
  • Experience working with Javascript and HTML in a large-scale application preferred
  • Experience or familiarity with Java preferred
  • Experience with applications designed to display archived web content
  • Experience with development environments and system monitoring/administration tools
  • Experience with open source practices, version control, and code review
  • Experience with Atlassian tool sets
  • Flexibility and a sense of humor are a plus