I'm currently looking for software tools to collect and document different sources of (more or less) open data. It seems like ckan (the software and/or datahub) can be the right choice.

The data I'm going to collect involves one central problem: the sources change the URL etc. quite often and go offline after a few years. Therefore some kind of permanent cache or local storage where the original sources can be saved would be fine. The solution should include: - Link to the original resources - Link to the local (cached?) version - all kind of meta data

Is there a solution for ckan or a similar platform?

asked 16 Aug '11, 18:37

FloE's gravatar image

accept rate: 0%

The basic answer is: yes, there is a solution. Having encountered this problem for a while in CKAN -- i.e. that many links to data eventually get broken -- we have been looking for a method of caching along the lines you suggest and have built an archiver as part of our 'QA' extension: ckanext-qa (we plan to factor out the archiver bit).

This extension automatically downloads resources and caches them into a storage system like S3 or Google Storage. We haven't turned this on fully on ckan.net / thedatahub due to the need the fine tune what we do store (we don't need to cache TBs of wikipedia dumps for example) but we want to do this soon. If you are interested please get in touch in on the ckan-discuss mailing list.


answered 16 Aug '11, 19:59

rgrp's gravatar image

rgrp ♦♦
accept rate: 14%

edited 16 Aug '11, 20:00

thx for the perfect answer && solution

(17 Aug '11, 13:37) FloE
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 16 Aug '11, 18:37

Seen: 584 times

Last updated: 20 Jul, 10:55

powered by OSQA