Terminology

If might get confusing sometimes to work with all those names. This all seems like some very odd farming project with terms like Harvests and seeds...

Sources

Sources are sort of publications - they can have multiple seeds = URLs, they have a publisher and they need to have assigned contract otherwise they might not be harvested.

Seeds

Seeds are just weird way how to say URL. Each seed has its own sources. Sources can have multiple seeds. Seeds have different rules how they can be harvested based on technical necessities.

Voting round

Process of deciding whether source should be archived or not. This process is repeated sometimes.

Curator

Somebody who checks the content of the archiving sources. Masters of the archive.

QA check

Quality assurance check that happens after source has been accepted to archive. This is a check mainly for the content changes and technical side of the harvesting.

Publishers

They publish sources. They need to sign a contract unless they have open source licence.

Harvests

Instance of an act of downloading seeds and archiving them. Might happen automatically in future.

Harvest blacklist

Some publishers don’t want to be their resources harvested. So they are blacklisted. Miserable people those are.

Visibility blacklist

Some sites are harvested but they don’t have contract yet so they must not ever be displayed on a web.