Many 'works' (an article, presentation etc.) in Citebase are duplicated across or within the source archives. This causes problems for users when searching, as the same work may appear multiple times, and causes problems for identifying the true citation and download impact of works (within the limits of Citebase).
Records are harvested from source archives and do not contain specific metadata to identify them as being duplicates of other records. Therefore Citebase uses a simple rule to de-duplicate (tie multiple records together as a single work). Records are determined to be a single work if they share the same first author and a similar title.
The total number of citations to and downloads of a work are inherited by all duplicates of that work. These totals are used when ranking search results, and are displayed in the summary table on the abstract page.
Without identifying duplicates references from the same work may be counted multiple times, or the same reference may be incorrectly linked to multiple works. This will result in the citation impact of each record being over-stated.
To more accurately count citations (a link between works) Citebase counts multiple repeated references from the same work only once.
Where a work is available from multiple sources user access will be spread across those instances of the work. While a work appearing in many places may increase its download impact, each record will represent only a part of the total download impact. This will result in the download impact of each record being under-stated.
Citebase includes all downloads, to all duplicates, as the total full-text downloads for the work.
Citebase Search is Copyright 2005-2012 Tim Brody <tdb01r@ecs.soton.ac.uk>, University of Southampton. Got a comment/question about Citebase? Please email me!
Full-texts, references and metadata are the copyright of the named author(s) and/or the respective publisher(s).