Table of Contents
- Advanced search
- Advanced query syntax
- Optical Character Recognition
- How to correct OCR text
- Technical requirements
You can perform a simple search by typing keywords in the search box and clicking "Search". The search engine will return results that include all of your search terms.
You can search for an exact phrase by placing quotation marks around your search terms, for example "new plymouth".
Boolean operators AND, OR and NOT can be used to refine your search results. AND (include all of the words) and NOT (without the words) narrow your search; OR (with at least one of the words) broadens your search. For example, plymouth NOT new will retrieve articles about Plymouth but not New Plymouth. You can group clauses using parentheses, for example (hamilton OR waikato) AND river.
On the search results page, the "Search limited to" area shows any filters that were applied to the search. You can remove these by clicking the "x" icons. The "Refine search" area shows the most common values occurring in various categories in the search results. Selecting one of these facets applies it as a search filter.
The Advanced search tab allows you to limit your search results by:
- One or more publications
- A date range
It also allows you to search within full text/comments/tags, choose the number of search results you want displayed on each page, and choose whether you would like text or image previews displayed with your search results.
Advanced query syntax
Query terms can be boosted to increase their importance in the search, changing the order of the search results. This is done by adding "^" and a boost factor at the end of the term, e.g. hamilton river^2 will treat "river" as more important than "hamilton" when ranking the search results returned.
Wildcard searches can be performed by including "?" (single character wildcard) or "*" (multiple character wildcard) in the query term. For example, hamilt* will match all words starting with "hamilt".
Fuzzy searching can be done by adding "~1" at the end of individual terms, e.g. roam~1 will find terms like "foam" and "roams" as well as "roam". This can help to compensate for errors in the text due to the Optical Character Recognition process.
Proximity searching allows you to search for words that appear close together in the text. For example, "John Smith"~3 will find results containing both the words "John" and "Smith" where they are no more than 3 words apart. So as well as finding "John Smith" it will also find "John J. Smith", "John Frederick Smith", "John Fullerton-Smith", and even "Smith, John".
Optical Character Recognition
Optical Character Recognition, or OCR, is a process by which software reads a page image and translates it into a text file by recognising the shapes of the letters (The NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials).
OCR enables searching of large quantities of full-text data, but it is never 100% accurate. The level of accuracy depends on the print quality of the original document, its condition at the time of microfilming, the level of detail captured by the microfilm scanner, and the quality of the OCR software. Documents with poor quality paper, small print, mixed fonts, multiple column layouts, or damaged pages may have poor OCR accuracy.
The searchable text and titles in this collection have been automatically generated using OCR software. They may not have been manually reviewed or corrected.
To look at the OCR text, select the page/article and click the "Text of this page/Text of this article" link.
How to correct text
The text correction interface is accessed by clicking the "Correct this text" link when viewing section text. This interface is split into two parts: the right side shows the page images that make up the document, and the left side is used for editing the lines of text.
When you move your mouse over the page images in the right pane, the blocks making up the pages will highlight. You can scroll this view by dragging with the mouse, or zoom in/out using the buttons above the viewer. Clicking a highlighted block will select it and load a form for editing that block into the left pane.
Correct the text line by line. A red box is displayed in the right pane to help you determine what text should be included in the line. Once you have finished correcting text, click "Save". The changes you make will take effect immediately. Alternatively, clicking the "Cancel" button will discard any unsaved changes you have made.
You can then make further corrections to the same block, move onto the next block by clicking the "Next" button, select another block in the right pane, or exit the text correction view by clicking the "Return to viewing mode" link. Clicking "Save & exit" instead of "Save" will save the changes and then return you to the normal viewing mode automatically.
Hint: Many web browsers include spell checking functionality and this can assist with your text correction by identifying misspelt words. If your web browser does not have this functionality, it's likely there is a spell checking add-on available (see your web browser's help for information on how to install add-ons).
Articles can be printed directly from your web browser, after selecting the article and clicking the "Clip this article" link.
If available, PDF versions of documents and pages can be downloaded for printing.
In general, you only need a common web browser like Chrome, Firefox, Internet Explorer, Safari, Opera or Microsoft Edge to search and browse this collection. To view or print PDFs, you will also need a PDF viewer like Adobe Reader.