Wednesday, 21 July 2010

Legal Māori Text Corpus released

Victoria University’s Faculty of Law is proud to announce that researchers and staff involved with the Legal Māori Project have just completed two of their major funded works: The Legal Māori Corpus and the Legal Māori lexicon.

The Legal Māori Corpus is an unprecedented collection of modern and historical Māori language texts, comprising 8 million words in total. “When we started the project two years ago we had no idea the final size of our corpus would be so great, and to our knowledge, it is the largest structured corpus of Māori language texts ever compiled,” says project co-leader and Faculty lecturer, Māmari Stephens.

All pre-1910 corpus files are now publicly available for researchers to use in order to analyse patterns of language use and vocabulary, with digitised versions of the source documents being available for download and re-use. The post-1910 texts will made available by the end of the Project once copyright permissions are gained.  

The Legal Māori Lexicon is a glossary of all legal terms identified during the course of the project so far. Almost 2000 terms have been collated with their English translations and will also soon be publicly available. These terms, and their frequency of appearance in the Corpus will form the basis of the final dictionary, due for completion in early 2012.

Says Māmari Stephens: “I would like to take this opportunity to thank the hard word put in by all involved with getting these outputs produced on time and in accordance with our FRST agreement.  Many of these contributors are either current or former students of the Law Faculty, and I am grateful beyond words to all of them.”

They are: Assistant Professor Mary Boyce, University of Hawai’I; Tai Ahu; Dulce Piacentini; Paranihia Walker; Max Sullivan; Phoebe Monk; Emma Kuperus; Ed Willis;
Rachael Hoare; Debbie Broughton; Joeliee Seed-Pihema; Rama Chadwick; Hannah Northover; Harvey Buchman; Dave Moskovitz.

Ka nui aku mihi matakuikui ki a rātou.  Ka haere tonu te mahi, ka puta mai tonu nga hua.

For further information, please contact

Nicky Saker
Communications Adviser
Faculty of Law
(04) 463 6310

Tuesday, 13 July 2010


Linkypedia is a site which provides data on how web content is used in Wikipedia

“linkypedia aims to help reveal the connections between digital curation communities. To let cultural institutions get a handle on the rich metadata and contextual information found in wikipedia and to serve as a sign post for rich seams of primary resource material on the web.”

It is of particular interest to us as the Library’s New Zealand Electronic Text Centre collection listed with 1,752 links coming from Wikipedia. That’s less than Library of Congress (4,750) but more than the British Library (1,379). We already knew from our web stats that the collection receives a lot of referrals from Wikipedia but this site lets us delve a bit further into the detail.

Not unexpectedly the largest number of links from Wikipedia to content in the NZETC collection are from pages about the Second World War - the Wikipedia page on Operation Crusader to relive the 1941 siege of Tobruk contains 49 footnotes links to digitised editions of The Official History of New Zealand in the Second World War, many to paragraphs and sections in “The Relief of Tobruk” by W. E Murphy.

Most Wikipedia pages only have one or two links to NZETC digital texts but a couple of other highly linked topics include early New Zealand politics and the 1860s Pai Marire movement. The page on Pai Marire is a nice example of Wikipedia authors drawing on a range of source material which, because it is made publicly available online, they are able to link directly to in their footnotes and offer the reader opportunities for further reading and research. The articles uses footnotes to reference two books in the NZETC collection, one in Auckland’s ENZB collection and an article in digitised edition of Te Ao Hou.

Great to see people making use of the resources we provide.