2006-01-09-library-text-mining.md

excerpt

categories

layout

title

created

permalink

<a href="http://github.com/%3Ca%20href="http://www.csc.liv.ac.uk/~azaroth/">Rob" rel="nofollow">http://www.csc.liv.ac.uk/~azaroth/">Rob Sanderson</a> Using the TeraGrid1 and the SRB DataGrid2, we have sufficient computational and storage facilities to run normally prohibitively expensive processing tasks. By integrating text and data mining tools3[4] within the Cheshire35 information architecture, we can parse the natural language present in 20 million MARC records (the University of Californiaâ€™s MELVYL collection) and extract information to provide to search/retrieve applications. In this talk, weâ€™ll discuss the results of applying new techniques to â€˜oldâ€™ data.

conferences

code4lib 2006

post

Library Text Mining

1136872693

/conference/2006/sanderson/

Rob Sanderson

Using the TeraGrid1 and the SRB DataGrid2, we have sufficient computational and storage facilities to run normally prohibitively expensive processing tasks. By integrating text and data mining tools3[4] within the Cheshire35 information architecture, we can parse the natural language present in 20 million MARC records (the University of Californiaâ€™s MELVYL collection) and extract information to provide to search/retrieve applications. In this talk, weâ€™ll discuss the results of applying new techniques to â€˜oldâ€™ data.

1: http://www.teragrid.org 2: http://www.sdsc.edu/srb 3: http://www.ailab.si/orange 4: http://www-tsujii.is.s.u-tokyo.ac.jp/ 5: http://www.cheshire3.org/

Rob Sanderson, ([email protected])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2006-01-09-library-text-mining.md

2006-01-09-library-text-mining.md

Files

2006-01-09-library-text-mining.md

Latest commit

History

2006-01-09-library-text-mining.md

File metadata and controls